Archive
Check if it a program is already running in Unix
There is more than one way to do it, the safe is probably to check if /home/lsc/OH_YES_I_AM_RUNNING exists and believe it. This is called the file.PID method and is widely used (Apache used to use it since a long long time). It needs file. It needs cleanup if you reboot your server in the middle of something (and surely you do not want to delete old pid files yourself)
Ok, often you see this :
ps -ef | grep program
There you list all processes and check the lines that contain program. So some does a vi program or anything worse (emacs?), you will get more rows than needed.
Maybe it is fine to run program with different arguments, this must be decided.
Well, take a simple test case :
x1.sh and x2.sh :
#!/bin/ksh
while :
do
date > /dev/null
done
let’s try to use ps
$ nohup ./x1.sh &
$ nohup ./x2.sh &
$ jobs
[2] + Running nohup ./x2.sh &
[1] - Running nohup ./x1.sh &
$ ps -ef | egrep 'x[12]'
u22 9240796 6226164 30 14:56:52 pts/2 0:00 /bin/ksh ./x2.sh
u22 20840608 6226164 31 14:56:48 pts/2 0:01 /bin/ksh ./x1.sh
So fine so good, I see I have one instance of each program.
Let’s try to see if the results are consistent over time :
$ n=9999;while :
do
ps -ef |
egrep 'x[12].sh'>f
if [ $(wc -l <f) != $n ]
then
n=$(wc -l <f)
echo
date
cat f
echo "==> $n"
fi
done
Fri Oct 28 15:01:01 CEST 2011
u22 9240796 6226164 32 14:56:52 pts/2 0:14 /bin/ksh ./x2.sh
u22 20840608 6226164 28 14:56:48 pts/2 0:14 /bin/ksh ./x1.sh
==> 2
Fri Oct 28 15:01:08 CEST 2011
u22 9240796 6226164 50 14:56:52 pts/2 0:14 /bin/ksh ./x2.sh
==> 1
Fri Oct 28 15:01:09 CEST 2011
u22 9240796 6226164 52 14:56:52 pts/2 0:14 /bin/ksh ./x2.sh
u22 20840608 6226164 53 14:56:48 pts/2 0:15 /bin/ksh ./x1.sh
==> 2
Fri Oct 28 15:01:17 CEST 2011
u22 9240796 6226164 40 14:56:52 pts/2 0:15 /bin/ksh ./x2.sh
u22 10944520 9240796 0 15:01:17 pts/2 0:00 /bin/ksh ./x2.sh
u22 20840608 6226164 31 14:56:48 pts/2 0:16 /bin/ksh ./x1.sh
==> 3
the fact that a subshell (pid 10944520 ) of x2 appear is not a problem for me. I have much more of a problem at 15:01:08 where x1 disappeared !
Conclusion : you cannot trust ps
shell and list of files
How do you loop thru a list of files?
For instance you want to archive than delete all pdf documents in the current directory :
Bad practice :
tar cvf f.tar *.pdf
rm *.pdf
There are multiple issue with the command above
1) new files could come during the tar, so the rm will delete files that have not been archived
filelist=$(ls *.pdf)
tar cvf f.tar $filelist
rm $filelist
2) if there is no file, tar and rm will return an error
filelist=$(ls|grep '\.pdf')
if [ -n "$filelist" ]
then
tar cvf f.tar $filelist
rm $filelist
fi
3) this will not work for long list (above 100k documents)
filelist=/tmp/filelist.$(date "+%Y%m%d%H%M%S").$$.$RANDOM
ls|grep '\.pdf' > $filelist
if [ -s "$filelist" ]
then
tar cvfL f.tar $filelist
for f in $(<filelist)
do
rm $f
done
fi
As you see, this require special handling. tar for instance use the -L option to accept a list of files, rm could delete files one by one (or in bunches with xargs -L).
This 100’000 limit (the limit may vary for your shell/os) is something that often gets forgotten.
Typical error that could occur are
ksh: no space
bash: Arg list too long
pstree in AIX
For those who do not want to download some linuxlike freeware on your aix box, use ps -T
ps -fT 2412672
UID PID PPID C STIME TTY TIME CMD
oracle 2412672 1 0 Sep 05 - 0:00 /u01/app/oracle/product/OAS
oracle 630956 2412672 0 Sep 05 - 6:11 \--/u01/app/oracle/prod
oracle 1347672 630956 0 Sep 05 - 15:32 |\--/u01/app/oracle/
oracle 1437836 630956 0 Sep 05 - 1:02 |\--/u01/app/oracle/
oracle 880820 1437836 0 Sep 05 - 0:32 | |\--/u01/app/ora
oracle 1036532 1437836 0 Sep 05 - 0:00 | |\--/u01/app/ora
oracle 1134796 1437836 0 Sep 05 - 0:01 | |\--/u01/app/ora
oracle 1343712 1437836 0 Sep 05 - 0:33 | |\--/u01/app/ora
oracle 1368166 1437836 0 Sep 05 - 1:11 | |\--/u01/app/ora
oracle 1384684 1437836 0 Sep 05 - 0:33 | |\--/u01/app/ora
oracle 1392862 1437836 0 Sep 05 - 0:32 | |\--/u01/app/ora
oracle 1396898 1437836 0 Sep 05 - 0:33 | |\--/u01/app/ora
oracle 1482978 1437836 0 Sep 05 - 0:32 | |\--/u01/app/ora
oracle 1527890 1437836 0 Sep 05 - 0:00 | |\--/u01/app/ora
oracle 1781798 1437836 0 Sep 05 - 0:32 | |\--/u01/app/ora
oracle 2195474 1437836 0 Sep 26 - 0:13 | \--/u01/app/ora
oracle 1626296 630956 0 Sep 05 - 13:49 \--/u01/app/oracle/
Large zip on Windows
I have never been a Microsoft fanatic nor an anti-microsoft terrorist, but today I could not believe that large compressed folders got corrupted in Windows !
I have send a relatively small zip file (5gb, peanuts) from AIX to Windows per sftp and in Windows Explorer, some files in the compressed folder (read zip) were just pointing to the wrong content.
I had some issues with large zip files on unix, but this was last century! Howcome could a modern filesystem/operating system have such issues?
I have found a few bugs on support.microsoft.com.
Ex: Compressed folder becomes corrupted when larger than 2 gigabytes
Workaround : make sure that you limit the size of a compressed folder to 2 GB or less
Amazing!
_optimizer_random_plan parameter
I was trying to find a workaround for a bug in 11.2.0.2
SELECT * FROM
(SELECT 2 B FROM DUAL WHERE DUMMY = 'Y'),
(SELECT 3 C FROM DUAL WHERE DUMMY LIKE '%')
WHERE C = B(+);
B C
---------- ----------
2 3
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER| | 1 | 4 | 4 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
As dummy is not Y, B could not be 2.
Ok, I tried :
alter session set "_optimizer_random_plan"=1;
SELECT * FROM
(SELECT 2 B FROM DUAL WHERE DUMMY = 'Y'),
(SELECT 3 C FROM DUAL WHERE DUMMY LIKE '%')
WHERE C = B(+);
B C
---------- ----------
3
Execution Plan
----------------------------------------------------------
Plan hash value: 837538736
-------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1146 | 5730 | 27G|
| 1 | MERGE JOIN OUTER | | 603K| 2946K| 27G|
|* 2 | TABLE ACCESS FULL | DUAL | 392K| 767K| 136K|
| 3 | VIEW | | 2 | 6 | 69180 |
|* 4 | FILTER | | | | |
|* 5 | TABLE ACCESS FULL| DUAL | 123K| 240K| 69180 |
-------------------------------------------------------------
Cool, I got correct results! the fact that the cost jumped from 4 to 27 Billions is just a minor annoyance I suppose
I also tried
alter session set "_optimizer_random_plan"=0; -- default
alter session set "_complex_view_merging"=false;
SELECT * FROM
(SELECT 2 B FROM DUAL WHERE DUMMY = 'Y'),
(SELECT 3 C FROM DUAL WHERE DUMMY LIKE '%')
WHERE C = B(+);
B C
---------- ----------
3
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 1 | 5 | 2 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | DUAL | 1 | 2 | 2 (0)| 00:00:01 |
| 3 | VIEW | | 1 | 3 | | |
|* 4 | FILTER | | | | | |
|* 5 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------
The cost is now 5 and instead of 4 and the results are correct
The first thing I did is opening a SR, now I am impatiently waiting for Oracle Support guidance…
Ow yes it is a “c” !
Rumours about what would come after 8i (internet) 10g (grid) were around, but now it is official, there will be an Oracle Enterprise Manager 12c
Read more : http://www.oracle.com/us/products/enterprise-manager/index.html
And in the blogosphere http://orana.info

Oracle 11.2.0.3
Just installed Oracle 11gR2 patchset 2 on Solaris Sparc 64bit, it works like a charm, waiting for the AIX patchset impatiently
https://updates.oracle.com/Orion/PatchDetails/process_form?patch_num=10404530