  平台是redhat as 3 ,oracle 9204.



[Fri Apr 13 06:00:03 2007] [error] shm.create(): error creating shm 2 No sUCh file or Directory
[Fri Apr 13 06:00:03 2007] [error] shm.create(): error creating shm /home/apache/logs/shm.file
[Fri Apr 13 06:00:03 2007] [warn] pid file /home/apache/logs/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
[Fri Apr 13 06:00:03 2007] [emerg] (28)No space left on device: Couldn't create accept lock





  今天早上接到信息,说新开的这个apache应用服务停止了,打开log一看,又是共享内存的问题,二话不说,把原来的脚本在系统上跑了一遍,restart apache,ok。系统可以了。

  过了几分钟。问题大了,说oracle服务宕了。赶紧检查,ps -eforacle 服务都没了


Errors in file /opt/oracle/admin/sc1/bdump/sc1_reco_5195.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/bdump/sc1_smon_5193.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Apr 13 10:10:46 2007
RECO: terminating instance due to error 27157
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/udump/sc1_ora_23824.trc:
ORA-27153: wait operation failed
ORA-27300: OS system dependent operation:semop failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwwait2
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/bdump/sc1_lgwr_5189.trc:


[root@oracle]# ipcs -s ------ Semaphore Arrays --------
key semid owner perms nsems
0x00000000 4849664 nobody 600 1
0x00000000 4882433 nobody 600 1
0x00000000 4915202 nobody 600 1
0x00000000 4947971 nobody 600 1
0x00000000 4980740 nobody 600 1
0xbeae576c 5111813 oracle 640 201
0xbeae576d 5144582 oracle 640 201
0xbeae576e 5177351 oracle 640 201
0xbeae576f 5210120 oracle 640 201
0xbeae5770 5242889 oracle 640 201
0x00000000 5275658 nobody 600 1
0x00000000 5308427 nobody 600 1
0x00000000 5341196 nobody 600 1
0x00000000 5373965 nobody 600 1
0x00000000 5406734 nobody 600 1
0x00000000 5439503 nobody 600 1
0x00000000 5472272 nobody 600 1
0x00000000 5505041 nobody 600 1

  果然有oracle的共享内存,而我的脚本没有判定。假如只是删除apache用户的共享内存,可以这样 ipcs -s grep apache perl -e 'while () { @a=split(/s+/); print `ipcrm sem $a[1]`}' 假如大家谁的应用和我这个类似,一定注重。



     话又说回来,假如这是一个重要的业务数据库,这样的操作引发的故障将是极为恐怖的(当然重要的系统这样的错误基本上也不会发生),所以作为一个DBA应该对自己的行为三思、多思而后行。 -The End- -----

