OHASD Agents do not start
- OHASD.BIN will spawn four agents/monitors to start resource:
- oraagent: responsible for ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd etc
- orarootagent: responsible for ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs etc
- cssdagent / cssdmonitor: responsible for ora.cssd(for ocssd.bin) and ora.cssdmonitor(for cssdmonitor itself)
If ohasd.bin can not start any of above agents properly, clusterware will not
come to healthy state.
Potential Problems
1. Common causes of agent failure are that the log file or log directory for the agents don't have proper ownership or permission.
2. If agent binary (oraagent.bin or orarootagent.bin etc) is corrupted, agent will not start resulting in related resources not coming up:
Debugging CRS startup if trace file location is not accessible
Action - Change trace directory
[grid@grac41 log]$ mv $GRID_HOME/log/grac41 $GRID_HOME/log/grac41_nw
[grid@grac41 log]$ crsctl start crs
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
Process Status and CRS status
[root@grac41 .oracle]# ps -elf | egrep "PID|d.bin|ohas|oraagent|orarootagent|cssdagent|cssdmonitor" | grep -v grep
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S root 5396 1 0 80 0 - 2847 pipe_w 10:52 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run
4 S root 26705 25370 1 80 0 - 47207 hrtime 14:05 pts/7 00:00:00 /u01/app/11204/grid/bin/crsctl.bin start crs
[root@grac41 .oracle]# crsctl check crs
CRS-4639: Could not contact Oracle High Availability Services
OS Tracefile: /var/log/messages
May 13 13:48:27 grac41 root: exec /u01/app/11204/grid/perl/bin/perl -I/u01/app/11204/grid/perl/lib
/u01/app/11204/grid/bin/crswrapexece.pl
/u01/app/11204/grid/crs/install/s_crsconfig_grac41_env.txt /u01/app/11204/grid/bin/ohasd.bin "reboot"
May 13 13:48:27 grac41 OHASD[22203]: OHASD exiting; Directory /u01/app/11204/grid/log/grac41/ohasd not found
Debugging steps
[root@grac41 gpnpd]# strace -f -o ohas.trc crsctl start crs
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
[root@grac41 gpnpd]# grep ohasd ohas.trc
...
22203 execve("/u01/app/11204/grid/bin/ohasd.bin", ["/u01/app/11204/grid/bin/ohasd.bi"..., "reboot"], [/* 60 vars */]) = 0
22203 stat("/u01/app/11204/grid/log/grac41/ohasd", 0x7fff17d68f40) = -1 ENOENT (No such file or directory)
==> Directory /u01/app/11204/grid/log/grac41/ohasd was missing or has wrong protection
Using clufy comp olr
[grid@grac41 ~]$ cluvfy comp olr
Verifying OLR integrity
Checking OLR integrity...
Checking OLR config file...
ERROR:
2014-05-17 18:26:41.576: CLSD: A file system error occurred while attempting to create default permissions for
file "/u01/app/11204/grid/log/grac41/alertgrac41.log" during alert open processing for
process "client". Additional diagnostics:
LFI-00133: Trying to create file /u01/app/11204/grid/log/grac41/alertgrac41.log
that already exists.
LFI-01517: open() failed(OSD return value = 13).
2014-05-17 18:26:41.585: CLSD: An error was encountered while attempting to
open alert log "/u01/app/11204/grid/log/grac41/alertgrac41.log".
Additional diagnostics: (:CLSD00155:) 2014-05-17 18:26:41.585:
OLR config file check successful
Checking OLR file attributes...
ERROR:
PRVF-4187 : OLR file check failed on the following nodes:
grac41
grac41:PRVF-4127 : Unable to obtain OLR location
/u01/app/11204/grid/bin/ocrcheck -config -local
<CV_CMD>/u01/app/11204/grid/bin/ocrcheck -config -local </CV_CMD><CV_VAL>2014-05-17 18:26:45.202:
CLSD: A file system error occurred while attempting to create default permissions for file
"/u01/app/11204/grid/log/grac41/alertgrac41.log" during alert open processing for process "client".
Additional diagnostics: LFI-00133: Trying to create file /u01/app/11204/grid/log/grac41/alertgrac41.log
that already exists.
LFI-01517: open() failed(OSD return value = 13).
2014-05-17 18:26:45.202:
CLSD: An error was encountered while attempting to open alert log
"/u01/app/11204/grid/log/grac41/alertgrac41.log". Additional diagnostics: (:CLSD00155:)
2014-05-17 18:26:45.202:
CLSD: Alert logging terminated for process client. File name: "/u01/app/11204/grid/log/grac41/alertgrac41.log"
2014-05-17 18:26:45.202:
CLSD: A file system error occurred while attempting to create default permissions for file
"/u01/app/11204/grid/log/grac41/client/ocrcheck_7617.log" during log open processing for process "client".
Additional diagnostics: LFI-00133: Trying to create file /u01/app/11204/grid/log/grac41/client/ocrcheck_7617.log
that already exists.
LFI-01517: open() failed(OSD return value = 13).
2014-05-17 18:26:45.202:
CLSD: An error was encounteredcluvfy comp gpnp while attempting to open log file
"/u01/app/11204/grid/log/grac41/client/ocrcheck_7617.log".
Additional diagnostics: (:CLSD00153:)
2014-05-17 18:26:45.202:
CLSD: Logging terminated for process client. File name: "/u01/app/11204/grid/log/grac41/client/ocrcheck_7617.log"
Oracle Local Registry configuration is :
Device/File Name : /u01/app/11204/grid/cdata/grac41.olr
</CV_VAL><CV_VRES>0</CV_VRES><CV_LOG>Exectask: runexe was successful</CV_LOG><CV_ERES>0</CV_ERES>
OLR integrity check failed
Verification of OLR integrity was unsuccessf