CRSD.BIN does not start
- Note: in 11.2 ASM starts before crsd.bin, and brings up the diskgroup automatically if it contains the OCR.
- Typical Problems :
- OCR not acessible
- Networking problems ( includes firewall, Nameserver problems and Network related errors ) ==> For futher details see GENERIC Networking troubleshooting chapter
- Common File Protection problems ( Oracle executable, Log Files, IPC sockets ) ==> For futher details see GENERIC File Protection troubleshooting chapter
Case #1: OCR not accessbible - CRS-5019 error
Reported error in ocssd.log : [check] DgpAgent::queryDgStatus no data found in v$asm_diskgroup_stat
Reported Clusterware Error in CW alert.log: CRS-5019:All OCR locations are on ASM disk groups [OCR3],
and none of these disk groups are mounted.
Details are at "(:CLSN00100:)" in ".../ohasd/oraagent_grid/oraagent_grid.log".
Testing scenario :
In file /etc/oracle/ocr.loc
Change entry ocrconfig_loc=+OCR to ocrconfig_loc=+OCR3
Clusterware status :
# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager
# crsi
NAME TARGET STATE SERVER STATE_DETAILS
------------------------- ---------- ---------- ------------ ------------------
ora.asm ONLINE INTERMEDIATE grac41 OCR not started
ora.cluster_interconnect.haip ONLINE ONLINE grac41
ora.crf ONLINE ONLINE grac41
ora.crsd ONLINE OFFLINE
ora.cssd ONLINE ONLINE grac41
ora.cssdmonitor ONLINE ONLINE grac41
ora.ctssd ONLINE ONLINE grac41 OBSERVER
ora.diskmon OFFLINE OFFLINE
ora.drivers.acfs ONLINE ONLINE grac41
ora.evmd ONLINE INTERMEDIATE grac41
ora.gipcd ONLINE ONLINE grac41
ora.gpnpd ONLINE ONLINE grac41
ora.mdnsd ONLINE ONLINE grac41
--> Both resoruces ora.evmd and ora.asm reports their state in INTERMEDIATE - CRSD/EVMD doesn't come up
# ps -elf | egrep "PID|d.bin|ohas|oraagent|orarootagent|cssdagent|cssdmonitor" | grep -v grep
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S root 6098 1 0 80 0 - 2846 pipe_w 04:44 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run
4 S root 10132 1 1 80 0 - 179077 futex_ 10:09 ? 00:00:07 /u01/app/11204/grid/bin/ohasd.bin reboot
4 S grid 10295 1 0 80 0 - 175970 futex_ 10:09 ? 00:00:03 /u01/app/11204/grid/bin/oraagent.bin
0 S grid 10306 1 0 80 0 - 74289 poll_s 10:09 ? 00:00:00 /u01/app/11204/grid/bin/mdnsd.bin
0 S grid 10317 1 0 80 0 - 127382 hrtime 10:09 ? 00:00:01 /u01/app/11204/grid/bin/gpnpd.bin
0 S grid 10328 1 1 80 0 - 159711 hrtime 10:09 ? 00:00:04 /u01/app/11204/grid/bin/gipcd.bin
4 S root 10330 1 1 80 0 - 168735 futex_ 10:09 ? 00:00:04 /u01/app/11204/grid/bin/orarootagent.bin
4 S root 10344 1 6 -40 - - 160907 hrtime 10:09 ? 00:00:25 /u01/app/11204/grid/bin/osysmond.bin
4 S root 10355 1 0 -40 - - 162793 futex_ 10:09 ? 00:00:01 /u01/app/11204/grid/bin/cssdmonitor
4 S root 10374 1 0 -40 - - 162921 futex_ 10:09 ? 00:00:01 /u01/app/11204/grid/bin/cssdagent
4 S grid 10385 1 4 -40 - - 166593 futex_ 10:09 ? 00:00:17 /u01/app/11204/grid/bin/ocssd.bin
4 S root 10733 1 0 -40 - - 127270 poll_s 10:09 ? 00:00:03 /u01/app/11204/grid/bin/ologgerd
4 S root 11286 1 0 80 0 - 159388 futex_ 10:09 ? 00:00:03 /u01/app/11204/grid/bin/octssd.bin reboot
0 S grid 11307 1 0 80 0 - 75351 hrtime 10:09 ? 00:00:00 /u01/app/11204/grid/bin/evmd.bin
Tracefile Details :
alertgrac41.log:
2014-05-24 10:13:48.085:
[/u01/app/11204/grid/bin/oraagent.bin(10295)]CRS-5019:All OCR locations are on ASM disk groups [OCR3],
and none of these disk groups are mounted.
Details are at "(:CLSN00100:)" in ".../agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-05-24 10:14:18.144:
[/u01/app/11204/grid/bin/oraagent.bin(10295)]CRS-5019:All OCR locations are on ASM disk groups [OCR3],
and none of these disk groups are mounted.
Details are at "(:CLSN00100:)" in ".../agent/ohasd/oraagent_grid/oraagent_grid.log".
ohasd/oraagent_grid/oraagent_grid.log
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] AsmAgent::check ocrCheck 1 m_OcrOnline 0 m_OcrTimer 30
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet { entry
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet procr_get_conf: retval [0]
configured [1] local only [0] error buffer []
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet procr_get_conf: OCR loc [0], Disk Group : [+OCR3]
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet m_ocrDgpSet 02d965f8 dgName OCR3
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet ocrret 0 found 1
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet ocrDgpSet OCR3
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::initOcrDgpSet exit }
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::ocrDgCheck Entry {
2014-05-24 10:10:47.848: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::getConnxn connected
2014-05-24 10:10:47.850: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::queryDgStatus excp no data found
2014-05-24 10:10:47.850: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::queryDgStatus no data found in v$asm_diskgroup_stat
2014-05-24 10:10:47.851: [ora.asm][4179105536]{0:0:2} [check] DgpAgent::queryDgStatus dgName OCR3 ret 1
2014-05-24 10:10:47.851: [ora.asm][4179105536]{0:0:2} [check] (:CLSN00100:)DgpAgent::ocrDgCheck OCR dgName OCR3 state 1
2014-05-24 10:10:47.851: [ora.asm][4179105536]{0:0:2} [check] AsmAgent::check ocrCheck 2 m_OcrOnline 0 m_OcrTimer 31
2014-05-24 10:10:47.851: [ora.asm][4179105536]{0:0:2} [check] CrsCmd::ClscrsCmdData::stat entity 5 statflag 32 useFilter 1
2014-05-24 10:10:48.053: [ COMMCRS][3665618688]clsc_connect: (0x7f77d4101f30) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))
Debugging steps
Identify OCR ASM DG on a running instance with asmcmd
[grid@grac42 ~]$ asmcmd lsdg
State Type Rebal Sector Block AU Total_MB Offline_disks Voting_files Name
MOUNTED NORMAL N 512 4096 1048576 3057 0 N ASMLIB_DG/
MOUNTED NORMAL N 512 4096 1048576 30708 0 N DATA/
MOUNTED EXTERN N 512 4096 1048576 40946 0 N FRA/
MOUNTED NORMAL N 512 4096 1048576 6141 0 Y OCR/
MOUNTED NORMAL N 512 4096 1048576 3057 0 N SSD/
--> diskgroup OCR is serving our voting disks ( and not DG +OCR3 )
Check availale disks:
[grid@grac42 ~]$ asmcmd lsdsk -k
Total_MB Free_MB OS_MB Name Failgroup Failgroup_Type Library Path
10236 2809 10236 DATA_0001 DATA_0001 REGULAR System /dev/asmdisk1_udev_sdc1
10236 2804 10236 DATA_0002 DATA_0002 REGULAR System /dev/asmdisk1_udev_sdd1
10236 2804 10236 DATA_0003 DATA_0003 REGULAR System /dev/asmdisk1_udev_sde1
2047 1696 2047 OCR_0000 OCR_0000 REGULAR System /dev/asmdisk1_udev_sdf1
2047 1696 2047 OCR_0001 OCR_0001 REGULAR System /dev/asmdisk1_udev_sdg1
2047 1697 2047 OCR_0002 OCR_0002 REGULAR System /dev/asmdisk1_udev_sdh1
1019 875 1019 SSD_0000 SSD_0000 REGULAR System /dev/asmdisk8_ssd1
1019 875 1019 SSD_0001 SSD_0001 REGULAR System /dev/asmdisk8_ssd2
1019 875 1019 SSD_0002 SSD_0002 REGULAR System /dev/asmdisk8_ssd3
20473 10467 20473 FRA_0001 FRA_0001 REGULAR System /dev/asmdisk_fra1
20473 10461 20473 FRA_0002 FRA_0002 REGULAR System /dev/asmdisk_fra2
1019 882 1019 ASMLIB_DG_0000 ASMLIB_DG_0000 REGULAR System /dev/oracleasm/disks/ASMLIB_DISK1
1019 882 1019 ASMLIB_DG_0001 ASMLIB_DG_0001 REGULAR System /dev/oracleasm/disks/ASMLIB_DISK2
1019 882 1019 ASMLIB_DG_0002 ASMLIB_DG_0002 REGULAR System /dev/oracleasm/disks/ASMLIB_DISK3
--> /dev/asmdisk1_udev_sdf1, /dev/asmdisk1_udev_sdg1 , /dev/asmdisk1_udev_sdh1 are
our disk serving the voting files
Verify on a working system the voting disk
[grid@grac42 ~]$ kfed read /dev/asmdisk1_udev_sdf1 | egrep 'vf|name'
kfdhdb.dskname: OCR_0000 ; 0x028: length=8
kfdhdb.grpname: OCR ; 0x048: length=3
kfdhdb.fgname: OCR_0000 ; 0x068: length=8
kfdhdb.capname: ; 0x088: length=0
kfdhdb.vfstart: 320 ; 0x0ec: 0x00000140
kfdhdb.vfend: 352 ; 0x0f0: 0x00000160
Note: If the markers between vfstart & vfend are not 0 then disk does contain voting disk !
Check OCR with ocrcheck
[grid@grac41 ~]$ ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
Tracing ocrcheck
[grid@grac41 ~]$ strace -f -o ocrcheck.trc ocrcheck
[grid@grac41 ~]$ grep open ocrcheck.trc | grep ocr.loc
17530 open("/etc/oracle/ocr.loc", O_RDONLY) = 6
--> ocrcheck reads /etc/oracle/ocr.loc
[grid@grac41 ~]$ cat /etc/oracle/ocr.loc
ocrconfig_loc=+OCR3
local_only=false
==> +OCR3 is a wrong entry
Verify CW state with cluvfy
[grid@grac42 ~]$ cluvfy comp ocr -n grac42,grac41
Verifying OCR integrity
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations
ERROR:
PRVF-4193 : Asm is not running on the following nodes.
Proceeding with the remaining nodes.
grac41
Checking OCR config file "/etc/oracle/ocr.loc"...
OCR config file "/etc/oracle/ocr.loc" check successful
ERROR:
PRVF-4195 : Disk group for ocr location "+OCR" not available on the following nodes:
grac41
NOTE:
This check does not verify the integrity of the OCR contents.
Execute 'ocrcheck' as a privileged user to verify the contents of OCR.
OCR integrity check failed
Verification of OCR integrity was unsuccessful.
Checks did not pass for the following node(s):
grac41