CRS does not start GIPC error: [29] msg
[gipcretConnectionRefused]
- Check your disk space using: # df
- Check whether your are a firewall: # service iptables status ( <— this command is very important )
- Use Nslookup and ping to verify you Cluster Interconnect
Errors:
GIPC repot error [29]
msg [gipcretConnectionRefused]
CHM report
clsu_get_private_ip failed
Check CRS status
[root@grac41
Desktop]# crsctl check crs
CRS-4638:
Oracle High Availability Services is online
CRS-4535:
Cannot communicate with Cluster Ready Services
CRS-4529:
Cluster Synchronization Services is online
CRS-4534:
Cannot communicate with Event Manager
[root@grac41
network-scripts]# my_crs_stat_init
NAME
TARGET
STATE
SERVER STATE_DETAILS
-------------------------
---------- ---------- ------------
------------------
ora.asm
ONLINE
OFFLINE
Instance Shutdown
ora.cluster_interconnect.haip
ONLINE
OFFLINE
ora.crf
ONLINE
ONLINE grac41
ora.crsd
ONLINE
OFFLINE
ora.cssd
ONLINE
UNKNOWN
grac41
ora.cssdmonitor
ONLINE
ONLINE
grac41
ora.ctssd
ONLINE
OFFLINE
ora.diskmon
OFFLINE
OFFLINE
ora.drivers.acfs
ONLINE
ONLINE
grac41
ora.evmd
ONLINE
OFFLINE
ora.gipcd
ONLINE
ONLINE
grac41
ora.gpnpd
ONLINE
ONLINE
grac41
ora.mdnsd
ONLINE
ONLINE grac41
--> ASM, HAIP, CRSD, CTSSD, DISKMON, EVMD resource are
OFFLINE !
Check traces - ohasd trace file
[root@grac41
ohasd]# cat ohasd.log | grep -i failed
2014-04-22
15:09:17.966: [ AGFW][2735122176]{0:0:2}
ora.cluster_interconnect.haip 1 1 received state from probe request. Old state
= UNKNOWN, New state = FAILED
2014-04-22
15:09:30.292: [ GPNP][2745628416]clsgpnp_getCachedProfileEx:
[at clsgpnp.c:623] Result: (26) CLSGPNP_NO_PROFILE. Failed to get offline GPnP
service profile.
2014-04-22
15:09:30.602: [ GPNP][2717640448]clsgpnp_getCachedProfileEx:
[at clsgpnp.c:623] Result: (26) CLSGPNP_NO_PROFILE. Failed to get offline GPnP
service profile.
--> HAIP goes to FAILED status
Try to find any repeating updated tracefiles
- maybe some RAC process tries to fix the network problem
[grid@grac41
grac41]$ date; find . -type f -printf
"%CY-%Cm-%Cd %CH:%CM:%CS %h/%f\n" | sort -n | tail -5
Tue Apr 22 13:24:40 CEST 2014
2014-04-22
13:24:30.0571859790 ./gpnpd/gpnpd.log
2014-04-22
13:24:33.0756944610
./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log
2014-04-22
13:24:38.0881994320 ./ohasd/ohasd.log
2014-04-22
13:24:38.3523314350 ./gipcd/gipcd.log
2014-04-22
13:24:39.0876989250 ./crfmond/crfmond.log
[grid@grac41
grac41]$ date; find . -type f -printf
"%CY-%Cm-%Cd %CH:%CM:%CS %h/%f\n" | sort -n | tail -5
Tue Apr 22 13:24:43 CEST 2014
2014-04-22
13:24:30.0571859790 ./gpnpd/gpnpd.log
2014-04-22
13:24:33.0756944610
./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log
2014-04-22
13:24:43.1007044060 ./ohasd/ohasd.log
2014-04-22
13:24:43.3668374000 ./gipcd/gipcd.log
2014-04-22
13:24:43.7580328990 ./crfmond/crfmond.log
[grid@grac41
grac41]$ date; find . -type f -printf
"%CY-%Cm-%Cd %CH:%CM:%CS %h/%f\n" | sort -n | tail -5
Tue Apr 22 13:24:47 CEST 2014
2014-04-22
13:24:30.0571859790 ./gpnpd/gpnpd.log
2014-04-22
13:24:33.0756944610
./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log
2014-04-22
13:24:43.1007044060 ./ohasd/ohasd.log
2014-04-22
13:24:44.0972023860 ./crfmond/crfmond.log
2014-04-22
13:24:46.4033548850 ./gipcd/gipcd.log
-->
Here we cans see that ./ohasd/ohasd.log ./gipcd/gipcd.log
./crfmond/crfmond.log
Use
tail to see what's going :
[grid@grac41
grac41]$ tail -f ./gpnpd/gpnpd.log
2014-04-22
13:19:59.175: [ OCRMSG][4002494208]GIPC error
[29] msg [gipcretConnectionRefused]
2014-04-22
13:21:29.469: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:22:59.792: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:24:30.057: [ OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22
13:26:00.383: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:27:30.622: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:29:00.869: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:30:31.203: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:32:01.459: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
2014-04-22
13:33:31.770: [ OCRMSG][4002494208]GIPC error [29] msg
[gipcretConnectionRefused]
[grid@grac41
grac41]$ tail -f
./ohasd/ohasd.log
2014-04-22
13:33:42.806: [GIPCHDEM][2222126848]gipchaDaemonInfRequest:
sent local interfaceRequest, hctx 0x2d03370 [0000000000000010] {
gipchaContext : host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000',
numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:33:47.817: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:33:52.839: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:33:57.848: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:34:03.859: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:34:09.874: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:34:15.881: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:34:20.900: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:34:25.920: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22
13:34:30.934: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local
interfaceRequest, hctx 0x2d03370 [0000000000000010] { gipchaContext :
host 'grac41', name 'CLSFRAME_grac4', luid '57127705-00000000', numNode 0,
numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
[grid@grac41
grac41]$ tail -f
./crfmond/crfmond.log
[
CLWAL][467654400]clsw_Initialize: OLR initlevel [70000]
2014-04-22
13:34:49.349: [
CRFM][467654400]crfm_connstr: clsu_get_private_ip failed(7).
2014-04-22
13:34:49.458: [
CRFM][467654400]crfm_connect_to: send fail(gipcret: 13)
2014-04-22
13:34:49.458: [ CRFM][467654400]crfmctx dump follows
2014-04-22
13:34:49.458: [ CRFM][467654400]****************************
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: connection
local name: tcp://0.0.0.0:45871
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: connection peer
name: tcp://192.168.1.101:61021
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: connaddr:
tcp://grac41:61021
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: ctype: 2
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: mytype: 0
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: hostname
grac41
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: myport:
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: rhostname
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: rport:
2014-04-22
13:34:49.458: [ CRFM][467654400]crfm_dumpctx: flags: 1
2014-04-22
13:34:49.458: [ CRFM][467654400]****************************
According to above traces we can see that clsu_get_private_ip
failed getting private IP tcp://192.168.1.101
Check Network status and DNS
[root@grac41
Desktop]# ifconfig
eth1
Link encap:Ethernet HWaddr 08:00:27:89:E9:A2
inet addr:192.168.2.101 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe89:e9a2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:17148 errors:0 dropped:0 overruns:0
frame:0
TX packets:13307 errors:0 dropped:0 overruns:0
carrier:0
collisions:0 txqueuelen:1000
RX bytes:22041591 (21.0 MiB) TX bytes:1211055 (1.1 MiB)
Interrupt:9 Base address:0xd240
eth2
Link encap:Ethernet HWaddr 08:00:27:6B:E2:BD
inet addr:192.168.1.101 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe6b:e2bd/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:17517 errors:0 dropped:0 overruns:0
frame:0
TX packets:13475 errors:0 dropped:0 overruns:0
carrier:0
collisions:0 txqueuelen:1000
RX bytes:22191772 (21.1 MiB) TX bytes:1230703 (1.1 MiB)
Interrupt:5 Base address:0xd260
-->
Check public and private interface for errors / Looks good
[root@grac41
Desktop]# nslookup grac41
Name:
grac41.example.com
Address:
192.168.1.101
[root@grac41
Desktop]# nslookup grac41int
Name:
grac41int.example.com
Address:
192.168.2.101
[root@grac41
Desktop]# nslookup 192.168.1.101
101.1.168.192.in-addr.arpa
name = grac41.example.com.
[root@grac41
Desktop]# nslookup 192.168.2.101
101.2.168.192.in-addr.arpa
name = grac41int.example.com.
--> DNS and Network seems to be ok
Restart CRS
root@grac41
Desktop]# crsctl stop crs -f
CRS-2791:
Starting shutdown of Oracle High Availability Services-managed resources on
'grac41'
CRS-2673:
Attempting to stop 'ora.crf' on 'grac41'
CRS-2673:
Attempting to stop 'ora.ctssd' on 'grac41'
CRS-2673:
Attempting to stop 'ora.evmd' on 'grac41'
...
CRS-2673:
Attempting to stop 'ora.gpnpd' on 'grac41'
CRS-2677:
Stop of 'ora.gpnpd' on 'grac41' succeeded
CRS-2793:
Shutdown of Oracle High Availability Services-managed resources on 'grac41' has
completed
CRS-4133:
Oracle High Availability Services has been stopped.
Cleanup
/var/tmp/.oracle
#
rm /var/tmp/.oracle/*
[root@grac41
Desktop]# crsctl start crs
[root@grac41
Desktop]# crsctl check crs
CRS-4638:
Oracle High Availability Services is online
CRS-4535:
Cannot communicate with Cluster Ready Services
CRS-4529:
Cluster Synchronization Services is online
CRS-4534:
Cannot communicate with Event Manager
--> Problem persists
Check OS logfile
#
cat /var/log/messages
--> Nothing related
Run orcheck ( and orcdump ) to check whether we can access our
OCR repostory
[root@grac41
Desktop]# ocrcheck
Status
of Oracle Cluster Registry is as follows :
Version
: 3
Total space (kbytes) :
262120
Used space (kbytes)
: 4076
Available space (kbytes) : 258044
ID
: 630679368
Device/File Name
: +OCR
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
Query voting disk :
[grid@grac41
grac41]$ crsctl query css votedisk
##
STATE File Universal
Id
File Name Disk group
--
-----
-----------------
--------- ---------
1.
ONLINE b0e94e5d83054fe9bf58b6b98bfacd65 (/dev/asmdisk1_udev_sdf1)
[OCR]
2.
ONLINE 88c2a08b4c8c4f85bf0109e0990388e4 (/dev/asmdisk1_udev_sdg1)
[OCR]
3.
ONLINE 1108f9a41e814fb2bfed879ff0039dd0 (/dev/asmdisk1_udev_sdh1)
[OCR]
Located
3 voting disk(s).
Debugging GIPCD and GPnPD daemons using strace
As GIPCD and GPnPD daemon traces gets updated every 5s lets
check the gipcd process with strace
#
ps -elf | egrep 'gpnpd.bin|gipcd.bin'
# strace -t -f -p 24376 2>&1 |
grep '192.168' | grep eth
[pid
24872] 09:17:28 <... ioctl resumed> 200, {{"lo", {AF_INET,
inet_addr("127.0.0.1")}}, {"eth0", {AF_INET,
inet_addr("10.0.2.15")}}, {"eth1", {AF_INET,
inet_addr("192.168.2.101")}}, {"eth2", {AF_INET,
inet_addr("192.168.1.101")}}, {"virbr0", {AF_INET,
inet_addr("192.168.122.1")}}}}) = 0
[pid
24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth1",
ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid
24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth1",
ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid
24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth1",
ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid
24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth2",
ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
[pid
24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth1",
ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid
24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth2",
ifr_broadaddr={AF_INET, inet_addr("192.168.1.255")}}) = 0
[pid
24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth2",
ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
[pid
24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth2",
ifr_broadaddr={AF_INET, inet_addr("192.168.1.255")}}) = 0
..
[pid
24872] 09:17:33 <... ioctl resumed> 200, {{"lo", {AF_INET,
inet_addr("127.0.0.1")}}, {"eth0", {AF_INET,
inet_addr("10.0.2.15")}}, {"eth1", {AF_INET,
inet_addr("192.168.2.101")}}, {"eth2", {AF_INET,
inet_addr("192.168.1.101")}}, {"virbr0", {AF_INET,
inet_addr("192.168.122.1")}}}}) = 0
[pid
24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth1",
ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid
24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth1",
ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid
24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth2",
ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
[pid
24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth2",
ifr_broadaddr={AF_INET, inet_addr("192.168.1.255")}}) = 0
[pid
24872] 09:17:33 <... ioctl resumed> , {ifr_name="eth1",
ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid
24872] 09:17:33 <... ioctl resumed> , {ifr_name="eth1",
ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid
24872] 09:17:33 <... ioctl resumed> , {ifr_name="eth2",
ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
..
--> Again we don't get an OS error but we are looping running
the same ioctl() command
Seems the kernel is not happy with the
inforamtion we get from ioctl() call and tries to reread the information
every 5 seconds
Check GPnP profile
[root@grac41
Desktop]# gpnptool get > profile.xml
Edit
profile.xml and extract the adapter usage
<gpnp:Network-Profile><gpnp:HostNetwork
id="gen" HostName="*">
<gpnp:Network id="net1"
IP="192.168.1.0" Adapter="eth1" Use="public"/>
<gpnp:Network id="net2"
IP="192.168.2.0" Adapter="eth2"
Use="cluster_interconnect"/>
Verify
with ifconfig
[root@grac41
Desktop]# ifconfig | egrep 'HWaddr|inet addr'
eth1 Link
encap:Ethernet HWaddr 08:00:27:89:E9:A2
inet addr:192.168.2.101
Bcast:192.168.2.255 Mask:255.255.255.0
eth2
Link encap:Ethernet HWaddr 08:00:27:6B:E2:BD
inet addr:192.168.1.101 Bcast:192.168.1.255
Mask:255.255.255.0
inet addr:127.0.0.1 Mask:255.0.0.0
-->
eth1 is using 192.168.2.101 but according GPnP Profile it should use
192.168.1.101
eth2 is using 192.168.1.101 but according GPnP Profile it should use
192.168.2.101
Problem found :
During manuall editing ifcfg-eth1 and ifcfg-eth2 HWADR
entry was wrongly filled ( /etc/sysconfig/network-scripts )
Reconfiguring/restart network and CRS
[root@grac41
network-scripts]# cat ifcfg-eth2
HWADDR=08:00:27:89:E9:A2
IPADDR=192.168.2.101
NAME=eth2
[root@grac41
network-scripts]# cat ifcfg-eth1
IPADDR=192.168.1.101
NAME=eth1
HWADDR=08:00:27:6B:E2:BD
After
changing HWaddr to follow the above ifconfig output the network looks good
[root@grac41
network-scripts] service network restart
[root@grac41
network-scripts]# ifconfig | egrep 'HWaddr|inet
addr'
eth1
Link encap:Ethernet HWaddr 08:00:27:89:E9:A2
inet addr:192.168.1.101
Bcast:192.168.1.255 Mask:255.255.255.0
eth2
Link encap:Ethernet HWaddr 08:00:27:6B:E2:BD
inet addr:192.168.2.101
Bcast:192.168.2.255 Mask:255.255.255.0
Restart CRS
[root@grac41
network-scripts]# crsctl stop crs -f
[root@grac41
network-scripts]# crsctl start crs
[root@grac41
network-scripts]# crsctl check cluster -all
**************************************************************
grac41:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
grac42:
CRS-4537:
Cluster Ready Services is online
CRS-4529:
Cluster Synchronization Services is online
CRS-4533:
Event Manager is online
**************************************************************
grac43:
CRS-4537:
Cluster Ready Services is online
CRS-4529:
Cluster Synchronization Services is online
CRS-4533:
Event Manager is online
**************************************************************
Lessons learned
- Verify carefully that IP addresses and Network Device names
are clusterwide consistent
[root@gract1
Desktop]# crsi
*****
Local Resources: *****
Resource
NAME
INST TARGET
STATE
SERVER STATE_DETAILS
---------------------------
---- ------------ ------------ ---------------
-----------------------------------------
ora.asm
1 ONLINE
OFFLINE -
STABLE
ora.cluster_interconnect.haip
1 ONLINE
OFFLINE
-
STABLE
ora.crf
1 ONLINE
OFFLINE -
STABLE
ora.crsd
1 ONLINE
OFFLINE -
STABLE
ora.cssd
1 ONLINE
OFFLINE -
STABLE
ora.cssdmonitor
1 OFFLINE
OFFLINE
-
STABLE
ora.ctssd
1 ONLINE
OFFLINE -
STABLE
ora.diskmon
1 OFFLINE
OFFLINE
-
STABLE
ora.drivers.acfs
1 ONLINE
ONLINE
gract1 STABLE
ora.evmd
1 ONLINE
OFFLINE
gract1 STARTING
ora.gipcd
1 ONLINE
OFFLINE
-
STABLE
ora.gpnpd
1 ONLINE
OFFLINE -
STABLE
ora.mdnsd
1 ONLINE
OFFLINE
gract1 STARTING
ora.storage
1 ONLINE
OFFLINE -
STABLE
Related client trace
2014-08-22
10:57:07.750: [ OCRMSG][2296473152]prom_waitconnect: CONN NOT ESTABLISHED
(0,29,1,2)
2014-08-22
10:57:07.750: [ OCRMSG][2296473152]GIPC error
[29] msg [gipcretConnectionRefused]
2014-08-22
10:57:07.750: [ OCRMSG][2296473152]prom_connect: error while waiting for
connection complete [24]
2014-08-22
10:57:07.821: [ OCRMSG][2296473152]prom_waitconnect: CONN NOT ESTABLISHED
(0,29,1,2)
2014-08-22
10:57:07.821: [ OCRMSG][2296473152]GIPC error [29] msg
[gipcretConnectionRefused]
2014-08-22
10:57:07.821: [ OCRMSG][2296473152]prom_connect: error while waiting for
connection complete [24]
Root Cause : File system full : 100% - No traces can be written
#
df -k
Filesystem
1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_oel64-lv_root
39603624 37798864 0 100% /
tmpfs
4194304 272 4194032
1% /dev/shm
/dev/sda1
495844 101751 368493 22% /boot
*****
Cluster Resources: *****
Resource
NAME
INST TARGET
STATE
SERVER STATE_DETAILS
---------------------------
---- ------------ ------------ ---------------
-----------------------------------------
ora.asm
1 ONLINE OFFLINE
-
STABLE
ora.cluster_interconnect.haip
1 ONLINE OFFLINE
-
STABLE
ora.crf
1 ONLINE OFFLINE
-
STABLE
ora.crsd
1 ONLINE OFFLINE
-
STABLE
ora.cssd
1 ONLINE OFFLINE
-
STABLE
ora.cssdmonitor
1 ONLINE
ONLINE gract2
STABLE
ora.ctssd
1 ONLINE OFFLINE
-
STABLE
ora.diskmon
1 OFFLINE OFFLINE
-
STABLE
ora.drivers.acfs
1 ONLINE
ONLINE
gract2 STABLE
ora.evmd
1 ONLINE INTERMEDIATE
gract2 STABLE
ora.gipcd
1 ONLINE
ONLINE
gract2 STABLE
ora.gpnpd
1 ONLINE
ONLINE
gract2 STABLE
ora.mdnsd
1 ONLINE
ONLINE
gract2 STABLE
ora.storage
1 ONLINE OFFLINE
-
STABLE
--> CSSD doesn't become ONLINE
Client log :
014-08-23
11:49:21.920: [ OCRMSG][2580342528]GIPC error
[29] msg [gipcretConnectionRefused]
2014-08-23
11:49:42.948: [ OCRMSG][2580342528]GIPC error [29] msg
[gipcretConnectionRefused]
2014-08-23
11:50:10.978: [ OCRMSG][2580342528]GIPC error [29] msg
[gipcretConnectionRefused]
2014-08-23
11:50:46.008: [ OCRMSG][2580342528]GIPC error [29] msg
[gipcretConnectionRefused]
2014-08-23
11:51:28.042: [ OCRMSG][2580342528]GIPC error [29] msg
[gipcretConnectionRefused]
2014-08-23
11:51:28.042: [ OCRMSG][2580342528]GIPC error [29] msg
[gipcretConnectionRefused]
20665
<... connect resumed>
) = 0
20665
connect(66, {sa_family=AF_FILE,
path="/var/tmp/.oracle/sOHASD_UI_SOCKET"}, 110 <unfinished ...>
20665
<... connect resumed> )
= 0
20665
connect(73, {sa_family=AF_FILE,
path="/var/tmp/.oracle/sprocr_local_conn_0_PROC"}, 110 <unfinished
...>
20665
<... connect resumed>
) = -1 ECONNREFUSED
(Connection refused)
occsd.log :
2014-08-23
12:32:58.427: [ CSSD][1279260416]clssnmvDHBValidateNCopy:
node 1, gract1, has a disk HB, but no network HB,
DHB has rcfg 304252836, wrtcnt,
3207223, LATS 4294823390, lastSeqNo 3207220, uniqueness 1408783210, timestamp
1408789980/5988764
2014-08-23
12:32:58.427: [ CSSD][1283991296]clssnmvDHBValidateNCopy:
node 1, gract1, has a disk HB, but no network HB,
DHB has rcfg 304252836, wrtcnt,
3207224, LATS 4294823390, lastSeqNo 3207221, uniqueness 1408783210, timestamp
1408789980/5988864
- Fix : Disable Firewall
Problem : Firewall not disabled on OEL 6 after running chkconfig iptables off and system reboot
To fix the problem you need to disable libvirtd
# chkconfig libvirtd off
# chkconfig libvirt-guests off
# chkconfig ip6tables off
# chkconfig iptables off
# chkconfig --list | egrep 'iptables|ip6tables|libvirt'
ip6tables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
libvirt-guests 0:off 1:off 2:off 3:off 4:off 5:off 6:off
libvirtd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
After a reboot the firewall should be disabled now
[root@grac43 ~]# service iptables status
iptables: Firewall is not running.
References
- Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip (Doc ID 1210883.1)
- Grid Infrastructure Installation root.sh Failed with “Failed to start CTSS” (Doc ID 1277307.1)
- Troubleshoot Grid Infrastructure Startup Issues (Doc ID 1050908.1)
- Top 5 Grid Infrastructure Startup Issues (Doc ID 1368382.1)
One
thought on “CRS does not start GIPC error: [29] msg [gipcretConnectionRefused]”