ORACLE DATABASE ADMINISTRATOR: CLUSTERWARE TROUBLESHOOTING

1. Useful "OS command" and "ClusterWare commands", "GREP commands " ,

"OS logfile location" & "Clusterware logfile" location details

2. Generic troubleshooting hints : How to review ClusterWare trace files

3, GENERIC Networking troubleshooting

4, Debugging complete ClusterWare startup with strace

5, Debugging a single ClusterWare process with strace

1, Useful OS and CW commands, GREP commands , OS logfile location and Clusterware logfile location details

1 : Clusterware logfile structure

CW Alert.log alert<hostname>.log ( most important one !! )

OHASD ohsad.log

CSSD ocssd.log

EVMD evmd.log

CRSD crsd.log

MDSND mdnsd.log

GIPCD gipcd.log

GPNPD gpnpd.log

Agent directories

agent/ohasd

agent/ohasd/oraagent_grid

agent/ohasd/oracssdagent_root

agent/ohasd/oracssdmonitor_root

agent/ohasd/orarootagent_root

2 : OS System logs

HPUX /var/adm/syslog/syslog.log

AIX /bin/errpt–a

Linux /var/log/messages

Windows Refer .TXT log files under Application/System log using Windows Event Viewer

Solaris /var/adm/messages

Linux Sample

# grep 'May 20' ./grac41/var/log/messages > SYSLOG

--> Check SYSLOG for relvant errors

An typical CW error could look like:

# cat /var/log/messages

May 13 13:48:27 grac41 OHASD[22203]: OHASD exiting; Directory /u01/app/11204/grid/log/grac41/ohasd not found

3 : Usefull Commands for a quick check of clusterware status

It may be usefull to run all commands below just to get an idea what is working and what is not working

3.1 : OS commands ( assume we have CW startup problems on grac41 )

# ping grac41

# route -n

# /bin/netstat -in

# /sbin/ifconfig -a

# /bin/ping -s <MTU> -c 2 -I source_IP nodename

# /bin/traceroute -s source_IP -r -F nodename-priv <MTU-28>

# /usr/bin/nslookup grac41

3.2 : Clusterware commands to debug startup problems

Check Clusterware status

# crsctl check crs

# crsctl check css

# crsctl check evm

# crsctl stat res -t -init

If OHASD stack is completly up and running you can check your cluster resources with

# crsctl stat res -t

3.3 : Checking OLR to debug startup problems

# ocrcheck -local

# ocrcheck -local -config

3.4 : Checking OCR/Votedisks to debug startup problems

$ crsctl query css votedisk

Next 2 commands will only work when startup problems are fixed

$ ocrcheck

$ ocrcheck -config

3.5 : Checking GPnP to debug startup problems

# $GRID_HOME/bin/gpnptool get

For futher debugging

# $GRID_HOME/bin/gpnptool lfind

# $GRID_HOME/bin/gpnptool getpval -asm_spf -p=/u01/app/11204/grid/gpnp/profiles/peer/profile.xml

# $GRID_HOME/bin/gpnptool check -p=/u01/app/11204/grid/gpnp/profiles/peer/profile.xml

# $GRID_HOME/bin/gpnptool verify -p=/u01/app/11204/grid/gpnp/profiles/peer/profile.xml -w="/u01/app/11204/grid/gpnp/grac41/wallets/peer" -wu=peer

3.6 : Cluvfy commands to debug startup problems

Network problems:

$ cluvfy comp nodereach -n grac41 -vebose

Identify your interfaces used for public and private usage and check related networks

$ cluvfy comp nodecon -n grac41,grac42 -i eth1 -verbose ( public Interface )

$ cluvfy comp nodecon -n grac41,grac42 -i eth2 -verbose ( private Interface )

$ cluvfy comp nodecon -n grac41 -verbose

Testing multicast communication for multicast group "230.0.1.0" .

$ cluvfy stage -post hwos -n grac42

Cluvfy commands to verify ASM DG and Voting disk location

Note: Run cluvfy from a working Node ( grac42 ) to get more details

[grid@grac42 ~]$ cluvfy comp vdisk -n grac41

ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdh1"

--> From the error code we know ASM disk group + Voting Disk location

$ cluvfy comp olr -verbose

$ cluvfy comp software -verbose

$ cluvfy comp ocr -n grac42,grac41

$ cluvfy comp sys -n grac41 -p crs -verbose

Comp healthcheck is quite helpfull to get an overview but as OHASD is not running most of the

errors are related to the CW startup problem.

$ cluvfy comp healthcheck -collect cluster -html

$ firefox cvucheckreport_523201416347.html

4 : Useful grep Commands

GPnP profile is not accessible - gpnpd needs to be fully up to serve profile

$ fn_egrep.sh "Cannot get GPnP profile|Error put-profile CALL"

TraceFileName: ./grac41/agent/ohasd/orarootagent_root/orarootagent_root.log

2014-05-20 10:26:44.532: [ default][1199552256]Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

Cannot get GPnP profile

2014-04-21 15:27:06.838: [ GPNP][132114176]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2243] Result: (13) CLSGPNP_NO_DAEMON.

Error put-profile CALL to remote "tcp://grac41:56376" disco "mdns:service:gpnp._tcp.local.://grac41:56376/agent=gpnpd,cname=grac4,host=grac41,pid=4548/gpnpd h:grac41 c:grac4"

Network socket file doesn't have appropriate ownership or permission

# fn_egrep.sh "clsclisten: Permission denied"

[ COMMCRS][3534915328]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_MDNSD))

Problems with Private Interconnect

$ fn.sh "2014-06-03" | egrep 'but no network HB|TraceFileName'

Search String: no network HB

TraceFileName: ./cssd/ocssd.log

2014-06-02 12:51:52.564: [ CSSD][2682775296]clssnmvDHBValidateNcopy: node 3, grac43, has a disk HB, but no network HB , ..

$ fn_egrep.sh "failed to resolve|gipcretFail|gipcretConnectionRefused" | egrep 'TraceFile|2014-05-20 11:0'

TraceFileName: ./grac41/crsd/crsd.log and ./grac41/evmd/evmd.log may report

2014-05-20 11:04:02.563: [GIPCXCPT][154781440] gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host 'grac41', port 'ffac-854b-c525-6f9c', hctx 0x2ed3940 [0000000000000010] { gipchaContext : host 'grac41', name 'd541-9a1e-7807-8f4a', luid 'f733b93a-00000000', numNode 0, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)

2014-05-20 11:04:02.563: [GIPCHGEN][154781440] gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 806]: EXCEPTION[ ret gipcretKeyNotFound (36) ] failed to resolve ctx 0x2ed3940 [0000000000000010] { gipchaContext : host 'grac41', name 'd541-9a1e-7807-8f4a', luid 'f733b93a-00000000', numNode 0, numInf 1, usrFlags 0x0, flags 0x5 }, host 'grac41', port 'ffac-854b-c525-6f9c', flags 0x0

Is there a valid CI network device ?

# fn_egrep.sh "NETDATA" | egrep 'TraceFile|2014-06-03'

TraceFileName: ./gipcd/gipcd.log

2014-06-03 07:48:45.401: [ CLSINET][3977414400] Returning NETDATA: 1 interfaces <-- ok

2014-06-03 07:52:51.589: [ CLSINET][1140848384] Returning NETDATA: 0 interfaces <-- problems !

Are Voting Disks acessible ?

$ fn_egrep.sh "Successful discovery"

TraceFileName: ./grac41/cssd/ocssd.log

2014-05-22 13:41:38.776: [ CSSD][1839290112]clssnmvDiskVerify: Successful discovery of 0 disks

Generic troubleshooting hints : How to review CW trace files

1 : Limit trace files size and file count by using TFA command: tfactl diagcollect

Note only single node collection is neccessary for CW startup problem - here node grac41 has CW startup problems

# tfactl diagcollect -node grac41 -from "May/20/2014 06:00:00" -to "May/20/2014 15:00:00"

--> Scanning files from May/20/2014 06:00:00 to May/20/2014 15:00:00

...

Logs are collected to:

/u01/app/grid/tfa/repository/collection_Wed_May_21_09_19_10_CEST_2014_node_grac41/grac41.tfa_Wed_May_21_09_19_10_CEST_2014.zip

Extract zip file and scan for various Clusterware errors

# mkdir /u01/TFA

# cd /u01/TFA

# unzip /u01/app/grid/tfa/repository/collection_Wed_May_21_09_19_10_CEST_2014_node_grac41/grac41.tfa_Wed_May_21_09_19_10_CEST_2014.zip

Locate important files in our unzipped TFA repository

# pwd

/u01/TFA/

# find . -name "alert*"

./grac41/u01/app/11204/grid/log/grac41/alertgrac41.log

./grac41/asm/+asm/+ASM1/trace/alert_+ASM1.log

./grac41/rdbms/grac4/grac41/trace/alert_grac41.log

# find . -name "mess*"

./grac41/var/log/messages

./grac41/var/log/messages-20140504

2 : Review Clusterware alert.log errors

# get_ca.sh alertgrac41.log 2014-05-23

-> File searched: alertgrac41.log

-> Start search timestamp : 2014-05-23

-> End search timestamp :

Begin: CNT 0 - TS --

2014-05-23 15:29:31.297: [mdnsd(16387)]CRS-5602:mDNS service stopping by request.

2014-05-23 15:29:34.211: [gpnpd(16398)]CRS-2329:GPNPD on node grac41 shutdown.

2014-05-23 15:29:45.785: [ohasd(2736)]CRS-2112:The OLR service started on node grac41.

2014-05-23 15:29:45.845: [ohasd(2736)]CRS-1301:Oracle High Availability Service started on node grac41.

2014-05-23 15:29:45.861: [ohasd(2736)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred

2014-05-23 15:29:49.999: [/u01/app/11204/grid/bin/orarootagent.bin(2798)]CRS-2302:Cannot get GPnP profile.

Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

2014-05-23 15:29:55.075: [ohasd(2736)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

2014-05-23 15:29:55.081: [gpnpd(2934)]CRS-2328:GPNPD started on node grac41.

2014-05-23 15:29:57.576: [cssd(3040)]CRS-1713:CSSD daemon is started in clustered mode

2014-05-23 15:29:59.331: [ohasd(2736)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE

2014-05-23 15:29:59.331: [ohasd(2736)]CRS-2769:Unable to failover resource 'ora.diskmon'.

2014-05-23 15:30:01.905: [cssd(3040)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;

Details at (:CSSNM00070:) in /u01/app/11204/grid/log/grac41/cssd/ocssd.log

2014-05-23 15:30:16.945: [cssd(3040)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;

Details at (:CSSNM00070:) in /u01/app/11204/grid/log/grac41/cssd/ocssd.log

--> Script get_ca.sh adds a timestamp and reduces output and only dumps errors for certain day

In the above sample we can easily pin-point the problem to a voting disks issue

If you cant find a obvious reason you still need to review your clusterware alert.log

3 : Review Clusterware logfiles for errors and resources failing or not starting up

TraceFileName: ./crsd/crsd.log

2014-05-25 11:29:04.968: [ CRSPE][1872742144]{1:62631:2} CRS-2672: Attempting to start 'ora.SSD.dg' on 'grac41'

2014-05-25 11:29:04.999: [ CRSPE][1872742144]{1:62631:2} CRS-2672: Attempting to start 'ora.DATA.dg' on 'grac41'

2014-05-25 11:29:05.063: [ CRSPE][1872742144]{1:62631:2} CRS-2672: Attempting to start 'ora.FRA.dg' on 'grac41

....

If a certain resource like ora.net1.network doesn't start - grep for details using resource name

[grid@grac42 grac42]$ fn.sh "ora.net1.network" | egrep '2014-05-31 11|TraceFileName'

TraceFileName: ./agent/crsd/orarootagent_root/orarootagent_root.log

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} Agent received the message: RESOURCE_START[ora.net1.network grac42 1] ID 4098:403

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} Preparing START command for: ora.net1.network grac42 1

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: OFFLINE to: STARTING

2014-05-31 11:58:27.917: [ AGFW][1826965248]{2:12808:5} Command: start for resource: ora.net1.network grac42 1 completed with status: SUCCESS

2014-05-31 11:58:27.919: [ AGFW][1829066496]{2:12808:5} Agent sending reply for: RESOURCE_START[ora.net1.network grac42 1] ID 4098:403

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: STARTING to: UNKNOWN

4 : Review ASM alert.log

5 : Track generation of new Tints and follow these Tints

$ fn.sh 2014-05-25 | egrep 'TraceFileName|Generating new Tint'

TraceFileName: ./agent/ohasd/orarootagent_root/orarootagent_root.log

2014-05-25 13:52:07.862: [ AGFW][2550134528]{0:11:3} Generating new Tint for unplanned state change. Original Tint: {0:0:2}

2014-05-25 13:52:36.126: [ AGFW][2550134528]{0:11:6} Generating new Tint for unplanned state change. Original Tint: {0:0:2}

[grid@grac41 grac41]$ fn.sh "{0:11:3}" | more

Search String: {0:11:3}

TraceFileName: ./alertgrac41.log

[/u01/app/11204/grid/bin/cssdmonitor(1833)]CRS-5822:Agent '/u01/app/11204/grid/bin/cssdmonitor_root' disconnected from server. Details at (:CRSAGF00117:) {0:11:3}

in /u01/app/11204/grid/log/grac41/agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log.

--------------------------------------------------------------------

TraceFileName: ./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log

2014-05-19 14:34:03.312: [ AGENT][3481253632]{0:11:3} {0:11:3} Created alert : (:CRSAGF00117:) :

Disconnected from server, Agent is shutting down.

2014-05-19 14:34:03.312: [ USRTHRD][3481253632]{0:11:3} clsncssd_exit: CSSD Agent was asked to exit with exit code 2

2014-05-19 14:34:03.312: [ USRTHRD][3481253632]{0:11:3} clsncssd_exit: No connection with CSS, exiting.

6 : Investigate GENERIC File Protection problems for Log File Location, Ownership and Permissions

==> For further details see GENERIC File Protection troubleshooting chapter

7 : Investigate GENERIC Networking problems

==> For further details see GENERIC Networking troubleshooting chapter

8 : Check logs for a certain resource for a certain time ( in this sample ora.gpnpd resource was used)

[root@grac41 grac41]# fn.sh "ora.gpnpd" | egrep "TraceFileName|2014-05-22 07" | more

TraceFileName: ./agent/ohasd/oraagent_grid/oraagent_grid.log

2014-05-22 07:18:27.281: [ora.gpnpd][3696797440]{0:0:2} [check] clsdmc_respget return: status=0, ecode=0

2014-05-22 07:18:57.291: [ora.gpnpd][3698898688]{0:0:2} [check] clsdmc_respget return: status=0, ecode=0

2014-05-22 07:19:27.290: [ora.gpnpd][3698898688]{0:0:2} [check] clsdmc_respget return: status=0, ecode=0

9 : Check related trace depending on startup dependencies

GPndp daemon has the following dependencies: OHASD (root) starts --> OHASD ORAgent (grid) starts --> GPnP (grid)

Following tracefile needs to reviewed first for add. info

./grac41/u01/app/11204/grid/log/grac41/gpnpd/gpnpd.log ( see above )

./grac41/u01/app/11204/grid/log/grac41/ohasd/ohasd.log

./grac41/u01/app/11204/grid/log/grac41/agent/ohasd/oraagent_grid/oraagent_grid.log

10 : Check for tracefiles updated very frequently ( this helps to identify looping processes )

# date ; find . -type f -printf "%CY-%Cm-%Cd %CH:%CM:%CS %h/%f\n" | sort -n | tail -5

Thu May 22 07:52:45 CEST 2014

2014-05-22 07:52:12.1781722420 ./grac41/alertgrac41.log

2014-05-22 07:52:44.8401175210 ./grac41/agent/ohasd/oraagent_grid/oraagent_grid.log

2014-05-22 07:52:45.2701299670 ./grac41/client/crsctl_grid.log

2014-05-22 07:52:45.2901305450 ./grac41/ohasd/ohasd.log

2014-05-22 07:52:45.3221314710 ./grac41/client/olsnodes.log

# date ; find . -type f -printf "%CY-%Cm-%Cd %CH:%CM:%CS %h/%f\n" | sort -n | tail -5

Thu May 22 07:52:48 CEST 2014

2014-05-22 07:52:12.1781722420 ./grac41/alertgrac41.log

2014-05-22 07:52:47.3701907460 ./grac41/client/crsctl_grid.log

2014-05-22 07:52:47.3901913240 ./grac41/ohasd/ohasd.log

2014-05-22 07:52:47.4241923080 ./grac41/client/olsnodes.log

2014-05-22 07:52:48.3812200070 ./grac41/agent/ohasd/oraagent_grid/oraagent_grid.log

Check these trace files with tail -f

# tail -f ./grac41/agent/ohasd/oraagent_grid/oraagent_grid.log

[ clsdmc][1975785216]Fail to connect (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_GPNPD)) with status 9

2014-05-22 07:56:11.198: [ora.gpnpd][1975785216]{0:0:2} [start] Error = error 9 encountered when connecting to GPNPD

2014-05-22 07:56:12.199: [ora.gpnpd][1975785216]{0:0:2} [start] without returnbuf

2014-05-22 07:56:12.382: [ COMMCRS][1966565120]clsc_connect: (0x7fd9500cc070) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_GPNPD))

[ clsdmc][1975785216]Fail to connect (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_GPNPD)) with status 9

2014-05-22 07:56:12.382: [ora.gpnpd][1975785216]{0:0:2} [start] Error = error 9 encountered when connecting to GPNPD

2014-05-22 07:56:13.382: [ora.gpnpd][1975785216]{0:0:2} [start] without returnbuf

2014-05-22 07:56:13.553: [ COMMCRS][1966565120]clsc_connect: (0x7fd9500cc070) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_GPNPD))

--> Problem is we can't connect to the GPNPD listener

# ps -elf | egrep "PID|d.bin|ohas" | grep -v grep

F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD

4 S root 1560 1 0 -40 - - 160927 futex_ 07:56 ? 00:00:00 /u01/app/11204/grid/bin/cssdmonitor

4 S root 4494 1 0 80 0 - 2846 pipe_w May21 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run

4 S root 30441 1 23 80 0 - 176982 futex_ 07:49 ? 00:02:04 /u01/app/11204/grid/bin/ohasd.bin reboot

4 S grid 30612 1 0 80 0 - 167327 futex_ 07:50 ? 00:00:05 /u01/app/11204/grid/bin/oraagent.bin

0 S grid 30623 1 0 80 0 - 74289 poll_s 07:50 ? 00:00:00 /u01/app/11204/grid/bin/mdnsd.bin

--> Indeed gpnpd is not running . Investiage this further by debugging CW with strace

11 : If still no root cause was found try to grep all message for that period and review the output carefully

# fn.sh "2014-05-20 07:2" | more

Search String: 2014-05-20 07:2

...

--------------------------------------------------------------------

TraceFileName: ./grac41/u01/app/11204/grid/log/grac41/agent/ohasd/oraagent_grid/oraagent_grid.l01

2014-05-20 07:23:04.338: [ AGFW][374716160]{0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:1236

....

--------------------------------------------------------------------

TraceFileName: ./grac41/u01/app/11204/grid/log/grac41/gpnpd/gpnpd.log

2014-05-20 07:23:13.385: [ default][4133218080]

2014-05-20 07:23:13.385: [ default][4133218080]gpnpd START pid=16641 Oracle Grid Plug-and-Play Daemon

2, GENERIC File Protection problems for Log File Location, Ownership and Permissions

Resource reports status STARTING for a long time before failing with CRS errors
After some time resource becomes OFFLINE

Debug startup problems for GPnP daemon

Case #1 : Check that GPnP daemon can write to trace file location and new timestamps are written

Following directory/files needs to have proper protections :

Trace directory : ./log/grac41/gpnpd

STDOUT log file : ./log/grac41/gpnpd/gpnpdOUT.log

Error log file : ./log/grac41/gpnpd/gpnpd.log

If gpnpdOUT.log and gpnpd.log are not updated when starting GPnP daemon you need to review your file protections

Sample for GPnP resource:

# ls -ld ./grac41/u01/app/11204/grid/log/grac41/gpnpd

drwxr-xr-x. 2 root root 4096 May 21 09:52 ./grac41/u01/app/11204/grid/log/grac41/gpnpd

# ls -l ./grac41/u01/app/11204/grid/log/grac41/gpnpd

-rw-r--r--. 1 root root 420013 May 21 09:35 gpnpd.log

-rw-r--r--. 1 root root 26567 May 21 09:31 gpnpdOUT.log

==> Here we can see that trace files are owned by root which is wrong !

After changing directory and tracefils with chown grid:oinstall ... traces where sucessfully written

If unsure about protection verify this with a cluster node where CRS is up and running

# ls -ld /u01/app/11204/grid/log/grac41/gpnpd

drwxr-x---. 2 grid oinstall 4096 May 20 13:53 /u01/app/11204/grid/log/grac41/gpnpd

# ls -l /u01/app/11204/grid/log/grac41/gpnpd

-rw-r--r--. 1 grid oinstall 122217 May 19 13:35 gpnpd_1.log

-rw-r--r--. 1 grid oinstall 1747836 May 20 12:32 gpnpd.log

-rw-r--r--. 1 grid oinstall 26567 May 20 12:31 gpnpdOUT.log

Case #2 : Check that IPC sockets have proper protection (this info is not available via tfa collector )

( verify this with a node where CRS is up and running )

# ls -l /var/tmp/.oracle | grep -i gpnp

srwxrwxrwx. 1 grid oinstall 0 May 20 12:31 ora_gipc_GPNPD_grac41

-rw-r--r--. 1 grid oinstall 0 May 20 10:11 ora_gipc_GPNPD_grac41_lock

srwxrwxrwx. 1 grid oinstall 0 May 20 12:31 sgrac41DBG_GPNPD

==> Again check this against a working cluster node

You may consider to compare all IPC socket info available in /var/tmp/.oracle

Same sort of debugging can be used for other processes like MDNSD daemon

MDNSD process can't listen to IPC socket grac41DBG_MDNSD socket and terminates

Grep command:

$ fn_egrep.sh "clsclisten: Permission denied"

TraceFileName: ./grac41/mdnsd/mdnsd.log

2014-05-19 08:08:53.097: [ COMMCRS][2179102464]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_MDNSD))

2014-05-19 08:10:58.177: [ COMMCRS][3534915328]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_MDNSD))

Trace file extract: ./grac41/mdnsd/mdnsd.log

2014-05-19 08:08:53.087: [ default][2187654912]mdnsd START pid=11855

2014-05-19 08:08:53.097: [ COMMCRS][2179102464]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_MDNSD))

--> Permissiong problems for MDNSD resource

2014-05-19 08:08:53.097: [ clsdmt][2181203712]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_MDNSD))

2014-05-19 08:08:53.097: [ clsdmt][2181203712]Terminating process

2014-05-19 08:08:53.097: [ MDNS][2181203712] clsdm requested mdnsd exit

2014-05-19 08:08:53.097: [ MDNS][2181203712] mdnsd exit

2014-05-19 08:10:58.168: [ default][3543467776]

Case #3 : Check gpnpd.log for sucessful write of the related PID file

# egrep "PID for the Process|Creating PID|Writing PID" ./grac41/u01/app/11204/grid/log/grac41/gpnpd/gpnpd.log

2014-05-20 07:23:13.417: [ clsdmt][4121581312]PID for the Process [16641], connkey 10

2014-05-20 07:23:13.417: [ clsdmt][4121581312]Creating PID [16641] file for home /u01/app/11204/grid host grac41 bin gpnp to /u01/app/11204/grid/gpnp/init/

2014-05-20 07:23:13.417: [ clsdmt][4121581312]Writing PID [16641] to the file [/u01/app/11204/grid/gpnp/init/grac41.pid]

Case #4 : Check gpnpd.log file for fatal errors like PROCL-5 PROCL-26

# less ./grac41/u01/app/11204/grid/log/grac41/gpnpd/gpnpd.log

2014-05-20 07:23:14.377: [ GPNP][4133218080]clsgpnpd_openLocalProfile: [at clsgpnpd.c:3477] Got local profile from file cache provider (LCP-FS).

2014-05-20 07:23:14.380: [ GPNP][4133218080]clsgpnpd_openLocalProfile: [at clsgpnpd.c:3532] Got local profile from OLR cache provider (LCP-OLR).

2014-05-20 07:23:14.385: [ GPNP][4133218080]procr_open_key_ext: OLR api procr_open_key_ext failed for key SYSTEM.GPnP.profiles.peer.pending

2014-05-20 07:23:14.386: [ GPNP][4133218080]procr_open_key_ext: OLR current boot level : 7

2014-05-20 07:23:14.386: [ GPNP][4133218080]procr_open_key_ext: OLR error code : 5

2014-05-20 07:23:14.386: [ GPNP][4133218080]procr_open_key_ext: OLR error message : PROCL-5: User does not have permission to perform a local

registry operation on this key. Authentication error [User does not have permission to perform this operation] [0]

2014-05-20 07:23:14.386: [ GPNP][4133218080]clsgpnpco_ocr2profile: [at clsgpnpco.c:578] Result: (58) CLSGPNP_OCR_ERR. Failed to open requested OLR Profile.

2014-05-20 07:23:14.386: [ GPNP][4133218080]clsgpnpd_lOpen: [at clsgpnpd.c:1734] Listening on ipc://GPNPD_grac41

2014-05-20 07:23:14.386: [ GPNP][4133218080]clsgpnpd_lOpen: [at clsgpnpd.c:1743] GIPC gipcretFail (1) gipcListen listen failure on

2014-05-20 07:23:14.386: [ default][4133218080]GPNPD failed to start listening for GPnP peers.

2014-05-20 07:23:14.388: [ GPNP][4133218080]clsgpnpd_term: [at clsgpnpd.c:1344] STOP GPnPD terminating. Closing connections...

2014-05-20 07:23:14.400: [ default][4133218080]clsgpnpd_term STOP terminating.

3, GENERIC Networking troubleshooting

Private IP address is not directly used by clusterware
If changing IP from 192.168.2.102 to 192.168.2.108 CW still comes up as network address 192.168.2.0 does not change
If changing IP from 192.168.2.102 to 192.168.3.103 CW doesn’t come up as network address changed from 192.168.2.0 to 192.168.3.0 – Network address is used by GPnP profile.xml for private and public network
Check crfmond/crfmond.trc trace file for private network errors ( usefull for CI errors )
If you get any GIPC error message always think on a real network problems first

Case #1 : Nameserver not running/available

Reported error in evmd.log : [ OCRMSG][3360876320]GIPC error [29] msg [gipcretConnectionRefused

Reported Clusterware Error in CW alert.log: CRS-5011:Check of resource "+ASM" failed:

Testing scenario :

Stop nameserver and restart CRS

# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

Clusterware status

[root@grac41 gpnpd]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4529: Cluster Synchronization Services is online

CRS-4534: Cannot communicate with Event Manager

--> CSS, HAS are ONLINE - EVM and CRS are OFFLINE

F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD

4 S root 5396 1 0 80 0 - 2847 pipe_w 10:52 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run

4 S root 9526 1 3 80 0 - 178980 futex_ 14:47 ? 00:00:08 /u01/app/11204/grid/bin/ohasd.bin reboot

4 S grid 9705 1 0 80 0 - 174922 futex_ 14:47 ? 00:00:00 /u01/app/11204/grid/bin/oraagent.bin

0 S grid 9716 1 0 80 0 - 74289 poll_s 14:47 ? 00:00:00 /u01/app/11204/grid/bin/mdnsd.bin

0 S grid 9749 1 1 80 0 - 127375 hrtime 14:47 ? 00:00:02 /u01/app/11204/grid/bin/gpnpd.bin

0 S grid 9796 1 1 80 0 - 159711 hrtime 14:47 ? 00:00:04 /u01/app/11204/grid/bin/gipcd.bin

4 S root 9799 1 1 80 0 - 168656 futex_ 14:47 ? 00:00:03 /u01/app/11204/grid/bin/orarootagent.bin

4 S root 9812 1 3 -40 - - 160908 hrtime 14:47 ? 00:00:08 /u01/app/11204/grid/bin/osysmond.bin

4 S root 9823 1 0 -40 - - 162793 futex_ 14:47 ? 00:00:00 /u01/app/11204/grid/bin/cssdmonitor

4 S root 9842 1 0 -40 - - 162920 futex_ 14:47 ? 00:00:00 /u01/app/11204/grid/bin/cssdagent

4 S grid 9855 1 2 -40 - - 166594 futex_ 14:47 ? 00:00:04 /u01/app/11204/grid/bin/ocssd.bin

4 S root 10884 1 1 80 0 - 159388 futex_ 14:48 ? 00:00:02 /u01/app/11204/grid/bin/octssd.bin reboot

0 S grid 10904 1 0 80 0 - 75285 hrtime 14:48 ? 00:00:00 /u01/app/11204/grid/bin/evmd.bin

$ crsi

NAME TARGET STATE SERVER STATE_DETAILS

------------------------- ---------- ---------- ------------ ------------------

ora.asm ONLINE OFFLINE CLEANING

ora.cluster_interconnect.haip ONLINE ONLINE grac41

ora.crf ONLINE ONLINE grac41

ora.crsd ONLINE OFFLINE

ora.cssd ONLINE ONLINE grac41

ora.cssdmonitor ONLINE ONLINE grac41

ora.ctssd ONLINE ONLINE grac41 OBSERVER

ora.diskmon OFFLINE OFFLINE

ora.drivers.acfs ONLINE ONLINE grac41

ora.evmd ONLINE INTERMEDIATE grac41

ora.gipcd ONLINE ONLINE grac41

ora.gpnpd ONLINE ONLINE grac41

ora.mdnsd ONLINE ONLINE grac41

--> Event manager is in INTERMEDIATE state --> need to reivew EVMD logfile first

Detailed Tracefile report

Grep Syntax:

$ fn_egrep.sh "failed to resolve|gipcretFail" | egrep 'TraceFile|2014-05-20 11:0'

Failed case:

TraceFileName: ./grac41/crsd/crsd.log and ./grac41/evmd/evmd.log may report

2014-05-20 11:04:02.563: [GIPCXCPT][154781440] gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36),

host 'grac41', port 'ffac-854b-c525-6f9c', hctx 0x2ed3940 [0000000000000010] { gipchaContext :

host 'grac41', name 'd541-9a1e-7807-8f4a', luid 'f733b93a-00000000', numNode 0, numInf 1, usrFlags 0x0, flags 0x5 }, ret gipcretKeyNotFound (36)

2014-05-20 11:04:02.563: [GIPCHGEN][154781440] gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 806]:

EXCEPTION[ ret gipcretKeyNotFound (36) ] failed to resolve ctx 0x2ed3940 [0000000000000010] { gipchaContext :

host 'grac41', name 'd541-9a1e-7807-8f4a', luid 'f733b93a-00000000', numNode 0, numInf 1, usrFlags 0x0, flags 0x5 },

host 'grac41', port 'ffac-854b-c525-6f9c', flags 0x0

--> Both request trying to use a host name.

If this isn't resolved we very likely have a Name server problem !

TraceFileName: ./grac41/ohasd/ohasd.log reports

2014-05-20 11:03:21.364: [GIPCXCPT][2905085696]gipchaInternalReadGpnp: No network info configured in GPNP,

using defaults, ret gipcretFail (1)

TraceFileName: ./evmd/evmd.log

2014-05-13 15:01:00.690: [ OCRMSG][2621794080]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)

2014-05-13 15:01:00.690: [ OCRMSG][2621794080]GIPC error [29] msg [gipcretConnectionRefused]

2014-05-13 15:01:00.690: [ OCRMSG][2621794080]prom_connect: error while waiting for connection complete [24]

2014-05-13 15:01:00.690: [ CRSOCR][2621794080] OCR context init failure.

Error: PROC-32: Cluster Ready Services on the local node

TraceFileName: ./grac41/gpnpd/gpnpd.log

2014-05-22 11:18:02.209: [ OCRCLI][1738393344]proac_con_init: Local listener using IPC. [(ADDRESS=(PROTOCOL=ipc)(KEY=procr_local_conn_0_PROC))]

2014-05-22 11:18:02.209: [ OCRMSG][1738393344]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)

2014-05-22 11:18:02.209: [ OCRMSG][1738393344]GIPC error [29] msg [gipcretConnectionRefused]

2014-05-22 11:18:02.209: [ OCRMSG][1738393344]prom_connect: error while waiting for connection complete [24]

2014-05-22 11:18:02.209: [ OCRCLI][1738393344]proac_con_init: Failed to connect to server [24]

2014-05-22 11:18:02.209: [ OCRCLI][1738393344]proac_con_init: Post sema. Con count [1]

[ OCRCLI][1738393344]ac_init:2: Could not initialize con structures. proac_con_init failed with [32]

Debug problem using cluvfy

[grid@grac42 ~]$ cluvfy comp nodereach -n grac41

Verifying node reachability

Checking node reachability...

PRVF-6006 : Unable to reach any of the nodes

PRKN-1034 : Failed to retrieve IP address of host "grac41"

==> Confirmation that we have a Name Server problem

Verification of node reachability was unsuccessful on all the specified nodes.

[grid@grac42 ~]$ cluvfy comp nodecon -n grac41

Verifying node connectivity

ERROR:

PRVF-6006 : Unable to reach any of the nodes

PRKN-1034 : Failed to retrieve IP address of host "grac41"

Verification cannot proceed

Verification of node connectivity was unsuccessful on all the specified nodes.

Debug problem with OS comands like ping and nslookup

==> For futher details see GENERIC Networking troubleshooting chapter

Case #2 : Private Interface down or wrong IP address- CSSD not starting

Reported Clusterware Error in CW alert.log: [/u01/app/11204/grid/bin/cssdagent(16445)]

CRS-5818:Aborted command 'start' for resource 'ora.cssd'.

Reported in ocssd.log : [ CSSD][491194112]clssnmvDHBValidateNcopy:

node 1, grac41, has a disk HB, but no network HB,

Reported in crfmond.log : [ CRFM][4239771392]crfm_connect_to:

Wait failed with gipcret: 16 for conaddr tcp://192.168.2.103:61020

Testing scenario :

Shutdown private interface

[root@grac42 ~]# ifconfig eth2 down

[root@grac42 ~]# ifconfig eth2

eth2 Link encap:Ethernet HWaddr 08:00:27:DF:79:B9

BROADCAST MULTICAST MTU:1500 Metric:1

RX packets:754556 errors:0 dropped:0 overruns:0 frame:0

TX packets:631900 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:378302114 (360.7 MiB) TX bytes:221328282 (211.0 MiB)

[root@grac42 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

Clusterware status :

[root@grac42 grac42]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

CRS-4534: Cannot communicate with Event Manager

[grid@grac42 grac42]$ crsi

NAME TARGET STATE SERVER STATE_DETAILS

------------------------- ---------- ---------- ------------ ------------------

ora.asm ONLINE OFFLINE grac41 Instance Shutdown

ora.cluster_interconnect.haip ONLINE OFFLINE

ora.crf ONLINE ONLINE grac42

ora.crsd ONLINE OFFLINE

ora.cssd ONLINE OFFLINE STARTING

ora.cssdmonitor ONLINE ONLINE grac42

ora.ctssd ONLINE OFFLINE

ora.diskmon OFFLINE OFFLINE

ora.drivers.acfs ONLINE OFFLINE

ora.evmd ONLINE OFFLINE

ora.gipcd ONLINE ONLINE grac42

ora.gpnpd ONLINE ONLINE grac42

ora.mdnsd ONLINE ONLINE grac42

--> CSSD in mode STARTING and not progressing over time

After some minutes CSSD goes OFFLINE

Tracefile Details:

alertgrac42.log:

[cssd(16469)]CRS-1656:The CSS daemon is terminating due to a fatal error;

Details at (:CSSSC00012:) in .../cssd/ocssd.log

2014-05-31 14:15:41.828:

[cssd(16469)]CRS-1603:CSSD on node grac42 shutdown by user.

2014-05-31 14:15:41.828:

[/u01/app/11204/grid/bin/cssdagent(16445)]CRS-5818:Aborted command 'start' for resource 'ora.cssd'.

Details at (:CRSAGF00113:) {0:0:2} in ../agent/ohasd/oracssdagent_root/oracssdagent_root.log.

ocssd.log:

2014-05-31 14:23:11.534: [ CSSD][491194112]clssnmvDHBValidateNcopy: node 1, grac41,

has a disk HB, but no network HB,

DHB has rcfg 296672934, wrtcnt, 25000048, LATS 9730774, lastSeqNo 25000045,

uniqueness 1401378465, timestamp 1401538986/54631634

2014-05-31 14:23:11.550: [ CSSD][481683200]clssnmvDHBValidateNcopy: node 1, grac41,

has a disk HB, but no network HB,

DHB has rcfg 296672934, wrtcnt, 25000050, LATS 9730794, lastSeqNo 25000047,

uniqueness 1401378465, timestamp 1401538986/54632024

Using grep to locate errors

CRF resource is checking CI every 5s and reports errors:

$ fn.sh 2014-06-03 | egrep 'TraceFileName|failed'

TraceFileName: ./crfmond/crfmond.log

2014-06-03 08:28:40.859: [ CRFM][4239771392]crfm_connect_to:

Wait failed with gipcret: 16 for conaddr tcp://192.168.2.103:61020

2014-06-03 08:28:46.065: [ CRFM][4239771392]crfm_connect_to: Wait failed with gipcret: 16 for conaddr tcp://192.168.2.103:61020

2014-06-03 08:28:51.271: [ CRFM][4239771392]crfm_connect_to: Wait failed with gipcret: 16 for conaddr tcp://192.168.2.103:61020

2014-06-03 08:37:11.264: [ CSSCLNT][4243982080]clsssInitNative: connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_grac42_)) failed, rc 13

DHB has rcfg 296672934, wrtcnt, 23890957, LATS 9730794, lastSeqNo 23890939, uniqueness 1401292202, timestamp 1401538981/89600384

Do we have any network problems ?

$ fn.sh "2014-06-03" | egrep 'but no network HB|TraceFileName'

Search String: no network HB

TraceFileName: ./cssd/ocssd.log

2014-06-02 12:51:52.564: [ CSSD][2682775296]clssnmvDHBValidateNcopy: node 3, grac43,

has a disk HB, but no network HB, DHB has rcfg 297162159, wrtcnt, 24036295, LATS 167224,

lastSeqNo 24036293, uniqueness 1401692525, timestamp 1401706311/13179394

2014-06-02 12:51:53.569: [ CSSD][2682775296]clssnmvDHBValidateNcopy: node 1, grac41,

has a disk HB, but no network HB, DHB has rcfg 297162159, wrtcnt, 25145340, LATS 168224,

lastSeqNo 25145334, uniqueness 1401692481, timestamp 1401706313/13192074

Is there a valid CI network device ?

# fn_egrep.sh "NETDATA" | egrep 'TraceFile|2014-06-03'

TraceFileName: ./gipcd/gipcd.log

2014-06-03 07:48:40.372: [ CLSINET][3977414400] Returning NETDATA: 1 interfaces

2014-06-03 07:48:45.401: [ CLSINET][3977414400] Returning NETDATA: 1 interfaces _-> ok

2014-06-03 07:52:51.589: [ CLSINET][1140848384] Returning NETDATA: 0 interfaces --> Problem with CI

2014-06-03 07:52:51.669: [ CLSINET][1492440832] Returning NETDATA: 0 interfaces

Debug with cluvfy

[grid@grac41 ~]$ cluvfy comp nodecon -n grac41,grac42 -i eth1,eth2

Verifying node connectivity

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

ERROR:

PRVG-11049 : Interface "eth2" does not exist on nodes "grac42"

Check: Node connectivity for interface "eth1"

Node connectivity passed for interface "eth1"

TCP connectivity check passed for subnet "192.168.1.0"

Check: Node connectivity for interface "eth2"

Node connectivity failed for interface "eth2"

TCP connectivity check passed for subnet "192.168.2.0"

Checking subnet mask consistency...

Subnet mask consistency check passed for subnet "10.0.2.0".

Subnet mask consistency check passed for subnet "192.168.1.0".

Subnet mask consistency check passed for subnet "192.168.2.0".

Subnet mask consistency check passed for subnet "169.254.0.0".

Subnet mask consistency check passed for subnet "192.168.122.0".

Subnet mask consistency check passed.

Node connectivity check failed

Verification of node connectivity was unsuccessful on all the specified nodes.

Debug using OS commands

[grid@grac42 NET]$ /bin/ping -s 1500 -c 2 -I 192.168.2.102 192.168.2.101

bind: Cannot assign requested address

[grid@grac42 NET]$ /bin/ping -s 1500 -c 2 -I 192.168.2.102 192.168.2.102

bind: Cannot assign requested address

Verify GnpP profile and find out CI device

[root@grac42 crfmond]# $GRID_HOME/bin/gpnptool get 2>/dev/null | xmllint --format - | egrep 'CSS-Profile|ASM-Profile|Network id'

<gpnp:HostNetwork id="gen" HostName="*">

<gpnp:Network id="net1" IP="192.168.1.0" Adapter="eth1" Use="public"/>

<gpnp:Network id="net2" IP="192.168.2.0" Adapter="eth2" Use="cluster_interconnect"/>

--> eth2 is our cluster interconnect

[root@grac42 crfmond]# ifconfig eth2

eth2 Link encap:Ethernet HWaddr 08:00:27:DF:79:B9

inet addr:192.168.3.102 Bcast:192.168.3.255 Mask:255.255.255.0

inet6 addr: fe80::a00:27ff:fedf:79b9/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

-> Here we have a wrong address 192.168.3.102 should be 192.168.2.102

Note CI device should have following flags : UP BROADCAST RUNNING MULTICAST

Case #3 : Public Interface down - Public network ora.net1.network not starting

Reported in ./agent/crsd/orarootagent_root/orarootagent_root.log

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} Preparing START command for: ora.net1.network grac42 1

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: STARTING to: UNKNOWN

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 would be continued to monitored!

Reported Clusterware Error in CW alert.log: no errors reported

Testing scenario :

- Shutdown public interface

[root@grac42 evmd]# ifconfig eth1 down

[root@grac42 evmd]# ifconfig eth1

eth1 Link encap:Ethernet HWaddr 08:00:27:63:08:07

BROADCAST MULTICAST MTU:1500 Metric:1

RX packets:2889 errors:0 dropped:0 overruns:0 frame:0

TX packets:2458 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:484535 (473.1 KiB) TX bytes:316359 (308.9 KiB)

[root@grac42 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

Clusterware status :

[grid@grac42 grac42]$ crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

--> OHASD stack is ok !

[grid@grac42 grac42]$ crsi

NAME TARGET STATE SERVER STATE_DETAILS

------------------------- ---------- ---------- ------------ ------------------

ora.asm ONLINE ONLINE grac42 Started

ora.cluster_interconnect.haip ONLINE ONLINE grac42

ora.crf ONLINE ONLINE grac42

ora.crsd ONLINE ONLINE grac42

ora.cssd ONLINE ONLINE grac42

ora.cssdmonitor ONLINE ONLINE grac42

ora.ctssd ONLINE ONLINE grac42 OBSERVER

ora.diskmon OFFLINE OFFLINE

ora.drivers.acfs ONLINE ONLINE grac42

ora.evmd ONLINE ONLINE grac42

ora.gipcd ONLINE ONLINE grac42

ora.gpnpd ONLINE ONLINE grac42

ora.mdnsd ONLINE ONLINE grac42

--> OAHSD stack is up and running

[grid@grac42 grac42]$ crs | grep grac42

NAME TARGET STATE SERVER STATE_DETAILS

------------------------- ---------- ---------- ------------ ------------------

ora.ASMLIB_DG.dg ONLINE ONLINE grac42

ora.DATA.dg ONLINE ONLINE grac42

ora.FRA.dg ONLINE ONLINE grac42

ora.LISTENER.lsnr ONLINE OFFLINE grac42

ora.OCR.dg ONLINE ONLINE grac42

ora.SSD.dg ONLINE ONLINE grac42

ora.asm ONLINE ONLINE grac42 Started

ora.gsd OFFLINE OFFLINE grac42

ora.net1.network ONLINE OFFLINE grac42

ora.ons ONLINE OFFLINE grac42

ora.registry.acfs ONLINE ONLINE grac42

ora.grac4.db ONLINE OFFLINE grac42 Instance Shutdown

ora.grac4.grac42.svc ONLINE OFFLINE grac42

ora.grac42.vip ONLINE INTERMEDIATE grac43 FAILED OVER

--> ora.net1.network OFFLINE

ora.grac42.vip n status INTERMEDIATE - FAILED OVER to grac43

Check messages logged for resource ora.net1.network from 2014-05-31 11:00:00 - 2014-05-31 11:59_59

[grid@grac42 grac42]$ fn.sh "ora.net1.network" | egrep '2014-05-31 11|TraceFileName'

TraceFileName: ./agent/crsd/orarootagent_root/orarootagent_root.log

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} Agent received the message: RESOURCE_START[ora.net1.network grac42 1] ID 4098:403

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} Preparing START command for: ora.net1.network grac42 1

2014-05-31 11:58:27.899: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: OFFLINE to: STARTING

2014-05-31 11:58:27.917: [ AGFW][1826965248]{2:12808:5} Command: start for resource: ora.net1.network grac42 1 completed with status: SUCCESS

2014-05-31 11:58:27.919: [ AGFW][1829066496]{2:12808:5} Agent sending reply for: RESOURCE_START[ora.net1.network grac42 1] ID 4098:403

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: STARTING to: UNKNOWN

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 would be continued to monitored!

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} Started implicit monitor for [ora.net1.network grac42 1] interval=1000 delay=1000

2014-05-31 11:58:27.969: [ AGFW][1829066496]{2:12808:5} Agent sending last reply for: RESOURCE_START[ora.net1.network grac42 1] ID 4098:403

2014-05-31 11:58:27.982: [ AGFW][1829066496]{2:12808:5} Agent received the message: RESOURCE_CLEAN[ora.net1.network grac42 1] ID 4100:409

2014-05-31 11:58:27.982: [ AGFW][1829066496]{2:12808:5} Preparing CLEAN command for: ora.net1.network grac42 1

2014-05-31 11:58:27.982: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: UNKNOWN to: CLEANING

2014-05-31 11:58:27.983: [ AGFW][1826965248]{2:12808:5} Command: clean for resource: ora.net1.network grac42 1 completed with status: SUCCESS

2014-05-31 11:58:27.984: [ AGFW][1829066496]{2:12808:5} Agent sending reply for: RESOURCE_CLEAN[ora.net1.network grac42 1] ID 4100:409

2014-05-31 11:58:27.984: [ AGFW][1829066496]{2:12808:5} ora.net1.network grac42 1 state changed from: CLEANING to: OFFLINE

Debug with cluvfy

Run cluvfy on the failing node

[grid@grac42 grac42]$ cluvfy comp nodereach -n grac42

Verifying node reachability

Checking node reachability...

PRVF-6006 : Unable to reach any of the nodes

PRKC-1071 : Nodes "grac42" did not respond to ping in "3" seconds,

PRKN-1035 : Host "grac42" is unreachable

Verification of node reachability was unsuccessful on all the specified nodes.

Debug with OS Commands

[grid@grac42 NET]$ /bin/ping -s 1500 -c 2 -I 192.168.1.102 grac42

bind: Cannot assign requested address

--> Here we are failing as eht1 is not up and running

[grid@grac42 NET]$ /bin/ping -s 1500 -c 2 -I 192.168.1.102 grac41

ping: unknown host grac41

--> Here we are failing due as Nameserver can be reached

4, Debugging complete CW startup with strace

resource STATE remains STARTING for a long time
resource process gets restarted quickly but could not successful start at all
Note strace will only help for protection or connection issues.
If there is a logical corruption you need to review CW log files

Command used as user root : # strace -t -f -o /tmp/ohasd.trc crsctl start crs

CW status : GPNP daemon doesn't come up

Verify OHASD stack

NAME TARGET STATE SERVER STATE_DETAILS

------------------------- ---------- ---------- ------------ ------------------

ora.asm ONLINE OFFLINE Instance Shutdown

ora.cluster_interconnect.haip ONLINE OFFLINE

ora.crf ONLINE OFFLINE

ora.crsd ONLINE OFFLINE

ora.cssd ONLINE OFFLINE

ora.cssdmonitor ONLINE OFFLINE

ora.ctssd ONLINE OFFLINE

ora.diskmon ONLINE OFFLINE

ora.drivers.acfs ONLINE ONLINE grac41

ora.evmd ONLINE OFFLINE

ora.gipcd ONLINE OFFLINE

ora.gpnpd ONLINE OFFLINE STARTING

ora.mdnsd ONLINE ONLINE grac41

--> gpnpd daemon does not progress over time : STATE shows STARTING

Now stop CW and restart CW startup with strace -t -f

[root@grac41 gpnpd]# strace -t -f -o /tmp/ohasd.trc crsctl start crs

[root@grac41 gpnpd]# grep -i gpnpd /tmp/ohasd.trc | more

Check whether gpnpd shell and gpnpd.bin were scheduled for running :

root@grac41 log]# grep -i execve /tmp/ohasd.trc | grep gpnp

9866 08:13:56 execve("/u01/app/11204/grid/bin/gpnpd", ["/u01/app/11204/grid/bin/gpnpd"], [/* 72 vars */] <unfinished ...>

9866 08:13:56 execve("/u01/app/11204/grid/bin/gpnpd.bin", ["/u01/app/11204/grid/bin/gpnpd.bi"...], [/* 72 vars */] <unfinished ...>

11017 08:16:01 execve("/u01/app/11204/grid/bin/gpnpd", ["/u01/app/11204/grid/bin/gpnpd"], [/* 72 vars */] <unfinished ...>

11017 08:16:01 execve("/u01/app/11204/grid/bin/gpnpd.bin", ["/u01/app/11204/grid/bin/gpnpd.bi"...], [/* 72 vars */] <unfinished ...>

--> gpnpd.bin was scheduled 2x in 5 seconeds - seems we have problem here check return codes

Check ohasd.trc for errors like:

$ egrep 'EACCES|ENOENT|EADDRINUSE|ECONNREFUSED|EPERM' /tmp/ohasd.trc

Check ohasd.trc for certain return codes

# grep EACCES /tmp/ohasd.trc

# grep ENOENT ohasd.trc # returns a lot of info

Linux error codes leading to a failed CW startup

EACCES :

open("/u01/app/11204/grid/log/grac41/gpnpd/gpnpdOUT.log", O_RDWR|O_CREAT|O_APPEND, 0644) = -1 EACCES (Permission denied)

ENOENT :

stat("/u01/app/11204/grid/log/grac41/ohasd", 0x7fff17d68f40) = -1 ENOENT (No such file or directory)

EADDRINUSE :

[pid 7391] bind(6, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, 110) = -1 EADDRINUSE (Address already in use)

ECONNREFUSED :

[pid 7391] connect(6, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, 110) = -1 ECONNREFUSED (Connection refused)

EPERM :

[pid 7391] unlink("/var/tmp/.oracle/sgrac41DBG_MDNSD") = -1 EPERM (Operation not permitted)

Check for LOGfile and PID file usage

PID file usage :

# grep "\.pid" ohasd.trc

Sucessful open of mdns/init/grac41.pid through MDSND

9848 08:13:55 stat("/u01/app/11204/grid/mdns/init/grac41.pid", <unfinished ...>

9848 08:13:55 access("/u01/app/11204/grid/mdns/init/grac41.pid", F_OK <unfinished ...>

9848 08:13:55 statfs("/u01/app/11204/grid/mdns/init/grac41.pid", <unfinished ...>

9848 08:13:55 open("/u01/app/11204/grid/mdns/init/grac41.pid", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 7

9841 08:13:56 stat("/u01/app/11204/grid/mdns/init/grac41.pid", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0

9841 08:13:56 access("/u01/app/11204/grid/mdns/init/grac41.pid", F_OK) = 0

9841 08:13:56 statfs("/u01/app/11204/grid/mdns/init/grac41.pid", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096,

f_blocks=9900906, f_bfree=2564283, f_bavail=2061346, f_files=2514944, f_ffree=2079685, f_fsid={1657171729, 223082106}, f_namelen=255, f_frsize=4096}) = 0

9841 08:13:56 open("/u01/app/11204/grid/mdns/init/grac41.pid", O_RDONLY) = 27

9845 08:13:56 open("/var/tmp/.oracle/mdnsd.pid", O_WRONLY|O_CREAT|O_TRUNC, 0666 <unfinished ...>

Failed open of gpnp/init/grac41.pid through GPNPD

9842 08:16:00 stat("/u01/app/11204/grid/gpnp/init/grac41.pid", <unfinished ...>

9842 08:16:00 stat("/u01/app/11204/grid/gpnp/init/grac41.pid", <unfinished ...>

9842 08:16:00 access("/u01/app/11204/grid/gpnp/init/grac41.pid", F_OK <unfinished ...>

9842 08:16:00 statfs("/u01/app/11204/grid/gpnp/init/grac41.pid", <unfinished ...>

9842 08:16:00 open("/u01/app/11204/grid/gpnp/init/grac41.pid", O_RDONLY <unfinished ...>

9860 08:16:01 stat("/u01/app/11204/grid/gpnp/init/grac41.pid", <unfinished ...>

9860 08:16:01 access("/u01/app/11204/grid/gpnp/init/grac41.pid", F_OK <unfinished ...>

9860 08:16:01 statfs("/u01/app/11204/grid/gpnp/init/grac41.pid", <unfinished ...>

9860 08:16:01 open("/u01/app/11204/grid/gpnp/init/grac41.pid", O_RDONLY <unfinished ...>

Sucessful open of Logfile log/grac41/mdnsd/mdnsd.log through MDSND

# grep "\.log" /tmp/ohasd.trc # this on is very helplful

9845 08:13:55 open("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", O_WRONLY|O_APPEND) = 4

9845 08:13:55 stat("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", {st_mode=S_IFREG|0644, st_size=509983, ...}) = 0

9845 08:13:55 chmod("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", 0644) = 0

9845 08:13:55 stat("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", {st_mode=S_IFREG|0644, st_size=509983, ...}) = 0

9845 08:13:55 chmod("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", 0644) = 0

9845 08:13:55 stat("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", {st_mode=S_IFREG|0644, st_size=509983, ...}) = 0

9845 08:13:55 access("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", F_OK) = 0

9845 08:13:55 statfs("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=9900906,

f_bfree=2564312, f_bavail=2061375, f_files=2514944, f_ffree=2079685, f_fsid={1657171729, 223082106}, f_namelen=255, f_frsize=4096}) = 0

9845 08:13:55 open("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", O_WRONLY|O_APPEND) = 4

9845 08:13:55 stat("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", {st_mode=S_IFREG|0644, st_size=509983, ...}) = 0

9845 08:13:55 stat("/u01/app/11204/grid/log/grac41/mdnsd/mdnsd.log", <unfinished ...>

Failed open of Logfile log/grac41/gpnpd/gpnpdOUT.log and through GPNPD

9866 08:13:56 open("/u01/app/11204/grid/log/grac41/gpnpd/gpnpdOUT.log", O_RDWR|O_CREAT|O_APPEND, 0644 <unfinished ...>

--> We need to get a file descriptor back from open call

9842 08:15:56 stat("/u01/app/11204/grid/log/grac41/alertgrac41.log", {st_mode=S_IFREG|0664, st_size=1877580, ...}) = 0

Checking IPC sockets usage

Succesfull opening of IPC sockets throuph of MSDMD process

# grep MDNSD /tmp/ohasd.trc

9849 08:13:56 chmod("/var/tmp/.oracle/sgrac41DBG_MDNSD", 0777) = 0

9862 08:13:56 access("/var/tmp/.oracle/sgrac41DBG_MDNSD", F_OK <unfinished ...>

9862 08:13:56 connect(28, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, 110 <unfinished ...>

9849 08:13:56 <... getsockname resumed> {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, [36]) = 0

9849 08:13:56 chmod("/var/tmp/.oracle/sgrac41DBG_MDNSD", 0777 <unfinished ...>

--> connect was successfull at 13:56 - further processing with system calls like bind() will be seend in trace

No furhter connect requests are happing for /var/tmp/.oracle/sgrac41DBG_MDNSD

Complete log including a successful connect

9862 08:13:56 connect(28, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, 110 <unfinished ...>

9834 08:13:56 <... times resumed> {tms_utime=3, tms_stime=4, tms_cutime=0, tms_cstime=0}) = 435715860

9862 08:13:56 <... connect resumed> ) = 0

9862 08:13:56 connect(28, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, 110 <unfinished ...>

9834 08:13:56 <... times resumed> {tms_utime=3, tms_stime=4, tms_cutime=0, tms_cstime=0}) = 435715860

9862 08:13:56 <... connect resumed> ) = 0

9849 08:13:55 bind(6, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_MDNSD"}, 110) = 0

9849 08:13:55 listen(6, 500 <unfinished ...>

9731 08:13:55 nanosleep({0, 1000000}, <unfinished ...>

9849 08:13:55 <... listen resumed> ) = 0

--> After a successfull listen system call clients can connect

To allow clients to connect we need as succesful connect(), bind() and listen() system call !

Failed opening of IPC sockets throuph of GPNPD process

# grep MDNSD /tmp/ohasd.trc

9924 08:14:37 access("/var/tmp/.oracle/sgrac41DBG_GPNPD", F_OK) = 0

9924 08:14:37 connect(30, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_GPNPD"}, 110)

= -1 ECONNREFUSED (Connection refused)

9924 08:14:37 access("/var/tmp/.oracle/sgrac41DBG_GPNPD", F_OK) = 0

9924 08:14:37 connect(30, {sa_family=AF_FILE, path="/var/tmp/.oracle/sgrac41DBG_GPNPD"}, 110) =

= -1 ECONNREFUSED (Connection refused)

--> The connect request was unsucssful and was repeated again and again.

5, Debugging a single CW process with strace

resource STATE remains STARTING for a long time
resource process gets restarted quickly but could not succesfully start at all
Note strace will only help for protection or connection issues.
If there is a logical corruption you need to review CW log files

[root@grac41 .oracle]# crsi

NAME TARGET STATE SERVER STATE_DETAILS

------------------------- ---------- ---------- ------------ ------------------

ora.asm ONLINE OFFLINE Instance Shutdown

ora.cluster_interconnect.haip ONLINE OFFLINE

ora.crf ONLINE OFFLINE

ora.crsd ONLINE OFFLINE

ora.cssd ONLINE OFFLINE

ora.cssdmonitor ONLINE OFFLINE

ora.ctssd ONLINE OFFLINE

ora.diskmon ONLINE OFFLINE

ora.drivers.acfs ONLINE OFFLINE

ora.evmd ONLINE OFFLINE

ora.gipcd ONLINE OFFLINE

ora.gpnpd ONLINE OFFLINE STARTING

ora.mdnsd ONLINE ONLINE grac41

Check whether the related process is running ?

F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD

4 S root 7023 1 0 80 0 - 2846 pipe_w 06:01 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run

0 S grid 17764 10515 0 80 0 - 1113 wait 12:11 pts/8 00:00:00 strace -t -f -o /tmp/mdnsd.trc /u01/app/11204/grid/bin/mdnsd

0 S grid 17767 17764 0 80 0 - 78862 poll_s 12:11 ? 00:00:00 /u01/app/11204/grid/bin/mdnsd.bin

4 S root 18501 1 23 80 0 - 176836 futex_ 12:13 ? 00:05:07 /u01/app/11204/grid/bin/ohasd.bin reboot

4 S grid 18567 1 1 80 0 - 170697 futex_ 12:13 ? 00:00:14 /u01/app/11204/grid/bin/oraagent.bin

4 S root 20306 1 0 -40 - - 160927 futex_ 12:19 ? 00:00:03 /u01/app/11204/grid/bin/cssdmonitor

4 S root 21396 1 0 80 0 - 163115 futex_ 12:23 ? 00:00:00 /u01/app/11204/grid/bin/orarootagent.bin

--> gpnpd.bin is not running - but was restarted very often

Starting gpnpd.bin process with strace : $ strace -t -f -o /tmp/gpnpd.trc /u01/app/11204/grid/bin/gpnpd

Note this is very very dangerous as you need to know which user start this process

IF you run this procless as the wrong user OLR,OCR, IPC sockest privs and log file location can we corrupted !

Before tracing a single process you may run strace -t -f -o ohasd.trc crsctl start crs ( see the above chapater )

as this command alwas starts the process as correct user, in correct order and pull up needed resources

Run the following commands only our your test system as a last ressort.

Manually start gpnpd as user grid with strace attached!

[grid@grac41 grac41]$ strace -t -f -o /tmp/gpnpd.trc /u01/app/11204/grid/bin/gpnpd

Unable to open file /u01/app/11204/grid/log/grac41/gpnpd/gpnpdOUT.log: %

[grid@grac41 grac41]$ egrep 'EACCES|ENOENT|EADDRINUSE|ECONNREFUSED|EPERM' /tmp/gpnpd.trc

25251 12:37:39 open("/u01/app/11204/grid/log/grac41/gpnpd/gpnpdOUT.log",

O_RDWR|O_CREAT|O_APPEND, 0644) = -1 EACCES (Permission denied)

-==> Fix: # chown grid:oinstall /u01/app/11204/grid/log/grac41/gpnpd/gpnpdOUT.log

Repeat the above command as long we errors

[grid@grac41 grac41]$ strace -t -f -o /tmp/gpnpd.trc /u01/app/11204/grid/bin/gpnpd

[grid@grac41 grac41]$ egrep 'EACCES|ENOENT|EADDRINUSE|ECONNREFUSED|EPERM' /tmp/gpnpd.trc

27089 12:44:35 connect(45, {sa_family=AF_FILE, path="/var/tmp/.oracle/sprocr_local_conn_0_PROC"}, 110) =

= -1 ENOENT (No such file or directory)

27089 12:44:58 connect(45, {sa_family=AF_FILE, path="/var/tmp/.oracle/sprocr_local_conn_0_PROC"}, 110)

= -1 ENOENT (No such file or directory)

32486 13:03:52 connect(45, {sa_family=AF_FILE, path="/var/tmp/.oracle/sprocr_local_conn_0_PROC"}, 110)

= -1 ENOENT (No such file or directory)

--> Connect was unsuccesfull - check IPC socket protections !

Note strace will only help for protection or connection issues.

If there is a logical corruption you need to review CW log files

Details of a logical OCR corruption with PROCL-5 error : ./gpnpd/gpnpd.log

[ CLWAL][3606726432]clsw_Initialize: OLR initlevel [70000]

[ OCRAPI][3606726432]a_init:10: AUTH LOC [/u01/app/11204/grid/srvm/auth]

[ OCRMSG][3606726432]prom_init: Successfully registered comp [OCRMSG] in clsd.

2014-05-20 07:57:51.118: [ OCRAPI][3606726432]a_init:11: Messaging init successful.

[ OCRCLI][3606726432]oac_init: Successfully registered comp [OCRCLI] in clsd.

2014-05-20 07:57:51.118: [ OCRCLI][3606726432]proac_con_init: Local listener using IPC. [(ADDRESS=(PROTOCOL=ipc)(KEY=procr_local_conn_0_PROL))]

2014-05-20 07:57:51.119: [ OCRCLI][3606726432]proac_con_init: Successfully connected to the server

2014-05-20 07:57:51.119: [ OCRCLI][3606726432]proac_con_init: Post sema. Con count [1]

2014-05-20 07:57:51.120: [ OCRAPI][3606726432]a_init:12: Client init successful.

2014-05-20 07:57:51.120: [ OCRAPI][3606726432]a_init:21: OCR init successful. Init Level [7]

2014-05-20 07:57:51.120: [ OCRAPI][3606726432]a_init:2: Init Level [7]

2014-05-20 07:57:51.132: [ OCRCLI][3606726432]proac_con_init: Post sema. Con count [2]

[ clsdmt][3595089664]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=grac41DBG_GPNPD))

2014-05-20 07:57:51.133: [ clsdmt][3595089664]PID for the Process [31034], connkey 10

2014-05-20 07:57:51.133: [ clsdmt][3595089664]Creating PID [31034] file for home /u01/app/11204/grid host grac41 bin gpnp to /u01/app/11204/grid/gpnp/init/

2014-05-20 07:57:51.133: [ clsdmt][3595089664]Writing PID [31034] to the file [/u01/app/11204/grid/gpnp/init/grac41.pid]

2014-05-20 07:57:52.108: [ GPNP][3606726432]clsgpnpd_validateProfile: [at clsgpnpd.c:2919] GPnPD taken cluster name 'grac4'

2014-05-20 07:57:52.108: [ GPNP][3606726432]clsgpnpd_openLocalProfile: [at clsgpnpd.c:3477] Got local profile from file cache provider (LCP-FS).

2014-05-20 07:57:52.111: [ GPNP][3606726432]clsgpnpd_openLocalProfile: [at clsgpnpd.c:3532] Got local profile from OLR cache provider (LCP-OLR).

2014-05-20 07:57:52.113: [ GPNP][3606726432]procr_open_key_ext: OLR api procr_open_key_ext failed for key SYSTEM.GPnP.profiles.peer.pending

2014-05-20 07:57:52.113: [ GPNP][3606726432]procr_open_key_ext: OLR current boot level : 7

2014-05-20 07:57:52.113: [ GPNP][3606726432]procr_open_key_ext: OLR error code : 5

2014-05-20 07:57:52.126: [ GPNP][3606726432]procr_open_key_ext: OLR error message : PROCL-5: User does not have permission to perform a local registry operation on this key.

Authentication error [User does not have permission to perform this operation] [0]

2014-05-20 07:57:52.126: [ GPNP][3606726432]clsgpnpco_ocr2profile: [at clsgpnpco.c:578] Result: (58) CLSGPNP_OCR_ERR. Failed to open requested OLR Profile.

2014-05-20 07:57:52.127: [ GPNP][3606726432]clsgpnpd_lOpen: [at clsgpnpd.c:1734] Listening on ipc://GPNPD_grac41

2014-05-20 07:57:52.127: [ GPNP][3606726432]clsgpnpd_lOpen: [at clsgpnpd.c:1743] GIPC gipcretFail (1) gipcListen listen failure on

2014-05-20 07:57:52.127: [ default][3606726432]GPNPD failed to start listening for GPnP peers.

2014-05-20 07:57:52.135: [ GPNP][3606726432]clsgpnpd_term: [at clsgpnpd.c:1344] STOP GPnPD terminating. Closing connections...

2014-05-20 07:57:52.137: [ default][3606726432]clsgpnpd_term STOP terminating.

2014-05-20 07:57:53.136: [ OCRAPI][3606726432]a_terminate:1:current ref count = 1

2014-05-20 07:57:53.136: [ OCRAPI][3606726432]a_terminate:1:current ref count = 0

--> Fatal OLR error ==> OLR is corrupted ==> GPnPD terminating .

For details how to fix PROCL-5 error please read following link.

ORACLE DATABASE ADMINISTRATOR

Search

Friday, August 12, 2016

CLUSTERWARE TROUBLESHOOTING

Followers