Cluster Logger Service (Ologgerd)
–
This
is processes is one of the components for the Cluster Health Monitor.
There is only one
master ologgerd and one replica ologgerd per cluster.
there
is a master ologgerd that receives the data from other nodes and saves them in
the repository. It compresses the data
before persisting to save the disk space.
In
an environment with multiple nodes, a replica ologgerd is also started on a
node where the master ologgerd is not running.
The
master ologgerd will sync the data with replica ologgerd by sending the data to
the replica ologgerd.
The
replica ologgerd takes over if the master ologgerd dies.
A
new replica ologgerd starts when the replica ologgerd dies.
Using oclumon we are going to fix the problem make sure oclumon up and running
Locate
CHM log directory
Check CHM resource status and locate Master Node
[grid@grac41
~] $ $GRID_HOME/bin/crsctl status res ora.crf
-init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE
on grac41
[grid@grac41
~]$ oclumon manage -get MASTER
Master
= grac43
Login into grac43 and located CHM log directory ( ologgerd process )
[root@grac43
~]# ps -elf |grep ologgerd | grep -v grep
.... /u01/app/11204/grid/bin/ologgerd -M -d /u01/app/11204/grid/crf/db/grac43
Comparison
of OSWatcher and CHM
- CHM CPU overhead for a single run is lower than OSWatcher as CHM don’t uses iostat,vmstat to collect data
- OSWatcher runs with user priorty compared to RT priority of CHM ( CHM should be able to collect data even under CPU starvation )
- OSWatcher does a better job tracing network related stats like top, traceroute, and netstat
- TFA can reduce the number of uploaded files
Starting
and stopping ora.crf resource starts and stops CHM.
Check
status:
$GRID_HOME/bin/crsctl
status res ora.crf -init
To stop CHM (or ora.crf resource managed by ohasd)
$GRID_HOME/bin/crsctl stop res ora.crf -init
To start CHM (or ora.crf resource managed by ohasd)
$GRID_HOME/bin/crsctl start res ora.crf -init
Check status on a specific node;
$ ssh grac42 $GRID_HOME/bin/crsctl status res ora.crf -init |
grep STATE
Error CRS-9011 running oclumon – ologerrd daemon not started
$
oclumon dumpnodeview -n grac41 -last
"00:15:00"
CRS-9011-Error dumpnodeview:
Failed to initialize connection to the Cluster Logger Service
$ ps -ef | egrep "sysmond|loggerd"
root
3820 1 2 Feb20
? 00:26:15
/u01/app/11204/grid/bin/osysmond.bin
--> Ologgerd deamon is not running
Fix:
1.
Stop ora.crf as root user on all nodes
#
/u01/app/11204/grid/bin/crsctl stop res ora.crf
-init
CRS-2673:
Attempting to stop 'ora.crf' on 'grac41'
CRS-2677:
Stop of 'ora.crf' on 'grac41' succeeded
2.
Comment the "BDBSIZE" entry and save
the changes. ( file $GRID_HOME/crf/admin/crfgrac41.ora )
3.
Start the ora.crf resource on all nodes
#
/u01/app/11204/grid/bin/crsctl start res
ora.crf -init
CRS-2672:
Attempting to start 'ora.crf' on 'grac41'
CRS-2676:
Start of 'ora.crf' on 'grac41' succeeded
4.
Verify that ologgerd daemon is running
#
ps -ef | egrep "sysmond|loggerd"
root
27213 1 4 11:22
? 00:00:00
/u01/app/11204/grid/bin/osysmond.bin
root
27227 1 4 11:22
? 00:00:00
/u01/app/11204/grid/bin/ologgerd -M -d
/u01/app/11204/grid/crf/db/grac41
root
27243 20061 0 11:22 pts/7 00:00:00 egrep
sysmond|loggerd
-->
Ologgerd deamin is now running
5.
Verify oclumon is working now
$
oclumon manage -get MASTER
Master
= grac41
Done