What is ASM?
In Oracle Database 10g/11g there are
two types of instances: database and ASM instances. The ASM instance, which is
generally named +ASM, is started with the INSTANCE_TYPE=ASM init.ora parameter.
This parameter, when set, signals the Oracle initialization routine to start an
ASM instance and not a standard database instance. Unlike the standard database
instance, the ASM instance contains no physical files; such as logfiles,
controlfiles or datafiles, and only requires a few init.ora parameters for
startup.
Upon startup, an ASM instance will
spawn all the basic background processes, plus some new ones that are specific
to the operation of ASM. The STARTUP clauses for ASM instances are similar to
those for database instances. For example, RESTRICT prevents database instances
from connecting to this ASM instance. NOMOUNT starts up an ASM instance without
mounting any disk group. MOUNT option simply mounts all defined diskgroups
For RAC configurations, the ASM SID
is +ASMx instance, where x represents the instance number.
What are the key benefits of ASM?
ASM provides filesystem and volume
manager capabilities built into the Oracle database kernel. Withthis
capability, ASM simplifies storage management tasks, such as creating/laying
out databases and disk space management. Since ASM allows disk management to be
done using familiar create/alter/drop SQL statements, DBAs do not need to learn
a new skill set or make crucial decisions on provisioning.
The following are some key benefits
of ASM:
·
ASM spreads I/O evenly across all
available disk drives to prevent hot spots and maximize performance.
·
ASM eliminates the need for over
provisioning and maximizes storage resource utilization facilitating database
consolidation.
·
Inherent large file support.
·
Performs automatic online
redistribution after the incremental addition or removal of storage
capacity.
·
Maintains redundant copies of data
to provide high availability, or leverages 3rd party RAID functionality.
·
Supports Oracle Database as well as
Oracle Real Application Clusters (RAC).
·
Capable of leveraging 3rd party
multipathing technologies.
·
For simplicity and easier migration
to ASM, an Oracle database can contain ASM and non-ASM files.
·
Any new files can be created as ASM
files whilst existing files can also be migrated to ASM.
·
RMAN commands enable non-ASM managed
files to be relocated to an ASM disk group.
·
Enterprise Manager Database Control
or Grid Control can be used to manage ASM disk and file activities.
Describe about ASM architecture.
Automatic Storage Management (ASM)
instance
Instance that manages the diskgroup
metadata
Disk Groups
Logcal grouping of disks
Determines file mirroring options
ASM Disks
LUNs presented to ASM
ASM Files
Determines file mirroring options
ASM Disks
LUNs presented to ASM
ASM Files
Files that are stored in ASM disk groups are called ASM files, this includes
database files
Notes:
Many databases can connect as
clients to single ASM instances
ASM instance name should only be
+ASM only
One diskgroup can serve many
databases
How does database connects to ASM Instance?
The database communicates with ASM
instance using the ASMB (umblicus process) process. Once the database obtains
the necessary extents from extent map, all database IO going forward is
processed through by the database processes, bypassing ASM. Thus we say ASM is
not really in the IO path. So, the question how do we make ASM go faster…..you
don’t have to.
What
init.ora parameters does a user need to configure for ASM instances?
The default parameter settings work
perfectly for ASM. The only parameters needed for 11g ASM:
• PROCESSES*
• ASM_DISKSTRING*
• ASM_DISKGROUPS
• INSTANCE_TYPE
• PROCESSES*
• ASM_DISKSTRING*
• ASM_DISKGROUPS
• INSTANCE_TYPE
How does the database interact with
the ASM instance and how do I make ASM go faster?
ASM is not in the I/O path so ASM
does not impede the database file access. Since the RDBMS instance is performing
raw I/O, the I/O is as fast as possible.
Do I need to define the RDBMS
FILESYSTEMIO_OPTIONS parameter when I use ASM?
No. The RDBMS does I/O directly to
the raw disk devices, the FILESYSTEMIO_OPTIONS parameter is only for
filesystems.
Why Oracle recommends two
diskgroups?
Oracle recommends two diskgroups to
provide a balance of manageability, utilization, and performance.
We have a 16 TB database. I’m
curious about the number of disk groups we should use; e.g. 1 large disk group,
a couple of disk groups, or otherwise?
For VLDBs you will probably end up
with different storage tiers; e.g with some of our large customers they have
Tier1 (RAID10 FC), Tier2 (RAID5 FC), Tier3 (SATA), etc. Each one of these is
mapped to a diskgroup.
We have a new app and don’t know our
access pattern, but assuming mostly sequential access, what size would be a
good AU fit?
For 11g ASM/RDBMS it is recommended
to use 4MB ASM AU for disk groups. See Metalink Note 810484.1
Would it be better to use BIGFILE
tablespaces, or standard tablespaces for ASM?
The use of Bigfile tablespaces has
no bearing on ASM (or vice versa). In fact most database object related
decisions are transparent to ASM.
What is the best LUN size for ASM?
There is no best size! In most cases
the storage team will dictate to you based on their standardized LUN size. The
ASM administrator merely has to communicate the ASM Best Practices and
application characteristics to storage folks :
• Need equally sized / performance LUNs
• Minimum of 4 LUNs
• The capacity requirement
• The workload characteristic (random r/w, sequential r/w) & any response time SLA
Using this info , and their standards, the storage folks should build a nice LUN group set for you.
• Need equally sized / performance LUNs
• Minimum of 4 LUNs
• The capacity requirement
• The workload characteristic (random r/w, sequential r/w) & any response time SLA
Using this info , and their standards, the storage folks should build a nice LUN group set for you.
In 11g RAC we want to separate ASM
admins from DBAs and create different users and groups. How do we set this up?
For clarification
• Separate Oracle Home for ASM and RDBMS.
• RDBMS instance connects to ASM using OSDBA group of the ASM instance.
Thus, software owner for each RDBMS instance connecting to ASM must be
a member of ASM’s OSDBA group.
• Choose a different OSDBA group for ASM instance (asmdba) than for
RDBMS instance (dba)
• In 11g, ASM administrator has to be member of a separate SYSASM group to
separate ASM Admin and DBAs.
• Separate Oracle Home for ASM and RDBMS.
• RDBMS instance connects to ASM using OSDBA group of the ASM instance.
Thus, software owner for each RDBMS instance connecting to ASM must be
a member of ASM’s OSDBA group.
• Choose a different OSDBA group for ASM instance (asmdba) than for
RDBMS instance (dba)
• In 11g, ASM administrator has to be member of a separate SYSASM group to
separate ASM Admin and DBAs.
Can my RDBMS and ASM instances run
different versions?
Yes. ASM can be at a higher version
or at lower version than its client databases. There’s two
components of compatiblity:
Software compatibility
Diskgroup compatibility attributes:
compatible.asm
compatible.rdbms
components of compatiblity:
Software compatibility
Diskgroup compatibility attributes:
compatible.asm
compatible.rdbms
Where do I run my database listener
from; i.e., ASM HOME or DB HOME?
It is recommended to run the
listener from the ASM HOME. This is particularly important for RAC env, since
the listener is a node-level resource. In this config, you can create additional
[user] listeners from the database homes as needed.
How do I backup my ASM instance?
Not applicable! ASM has no files to
backup, as its does not contain controlfile,redo logs etc.
When should I use RMAN and when
should I use ASMCMD copy?
·
RMAN is the recommended and most
complete and flexible method to backup and transport database files in
ASM.
ASMCMD copy is good for copying single files
• Supports all Oracle file types
• Can be used to instantiate a Data Guard environment
• Does not update the controlfile
• Does not create OMF files
ASMCMD copy is good for copying single files
• Supports all Oracle file types
• Can be used to instantiate a Data Guard environment
• Does not update the controlfile
• Does not create OMF files
I’m going to do add disks to my ASM
diskgroup, how long will this rebalance take?
·
Rebalance time is heavily driven by
the three items:
1) Amount of data currently in the diskgroup
2) IO bandwidth available on the server
3) ASM_POWER_LIMIT or Rebalance Power Level
1) Amount of data currently in the diskgroup
2) IO bandwidth available on the server
3) ASM_POWER_LIMIT or Rebalance Power Level
We are migrating to a new storage
array. How do I move my ASM database from storage A to storage B?
Given that the new and old storage
are both visible to ASM, simply add the new disks to the ASM disk group and
drop the old disks. ASM rebalance will migrate data online.
Note 428681.1 covers how to move
OCR/Voting disks to the new storage array
Is it possible to unplug an ASM disk
group from one platform and plug into a server on another platform (for example,
from Solaris to Linux)?
No. Cross-platform disk group
migration not supported. To move datafiles between endian-ness platforms, you
need to use XTTS, Datapump or Streams.
How does ASM work with multipathing
software?
It works great! Multipathing software
is at a layer lower than ASM, and thus is transparent.
You may need to adjust ASM_DISKSTRING to specify only the path to the multipathing pseudo devices.
You may need to adjust ASM_DISKSTRING to specify only the path to the multipathing pseudo devices.
Is ASM constantly rebalancing to
manage “hot spots”?
No…No…Nope!! ASM provides even
distribution of extents across all disks in a disk group. Since each disk will
equal number of extents, no single disk will be hotter than another. Thus the
answer NO, ASM does not dynamically move hot spots, because hot spots simply do
not
occur in ASM configurations. Rebalance only occurs on storage configuration changes (e.g. add, drop, or resize disks).
occur in ASM configurations. Rebalance only occurs on storage configuration changes (e.g. add, drop, or resize disks).
What are the file types that ASM
support and keep in disk groups?
Control files
Flashback logs
Data Pump dump sets
Flashback logs
Data Pump dump sets
Data files
DB SPFILE
Data Guard configuration
DB SPFILE
Data Guard configuration
Temporary data files
RMAN backup sets
Change tracking bitmaps
RMAN backup sets
Change tracking bitmaps
Online redo logs
RMAN data file copies
OCR files
RMAN data file copies
OCR files
Archive logs
Transport data files
ASM SPFILE
Transport data files
ASM SPFILE
List Key benefits of ASM?
·
Stripes files rather than logical
volumes
·
Provides redundancy on a file basis
·
Enables online disk reconfiguration
and dynamic rebalancing
·
Reduces the time significantly to
resynchronize a transient failure by tracking changes while disk is offline
·
Provides adjustable rebalancing
speed
·
Is cluster-aware
·
Supports reading from mirrored copy
instead of primary copy for extended clusters
·
Is automatically installed as part
of the Grid Infrastructure
What is ASM Striping?
ASM can use variable size data
extents to support larger files, reduce memory requirements, and improve
performance.
Each data extent resides on an
individual disk.
Data extents consist of one or more
allocation units.
The data extent size is:
·
Equal to AU for the first 20,000
extents (0–19999)
·
Equal to 4 × AU for the next 20,000
extents (20000–39999)
·
Equal to 16 × AU for extents above
40,000
ASM stripes files using extents with
a coarse method for load balancing or a fine method to reduce latency.
·
Coarse-grained striping is always
equal to the effective AU size.
·
Fine-grained striping is always
equal to 128 KB.
How many ASM Diskgroups can be
created under one ASM Instance?
ASM imposes the following limits:
·
63 disk groups in a storage system
·
10,000 ASM disks in a storage system
·
Two-terabyte maximum storage for
each ASM disk (non-Exadata)
·
Four-petabyte maximum storage for
each ASM disk (Exadata)
·
40-exabyte maximum storage for each
storage system
·
1 million files for each disk group
·
ASM file size limits (database limit
is 128 TB):
1.
External redundancy maximum file
size is 140 PB.
2.
Normal redundancy maximum file size
is 42 PB.
3.
High redundancy maximum file size is
15 PB.
What is a diskgroup?
A disk group consists of multiple
disks and is the fundamental object that ASM manages. Each disk group contains
the metadata that is required for the management of space in the disk group.
The ASM instance manages the metadata about the files in a Disk Group in the
same way that a file system manages metadata about its files. However, the vast
majority of I/O operations do not pass through the ASM instance. In a moment we
will look at how file
I/O works with respect to the ASM instance.
I/O works with respect to the ASM instance.
Diagram that how database interacts
with ASM when a request is to read or open a datafile.
1A. Database issues open of a database file
1B. ASM sends the extent map for the file to database instance. Starting with 11g, the RDBMS only receives first 60 extents the remaining extents in the extent map are paged in on demand, providing a faster open
2A/2B. Database now reads directly from disk
3A.RDBMS foreground initiates a create tablespace for example
3B. ASM does the allocation for its essentially reserving the allocation units
for the file creation
3C. Once allocation phase is done, the extent map is sent to the RDBMS
3D. The RDBMS initialization phase kicks in. In this phase the initializes all
the reserved AUs
3E. If file creation is successful, then the RDBMS commits the file creation
1B. ASM sends the extent map for the file to database instance. Starting with 11g, the RDBMS only receives first 60 extents the remaining extents in the extent map are paged in on demand, providing a faster open
2A/2B. Database now reads directly from disk
3A.RDBMS foreground initiates a create tablespace for example
3B. ASM does the allocation for its essentially reserving the allocation units
for the file creation
3C. Once allocation phase is done, the extent map is sent to the RDBMS
3D. The RDBMS initialization phase kicks in. In this phase the initializes all
the reserved AUs
3E. If file creation is successful, then the RDBMS commits the file creation
Going forward all I/Os are done by
the RDBMS directly.
Can my disks in a diskgroup can be
varied size? For example one disk is of 100GB and another disk is of 50GB. If
so how does ASM manage the extents?
Yes, disk sizes can be varied,
Oracle ASM will manage data efficiently and intelligent by placing the extents
proportional to the size of the disk in the disk group, bigger diskgroups have
more extents than lesser ones.
31) What is Intelligent Data
Placement?
32) What is ASM preferred Mirror
read? How does it useful?
33) What is ACFS?
34) What is ADVM?
What is the major difference between
10g and 11g RAC?
Well, there is not much difference
between 10g and 11gR (1) RAC.
But there is a significant
difference in 11gR2.
Prior
to 11gR1(10g) RAC, the following were managed by Oracle CRS
o Databases
o Instances
o Applications
o Node Monitoring
o Event Services
o High Availability
From
11gR2(onwards) its completed HA stack managing and providing the following
resources as like the other cluster software like VCS etc.
·
Databases
·
Instances
·
Applications
·
Cluster Management
·
Node Management
·
Event Services
·
High Availability
·
Network Management (provides
DNS/GNS/MDNSD services on behalf of other traditional services) and SCAN –
Single Access Client Naming method, HAIP
·
Storage Management (with help of ASM
and other new ACFS filesystem)
·
Time synchronization (rather
depending upon traditional NTP)
·
Removed OS dependent hang checker
etc, manages with own additional monitor process
What are Oracle Cluster Components?
Cluster Interconnect (HAIP)
Shared Storage (OCR/Voting Disk)
Clusterware software
What are Oracle RAC Components?
VIP, Node apps etc.
What are Oracle Kernel Components (nothing but
how does Oracle RAC database differs than Normal single instance database in
terms of Binaries and process)
Basically Oracle kernel need to
switched on with RAC On option when you convert to RAC, that is the difference
as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.
To turn on RAC
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
Oracle RAC is composed of two or
more database instances. They are composed of Memory structures and background
processes same as the single instance database.Oracle RAC instances use two
processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable
cache fusion.Oracle RAC instances are composed of following background
processes:
ACMS—Atomic Controlfile to Memory
Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
What is Clusterware?
Software that provides various
interfaces and services for a cluster. Typically, this includes capabilities
that:
·
Allow the cluster to be managed as a
whole
·
Protect the integrity of the cluster
·
Maintain a registry of resources
across the cluster
·
Deal with changes to the cluster
·
Provide a common view of resources
What are the background process that
exists in 11gr2 and functionality?
Process Name
|
Functionality
|
crsd
|
•The CRS daemon (crsd) manages
cluster resources based on configuration information that is stored in Oracle
Cluster Registry (OCR) for each resource. This includes start, stop, monitor,
and failover operations. The crsd process generates events when the status of
a resource changes.
|
cssd
|
•Cluster Synchronization Service
(CSS): Manages the cluster configuration by controlling which nodes are
members of the cluster and by notifying members when a node joins or leaves
the cluster. If you are using certified third-party clusterware, then CSS
processes interfaces with your clusterware to manage node membership
information. CSS has three separate processes: the CSS daemon (ocssd), the
CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent
process monitors the cluster and provides input/output fencing. This service
formerly was provided by Oracle Process Monitor daemon (oprocd), also known
as OraFenceService on Windows. A cssdagent failure results in Oracle
Clusterware restarting the node.
|
diskmon
|
•Disk Monitor daemon (diskmon):
Monitors and performs input/output fencing for Oracle Exadata Storage Server.
As Exadata storage can be added to any Oracle RAC node at any point in time,
the diskmon daemon is always started when ocssd is started.
|
evmd
|
•Event Manager (EVM): Is a
background process that publishes Oracle Clusterware events
|
mdnsd
|
•Multicast domain name service
(mDNS): Allows DNS requests. The mDNS process is a background process on
Linux and UNIX, and a service on Windows.
|
gnsd
|
•Oracle Grid Naming Service (GNS):
Is a gateway between the cluster mDNS and external DNS servers. The GNS
process performs name resolution within the cluster.
|
ons
|
•Oracle Notification Service
(ONS): Is a publish-and-subscribe service for communicating Fast Application
Notification (FAN) events
|
oraagent
|
•oraagent: Extends clusterware to
support Oracle-specific requirements and complex resources. It runs server
callout scripts when FAN events occur. This process was known as RACG in
Oracle Clusterware 11g Release 1 (11.1).
|
orarootagent
|
•Oracle root agent (orarootagent):
Is a specialized oraagent process that helps CRSD manage resources owned by
root, such as the network, and the Grid virtual IP address
|
oclskd
|
•Cluster kill daemon (oclskd):
Handles instance/node evictions requests that have been escalated to CSS
|
gipcd
|
•Grid IPC daemon (gipcd): Is a
helper daemon for the communications infrastructure
|
ctssd
|
•Cluster time synchronisation
daemon(ctssd) to manage the time syncrhonization between nodes, rather
depending on NTP
|
Under which user or owner the
process will start?
Component
|
Name of the Process
|
Owner
|
Oracle High Availability Service
|
ohasd
|
init, root
|
Cluster Ready Service (CRS)
|
Cluster Ready Services
|
root
|
Cluster Synchronization Service
(CSS)
|
ocssd,cssd monitor, cssdagent
|
grid owner
|
Event Manager (EVM)
|
evmd, evmlogger
|
grid owner
|
Cluster Time Synchronization
Service (CTSS)
|
octssd
|
root
|
Oracle Notification Service (ONS)
|
ons, eons
|
grid owner
|
Oracle Agent
|
oragent
|
grid owner
|
Oracle Root Agent
|
orarootagent
|
root
|
Grid Naming Service (GNS)
|
gnsd
|
root
|
Grid Plug and Play (GPnP)
|
gpnpd
|
grid owner
|
Multicast domain name service
(mDNS)
|
mdnsd
|
grid owner
|
What is startup sequence in Oracle
11g RAC? 11g RAC startup sequence?
This is about
to understand the startup sequence of Grid Infrastructure daemons and its
resources in 11gR2 RAC.
In 11g RAC aka
Grid Infrastructure we all know there are additional background daemons and
agents, and the Oracle documentation is not so clear nor the other blog.
For example:- I
have found below two diagram follow any one of these.
explanation
from diagram
OHASD Phase:-
- OHASD (Oracle High Availability Server Daemon) starts Firsts and it will start
OHASD Agent Phase:-
- OHASD Agent starts and in turn this will start
gipcd
|
Grid interprocess communication
daemon, used for monitoring cluster interconnect
|
mdnsd
|
Multicast DNS service It resolves
DNS requests on behalf of GNS
|
gns
|
The Grid Naming Service (GNS), a
gateway between DNS and mdnsd, resolves DNS requests
|
gpnpd
|
Grid Plug and Play Daemon,
Basically a profile similar like OCR contents stored in XML format in
$GI_HOME/gpnp/profiles/<peer> etc., this is where used by OCSSD also to
read the ASM disk locations to start up with out having ASM to be up,
moreover this also provides the plug and play profile where this can be
distributed across nodes to cluster
|
evmd/
evmlogger
|
Evm service will be provided by
evmd daemon, which is a information about events happening in cluster, stop node,start
node, start instance etc.
|
- cssdagent (cluster synchronization service agent), in turn starts
ocssd
|
Cluster synchronization service
daemon which manages node membership in the cluster
|
If cssd found
that ocssd is down, it will reboot the node to protect the data integrity.
- cssdmonitor (cluster synchronization service monitor), replaces oprocd and provides I/O fencing
- OHASD orarootagent starts and in turn starts
crsd.bin
|
Cluster ready services, which
manages high availability of cluster resources , like stopping , starting,
failing over etc.
|
diskmon.bin
|
disk monitor (diskdaemon monitor) provides I/O fencing for exadata
storage
|
octssd.bin
|
Cluster synchronization time
services , provides Network time protocol
services but manages its own rather depending on OS
|
CRSD Agent Phase:- crsd.bin starts two more agents
crsd
orarootagent(Oracle root agent) starts and in turn this will start
gns
|
Grid interprocess communication
daemon, used for monitoring cluster interconnect
|
gns vip
|
Multicast DNS service It resolves
DNS requests on behalf of GNS
|
Network
|
Monitor the additional networks to
provide HAIP to cluster interconnects
|
Scan vip
|
Monitor the scan vip, if found
fail or unreachable failed to other node
|
Node vip
|
Monitor the node vip, if found
fail or unreachable failed to other node
|
crsd
oraagent(Oracle Agent) starts and in turn it will start (the same functionality
in 11gr1 and 10g managed by racgmain and racgimon background process) which is
now managed by crs Oracle agent itself.
·
ASM & disk groups
|
Start & monitor local asm
instance
|
ONS
|
FAN feature, provides notification
to interested client
|
eONS
|
FAN feature, provides notification
to interested client
|
SCAN Listener
|
Start & Monitor scan listener
|
Node Listener
|
Start & monitor the node
listener (rdbms?)
|
As you said Voting & OCR Disk resides in
ASM Diskgroups, but as per startup sequence OCSSD starts first before than ASM,
how is it possible?
How does OCSSD starts if voting disk
& OCR resides in ASM Diskgroups?
You might wonder how CSSD, which is
required to start the clustered ASM instance, can be started if voting disks
are stored in ASM? This sounds like a chicken-and-egg problem: without access
to the voting disks there is no CSS, hence the node cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM instance. To
solve this problem the ASM disk headers have new metadata in 11.2: you can use
kfed to read the header of an ASM disk containing a voting disk. The
kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file.
This does not require the ASM instance to be up. Once the voting disks are
located, CSS can access them and joins the cluster.
To resolve this issue, Oracle ASM reserves several blocks at a fixed location for every Oracle ASM disk used for storing the voting disk.As a result , Oracle Clusterware can access the voting disks present in ASM even if the ASM instance is down and CSS can continue to maintain the Oracle cluster even if the ASM instance has failed.The physical location of the voting files in used ASM disks is fixed, i.e. the cluster stack does not rely on a running ASM instance to access the files. The location of the file is visible in the ASM disk header (dumping the file out of ASM with dd is quite easy):
kfdhdb.vfend: 128 ; 0x0f0: 0x00000080 <
To resolve this issue, Oracle ASM reserves several blocks at a fixed location for every Oracle ASM disk used for storing the voting disk.As a result , Oracle Clusterware can access the voting disks present in ASM even if the ASM instance is down and CSS can continue to maintain the Oracle cluster even if the ASM instance has failed.The physical location of the voting files in used ASM disks is fixed, i.e. the cluster stack does not rely on a running ASM instance to access the files. The location of the file is visible in the ASM disk header (dumping the file out of ASM with dd is quite easy):
oracle@rac1:~/ [+ASM1] kfed read /dev/sdf | grep -E ‘vfstart|vfend’
kfdhdb.vfstart: 96 ; 0x0ec: 0x00000060 <kfdhdb.vfend: 128 ; 0x0f0: 0x00000080 <
– The voting disk is not striped but put as a whole on ASM Disks
– In the event that the disk containing the voting disk fails, Oracle ASM will choose another disk on which to store this data.
How does SCAN works?
1.
Client Connected through SCAN name
of the cluster (remember all three IP addresses round robin resolves to same
Host name (SCAN Name), here in this case our scan name is
cluster01-scan.cluster01.example.com
2.
The request reaches to DNS server in
your corp and then resolves to one of the node out of three. a. If GNS
(Grid Naming service or domain is configured) that is a subdomain configured
in the DNS entry for to resolve cluster address the request will be
handover to GNS (gnsd)
3.
Here in our case assume there is no
GNS, now the with the help of SCAN listeners where end points are configured to
database listener.
4.
Database Listeners listen the
request and then process further.
5.
In case of node addition, Listener
4, client need not to know or need not change any thing from their tns entry
(address of 4th node/instance) as they just using scan IP.
6.
Same case even in the node deletion.
What is GNS?
Grid Naming service is alternative
service to DNS , which will act as a sub domain in your DNS but managed by
Oracle, with GNS the connection is routed to the cluster IP and manages
internally.
What is GPNP?
Grid Plug and Play along with GNS
provide dynamic
In previous releases, adding or
removing servers in a cluster required extensive manual preparation.
In Oracle Database 11g Release
2, GPnP allows each node to perform the following tasks dynamically:
o Negotiating appropriate network identities for itself
o Acquiring additional information from a configuration
profile
o Configuring or reconfiguring itself using profile data,
making host names and addresses resolvable on the network
For example a domain should contain
·
–Cluster name: cluster01
·
–Network domain: example.com
·
–GPnP domain: cluster01.example.com
To add a node, simply connect the
server to the cluster and allow the cluster to configure the node.
To make it happen, Oracle uses the
profile located in $GI_HOME/gpnp/profiles/peer/profile.xml which contains the
cluster resources, for example disk locations of ASM. etc.
So this profile will be read local
or from the remote machine when plugged into cluster and dynamically added to
cluster.
What are the file types that ASM
support and keep in disk groups?
Control files
|
Flashback logs
|
Data Pump dump sets
|
Data files
|
DB SPFILE
|
Data Guard configuration
|
Temporary data files
|
RMAN backup sets
|
Change tracking bitmaps
|
Online redo logs
|
RMAN data file copies
|
OCR files
|
Archive logs
|
Transport data files
|
ASM SPFILE
|
List Key benefits of ASM?
·
Stripes files rather than logical
volumes
·
Provides redundancy on a file basis
·
Enables online disk reconfiguration
and dynamic rebalancing
·
Reduces the time significantly to
resynchronize a transient failure by tracking changes while disk is offline
·
Provides adjustable rebalancing
speed
·
Is cluster-aware
·
Supports reading from mirrored copy
instead of primary copy for extended clusters
·
Is automatically installed as part
of the Grid Infrastructure
List some of the background process
that used in ASM?
Process
|
Description
|
RBAL
|
Opens all device files as part of
discovery and coordinates the rebalance activity
|
ARBn
|
One or more slave processes that
do the rebalance activity
|
GMON
|
Responsible for managing the
disk-level activities such as drop or offline and advancing the ASM disk
group compatibility
|
MARK
|
Marks ASM allocation units as
stale when needed
|
Onnn
|
One or more ASM slave processes
forming a pool of connections to the ASM instance for exchanging messages
|
PZ9n
|
One or more parallel slave
processes used in fetching data on clustered ASM installation from GV$ views
|
What is node listener?
In 11gr2 the listeners will run from
Grid Infrastructure software home
·
The node listener is a process that
helps establish network connections from ASM clients to the ASM instance.
·
Runs by default from the Grid
$ORACLE_HOME/bin directory
·
Listens on port 1521 by default
·
Is the same as a database instance
listener
·
Is capable of listening for all
database instances on the same machine in addition to the ASM instance
·
Can run concurrently with separate
database listeners or be replaced by a separate database listener
·
Is named tnslsnr on the Linux
platform
What is SCAN listener?
A scan listener is something that
additional to node listener which listens the incoming db connection requests
from the client which got through the scan IP, it got end points configured to
node listener where it routes the db connection requests to particular node listener.
What is the difference between
CRSCTL and SRVCTL?
crsctl manages clusterware-related
operations:
·
Starting and stopping Oracle
Clusterware
·
Enabling and disabling Oracle
Clusterware daemons
·
Registering cluster resources
srvctl manages Oracle resource–related
operations:
·
Starting and stopping database
instances and services
·
Also from 11gR2 manages the cluster
resources like network,vip,disks etc
How to control Oracle Clusterware?
To start or stop Oracle Clusterware
on a specific node:
# crsctl stop crs
# crsctl start crs
To enable or disable Oracle
Clusterware on a specific node:
# crsctl enable crs
# crsctl disable crs
How to check the cluster (all nodes)
status?
To check the viability of Cluster
Synchronization Services (CSS) across nodes:
$ crsctl check cluster
CRS-4537: Cluster Ready Services is
online
CRS-4529: Cluster Synchronization
Services is online
CRS-4533: Event Manager is online
How to check the cluster (one node)
status?
$ crsctl check crs
CRS-4638: Oracle High Availability Services
is online
CRS-4537: Cluster Ready Services is
online
CRS-4529: Cluster Synchronization
Services is online
CRS-4533: Event Manager is online
How to find Voting Disk location?
•To determine the location of the
voting disk:
# crsctl query css votedisk
## STATE File Universal Id File Name
Disk group
– —– —————– ———- ———-
1. ONLINE
8c2e45d734c64f8abf9f136990f3daf8 (ASMDISK01) [DATA]
2. ONLINE
99bc153df3b84fb4bf071d916089fd4a (ASMDISK02) [DATA]
3. ONLINE
0b090b6b19154fc1bf5913bc70340921 (ASMDISK03) [DATA]
Located 3 voting disk(s).
How to find Location of OCR?
·
cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE
·
#OCRCHECK (also about OCR integrity)
What
are types of ASM Mirroring?
Disk Group Type
|
Supported MirroringLevels
|
Default Mirroring Level
|
External redundancy
|
Unprotected (None)
|
Unprotected (None)
|
Normal redundancy
|
Two-wayThree-way
Unprotected (None)
|
Two-way
|
High redundancy
|
Three-way
|
Three-way
|
What is ASM Striping?
ASM can use variable size data
extents to support larger files, reduce memory requirements, and improve
performance.
Each data extent resides on an
individual disk.
Data extents consist of one or more
allocation units.
The data extent size is:
·
Equal to AU for the first 20,000
extents (0–19999)
·
Equal to 4 × AU for the next 20,000
extents (20000–39999)
·
Equal to 16 × AU for extents above
40,000
ASM stripes files using extents with
a coarse method for load balancing or a fine method to reduce latency.
·
Coarse-grained striping is always
equal to the effective AU size.
·
Fine-grained striping is always
equal to 128 KB.
How many ASM Diskgroups can be
created under one ASM Instance?
ASM imposes the following limits:
·
63 disk groups in a storage system
·
10,000 ASM disks in a storage system
·
Two-terabyte maximum storage for
each ASM disk (non-Exadata)
·
Four-petabyte maximum storage for
each ASM disk (Exadata)
·
40-exabyte maximum storage for each
storage system
·
1 million files for each disk group
·
ASM file size limits (database limit
is 128 TB):
1.
External redundancy maximum file
size is 140 PB.
2.
Normal redundancy maximum file size
is 42 PB.
3.
High redundancy maximum file size is
15 PB.
How to find the cluster network
settings?
To determine the list of interfaces
available to the cluster:
$ oifcfg iflist –p -n
To determine the public and private
interfaces that have been configured:
$ oifcfg getif
eth0 192.0.2.0 global public
eth1 192.168.1.0 global cluster_interconnect
To determine the Virtual IP (VIP)
host name, VIP address, VIP subnet mask, and VIP interface name:
$ srvctl config nodeapps -a
VIP exists.:host01
VIP exists.:
/192.0.2.247/192.0.2.247/255.255.255.0/eth0
…
How to change Cluster interconnect
in RAC?
On a single node in the cluster, add
the new global interface specification:
$ oifcfg setif -global
eth2/192.0.2.0:cluster_interconnect
Verify the changes with oifcfg getif
and then stop Clusterware on all nodes by running the following command as root
on each node:
# oifcfg getif
# crsctl stop crs
Assign the network address to the
new network adapters on all nodes using ifconfig:
#ifconfig eth2 192.0.2.15
netmask 255.255.255.0 \ broadcast 192.0.2.255
Remove the former adapter/subnet
specification and restart Clusterware:
$ oifcfgdelif -global
eth1/192.168.1.0
# crsctl start crs
Managing or Modifying SCAN in Oracle
RAC?
To add a SCAN VIP resource:
$ srvctl add scan -n cluster01-scan
To remove Clusterware resources from
SCAN VIPs:
$ srvctl remove scan [-f]
To add a SCAN listener resource:
$ srvctl add scan_listener
$ srvctl add scan_listener -p 1521
To remove Clusterware resources from
all SCAN listeners:
$ srvctl remove scan_listener [-f]
How to check the node connectivity
in Oracle Grid Infrastructure?
$ cluvfy comp nodecon -n all
–verbose
Can I stop all nodes in one command?
Meaning that stopping whole cluster ?
In 10g its not possible, where in
11g it is possible
[root@pic1]# crsctl start cluster
-all
[root@pic2]# crsctl stop cluster –all
[root@pic2]# crsctl stop cluster –all
What is OLR? Which of the following
statements regarding the Oracle Local Registry (OLR) is true?
1.Each cluster node has a local
registry for node-specific resources.
2.The OLR should be manually created
after installing Grid Infrastructure on each node in the cluster.
3.One of its functions is to
facilitate Clusterware startup in situations where the ASM stores the OCR and
voting disks.
4.You can check the status of the
OLR using ocrcheck.
What is runfixup.sh script in
Oracle Clusterware 11g release 2 installation
With Oracle Clusterware 11g release
2, Oracle Universal Installer (OUI) detects when the minimum requirements for
an installation are not met, and creates shell scripts, called fixup scripts,
to finish incomplete system configuration steps. If OUI detects an incomplete
task, then it generates fixup scripts (runfixup.sh). You can run the fixup
script after you click the Fix and Check Again Button.
The Fixup script does the following:
If necessary sets kernel parameters
to values required for successful installation, including:
·
Shared memory parameters.
·
Open file descriptor and UDP
send/receive parameters.
Sets permissions on the Oracle
Inventory (central inventory) directory. Reconfigures primary and secondary
group memberships for the installation owner, if necessary, for the Oracle
Inventory directory and the operating system privileges groups.
·
Sets shell limits if necessary to
required values
Update 12-May-2013, Some practical questions added here
1. Viewing Contents in OCR/Voting disks
There are three possible ways to view the OCR contents. a. OCRDUMP (or) b. crs_stat -p (or) c. By using strings. Voting disk contents are not persistent and are not required to view the contents, because the voting disk contents will be overwritten. if still need to view, strings are used.
2. Server pools – Read in my blog
3. Verifying Cluster Interconnect
Cluster interconnects can be verified by: i. oifcfg getif ii. From AWR Report. iii. show parameter cluster_interconnect iv. srvctl config network
4. Does scan IP required or we can disable it
SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by the concurrent manager (kind of job scheduler in oracle apps). To disable the SCAN IP, i. Do not use SCAN IP at the client end. ii. Stop scan listener srvctl stop scan_listener iii. Stop scan srvctl stop scan (this will stop the scan vip's) iv. Disable scan and disable scan listener srvctl disable scan
5. Migrating to new Diskgroup scenarious
a. Case 1: Migrating disk group from one storage to other with same name 1. Consider the disk group is DATA, 2. Create new disks in DATA pointing towards the new storage (EMC), a) Partioning provisioning done by storage and they give you the device name or mapper like /dev/mapper/asakljdlas 3. Add the new disk to diskgroup DATA a) Alter diskgroup data add disk '/dev/mapper/asakljdlas' 3. drop the old disks from DATA with which rebalancing is done automatically. If you want you can the rebalance by alter system set asm_power_limit =12 for full throttle. alter diskgroup data drop disk 'path to hitachi storage' Note: you can get the device name in v$asm_disk in path column. 4. Request SAN team to detach the old Storage (HITACHI). b. Case 2: Migrating disk group from one to another with different diskgroup name. 1) Create the Disk group with new name in the new storage. 2) Create the spfile in new diskgroup and change the parameter scope = spfile for control files etc. 3) Take a control file backup in format +newdiskgroup 4) Shutdown the db, startup nomount the database 5) restore the control file from backup (now the control will restore to new diskgroup) 6) Take the RMAN backup as copy of all the databases with new format. RMAN> backup database as copy format '+newdiskgroup name' ; 3) RMAN> Switch database to copy. 4) Verify dba_data_files,dba_temp_files, v$log that all files are pointing to new diskgroup name. c. Case 3: Migrating disk group to new storage but no additional diskgroup given 1) Take the RMAN backup as copy of all the databases with new format and place it in the disk. 2) Prepare rename commands from v$log ,v$datafile etc (dynamic queries) 3) Take a backup of pfile and modify the following referring to new diskgroup name .control_files .db_create_file_dest .db_create_online_log_dest_1 .db_create_online_log_dest_2 .db_recovery_file_des 4) stop the database 5) Unmount the diskgroup asmcmd umount ORA_DATA 6) use asmcmd renamedg (11gr2 only) command to rename to new diskgroup renamedg phase=both dgname=ORA_DATA newdgname=NEW_DATA verbose=true 7) mount the diskgroup asmcmd mount NEW_DATA 8) start the database in mount with new pfile taken backup in step 3 9) Run the rename file scripts generated at step2 9) Add the diskgroup to cluster the cluster (if using rac) srvctl modify database -d orcl -p +NEW_FRA/orcl/spfileorcl.ora srvctl modify database -d orcl -a "NEW_DATA" srvctl config database -d orcl srvctl start database -d orcl 10) Delete the old diskgroup from cluster crsctl delete resource ora.ORA_DATA.dg 11) Open the database.
7. Database rename in RAC, what could be the checklist for you?
a. Take the outputs of all the services that are running on the databases. b. set cluster_database=FALSE c. Drop all the services associated with the database. d. Stop the database e. Startup mount f. Use nid to change the DB Name. Generic question, If using ASM the usual location for the datafile would be +DATA/datafile/OLDDBNAME/system01.dbf' Does NID changes this path too? to reflect the new db name? Yes it will, by using proper directory structure it will create a links to original directory structure. +DATA/datafile/NEWDBNAME/system01.dbf' this has to be tested, We dont have test bed, but thanks to Anji who confirmed it will g. Change the parameters according to the new database name h. Change the password file. i. Stop the database. j. Mount the database k. Open database with Reset logs l. Create spfile from pfile. m. Add database to the cluster. n. Create the services that are dropped in prior to rename. o. Bounce the database.
8.How to find the database in which particular service is attached to
when you have a large number of databases running in the server, you
cannot check one by one manually
Write a shell script to read the database name from oratab and iterate
the loop taking inpt as DB name in srvctl to get the result.
#!/bin/ksh
ORACLE_HOME=
PATH=$ORACLE_HOME/bin:$PATH
LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib
export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH
for INSTANCE in `cat /etc/oratab|grep -v "^#"|cut -f1 -d: -s`
do
export ORACLE_SID=$INSTANCE
echo `srvctl status service -d $INSTANCE -s $1| grep -i "is running"`
done
#!/bin/ksh
ORACLE_HOME=
PATH=$ORACLE_HOME/bin:$PATH
LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib
export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH
for INSTANCE in `cat /etc/oratab|grep -v "^#"|cut -f1 -d: -s`
do
export ORACLE_SID=$INSTANCE
echo `srvctl status service -d $INSTANCE -s $1| grep -i "is running"`
done
9. Difference between OHAS and CRS
OHAS is complete cluster stack which includes some kernel level tasks like managing network,time synchronization, disks etc, where the CRS has the ability to manage the resources like database,listeners,applications, etc With both of this Oracle provides the high availability clustering services rather only affinity to databases.
OHAS is complete cluster stack which includes some kernel level tasks like managing network,time synchronization, disks etc, where the CRS has the ability to manage the resources like database,listeners,applications, etc With both of this Oracle provides the high availability clustering services rather only affinity to databases.