11gR2 Clusterware and Grid Home - What You Need to Know

11gR2 Clusterware Key Facts

11gR2 Clusterware is required to be up and running prior to installing a 11gR2 Real Application Clusters database.
The GRID home consists of the Oracle Clusterware and ASM. ASM should not be in a separate home.
The 11gR2 Clusterware can be installed in "Standalone" mode for ASM and/or "Oracle Restart" single node support. This clusterware is a subset of the full clusterware described in this document.
The 11gR2 Clusterware can be run by itself or on top of vendor clusterware. See the certification matrix for certified combinations. Ref: Note: 184875.1 "How To Check The Certification Matrix for Real Application Clusters"
The GRID Home and the RAC/DB Home must be installed in different locations.
The 11gR2 Clusterware requires a shared OCR files and voting files. These can be stored on ASM or a cluster filesystem.
The OCR is backed up automatically every 4 hours to <GRID_HOME>/cdata/<clustername>/ and can be restored via ocrconfig.
The voting file is backed up into the OCR at every configuration change and can be restored via crsctl.
The 11gR2 Clusterware requires at least one private network for inter-node communication and at least one public network for external communication. Several virtual IPs need to be registered with DNS. This includes the node VIPs (one per node), SCAN VIPs (three). This can be done manually via your network administrator or optionally you could configure the "GNS" (Grid Naming Service) in the Oracle clusterware to handle this for you (note that GNS requires its own VIP).
A SCAN (Single Client Access Name) is provided to clients to connect to. For more information on SCAN see Note: 887522.1
The root.sh script at the end of the clusterware installation starts the clusterware stack. For information on troubleshooting root.sh issues see Note: 1053970.1
Only one set of clusterware daemons can be running per node.
On Unix, the clusterware stack is started via the init.ohasd script referenced in /etc/inittab with "respawn".
A node can be evicted (rebooted) if a node is deemed to be unhealthy. This is done so that the health of the entire cluster can be maintained. For more information on this see: Note: 1050693.1"Troubleshooting 11.2 Clusterware Node Evictions (Reboots)"
Either have vendor time synchronization software (like NTP) fully configured and running or have it not configured at all and let CTSS handle time synchronization. See Note: 1054006.1 for more information.
If installing DB homes for a lower version, you will need to pin the nodes in the clusterware or you will see ORA-29702 errors. See Note 946332.1 and Note:948456.1 for more information.
The clusterware stack can be started by either booting the machine, running "crsctl start crs" to start the clusterware stack, or by running "crsctl start cluster" to start the clusterware on all nodes. Note that crsctl is in the <GRID_HOME>/bin directory. Note that "crsctl start cluster" will only work if ohasd is running.
The clusterware stack can be stopped by either shutting down the machine, running "crsctl stop crs" to stop the clusterware stack, or by running "crsctl stop cluster" to stop the clusterware on all nodes. Note that crsctl is in the <GRID_HOME>/bin directory.
Killing clusterware daemons is not supported.
Instance is now part of .db resources in "crsctl stat res -t" output, there is no separate .inst resource for 11gR2 instance.

Note that it is also a good idea to follow the RAC Assurance best practices in Note: 810394.1

Clusterware Startup Sequence

The following is the Clusterware startup sequence (image from the "Oracle Clusterware Administration and Deployment Guide):

Don‘t let this picture scare you too much. You aren‘t responsible for managing all of these processes, that is the Clusterware‘s job!

Short summary of the startup sequence: INIT spawns init.ohasd (with respawn) which in turn starts the OHASD process (Oracle High Availability Services Daemon). This daemon spawns 4 processes.

Level 1: OHASD Spawns:

cssdagent - Agent responsible for spawning CSSD.
orarootagent - Agent responsible for managing all root owned ohasd resources.
oraagent - Agent responsible for managing all oracle owned ohasd resources.
cssdmonitor - Monitors CSSD and node health (along wth the cssdagent).

Level 2: OHASD rootagent spawns:

CRSD - Primary daemon responsible for managing cluster resources.
CTSSD - Cluster Time Synchronization Services Daemon
Diskmon
ACFS (ASM Cluster File System) Drivers

Level 2: OHASD oraagent spawns:

MDNSD - Used for DNS lookup
GIPCD - Used for inter-process and inter-node communication
GPNPD - Grid Plug & Play Profile Daemon
EVMD - Event Monitor Daemon
ASM - Resource for monitoring ASM instances

Level 3: CRSD spawns:

orarootagent - Agent responsible for managing all root owned crsd resources.
oraagent - Agent responsible for managing all oracle owned crsd resources.

Level 4: CRSD rootagent spawns:

Network resource - To monitor the public network
SCAN VIP(s) - Single Client Access Name Virtual IPs
Node VIPs - One per node
ACFS Registery - For mounting ASM Cluster File System
GNS VIP (optional) - VIP for GNS

Level 4: CRSD oraagent spawns:

ASM Resouce - ASM Instance(s) resource
Diskgroup - Used for managing/monitoring ASM diskgroups.
DB Resource - Used for monitoring and managing the DB and instances
SCAN Listener - Listener for single client access name, listening on SCAN VIP
Listener - Node listener listening on the Node VIP
Services - Used for monitoring and managing services
ONS - Oracle Notification Service
eONS - Enhanced Oracle Notification Service
GSD - For 9i backward compatibility
GNS (optional) - Grid Naming Service - Performs name resolution

This image shows the various levels more clearly:

时间： 2024-08-05 15:18:34

11gR2 Clusterware and Grid Home - What You Need to Know的相关文章

Oracle 11gR2 RAC 安装Grid Infrastructure错误

Oracle 11gR2 RAC 安装Grid Infrastructure错误系统环境: 操作系统:RedHat EL5 Cluster: Oracle GI(Grid Infrastructure) Oracle: Oracle 11.2.0.1.0 如图所示:RAC 系统架构对于Oracle 11G构建RAC首先需要构建GI(Grid Infrastructure)的架构错误现象: 报:node2 检测用户信任关系错误,node2安装目录不可访问! 手工检测,node2与nod

RAC5——11gR2以后GI进程的变化

参考文档: 11gR2 Clusterware and Grid Home - What You Need to Know (Doc ID 1053147.1)诊断 Grid Infrastructure 启动问题 (Doc ID 1623340.1) Oracle 11gR2 中对CRSD资源进行了重新分类: Local Resources 和 Cluster Resources,可以通过命令crsctl查看: [[email protected] ~]# crsctl stat res -t

Grid Infrastructure Single Client Access Name (SCAN) Explained (文档 ID 887522.1)

APPLIES TO: Oracle Database - Enterprise Edition - Version 11.2.0.1 and laterExalogic Elastic Cloud X4-2 Half RackInformation in this document applies to any platform. PURPOSE 11gR2 Grid Infrastructure (CRS) introduced Single Client Access Name (SCAN

【Oracle】RAC11gR2 Grid启动顺序及启动故障诊断思路

从11gR2开始,Oracle RAC的架构有了比较大的变化,集群层面相交于之前的版本有了比较大的变动,原来的rac架构基本上属于cssd.crsd.evmd三大光秃秃的主干进程,日志数量较少,对于rac无法启动原因,采用最原始的方法逐一查看各个进程的日志也可找到无法启动的原因.然而从11gR2之后,集群层发生了比较大的变动,以下是$GRID_HOME/log/rac1/下的目录情况: [[email protected] rac1]$ ls acfs acfsrepl acf

转://诊断 Grid Infrastructure 启动问题 (文档 ID 1623340.1) .

文档内容用途适用范围详细信息启动顺序: 集群状态问题 1: OHASD 无法启动问题 2: OHASD Agents 未启动问题 3: OCSSD.BIN 无法启动问题 4: CRSD.BIN 无法启动问题 5: GPNPD.BIN 无法启动问题 6: 其它的一些守护进程无法启动问题 7: CRSD Agents 无法启动问题 8: HAIP 无法启动网络和域名解析的验证日志文件位置, 属主和权限

Centos-6.5搭建oracle11g RAC集群

一.基本概念 RAC( Real Application Clusters-----真正的应用集群) RAC数据库是Oracle公司数据库的集群解决方案.高可用性解决方案.两个或多个服务器之间通过一个内部的私有网络互相连接起来,使用集群软件将集群中所有的服务器融合成一个整体,构成一个集群. 集群内部的所有服务器共享存储,所有服务器都接入公共网络,通过集群唯一的别名对外形成逻辑上单一的数据库提供服务,对内实现集群数据库的高可用性.节点间的负载均衡和Failover(失败切换). 物理结构: 集群的

RAC Concept

1. RAC的高可用性 RAC的高可用性主要包含以下几点: 1> 实现节点间的负载均衡. 2> 实现失败切换的功能. 3> 通过Service组件来控制客户端的访问路径. 4> 集群软件能够自动化管理各个资源,并且有定时的节点状态监测机制,能自动对一些失败的进程以及心跳监测失败的节点进行重启,使其重新恢复到正常的运行状态. 2. 集群管理软件Clusterware Clusterware为所有平台的Oracle数据库提供一个完整的集群解决方案,并为RAC运行提供必要的基础架构.Cl

INS-20802 PRVF-9802 PRVF-5184 PRVF-5186 After Successful Upgradeto 11gR2 Grid Infrastructure

INS-20802 PRVF-9802 PRVF-5184 PRVF-5186 After Successful Upgradeto 11gR2 Grid Infrastructure (文档 ID 974481.1) This document is being delivered to you viaOracle Support's Rapid Visibility (RaV) process and therefore has not beensubject to an independe

oracle 11gr2 deinstall卸载oracle和grid

在10g中要卸载CRS是件非常繁琐的事.到了11g,oracle提供了卸载工具deinstall,用这个工具可以卸载的非常干净.这个工具默认放在oracle用户下的$ORACLE_HOME/deinstall/deinstall,grid用户下的$ORACLE_HOME/deinstall/deinstall .google了下,发现只需要执行这个工具一次就可以了.但是在自己的实际测试当中分别oracle用户和grid用户各执行了一次才卸载完成.猜想可能只需要grid用户执行一次deinsta