在Exadata中,出现任何与数据库无关的问题的时候最好都运行exachk进行健康检查。exachk收集的信息很全,省去大量人工收集的繁琐步骤。并且收集完成以后,可以在整体上对系统的健康状况做一个评估,该报告包含软件、硬件、固件版本、配置等方面信息,从中发现一些可疑点,进而缩小范围进行下一步的诊断。
这篇文章主要记录了exachk的基本使用方法,exachk可以从MOS文档:1070954.1中下载得到。
首先要声明两个环境变量RAT_ORACLE_HOME和RAT_EXADATA_VERSION,不然之后的使用过程中会报错:
[[email protected] dbhome_1]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0.4/dbhome_1
[[email protected] dbhome_1]$ export RAT_ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_1
[[email protected] exachk]$ rpm -qa |grep exadata
exadata-oswatcher-11.2.3.3.0.131014.1-1
exadata-asr-11.2.3.3.0.131014.1-1
exadata-sun-computenode-11.2.3.3.0.131014.1-1
exadata-base-11.2.3.3.0.131014.1-1
exadata-applyconfig-11.2.3.3.0.131014.1-1
exadata-ibdiagtools-11.2.3.3.0.131014.1-1
exadata-exachk-11.2.3.3.0.131014.1-1
exadata-validations-compute-11.2.3.3.0.131014.1-1
exadata-ipconf-11.2.3.3.0.131014.1-1
exadata-commonnode-11.2.3.3.0.131014.1-1
exadata-firmware-compute-11.2.3.3.0.131014.1-1
exadata-sun-computenode-minimum-11.2.3.3.0.131014.1-1
[[email protected] exachk]$ export RAT_EXADATA_VERSION=11.2.3.3.0
然后运行exachk:
[[email protected] dbhome_1]$ cd /opt/oracle.SupportTools/
[[email protected] oracle.SupportTools]$ cd exachk
[[email protected] exachk]$ ./exachk
CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /u01/app/11.2.0.4/grid?[y/n][y] --确认CRS_HOME的路径
Checking ssh user equivalency settings on all nodes in cluster
Node dm02db02 is configured for ssh user equivalency for oracle user
Searching for running databases . . . . .
. . . . . . . . . . . . . .
List of running databases registered in OCR
1. bdataedw
2. bdataetl
3. cata
4. edw
5. ETL
6. OMSSTD
7. portalstd
8. rdsdbstd
9. All
10. None
Select respective number to choose database for checking best practices. For multiple databases, select 9 for All or comma separated number like 1,2 etc [1-10][9]. --选择要进行检查的库,1-8是扫描到的8个库,9是全部检查,10是跳过。
Searching out ORACLE_HOME for selected databases.
. . . . . . . . . . . . . . . . . . .
ls: /u01/app/oracle/product/11.2.0.4/dbhome_1ORACLE_HOME_OLD/bin/oracle: No such file or directory
Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-------------------------------------------------------------------------------------------------------
Oracle Stack Status
-------------------------------------------------------------------------------------------------------
Host Name CRS Installed ASM HOME RDBMS Installed CRS UP ASM UP RDBMS UP DB Instance Name
-------------------------------------------------------------------------------------------------------
dm02db01 Yes Yes Yes Yes Yes Yes bdataedw1 bdataetl1 cata1 edw1 ETL1 OMS3 portal1 rdsdb1
dm02db02 Yes Yes Yes Yes Yes Yes bdataedw2 bdataetl2 cata2 edw2 ETL2 OMS4 portal2 rdsdb2
-------------------------------------------------------------------------------------------------------
root user equivalence is not setup between dm02db01 and STORAGE SERVER dm02cel01.
1. Enter 1 if you will enter root password for each STORAGE SERVER when prompted.
2. Enter 2 to exit and configure root user equivalence manually and re-run exachk.
3. Enter 3 to skip checking best practices on STORAGE SERVER.
Please indicate your selection from one of the above options[1-3][1]:-
Is root password same on all STORAGE SERVER[y/n][y]
Enter root password for STORAGE SERVER :- --所有cell节点的密码
root password for 192.168.0.19 was incorrect. 2 retries remaining.
Enter root password for 192.168.0.19 :-
root password for 192.168.0.19 was incorrect. 1 retries remaining.
Enter root password for 192.168.0.19 :-
root password for 192.168.0.19 was incorrect. root privileged checks will not be executed on 192.168.0.19
--如果有节点的root密码和其他节点的不同会提示你单独输入,如果不知道的话exachk在收集阶段会跳过该节点,不影响其他节点的正常运行。
expect: spawn id exp6 not open
while executing
"expect "*?assword:*""
expect: spawn id exp6 not open
while executing
"expect "*?assword:*""
expect: spawn id exp6 not open
while executing
"expect "*?assword:*""
expect: spawn id exp6 not open
while executing
"expect "*?assword:*""
120 of the included audit checks require root privileged data collection on DATABASE SERVER. If sudo is not configured or the root password is not available, audit checks which require root privileged data collection can be skipped.
1. Enter 1 if you will enter root password for each on DATABASE SERVER host when prompted
2. Enter 2 if you have sudo configured for oracle user to execute root_exachk.sh script on DATABASE SERVER
3. Enter 3 to skip the root privileged collections on DATABASE SERVER
4. Enter 4 to exit and work with the SA to configure sudo on DATABASE SERVER or to arrange for root access and run the tool later.
Please indicate your selection from one of the above options[1-4][1]:-
Is root password same on all compute nodes?[y/n][y]
Enter root password on DATABASE SERVER:- --所有DB节点的root密码
9 of the included audit checks require root privileged data collection on INFINIBAND SWITCH .
1. Enter 1 if you will enter root password for each INFINIBAND SWITCH when prompted
2. Enter 2 to exit and to arrange for root access and run the exachk later.
3. Enter 3 to skip checking best practices on INFINIBAND SWITCH
Please indicate your selection from one of the above options[1-3][1]:-
Is root password same on all INFINIBAND SWITCH ?[y/n][y] --INFINIBAND的root密码
Enter root password for INFINIBAND SWITCH :-
root passwords for following nodes are incorrect.
You can still continue but root privileged checks will not be executed on following nodes.
1. 192.168.0.19
Do you want to continue[y/n][y]:-
*** Checking Best Practice Recommendations (PASS/WARNING/FAIL) ***
Log file for collections and audit checks are at
/opt/oracle.SupportTools/exachk/exachk_112114_162425/exachk.log
=============================================================
Node name - dm02db01
=============================================================
Collecting - ASM DIsk I/O stats
Collecting - ASM Disk Groups
Collecting - ASM Diskgroup Attributes
Collecting - ASM disk partnership imbalance
Collecting - ASM initialization parameters
Collecting - Active sessions load balance for bdataedw database
Collecting - Active sessions load balance for bdataetl database
Collecting - Active sessions load balance for cata database
Collecting - Active sessions load balance for edw database
..............
Collecting patch inventory on CRS HOME /u01/app/11.2.0.4/grid
Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11.2.0.4/dbhome_1
Collecting patch inventory on ORACLE_HOME /u01/app2/oracle/product/11.2.0.2/dbhome_1
---------------------------------------------------------------------------------
Detailed report (html) - /opt/oracle.SupportTools/exachk/exachk_rdsdbstd_112114_162425/exachk_rdsdbstd_112114_162425.html
UPLOAD(if required) - /opt/oracle.SupportTools/exachk/exachk_rdsdbstd_112114_162425.zip
至此exachk运行完毕,可以下载/opt/oracle.SupportTools/exachk/exachk_rdsdbstd_112114_162425.zip文件,打开/opt/oracle.SupportTools/exachk/exachk_rdsdbstd_112114_162425/exachk_rdsdbstd_112114_162425.html进行查看。
可以从中比较直观的看到目前存在的一些问题,界面如下: