在刚刚接手Oracle时,这个问题困扰了我一段时间,现在将问题的解决过程分享一下
Oracle版本:11gR2
OS环境:Centos6.4
问题重现:
1. 接手数据库是写了一个备份脚本,脚本内容如下:
-----------------------------------------------------------------------------------------------------------------------------------
# !/bin/bash
# Name: rmanbk_level0.sh
# Write by: Datura at 2014/11/11 v1.0
# Description: The script is used to make the zero level backup for the orcl Library
# The definition of the variable
lock_file=/tmp/rmanbk.lock
oracleid=`cat /etc/passwd|grep oracle|awk -F: ‘{print $3}‘`
# Check the script to run or not
if [ -f $lock_file ];then
pid=`cat $lock_file`
ps $pid &> /dev/null
[ $? -eq 0 ] && echo "Script is running..." && exit 1
fi
# Create process lock
echo $$ > $lock_file
# Only allows the oracle to run
[ $UID -ne $oracleid ] && echo "Please run as oracle !!" && exit 4
# To set environment variables
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1
export PATH=$ORACLE_HOME/bin:$PATH
export ORACLE_SID=orcl1
export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
echo on
rman target / msglog=/storage/script/log/rmanbk_level0_`date +%Y‘-‘%m‘-‘%d‘-‘%H‘:‘%M‘:‘%S`.log <<EOF
RUN {
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate channel c3 type disk;
allocate channel c4 type disk;
backup as compressed backupset incremental level 0 database format ‘/storage/arch/rman/rmanbk_level0_%d_%I_%s_%p_%T.bkp‘;
crosscheck archivelog all;
backup archivelog all format ‘/storage/arch/rman/rmanbk_archivelog_%d_%I_%s_%p_%T.bkp‘;
crosscheck backup;
delete noprompt obsolete;
delete noprompt expired backupset;
release channel c1;
release channel c2;
release channel c3;
release channel c4;
}
exit
EOF
echo off
find /storage/arch/rman/ -mtime -0.5 -type f -exec zip /storage/arch/rman/rmanbk_level0`date +%F`.zip {} \;
scp /storage/arch/rman/rmanbk_level0`date +%F`.zip backup.demon.com:/home/oracle/orabackup
rm -rf /storage/arch/rman/rmanbk_level0`date +%F`.zip
find /storage/script/log -mtime +7 -exec rm -rf {} \;
2. 有一天在备份日志里出现了下面的报错信息
--------------------------------------------------------------------------------------------------------------------------------------------------
RMAN-06207: WARNING: 1 objects could not be deleted for DISK channel(s) due
RMAN-06208: to mismatched status. Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: =========================================================
RMAN-06212: Object Type Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Datafile Copy +DATA/orcl/snapc_orcl.f
3. 报错信息处理过程
----------------------------------------------------------------------------------------------------------------------------------------------------
RMAN> crosscheck backupset;
RMAN> report obsolete;
RMAN retention policy will be applied to the command
RMAN retention policy is set to recovery window of 2 days
Report of obsolete backups and copies
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Control File Copy 5 12-DEC-14 +DATA/orcl/snapc_orcl.f
RMAN> delete noprompt obsolete;
RMAN-00571: ====================================================
RMAN-00569: ===============ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ====================================================
RMAN-03009: failure of delete command on c2 channel at 11/20/2014 09:03:14
ORA-19606: Cannot copy or restore to snapshot control file
RMAN> show snapshot controlfile name;
RMAN configuration parameters for database with db_unique_name ORCL are:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f‘; # default
RMAN> configure snapshot controlfile name to ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f_bak‘;
new RMAN configuration parameters:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f_bak‘;
new RMAN configuration parameters are successfully stored
RMAN> crosscheck controlfilecopy ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f‘;
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=80 instance=orcl1 device type=DISK
validation failed for control file copy
control file copy file name=/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f RECID=2 STAMP=863884566
Crosschecked 1 objects
RMAN> delete expired controlfilecopy ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f‘;
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=80 instance=orcl1 device type=DISK
List of Control File Copies
=========================================================================================
Key S Completion Time Ckp SCN Ckp Time
------- - --------------- ---------- ---------------
2 X 17-NOV-14 67553950 17-NOV-14
Name: /u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f
Tag: TAG20141117T155602
Do you really want to delete the above objects (enter YES or NO)? yes
deleted control file copy
control file copy file name=/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f RECID=2 STAMP=863884566
Deleted 1 EXPIRED objects
RMAN> configure snapshot controlfile name to ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f‘;
old RMAN configuration parameters:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f_bak‘;
new RMAN configuration parameters:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f‘;
new RMAN configuration parameters are successfully stored
RMAN> configure snapshot controlfile name clear;
old RMAN configuration parameters:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl1.f‘;
RMAN configuration parameters are successfully reset to default value
4. 通过上面的常规操作暂时解决了问题,但没几天就会再次出现相同的问题
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
RMAN-00571: ==========================================================
RMAN-00569: ================= ERROR MESSAGE STACK FOLLOWS ===================
RMAN-00571: ==========================================================
RMAN-03009: failure of Control File and SPFILE Autobackup command on c1 channel at 12/12/2014 01:05:19
ORA-00245: control file backup failed; target is likely on a local file system
5. 上面的常规操作只能暂时解决问题,不能解决根本问题,于是翻阅官方文档得到以下信息
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
从11gR2开始,在备份控制文件时不再需要锁住controlfile enqueue
对于非RAC环境的数据库没有任何的改变
但是对于RAC环境,因为控制文件备份机制的改变
集群中的所有节点都必须能够访问快照控制文件,所以快照控制文件要对所有实例可见
如果快照控制文件没有放到共享设备上,当rman备份快照控制文件时就会出现以上的错误
6. 根据得到的信息做出如下调整
------------------------------------------------------------------------------------------------------------------------------------------
RMAN> show snapshot controlfile name;
RMAN> configure snapshot controlfile name to ‘/storage/snap_control/snapcf_%d_%I_%s_%p_%T.f‘;
RMAN> configure snapshot controlfile name to ‘+DATA/orcl/snapcf_orcl.f‘; --也可以指定到对应的ASM磁盘组(磁盘组不支持通配符命名)
将控制文件快照放到共享存储之后就没有再出现上面的错误了
在出现问题时我们都习惯用自己的经验采用常规的手法去解决问题
但有时候翻阅官档是个不错的选择
虽然官档不能告诉你具体的操作步骤
却可以给予解决根本问题的正确指引
-------------------------------------------------------以上为个人观点阐述,如有不妥,欢迎指点-----------------------------------------------------------------------