【翻译自mos文章】当 使用DCD 和TCPS时,rman duplicate hang住

当 使用DCD 和TCPS时,rman duplicate hang住。

来源于:

RMAN Duplicate hangs when using DCD and TCPS (文档 ID 1676197.1)

适用于:

Oracle Database - Enterprise Edition - Version 11.2.0.1 and later

Information in this document applies to any platform.

症状:

在datafile copy 阶段,RMAN active duplicate for standby hang住。SSL Oracle Net 和Dead Connection Detection (DCD) 正在使用。

这个hang 是 间歇性的(intermittent),也就是说,有时duplicate 是能工作的,在其他时候,会hang住 很多天,直到进程从操作系统和database中kill掉。

rman debug 揭示了下面的信息会repeat:

RMAN-06731: command backup:x% complete, time left HH:MM:SS

样例RMAN debug输出如下:

RMAN-12016: using channel ORA_DISK_8
RMAN-08580: channel ORA_DISK_1: starting datafile copy
RMAN-08522: input datafile file number=00012 name=+OFD_DATA/ofmim01q/datafile/ofm_tbs_oaam_indx.272.810048785
...
RMAN-08581: channel ORA_DISK_4: datafile copy complete, elapsed time: 00:00:16
RMAN-08592: output file name=+OFN_DAT/ofmiy01q/datafile/ofm_ias_iau.373.842790419 tag=TAG20140321T065222
RMAN-08581: channel ORA_DISK_7: datafile copy complete, elapsed time: 00:00:16
RMAN-06731: command backup:94.1% complete, time left 00:21:05

//
// RMAN-06731 and % complete repeats here
// Process is completely stalled 

RMAN-06731: command backup:94.1% complete, time left 00:21:05

在primary database上,我们可以看到8个session hang住,等待事件"remote db file write" 的wait time会简单的增加

SQL> select SID ,SERIAL# , INST_ID , USERNAME, OSUSER || '@' || MACHINE OSINFO, SUBSTR(PROGRAM,0,20) PROGRAM,
  2  TO_CHAR(LOGON_TIME,'yyyy-mm-dd hh24:mi:ss') LOGON_TIME, EVENT, SECONDS_IN_WAIT SIW from gv$session where type <> 'BACKGROUND' and PROGRAM like 'rman%'
  3  ORDER BY USERNAME, INST_ID, SID;

 SID SERIAL# INST_ID USERNAME OSINFO            PROGRAM              LOGON_TIME           EVENT                                 SIW
---- ------- ------- -------- ----------------- -------------------- -------------------- ------------------------------ ----------
 632    5635       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:27  SQL*Net message from client            34
 758    2535       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:30  SQL*Net message from client             4
 948     441       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:35  remote db file write                52036
1010     369       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:36  remote db file write                35532
1073     215       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:37  remote db file write                52935
1136     291       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:37  remote db file write                54014
1199     753       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:38  remote db file write                41651
1325    1595       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:39  remote db file write                42730
1388    2121       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:39  remote db file write                50771
1451    1351       2 SYS      [email protected]   [email protected] (TNS V 2014-04-08 17:29:40  remote db file write                47650

hang住的进程必须被从databae和os级别kill掉。

原因:

Unfortunately expire_time + TCPS combination is not supported by oracle as NTZ layer(used for TCPS communication) uses routines that not async-signal-safe.

Using async-signal-safe routines can cause unpredictable results like hang, crash etc.

解决方案:

Do not use DCD with SSL Oracle Net. Remove sqlnet.expire_time from the sqlnet.ora file or set it to 0 (zero).

If you need to keep the connection alive due to firewall issues, consider using the operating system‘s TCP KEEPALIVE parameters instead. eg:

TCP_KEEPIDLE (the amount of time until the first keepalive packet is sent)

TCP_KEEPCNT (the number of probes to send)

TCP_KEEPINTVL (the interval between keepalive packets)

Otherwise, if you need to use DCD, you must use non-SSL Oracle Net.

时间: 2024-12-05 10:01:02

【翻译自mos文章】当 使用DCD 和TCPS时,rman duplicate hang住的相关文章

【翻译自mos文章】OGG add Supplemental Logging 时失败,报错为 块损坏(Block Corruption)

OGG add Supplemental Logging 时失败,报错为 块损坏(Block Corruption) 来源于: Add Supplemental Logging Fails Due To Block Corruption (文档 ID 1468322.1) 适用于: Oracle Server - Enterprise Edition - Version 10.2.0.5 to 12cBETA1 [Release 10.2 to 12.1] Information in this

【翻译自mos文章】在网络流量变大(比如rman duplicat 一个active database)之后,由于脑裂导致节点重启

在网络流量变大(比如rman duplicat 一个active database)之后,由于脑裂导致节点重启 来源于: The node reboots due to split brain during increased network traffic like rman duplicating an active database (文档 ID 985123.1) 适用于: Oracle Server - Enterprise Edition - Version: 11.1.0.7 to

【翻译自mos文章】当relink Oracle 软件时,用的是哪个linker 和 compiler?

当relink Oracle 软件时,用的是哪个linker 和 compiler? 参考自: REQUIRED LINKER AND COMPILER LOCATIONS (文档 ID 1012798.6) Problem Description: ==================== Which linker and compiler should I use to relink Oracle executables?  Where are they located? Search Wo

【翻译自mos文章】找到&#39;cursor: pin S wait on X&#39; 等待事件的阻塞者session(即:持有者session)

找到'cursor: pin S wait on X' 等待事件的阻塞者session(即:持有者session) 来源于: How to Determine the Blocking Session for Event: 'cursor: pin S wait on X' (Doc ID 786507.1) 适用于: Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.3 [Release 10.2 to 11.2

【翻译自mos文章】SGA_TARGET与SHMMAX的关系

SGA_TARGET与SHMMAX的关系 参考原文: Relationship Between SGA_TARGET and SHMMAX (文档 ID 1527109.1) 适用于: Oracle Database - Enterprise Edition - Version 10.1.0.2 to 11.2.0.3 [Release 10.1 to 11.2] Information in this document applies to any platform. 目的: 解释了参数文件中

【翻译自mos文章】在11gR2 rac环境中,文件系统使用率紧张,并且lsof显示有很多oraagent_oracle.l10 (deleted)

在11gR2 rac环境中,文件系统使用率紧张,并且lsof显示有很多oraagent_oracle.l10 (deleted) 参考原文: High Space Usage and "lsof" Output Shows Many 'oraagent_oracle.l10 (deleted)' in GI environment (Doc ID 1598252.1) 适用于: Oracle Database - Enterprise Edition - Version 11.2.0.

【翻译自mos文章】使用Windows操作系统的Dell Pcserver,Oracle db报错:ORA-8103

翻译自mos文章:使用Windows操作系统的Dell Pcserver,Oracle db报错:ORA-8103 ORA-8103 using Windows platform and DELL servers (Doc ID 1921533.1) Applies to: Oracle Database - Personal Edition - Version 11.1.0.6 to 12.1.0.2 [Release 11.1 to 12.1] Oracle Database - Stand

【翻译自mos文章】使用buffer memory 参数来调整rman的性能。

使用buffer memory 参数来调整rman的性能. 本文翻译自mos文章:RMAN Performance Tuning Using Buffer Memory Parameters (Doc ID 1072545.1) rman 性能调整的目的是分辨一个特定的backup or  restore job的瓶颈. 并使用使用rman命令.初始化参数 或者对physical media的调整来提高整体的性能. 由于数据库容量持续变大,在客户的环境中,几十到几百TB的数据库很常见, serv

【翻译自mos文章】11gR2 OUI 在 PREREQUISITE CHECKS 时 hang住

翻译自mos文章:11gR2 OUI 在 PREREQUISITE CHECKS 时 hang住 适用于: Oracle Server - Enterprise Edition - Version 8.0.6.0 to 11.2.0.2.0 [Release 8.0.6 to 11.2] Information in this document applies to any platform. This can occur on any Unix/Linux platform 症状: 11gR2