坑爹的APAR IT08059.

最近, 为了解决客户某个重要的数据库Crash的问题, 从v97fp9升级到了v97fp10.

过了几天, 活动日志满了.早上被客户call起, 急忙赶到现场.

发现是某个应用长时间hold住了最早的事务日志, force不了, 应该是hang住了.

收了数据. 10分钟就找到了RCA. 中了APAR IT08059.

IT08059:

Interrupting a lock escalation may result in a latch not being
released, which in turn may cause subsequent latch contention,
resulting in performance degradation.
 
The problem can occur on version 9.7 fix pack 10, version 10.1
fix pack 5, or version 10.5 fix pack 5 only.
 
Analysis of the performance problem will show a latch wait on
the SQLP_LTRN_CHAIN__entry_latch.  The latch may or may not have
an owner, and the owner may or may not be the same transaction
that is requesting the latch.
 
Waiting on latch type: (SQLO_LT_SQLP_LTRN_CHAIN__entry_latch) -
Address: (7000001c2638260), Line: 106, File: sqlplsc.C

Local Fix:
Try to avoid lock escalation or interrupting a lock escalation.

这个APAR是怎么来的呢? 原来是为了修复APAR IT03126带来的, 真是坑爹啊.

IT03126: APPLICATION PERFORMING LOCK ESCALATION CANNOT BE FORCED

Error description:

An application that is holding a large number of locks may take
a long time to complete lock escalation.  As a result, other
applications may experience lock waits, timeouts, deadlocks.
Lock escalation is not interruptable, and as a result, the
system may appear to be hanging.  Currently, the only options
are to forcibly bring down DB2, or wait until the lock
escalation completes.

This APAR will add an enhancement to allow the lock escalation
process to be interrupted, so that an agent that is in lock
escalation can be forced.

诊断数据:

2015-04-01-09.21.53.295718+480 E565684697A825     LEVEL: Warning
PID     : 49152100             TID  : 37376       PROC : db2sysc 1
INSTANCE: db2inst1             NODE : 001         DB   : ******
APPHDL  : 0-9316               APPID: 10.1.72.116.9563.150401013601
AUTHID  : ******
EDUID   : 37376                EDUNAME: db2agntp (*****) 1
FUNCTION: DB2 UDB, data management, sqldEscalateLocks, probe:2
MESSAGE : ADM5500W  The database manager is performing lock escalation. The
          affected application is named "****.exe", and is associated with
          the workload name "****" and application ID
          "*****.9563.150401013601"  at member "1". The total number of
          locks currently held  is "101489", and the target number of locks to
          hold is "50744".  Reason code: "1"

--锁升级发生在21:53秒~21:55秒之间, 其间有中断的操作.

2015-04-01-09.21.54.915969+480 I565686072A537     LEVEL: Error
PID     : 21430854             TID  : 74023       PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000         DB   : ******
APPHDL  : 0-9316               APPID: 10.1.72.116.9563.150401013601
AUTHID  : ******
EDUID   : 74023                EDUNAME: db2agent (******) 0
FUNCTION: DB2 UDB, common communication, sqlcctcptest, probe:11
MESSAGE : Detected client termination
DATA #1 : Hexdump, 2 bytes
0x0A000000473E78B2 : 0036

<StackTrace>
-------Frame------ ------Function + Offset------
0x0900000024DF105C sqloXlatchConflict + 0x23C
0x0900000024DF1320 [email protected]@clone0 + 0x78
0x0900000024E4A980 sqloxltc_track[email protected]glueBED + 0xE0
0x0900000024323B80 sqlplrm__FP8sqeAgent + 0x140
0x0900000024FAF494 sqlpxrbk__FP8sqeAgentP15SQLXA_CALL_INFOPiP9SQLP_GXIDPP11sqlo_xlatch + 0x24
0x09000000247E53F4 sqlrrbck_dps__FP8sqlrr_cbiN22P15SQLXA_CALL_INFOP9SQLP_GXID + 0x7F0
0x0900000025480E90 sqlrr_tran_router__FP8sqlrr_cb + 0x558
0x09000000253F8210 sqlrr_subagent_router__FP8sqeAgentP12SQLE_DB2RA_T + 0x32C
0x0900000023EF1BA8 sqleSubRequestRouter__FP8sqeAgentPUiT2 + 0x5F4
0x0900000023EEEA4C sqleProcessSubRequest__FP8sqeAgent + 0x764
0x09000000245B5D4C RunEDU__8sqeAgentFv + 0x2EC
0x0900000024D49AB8 EDUDriver__9sqzEDUObjFv + 0xDC
0x0900000024D3E18C sqloEDUEntry + 0x254
</StackTrace>

<LatchInformation>

Waiting on latch type: (SQLO_LT_SQLP_LTRN_CHAIN__entry_latch) - Address: (a000200000869e0), Line: 441, File: sqlplrm.C

Holding Latch type: (SQLO_LT_SQLP_TENTRY__tranEntryLatch) - Address: (a00020000086900), Line: 748, File: /view/db2_v97fp10_aix64_s141015/vbs/engn/include/sqlpi_inlines.h HoldCount: 1
Holding Latch type: (SQLO_LT_SQLP_LTRN_CHAIN__entry_latch) - Address: (a000200000869e0), Line: 458, File: sqlplcl.C HoldCount: 1
Holding Latch type: (SQLO_LT_SQLP_LTRN__cursor_latch) - Address: (a00020000086fbc), Line: 430, File: sqlplrm.C HoldCount: 1
</LatchInformation>

APAR IT08059只在v97fp10,v10.5fp5里有哦, 这都是当前最新的fixpack. Local Fix看起来不怎么样,我们采取的方案只能是回退到fp9了, 并向实验室申请了fp9的special build.

时间: 2024-10-05 09:32:29

坑爹的APAR IT08059.的相关文章

app里使用163邮箱发送邮件,被163认为是垃圾邮件的坑爹经历!_ !

最近有个项目,要发邮件给用户设定的邮箱报警,然后就用了163邮箱,代码是网上借来的^^,如下: package com.smartdoorbell.util; import android.os.AsyncTask; import java.util.Date; import java.util.List; import java.util.Properties; import javax.activation.CommandMap; import javax.activation.Mailca

小米电视支付SDK接入air坑爹之路

1. air的包名在生成android后会加入一个air的前缀变为air.***.***.mibox.包名与appid和appkey必须相相应才行,不然会一直返回40000错误 2. 加入了NativeApplication.nativeApplication.exit(),失去焦点退出,导致登录界面一直弹不出来 这俩坑爹问题! .!!

【TK】1023: 坑爹的黑店

1023: 坑爹的黑店 时间限制: 1 Sec  内存限制: 32 MB提交: 2134  解决: 855[提交][状态][下载(1元)] 题目描述 今天小明去了一个风景如画的地方散心,但是自己带的饮料喝完了,小明口渴难耐,见不远处有家小商店,于是跑去买饮料. 小明:"我要买饮料!" 店主:"我们这里有三种饮料,矿泉水1.5元一瓶,可乐2元一瓶,橙汁3.5元一瓶." 小明:"好的,给我一瓶矿泉水." 说完他掏出一张N元的大钞递给店主. 店主:&q

SQL Server--疑难杂症之坑爹的Windows故障转移群集

--============================================================== 估计是春节前最后一次写博客,也估计是本年值班最后一次踩雷,感叹下成也SQL SERVER,败也SQL SERVER. --============================================================== 场景描述: 操作系统版本 :Windows Server 2012 数据中心版本 数据库版本 :SQL SERVER 20

哦这。。!C语言scanf输入的坑爹之处

一. 今天闲来无事,跑去A题,本想3sA了poj1004,结果搞了10分钟,最令人困惑的问题就是为什么定义了double类型的变量,但是用scanf输入的时候标识符用%f的话,输入并不能完成,也就是说输入不会起作用,后来查找资料,才知道,原来用double 输入的时候标识符是%lf,注意这里是L和F,不是1和f 经过修正终于搞定,分分钟A了这个水题. 然而后来再想 A 就A不过去了,,,,并不知道为什么.好吧,忽略这种细节,没什么难度的水题没必要太在意,代码如下: 1 #include<cstd

&#31227;&#21160;&#31471;&#19978;&#19979;&#28369;&#21160;&#20107;&#20214;&#20043;--&#22353;&#29241;&#30340;touch.js

转:http://blog.csdn.net/minidrupal/article/details/39611605?utm_source=tuicool&utm_medium=referral 移动端页面的盛行,微信的便利的页面推广等等,让越来越多的css3效果和html5在手机端大放异彩。 于是乎,各式各样的简约酷炫的html5页面层出不穷,最多的就是视差滚动+css3动画。 接下来就说说自己在搞这些页面里面碰到的一个小问题-------zepto.js里面,坑爹的touch.js的上下滑动

移动端上下滑动事件之--坑爹的touch.js

原文   http://blog.csdn.net/minidrupal/article/details/39611605 移动端页面的盛行,微信的便利的页面推广等等,让越来越多的css3效果和html5在手机端大放异彩. 于是乎,各式各样的简约酷炫的html5页面层出不穷,最多的就是视差滚动+css3动画. 接下来就说说自己在搞这些页面里面碰到的一个小问题-------zepto.js里面,坑爹的touch.js的上下滑动( swipe )事件失效. 在举例之前,先科普一下如何在pc端,查看h

关于那些常见的坑爹的小bug(会持续更新)

当我学了矩阵分析的时候我知道什么是麻烦,当我学了傅里叶级数的时候我知道什么是相当麻烦.然而,当我刚刚接触前端,我才明白什么叫做坑爹的ie6.这个分享对于经验丰富的前端基本都遇过.对于刚入行的新手,或许可以起到一点点的指导作用.不求救万人于水火,但求某日能帮到路过的你. 在说bug之前,先看看各大浏览器最近的份额 有这个百度的浏览器份额可以看出ie6的末日渐行渐近了. 但是中国盗版的xp系统用户还不在少数,所以ie6在短期内不会消失.下面就说一下以ie6为首的一些奇葩而又常见的bug. 1.IE6

poj 1502 最短路+坑爹题意

链接:http://poj.org/problem?id=1502 MPI Maelstrom Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 5249   Accepted: 3237 Description BIT has recently taken delivery of their new supercomputer, a 32 processor Apollo Odyssey distributed share