问题现象:
数据库服务器可以ping通,但SSH连接不了;应用、plsqldeveloper
也都连接不了。事情到了这个地步,只能重启服务器。
服务器环境:oracle10.2.0.1 +rhel5.8
重启后,查看实例日志:
Wed Apr 30 13:12:24
2014
Memory Notification:
Library Cache Object loaded into SGA
Heap
size 2210K exceeds notification threshold (2048K)
KGL
object name :XDB.XDbD/PLZ01TcHgNAgAIIegtw==
Wed Apr 30 14:00:16
2014
Thread 1 advanced to log sequence 24932
Current log# 1 seq#
24932 mem# 0:
/data/oracle/product/10.2.0/db_1/oradata/urpdb/redo01.log
Wed Apr 30
15:00:16 2014
Thread 1 advanced to log sequence 24933
Current log#
3 seq# 24933 mem# 0:
/data/oracle/product/10.2.0/db_1/oradata/urpdb/redo03.log
Wed Apr 30
15:15:05 2014
Thread 1 advanced to log sequence 24934
Current log#
2 seq# 24934 mem# 0:
/data/oracle/product/10.2.0/db_1/oradata/urpdb/redo02.log
Wed Apr 30
15:16:02 2014
Thread 1 advanced to log sequence 24935
Current log#
1 seq# 24935 mem# 0:
/data/oracle/product/10.2.0/db_1/oradata/urpdb/redo01.log
Wed Apr 30
15:17:42 2014
Thread 1 cannot allocate new log, sequence
24936
Checkpoint not complete
查看系统日志:
Apr 30 15:37:17
wiscomApp1 kernel: [<c0456331>] out_of_memory+0x72/0x1a5
Apr 30 15:37:17 wiscomApp1 kernel: [<c0457806>]
__alloc_pages+0x216/0x297
Apr 30 15:37:17 wiscomApp1 kernel:
[<c0458a73>] __do_page_cache_readahead+0xc4/0x1c6
Apr 30 15:37:17
wiscomApp1 kernel: [<c045304c>] sync_page+0x0/0x3b
Apr 30
15:37:17 wiscomApp1 kernel: [<c044e161>]
__delayacct_blkio_end+0x32/0x35
Apr 30 15:37:17 wiscomApp1 kernel:
[<c06077cf>] __wait_on_bit_lock+0x4b/0x52
Apr 30 15:37:17
wiscomApp1 kernel: [<c0452fc7>] __lock_page+0x52/0x59
Apr 30
15:37:17 wiscomApp1 kernel: [<c04558e3>]
filemap_nopage+0x151/0x312
Apr 30 15:37:17 wiscomApp1 kernel:
[<c045f306>] __handle_mm_fault+0x1d0/0xb62
Apr 30 15:37:17
wiscomApp1 kernel: [<c0609886>] do_page_fault+0x2a5/0x5d3
Apr 30
15:37:17 wiscomApp1 kernel: [<c0448f0d>]
audit_syscall_entry+0x14b/0x17d
Apr 30 15:37:17 wiscomApp1 kernel:
[<c06095e1>] do_page_fault+0x0/0x5d3
Apr 30 15:37:17 wiscomApp1
kernel: [<c0405a71>] error_code+0x39/0x40
通过这2个日志可以看出,在13:12分,实例日志提示sga中有数据内存超出默认值
操作系统在15:37:17报错内存溢出。这个内存溢出应该和实例有直接关系。
再次查看服务器环境:物理内存8G,但sga只有2G。另外无意中发现操作系统是32-bit Red Hat
Linux,晕啊!
当时的第一想法,要想彻底解决这个问题,只能重新安装操作系统,再安装数据库,迁移数据。
后来,想看看实例中下面这段报错什么意思,
Memory Notification: Library Cache Object loaded into
SGA
Heap size 2210K exceeds notification threshold
(2048K)
于是发现了http://blog.itpub.net/519536/viewspace-659979这篇文章对这个分析的很好;
但从这个系统的实际情况说,这个只能是次要问题。真正要解决问题,还是上面的办法。
==============摘录链接文章中关键部分:===========================
在Oracle
10.2.0.1版本数据库中隐含参数_kgl_large_heap_warning_threshold默认值是2M,
该参数控制加载到内存中对象的大小,当加载的对象大于2M时,就会在alert警告文件中进行提示。
2M的默认大小相对太小,因此在10.2.0.1版本中可能很容易遇到这个报错信息。
该参数默认值在10.2.0.2版本中进行了调整,调整到了50M。
Memory Notification: Library Cache Object loaded into
SGA