APPLIES TO:
Oracle Database - Enterprise Edition - Version 12.1.0.2 and later
Linux x86-64
SYMPTOMS
12.1 RAC database error while starting via srvctl, alert log showing database startup and further goes down
通过 srvctl 启动12.1 RAC数据库时出现错误,alert log显示database startup and further goes down
alert_ORCL2.log inside <oracle base>/diag/rdbms/<db name>/<SID name>/trace:
----------------------------------------------------------------------------------------------------------------
Fri Nov 03 15:06:25 2017
Adjusting the default value of parameter parallel_max_servers
from 960 to 486 due to the value of parameter processes (600)
Starting ORACLE instance (normal) (OS id: 19684)
.
.
Fri Nov 03 15:10:45 2017
Process startup failed, error stack:
Fri Nov 03 15:10:45 2017
Errors in file /u01/app/oracle/diag/rdbms/ORCL/ORCL2/trace/ORCL2_psp0_19706.trc:
ORA-27300: OS system dependent operation:fork failed with status: 11
ORA-27301: OS failure message: Resource temporarily unavailable
ORA-27302: failure occurred at: skgpspawn3
Fri Nov 03 15:10:46 2017
Shutting down instance (abort)
License high water mark = 2
Fri Nov 03 15:10:46 2017
USER (ospid: 22067): terminating the instance
Fri Nov 03 15:10:47 2017
Instance terminated by USER, pid = 22067
Fri Nov 03 15:10:47 2017
Instance shutdown complete
Errors in file //u01/app/oracle/diag/rdbms/ORCL/ORCL2/trace/ORCL2_mmon_19853.trc (incident=14681):
ORA-00600: internal error code, arguments: [KSLGES_3], [], [], [], [], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Incident details in: /u01/app/oracle/diag/rdbms/ORCL/ORCL2/incident/incdir_14681/ORCL2_mmon_19853_i14681.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORCL2_psp0_19706.trc:
-------------------------------------------------------------------------------------------------------------------
*** 2017-11-03 15:10:44.989
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 11
ORA-27301: OS failure message: Resource temporarily unavailable
ORA-27302: failure occurred at: skgpspawn3
*** 2017-11-03 15:10:45.993
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 11
ORA-27301: OS failure message: Resource temporarily unavailable
ORA-27302: failure occurred at: skgpspawn3
ORCL2_mmon_19853.trc:
-------------------------------------------------------------------------------------------------------------
*** 2017-11-03 15:09:57.908
***KELR Apply Log: unable to schedule MMON Slave, error 3
*** 2017-11-03 15:10:47.847
Incident 14681 created, dump file: /u01/app/oracle/diag/rdbms/ORCL/ORCL2/incident/incdir_14681/ORCL2_mmon_19853_i14681.trc
ORA-00600: internal error code, arguments: [KSLGES_3], [], [], [], [], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
KEBM: MMON action policy violation. ‘PQ: Adjust Slave Pool‘ viol=0; err=600
error 0 detected in background process
kgxgnsdr: clssgsshdereg: warning: return status 26 (-558242808 )
OPIRIP: Uncaught error 447. Error stack:
ORA-00447: fatal error in background process
ORA-00600: internal error code, arguments: [KSLGES_3], [], [], [], [], [], [], [], [], [], [], []
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
kgxgnsdr: clssgsshdereg: warning: return status 26 (-813323584 )
Following error reported in OS logs
messages:
2017-11-03T15:10:11.690393+01:00 rachost1 Oracle Audit[7888]: LENGTH : ‘200‘ ACTION :[52] ‘ALTER DATABASE MOUNT /* db agent *//* {2:39656:2} */‘ DATABASE USER:[1] ‘/‘ PRIVILEGE :[6] ‘SYSDBA‘ CLIENT USER:[6] ‘oracle‘ CLIENT TERMINAL:[0] ‘‘ STATUS:[1] ‘0‘ DBID:[10] ‘2949004148‘
2017-11-03T15:10:47.700094+01:00 rachost1 kernel: [ 173.788859] cgroup: fork rejected by pids controller in /system.slice/ohasd.service <<<<<<<<<<<<<<<<<<<<<
Database able to startup with sqlplus. 能够使用sqlplus启动的数据库
CAUSE
This is due the PIDs cgroup controller introduced with SUSE12. 这是由于 SUSE12 附带的 PIDs cgroup 控制器
From the "Support for PIDs cgroup Controller" section on https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12-SP2/#fate-320358
"To control the default TasksMax= setting for services and scopes running on the system, use the system.conf setting DefaultTasksMax=. This setting defaults to 512, which means services that are not explicitly configured otherwise will only be able to create 512 processes or threads at maximum.
要控制TasksMax的默认值,请使用system.conf设置DefaultTasksMax。此设置默认为512,这意味着未明确配置的服务最多只能创建512个进程或线程。
For thread- or process-heavy services, you may need to set a higher TasksMax value. In such cases, set TasksMax directly in the specific unit files. Either choose a numeric value or even infinity."
对于大量使用线程或进程的服务,您可能需要设置更高的TasksMax值。 在这种情况下,请直接在特定 unit files 中设置 TasksMax。 选择一个数字值,甚至选择infinity。”
From SLES12 onwards, systemd is used instead of initd and the OHASD server is only allowed to open a maximum of 512 tasks.
从SLES12开始,使用systemd代替initd,并且只允许OHASD服务器打开最多512个任务。
SOLUTION
Configure the value of DefaultTasksMax to 65535 in the file /etc/systemd/system.conf or or set the TasksMax value properly for the ohasd systemd service.
在文件 /etc/systemd/system.conf 中将 DefaultTasksMax 的值配置为65535,或者为 ohasd systemd 服务正确设置TasksMax值。
For example:
# cat /etc/systemd/system/ohasd.service.d/lunar.conf [Service] TasksMax=16384 <<<<<<<<<<< # systemctl status ohasd ● ohasd.service - LSB: Start and Stop Oracle High Availability Service Loaded: loaded (/etc/init.d/ohasd; bad; vendor preset: disabled) Drop-In: /etc/systemd/system/ohasd.service.d └─lunar.conf Active: active (exited) since Mon 2017-11-13 14:29:23 CET; 3h 5min ago Docs: man:systemd-sysv-generator(8) Process: 4876 ExecStart=/etc/init.d/ohasd start (code=exited, status=0/SUCCESS) Tasks: 612 (limit: 16384) <<<<<<=================== here was the limit with 512
原文地址:https://www.cnblogs.com/zylong-sys/p/12054385.html