CRS does not start GIPC error: [29] msg [gipcretConnectionRefused]

CRS does not start GIPC error: [29] msg [gipcretConnectionRefused]

Table of Contents

What to do first ?

  • Check your disk space using:  #  df
  • Check whether your are a firewall: # service iptables status ( <— this command is very important )
  • Use Nslookup and ping to verify you Cluster Interconnect
CRS does not start GIPC error: [29] msg [gipcretConnectionRefused]
Table of Contents
What to do first ?
Scenario 1: Wrong IP Address
Scenario 2: Filesystem full ( 12c )
Scenario 3 : Firwall ON
References
What to do first ?
Check your disk space using:  #  df
Check whether your are a firewall: # service iptables status ( <— this command is very important )
Use Nslookup and ping to verify you Cluster Interconnect
Scenario 1: Wrong IP Address

Errors:
   GIPC repot error [29] msg [gipcretConnectionRefused]
   CHM report clsu_get_private_ip failed 

Check CRS status
[[email protected] Desktop]#  crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager

[[email protected] network-scripts]# my_crs_stat_init
NAME                           TARGET     STATE           SERVER       STATE_DETAILS
-------------------------      ---------- ----------      ------------ ------------------
ora.asm                        ONLINE     OFFLINE                      Instance Shutdown
ora.cluster_interconnect.haip  ONLINE     OFFLINE
ora.crf                        ONLINE     ONLINE          grac41
ora.crsd                       ONLINE     OFFLINE
ora.cssd                       ONLINE     UNKNOWN         grac41
ora.cssdmonitor                ONLINE     ONLINE          grac41
ora.ctssd                      ONLINE     OFFLINE
ora.diskmon                    OFFLINE    OFFLINE
ora.drivers.acfs               ONLINE     ONLINE          grac41
ora.evmd                       ONLINE     OFFLINE
ora.gipcd                      ONLINE     ONLINE          grac41
ora.gpnpd                      ONLINE     ONLINE          grac41
ora.mdnsd                      ONLINE     ONLINE          grac41
--> ASM, HAIP, CRSD, CTSSD, DISKMON, EVMD resource are OFFLINE  !

Check traces - ohasd trace file
[[email protected] ohasd]#  cat ohasd.log | grep -i failed
2014-04-22 15:09:17.966: [    AGFW][2735122176]{0:0:2} ora.cluster_interconnect.haip 1 1 received state from probe request. Old state = UNKNOWN, New state = FAILED
2014-04-22 15:09:30.292: [    GPNP][2745628416]clsgpnp_getCachedProfileEx: [at clsgpnp.c:623] Result: (26) CLSGPNP_NO_PROFILE. Failed to get offline GPnP service profile.
2014-04-22 15:09:30.602: [    GPNP][2717640448]clsgpnp_getCachedProfileEx: [at clsgpnp.c:623] Result: (26) CLSGPNP_NO_PROFILE. Failed to get offline GPnP service profile.
--> HAIP goes to FAILED status 

Try to find any repeating updated tracefiles - maybe some RAC process tries to fix the network problem
[[email protected] grac41]$ date;  find . -type f -printf "%CY-%Cm-%Cd %CH:%CM:%CS  %h/%f\n" | sort -n | tail -5
Tue Apr 22 13:24:40 CEST 2014
2014-04-22 13:24:30.0571859790  ./gpnpd/gpnpd.log
2014-04-22 13:24:33.0756944610  ./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log
2014-04-22 13:24:38.0881994320  ./ohasd/ohasd.log
2014-04-22 13:24:38.3523314350  ./gipcd/gipcd.log
2014-04-22 13:24:39.0876989250  ./crfmond/crfmond.log

[[email protected] grac41]$ date;  find . -type f -printf "%CY-%Cm-%Cd %CH:%CM:%CS  %h/%f\n" | sort -n | tail -5
Tue Apr 22 13:24:43 CEST 2014
2014-04-22 13:24:30.0571859790  ./gpnpd/gpnpd.log
2014-04-22 13:24:33.0756944610  ./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log
2014-04-22 13:24:43.1007044060  ./ohasd/ohasd.log
2014-04-22 13:24:43.3668374000  ./gipcd/gipcd.log
2014-04-22 13:24:43.7580328990  ./crfmond/crfmond.log

[[email protected] grac41]$ date;  find . -type f -printf "%CY-%Cm-%Cd %CH:%CM:%CS  %h/%f\n" | sort -n | tail -5
Tue Apr 22 13:24:47 CEST 2014
2014-04-22 13:24:30.0571859790  ./gpnpd/gpnpd.log
2014-04-22 13:24:33.0756944610  ./agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log
2014-04-22 13:24:43.1007044060  ./ohasd/ohasd.log
2014-04-22 13:24:44.0972023860  ./crfmond/crfmond.log
2014-04-22 13:24:46.4033548850  ./gipcd/gipcd.log
--> Here we cans see  that ./ohasd/ohasd.log  ./gipcd/gipcd.log ./crfmond/crfmond.log 

Use tail to see what‘s going :
[[email protected] grac41]$ tail -f  ./gpnpd/gpnpd.log
2014-04-22 13:19:59.175: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:21:29.469: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:22:59.792: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:24:30.057: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:26:00.383: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:27:30.622: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:29:00.869: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:30:31.203: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:32:01.459: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]
2014-04-22 13:33:31.770: [  OCRMSG][4002494208]GIPC error [29] msg [gipcretConnectionRefused]

[[email protected] grac41]$  tail -f    ./ohasd/ohasd.log
2014-04-22 13:33:42.806: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:33:47.817: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:33:52.839: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:33:57.848: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:34:03.859: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:34:09.874: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:34:15.881: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:34:20.900: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:34:25.920: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd
2014-04-22 13:34:30.934: [GIPCHDEM][2222126848]gipchaDaemonInfRequest: sent local interfaceRequest,  hctx 0x2d03370 [0000000000000010] { gipchaContext : host ‘grac41‘, name ‘CLSFRAME_grac4‘, luid ‘57127705-00000000‘, numNode 0, numInf 0, usrFlags 0x0, flags 0x63 } to gipcd

[[email protected] grac41]$ tail -f   ./crfmond/crfmond.log
[   CLWAL][467654400]clsw_Initialize: OLR initlevel [70000]
2014-04-22 13:34:49.349: [    CRFM][467654400]crfm_connstr: clsu_get_private_ip failed(7).
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_connect_to: send fail(gipcret: 13)
2014-04-22 13:34:49.458: [    CRFM][467654400]crfmctx dump follows
2014-04-22 13:34:49.458: [    CRFM][467654400]****************************
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: connection local name: tcp://0.0.0.0:45871
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: connection peer name:  tcp://192.168.1.101:61021
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: connaddr:  tcp://grac41:61021
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: ctype:  2
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: mytype:  0
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: hostname  grac41
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: myport:
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: rhostname
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: rport:
2014-04-22 13:34:49.458: [    CRFM][467654400]crfm_dumpctx: flags:  1
2014-04-22 13:34:49.458: [    CRFM][467654400]****************************

According to above traces we can see that clsu_get_private_ip failed  getting private IP tcp://192.168.1.101

Check Network status and DNS
[[email protected] Desktop]# ifconfig
eth1      Link encap:Ethernet  HWaddr 08:00:27:89:E9:A2
          inet addr:192.168.2.101  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe89:e9a2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:17148 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13307 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:22041591 (21.0 MiB)  TX bytes:1211055 (1.1 MiB)
          Interrupt:9 Base address:0xd240 

eth2      Link encap:Ethernet  HWaddr 08:00:27:6B:E2:BD
          inet addr:192.168.1.101  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe6b:e2bd/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:17517 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13475 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:22191772 (21.1 MiB)  TX bytes:1230703 (1.1 MiB)
          Interrupt:5 Base address:0xd260
--> Check public and private interface for errors / Looks good 

[[email protected] Desktop]# nslookup grac41
Name:    grac41.example.com
Address: 192.168.1.101

[[email protected] Desktop]# nslookup grac41int
Name:    grac41int.example.com
Address: 192.168.2.101

[[email protected] Desktop]# nslookup 192.168.1.101
101.1.168.192.in-addr.arpa    name = grac41.example.com.

[[email protected] Desktop]# nslookup  192.168.2.101
101.2.168.192.in-addr.arpa    name = grac41int.example.com.
--> DNS and Network seems to be ok

Restart CRS

[email protected] Desktop]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on ‘grac41‘
CRS-2673: Attempting to stop ‘ora.crf‘ on ‘grac41‘
CRS-2673: Attempting to stop ‘ora.ctssd‘ on ‘grac41‘
CRS-2673: Attempting to stop ‘ora.evmd‘ on ‘grac41‘
...
CRS-2673: Attempting to stop ‘ora.gpnpd‘ on ‘grac41‘
CRS-2677: Stop of ‘ora.gpnpd‘ on ‘grac41‘ succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on ‘grac41‘ has completed
CRS-4133: Oracle High Availability Services has been stopped.

Cleanup /var/tmp/.oracle
# rm  /var/tmp/.oracle/*
[[email protected] Desktop]# crsctl start crs
[[email protected] Desktop]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager
--> Problem persists

Check OS logfile
#  cat /var/log/messages
--> Nothing related

Run orcheck ( and orcdump ) to check whether we can access our OCR repostory
[[email protected] Desktop]#  ocrcheck
Status of Oracle Cluster Registry is as follows :
     Version                  :          3
     Total space (kbytes)     :     262120
     Used space (kbytes)      :       4076
     Available space (kbytes) :     258044
     ID                       :  630679368
     Device/File Name         :       +OCR
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
     Cluster registry integrity check succeeded
     Logical corruption check succeeded

Query voting disk :
[[email protected] grac41]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   b0e94e5d83054fe9bf58b6b98bfacd65 (/dev/asmdisk1_udev_sdf1) [OCR]
 2. ONLINE   88c2a08b4c8c4f85bf0109e0990388e4 (/dev/asmdisk1_udev_sdg1) [OCR]
 3. ONLINE   1108f9a41e814fb2bfed879ff0039dd0 (/dev/asmdisk1_udev_sdh1) [OCR]
Located 3 voting disk(s).

Debugging GIPCD and GPnPD daemons using strace
As GIPCD and GPnPD daemon traces gets updated every 5s lets check the gipcd process with strace
# ps -elf | egrep ‘gpnpd.bin|gipcd.bin‘
# strace -t -f  -p 24376   2>&1  | grep ‘192.168‘ | grep eth
[pid 24872] 09:17:28 <... ioctl resumed> 200, {{"lo", {AF_INET, inet_addr("127.0.0.1")}}, {"eth0", {AF_INET, inet_addr("10.0.2.15")}}, {"eth1", {AF_INET, inet_addr("192.168.2.101")}}, {"eth2", {AF_INET, inet_addr("192.168.1.101")}}, {"virbr0", {AF_INET, inet_addr("192.168.122.1")}}}}) = 0
[pid 24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth1", ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid 24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth1", ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid 24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth1", ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid 24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth2", ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
[pid 24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth1", ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid 24870] 09:17:28 <... ioctl resumed> , {ifr_name="eth2", ifr_broadaddr={AF_INET, inet_addr("192.168.1.255")}}) = 0
[pid 24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth2", ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
[pid 24872] 09:17:28 <... ioctl resumed> , {ifr_name="eth2", ifr_broadaddr={AF_INET, inet_addr("192.168.1.255")}}) = 0
..
[pid 24872] 09:17:33 <... ioctl resumed> 200, {{"lo", {AF_INET, inet_addr("127.0.0.1")}}, {"eth0", {AF_INET, inet_addr("10.0.2.15")}}, {"eth1", {AF_INET, inet_addr("192.168.2.101")}}, {"eth2", {AF_INET, inet_addr("192.168.1.101")}}, {"virbr0", {AF_INET, inet_addr("192.168.122.1")}}}}) = 0
[pid 24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth1", ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid 24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth1", ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid 24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth2", ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
[pid 24870] 09:17:33 <... ioctl resumed> , {ifr_name="eth2", ifr_broadaddr={AF_INET, inet_addr("192.168.1.255")}}) = 0
[pid 24872] 09:17:33 <... ioctl resumed> , {ifr_name="eth1", ifr_addr={AF_INET, inet_addr("192.168.2.101")}}) = 0
[pid 24872] 09:17:33 <... ioctl resumed> , {ifr_name="eth1", ifr_broadaddr={AF_INET, inet_addr("192.168.2.255")}}) = 0
[pid 24872] 09:17:33 <... ioctl resumed> , {ifr_name="eth2", ifr_addr={AF_INET, inet_addr("192.168.1.101")}}) = 0
..
--> Again we don‘t get an OS error but we are looping running the same ioctl() command
    Seems the kernel is not happy with the inforamtion we get from ioctl() call  and tries to reread the information every 5 seconds 

Check GPnP profile
[[email protected] Desktop]#  gpnptool get > profile.xml
Edit  profile.xml and extract the adapter usage
<gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*">
   <gpnp:Network id="net1" IP="192.168.1.0" Adapter="eth1" Use="public"/>
   <gpnp:Network id="net2" IP="192.168.2.0" Adapter="eth2" Use="cluster_interconnect"/>
Verify with ifconfig
[[email protected] Desktop]# ifconfig | egrep ‘HWaddr|inet addr‘
eth1      Link encap:Ethernet  HWaddr 08:00:27:89:E9:A2
          inet addr:192.168.2.101  Bcast:192.168.2.255  Mask:255.255.255.0
eth2      Link encap:Ethernet  HWaddr 08:00:27:6B:E2:BD
          inet addr:192.168.1.101  Bcast:192.168.1.255  Mask:255.255.255.0
          inet addr:127.0.0.1  Mask:255.0.0.0
--> eth1 is using  192.168.2.101 but according GPnP Profile it should use 192.168.1.101
    eth2 is using  192.168.1.101 but according GPnP Profile it should use 192.168.2.101

Problem found :
During manuall editing  ifcfg-eth1 and ifcfg-eth2  HWADR entry was wrongly filled ( /etc/sysconfig/network-scripts )

Reconfiguring/restart network and CRS
[[email protected] network-scripts]# cat  ifcfg-eth2
HWADDR=08:00:27:89:E9:A2
IPADDR=192.168.2.101
NAME=eth2
[[email protected] network-scripts]# cat  ifcfg-eth1
IPADDR=192.168.1.101
NAME=eth1
HWADDR=08:00:27:6B:E2:BD

After changing HWaddr to follow the above ifconfig output the network looks good
[[email protected] network-scripts] service network restart
[[email protected] network-scripts]# ifconfig | egrep ‘HWaddr|inet addr‘
eth1      Link encap:Ethernet  HWaddr 08:00:27:89:E9:A2
          inet addr:192.168.1.101  Bcast:192.168.1.255  Mask:255.255.255.0
eth2      Link encap:Ethernet  HWaddr 08:00:27:6B:E2:BD
          inet addr:192.168.2.101  Bcast:192.168.2.255  Mask:255.255.255.0

Restart CRS
[[email protected] network-scripts]# crsctl stop crs -f
[[email protected] network-scripts]# crsctl start crs
[[email protected] network-scripts]# crsctl check cluster -all
**************************************************************
grac41:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
grac42:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
grac43:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

Lessons learned
 - Verify carefully that IP addresses and Network Device names are clusterwide  consistent 

Scenario 2: Filesystem full ( 12c )
[[email protected] Desktop]# crsi
*****  Local Resources: *****
Resource NAME               INST   TARGET       STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE       OFFLINE      -               STABLE
ora.cluster_interconnect.haip  1   ONLINE       OFFLINE      -               STABLE
ora.crf                        1   ONLINE       OFFLINE      -               STABLE
ora.crsd                       1   ONLINE       OFFLINE      -               STABLE
ora.cssd                       1   ONLINE       OFFLINE      -               STABLE
ora.cssdmonitor                1   OFFLINE      OFFLINE      -               STABLE
ora.ctssd                      1   ONLINE       OFFLINE      -               STABLE
ora.diskmon                    1   OFFLINE      OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE       ONLINE       gract1          STABLE
ora.evmd                       1   ONLINE       OFFLINE      gract1          STARTING
ora.gipcd                      1   ONLINE       OFFLINE      -               STABLE
ora.gpnpd                      1   ONLINE       OFFLINE      -               STABLE
ora.mdnsd                      1   ONLINE       OFFLINE      gract1          STARTING
ora.storage                    1   ONLINE       OFFLINE      -               STABLE

Related client trace
2014-08-22 10:57:07.750: [  OCRMSG][2296473152]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)
2014-08-22 10:57:07.750: [  OCRMSG][2296473152]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-22 10:57:07.750: [  OCRMSG][2296473152]prom_connect: error while waiting for connection complete [24]
2014-08-22 10:57:07.821: [  OCRMSG][2296473152]prom_waitconnect: CONN NOT ESTABLISHED (0,29,1,2)
2014-08-22 10:57:07.821: [  OCRMSG][2296473152]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-22 10:57:07.821: [  OCRMSG][2296473152]prom_connect: error while waiting for connection complete [24]

Root Cause : File system full : 100% - No traces can be written
# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_oel64-lv_root
                      39603624  37798864         0 100% /
tmpfs                  4194304       272   4194032   1% /dev/shm
/dev/sda1               495844    101751    368493  22% /boot
Scenario 3 : Firwall ON
*****  Cluster Resources: *****
Resource NAME               INST   TARGET    STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE    OFFLINE      -               STABLE
ora.cluster_interconnect.haip  1   ONLINE    OFFLINE      -               STABLE
ora.crf                        1   ONLINE    OFFLINE      -               STABLE
ora.crsd                       1   ONLINE    OFFLINE      -               STABLE
ora.cssd                       1   ONLINE    OFFLINE      -               STABLE
ora.cssdmonitor                1   ONLINE    ONLINE       gract2          STABLE
ora.ctssd                      1   ONLINE    OFFLINE      -               STABLE
ora.diskmon                    1   OFFLINE    OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE    ONLINE       gract2          STABLE
ora.evmd                       1   ONLINE    INTERMEDIATE gract2          STABLE
ora.gipcd                      1   ONLINE    ONLINE       gract2          STABLE
ora.gpnpd                      1   ONLINE    ONLINE       gract2          STABLE
ora.mdnsd                      1   ONLINE    ONLINE       gract2          STABLE
ora.storage                    1   ONLINE    OFFLINE      -               STABLE

--> CSSD doesn‘t become ONLINE 

Client log :
014-08-23 11:49:21.920: [  OCRMSG][2580342528]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-23 11:49:42.948: [  OCRMSG][2580342528]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-23 11:50:10.978: [  OCRMSG][2580342528]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-23 11:50:46.008: [  OCRMSG][2580342528]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-23 11:51:28.042: [  OCRMSG][2580342528]GIPC error [29] msg [gipcretConnectionRefused]
2014-08-23 11:51:28.042: [  OCRMSG][2580342528]GIPC error [29] msg [gipcretConnectionRefused]

20665 <... connect resumed> )           = 0
20665 connect(66, {sa_family=AF_FILE, path="/var/tmp/.oracle/sOHASD_UI_SOCKET"}, 110 <unfinished ...>
20665 <... connect resumed> )           = 0
20665 connect(73, {sa_family=AF_FILE, path="/var/tmp/.oracle/sprocr_local_conn_0_PROC"}, 110 <unfinished ...>
20665 <... connect resumed> )           = -1 ECONNREFUSED (Connection refused)

occsd.log :
2014-08-23 12:32:58.427: [    CSSD][1279260416]clssnmvDHBValidateNCopy: node 1, gract1, has a disk HB, but no network HB,
         DHB has rcfg 304252836, wrtcnt, 3207223, LATS 4294823390, lastSeqNo 3207220, uniqueness 1408783210, timestamp 1408789980/5988764
2014-08-23 12:32:58.427: [    CSSD][1283991296]clssnmvDHBValidateNCopy: node 1, gract1, has a disk HB, but no network HB,
         DHB has rcfg 304252836, wrtcnt, 3207224, LATS 4294823390, lastSeqNo 3207221, uniqueness 1408783210, timestamp 1408789980/5988864
Fix : Disable Firewall
References
Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip (Doc ID 1210883.1)
Grid Infrastructure Installation root.sh Failed with “Failed to start CTSS” (Doc ID 1277307.1)
Troubleshoot Grid Infrastructure Startup Issues (Doc ID 1050908.1)
Top 5 Grid Infrastructure Startup Issues (Doc ID 1368382.1)

原文地址:https://www.cnblogs.com/chendian0/p/10880576.html

时间: 2024-11-01 13:25:40

CRS does not start GIPC error: [29] msg [gipcretConnectionRefused]的相关文章

Mysql load data 命令解析、处理 error 29 (ErrCode: 13) 错误(在ubuntu环境下)

在 mysql 服务器上,可以通过 load data infile 'file_name' into table table_name; 命令将一个文本文件中的所有数据存到指定表中.最粗略形式的例子: load data infile 'test.txt' into table test_table; 默认情况下,load data infile 对于文本中行为是: 一行对应数据库表中的一条记录 各个字段之间以tab键分开 每个字段的值没有被任何字符括起来 行没有前缀可以忽略 比如某一行文本:

启动监听报错:TNS-12537: TNS:connection closed TNS-12560: TNS:protocol adapter error TNS-00507: Connection closed Linux Error: 29: Illegal seek

启动监听程序报错: 说明:在rhel5.8上安装完成oracle11g数据库后,使用netca创建完监听,启动监听时报错.还未使用dbca创建实例. [[email protected] ~]$ lsnrctl start LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 02-APR-2015 08:33:53 Copyright (c) 1991, 2009, Oracle. All rights reserved. Starting

Keil MDK C (error: #29: expected an expression) 错误的解决

今天,自己建了一个EFM32工程模版,调试代码时显示 ..\App\Panel_main.c(119): error:  #29: expected an expression 仔细的检查了半个小时,最后解决了!问题关键点如下图: Misc Controls 默认是C90,只要将改为C99即可解决 打完收工!!!

mysql导入文件数据时错误: ERROR 29 (HY000): File &#39;xxx.txt&#39; not found (Errcode: 13)

突然遇到mysql错误:ERROR 29 (HY000): File '/var/www/xiaoyou/static/upload/import/20130427105733.csv.txt' not found (Errcode: 13) 感觉特奇怪,因为程序在本地很ok,生气了语句如下: LOAD DATA INFILE '/var/www/upload/abc.csv.txt' INTO TABLE alumni_import FIELDS TERMINATED BY ',' OPTIO

Linux Error: 29: Illegal seek

SQL> !cat /etc/issue Enterprise Linux Enterprise Linux Server release 5.5 (Carthage) Kernel \r on an \m SQL> select * from v$version where rownum<2; BANNER -------------------------------------------------------------------------------- Oracle Da

Oracle 12c 12.1.0.1.0管理控制文件官方文档翻译说明

Link: http://download.csdn.net/detail/rlhua/7718571 官方Link: http://docs.oracle.com/database/121/ADMIN/control.htm#ADMIN006 版本: [email protected]>select * from v$version; BANNER                                                                          

11gR2 RAC重启后只能起单节点

11gR2 RAC重启后只能起单节点 问题背景: 将11gR2 RAC正常部署完成之后执行两节点重启操作发现其中有一个节点的集群资源无法启动,遂再次重启该无法启动集群资源的节点,还是不可.随即将正常节点重启发现原故障节点资源起来了,待重启完毕后原正常节点资源无法启动. 集群环境: OS:RedHat EnterPrise5.8 x86_x64 DB:Oracle EnterPrise Database 11.2.0.4.0 x86_x64 GRID:Oracle Grid Infrastruct

11gR2 RAC重新启动后仅仅能起单节点

11gR2 RAC重新启动后仅仅能起单节点 问题背景: 将11gR2 RAC正常部署完毕之后运行两节点重新启动操作发现当中有一个节点的集群资源无法启动,遂再次重新启动该无法启动集群资源的节点,还是不可.随即将正常节点重新启动发现原故障节点资源起来了,待重新启动完毕后原正常节点资源无法启动. 集群环境: OS:RedHat EnterPrise5.8 x86_x64 DB:Oracle EnterPrise Database 11.2.0.4.0 x86_x64 GRID:Oracle Grid

TNS-12547 Linux Error: 104: Connection reset by pe (转载)

TNS-12547 Linux Error: 104: Connection reset by peer 解决过程参考:http://blog.chinaunix.net/u/7121/showart_403812.html [[email protected] log]$ lsnrctl startLSNRCTL for Linux: Version 10.2.0.1.0 - Production on 23-JUN-2009 09:53:26 Copyright (c) 1991, 2005