1、Oracle rac宕机分析故障处理 某客户rac数据库2号节点实例自动宕节点,以下为分析报告 一、现象回顾: 2号节点发生故障时,alert日志显示如下: Thread 2 advanced to log sequence 77740 (LGWR switch) Current log# 24 seq# 77740 mem# 0: /dev/rcrm4_rd2_91_2G Current log# 24 seq# 77740 mem# 1: /dev/rcrm4_rd2_92_2G Mon Mar 28 09:45:23 BEIST 2011 PMON
2、failed to acquire latch, see PMON dump Mon Mar 28 09:46:52 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:47:52 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:47:53 BEIST 2011 Thread 2 advanced to log sequence 77741 (LGWR switch) Current log
3、# 25 seq# 77741 mem# 0: /dev/rcrm5_rd2_101_2G Current log# 25 seq# 77741 mem# 1: /dev/rcrm5_rd2_102_2G Mon Mar 28 09:48:52 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:49:52 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:50:52 BEIST 2011 P
4、MON failed to acquire latch, see PMON dump Mon Mar 28 09:51:52 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:52:53 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:53:53 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:54:53
5、 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:55:53 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:56:53 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon Mar 28 09:57:53 BEIST 2011 PMON failed to acquire latch, see PMON dump Mon M
6、ar 28 09:58:44 BEIST 2011 Received an instance abort message from instance 1 (reason 0x0) Please check instance 1 alert and LMON trace files for detail. LMD0: terminating instance due to error 481 Mon Mar 28 09:58:45 BEIST 2011 System state dump is made for local instance System State du
7、mped to trace file /opt/oracle/admin/crmdb/bdump/crmdb2_diag_3732132.trc Mon Mar 28 09:58:45 BEIST 2011 Shutting down instance (abort) License high water mark = 2290 从alert日志来看,二号节点主要表现为PMON进程不能获得latch(PMON failed to acquire latch