环境是一套11.2.0.3 2nodes RAC on hpux-ia31, alert中出现ora-32701 hangmgr错误, 从trace文件中发现是m000进程是mmon的辅助进程,用于flush AWR相关数据,有一个wait event: enq: WF – contention, 这也是flush AWR数据时相关的enqueue等待,但是blocker进程是not in wait, (另mmon hang是可以直接kill 该进程spid, 通常会在不重启实例的前掉下重启该进程)。 这里简单的记录该问题
# db alert
Thread 2 advanced to log sequence 170773 (LGWR switch)
Current log# 8 seq# 170773 mem# 0: /dev/tza_oravg02/rtza_redo08
Thu May 26 17:08:11 2016
Errors in file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc (incident=720137):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720137/anbob2_dia0_29439_i720137.trc
Thu May 26 17:08:12 2016
Sweep [inc][720137]: completed
System State dumped to trace file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720137/anbob2_m001_22862_i720137_a.trc
DIA0 terminating blocker (ospid: 13964 sid: 5649 ser#: 7241) of hang with ID = 3
requested by master DIA0 process on instance 1
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
by terminating session sid: 5649 ospid: 13964
DIA0 successfully terminated session sid:5649 ospid:13964 with status 31.
Thu May 26 17:09:46 2016
Errors in file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc (incident=720138):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720138/anbob2_dia0_29439_i720138.trc
Thu May 26 17:09:47 2016
Sweep [inc][720138]: completed
Thu May 26 17:09:47 2016
Sweep [inc2][720138]: completed
Sweep [inc2][720137]: completed
System State dumped to trace file /oracle/app/oracle/diag/rdbms/ucisa/anbob2/incident/incdir_720138/anbob2_m003_23913_i720138_a.trc
DIA0 terminating blocker (ospid: 13964 sid: 5649 ser#: 7241) of hang with ID = 3
requested by master DIA0 process on instance 1
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
by terminating the process
DIA0 successfully terminated process ospid:13964.
Thu May 26 17:17:16 2016
Thread 2 advanced to log sequence 170774 (LGWR switch)
Current log# 9 seq# 170774 mem# 0: /dev/tza_oravg02/rtza_redo09
# trace file: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc
Dump continued from file: /oracle/app/oracle/diag/rdbms/ucisa/anbob2/trace/anbob2_dia0_29439.trc
ORA-32701: Possible hangs up to hang ID=0 detected
========= Dump for incident 720137 (ORA 32701) ========
----- Beginning of Customized Incident Dump(s) -----
There are resolvable hangs on your system. Hang Manger will
attempt to resolve these hangs. Some information about these
is output below. Complete information is available in an
incident trace file on instance 1.
If the hang is to be resolved by terminating the session or
process that is the root or victim of the hang, additional
information will be output on the local instance of that
session or process. Below are the hangs for which resolution
will be attempted.
*** 2016-05-26 17:08:11.765
Resolvable Hangs in the System
Root Chain Total Hang
Hang Hang Inst Root #hung #hung Hang Hang Resolution
ID Type Status Num Sess Sess Sess Conf Span Action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
3 HANG RSLNPEND 2 5649 2 2 HIGH GLOBAL Terminate Process
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
inst# SessId Ser# OSPID PrcNm Event
----- ------ ----- --------- ----- -----
1 6238 38599 3697 M000 enq: WF - contention
2 5649 7241 13964 M000 not in wait
Dumping process info of pid[212.13964] (sid:5649, ser#:7241)
requested by master DIA0 process on instance 1.
*** 2016-05-26 17:08:11.765
Process diagnostic dump for oracle@qdtza2 (M000), OS id=13964,
pid: 212, proc_ser: 66, sid: 5649, sess_ser: 7241
-------------------------------------------------------------------------------
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 17:08:11 ]
NOTE: scheduling delay has not been sampled for 0.103181 secs 0.000667 secs from [ 17:08:07 - 17:08:12 ], 5 sec avg
0.003777 secs from [ 17:07:12 - 17:08:12 ], 1 min avg
0.000976 secs from [ 17:03:12 - 17:08:12 ], 5 min avg
loadavg : 0.19 0.18 0.17 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Swapinfo :
Avail = 380207.03Mb Used = 165771.44Mb
Swap free = 214385.22Mb Kernel rsvd = 15198.44Mb
Free Mem = 83933.84Mb
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME COMD
1401 R oracle 13964 1 254 178 20 e000000d77ef2040 102092 - 17:00:09 ? 7:57 ora_m000_anbob2 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Short stack dump:
ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<-kghalf()+385<-kqlfbctc()+1024<-kgligi()+128<-kglic_cbk()+832<-kglic0()+1376<-kglic()+144<-kqlfbct()+560<-qerfxFetch()+2784<-qerjotRowProc()+2432<-qerhjWalkHashBucket()+768<-qerhjInnerProbeHashTable()+1168<-qerfxFetch()+1680<-rwsfcd()+320<-qerhjFetch()+1040<-qerjotFetch()+3664<-rwsfcd()+320<-qerltcFetch()+1696<-insexe()+1584<-opiexe()+16192<-kpoal8()+4624<-opiodr()+2416<-kpoodrc()+64<-rpiswu2()+1120<-kpoodr()+1008<-upirtrc()+2640<-kpurcsc()+256<-kpuexec()+6672<-OCIStmtExecute()+80<-kewrose_oci_stmt_exec()+96<-kewrgwxf1_gwrsql_exft_1()+576<-kewrgwxf_gwrsql_exft()+928<-kewrftbs_flush_table_by_sql()+352<-kewrfabt_flush_attribute_table()+1120<-kewrfsb_flush_sqlbind()+80<-kewrft_flush_table()+800<-kewrftec_flush_table_ehdlcx()+832<-kewrftsq_flush_topsql()+624<-kewrft_flush_table()+800<-kewrftec_flush_table_ehdlcx()+832<-kewrfat_flush_all_tables()+2144<-kewrfos_flush_onesnap()+272<-kewrfsc_flush_snapshot_c()+832<-kewrafs_auto_flush_slave()+1008<-kebm_slave_main()+944<-ksvrdp()+5440<-opirip()+1360<-opidrv()+1152<-sou2o()+256<-opimai_real()+352<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80
-------------------------------------------------------------------------------
Process diagnostic dump actual duration=0.392000 sec
(max dump time=15.000000 sec)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
*** 2016-05-26 17:08:12.156
current sql: insert into wrh$_sql_bind_metadata (snap_id, dbid, sql_id, name, position, dup_position, datatype, datatype_string, character_sid, precision, scale, max_length) SELECT /*+ ordered use_nl(bnd) index(bnd sql_id) */ :lah_snap_id, :dbid, bnd.sql_id, name, position, dup_position, datatype, dataty
----------------------------------------
SO: 0xc000002071d7c428, type: 2, owner: 0x0000000000000000, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0xc000002071d7c428, name=process, file=ksu.h LINE:12616 ID:, pg=0
(process) Oracle pid:212, ser:66, calls cur/top: 0xc000001fb321ec20/0xc000001fb31300e8
flags : (0x2) SYSTEM
flags2: (0x30), flags3: (0x10)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 0 0 80
last post received-location: kji.h LINE:3418 ID:kjata: wake up enqueue owner
last process to post me: c0000020b1d72b30 2 6
last post sent: 0 0 26
last post sent-location: ksa2.h LINE:285 ID:ksasnd
last process posted by me: c000002071d60198 2 6
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 0xc0000020a20d45a0
O/S info: user: oracle, term: UNKNOWN, ospid: 13964
OSD pid info: Unix process pid: 13964, image: oracle@qdtza2 (M000)
----------------------------------------
SO: 0xc0000020505807f8, type: 10, owner: 0xc000002071d7c428, flag: INIT/-/-/0x00 if: 0x1 c: 0x1
proc=0xc000002071d7c428, name=FileOpenBlock, file=ksfd.h LINE:6448 ID:, pg=0
(FOB) c0000020505807f8 flags=2050 fib=c0000020c8afa058 incno=0 pending i/o cnt=0
fname=/dev/tza_oravg04/rtza_lv16_403
fno=820 lblksz=16384 fsiz=1105919
service name: SYS$BACKGROUND
Current Wait Stack:
Not in wait; last wait ended 7 min 48 sec ago <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
There are 1 sessions blocked by this session.
Dumping one waiter:
inst: 1, sid: 6238, ser: 38599
wait event: 'enq: WF - contention'
p1: 'name|mode'=0x57460006
p2: '0'=0x46
p3: '0'=0x0
row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0 <<<<<<<<<<<<<<<<<<<<<<<<<<
min_blocked_time: 445 secs, waiter_cache_ver: 50989
Wait State:
fixed_waits=0 flags=0x20 boundary=0x0000000000000000/-1
Session Wait History:
elapsed time of 7 min 48 sec since last wait
0: waited for 'gc cr block 2-way'
=0x3, =0x115a3, =0x1
wait_id=5905 seq_num=5906 snap_id=1
wait times: snap=0.000424 sec, exc=0.000424 sec, total=0.000424 sec
wait times: max=infinite
wait counts: calls=1 os=1
occurred after 0.000712 sec of elapsed time
1: waited for 'gc current block 2-way'
=0x3, =0x13148, =0x1
wait_id=5904 seq_num=5905 snap_id=1
wait times: snap=0.000325 sec, exc=0.000325 sec, total=0.000325 sec
wait times: max=infinite
wait counts: calls=1 os=1
occurred after 0.001039 sec of elapsed time
2: waited for 'enq: WF - contention'
name|mode=0x57460006, 0=0x38, 0=0x0
wait_id=5903 seq_num=5904 snap_id=1
wait times: snap=0.000257 sec, exc=0.000257 sec, total=0.000257 sec
wait times: max=infinite
wait counts: calls=2 os=2
occurred after 1.078632 sec of elapsed time
3: waited for 'enq: WF - contention'
name|mode=0x57460006, 0=0x46, 0=0x0
wait_id=5902 seq_num=5903 snap_id=1
wait times: snap=0.000282 sec, exc=0.000282 sec, total=0.000282 sec
wait times: max=infinite
wait counts: calls=2 os=2
occurred after 0.002117 sec of elapsed time
4: waited for 'gc current block 2-way'
=0x123, =0x31410, =0x2000001
wait_id=5901 seq_num=5902 snap_id=1
wait times: snap=0.000276 sec, exc=0.000276 sec, total=0.000276 sec
wait times: max=infinite
wait counts: calls=1 os=1
occurred after 0.000529 sec of elapsed time
5: waited for 'gc current block 2-way'
=0x123, =0x71a2, =0x2000009
wait_id=5900 seq_num=5901 snap_id=1
wait times: snap=0.000141 sec, exc=0.000141 sec, total=0.000141 sec
wait times: max=infinite
wait counts: calls=1 os=1
occurred after 0.000231 sec of elapsed time
...
解决方法:
有个类似 Bug 20733574 : HANG DETECTED WHILE CREATING AWR SNAPSHOT, 但当时因为收集信息不足,开发未确认为bug;
还有一种解决方法是跳过收集与该表相关的数据(这个案例是wrh$_sql_bind_metadata记录的是SQL Bind Metadata),使用下面的命令:
alter system set “_awr_disabled_flush_tables” = ‘wrh$_sql_bind_metadata’;
该命令是动态的,不需要重启实例, 禁用多个表时参数用逗号分隔;
原理禁用了和上面表相关的统计信息flush ash to awr, 除了上述的bug ,还有其它一些原因时如空间不足、hang、也可以禁用,除了上面的禁用方法还有另一种禁用方法,先查询AWR Table Info (KEW layer)使用下面的SQL:
select table_id_kewrtb, table_name_kewrtb from x$kewrtb order by 1;
从上面的查询中得到table_id用于下面的level 值,比如我们这里表wrh$_sql_bind_metadata对应的是56
alter session set events ‘immediate trace name awr_flush_table_off level 56’;
后期如果想打开flushing to awr如果使用第一种方法就是把”_awr_disabled_flush_tables” reset掉;如果第二种方法禁用,打开使用
alter session set events ‘immediate trace name awr_flush_table_on level 56’;
迷雾城堡免广告 最新版v0.1.30
迷雾城堡免广告是一款非常好玩的模拟建造类手游,玩家无需看广告
鉴车大师免广告 安卓版v1.2.2
鉴车大师免广告是一款非常好玩的模拟类手游,玩家在游戏中不用看
从前有条街 安卓最新版v1.5
从前有条街是一款非常好玩的模拟经营类手游,玩家在游戏中将会进
我的世界源之界冰火魔龙 最新版v阿夜整合
我的世界源之界冰火魔龙模组整合包是一款像素风格的沙河模拟生存
假面骑士创骑腰带模拟器 安卓版v6
假面骑士创骑腰带模拟器是一个专为喜欢假面骑士的用户打造的变身