本来是一件很简单的事情,restore文件,然后recover归档,恢复到某个点,然后open resetlogs 打开数据库,但是居然报错,ora-600 [4097],很常见的一个错误,不过比较怪异的是,这里并没有直接提示是哪个回滚段有问题,如下是trace内容:
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c bl ksedst1 000000000 ? 000000000 ?
ksedmp+0290 bl ksedst 104A56548 ?
ksfdmp+0018 bl 03F563A4
kgeriv+0108 bl _ptrgl
kgesiv+0080 bl kgeriv 000000018 ? FFFFFFFFFFDB5C0 ?
000000000 ? 10564B600 ?
7000004F76CA698 ?
ksesic0+0060 bl kgesiv 000000000 ? 700000350CD7B3C ?
FFFFFFFFFFDB050 ?
FFFFFFFFFFDB598 ? 000000000 ?
ktugti+07cc bl ksesic0 100100001001 ? 0000010E4 ?
000000000 ? 000000000 ?
1104B74D0 ? 000000080 ?
1100DFC10 ? 000000007 ?
ktcwit1+0684 bl ktugti 700000506D0CC10 ? 410000FC8 ?
0056815A0 ? 147AE1410000FC8 ?
3B00000015 ? 4400000005 ?
4EB10018078 ? 21FE7BFCF8 ?
ktbgfi+1390 bl ktcwit1 FFFFFFFFFFDB5C0 ?
FFFFFFFFFFDB598 ? 20010D18C ?
41022B190 ? 000000000 ?
147AE1411D203D0 ?
FFFFFFFFFFDB5B0 ? 111D20390 ?
kdddgb+08b0 bl ktbgfi 011D16648 ? 111D44B58 ?
000000000 ? 111D16770 ?
FFFFFFFFFFDB8C0 ?
4844484304CD2968 ?
1020E0968 ?
kdusru+15d8 bl kdddgb 000000000 ? 000000000 ?
000000000 ?
kauupd+0230 bl kdusru 000000000 ? 000000000 ?
000000000 ? 000000000 ?
updrow+10fc bl kauupd 111D643F0 ? 7000004ECF8A480 ?
1104D87D0 ?
4004824000000000 ?
7000004ECF8C978 ?
E0004F60D27C8 ? FFFFE6D50 ?
1104D6E60 ?
qerupRowProcedure+0 bl updrow 1100C82A8 ? 7FFF04E5836C ?
050
qerupFetch+053c bl 03F52E00
updaul+0e0c bl 01FC3DDC
updThreePhaseExe+0e bl updaul 7000004ECF7FF48 ?
ec FFFFFFFFFFE8328 ? 000000000 ?
updexe+02f8 bl updThreePhaseExe FFFFFFFFFFE8580 ? 100000000 ?
000000000 ? 1104DCEA0 ?
opiexe+2868 bl updexe 111D7C110 ? 300000418 ?
opiodr+0ae0 bl _ptrgl
rpidrus+01bc bl opiodr 400000000 ? 4104DCEA0 ?
FFFFFFFFFFEC030 ? 204E92D50 ?
skgmstack+00c8 bl _ptrgl
rpidru+0088 bl skgmstack 700000505CE041C ? 000000000 ?
000000002 ? 000000000 ?
FFFFFFFFFFEBBC8 ?
rpiswu2+034c bl _ptrgl
rpidrv+095c bl rpiswu2 700000505CE03E0 ?
FFFFFFFFFFEBB30 ?
FFFFFFFFFFEC210 ?
882244200030B558 ?
10107B9CC ? 000000000 ?
FFFFFFFFFFEBF30 ? 000000000 ?
rpiexe+005c bl rpidrv 200000000 ? 400000000 ?
FFFFFFFFFFEC030 ? 000000000 ?
ktuscu+0284 bl 01FC42B8
kqrcmt+0404 bl _ptrgl
ktcrcm+052c bl kqrcmt 7000004F76CA698 ? 100000000 ?
000000000 ?
ktuiup+056c bl ktcrcm 7000004F76CA698 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 100000001 ?
000000000 ? 000000000 ?
ktuini+0064 bl ktuiup 000000000 ?
adbdrv+1984 bl ktuini 010441180 ?
opiexe+2c98 bl adbdrv
opiosq0+19f0 bl opiexe 000000000 ? 000000000 ?
FFFFFFFFFFF8F20 ?
kpooprx+0168 bl opiosq0 3F8FECB10 ? 700000010003520 ?
7000004F8FECA90 ?
A40001101960A8 ?
kpoal8+0400 bl kpooprx FFFFFFFFFFFB774 ?
FFFFFFFFFFFB518 ?
1D0000001D ? 100000001 ?
000000000 ? A40000000000A4 ?
000000000 ? 1103A5678 ?
opiodr+0ae0 bl _ptrgl
ttcpip+1020 bl _ptrgl
opitsk+1124 bl ttcpip 1100CB4B0 ? 9001000A0080860 ?
FFFFFFFFFFFB750 ? 11044D010 ?
FFFFFFFFFFFB750 ? 11044D090 ?
FFFFFFFFFFFB750 ?
9001000A0080860 ?
opiino+0990 bl opitsk 1E00000000 ? 000000000 ?
opiodr+0ae0 bl _ptrgl
opidrv+0484 bl 01FC4CDC
sou2o+0090 bl opidrv 3C02877CFC ? 4A006F398 ?
FFFFFFFFFFFF6B0 ?
opimai_real+01bc bl 01FC306C
main+0098 bl opimai_real 000000000 ? 000000000 ?
__start+0070 bl main 000000000 ? 000000000 ?
其实我们可以尝试reset incarnation,然后再去restore归档,然后recover,想想麻烦,反正是测试,所以继续搞下去。
首先利用10046 event 来跟踪一下,发现如下sql报错:
PARSING IN CURSOR #2 len=148 dep=1 uid=0 oct=6 lid=0 tim=11525224803938 hv=3540833987 ad='9f8d140'
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
END OF STMT
PARSE #2:c=0,e=10,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=11525224803936
BINDS #2:
kkscoacd
Bind#0
oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
kxsbbbfp=70000050bfe3922 bln=32 avl=10 flg=09
value="_SYSSMU29$"
Bind#1
实际上因为oracle 在open的时候会去判断回滚端上是否存在事物,如果存在,那么就会进行update,如果进行update那么也就说明正在open的时候需要更新回滚端的信息。这里尝试使用参数将上述几个回滚端屏蔽掉,发现仍然无法open,再次寻找10046 trace,发现原来是另外一个回滚段可能有问题,如下:
Cursor#2(1104c1ad8) state=BOUND curiob=11136c0d0
curflg=5 fl2=0 par=1104c1a70 ses=700000505ce03e0
sqltxt(70000050ebb91d8)=update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
hash=9caba1288112094d5553173dd30cd6c3
parent=7000004edfe6078 maxchild=01 plk=7000004f179cc38 ppn=n
cursor instantiation=11136c0d0 used=1435474913
child#0(70000050ebb8fb0) pcs=7000004edfe5c88
clk=7000004f1788cd0 ci=7000004edfe5370 pn=700000509fa4f00 ctx=7000004ecf7ff48
kgsccflg=1 llk[11136c0d8,11136c0d8] idx=c4
xscflg=e0100666 fl2=d100400 fl3=4022218c fl4=100
Bind bytecodes
Opcode = 5 Bind Rpi Scalar Sql In (not out) Nocopy
Offsi = 48, Offsi = 0
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 32
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 64
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 96
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 128
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 160
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 192
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 224
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 256
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 288
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 320
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 352
Opcode = 1 Unoptimized
Offsi = 48, Offsi = 384
kkscoacd
Bind#0
oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
kxsbbbfp=70000050afdf68a bln=32 avl=10 flg=09
value="_SYSSMU61$"
果断再次屏蔽,然后尝试open resetlogs,发热仍然报错,原来这个回滚端用无法直接offline,隐含参数不好用,因此直接bbed 修改状态吧,如下:
BBED> p *kdbr[7]
rowdata[7302]
-------------
ub1 rowdata[7302] @7662 0x0c
BBED> x /1rncnnnnnnnnnnn
rowdata[7302] @7662
-------------
flag@7662: 0x0c (KDRHFL, KDRHFF)
lock@7663: 0x00
cols@7664: 17
hrid@7665:0x0040006a.3d
col 0[2] @7671: 61
col 1[10] @7674: _SYSSMU61$
col 2[2] @7685: 1
col 3[2] @7688: 200
col 4[4] @7691: 34489
col 5[6] @7696: 4196918701
col 6[3] @7703: 3364
col 7[5] @7707: 8202997
col 8[4] @7713: 23884
col 9[1] @7718: 0
col 10[2] @7720: 3
col 11[2] @7723: 1
col 12[0] @7726: *NULL*
col 13[0] @7727: *NULL*
col 14[0] @7728: *NULL*
col 15[0] @7729: *NULL*
col 16[2] @7730: 1
BBED> modify /x c103 offset 7721
File: /crm/oradata02/rngc_system.dbf (1)
Block: 110 Offsets: 7721 to 7726 Dba:0x0040006e
------------------------------------------------------------------------
c10302c1 02ff
<32 bytes per line>
BBED> sum apply
Check value for File 1, Block 110:
current = 0x704c, required = 0x704c
修改之后成功open 数据库。
not connected> alter database open resetlogs;
Database altered.
虽然打开了,但是奇葩的还在后面,当我shutdown 再次启动,居然无法启动了。 报错ora-01555,比较经典的错误。
ARC0: Becoming the 'no SRL' ARCH
Sun Jun 28 16:08:22 2015
ARC1: Becoming the heartbeat ARCH
Sun Jun 28 16:08:22 2015
SMON: enabling cache recovery
Sun Jun 28 16:08:22 2015
ORA-01555 caused by SQL statement below (SQL ID: 7bd391hat42zk, Query Duration=0 sec, SCN: 0x0d27.0a1ce29d):
Sun Jun 28 16:08:22 2015
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
Sun Jun 28 16:08:22 2015
Errors in file /oracle/app/oracle/admin/ibsscrm/udump/xxxx_ora_30212428.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 0 with name "SYSTEM" too small
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 30212428
ORA-1092 signalled during: alter database open...
比较郁闷的是system 回滚段。很明显这也跟scn有关系,aix平台,尝试oradebug 修改scn,发现比较费劲。
最后果断bbed 再次修改block(仍然通过10046 trace 寻找相关的block).
BBED> p ktbbh
struct ktbbh, 72 bytes @20
ub1 ktbbhtyp @20 0x01 (KDDBTDATA)
union ktbbhsid, 4 bytes @24
ub4 ktbbhsg1 @24 0x0000000f
ub4 ktbbhod1 @24 0x0000000f
struct ktbbhcsc, 8 bytes @28
ub4 kscnbas @28 0x0a1ba8da
ub2 kscnwrp @32 0x0d27
b2 ktbbhict @36 2
ub1 ktbbhflg @38 0x02 (NONE)
ub1 ktbbhfsl @39 0x00
ub4 ktbbhfnx @40 0x00000000
struct ktbbhitl[0], 24 bytes @44
struct ktbitxid, 8 bytes @44
ub2 kxidusn @44 0x0000
ub2 kxidslt @46 0x002a
ub4 kxidsqn @48 0x000004eb
struct ktbituba, 8 bytes @52
ub4 kubadba @52 0x00400195
ub2 kubaseq @56 0x0238
ub1 kubarec @58 0x0b
ub2 ktbitflg @60 0x0001 (NONE)
union _ktbitun, 2 bytes @62
b2 _ktbitfsc @62 0
ub2 _ktbitwrp @62 0x0000
ub4 ktbitbas @64 0x00000000
struct ktbbhitl[1], 24 bytes @68
struct ktbitxid, 8 bytes @68
ub2 kxidusn @68 0x0000
ub2 kxidslt @70 0x0007
ub4 kxidsqn @72 0x000004e5
struct ktbituba, 8 bytes @76
ub4 kubadba @76 0x00400017
ub2 kubaseq @80 0x0235
ub1 kubarec @82 0x11
ub2 ktbitflg @84 0x8000 (KTBFCOM)
union _ktbitun, 2 bytes @86
b2 _ktbitfsc @86 3367
ub2 _ktbitwrp @86 0x0d27
ub4 ktbitbas @88 0x0a1ba8d9
BED> d /v offset 60 count 2
File: /crm/oradata02/rngc_system.dbf (1)
Block: 106 Offsets: 60 to 61 Dba:0x0040006a
-------------------------------------------------------
0001 l ..
<16 bytes per line>
BBED> modify /x 00 offset 61
File: /crm/oradata02/rngc_system.dbf (1)
Block: 106 Offsets: 61 to 62 Dba:0x0040006a
------------------------------------------------------------------------
0000
<32 bytes per line>
BBED> sum apply
Check value for File 1, Block 106:
current = 0x3972, required = 0x3972
BBED> verify
DBVERIFY - Verification starting
FILE = /crm/oradata02/rngc_system.dbf
BLOCK = 106
Block Checking: DBA = 4194410, Block Type = KTB-managed data block
data header at 0x1101fb05c
kdbchk: row locked by non-existent transaction
table=0 slot=124
lockid=1 ktbbhitc=2
Block 106 failed with check code 6101
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing (Data) : 1
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0
BBED> p *kdbr[124]
rowdata[65]
-----------
ub1 rowdata[65] @431 0x2c
BBED> x /1rncnnnnnnnnnnn
rowdata[65] @431
-----------
flag@431: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@432: 0x01
cols@433: 17
col 0[3] @434: 124
col 1[11] @438: _SYSSMU124$
col 2[2] @450: 1
col 3[3] @453: 208
col 4[3] @457: 1881
col 5[6] @461: 4246102093
col 6[3] @468: 3364
col 7[5] @472: 2167495
col 8[4] @478: 60563
col 9[1] @483: 0
col 10[2] @485: 3
col 11[2] @488: 1
col 12[0] @491: *NULL*
col 13[0] @492: *NULL*
col 14[0] @493: *NULL*
col 15[0] @494: *NULL*
col 16[2] @495: 1
BBED> d /v offset 432 count 2
File: /crm/oradata02/rngc_system.dbf (1)
Block: 106 Offsets: 432 to 433 Dba:0x0040006a
-------------------------------------------------------
0111 l ..
<16 bytes per line>
BBED> modify /x 00 offset 432
File: /crm/oradata02/rngc_system.dbf (1)
Block: 106 Offsets: 432 to 433 Dba:0x0040006a
------------------------------------------------------------------------
0011
<32 bytes per line>
BBED> sum apply
Check value for File 1, Block 106:
current = 0x3872, required = 0x3872
BBED> verify
DBVERIFY - Verification starting
FILE = /crm/oradata02/rngc_system.dbf
BLOCK = 106
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0
BBED> set file 1 block 110
FILE# 1
BLOCK# 110
BBED> p ktbbh
struct ktbbh, 48 bytes @20
ub1 ktbbhtyp @20 0x01 (KDDBTDATA)
union ktbbhsid, 4 bytes @24
ub4 ktbbhsg1 @24 0x0000000f
ub4 ktbbhod1 @24 0x0000000f
struct ktbbhcsc, 8 bytes @28
ub4 kscnbas @28 0x0a1ba9bf
ub2 kscnwrp @32 0x0d27
b2 ktbbhict @36 1
ub1 ktbbhflg @38 0x02 (NONE)
ub1 ktbbhfsl @39 0x00
ub4 ktbbhfnx @40 0x00000000
struct ktbbhitl[0], 24 bytes @44
struct ktbitxid, 8 bytes @44
ub2 kxidusn @44 0x0000
ub2 kxidslt @46 0x0044
ub4 kxidsqn @48 0x000004eb
struct ktbituba, 8 bytes @52
ub4 kubadba @52 0x00400195
ub2 kubaseq @56 0x0238
ub1 kubarec @58 0x1d
ub2 ktbitflg @60 0x0001 (NONE)
union _ktbitun, 2 bytes @62
b2 _ktbitfsc @62 0
ub2 _ktbitwrp @62 0x0000
ub4 ktbitbas @64 0x00000000
BBED> d /v offset 60 count 2
File: /crm/oradata02/rngc_system.dbf (1)
Block: 110 Offsets: 60 to 61 Dba:0x0040006e
-------------------------------------------------------
0001 l ..
<16 bytes per line>
BBED> modify /x 8000
File: /crm/oradata02/rngc_system.dbf (1)
Block: 110 Offsets: 60 to 61 Dba:0x0040006e
------------------------------------------------------------------------
8000
<32 bytes per line>
BBED> sum apply
Check value for File 1, Block 110:
current = 0xefb1, required = 0xefb1
BBED> verify
DBVERIFY - Verification starting
FILE = /crm/oradata02/rngc_system.dbf
BLOCK = 110
Block Checking: DBA = 4194414, Block Type = KTB-managed data block
data header at 0x11021d044
kdbchk: row locked by non-existent transaction
table=0 slot=8
lockid=1 ktbbhitc=1
Block 110 failed with check code 6101
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing (Data) : 1
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0
BBED> p *kdbr[8]
rowdata[7231]
-------------
ub1 rowdata[7231] @7591 0x0c
BBED> x /1rncnnnnnnnnnnn
rowdata[7231] @7591
-------------
flag@7591: 0x0c (KDRHFL, KDRHFF)
lock@7592: 0x01
cols@7593: 17
hrid@7594:0x0040006b.7
col 0[3] @7600: 132
col 1[11] @7604: _SYSSMU132$
col 2[2] @7616: 1
col 3[2] @7619: 9
col 4[2] @7622: 89
col 5[6] @7625: 4246102099
col 6[3] @7632: 3364
col 7[5] @7636: 2064336
col 8[4] @7642: 55781
col 9[1] @7647: 0
col 10[2] @7649: 3
col 11[2] @7652: 1
col 12[0] @7655: *NULL*
col 13[0] @7656: *NULL*
col 14[0] @7657: *NULL*
col 15[0] @7658: *NULL*
col 16[2] @7659: 1
BBED> d /v offset 7592 count 2
File: /crm/oradata02/rngc_system.dbf (1)
Block: 110 Offsets: 7592 to 7593 Dba:0x0040006e
-------------------------------------------------------
0111 l ..
<16 bytes per line>
BBED> modify /x 00 offset 7592
File: /crm/oradata02/rngc_system.dbf (1)
Block: 110 Offsets: 7592 to 7593 Dba:0x0040006e
------------------------------------------------------------------------
0011
<32 bytes per line>
BBED> sum apply
Check value for File 1, Block 110:
current = 0xeeb1, required = 0xeeb1
BBED> verify
DBVERIFY - Verification starting
FILE = /crm/oradata02/rngc_system.dbf
BLOCK = 110
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0
BBED>
最后再次open,发现一切顺利。
not connected> startup mount
ORACLE instance started.
Total System Global Area 2.1475E+10 bytes
Fixed Size 2122472 bytes
Variable Size 6425677080 bytes
Database Buffers 1.5032E+10 bytes
Redo Buffers 14651392 bytes
Database mounted.
not connected> alter database open;
Database altered.
由于是测试环境,因此可以随便折腾,生产库,建议不要这样玩,可不好哦~~