这是友情支持一个朋友的数据库恢复case,昨天圣诞节的时候来个求助,只能速度帮忙解决了好过节去了。首先我们来看下的alert log都有哪些信息:
Wed Dec 23 15:19:21 2015
SMON: enabling tx recovery
Wed Dec 23 15:19:21 2015
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 2, CPUs = 8
Wed Dec 23 15:19:22 2015
Errors in file d:oracleproduct10.2.0adminajzqudumpajzq_ora_1712.trc:
ORA-00600: internal error code, arguments: [4194], [45], [45], [], [], [], [], []
Doing block recovery for file 2 block 1618943
Block recovery from logseq 4, block 63 to scn 7662329350
Wed Dec 23 15:19:23 2015
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
Mem# 0: D:ORACLEPRODUCT10.2.0ORADATAxxxxREDO01.LOG
Block recovery completed at rba 4.4749.16, scn 1.3367362055
Doing block recovery for file 2 block 121
Block recovery from logseq 4, block 63 to scn 7662329346
Wed Dec 23 15:19:23 2015
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
Mem# 0: D:ORACLEPRODUCT10.2.0ORADATAxxxxREDO01.LOG
Block recovery completed at rba 4.3977.16, scn 1.3367362051
Wed Dec 23 15:19:24 2015
SMON: Restarting fast_start parallel rollback
Wed Dec 23 15:19:24 2015
Errors in file d:oracleproduct10.2.0adminxxxxudumpxxxx_ora_1712.trc:
ORA-00600: internal error code, arguments: [4194], [45], [45], [], [], [], [], []
Wed Dec 23 15:19:24 2015
DEBUG: Replaying xcb 0xf0adfbe0, pmd 0x20588f48 for failed op 8
Doing block recovery for file 2 block 1618943
Block recovery from logseq 4, block 63 to scn 7662329350
Wed Dec 23 15:19:24 2015
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
Mem# 0: D:ORACLEPRODUCT10.2.0ORADATAxxxxREDO01.LOG
Block recovery completed at rba 4.4749.16, scn 1.3367362055
Wed Dec 23 15:21:28 2015
Errors in file d:oracleproduct10.2.0adminxxxxudumpxxxx_ora_1712.trc:
ORA-00600: internal error code, arguments: [4194], [45], [45], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [45], [45], [],
.....
Wed Dec 23 16:47:26 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_p005_8980.trc:
ORA-00600: internal error code, arguments: [2037], [141460651], [141460651], [162], [6], [1], [3367403517], [3141222167]
Wed Dec 23 16:47:26 2015
Hex dump of (file 92, block 3926982) in trace file d:oracleproduct10.2.0adminxxxxbdumpxxxx_dbw0_8864.trc
Corrupt block relative dba: 0x173bebc6 (file 92, block 3926982)
Bad header found during preparing block for write
Data in bad block:
type: 23 format: 7 rdba: 0xbd3b3f17
last change scn: 0x0001.c8b6be51 seq: 0x5 flg: 0x00
spare1: 0x3b spare2: 0xbc spare3: 0xc
consistency value in tail: 0xbe511705
check value in block header: 0x0
block checksum disabled
Wed Dec 23 16:47:26 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_p005_8980.trc:
ORA-00600: internal error code, arguments: [2037], [141460651], [141460651], [162], [6], [1], [3367403517], [3141222167]
Wed Dec 23 16:47:27 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_dbw0_8864.trc:
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], []
Wed Dec 23 16:47:27 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_dbw0_8864.trc:
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], []
DBW0: terminating instance due to error 471
我们可以看到,这个数据库已经宕机好几天了,在23号的时候就已经无法正常open了。上述的结果错误也很常见很简单。朋友经过处理后顺利把数据库open了,可是打开之后创建新的undo 表空间之后尝试重启后就再也打开不开了。遇到了如下的错误:
Fri Dec 25 11:44:17 2015
Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0
Mem# 0: D:ORACLEPRODUCT10.2.0ORADATAxxxxREDO03.LOG
Fri Dec 25 11:44:17 2015
Completed redo application
Fri Dec 25 11:44:17 2015
Hex dump of (file 44, block 518322) in trace file d:oracleproduct10.2.0adminxxxxbdumpxxxx_dbw2_21232.trc
Corrupt block relative dba: 0x0b07e8b2 (file 44, block 518322)
Bad header found during preparing block for write
Data in bad block:
type: 82 format: 2 rdba: 0x1c4ee895
last change scn: 0x0001.c8bbad54 seq: 0x21 flg: 0xf0
spare1: 0x52 spare2: 0x4d spare3: 0x895b
consistency value in tail: 0xad545221
check value in block header: 0x0
block checksum disabled
Fri Dec 25 11:44:18 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_dbw2_21232.trc:
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], []
Fri Dec 25 11:44:18 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_dbw2_21232.trc:
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], []
DBW2: terminating instance due to error 471
Fri Dec 25 11:44:18 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_p014_20704.trc:
ORA-00471: DBWR process terminated with error
这个错误我也是第一次遇见。从kcbzpbuf函数来看,是dbwr进出写脏块时发现了坏块。很明显,从上面的坏块信息来看,dba地址0x0b07e8b2 和 rdba地址0x1c4ee895是不匹配,这肯定没法正常写入的。
针对这个错误我事后在metalink 搜索了一下,还是有不少相关的文档甚至是bug,这里不多说了。处理方式很简单,首先加入如下2个隐含参数尝试打开数据库:
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
做了不完全恢复后,尝试open resetlogs时发现遇到了如下错误:
ri Dec 25 12:20:03 2015
SMON: enabling cache recovery
Fri Dec 25 12:20:04 2015
Errors in file d:oracleproduct10.2.0adminxxxxudumpxxxx_ora_20176.trc:
ORA-00600: internal error code, arguments: [2662], [1], [3367742663], [1], [3367742798], [4194313], [], []
Fri Dec 25 12:20:05 2015
Errors in file d:oracleproduct10.2.0adminxxxxudumpxxxx_ora_20176.trc:
ORA-00600: internal error code, arguments: [2662], [1], [3367742663], [1], [3367742798], [4194313], [], []
Fri Dec 25 12:20:05 2015
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Dec 25 12:20:05 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_p015_21472.trc:
ORA-00600: internal error code, arguments: [15784], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Dec 25 12:20:05 2015
Errors in file d:oracleproduct10.2.0adminxxxxbdumpxxxx_p014_9072.trc:
ORA-00600: internal error code, arguments: [15784], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
看到这里2个错误,很多人可能会很迷惑,到底数据库无法open是因为哪个错误导致的呢? 这里稍微注意一下前后报错的顺序就知道了,很明显是前面的ora-00600 [2662]错误导致无法open resetlogs完成。
对于ora-00600 [2662]错误这里不再解释了,SCN的问题。这里直接推进scn即可。
将数据库启动到mount状态,执行如下命令:
SQL> alter system set job_queue_processes=0;
Session altered.
SQL> startup mount
ORACLE instance started.
Total System Global Area 7516192768 bytes
Fixed Size 2175728 bytes
Variable Size 1232850192 bytes
Database Buffers 6274678784 bytes
Redo Buffers 6488064 bytes
Database mounted.
SQL> alter session set events 'IMMEDIATE trace name adjust_scn level 10';
Session altered.
SQL> alter database open;
Database altered.
很顺利就打开了数据库。由于之前朋友已经创建了新的undo表空间,我只需要帮忙把undo重建就完事了,结束支持任务。尝试进行drop tablespace undotbs1 including contents and datafiles时,遇到ora-01548错误。
在alter session set “_smu_debug_mode” = 4; 后都无法进行drop,那只能继续使用下隐含参数了:
_offline_rollback_segments
_corrupted_rollback_segments
通过这2个隐含参数,可以顺利将旧的undo 表空间drop掉,然后修改pfile文件去掉不需要的一些隐含参数,顺利打开数据库。
到这里结束我的半小时支持友情支持任务!悲剧的下午又开始优化某客户的SQL。。。。。
备注:
1) 对于这种异常强制打开的数据库,建议进行全库检查,包括是否有坏块等,甚至检查数据字典是否一致等等(实际上检查之前的alert log发现, 之前数据库在恢复过程中就报了不少的坏块)
2) 在这个年代,还有3TB的数据库,而且是windows,更悲剧的还是非归档,没有备份,这是不应该的。
3) 我们应该多考虑下系统的容灾建设,哪怕是有备份。
电神魔傀2街机免费版 官方版v1.2.1
下载三国战纪2手游腾讯渠道服 安卓版v2.41.0.0
下载三国战纪2手游抖音渠道服 安卓版v2.41.0.0
下载三国战纪2折扣服 安卓版v2.41.0.0
下载叫我大掌柜小米版 安卓版v7.4.4
叫我大掌柜小米版是这款模拟经营类手游的渠道服版本,在此版本中
cooking fever正版 安卓最新版v23.0.2
cooking fever正版是一款非常好玩的模拟经营类手游
咖啡厅的生活故事 最新版v1.7
咖啡厅的生活故事是一款模拟经营游戏,玩家们在游戏中可以经营一
迅猛龙模拟器金币不减反增版 v1.1.8
迅猛龙模拟器无限金币版是一款动物模拟类游戏,玩家们将在游戏中
泽塔奥特曼升华器免广告版 v1.4
泽塔奥特曼升华器去广告版是游戏的破解版本,在该版本中为玩家去