一次简单的Oracle恢复Case实战记录
发表于:2025-01-20 作者:千家信息网编辑
千家信息网最后更新 2025年01月20日,发现问题某网友的数据库由于坏盘了,并且存储掉电,导致数据库无法open了。单看其数据库alert log的错误来看,是非常之简单的,如下:Fri Oct 26 10:33:53 2018Recover
千家信息网最后更新 2025年01月20日一次简单的Oracle恢复Case实战记录
发现问题
某网友的数据库由于坏盘了,并且存储掉电,导致数据库无法open了。单看其数据库alert log的错误来看,是非常之简单的,如下:
Fri Oct 26 10:33:53 2018Recovery of Online Redo Log: Thread 1 Group 3 Seq 39 Reading mem 0Mem# 0: /fs/fs/oradata/orcl/redo03.logBlock recovery stopped at EOT rba 39.77.16Block recovery completed at rba 39.77.16, scn 0.1002048587ORACLE Instance orcl (pid = 8) - Error 600 encountered while recovering transaction (9, 30) on object 9149.Fri Oct 26 10:33:53 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_smon_192644.trc:ORA-00600: internal error code, arguments: [6856], [0], [43], [], [], [], [], []Fri Oct 26 10:33:56 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_smon_192644.trc:ORA-00600: internal error code, arguments: [4194], [33], [36], [], [], [], [], []Doing block recovery for file 2 block 713Block recovery from logseq 39, block 82 to scn 1002048595
对于这种错误,很明显,屏蔽回滚段即可,屏蔽之后可顺利打开数据库,不过后面很快又会crash掉,因此重建undo也就绕过这个问题了。
打开数据库之后,再去观察数据库,会发现alert log有不少的错误,如下所示:
Fri Oct 26 11:01:46 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [17147], [0x110549070], [], [], [], [], [], []Fri Oct 26 11:01:46 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_m001_373218.trc:ORA-00600: internal error code, arguments: [kdddgb5], [196650], [0], [], [], [], [], []ORA-600 encountered when generating server alert SMG-4120Fri Oct 26 11:01:47 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [KGHALO4], [0x11047F6F0], [], [], [], [], [], []ORA-600 encountered when generating server alert SMG-4121Fri Oct 26 11:01:48 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [KGHALO4], [0x11047F6F0], [], [], [], [], [], []ORA-600 encountered when generating server alert SMG-4121Fri Oct 26 11:01:50 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_m001_373218.trc:ORA-00600: internal error code, arguments: [kdddgb5], [196650], [0], [], [], [], [], []Fri Oct 26 11:02:22 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [17114], [0x110549070], [], [], [], [], [], []Fri Oct 26 11:02:23 2018Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:ORA-00600: internal error code, arguments: [kebm_mmon_main_1], [39], [], [], [], [], [], []ORA-00039: error during periodic actionORA-00600: internal error code, arguments: [17114], [0x110549070], [], [], [], [], [], []Fri Oct 26 11:03:30 2018Restarting dead background process MMON
除此之外,由于之外alert log有坏块报错,因此对system进行了dbv检查,发现确实存在少量坏块,如下:
DBVERIFY: Release 10.2.0.4.0 - Production on Fri Oct 26 10:37:20 2018 Copyright (c) 1982, 2007, Oracle. All rights reserved. DBVERIFY - Verification starting : FILE = system01.dbf DBV-00200: Block, DBA 4255202, already marked corruptBlock Checking: DBA = 4258751, Block Type = KTB-managed data blockdata header at 0x11022a05ckdbchk: fsbo(596) wrong, (hsz 4178)Page 64447 failed with check code 6129Block Checking: DBA = 4259386, Block Type = KTB-managed data block**** kdxcofbo = 208 != 24---- end index block validationPage 65082 failed with check code 6401Block Checking: DBA = 4269609, Block Type = Unlimited data segment headerIncorrect extent count in the extent map: 16777317Block Checking: DBA = 4269612, Block Type = KTB-managed data block**** kdxcofbo = 224 != 216---- end index block validationPage 75308 failed with check code 6401Block Checking: DBA = 4269615, Block Type = KTB-managed data block**** actual rows locked by itl 2 = 1 != # in trans. header = 0---- end index block validationPage 75311 failed with check code 6401Page 85271 is influx - most likely media corruptCorrupt block relative dba: 0x00414d17 (file 1, block 85271)Fractured block found during dbv:Data in bad block:type: 6 format: 2 rdba: 0x00414d17last change scn: 0x0000.3afaf495 seq: 0x1 flg: 0x04spare1: 0x0 spare2: 0x0 spare3: 0x0consistency value in tail: 0xfe830601check value in block header: 0x96c6computed block checksum: 0x3c6b Page 85383 is influx - most likely media corruptCorrupt block relative dba: 0x00414d87 (file 1, block 85383)Fractured block found during dbv:Data in bad block:type: 6 format: 2 rdba: 0x00414d87last change scn: 0x0000.3b6b9d19 seq: 0x1 flg: 0x06spare1: 0x0 spare2: 0x0 spare3: 0x0consistency value in tail: 0x970f0601check value in block header: 0xe825computed block checksum: 0x3c6b DBVERIFY - Verification complete Total Pages Examined : 640000Total Pages Processed (Data) : 116312Total Pages Failing (Data) : 1Total Pages Processed (Index): 65914Total Pages Failing (Index): 3Total Pages Processed (Other): 64634Total Pages Processed (Seg) : 0Total Pages Failing (Seg) : 0Total Pages Empty : 393138Total Pages Marked Corrupt : 3Total Pages Influx : 2Highest block SCN : 1002028510 (0.1002028510)
这部分错误,其实处理起来也不困难,部分是业务表的index,但是其他的几乎都是AWR相关基表,有2个坏块跟是system相关的基表和索引,分别是I_H_OBJ#_COL#和COM$ ,HISTGRM$。
对于业务索引,很简单,直接drop 重建即可,对于这个sys的index,可以通过设置38003 event进行drop重建。
对于基表COM$,HISTGRM$,由于是非bootstrap$核心对象,其实也可以处理掉的。
处理方法
不过考虑到这种毕竟是存储掉电,undo异常的情况,还是重建库更稳妥一些。最后补充一点,这个库稍微有点奇葩的地方是全库1.2TB,其中有个表的LOB自动980GB,重建数据库是相对较慢的。对于大表,且有LOB自动,通常建议基于分片,否则会报ORA-01555错误的,如下是常用的一个基于rowid的分片脚本,供大家参考:
set verify off undefine rowid_ranges undefine segment_name undefine owner set head off set pages 0 set trimspool on select 'where rowid between ''' || sys.dbms_rowid.rowid_create(1, d.oid, c.fid1, c.bid1, 0) || ''' and ''' || sys.dbms_rowid.rowid_create(1, d.oid, c.fid2, c.bid2, 9999) || '''' || ';' from (select distinct b.rn, first_value(a.fid) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) fid1, last_value(a.fid) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) fid2, first_value(decode(sign(range2 - range1), 1, a.bid + ((b.rn - a.range1) * a.chunks1), a.bid)) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) bid1, last_value(decode(sign(range2 - range1), 1, a.bid + ((b.rn - a.range1 + 1) * a.chunks1) - 1, (a.bid + a.blocks - 1))) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) bid2 from (select fid, bid, blocks, chunks1, trunc((sum2 - blocks + 1 - 0.1) / chunks1) range1, trunc((sum2 - 0.1) / chunks1) range2 from (select /*+ rule */ relative_fno fid, block_id bid, blocks, sum(blocks) over() sum1, trunc((sum(blocks) over()) / &&rowid_ranges) chunks1, sum(blocks) over(order by relative_fno, block_id) sum2 from dba_extents where segment_name = upper('&&segment_name') and owner = upper('&&owner')) where sum1 > &&rowid_ranges) a, (select rownum - 1 rn from dual connect by level <= &&rowid_ranges) b where b.rn between a.range1 and a.range2) c, (select max(data_object_id) oid from dba_objects where object_name = upper('&&segment_name') and owner = upper('&&owner') and data_object_id is not null) d /
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,如果有疑问大家可以留言交流,谢谢大家对的支持。
数据
数据库
错误
处理
业务
内容
索引
问题
参考
存储
学习
明显
稳妥
困难
价值
可以通过
地方
奇葩
对象
就是
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
软件开发专业体会心得
信标服务器
传奇服务器怎么监控玩家
2020年网络安全试点通知
维护网络安全 英文
软件开发从业人员的供需状况
山西电商软件开发定制
网络安全作业119
公安部网络安全监测平台
蔬菜鱼肉配送软件开发公司
mysql服务器登录
2k19连接不到服务器xbox
和创互联网科技
中国软件开发人才缺口
淮安云服务器哪个厂家质量好
数据库sdk
服务器usb安装系统安装
企业号服务器配置
网络安全管理工程师
网络安全防火墙分析题
服务器如何租用
传奇服务器怎么监控玩家
js 用什么链接数据库
网络安全行业工资好低
公安部网络安全监测平台
软件开发培训去哪里学
网络安全防诈骗手抄报A4纸
软件测试为什么要学数据库
软件开发笔记软件
专业的期货软件开发