千家信息网

long长事务回滚的模拟与定位

发表于:2024-11-19 作者:千家信息网编辑
千家信息网最后更新 2024年11月19日,大型事务的回滚 大型事务的回滚产生非常大的代价,不仅锁定需要的资源,并且消耗的CPU和IO,尤其是IO将极为密集。尤其在KILL大型事务之前检查事务究竟有多大可能是必要的,同时我们也需要知道回滚已经进
千家信息网最后更新 2024年11月19日long长事务回滚的模拟与定位大型事务的回滚 大型事务的回滚产生非常大的代价,不仅锁定需要的资源,并且消耗的CPU和IO,尤其是IO将极为密集。尤其在KILL大型事务之前检查事务究竟有多大可能是必要的,同时我们也需要知道回滚已经进行了多少程度。
一、模拟:

1.删除500多万的数据。


SQL> conn hr/hr;
Connected.
SQL> create table test as select * from dba_objects;

Table created.

SQL> insert into test select * from test;

87055 rows created.

SQL> insert into test select * from test;

174110 rows created.

SQL> insert into test select * from test;

348220 rows created.

SQL> insert into test select * from test;

696440 rows created.

SQL> insert into test select * from test;

1392880 rows created.

SQL> select count(*) from test;

COUNT(*)
----------
2785760

SQL> insert into test select * from test;

2785760 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from test;

COUNT(*)
----------
5571520

SQL> analyze table test compute statistics;

Table analyzed.

SQL>
SQL> select sid from v$mystat where rownum=1;

SID
----------
30

SQL>

--模拟删除,不commit
SQL> delete test;

5571520 rows deleted.



--另开窗口,查询kill掉会话

SQL> select sid,serial#,sql_id,event,blocking_session from v$session where sid=30;

SID SERIAL# SQL_ID EVENT BLOCKING_SESSION
---------- ---------- ------------- ---------------------------------------------------------------- ----------------
30 165 7qqwcq9td6akt log buffer space 11

SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';

SQL_TEXT
----------------------------------------------------------------------
delete test

SQL> alter system kill session '30,165' immediate;

System altered.

--回到原来窗口验证:
SQL> select count(*) from test;
select count(*) from test
*
ERROR at line 1:
ORA-03135: connection lost contact
Process ID: 28346
Session ID: 30 Serial number: 165


二、定位:
查看回滚进度:
可以通过以下两个视图查看回滚的进度,通过单位时间内恢复的undo block来估算恢复时间:
1. 通过x$ktuxe
alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';
select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;


SQL> alter session set nls_date_format='yyyy-mm-dd hh34:mi:ss';

Session altered.

SQL> select ADDR,KTUXEUSN,KTUXESLT,KTUXESQN,KTUXESIZ,sysdate from x$ktuxe where KTUXECFL='DEAD' and KTUXESIZ >0;

ADDR KTUXEUSN KTUXESLT KTUXESQN KTUXESIZ SYSDATE
---------------- ---------- ---------- ---------- ---------- -------------------
00007F170E8AAC20 5 11 7784 84438 2017-12-09 14:19:22

SQL>

可以通过KTUXESLT ,KTUXESQN这两个字段,然后用以下脚本回滚得出大概需要的时间:
set serveroutput on
declare
l_start number;
l_end number;
begin
select ktuxesiz
into l_start
from x$ktuxe
where KTUXEUSN = 5
and KTUXESLT = 11;
dbms_lock.sleep(60);
select ktuxesiz
into l_end
from x$ktuxe
where KTUXEUSN = 5
and KTUXESLT = 11;
dbms_output.put_line('time est Day:' ||
round(l_end / (l_start - l_end) / 60 / 24, 2));
end;
/

time est Day:.01

PL/SQL procedure successfully completed.

SQL> SQL>

2. 通过v$fast_start_trancsations 状态为recovering表示恢复中;
select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;

SQL> select USN,SLT,SEQ,STATE,UNDOBLOCKSDONE,UNDOBLOCKSTOTAL,CPUTIME,XID,sysdate from V$FAST_START_TRANSACTIONS;

USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME XID SYSDATE
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------------- -------------------
5 11 7784 RECOVERING 123491 179829 120 05000B00681E0000 2017-12-09 14:20:25

通过如下视图观察回滚是串行还是并行回滚的,如下图应是并行恢复的,
V$FAST_START_SERVERS provides information about all the recovery slaves performing parallel transaction recovery.

通过xid字段与v$fast_start_trancsations关联。
select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);

SQL> select * from v$fast_start_servers where xid in (select XID from V$FAST_START_TRANSACTIONS);

STATE UNDOBLOCKSDONE PID XID
----------- -------------- ---------- ----------------
RECOVERING 133950 20 05000B00681E0000

3.查看回滚对象
通过dump undo block方式查看回滚的对象:

首先通过v$fast_start_trancsations的usn字段查询到使用的回滚段。

SQL> select * from v$rollname where usn=5;

USN NAME
---------- ------------------------------
5 _SYSSMU5_898567397$

dump这个undo block,因为dump的文件很大,在查询出对象的object_id后,kill掉这个dump会话。
alter system dump undo block "" XID ;

SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;

。。。。。。。。。


[oracle@wang trace]$ ls -lrt
total 635992
-rw-r----- 1 oracle oinstall 81 Apr 27 2017 DBdb_ora_9045.trm
-rw-r----- 1 oracle oinstall 59 Apr 27 2017 DBdb_mman_9065.trm
-rw-r----- 1 oracle oinstall 60 Apr 27 2017 DBdb_ora_9084.trm
-rw-r----- 1 oracle oinstall 111 Apr 27 2017 DBdb_ora_9099.trm
.......................................
-rw-r----- 1 oracle oinstall 12306 Dec 9 14:23 DBdb_ora_28113.trm
-rw-r----- 1 oracle oinstall 536798705 Dec 9 14:23 DBdb_ora_28113.trc
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$
[oracle@wang trace]$ grep objn DBdb_ora_28113.trc | head -5
* Rec #0x11 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0x10 slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xf slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xe slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
* Rec #0xd slt: 0x0b objn: 90373(0x00016105) objd: 90373 tblspc: 4(0x00000004)
[oracle@wang trace]$
[oracle@wang trace]$

可以查询到objn为121192,对应的就是dba_objects的object_id,即mosongtao.rollback_test,正是前边测试的对象。再查询v$session_longops配合username,last_update_time,target,可以大概定位到执行sql_id。

注意:在查询到object_id后手工停掉dump undo block 动作

SQL> alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784;

^C^C^C^C^C^C
alter system dump undo block "_SYSSMU5_898567397$" xid 5 11 7784
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation

SQL> SQL> SQL> SQL>


4.查询sql
SQL> select SID,TARGET,SQL_ID,START_TIME,LAST_UPDATE_TIME from v$session_longops where target like '%TEST%' order by LAST_UPDATE_TIME desc;

SID TARGET SQL_ID START_TIME LAST_UPDATE_
---------- ---------------------------------------------------------------- ------------- ------------ ------------
30 HR.TEST 7qqwcq9td6akt 09-DEC-17 09-DEC-17

SQL> select sql_text from v$sql where sql_id='7qqwcq9td6akt';

SQL_TEXT
-------------------------------------------------------------------------
delete test

SQL>








0