近期一些典型的Case
招行问题分析及建议
针对近期工作中出现的问题,我选取了一些典型案例并加上自己的分析。
1. ftpdbn1-deadlock问题:
通过打开event monitor抓取的数据,发现是由一个存储过程大量执行同一个表的delete语句导致的死锁。存储过程定义如下:
CREATEPROCEDURE CIFDBO.SP_FRM_E_FMEVRFCA_LOG_ADD_V1 (
IN pMDL_INS_COD CIFDBO.MDL_INS_COD_ARRAY,
......
IN pREC_CNT CIFDBO.INT_ARRAY,
OUT pRtCode INTEGER
)
SPECIFIC SP_FRM_E_FMEVRFCA_LOG_ADD_V1
P1: BEGIN
DECLARE vEVT_ID CHAR(20);
DECLARE i, n INTEGER;
DECLARE vTimestamp TIMESTAMP;
DECLARE vDateDelete TIMESTAMP;
DECLARE SQLCODE INTEGER;
DECLAREEXIT HANDLER FOR SQLEXCEPTION
BEGIN
SET pRtCode = SQLCODE;
END;
SET pRtCode =-1;
SET vTimestamp =CURRENT TIMESTAMP;
SET vDateDelete = vTimestamp -3 MONTHS;
DELETEFROM FRM.FMEVRFCA_LOG WHERE CRT_TIM < vDateDelete;
SET n = CARDINALITY(pTBL_NAM);
SET i =1;
WHILE (i <= n) DO
INSERTINTO FRM.FMEVRFCA_LOG
(
MDL_INS_COD ,
SEQ_NO ,
TBL_NAM ,
OPR_COD ,
UPD_TIM ,
REM_KEY ,
REC_CNT ,
CRT_TIM
)
VALUES
(
pMDL_INS_COD[i],
pSEQ_NO[i] ,
pTBL_NAM[i] ,
pOPR_COD[i] ,
pUPD_TIM[i] ,
pREM_KEY[i] ,
pREC_CNT[i] ,
vTimestamp
);
SET i = i +1;
END WHILE;
SET pRtCode =0;
END P1
优化建议: 为避免将delete语句移出存储过程,由一个单独的task定期执行!
2. osfdb01-ORSDB 锁升级
造成锁升级的SQL语句(锁升级发生在表OWK.EMP_STAFF_ORG上):
SELECT tsk.RSP_TSK_CODE, tsk.PRJ_PUB_CODE, tsk.RSP_PRJ_CODE, prj.RSP_PRJ_NAME, tsk.RSM_ID, tsk.RSM_NAME, tsk.RSM_ORG, prj.RSP_FREQ, pub.BRANCH_ID,pub.BRANCH_NAME, tsk.CHK_OBJ, tsk.ATT_NAME, tsk.CHK_DATE, tsk.PLM_RSV, tsk.REMARK, tsk.DUE_DATE, tsk.UPDATE_USER, (case when (tsk.RSP_TSK_STATUS ='FINISHED'or tsk.RSP_TSK_STATUS='ODFINISHED') then tsk.UPDATE_TIME elsenullend) as UPDATE_TIME, tsk.RSP_TSK_STATUS
FROM OBS.RSP_TASK tsk left join OBS.RSP_PRJ_PUB pub on tsk.PRJ_PUB_CODE = pub.PRJ_PUB_CODE left join OBS.RSP_PRJ prj on tsk.RSP_PRJ_CODE = prj.RSP_PRJ_CODE
WHERE 1=1AND tsk.RSM_ID IN (
select sta_id from OWK.EMP_STAFF where sta_id='157495'union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists (select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106075%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists (select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106110%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists (select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/112405%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists (select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/116955%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists (select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106085%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists (select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106095%') )
AND tsk.DUE_DATE >='2017-07-01'
AND tsk.DUE_DATE <='2017-07-31'
ORDERBYYEAR(tsk.CREATE_TIME) DESC, pub.BRANCH_ID DESC
抓出关于表OWK.EMP_STAFF_ORG部分的SQL
select sta_id from OWK.EMP_STAFF where sta_id='157495'union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106075%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106110%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/112405%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/116955%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106085%') union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and org.PTH like'100001/105990/106095%')
用db2expln得到它的执行计划:
Optimizer Plan:
Rows
Operator
(ID)
Cost
6.79546
RETURN
( 1)
6214.79
|
6.79546
TBSCAN
( 2)
6214.79
|
6.79546
SORT
( 3)
6214.79
|
41.2667
UNION
( 4)
6214.77
+-----------------------+------------------------+-----------------+-----+-----------------------+---------------+-------------------+
6.79546 6.28938 6.79546 6.79546 6.79546 1 6.79546
HSJOIN HSJOIN HSJOIN HSJOIN HSJOIN IXSCAN HSJOIN
( 5) ( 8) (11) (14) (17) (20) (21)
1034.53 1034.53 1034.53 1034.53 1034.53 7.58089 1034.53
/ \ / \ / \ / \ / \ | / \
102296 1.08021 102296 0.999762 102296 1.08021 102296 1.08021 102296 1.08021 101993 102296 1.08021
TBSCAN TBSCAN TBSCAN TBSCAN TBSCAN TBSCAN TBSCAN TBSCAN TBSCAN TBSCAN Index: TBSCAN TBSCAN
( 6) ( 7) ( 9) (10) (12) (13) (15) (16) (18) (19) OWK (22) (23)
293.6 735.957 293.6 735.957 293.6 735.957 293.6 735.957 293.6 735.957 P_STAFF_ID 293.6 735.957
| | | | | | | | | | | |
102296 16261 102296 16261 102296 16261 102296 16261 102296 16261 102296 16261
Table: Table: Table: Table: Table: Table: Table: Table: Table: Table: Table: Table:
OWK OWK OWK OWK OWK OWK OWK OWK OWK OWK OWK OWK
EMP_STAFF_ORG EMP_ORG EMP_STAFF_ORG EMP_ORG EMP_STAFF_ORG EMP_ORG EMP_STAFF_ORG EMP_ORG EMP_STAFF_ORG EMP_ORG EMP_STAFF_ORG EMP_ORG
Estimated Cost = 6214.787109
将SQL改编如下:
select sta_id from OWK.EMP_STAFF where sta_id='157495'union
select sta.sta_id from OWK.EMP_STAFF_ORG sta where sta.MFLAG=1andexists
(select1from OWK.EMP_ORG org where sta.ORG_CODE = org.ORG_CODE and
substr(org.PTH,1,20) in ('100001/105990/106075','100001/105990/106110','100001/105990/112405','100001/105990/116955','100001/105990/106085','100001/105990/106095'))
用db2expln得到新SQL的执行计划:
Optimizer Plan:
Rows
Operator
(ID)
Cost
40.2545
RETURN
( 1)
1041.11
|
40.2545
TBSCAN
( 2)
1041.11
|
40.2545
SORT
( 3)
1041.11
|
41.2545
UNION
( 4)
1041.09
/ \
40.2545 1
HSJOIN IXSCAN
( 5) ( 8)
1033.51 7.58089
/ \ |
102296 6.39886 101993
TBSCAN TBSCAN Index:
( 6) ( 7) OWK
293.6 734.935 P_STAFF_ID
| |
102296 16261
Table: Table:
OWK OWK
EMP_STAFF_ORG EMP_ORG
Estimated Cardinality = 40.254482
简单改写后,返回的结果集不变,但效率提高了很多。可见SQL语句
3. didisrvdb02-逻辑读高问题:
逻辑读高的SQL:
update (select TRN_STATUS,REAL_SERIAL
from DPAY.TAB_OUSYS_INFO1
where TRN_STATUS=:L0 and CUST_ACCNO=:L1 and REAL_SERIAL=:L2 and ID>=:L3 and ID<=:L4
orderby id ascfetch first 5000 rows only)
set TRN_STATUS=:L5 ,REAL_SERIAL=:L6
查看表 DPAY.TAB_OUSYS_INFO1上的索引:
索引名 索引包含的列
SQL160106192202630 +MERCH_DATE+MERCH_SERIAL
OUSYS1_INDEX_1 +ID+TRN_BATCH
OUSYS1_INDEX_2 +TRN_STATUS+CUST_ACCNO+REAL_SERIAL+SEND_FLAG
OUSYS1_INDEX_3 +TRN_STATUS+MERCH_DATE+ID
OUSYS1_INDEX_4 +ID
表的行数及索引的键值情况如下:
db2 "select count(1) from DPAY.TAB_OUSYS_INFO1"
2685595
db2 "select count(distinct SEND_FLAG) from DPAY.TAB_OUSYS_INFO1"
1
db2 "select count(distinct TRN_STATUS) from DPAY.TAB_OUSYS_INFO1"
4
db2 "select count(distinct CUST_ACCNO) from DPAY.TAB_OUSYS_INFO1"
1
db2 "select count(distinct REAL_SERIAL) from DPAY.TAB_OUSYS_INFO1"
1247
db2 "select count(distinct ID) from DPAY.TAB_OUSYS_INFO1"
2685074
db2 "select count(distinct MERCH_DATE) from DPAY.TAB_OUSYS_INFO1"
6
db2 "select count(distinct MERCH_SERIAL) from DPAY.TAB_OUSYS_INFO1"
2685339
从以上的数据可以看出,此表上的索引建立的很不合理。根据索引建立原则,我们应该选择强键值列作为索引列,而且越强的越要越放在前面,所以此表上的索引应该做如下优化:
l 对索引SQL160106192202630进行改造,使索引包含的列为+MERCH_SERIAL 或+MERCH_SERIAL+MERCH_DATE,这样根据MERCH_SERIAL检索时也可以用到此索引。
l 索引OUSYS1_INDEX_1可以保留,但意义不大,因为ID已经是强键值列。
l 对索引OUSYS1_INDEX_2进行改造,只保留REAL_SERIAL列,至少也应该将REAL_SERIAL列放在最前面。
l 索引OUSYS1_INDEX_3可以删除,至少也应该将ID列放在最前面。
l 保留OUSYS1_INDEX_4索引。
数据库活动时间为2016-01-06-20.19.41:
db2 "SELECT SUBSTR(DB_NAME, 1, 20) AS DB_NAME, DB_STATUS, SERVER_PLATFORM, DB_LOCATION, DB_CONN_TIME FROM SYSIBMADM.SNAPDB"
DB_NAME DB_STATUS SERVER_PLATFORM DB_LOCATION DB_CONN_TIME
-------------------------------------------------------------------------------------------------------------------------------- ---------------- --------------- ------------ --------------------------
DIDIPRI ACTIVE AIX64 LOCAL 2016-01-06-20.19.41.644178
检查数据库的索引使用情况:
db2 "SELECT VARCHAR(S.INDSCHEMA, 20) AS INDSCHEMA,VARCHAR(S.INDNAME, 20) AS INDNAME,T.DATA_PARTITION_ID, T.MEMBER,T.INDEX_SCANS,T.INDEX_ONLY_SCANS FROM TABLE(MON_GET_INDEX(null,null, -2)) as T, SYSCAT.INDEXES AS S WHERE T.TABSCHEMA = S.TABSCHEMA AND T.TABNAME = S.TABNAME AND T.IID = S.IID and S.INDSCHEMA not like 'SYS%' ORDER BY INDEX_SCANS DESC"|more
数据库中非系统索引共336个:
db2 "SELECT count(*) FROM TABLE(MON_GET_INDEX(null,null, -2)) as T, SYSCAT.INDEXES AS S WHERE T.TABSCHEMA = S.TABSCHEMA AND T.TABNAME = S.TABNAME AND T.IID = S.IID and S.INDSCHEMA not like 'SYS%'"
336
有99个索引自数据库激活以来从未使用:
db2 "SELECT count(*) FROM TABLE(MON_GET_INDEX(null,null, -2)) as T, SYSCAT.INDEXES AS S WHERE T.TABSCHEMA = S.TABSCHEMA AND T.TABNAME = S.TABNAME AND T.IID = S.IID and S.INDSCHEMA not like 'SYS%' and T.INDEX_SCANS=0"
99
4. 问题分析及建议:
根据以上的问题分析,应用方面存在如下几个问题:
l 只考虑功能的实现,对是否会造成锁竞争,SQL语句执行效率是否底下考虑不足。
l 数据库存在大量的无效索引和不合理的索引。
几点建议如下:
l 在数据库报告中增加"从未使用的索引"项,并考虑删除以节省空间和提高效率。
l 对开发人员进行锁机制,SQL优化,建立高效索引方面的培训,以从源头解决问题,减少运维压力。