设为首页 | 加入收藏 RSS订阅 | 高级搜索 | 收藏本站 | 论坛
用户名: 密码: 验证码: gdcode 注册
首页 | 财税咨询 | 税收实务 | 纳税辅导 | 税收优惠 | 税务策划 | 财税实务 | 财务研究 | 税务知识 | 财软 | 学院 | 下载 | 数据库技术 | 论坛
logo

Oracle中捕获问题SQL解决CPU过渡消耗

时间:2007-06-15 来源: 作者: 【字体: 减小 增大点击: 收藏 | 投稿
  
本文通过实际业务系统中调整的一个案例,试图给出一个常见CPU消耗问题的一个诊断方法.大多数情况下,系统的性能问题都是由不良SQL代码引起的,那么作为DBA,怎样发现和解决这些SQL问题就显得尤为重要.

  本案例平台为UNIX,所以不可避免的应用了一些Unix下常用的工具.如vmstat,top等.

  本文适宜读者范围:中高级.

  系统环境:

   OS:Solaris8

   Oracle:8.1.7.4

  问题描述:

  开发人员报告系统运行缓慢,已经影响业务系统正常使用.请求协助诊断.

  1.登陆数据库主机

  使用vmstat检查,发现CPU资源已经耗尽,大量任务位于运行队列:

bash-2.03$vmstat3
procsmemorypagediskfaultscpu
rbwswapfreeremfpipofrdesrs6s9s1sdinsycsussyid
0005504232146411200000000110429496719600-84-5-145

财管 家园 fs119.net


1310053680721518360566910220001003011791827959730
1310053773281522464817190220001002766801925779640
1300053824001524776676820000000003570853433169730
134005373616152051212710780220001003838958436239640
13600536939215184961079240550000002920857326399730
1320053649121516224635780000000003358794431199730
1290053586481511712189123600000000033661036531359550
129005354528151130412011940000000403235886429119640
1280053468481507704998230000000303189904830749640
1250053412481504704808430220006103563951433149550
1330053327441501112797980000000103218880529029730
12900532538414973681076430220001403184829728799640
1260053631441514320817530000000002533740921649730
1360053556241510512169566786000000103002860028109640
13010535144815029362675801821000000003126781229009640
12900534725614995681559132220000102225807619419820
116005338192149540017711620000000101947778116399730
  2.使用Top命令

  观察进程CPU耗用,发现没有明显过高CPU使用的进程 财软联.盟.fs119.net

$top
lastpid:28313;loadaverages:99.90,117.54,125.7123:28:38
296processes:186sleeping,99running,2zombie,9oncpu
CPUstates:0.0%idle,96.5%user,3.5%kernel,0.0%iowait,0.0%swap
Memory:4096Mreal,1404Mfree,2185Mswapinuse,5114Mswapfree
PIDUSERNAMETHRPRINICESIZERESSTATETIMECPUCOMMAND
27082oracle8i13301328M1309Mrun0:171.29%oracle
26719oracle8i15501327M1306Msleep0:291.11%oracle
28103oracle8i13501327M1304Mrun0:061.10%oracle
28161oracle8i12501327M1305Mrun0:041.10%oracle
26199oracle8i14501328M1309Mrun0:421.10%oracle
26892oracle8i13301328M1310Mrun0:241.09%oracle
27805oracle8i14501327M1306Mcpu/10:101.04%oracle
23800oracle8i12301327M1306Mrun1:281.03%oracle
25197oracle8i13401328M1309Mrun0:571.03%oracle
21593oracle8i13301327M1306Mrun2:121.01%oracle
27616oracle8i14501329M1311Mrun0:141.01%oracle
27821oracle8i14301327M1306Mrun0:101.00%oracle
26517oracle8i13301328M1309Mrun0:330.97%oracle

财管家 园 fs119.net


25785oracle8i14401328M1309Mrun0:460.96%oracle
26241oracle8i14501327M1306Mrun0:420.96%oracle
  3.检查进程数量

bash-2.03$ps-ef|grepora|wc-l
258
bash-2.03$ps-ef|grepora|wc-l
275
bash-2.03$ps-ef|grepora|wc-l
274
bash-2.03$ps-ef|grepora|wc-l
278

bash-2.03$ps-ef|grepora|wc-l
277
bash-2.03$ps-ef|grepora|wc-l
366


  发现系统存在大量Oracle进程,大约在300左右,大量进程消耗了几乎所有CPU资源,而正常情况下Oracle连接数应该在100左右. 财软联.盟.fs119.net

财软.联盟.fs119.net

财,软联盟,fs119.net


财软,联盟,fs119.net

财管家园.fs119.net

  4.检查数据库

  查询v$session_wait获取各进程等待事件

SQL>selectsid,event,p1,p1textfromv$session_wait;
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
124latchfree1.6144E10address
1pmontimer300duration
2rdbmsipcmessage300timeout
3rdbmsipcmessage300timeout
11rdbmsipcmessage30000timeout
6rdbmsipcmessage180000timeout
4rdbmsipcmessage300timeout
134rdbmsipcmessage6000timeout
147rdbmsipcmessage6000timeout
275rdbmsipcmessage17995timeout
274rdbmsipcmessage6000timeout
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
118rdbmsipcmessage6000timeout
7bufferbusywaits17file#
56bufferbusywaits17file#
161bufferbusywaits17file#
195bufferbusywaits17file#

财管家园.fs119.net

311bufferbusywaits17file#
314bufferbusywaits17file#
205bufferbusywaits17file#
269bufferbusywaits17file#
200bufferbusywaits17file#
164bufferbusywaits17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------

140bufferbusywaits17file#
66bufferbusywaits17file#
10dbfilesequentialread17file#
18dbfilesequentialread17file#
54dbfilesequentialread17file#
49dbfilesequentialread17file#
48dbfilesequentialread17file#
46dbfilesequentialread17file#
45dbfilesequentialread17file#
35dbfilesequentialread17file#
30dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
29dbfilesequentialread17file#
22dbfilesequentialread17file#
178dbfilesequentialread17file#
175dbfilesequentialread17file#
171dbfilesequentialread17file#
123dbfilesequentialread17file#
121dbfilesequentialread17file#

财管家.园.fs119.net


120dbfilesequentialread17file#
117dbfilesequentialread17file#
114dbfilesequentialread17file#
113dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
111dbfilesequentialread17file#
107dbfilesequentialread17file#
80dbfilesequentialread17file#
222dbfilesequentialread17file#
218dbfilesequentialread17file#
216dbfilesequentialread17file#
213dbfilesequentialread17file#
199dbfilesequentialread17file#
198dbfilesequentialread17file#
194dbfilesequentialread17file#
192dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
188dbfilesequentialread17file#
249dbfilesequentialread17file#

242dbfilesequentialread17file#
239dbfilesequentialread17file#
236dbfilesequentialread17file#
235dbfilesequentialread17file#
234dbfilesequentialread17file# 财管家,园,fs119.net
233dbfilesequentialread17file#
230dbfilesequentialread17file#
227dbfilesequentialread17file#
336dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
333dbfilesequentialread17file#
331dbfilesequentialread17file#
329dbfilesequentialread17file#
327dbfilesequentialread17file#
325dbfilesequentialread17file#
324dbfilesequentialread17file#
320dbfilesequentialread17file#
318dbfilesequentialread17file#
317dbfilesequentialread17file#
316dbfilesequentialread17file#
313dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
305dbfilesequentialread17file#
303dbfilesequentialread17file#
301dbfilesequentialread17file#
293dbfilesequentialread17file#
290dbfilesequentialread17file#
288dbfilesequentialread17file#
287dbfilesequentialread17file#
273dbfilesequentialread17file# 财管.家园.fs119.net
271dbfilesequentialread17file#
257dbfilesequentialread17file#
256dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
254dbfilesequentialread17file#
252dbfilesequentialread17file#
159dbfilesequentialread17file#
153dbfilesequentialread17file#

146dbfilesequentialread17file#
142dbfilesequentialread17file#
135dbfilesequentialread17file#
133dbfilesequentialread17file#
132dbfilesequentialread17file#
126dbfilesequentialread17file#
79dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
77dbfilesequentialread17file#
72dbfilesequentialread17file#
70dbfilesequentialread17file#
69dbfilesequentialread17file#
67dbfilesequentialread17file#
63dbfilesequentialread17file#
55dbfilesequentialread17file#
102dbfilesequentialread17file#
96dbfilesequentialread17file#
财软联.盟.fs119.net

95dbfilesequentialread17file#
91dbfilesequentialread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
81dbfilesequentialread17file#
15dbfilesequentialread17file#
19dbfilescatteredread17file#
50dbfilescatteredread17file#
285dbfilescatteredread17file#
279dbfilescatteredread17file#
255dbfilescatteredread17file#
243dbfilescatteredread17file#
196dbfilescatteredread17file#
187dbfilescatteredread17file#
170dbfilescatteredread17file#
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
162dbfilescatteredread17file#
138dbfilescatteredread17file#
110dbfilescatteredread17file#
108dbfilescatteredread17file#
92dbfilescatteredread17file#
330dbfilescatteredread17file#

310dbfilescatteredread17file#
302dbfilescatteredread17file#
299dbfilescatteredread17file#
89dbfilescatteredread17file#

财,管家园,fs119.net


5smontimer300sleeptime
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
20SQL*Netmessagetoclient1952673792driverid
103SQL*Netmessagetoclient1650815232driverid
....
148SQL*Netmoredatafromclient1952673792driverid
291SQL*Netmoredatafromclient1952673792driverid
244rowsselected.
  发现存在大量dbfilescatteredread及dbfilesequentialread等待.显然全表扫描等操作成为系统最严重的性能影响因素.

财.软联盟.fs119.net

财软联,盟,fs119.net

财软联 盟 fs119.net


财管家,园,fs119.net

财管家园.fs119.net

  5.捕获相关SQL

  这里用到了我的以下脚本getsqlbysid:

SELECTsql_text
FROMv$sqltexta
WHEREa.hash_value=(SELECTsql_hash_value
FROMv$sessionb
WHEREb.SID=’&sid’)
ORDERBYpieceASC
/
  该脚本根据用户sid,结合v$session和v$sqltext视图,获得用户sql语句的完整文本。用该脚本,通过从v$session_wait中获得的等待全表或索引扫描的进程SID,捕获问题sql:

SQL>@getsql
Entervalueforsid:18
old5:whereb.sid=’&sid’
new5:whereb.sid=’18’

SQL_TEXT
----------------------------------------------------------------
selecti.vc2title,i.numinfoguidfromhs_infoiwherei.intenab
ledflag=1andi.intpublishstate=1andi.datpublishdate<=
sysdateandi.numcatalogguid=2047orderbyi.datpublishdated

财管家园.fs119.net

esc,i.numorderdesc
SQL>/
Entervalueforsid:54
old5:whereb.sid=’&sid’
new5:whereb.sid=’54’
SQL_TEXT
----------------------------------------------------------------
selecti.vc2title,i.numinfoguidfromhs_infoiwherei.intenab
ledflag=1andi.intpublishstate=1andi.datpublishdate<=
sysdateandi.numcatalogguid=33orderbyi.datpublishdatedes
c,i.numorderdesc
SQL>/
Entervalueforsid:49
old5:whereb.sid=’&sid’
new5:whereb.sid=’49’
SQL_TEXT
----------------------------------------------------------------
selecti.vc2title,i.numinfoguidfromhs_infoiwherei.intenab
ledflag=1andi.intpublishstate=1andi.datpublishdate<=
sysdateandi.numcatalogguid=26orderbyi.datpublishdatedes
c,i.numorderdesc

  对几个进程进行跟踪,分别得到以上SQL语句,这些SQL可能就是问题产生的根源。以上语句如果良好编码应该使用绑定变量.但是现在这个不是我们关心的。使用该应用用户连接,检查以上SQL的执行计划: 财,管家园,fs119.net

SQL>setautotracetraceexplain
SQL>selecti.vc2title,i.numinfoguid
2fromhs_infoiwherei.intenabledflag=1
3andi.intpublishstate=1andi.datpublishdate<=sysdate
4andi.numcatalogguid=3475
5orderbyi.datpublishdatedesc,i.numorderdesc;
ExecutionPlan

----------------------------------------------------------
0SELECTSTATEMENTOptimizer=CHOOSE(Cost=228Card=1Bytes=106)
10SORT(ORDERBY)(Cost=228Card=1Bytes=106)
21TABLEACCESS(FULL)OF’HS_INFO’(Cost=218Card=1Bytes=106)
SQL>selectcount(*)fromhs_info;
COUNT(*)
----------
227404
  以上查询使用了全表扫描,该表这里有22万记录,全表扫描已经不再适合。

  检查该表,存在以下索引:

SQL>selectindex_name,index_typefromuser_indexeswheretable_name=’HS_INFO’; 财管家园.fs119.net
INDEX_NAMEINDEX_TYPE
---------------------------------------------------------
HSIDX_INFO1FUNCTION-BASEDNORMAL
HSIDX_INFO_SEARCHKEYDOMAIN
PK_HS_INFONORMAL
  检查索引键值:

SQL>selectindex_name,column_name
2fromuser_ind_columnswheretable_name=’HS_INFO’;
INDEX_NAMECOLUMN_NAME
--------------------------------------------------
HSIDX_INFO1NUMORDER
HSIDX_INFO1SYS_NC00024$
HSIDX_INFO_SEARCHKEYVC2INDEXWORDS
PK_HS_INFONUMINFOGUID
SQL>deschs_info
NameNull?Type
-------------------------------------------------------------------------------------
NUMINFOGUIDNOTNULLNUMBER(15)
NUMCATALOGGUIDNOTNULLNUMBER(15)
INTTEXTTYPENOTNULLNUMBER(38)
VC2TITLENOTNULLVARCHAR2(60)
VC2AUTHORVARCHAR2(100)

NUMPREVINFOGUIDNUMBER(15)
NUMNEXTINFOGUIDNUMBER(15)
NUMORDERNOTNULLNUMBER(15)
DATPUBLISHDATENOTNULLDATE
INTPUBLISHSTATENOTNULLNUMBER(38) 财管,家园,fs119.net
VC2PUBLISHERIDVARCHAR2(30)
VC2INDEXWORDSVARCHAR2(200)
VC2WAPPREVPATHVARCHAR2(200)
VC2WEBPREVPATHVARCHAR2(200)
VC2WAP2PREVPATHVARCHAR2(200)
NUMVISITEDNOTNULLNUMBER(15)
INTENABLEDFLAGNOTNULLNUMBER(38)
DATCREATETIMENOTNULLDATE
DATMODIFYTIMENOTNULLDATE
VC2NOTESVARCHAR2(1000)
INTINFOTYPENOTNULLNUMBER(38)
VC2PRIZEFLAGVARCHAR2(1)
VC2DESCVARCHAR2(1000)
  6.决定创建新的索引以消除全表扫描

  检查发现在numcatalogguid字段上并没有索引,该字段具有很好的区分度,考虑在该字段创建索引以消除全表扫描。

SQL>createindexhs_info_NUMCATALOGGUIDonhs_info(NUMCATALOGGUID);
Indexcreated.
SQL>setautotracetraceexplain
SQL>selecti.vc2title,i.numinfoguid
2fromhs_infoiwherei.intenabledflag=1
3andi.intpublishstate=1andi.datpublishdate<=sysdate
4andi.numcatalogguid=3475 财管家园,fs119.net
5orderbyi.datpublishdatedesc,i.numorderdesc;
ExecutionPlan
----------------------------------------------------------
0SELECTSTATEMENTOptimizer=CHOOSE(Cost=12Card=1Bytes=106)
10SORT(ORDERBY)(Cost=12Card=1Bytes=106)
21TABLEACCESS(BYINDEXROWID)OF’HS_INFO’(Cost=2Card=1
Bytes=106)
32INDEX(RANGESCAN)OF’HS_INFO_NUMCATALOGGUID’
(NON-UNIQUE)(Cost=1Card=1)
  7.观察系统状况

  原大量等待消失

SQL>selectsid,event,p1,p1textfromv$session_waitwhereeventnotlike’SQL%’;
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
1pmontimer300duration
2rdbmsipcmessage300timeout
3rdbmsipcmessage300timeout
6rdbmsipcmessage180000timeout
59rdbmsipcmessage6000timeout
118rdbmsipcmessage6000timeout
275rdbmsipcmessage30000timeout
147rdbmsipcmessage6000timeout 财管,家园,fs119.net
62rdbmsipcmessage6000timeout
11rdbmsipcmessage30000timeout
4rdbmsipcmessage300timeout
SIDEVENTP1P1TEXT
------------------------------------------------------------------------------------------------------------------
305dbfilesequentialread17file#
356dbfilesequentialread17file#
19dbfilescatteredread17file#
5smontimer300sleeptime
15rowsselected.
  持续观察的CPU使用情况

bash-2.03$vmstat3
procsmemorypagediskfaultscpu
rbwswapfreeremfpipofrdesrs6s9s1sdinsycsussyid
20005421792150348838434136000000202931779526229190
23105416080150063295734560000000029498057259889110
2600541201614984802101170215500021033019647311690100
2500539491214901602421606560000001031339318285089110
4000539020014881121621393660000000028489080250290100
4000537712014817921361180120220001102846909925939280
3600536321614751681341169530000032028717989262188120 财管家,园,fs119.net
390053489361469160157144821000000000366010062348088120
35005344552146647271556000000002885766326359280

3400534301614654164438677000000003197848629029280
31005331568145969617814911220000003032379461300589110
3100531779214530087671980000000003292873630259370
3120531114414495522351263692200010034739535335788120
25005300240144392010875718220001102377787622749550
190052959041441840503770000000101915659815999810
----以上为创建索引之前部分
----以下为创建索引之后部分,CPU使用率恢复正常
procsmemorypagediskfaultscpu
rbwswapfreeremfpipofrdesrs6s9s1sdinsycsussyid
4010529004014392083153894822000160363113414520661930
01052371921414744731674945000002703264135584941521434
00051636321380608747658510000000102617122913901461241
10050902241348152712607929000000602825124164178501239
10050236721317296714618324000000503166124244745471340
00049558721287136737625816000000302890117774432441244
1004887888125646480962348220000202809120664247451243
财软 联盟 fs119.net

000482891212282003122364135500021024106816349238657
00048568161240168813800000010023144026323234462
0004874176124771208600000000022983930332435263
2004926088127082434560000000000219246942612291655
0005427320151295253694000000320244350853340331255
0005509120155313603700000000023093908332135164
000556204815770001623400000000025075187343335857
00056656721623848252189682200010020916548293934561
000565475216182085173160000000022264218305135460
000572702416511202825400000000021264224298238260
000572318416488809356282200011023715140343238359
000573074416525127177262200010024654442357536361
  至此,此问题得以解决.

  8.性能何以提高?

  回答这个问题似乎是多余的,我只想重申一点:

  有效的降低SQL的逻辑读是SQL优化的基本原则之一,我们来比较一下前后两种执行方式的读取及性能差异。

  全表扫描的性能:

SQL>selecti.vc2title,i.numinfoguid 财软联 盟 fs119.net
2fromhs_infoiwherei.intenabledflag=1

3andi.intpublishstate=1andi.datpublishdate<=sysdate
4andi.numcatalogguid=3475
5orderbyi.datpublishdatedesc,i.numorderdesc;
352rowsselected.
ExecutionPlan
----------------------------------------------------------
0SELECTSTATEMENTOptimizer=CHOOSE(Cost=541Card=1Bytes=106)
10SORT(ORDERBY)(Cost=541Card=1Bytes=106)
21TABLEACCESS(FULL)OF’HS_INFO’(Cost=531Card=1Bytes=106)
Statistics
----------------------------------------------------------
0recursivecalls
25dbblockgets
3499consistentgets
258physicalreads
0redosize
14279bytessentviaSQL*Nettoclient
2222bytesreceivedviaSQL*Netfromclient
25SQL*Netroundtripsto/fromclient
2sorts(memory)
0sorts(disk)
352rowsprocessed
  使用索引的性能:

SQL>selecti.vc2title,i.numinfoguid
2fromhs_infoiwherei.intenabledflag=1
3andi.intpublishstate=1andi.datpublishdate<=sysdate 财管家园,fs119.net
4andi.numcatalogguid=3475
5orderbyi.datpublishdatedesc,i.numorderdesc;
352rowsselected.
ExecutionPlan
----------------------------------------------------------
0SELECTSTATEMENTOptimizer=CHOOSE(Cost=12Card=1Bytes=106)
10SORT(ORDERBY)(Cost=12Card=1Bytes=106)
21TABLEACCESS(BYINDEXROWID)OF’HS_INFO’(Cost=2Card=1
Bytes=106)
32INDEX(RANGESCAN)OF’HS_INFO_NUMCATALOGGUID’
(NON-UNIQUE)(Cost=1Card=1)

Statistics
----------------------------------------------------------
0recursivecalls
0dbblockgets
89consistentgets
0physicalreads
0redosize
14279bytessentviaSQL*Nettoclient
2222bytesreceivedviaSQL*Netfromclient
25SQL*Netroundtripsto/fromclient
1sorts(memory)
0sorts(disk)
352rowsprocessed
  consistentgets从3499到89,我们看到性能得到了巨大的提高。

  结束语:

  通常,开发人员很少注意SQL代码的效率,他们更着眼于功能的实现.至于性能问题通常被认为是次要的,而且在应用系统开发初期,由于数据库数据量较少,对于查询SQL语句等,不容易体会出各种SQL句法的性能差异. 财管,家园,fs119.net

  但是一旦这些应用作为生产系统上线运行,随着数据库中数据量的增加,大量并发访问,系统的响应速度可能就会成为系统需要解决的最主要的问题之一.

  在少量用户下性能可以接受的SQL,可能在大量用户并发的条件下就会成为性能瓶颈。在我这个案例中,开发人员很难相信仅只一条SQL语句就导致了整个数据库的性能下降。

  然而事实就是如此,一条低效的SQL语句就可能毁掉你的数据库,所以在系统设计及开发过程中,你必须考虑到诸多细节,严格的测试也是提早发现问题的有效方法。

  如果不幸以上环节都被忽略,那么,DBA(也许就是你)就是最后的一环,你必须能够快速的诊断并解决各种复杂问题。

财软联盟 fs119.net

财管家园.fs119.net

财,管家园,fs119.net


文章摘自网络,如有侵权,请与我们联系.
数据统计中!!
上一篇:Oracle平台应用数据库系统的设计与开发
下一篇:Oracle与SQLServer在企业应用中的比较

用户名: 密码: 匿名? 注册