Advanced SQL Tuning
by Michael Bryzek (mbryzek@arsdigita.com)
Submitted on: 1999-11-12
Last updated: 1999-11-12
ArsDigita : ArsDigita Systems Journal : One article
By now you've invested in a great RDBMS and even tried to optimize
(see http://www.arsdigita.com/books/sql/tuning.html)
some of your queries by creating indexes (and possibly moving
tablespaces onto physically separate drives). You find that your
application is still not fast enough. You have two choices:
- Close shop and go on vacation
- Hire a real DBA
- Expand your knowledge of SQL
Though the first option is nice, it's hard to go on vacation if you
don't have a job. And the second option could be quite expensive (not
to mention hard to find). It's time to look at a couple ways to tune the
SQL you're using on your tables with hundreds of thousands or
millions of rows. The hope is that a few examples will add a few new
tools to your SQL toolbox to help you tune your own problematic queries.
From here on, we assume you've created your plan table, that you have
located the problematic query, and have tried indexing the appropriate
columns. If not, read Philip Greenspun's Indexing and Tuning Page (http://www.arsdigita.com/books/sql/tuning.html)
first. And just in case it's not clear enough, make sure you:
SQL> analyze table <your table>
compute statistics for all columns
to help oracle's cost-based
optimizer do the "right thing."
Here are the steps to further optimize your sql:
- copy the problematic query into emacs.
- open a sqlplus session
- become creative
How can you become creative? I'm not sure, but thanks to the help
of Alpha (one of the guys who actually wrote Intermedia), we were
able to replace a query that took about 10 seconds with one that took
about a tenth of a second. That's pretty creative.
Setting up your sqlplus session
One of the major problems in tuning expensive queries is that it takes
too long to repeatedly run variations of the query. Instead of running
the queries, we focus on reading and understanding the
execution plan. Once we have found an execution plan that we think
looks good, we'll run the query with SQL> set autotrace on
to make sure that we're not hitting the disk too often.
In general, there are at least two ways to just look at the execution plan of a query:
Method 1 (not recommended):
SQL> explain plan for <your query>
SQL> select rtrim(lpad(' ',2*level)||
rtrim(operation)||' '||
rtrim(options)||' '||
object_name) query_plan
from plan_table
connect by prior id=parent_id
start with id=0;
SQL> truncate table plan_table;
(Note that we only truncate the plan_table so that the next time we
use explain plan, we only read back the query we ran. I put the select
statement above into a file format_plan.sql that I run every time I
need to)
Method 2 (recommend because of its simplicity):
SQL> set autotrace traceonly explain
SQL> <your query>
Another useful tool for tuning in general is the use of trace files and tkprof
to look at what is going on while your query executes. Example 3 illustrates
the importance of these tools.
SQL Tuning Example 1:
Context
GuideStar is a website focused
on delivering financial information about nonprofits to potential
investors. The financial information is taken from the tax forms that
all non-profits file every year (Form 990 or 990-EZ). We receive
digitized versions of these forms and run a series of math and spelling
checks to ensure that the appropriate numbers add up as they should.
When a form has no more errors, it is approved for public use.
The Tables
I wanted to get back 50 form_ids for forms that had at least one
error in the state "found." Forms are stored in a table called
f990_header with a primary key of form_id and a boolean approved_p that
indicates whether the form is approved. Errors and their
related states are stored in a table called gs_990_errors which has
a primary key pair (form_id, check_id) where check_id refers to the
type of error flagged. Each form_id in f990_header has about 30-40
corresponding rows in gs_990_errors. At the time I was working on this
query, f990_header had about 140,000 rows and gs_990_errors had
about 4.5 million rows.
Tuning
The original query looked like this:
SQL> set autotrace traceonly explain
SQL> select h.form_id
from f990_header h, gs_990_errors e
where approved_p='N'
and rownum <= 50
and e.form_id = h.form_id
and e.error_state = 'found';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=41454 Card=67621 Byt
es=4665849)
1 0 COUNT (STOPKEY)
2 1 MERGE JOIN (Cost=41454 Card=67621 Bytes=4665849)
3 2 SORT (JOIN) (Cost=1818 Card=67621 Bytes=1961009)
4 3 TABLE ACCESS (FULL) OF 'F990_HEADER' (Cost=716 Card=
67621 Bytes=1961009)
5 2 SORT (JOIN) (Cost=39636 Card=1040847 Bytes=41633880)
6 5 TABLE ACCESS (FULL) OF 'GS_990_ERRORS' (Cost=4079 Ca
rd=1040847 Bytes=41633880)
The query plan is about as bad as it gets. I'm doing a full table scan
of f990_header and gs_990_errors. (Note: If you were to actually run the
query with autotrace on, you would see that I'm hitting the disk
over 10,000 times.)
Let's time how long it actually takes to run the query:
SQL> set timing on;
SQL> set autotrace off;
SQL> select h.form_id
from f990_header h, gs_990_errors e
where approved_p='N'
and rownum <= 50
and e.form_id = h.form_id
and e.error_state = 'found';
50 rows selected.
Elapsed: 00:00:09.52
The first step to improving this query is to look at what it is I am
interested in obtaining: the f990_header form_id. I don't care at all about any data in
gs_990_errors. I rewrote the query as:
SQL> set autotrace traceonly explain
SQL> set timing off
SQL> select h.form_id
from f990_header h
where h.approved_p='N'
and h.form_id = (select e.form_id
from gs_990_errors e
where rownum < 2
and e.form_id = h.form_id
and e.error_state = 'found')
and rownum <= 50;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=716 Card=1 Bytes=42)
1 0 COUNT (STOPKEY)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'F990_HEADER' (Cost=716 Card=1
Bytes=42)
4 2 COUNT (STOPKEY)
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'GS_990_ERRORS' (Co
st=4 Card=1 Bytes=45)
6 5 INDEX (RANGE SCAN) OF 'GS_990_ERRORS_FORM_CHECK_ID
X' (NON-UNIQUE) (Cost=3 Card=1)
That's quite a bit better. We got rid of the full table scan on
gs_990_errors and are now simply hitting the index on
gs_990_errors(form_id, check_id). Could we do better? Maybe.
I created an index on approved_p naively hoping that the query would
use it (I even gave it a hint to do so!). However, over 40% of the
rows in f990_header have approved_p='N',
which means the full table scan will be more efficient than the index
scan. The optimizer is doing the right thing here.
My friend Oumi pointed out that
we could simplify the query a bit since we don't even need to return
e.form_id from the inner select:
SQL> select h.form_id
from f990_header h
where h.approved_p='N'
and exists (select 1
from gs_990_errors e
where rownum < 2
and e.form_id = h.form_id
and e.error_state = 'found')
and rownum <= 50;
50 rows selected.
Elapsed: 00:00:00.01
From almost 10 seconds to .01 - that's about 1,000 times faster!
Here's the intuition behind this example: Only join two tables when
you really need data from both tables. Otherwise, a faster query
using a subquery probably exists. I found that this general rule works
best when all tables being accessed are fairly large (greater than 50,000 rows)
and one of the tables is at least five times larger than the rest.
It's worth noting that our example is somewhat unique in that we are only accessing
fifty rows of gs_990_errors meaning that we really win big over a full table scan
of gs_990_errors. I tested a query that counted the total number of forms with an
error in the state "found," forcing a full table scan in both cases. Using a standard
join, the query took 17.65 seconds while the sub-query method took 6.58 seconds. Not an
improvement of 100 times, but still much better!
SQL Tuning Example 2:
Context
Another friend of mine, also named Michael, was tuning a query for
a bonds site. The idea was to identify a particular bond and the
number of positions a user (or users) had executed in that bond.
In this example, we are actually tuning views. Michael
later told me that the query could also have been rewritten using the base
tables to solve the problem we had. In any case, it's a good example of
sql tuning.
The Views
The bonds are stored in a view called bonds with primary key bond_id.
Positions are stored in orders_executable. Some users may not have any
corresponding rows in the orders_executable view.
Tuning
The original query looked like this
SQL> set autotrace traceonly explain
SQL> select b.bond_id, sum(decode(o.order_id, null, 0, 1)) as n_orders
from bonds b, orders_executable o
where b.moody_long_term_rtg_sort >= (select ordinal_num
from moody_ratings
where code='AAA')
and b.moody_long_term_rtg_sort <= (select ordinal_num
from moody_ratings
where code='A')
and b.bond_id = o.bond_id(+)
group by b.bond_id;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 FILTER
3 2 MERGE JOIN (OUTER)
4 3 SORT (JOIN)
5 4 VIEW OF 'BONDS'
6 5 NESTED LOOPS (OUTER)
7 6 TABLE ACCESS (FULL) OF 'BOND_DESCRIPTIONS'
8 6 INDEX (UNIQUE SCAN) OF 'SYS_C0057002' (UNIQUE)
9 3 SORT (JOIN)
10 9 VIEW OF 'ORDERS_EXECUTABLE'
11 10 TABLE ACCESS (BY INDEX ROWID) OF 'ORDERS'
12 11 INDEX (RANGE SCAN) OF 'ORDER_STATUS_IDX' (NON-UNIQUE)
13 2 TABLE ACCESS (BY INDEX ROWID) OF 'MOODY_RATINGS'
14 13 INDEX (UNIQUE SCAN) OF 'SYS_C0057182' (UNIQUE)
15 2 TABLE ACCESS (BY INDEX ROWID) OF 'MOODY_RATINGS'
16 15 INDEX (UNIQUE SCAN) OF 'SYS_C0057182' (UNIQUE)
In line 7 you see that we are doing a full table access of the
bond_descriptions table. To tune this query, we:
- Identify what information we need: bond_id and the number of orders
- Identify the minimum set of tables we need to get that information: The
number of orders definitely must come from
orders_executable table. However, bond_id can come from either orders_executable
or bonds. Our first instinct would be to eliminate using the bonds table at all.
This doesn't work here since we need the outer join on orders_executable to get
the bond_id's which do not have any corresponding rows in orders_executable.
- Try to replace a full table scan with a smaller view of the data that we really need:
In our case, we are only using a subset of the bonds table (namely those bonds whose
moody_ratings fall in some range).
We rewrote the query by replacing bonds with a "view created on the fly":
select b.bond_id, sum(decode(o.order_id, null, 0, 1)) as n_orders
from (select bond_id from bonds
where moody_long_term_rtg_sort >= (select ordinal_num from
moody_ratings where code='AAA')
and moody_long_term_rtg_sort <= (select ordinal_num from
moody_ratings where code='A')) b, orders_executable o
where b.bond_id = o.bond_id(+)
group by b.bond_id;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 MERGE JOIN (OUTER)
3 2 SORT (JOIN)
4 3 VIEW
5 4 NESTED LOOPS (OUTER)
6 5 TABLE ACCESS (BY INDEX ROWID) OF 'BOND_DESCRIPTIONS'
7 6 INDEX (RANGE SCAN) OF 'BOND_MOODY_LT_RTG_SORT_IDX' (NON-UNIQUE)
8 7 TABLE ACCESS (BY INDEX ROWID) OF 'MOODY_RATINGS'
9 8 INDEX (UNIQUE SCAN) OF 'SYS_C0057182' (UNIQUE)
10 5 INDEX (UNIQUE SCAN) OF 'SYS_C0057002' (UNIQUE)
11 2 SORT (JOIN)
12 11 VIEW OF 'ORDERS_EXECUTABLE'
13 12 TABLE ACCESS (BY INDEX ROWID) OF 'ORDERS'
14 13 INDEX (RANGE SCAN) OF 'ORDER_STATUS_IDX' (NON-UNIQUE)
The full table scan we previously saw in line 7 is now replaced with index lookups to
create the view that we then join with orders_executable.
Did we really win? In this case, yes! The full table scan was our bottleneck. However,
keep in mind that this method of optimizing a sql query might not be too useful if the
view you create on they fly contains a good portion of the rows in your table.
Example 3: Tuning With Trace Files and tkprof
Sometimes it hard to understand why a particular query, procedure, or function
is taking so long to execute. The best solution here is to create a trace file
to log what's going on during execution and to feed that trace file to tkprof
which formats and sorts the trace file.
I use a Site-wide Search Table (see http://www.arsdigita.com/doc/site-wide-search ) named search to allow users to search all the content
on GuideStar ( http://www.guidestar.org ) and I created a pl/sql procedure
that updates one row of the search table. This procedure is not very efficient -
It takes over eight seconds per row on the average - multiply that by 700,000 rows
and it would take
about two months to update every row. Let's find out why this procedure takes
so long to execute. Note that autotrace doesn't give us any information
about the execution of pl/sql procedures (at least I don't know how to get it!).
The first step is to make sure timing is on inside Oracle (this ensures that the trace
file will contain information regarding execution time):
- Open up the Oracle init file ($ORACLE_BASE/admin/ora8/pfile/initora8.ini)
in your favorite editor
- Make sure it says somewhere "timing = true". If timing is set to false,
set timing to true, shutdown the database, and start it back up.
SQL> alter session set sql_trace=true;
Session altered.
SQL> BEGIN gs_update_search_for_ein('25-0965554'); END;
2 /
PL/SQL procedure successfully completed.
SQL> alter session set sql_trace=false;
Session altered.
Now we look at the trace file:
- From a unix shell as the oracle user, go to the trace files
directory ($ORACLE_BASE/admin/ora8/udump)
- Figure out which trace file was generated (use ls -lrt *.trc to sort all trace
files in ascending order of time last modified. The last file is probably the one you want)
- Run tkprof (my trace file was called ora8_ora_24611.trc):
> tkprof ora8_ora_24611.trc output=ora8_ora_24611.prf sort=prsdsk,exedsk,fchdsk
About the arguments to tkprof:
- prsdsk - parse-time disk access (time to parse your sql)
- exedsk - execution disk access (time to open the cursor)
- fchdsk - fetch disk access (time to walk down the cursor)
- Look at the file ora8_ora_24611.prf:
UPDATE SEARCH SET GIF_P=:b1,FIN_P=:b2,BMF_P=:b3,F990_P=:b4,PDF_P=:b5,FORM_ID=
:b6,ADDRESS=:b7,DESCRIPTION=:b8,URL=:b9,ORG_NAME=:b10,UPPER_ORG_NAME=
UPPER(:b10),ZIP=:b12,CITY=:b13,UPPER_CITY=UPPER(:b13),STATE=:b15,LONGITUDE=
:b16,LATITUDE=:b17,NTEE_CD_BMF=:b18,NTEE_CD_1=:b19,NTEE_CD_2=:b20,NTEE_CD_3=
:b21,NTEE_CD_BMF_ALL=:b22,NTEE_CD_1_ALL=:b23,NTEE_CD_2_ALL=:b24,
NTEE_CD_3_ALL=:b25,NTEE_CD_BMF_RANGE=:b26,NTEE_CD_1_RANGE=:b27,
NTEE_CD_2_RANGE=:b28,NTEE_CD_3_RANGE=:b29,KEYWORDS=EMPTY_CLOB(),
REVENUE_CODE=:b30,FOUNDN_IRS=:b31,DATA_QUALITY=:b32,PROCESSED_P='t'
WHERE
EIN = :b33
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.63 6.68 3126 3866 364 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.64 6.69 3126 3866 364 1
In this case, the most expensive query was the update on the table search -
It took over 3,000 disk accesses! It turns out, every column that is
being updated is itself indexed, and several pairs of columns are indexed as well.
There are over 25 indexes on this table that all must be updated on all dml queries.
Knowing that the problem was disk I/O, I:
- Separated my tablespaces over 4 disks
- Ran several instances of the above procedure in parallel
The point here is that usually the hardest part of tuning is
finding the bottleneck - sql trace files and tkprof can help.
asj-editors@arsdigita.com