Tips for Using Oracle
(mostly Oracle 8 but also 7.3), by Philip Greenspun (philg@mit.edu)
ArsDigita : ArsDigita Systems Journal : One article
This is still a useful document but it is old (mostly
1998) and some of the advice is irrelevant in newer versions of Oracle.
I set up this page as a place to vent my rage against Oracle. It
ended up being useful to share experience but not much rage. Why not?
Oracle lets me sleep at night. With some other RDBMSes (notably
Illustra, a prematurely commercialized version of Postgres), I used to
have to manually unwedge my deadlocked database at 3 am, 11 am, 1
pm,... you get the idea. In Oracle, readers never wait for writers or
vice versa. I can serve my 18,000 daily users. I can EXPORT a
consistent backup dump. I can update a 20,000 row table. I can
rebuild an index. All simultaneously.
So there are lots of things about Oracle that I wish were different.
There are lots of things that have wasted my time. There are lots of
things that were painful to set up. However, once set up properly,
Oracle seems to be very reliable and well-behaved.
Inspiration: http://www.scorecard.org. This is a
server that gets 30 requests/second at peak hours (it was a top story on
ABC News, in Newsweek, in the New York Times, etc.). Each request does
between 1 and 5 Oracle queries. It just about uses up an old pizza-box
size Unix machine (Sun Ultra 2 with dual 167 MHz CPUs). If you use
Oracle 8 intelligently (and connect to Oracle 8 intelligently;
we use AOLserver), there is no scalability problem for Internet sites
(though you can create one and increase your job security by purchasing
an application server).
Hire the best dba that you can
If you take a smart MIT kid and have him install Oracle on your machine
he will very likely accept the default settings for most parameters and
tablespace sizes. When you're done, you'll have a nice setup for a
10-year-old computer with a 250 MB disk drive. I.e., you'll have a
$150,000 computer with 50 GB of disk space yet you won't be able to
build an index on a 20,000-row table because a rollback segment isn't
large enough and won't autoextend to use any of the 7 GB left on that
particular disk drive. You'll have the wrong block size (see below) and
the data and indices won't be spread across the disks properly. Most
operations will be taking five times as long as they should.
Fire your dba ...
Fire your dba if he or she can't make your Oracle installation pass the
following acceptance tests:
Lay in a big stock of books
You'll probably want to buy all of the Oracle
books that I recommend from my bookshelf and then some. You also
want to clear out about 100 MB of disk space on your Web server. Load
the entire contents of the Oracle documentation CD-ROM (which is in
HTML) onto your hard disk and make it available via the Web to your
developers. I can't for the life of me figure out why this isn't simply
available at www.oracle.com but it isn't.
Dating and Oracle
As the author of Using the Internet to
Pick up Babes and/or Hunks, I've spent quite a bit of time trying to
figure out how to get more dates using my Oracle knowledge. I was doing
pretty well with the line "I am an SAP consultant making $7000/day" but
women in bars refused to go out with me when I couldn't answer the
following question: "Why doesn't Oracle support ANSI SQL datetime
datatypes?"
With an ANSI-standard database, you have DATE
and
TIMESTAMP
data types and a whole bunch of interval types.
With an ANSI RDBMS, you can store time with precision down to
microseconds (or optionally even finer, but the standard mandates that
implementations hold 6 significant digits after the second). ANSI
timestamps look like "1963-09-28 23:05:14" (September 28, 1963 at 11:05
pm and 14 seconds).
With Oracle, you have only one datetime datatype: DATE
.
But it is actually the same granularity of information as an
ANSI-standard TIMESTAMP(0)
, i.e., it stores the time down
to single-second precision. That wouldn't be so bad except that Oracle
SQL does not implement any of the ANSI syntax or functions. By default,
Oracle would say "28-SEP-63". Note that this is an invitation to Year
2000 bugs as well as a nightmare for people porting systems from other
RDBMS products. Note further that Oracle dates will not sort
lexicographically. If you start naming files with an Oracle date in the
filename, they won't sort in any meaningful order.
A good general rule with Oracle is that every parameter in the default
Oracle installation is wrong. Date format is no exception. Here's what
you should have for starters in your $ORACLE_HOME/dbs/initora8.ora:
# Use ANSI date format by default!
nls_date_format = "YYYY-MM-DD"
Alternatively, you can decide that your entire database will really
behave like TIMESTAMP(0) and use "YYYY-MM-DD HH24:MI:SS".
Remember that Oracle's "hidden precision" can make queries fail in
unexpected ways. For example, if you ask Oracle to print out the date,
you'll get precision down to the day:
SQL> select sysdate from dual;
SYSDATE
----------
1998-07-12
However, the internal precision will make naive comparisons fail:
create table test_dates (
the_value varchar(20),
the_date date
);
insert into test_dates values ('happy',sysdate);
insert into test_dates values ('happy',sysdate);
insert into test_dates values ('joy',sysdate);
insert into test_dates values ('joy',sysdate);
*** brief pause ***
SQL> select * from test_dates where the_date = sysdate;
no rows selected
We didn't get any rows because none matched down to the second. Are we
stuck? No. We can rely on the fact that Oracle stores dates as
numbers:
SQL> select * from test_dates where trunc(the_date) = trunc(sysdate);
THE_VALUE THE_DATE
-------------------- ----------
happy 1998-07-12
happy 1998-07-12
joy 1998-07-12
joy 1998-07-12
An arguably cleaner approach is to use to_char
:
select * from test_dates where to_char(the_date,'YYYY-MM-DD') = to_char(sysdate,'YYYY-MM-DD');
How to rename a column in Oracle
Renaming or removing a column from an Oracle table is easy. If you type
alter table phone_directory rename column name to full_name;
(a statement that works nicely in some other RDBMSes), then you get a
very unpleasant error message. It turns out that the easiest (only) way
to rename a column is
drop table phone_directory;
edit SQL file
create table phone_directory ...
This is fixed to some extent for those who have ascended in Oracle
8.1 heaven.
Think carefully about long strings
The longest VARCHAR that you can store in Oracle 8 is 4000 bytes. For
most of my Web applications, this turns out to be long enough for 97% of
user-entered data but the remaining 3% are sufficiently important that I
can't use VARCHAR.
The traditional RDBMS approach is to use a LONG column. This should
not be used in Oracle 8, which has Character Large Objects (CLOBs).
They are an industrial-strength implementation of an idea that was
floating around in research RDBMSes and in the old Illustra system. A
CLOB can be up to 4 GB in size and can sit in a separate tablespace from
the table that references it. But if the character string is less than
4000 bytes, by default it gets stored "inline" (with the rest of the
stuff in the row).
create table foobar
( mykey integer,
moby clob );
insert into foobar values ( 1, 'foo');
The first time I used CLOBs was for an Oracle 8 port of my Q&A forum
software (what you see running at photo.net). I
use it for the MESSAGE column. In my Illustra implementation, it turned
out that only 46 out of the 12,000 rows in the table had messages longer
than 4000 bytes (the VARCHAR limit in Oracle 8). But those 46 messages
were very interesting and sometimes contained quoted material from other
works. So it seemed desireable to allow users to post extremely long
messages if they want.
Minor Caveats:
- CLOBs don't work like strings. You can't ask for the LENGTH
of a CLOB column, for example. You can work around this with PL/SQL
calls but it isn't much fun.
- LOBs are not allowed in GROUP BY, ORDER BY, SELECT DISTINCT,
aggregates and JOINS
alter table bboard modify message clob;
does not
work (even with no rows in the table). If you want to use CLOBs, you
apparently have to say so at table creation time.
If you thought that these limitations were bad, you haven't gotten to
the big one: the Oracle SQL parser can only handle string literals up
to 4000 characters in length. SQL*Plus is even more restricted. It can
handle strings up to 2500 characters in length. From most Oracle
clients, there will in fact be no way to insert data longer than 4000
characters long into a table. A statement/system that works perfectly
with a 4000-character string will fail completely with a 4001-character
string. You have to completely redesign the way you're doing things to
work with strings that might be long.
My partner Cotton was custom writing me an Oracle 8 driver for AOLserver
(my preferred RDBMS client). So he decided to use C bind variables
instead. It turned out that these didn't work either for strings longer
than 4000 chars. There is some special CLOB type for C that you could
use if you knew in advance that the column was CLOB. But of course
Cotton's C code was just taking queries from my AOLserver Tcl code.
Without querying the database before every INSERT or SELECT, he had no
way of knowing which columns were going to be CLOB.
One of the most heavily touted features of Oracle 8 is that you can
partition tables, e.g., say "every row with an order_date column less
than January 1, 1997 goes in tablespace A; every row with order_date
less than January 1, 1998 goes in tablespace B; all the rest go in
tablespace C." Assuming that these tablespaces are all on separate disk
drives, this means that when you do a GROUP BY on orders during the last
month or two, Oracle isn't sifting through many years worth of data; it
only has to scan the partition of the table that resides in tablespace
C. Partitioning seems to be a good idea, or at least the Informix
customers who've been using it over the years seem to think so. But if
you are a CLOB Achiever, you won't be using partitioning.
Right now, my take on CLOBs is that they are so unpleasant to use as
to be almost not worth it. It has taken Cotton longer to get this one
feature working than everything else he did with his driver. Informix
Universal Server lets you have 32,000-character long VARCHARs. The
Solid embedded RDBMS allowed VARCHARs up to 2 GB in length! One hopes
that Oracle will see the light eventually and choose abstraction over
whatever efficiency they're gaining now with these limits.
Think even more carefully about long strings
I tend to develop a data model, fill it up, see how fast it runs, and
then add indices when particular operations seem sluggish. I also tend
to think that most text columns, e.g., VARCHARs and CLOBs should be able
to hold an arbitrary amount of user-entered text. This is a very bad
design combination with Oracle. You can't build an index on a CLOB.
You can't build an index on a VARCHAR that is potentially longer than
758 bytes. You can't alter a table to decrease the maximum size of a
VARCHAR column unless every row has a NULL value for that column. So if
you casually say "VARCHAR(4000)" for a column that typically has a dozen
bytes in it, you are preventing yourself from ever adding an index to
that column (unless you want to copy all the data out of the table and
then back in).
Here's how much fun life is if you're as stupid as I am...
- when copying my bboard
software from Illustra to Oracle, I changed the type of the SORT_KEY
column from "TEXT" (Illustra's super-winning catch-all string datatype)
to VARCHAR(4000). I wasn't sure if anyone would ever need a threaded
discussion where the threads were nested 2000 levels (each level takes
up two characters), but my program would be ready!
- I discovered that the operation of finding rows related to a message
in the Q&A interface was sluggish and requiring a sequential table scan
(see my tuning tips below). The query did a "SORT_KEY LIKE 'foo%'.
Since the % was at the end of the pattern, I knew that I could speed
this up with an index on sort key.
- The maximum length sort_key in the database was 11 bytes. But I
couldn't alter the column length unless I NULLed out all the data.
- For safety, I did a
"create table bboard_copy as select * from
bboard;"
- I did an
"alter table bboard add (scratch varchar(4000));"
- I copied the sort_key info into the scratch column:
SQL> update bboard set scratch = sort_key;
20453 rows updated.
Elapsed: 00:14:20.59
Note the elapsed time: 14 minutes for 20,000 rows, about 23 rows/second.
- During this 14-minute period, my Web services ground to a halt. My
server monitors screamed in agony, AOLserver threads blocked because
Oracle never responded to queries. I was used to this kind of thing
with Illustra so I didn't worry. I knew everything would come back to
normal as soon as the update completed. I was a bit confused because
Oracle is supposed to have multiversioning and not make readers wait for
writers.
- The Web services were not restored when Oracle came back. In fact,
even simple SQL queries from a SQL*Plus session were taking 150 seconds
that normally took one or two seconds.
- I went to a Unix shell and su'd to root and executed
"/etc/init.d/dbshut stop" (can take as long as 2 minutes; I think Oracle
is resyncing the datafiles with the redo logs) and then
"/etc/init.d/dbstart start". Oracle came back to life (except for
ConText full text searching because I forgot to type
"/etc/init.d/context start").
I'm very disturbed by the fact that I had to stop and restart Oracle.
Unless there were a bunch of queries still executing in the kernel
despite disconnected clients, I have no idea why this was necessary. I
noticed something disturbing in the startup:
SVRMGR> ORACLE instance started.
Total System Global Area 21633944 bytes
Fixed Size 44924 bytes
Variable Size 20655132 bytes
Database Buffers 409600 bytes
Redo Buffers 524288 bytes
Database mounted.
The 21 MB system global area didn't sound so bad. But the 409K database
buffer size didn't look right to me. Perhaps it can grow dynamically?
I poked around in $ORACLE_HOME/dbs/initora8.ora:
db_block_buffers = 200 # SMALL
Hmmm... default block size of 2048 bytes (see below) times 200 = 409600
bytes. So with 256 MB of RAM, my computer was only able to use 400 KB
as a cache for Oracle data. Oracle
SQL High-Performance Tuning says that "many applications get good
performance from a buffer cache as small as 10 MB." Oops. The book
goes on to note that "high performance applications may have buffer
caches of 50 to 100 MB and caches of over 200 MB are not rare." Double
oops. How often are we missing our monster 400 KB cache?
SQL> SELECT name, value
FROM v$sysstat
WHERE name IN ('db block gets', 'consistent gets','physical reads');
NAME VALUE
---------------------------------------------------------------- ----------
db block gets 62969
consistent gets 49707710
physical reads 44449762
Doh! 44 million physical reads from the disk in a few hours! Oracle
says
Hit Ratio = 1 - ( physical reads / (db block gets + consistent gets) )
In this case, our hit ratio is 11%. So 89% of the time, we had to go to
the disk. If a disk can do 100 operations/second, doing 44 million
operations would take 440,000 seconds, or about 5 days. We're doing a
bit better than this partly because we have lots of disks on this
machine but mostly because the Unix file system is obviously caching
much of the information.
We need to increase db_block_buffers until we hit the cache at least 90%
of the time according to Oracle. On a big machine with a light load, my
personal preference would be to hit the cache at least 95% of the time.
Memory is cheap.
If our problems weren't so pathetically obvious, we'd have to think
about the fact that a hit in the Oracle cache might still be a miss
because that part of the SGA had been paged out to disk by the operating
system.
In addition to caching user data, Oracle separately caches housekeeping
items in the "shared pool". The person who installed this Oracle for
some reason decided to not take the 3.5 MB default because initora8.ora
also contains
shared_pool_size = 20000000
Is this a reasonable size? Well, there doesn't seem to be too much
free:
SQL> select * from v$sgastat where name = 'free memory';
POOL NAME BYTES
----------- -------------------------- ----------
shared pool free memory 667596
All but 667K of the 20 MB are in use for caching parsed queries and data
dictionary information:
SQL> SELECT SUM(pins) "Executions",
SUM(reloads) "Cache Misses while Executing"
FROM v$librarycache;
Executions Cache Misses while Executing
---------- ----------------------------
130800 828
I think this means that only 828 times out of 130,800 did Oracle have to
reparse a query. I'm surprised that this is so high, actually, because
I don't use bind variables in my queries and queries can't be cached if
they don't match character for character. In other words,
"select * from users where user_id = 67" won't match
"select * from users where user_id = 156".
The Oracle tuning guide suggests checking for data dictionary cache
performance with the following query:
SQL> select (sum(gets - getmisses - usage - fixed)) / sum(gets) "Row Cache"
from v$rowcache;
Row Cache
----------
.995257207
99.5% of the time, Oracle was able to find data dictionary information
in its cache and did not have to go to the Unix file system to get data
about the structure of tables or the Oracle system itself. Anyway, I
guess it is nice to have 20 MB of shared pool, but probably this is not
the best use of memory when you only have 400 KB to cache user data.
OK.. so I upped db_block_buffers to 2000 (from 200) and decided to try
updating bboard again with an index for SORT_KEY . Here's the original
problem, a query that I do on every bulletin board posting page:
SQL> select decode(email,'philg@mit.edu','f','t') as not_maintainer_p, to_char(posting_time,'YYYY-MM-DD') as posting_date,bboard.*
from bboard
where sort_key like '0006XO%'
and msg_id <> '0006XO'
order by not_maintainer_p, sort_key;
Elapsed: 00:00:03.80
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1811 Card=1 Bytes=828)
1 0 SORT (ORDER BY) (Cost=1811 Card=1 Bytes=828)
2 1 TABLE ACCESS (FULL) OF 'BBOARD' (Cost=1808 Card=1 Bytes=828)
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
16544 consistent gets
13742 physical reads
0 redo size
2163 bytes sent via SQL*Net to client
995 bytes received via SQL*Net from client
5 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
Note that this is doing a full table scan and 13,742 physical reads.
Ordinarily at least some of these blocks would be read from the cache
except that I think Oracle tries to avoid using the cache when doing
full table scans (unless you put in an optimizer hint instructing it to
do so and analyze the table (see below)). This is because one sequential scanning loser dragging a 20 MB
table through the cache would displace all the other users' cached
blocks.
Following a painful 15 minutes of downtime ...
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=828)
1 0 SORT (ORDER BY) (Cost=6 Card=1 Bytes=828)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD' (Cost=3 Card=1 Bytes=828)
3 2 INDEX (RANGE SCAN) OF 'BBOARD_BY_SORT_KEY' (NON-UNIQUE) (Cost=2 Card=1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
2170 bytes sent via SQL*Net to client
995 bytes received via SQL*Net from client
5 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
We're down to 0 physical reads from 13,000+. Not so bad. I still wish
I understood why tables could not be read while an index was being built
or every row in the table was being updated.
Be wary of SQL*Plus
My preferred way of dealing with computers is Emacs. So my primary way
of talking to Oracle during development is by spawning an Emacs shell
and then running SQL*Plus, the standard Oracle shell-level client.
The first thing that I've observed with SQL*Plus is that legal SQL often
generates errors when typed into SQL*Plus. Here's what I do to make
things palatable for SQL*Plus:
- remove all comments from my code
- type "set scan off" (so that SQL*Plus doesn't stop when it sees an
ampersand and ask you to fill in the variable value)
- give up and feed an error-producing SQL statement into Oracle via AOLserver where it is accepted just
fine. Note that AOLserver has a magic feature whereby if you grab
http://foobar.com/yow.sql it asks you "oh, this is a SQL file, would you
like me to stuff it into one of my already-setup database connections?"
Here's an example of a definition that makes SQL*Plus unhappy but that
AOLserver + Cotton's driver + OCI is able to handle just fine:
create table photo_cds (
photocd_id varchar(20) not null primary key,
photographer_id integer not null references photographers, -- no orphans
client_id integer references clients, -- may often be null
-- do we have a 6-resolution pro disk?
pro_disk_p char(1) default 'f' check (pro_disk_p in ('t','f')),
sharpen_p char(1) check (sharpen_p in ('t','f')),
add_borders_p char(1) check (add_borders_p in ('t','f')),
copyright_label varchar(100),
-- bit vectors done with ASCII 0 and 1; probably convert this to
-- Oracle 8 abstract data type
jpeg_resolutions char(6),
-- on which resolutions to write copyright label
copyright_resolutions char(6),
-- how this will be published
url_stub varchar(100), -- e.g., 'pcd3735/'
images_public_p char(1) check (images_public_p in ('t','f'))
);
unknown command beginning "_public_p ..." - rest of line ignored.
SQL> unknown command ")" - rest of line ignored.
SQL> SQL> select * from photo_cds;
Be afraid of SQL*Plus
SQL*Plus does not operate by default in autocommit mode. This is so
that you can do things like the following:
SQL> update bank_accounts set balance = 0;
13,456 rows updated.
*** panic ***
SQL> rollback;
Rollback complete.
SQL> update bank_accounts set balance = 0 where account_number = 3745;
1 row updated.
SQL> commit;
Commit complete.
In some ideal world you would have separate development and production
Oracle installations and would never be typing arbitrary SQL at a
SQL*Plus session on a production system. In reality, here's what
happens:
- Joe Nerd needs to go into SQL*Plus and update a row, perhaps
incrementing a counter. He does this but forgets to commit or exit from
SQL*Plus before going to lunch.
- You've got a simultaneously running Web service that expects to
process 50 transactions/second. It needs to update the same row that
Joe Nerd has touched in his SQL*Plus implicit transaction. Oracle makes
the Web server hang until Joe Nerd either rolls back or commits.
- With Perl/CGI, your Unix box keeps spawning CGI processes that in
turn spawn Oracle server processes that in turn wait for Joe Nerd to get
back from lunch and type "commit". Your Unix box's process table and
virtual memory fill up and eventually new users will get "could not fork
new process" error messages. Moreover, all the other services supported
by that computer (e.g., other Web sites) grind to a halt. With a
connection-pooling Web server such as AOLserver, the rest of the Unix
box isn't trashed. However, very quickly all the configured Oracle
connections are used up by waiting inserts or updates and the AOLserver
will run out of threads and start serving "server busy" pages to new
users.
If you're not Joe Nerd, you have no idea why this is all happening. You
have a couple of choices. The first is to shutdown and restart Oracle:
# su oracle
ora8@orchid/solaris$ svrmgrl
SVRMGR> connect internal
Connected.
SVRMGR> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SVRMGR> startup
ORACLE instance started.
...
Remember that you might have to restart Web servers and/or the Oracle
ConText servers.
An alternative is to poke around among the DBA views and try to figure
out who is causing the problem. Let's artificially create the problem:
SQL*Plus session 1
| SQL*Plus session 2
|
SQL> create table foobar (
the_key integer primary key
);
SQL> insert into foobar (the_key)
values (1);
1 row created.
|
SQL> insert into foobar (the_key)
values (1);
*** wait forever ***
|
The second session will wait until the user in the first session types
"commit" (in which case the second session will raise an error about the
primary key constraint being violated) or "rollback" (in which case the
second session will report success in inserting the row).
If you log in with dba privileges, e.g., as SYSTEM, you can do the
following:
column username format a15
column program format a20
column machine format a10
column n_seconds format a6
select username,v$session.sid,v$session.serial#, ltrim(v$session_wait.seconds_in_wait) as n_seconds, process, machine, terminal, program
from v$session, v$session_wait
where v$session.sid = v$session_wait.sid
and v$session.sid in (select sid from v$lock where lmode=6);
USERNAME SID SERIAL# N_SECO PROCESS MACHINE TERMINAL
--------------- ---------- ---------- ------ --------- ---------- ----------
PROGRAM
--------------------
3 1 882 1338 homepage ?
oracle@homepage (LGWR)
SCOTT 8 33 1448 16351 homepage ttypa
sqlplus@homepage (TNS V1-V3)
SCOTT 11 24 1484 16344 homepage ttyp9
USERNAME SID SERIAL# N_SECO PROCESS MACHINE TERMINAL
--------------- ---------- ---------- ------ --------- ---------- ----------
PROGRAM
--------------------
sqlplus@homepage (TNS V1-V3)
I have no idea what the magic lmode=6
clause does, but
apparently it gives you the correct processes. One of them is the
Oracle log writer (LGWR) process. You definitely don't want to mess
with that! The other two are sessions for user SCOTT that have been
waiting for more than 1400 seconds. You can kill one of them (the
one that has been waiting longer) by specifying the session id (SID) and
serial number:
alter system kill session '11,24';
Here's what you see
SQL*Plus session 1
| SQL*Plus session 2
|
SQL> select sysdate from dual;
select sysdate from dual
*
ERROR at line 1:
ORA-00028: your session has been killed
|
1 row created.
|
Of course, the nice thing to do would be to figure out what tables and
rows were being locked. But I haven't figured out how to do this.
Remember that the default behavior of ConText is not what you want
For Web publishing, you desperately need a full-text search engine for
strings in your RDBMS (see Chapter
11 of my book). Oracle is one of the handful of RDBMS vendors that
provides one. Unfortunately, the default behavior of ConText is exact
phrase matching. So if you pipe a user's query direct to ConText, it
will only return rows that contain exactly that phrase. This is
obviously not what users will expect if they've been accustomed to
Lycos, Excite, and AltaVista. If you want ConText to return rows that
that are reasonable matches in descending order of reasonableness, in
response to a user query of "dogs running wild", here's what you need to
do....
select *
from bboard
where contains (indexed_column, '$(dogs,running,wild)', 10) > 0
order by score(10) desc
It is very difficult to simply feed a user query from a Web form into
ConText. If you just put it directly through, ConText does precise
phrase matching, probably not what Web users expect. If you blindly
split on spaces and then join with commas to produce queries like the
one above, then the user will get an error if his query contained any
any ConText operators (e.g., "and") or any stop words. I've found with
my Q&A forum that I get much better results if I try using the NEAR
operator first (";") and then resort to the accumulate operator (",") as
above.
Another bad feature of ConText is that it has a very heavy flavor of
"I'm designed for indexing big documents, each of which resides in one
column, presumably as a LONG or a CLOB." Most of my Web stuff has data
spread among multiple columns, e.g., the name, email, subject, and body
fields of a bboard posting. You can create a concatenating view with
two columns, one for the primary key and one with all the subsidiary
columns mushed together with "||". Unfortunately, this doesn't work if
the total length of the mush is more than 4000 characters or if one of
the underlying columns is a CLOB. In that mournful case, you're forced
to maintain, either with database triggers or in your application code,
a denormalized copy of all the indexed text mushed together into a CLOB.
ConText the speed demon: searching for "nikon zoom lens" in my bulletin
board took 2.8 seconds in Illustra/PLS. ConText took about 10 seconds.
Searching for one word that was not present in any row took 0.17 seconds
in Illustra/PLS, 1.5 seconds in Oracle/ConText (but then I dropped and
rebuilt the ConText index and the time fell to 0.4 seconds).
One of my clients couldn't get ConText to work at all. He'd configured
it to use its own tablespace but it insisted on filling up the SYSTEM
tablespace instead. Attempting to index 300 MB of text failed and left
behind 100 MB of garbage in the SYSTEM tablespace, thus rendering the
entire Oracle installation useless. After the two of us spent a few
nights on the phone with Oracle support in Australia, I wrote this...
-- here's our ConText replacement
-- we take the indexed_stuff string (presumably straight from the DB)
-- and a space-separated list of query tokens (presumably from the user)
-- we loop through the space_sep_list and compare to INDEXED_STUFF
-- we return 0 if we didn't find any, plus 10 for every match
-- this doesn't work though for views in which string concatenation will
-- overflow (see next function)
create or replace function pseudo_contains (indexed_stuff IN varchar, space_sep_list_untrimmed IN varchar)
return integer
IS
space_sep_list varchar(32000);
upper_indexed_stuff varchar(32000);
-- if you call this var START you get hosed royally
first_space integer;
score integer;
BEGIN
space_sep_list := upper(ltrim(rtrim(space_sep_list_untrimmed)));
upper_indexed_stuff := upper(indexed_stuff);
score := 0;
IF space_sep_list is null or indexed_stuff is null THEN
RETURN score;
END IF;
LOOP
first_space := instr(space_sep_list,' ');
IF first_space = 0 THEN
-- one token or maybe end of list
IF instr(upper_indexed_stuff,space_sep_list) <> 0 THEN
RETURN score+10;
END IF;
RETURN score;
ELSE
-- first_space <> 0
IF instr(upper_indexed_stuff,substr(space_sep_list,1,first_space-1)) <> 0 THEN
score := score + 10;
END IF;
END IF;
space_sep_list := substr(space_sep_list,first_space+1);
END LOOP;
END pseudo_contains;
September 13, 1998: the same thing happened to me behind the
db.photo.net server. In response to us inserting 16 MB of stuff to be
indexed (in a table that is part of the ArsDigita Community System)
ConText filled up the TOOLS tablespace with a 2 GB "service queue" of
stuff that it supposedly needed to do. Since we'd set our database
files to autoextend, this resulted in a database file longer than 2 GB
(and a full hard disk), which apparently makes Oracle unhappy in
profound ways. This brought down the core Oracle server itself and we
spent five solid days trying to extricate ConText from our
installation. Here's what I've learned about ConText:
- make sure that the ConText user's default tablespaces are created
specially for that user, i.e., that the ConText installation won't use
anyone else's tablespace for its service queue or PL/SQL procs or
thesaurus tables. This way if ConText runs wild, you can just drop the
whole tablespace and save the rest of your RDBMS installation.
- you probably need a full-time administrator just for the ConText
option; expect ConText by itself to generate as much in the way of admin
tasks as the rest of Oracle
- since getting ConText to index stuff requires a lot of
denormalization, it might be a similar amount of programming work to
download the PLS software from http://www.pls.com
and write some scripts to sync up an external PLS index with the data in
your Oracle database. If there is a problem with PLS, it will not
affect your RDBMS installation. PLS is extremely smart about
linguistics (the best system I've personally used in this area) and is
free now that America Online has acquired the company. America Online
depends heavily on Unix-based PLS (which is why they bought the company)
for its 11 million users so you can be pretty sure that it is reliable.
ConText is due for a rewrite to go with Release 8.1 of the Oracle server.
'' is the same as NULL
A pure system using three-valued logic would treat the empty string as
different from NULL. As far as I can tell, this behavior is required by
the SQL-92 standard. Illustra works this way. Oracle says that a
future release may work this way. Oracle 8 does not work this way.
Here's an example. We define a table of names and phone numbers.
create table empty_string_test (
name varchar(100),
phone_number varchar(100)
);
Suppose Philip Greenspun and Cotton Seed are using a Web page where they can
type their phone numbers into a form. Cotton chooses to leave this
field blank. So that information should be represented as '' (empty
string).
insert into empty_string_test values ('Philip Greenspun','617 253-8574');
insert into empty_string_test values ('Cotton Seed','');
Jin Choi's name is entered from a table of mail-order customers where no
phone numbers are kept. So we insert a row for him with a NULL.
insert into empty_string_test values ('Jin Choi',NULL);
Now we want to know how many rows we entered from the mail-order
customers table...
select count(*) from empty_string_test where phone_number is null;
COUNT(*)
----------
2
Oops! Oracle tells us that there were 2 instead of 1. It gets worse.
Try to figure out how many people refused to enter their phone number on
the form...
select count(*) from empty_string_test where phone_number = '';
COUNT(*)
----------
0
But you just inserted a row for Cotton where phone_number was ''! Let's
say that you want to get a list of people to call. You aren't going to
want rows for which there isn't a phone number, so you use an "is not
null" clause:
select * from empty_string_test where phone_number is not null;
NAME
--------------------------------------------------------------------------------
PHONE_NUMBER
--------------------------------------------------------------------------------
Philip Greenspun
617 253-8574
Note the obtuseness of SQL*Plus in formatting here. Because these are
VARCHAR(100), it allocates 100 characters for each column and can't
display them on the same line. Anyway, you only get one row back. In a
pure ANSI SQL you'd expect to get two rows, one with Cotton Seed and his
empty string. Oracle's documentation seems to threaten a move in the
direction of the standard. So you program defensely...
select * from empty_string_test
where phone_number is not null
and phone_number <> '';
no rows selected
That extra condition ("<> ''") should have been redundant with the "is
not null". But it seems to have wiped out all the rows. I guess this
isn't too surprising given
select * from empty_string_test where phone_number <> '';
no rows selected
As far as I can tell, there is no way to program this that does not lock
you into Oracle's non-standard behavior.
= is not necessarily commutative
If A=B does that mean that B=A? If you say "yes" then grab yourself a
Ph.D. in mathematics. But don't try to get a job programming Oracle
statements with subqueries. Here's a PL/SQL statement from my bulletin
board system (described ad nauseum in Chapter 13 of my book). It is
supposed to find unanswered messages.
select msg_id, one_line, sort_key, email, name
from bboard bbd1
where topic = 'photo.net'
and (select count(*) from bboard bbd2 where bbd2.refers_to = bbd1.msg_id) = 0
and refers_to is null
Oracle does not like this: "ORA-00936: missing expression". So I
flipped through the subquery examples in
Oracle8:
The Complete Reference. They all put the subquery after a column
name. Swapping the 0 and the subquery...
select msg_id, one_line, sort_key, email, name
from bboard bbd1
where topic = 'photo.net'
and 0 = (select count(*) from bboard bbd2 where bbd2.refers_to = bbd1.msg_id)
and refers_to is null
... worked like a charm.
Partitioning Breaks the semantics of SQL
Suppose you're running a Web personal ads service. You have 200,000
members but, since you've done such a great job matching them, only
20,000 are currently looking for new dates. You sometimes want to scan
all 200,000 for reports, etc., but for day-to-day searching you are only
considering the 20,000 active lookers. You want these mushed together
in a separate collection of blocks and not intermingled with the 180,000
inactive users. This way, you only need 10% as much RAM to cache your
working set. The obvious way to do this is partitioning:
create table lonelyhearts (
lonelyheart_id integer primary key using index tablespace index01,
first_names varchar(100),
last_name varchar(100),
desperation_status integer,
bad_acne_p char(1),
had_plastic_surgery_p char(1),
needs_plastic_surgery_p char(1),
...
)
partition by range (desperation_status)
(partition inactive
values less than (100)
tablespace data02,
partition active
values less than (maxvalue)
tablespace data01
);
We've chosen to represent someone's desperation level with an integer,
presumably so that we can add and subtract increments depending on how
thoroughly they are surfing the site. Any row with a desperation_status
greater than 100 gets put into the "active" partition in tablespace
DATA01. Folks whose desperation_status is less than 100 are put into
the "inactive" partition. Note that the primary index, which covers all
the rows, is stuck into a third tablespace because of the PRIMARY KEY
USING INDEX TABLESPACE INDEX01
annotation.
How does this work? Great! If you SELECT ... WHERE
DESPERATION_STATUS >= 100
then Oracle very cleverly hits only the
active partition.
Can you go home happy? Uh.... try changing a user from active to
inactive or vice versa. Oracle raises an error. If an otherwise legal
SQL UPDATE would cause a row to move from one partition to another,
Oracle won't let you do it. Thus the db administrator's decision to
partition a table to get higher performance must be examined by all the
application programmers who use the table.
Is this so bad? No. Just write a PL/SQL program to update the
desperation_status of a row. It will:
- read "all the columns" from the table where lonelyheart_id = N and
store them in a PL/SQL record of type lonelyhearts%ROWTYPE
- delete from lonelyhearts where lonelyheart_id = N
- update the PL/SQL record
- insert from the PL/SQL record into the table
It turns out that all of this is easy except the last step. There is no
way in PL/SQL to say "stuff everything from this record into a table
row". This is so commonly needed that Oracle even took the trouble to
document the limitation (though one wishes they'd put the effort into
enhancing the language instead). So you have to explicitly enumerate
the columns of the LONELYHEARTS table in your PL/SQL procedure. This is
tedious if there are 100 columns but doable. The problem is that
maintenance of this kind of system is impossible. A programmer
will inevitably come in one day and add a column to the LONELYHEARTS
table and forget to add it to the PL/SQL procedure. Rows that have
moved from one partition to another will silently lose their value in
this new column. It will take months to debug this problem.
How have I worked around this limitation? With the awesome programming
power of ... AOLserver Tcl. Because of the powerful API, it turns out
to be only a few lines of Tcl code to do all the steps above in a
transaction.
If you install AOLserver on an already-partitioned data model will your
troubles be over? No. Most likely there are other tables that refer to
the LONELYHEARTS table, e.g.,
create table email_alerts (
email_alert_id integer primary key,
lonelyheart_id integer not null references lonelyhearts,
...
)
You find that your transaction is aborted because there are rows in
other tables referencing "lonelyheart_id 37" and you therefore can't
delete the row containing this id. You can't do the insert before you
do the delete because the LONELYHEARTS table has a primary key
constraint. There is actually a way to tell Oracle to wait until a
transaction closes before checking an integrity constraint:
create table email_alerts (
email_alert_id integer primary key,
lonelyheart_id integer not null references lonelyhearts initially deferred deferrable,
...
)
The "deferrable" tells Oracle that this constraint may be
deferred. The "initially deferred" tells Oracle that this constraint
should be set at the ends of transactions.
Your pain is at an end.
Note that Informix Universal Server gets this right. Fragmenting a
table has performance/caching/tuning implications. However, no
application code SQL has to be examined or changed. Informix will
transparently migrate rows from one tablespace to another if necessary.
Altering Check Constraints
A nice clean Oracle data model will have check constraints that keep
application programs (e.g., Web scripts) from stuffing garbage data into
columns, e.g.,
presentation_type varchar(20) check(presentation_type in ('q_and_a','threads'))
makes sure that a topic in my
bulletin board system is either a Q&A forum or a traditional
USENET-style threaded discussion. This was fine until I copied the
software over to www.scorecard.org where we wanted to
tag threads with state, county, or zip code. This would be a new kind
of forum: "usgeospatial". I'd already defined the table and was running
a few forums. So I couldn't just change the CREATE TABLE
statement. Here's what you actually have to do:
-- first find the constraint name (since we weren't clever enough
-- to explicitly name it in the CREATE TABLE)
select constraint_name, search_condition
from user_constraints
where table_name = 'BBOARD_TOPICS';
-- drop the check constraint
alter table bboard_topics drop constraint SYS_C002703;
-- add it back with the extra value
alter table bboard_topics
add (constraint check_presentation_type check(presentation_type in ('q_and_a','threads', 'usgeospatial')));
Note that this time we gave it a name so that we'd not have to search
the USER_CONSTRAINTS
table next time.
Tracing and tuning Oracle SQL
If you really care about doing this right, you should read Guy
Harrison's Oracle
SQL High-Performance Tuning. But if you are satisfied to be an Oracle
Caveman like me, here are the steps:
- find a URL that is running too slowly
- stick "Verbose=On" in the [ns/db/pool/**poolname**] section of your
AOLserver .ini file; this will cause AOLserver to write all of its db
queries into the error log ("/home/nsadmin/log/server.log" by default)
- request the problematic URL from a Web browser
- fire up Emacs and load the AOLserver error log into a buffer; spawn
a shell and run sqlplus from the shell, logging in with the same
username/password as used by AOLserver
- you can now cut (from server.log) and paste (into sqlplus) the
queries performed by the script backing the slow URL. However, first
you must turn on tracing so that you can see what Oracle is
doing.
SQL> set autotrace on
Unable to verify PLAN_TABLE format or existence
Error enabling EXPLAIN report
Oops! It turns out that Oracle is unhappy about just writing to
standard output. For each user that wants to trace queries, you need to
feed sqlplus the file $ORACLE_HOME/rdbms/admin/utlxplan.sql which
contains a single table definition:
create table PLAN_TABLE (
statement_id varchar2(30),
timestamp date,
remarks varchar2(80),
operation varchar2(30),
options varchar2(30),
object_node varchar2(128),
object_owner varchar2(30),
object_name varchar2(30),
object_instance numeric,
object_type varchar2(30),
optimizer varchar2(255),
search_columns number,
id numeric,
parent_id numeric,
position numeric,
cost numeric,
cardinality numeric,
bytes numeric,
other_tag varchar2(255),
partition_start varchar2(255),
partition_stop varchar2(255),
partition_id numeric,
other long);
- Type "set autotrace on" again (it should work now; if you get an
error about the PLUSTRACE role then tell your dbadmin to run
$ORACLE_HOME/sqlplus/admin/plustrce.sql as SYS then GRANT your user that
role).
- Type "set timing on" (you'll get reports of elapsed time)
- cut and paste the query of interest.
Now that we're all set up, let's look at a couple of examples.
Tracing/Tuning Case 1: did we already insert the message?
The SQL here comes from my
bulletin board system. In the bad old Illustra days, it took so
long to do an insert that users would keep hitting "Reload" on their
browsers. When they were all done, there were three copies of a message
in the bulletin board. Before trying an insert, therefore, I decided it
would be best to check the bboard to see if there was already a message
with exactly the same values in the ONE_LINE and MESSAGE columns.
Because MESSAGE is a CLOB column, you can't just do the obvious "="
comparison and need to call the PL/SQL function "DBMS_LOB.INSTR".
Furthermore, this query will fail if the message body being tested is
longer than 4000 characters (since the SQL parser can't handle strings
longer than 4000 chars). I had to wrap a Tcl catch around the query.
Anyway, here's the SQL Plus session
SQL> select count(*) from bboard
where topic = 'photo.net'
and one_line = 'foo'
and dbms_lob.instr(message,'bar') > 0 ;
2 3 4
COUNT(*)
----------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD'
3 2 INDEX (RANGE SCAN) OF 'BBOARD_BY_TOPIC' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
59967 consistent gets
10299 physical reads
0 redo size
570 bytes sent via SQL*Net to client
741 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
The first thing to note here is "10,299 physical reads". Disk drives
are very slow. You don't really want to be doing more than a handful of
physical reads. Let's look at the heart of the query plan:
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD'
3 2 INDEX (RANGE SCAN) OF 'BBOARD_BY_TOPIC' (NON-UNIQUE)
Looks to me as though Oracle is hitting the "bboard_by_topic" index for
the ROWIDs of "just the rows that have a topic of 'photo.net'". It is
then using the ROWID, an internal Oracle pointer, to pull the actual
rows from the BBOARD table. Presumably Oracle will then count up just
those rows where the ONE_LINE and MESSAGE columns are appropriate. This
might not actually be so bad in an installation where there were 500
different discussion groups. Hitting the index would eliminate 499/500
rows. But BBOARD_BY_TOPIC isn't a very selective index, as
"select topic, count(*) from bboard group by topic order by count(*)
desc" reveals..
topic | count(*) |
photo.net | 14159 |
Nature Photography | 3289 |
Medium Format Digest | 1639 |
Ask Philip | 91 |
web/db | 62 |
The BBOARD table only has about 19,000 rows and the photo.net topic has
14,000 of them, about 75%. So the index didn't do us much good. In
fact, you'd have expected Oracle not to use the index. A full table
scan is generally faster than an index scan if more than 20% of the rows
need be examined. Why didn't oracle do the full table scan? Because
the table hadn't been "analyzed". There were no statistics for the
cost-based optimizer so the older rule-based optimizer was employed.
You have to periodically tell Oracle to build statistics on tables if
you want the fancy optimizer:
SQL> analyze table bboard compute statistics;
Table analyzed.
SQL> select count(*) from bboard
where topic = 'photo.net'
and one_line = 'foo'
and dbms_lob.instr(message,'bar') > 0 ;
2 3 4
COUNT(*)
----------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1808 Card=1 Bytes=828)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'BBOARD' (Cost=1808 Card=1 Bytes=828)
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
74280 consistent gets
12266 physical reads
0 redo size
572 bytes sent via SQL*Net to client
741 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
The final numbers don't look much better. But at least the cost-based
optimizer has figured out that the topic index won't be worth much. Now
we're just scanning the full BBOARD table. While transferring 20,000
rows from Illustra to Oracle, I'd not created any indices. This speeded
up loading but then I was so happy to have the system running
deadlock-free that I forgot to recreate an index that I'd be using on
the Illustra system expressly for the purpose of making this query fast.
SQL> create index bboard_index_by_one_line
on bboard ( one_line ) tablespace philgidx;
Index created.
Bboard postings are now indexed by subject line. Note that the index is
shoved off into a separate tablespace from the main table. This speeds
up inserts and updates since data can be written on two separate disk
drives in parallel. Let's try the query again:
SQL> select count(*) from bboard
where topic = 'photo.net'
and one_line = 'foo'
and dbms_lob.instr(message,'bar') > 0 ;
2 3 4
COUNT(*)
----------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=828)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD' (Cost=2 Card=1 Bytes=828)
3 2 INDEX (RANGE SCAN) OF 'BBOARD_INDEX_BY_ONE_LINE' (NON-UNIQUE) (Cost=1 Card=1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
3 physical reads
0 redo size
573 bytes sent via SQL*Net to client
741 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
We've brought physical reads down from 12266 to 3. Oracle is checking
the ONE_LINE index and then poking at the main table using the ROWIDs
retrieved from the index. It might actually be better to build a
concatenated index on the name of the person posting and the subject
line, but I'm not going to sweat because this is good enough for me.
Tracing/Tuning Case 2: new questions
At the top of each forum page, e.g.,
http://photo.net/bboard/q-and-a.tcl?topic=photo.net,
I like to show questions asked in the last 7 days. I programmed this in
AOLserver Tcl so that the user would get a partial page before I
attempted the query, but still there was a perceptible lag before any
questions would appear.
SQL> select msg_id, one_line, sort_key, email, name
from bboard
where topic = 'photo.net'
and refers_to is null
and posting_time > (sysdate - 7)
order by sort_key desc;
...
61 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1828 Card=33 Bytes=27324)
1 0 SORT (ORDER BY) (Cost=1828 Card=33 Bytes=27324)
2 1 TABLE ACCESS (FULL) OF 'BBOARD' (Cost=1808 Card=33 Bytes=27324)
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
13188 consistent gets
12071 physical reads
0 redo size
7369 bytes sent via SQL*Net to client
1234 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
61 rows processed
A full table scan and 12,071 physical reads just to get 61 rows! It was
time to get medieval on this query. If the query's WHERE clause
contained topic, refers_to, and posting_time, then I'd build a
concatenated index on all three columns:
SQL> create index bboard_for_new_questions on bboard ( topic, refers_to, posting_time ) tablespace philgidx;
Index created.
SQL> select msg_id, one_line, sort_key, email, name
from bboard
where topic = 'photo.net'
and refers_to is null
and posting_time > (sysdate - 7)
order by sort_key desc;
...
61 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=23 Card=33 Bytes=27324)
1 0 SORT (ORDER BY) (Cost=23 Card=33 Bytes=27324)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD' (Cost=3 Card=33 Bytes=27324)
3 2 INDEX (RANGE SCAN) OF 'BBOARD_FOR_NEW_QUESTIONS' (NON-UNIQUE) (Cost=2 Card=33)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
66 consistent gets
60 physical reads
0 redo size
7369 bytes sent via SQL*Net to client
1234 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
61 rows processed
60 reads is better than 12,000. One bit of clean-up, though. There is
no reason to have a BBOARD_BY_TOPIC index if we are going to keep this
BBOARD_FOR_NEW_QUESTIONS index, whose first column is TOPIC. The query
optimizer can use BBOARD_FOR_NEW_QUESTIONS even when the SQL only
restricts based on the TOPIC column. The redundant index won't cause
any services to fail, but it will slow down inserts.
SQL> drop index bboard_by_topic;
Index dropped.
I was so pleased with myself that I decided to drop an index on bboard
by the REFERS_TO column, reasoning that nobody ever queries REFERS_TO
without also querying on TOPIC. Therefore they could just use the first
two columns in the BBOARD_FOR_NEW_QUESTIONS index. Here's my query
looking for unanswered questions (from the above section on whether "=" is
commutative):
SQL> select msg_id, one_line, sort_key, email, name
from bboard bbd1
where topic = 'photo.net'
and 0 = (select count(*) from bboard bbd2 where bbd2.refers_to = bbd1.msg_id)
and refers_to is null
order by sort_key desc;
...
57 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=49 Card=33 Bytes=27324)
1 0 SORT (ORDER BY) (Cost=49 Card=33 Bytes=27324)
2 1 FILTER
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD' (Cost=29 Card=33 Bytes=27324)
4 3 INDEX (RANGE SCAN) OF 'BBOARD_FOR_NEW_QUESTIONS' (NON-UNIQUE) (Cost=2 Card=33)
5 2 INDEX (FULL SCAN) OF 'BBOARD_FOR_NEW_QUESTIONS' (NON-UNIQUE) (Cost=26 Card=7 Bytes=56)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
589843 consistent gets
497938 physical reads
0 redo size
6923 bytes sent via SQL*Net to client
1173 bytes received via SQL*Net from client
7 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
57 rows processed
Ouch! 497,938 physical reads. Let's try it with the index in place:
SQL> create index bboard_index_by_refers_to on bboard ( refers_to )
tablespace philgidx;
Index created.
SQL> select msg_id, one_line, sort_key, email, name
from bboard bbd1
where topic = 'photo.net'
and 0 = (select count(*) from bboard bbd2 where bbd2.refers_to = bbd1.msg_id)
and refers_to is null
order by sort_key desc;
...
57 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=49 Card=33 Bytes=27324)
1 0 SORT (ORDER BY) (Cost=49 Card=33 Bytes=27324)
2 1 FILTER
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'BBOARD' (Cost=29 Card=33 Bytes=27324)
4 3 INDEX (RANGE SCAN) OF 'BBOARD_FOR_NEW_QUESTIONS' (NON-UNIQUE) (Cost=2 Card=33)
5 2 INDEX (RANGE SCAN) OF 'BBOARD_INDEX_BY_REFERS_TO' (NON-UNIQUE) (Cost=1 Card=7 Bytes=56)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8752 consistent gets
2233 physical reads
0 redo size
6926 bytes sent via SQL*Net to client
1173 bytes received via SQL*Net from client
7 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
57 rows processed
This is still a fairly expensive query, but 200 times faster than
before and it executes in a fraction of a second. That's probably fast
enough considering that this is an infrequently requested page.
Tracing/Tuning Case 3: forcing Oracle to cache a full table scan
You may have a Web site that is basically giving users access to a huge
table. For maximum flexibility, it might be the case that this table
needs to be sequentially scanned for every query. In general, Oracle
won't cache blocks retrieved during a full table scan. The Oracle
tuning guide helpfully suggests that you include the following cache
hints in your SQL:
select /*+ FULL (students) CACHE(students) */ count(*) from students;
You will find, however, that this doesn't work if your buffer cache
(controlled by db_block_buffers; see above) isn't large enough to
contain the table. Oracle is smart and ignores your hint. After you've
reconfigured your Oracle installation to have a larger buffer cache,
you'll probably find that Oracle is still ignoring your cache
hint. That's because you also need to
analyze table students compute statistics;
and then Oracle will work as advertised in the tuning guide. It makes
sense when you think about it because Oracle can't realistically start
stuffing things into the cache unless it knows roughly how large the
table is.
db_block_size
Every Oracle and third-party tuning publication says "you have to have
your database block size at least as large as your operating system
block size". This is because it might actually take longer to read 2K
bytes (the Oracle default) from the Unix file system than a whole block
(8K bytes on most systems). Supposedly you also want to make sure that
your block size is larger than all of your rows. So if you are using
CLOBs, which stay in the row until they are 4000 bytes long, you'll
presumably want to use 8K blocks (most tuning books recommend 4K or
8K).
All of the Oracle and computer trade publications confidently tell you
to just look in $ORACLE_HOME/dbs/init.ora and you'll find the
db_block_size. This would be pretty easy if it were true. In fact, if
you run the standard Oracle install scripts, db_block_size is never
specified anywhere.
After 45 minutes on the phone with Oracle support, I learned that you
need to run "svrmgrl", connect internal, and type "show parameters":
bash-2.01$ svrmgrl
Oracle Server Manager Release 3.0.3.0.0 - Production
(c) Copyright 1997, Oracle Corporation. All Rights Reserved.
Oracle8 Enterprise Edition Release 8.0.3.0.0 - Production
With the Partitioning and Objects options
PL/SQL Release 8.0.3.0.0 - Production
SVRMGR> connect internal
Connected.
SVRMGR> show parameters
NAME TYPE VALUE
----------------------------------- ------- ------------------------------
O7_DICTIONARY_ACCESSIBILITY boolean TRUE
allow_partial_sn_results boolean FALSE
always_anti_join string NESTED_LOOPS
aq_tm_processes integer 0
arch_io_slaves integer 0
audit_file_dest string ?/rdbms/audit
audit_trail string NONE
b_tree_bitmap_plans boolean FALSE
background_core_dump string full
background_dump_dest string /ora8/m01/app/oracle/admin/ora
backup_disk_io_slaves integer 0
backup_tape_io_slaves boolean FALSE
bitmap_merge_area_size integer 1048576
blank_trimming boolean FALSE
buffer_pool_keep string
buffer_pool_recycle string
cache_size_threshold integer 20
cleanup_rollback_entries integer 20
close_cached_open_cursors boolean FALSE
commit_point_strength integer 1
compatible string 8.0.0
compatible_no_recovery string 8.0.0
control_file_record_keep_time integer 7
control_files string /ora8/m02/oradata/ora8/control
core_dump_dest string /ora8/m01/app/oracle/admin/ora
cpu_count integer 1
create_bitmap_area_size integer 8388608
cursor_space_for_time boolean FALSE
db_block_buffers integer 200
db_block_checkpoint_batch integer 8
db_block_checksum boolean FALSE
db_block_lru_extended_statistics integer 0
db_block_lru_latches integer 1
db_block_lru_statistics boolean FALSE
db_block_max_dirty_target integer 4294967294
db_block_size integer 2048
db_domain string WORLD
db_file_direct_io_count integer 64
db_file_multiblock_read_count integer 8
db_file_name_convert string
db_file_simultaneous_writes integer 4
db_files integer 80
db_name string ora8
dblink_encrypt_login boolean FALSE
dbwr_io_slaves integer 0
delayed_logging_block_cleanouts boolean TRUE
discrete_transactions_enabled boolean FALSE
disk_asynch_io boolean TRUE
distributed_lock_timeout integer 60
distributed_recovery_connection_hol integer 200
distributed_transactions integer 16
dml_locks integer 100
enqueue_resources integer 155
event string
fixed_date string
freeze_DB_for_fast_instance_recover boolean FALSE
gc_defer_time integer 10
gc_files_to_locks string
gc_lck_procs integer 1
gc_releasable_locks integer 0
gc_rollback_locks string
global_names boolean TRUE
hash_area_size integer 0
hash_join_enabled boolean TRUE
hash_multiblock_io_count integer 8
ifile file /ora8/m01/app/oracle/admin/ora
instance_groups string
instance_number integer 0
job_queue_interval integer 60
job_queue_keep_connections boolean FALSE
job_queue_processes integer 0
large_pool_min_alloc string 16K
large_pool_size string 0
lgwr_io_slaves integer 0
license_max_sessions integer 0
license_max_users integer 0
license_sessions_warning integer 0
lm_locks integer 12000
lm_procs integer 127
lm_ress integer 6000
local_listener string
lock_name_space string
lock_shared_memory boolean FALSE
log_archive_buffer_size integer 64
log_archive_buffers integer 4
log_archive_dest string ?/dbs/arch
log_archive_duplex_dest string
log_archive_format string %t_%s.dbf
log_archive_min_succeed_dest integer 1
log_archive_start boolean FALSE
log_block_checksum boolean FALSE
log_buffer integer 8192
log_checkpoint_interval integer 10000
log_checkpoint_timeout integer 0
log_checkpoints_to_alert boolean FALSE
log_file_name_convert string
log_files integer 255
log_simultaneous_copies integer 0
log_small_entry_max_size integer 80
max_commit_propagation_delay integer 90000
max_dump_file_size string 10240
max_enabled_roles integer 20
max_rollback_segments integer 30
max_transaction_branches integer 8
mts_dispatchers string
mts_listener_address string
mts_max_dispatchers integer 5
mts_max_servers integer 20
mts_multiple_listeners boolean FALSE
mts_rate_log_size string
mts_rate_scale string
mts_servers integer 0
mts_service string ora8
nls_currency string
nls_date_format string
nls_date_language string
nls_iso_currency string
nls_language string AMERICAN
nls_numeric_characters string
nls_sort string
nls_territory string AMERICA
object_cache_max_size_percent integer 10
object_cache_optimal_size integer 102400
open_cursors integer 50
open_links integer 4
open_links_per_instance integer 4
ops_admin_group string
optimizer_mode string CHOOSE
optimizer_percent_parallel integer 0
optimizer_search_limit integer 5
oracle_trace_collection_name string
oracle_trace_collection_path string ?/otrace/admin/cdf
oracle_trace_collection_size integer 5242880
oracle_trace_enable boolean FALSE
oracle_trace_facility_name string oracled
oracle_trace_facility_path string ?/otrace/admin/fdf
os_authent_prefix string ops$
os_roles boolean FALSE
parallel_default_max_instances integer 0
parallel_instance_group string
parallel_max_servers integer 5
parallel_min_message_pool integer 48420
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_idle_time integer 5
parallel_transaction_resource_timeo integer 300
partition_view_enabled boolean FALSE
plsql_v2_compatibility boolean FALSE
pre_page_sga boolean FALSE
processes integer 50
recovery_parallelism integer 0
reduce_alarm boolean FALSE
remote_dependencies_mode string TIMESTAMP
remote_login_passwordfile string NONE
remote_os_authent boolean FALSE
remote_os_roles boolean FALSE
replication_dependency_tracking boolean TRUE
resource_limit boolean FALSE
rollback_segments string r01, r02, r03, r04
row_cache_cursors integer 10
row_locking string always
sequence_cache_entries integer 10
sequence_cache_hash_buckets integer 10
serial_reuse string DISABLE
serializable boolean FALSE
session_cached_cursors integer 0
session_max_open_files integer 0
sessions integer 60
shadow_core_dump string full
shared_pool_reserved_min_alloc string 5K
shared_pool_reserved_size string 175000
shared_pool_size string 3500000
snapshot_refresh_interval integer 60
snapshot_refresh_keep_connections boolean FALSE
snapshot_refresh_processes integer 0
sort_area_retained_size integer 0
sort_area_size integer 65536
sort_direct_writes string AUTO
sort_read_fac integer 5
sort_spacemap_size integer 512
sort_write_buffer_size integer 32768
sort_write_buffers integer 2
spin_count integer 1
spin_count integer 1
spread_extents boolean TRUE
sql92_security boolean FALSE
sql_trace boolean FALSE
star_transformation_enabled boolean FALSE
tape_asynch_io boolean TRUE
temporary_table_locks integer 60
text_enable boolean FALSE
thread integer 0
timed_os_statistics integer 0
timed_statistics boolean FALSE
transaction_auditing boolean TRUE
transactions integer 66
transactions_per_rollback_segment integer 11
user_dump_dest string /ora8/m01/app/oracle/admin/ora
utl_file_dir string
In this case, it looks like the default for HP Unix is in fact 2048
bytes (I tried this on a SPARC Solaris machine also and the default
installation also resulted in an Oracle with 2K blocks).
What about Unix?
All of the "Oracle for Dummies" books claim that the Unix file system
defaults to 8K blocks, but they don't say how to determine what you've
actually got in an already-built system, though. Fanatical in my
devotion to bringing readers the best possible information, I read
dozens of man pages, the source code for Linux, 10 volumes of Oracle
documentation, and then... called the HP Unix technical support 800
number. Wayne Guerin set me straight in about two minutes...
You have to first find out the "block device name" of the file system
using "df -k".
# df -k
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/vg00/lvol3 1228800 1201755 25049 98% /
/dev/vg00/lvol1 47829 14205 28841 33% /stand
/dev/vg00/lvol4 20480 19322 1104 95% /home
/dev/dsk/c0t3d0 4103198 2541381 1151497 69% /extra2
/dev/dsk/c0t1d0 4103198 3203259 489619 87% /extra1
Suppose we're interested in the file system we know as "/extra2". That
means we need to feed a block device name of "/dev/dsk/c0t3d0" to the
"tunefs" command:
# /usr/sbin/tunefs -v /dev/dsk/c0t3d0
super block last mounted on: /extra2
magic 95014 clean FS_OK time Sat Dec 20 22:45:39 1997
sblkno 16 cblkno 24 iblkno 32 dblkno 232
sbsize 2048 cgsize 3072 cgoffset 40 cgmask 0xfffffff0
ncg 421 size 4194157 blocks 4103198
bsize 8192 bshift 13 bmask 0xffffe000
fsize 1024 fshift 10 fmask 0xfffffc00
frag 8 fragshift 3 fsbtodb 0
minfree 10% maxbpg 312
maxcontig 1 rotdelay 0ms rps 60
csaddr 232 cssize 7168 csshift 9 csmask 0xfffffe00
ntrak 16 nsect 39 spc 624 ncyl 6722
cpg 16 bpg 1248 fpg 9984 ipg 1600
nindir 2048 inopb 64 nspf 1
nbfree 194670 ndir 2168 nifree 631327 nffree 4457
cgrotor 304 fmod 0 ronly 0
fname fpack
cylinders in last group 2
blocks in last group 156
Sure enough, this has the standard "bsize" of 8192 (8K). If tunefs
doesn't have this kind of display option on your Unix box, or if you are
an HP-UX Achiever running the VxFS journaled file system, then you need
to use "mkfs -m", which gives you back the command that you would have
had to type to create the file system:
# /usr/sbin/mkfs -m /dev/dsk/c0t3d0
mkfs -F hfs -L /dev/dsk/c0t3d0 4194157 39 16 8192 1024 16 10 60 6237
Note that a VxFS journaled file system defaults to a block size of
1024. Here's one of my already-created ones:
# /usr/sbin/mkfs -m /dev/vg00/lvol4
mkfs -F vxfs -o ninode=unlimited,bsize=1024,version=3,inosize=256,logsize=1024,nolargefiles /dev/vg00/lvol4 20480
What should you do if you find that your block size is smaller than your
typical row lengths and/or your operating system block size? Here's my
favorite quote from a "for Dummies"-type computer trade book (page
275 of the Oracle8 DBA Handbook):
"The effect of increasing the database block size is stunning. In most
environments, at least two block sizes are supported--for example, 2K
and 4K. Most of the installation routines are set to use the lower of
the two. However, using the next higher value for the block size may
improve the performance of query-intensive operations by up to 50
percent."
"This gain comes relatively free of charge. To increase the database
block size, the entire database must be rebuilt, and all of the old
database files have to be deleted."
[emphasis added]
Well now, that doesn't sound too difficult does it? I often rebuild my
entire database and delete all the tablespace files while my Web
services are on-line, just for fun.
When a tablespace fills up
The default roll back segment (RBS) tablespace seems to be much too
small (maybe 15 MB) to do interesting transactions, e.g., rebuild a
ConText index. Nor does it autoextend by default. So you'll often see
big updates failing with
ORA-01562: failed to extend rollback segment number 5
ORA-01650: unable to extend rollback segment R04 by 64 in tablespace RBS
This is pretty humiliating if you have $500,000 worth of computer
hardware and are just trying to work with some wimpy 20 MB table.
Probably it means that the rest of your installation was badly done as
well. Anyway, if you have dba privileges, you can investigate the
problem with
SELECT file_name, bytes, tablespace_name
FROM sys.dba_data_files;
I think there might be a way to fix the existing rollback data file to
make it autoextend. Alternatively, you can just add a datafile,
possibly on a different disk:
ALTER TABLESPACE RBS
ADD DATAFILE '/ora8/m03/oradata/ora8/rbs02.dbf' SIZE 50M
AUTOEXTEND ON
NEXT 5M
MAXSIZE 500M;
The datafilename here is a complete Unix path. This one is going to
start at 50 MB and grow up to 500 MB if necessary.
I'm OK, you're dysfunctional
You cannot define what Informix would call a "functional index" in
Oracle until version 8.1.5. That is to say, you can't say "build me
an index on UPPER(EMAIL) in the USERS table". This sounds trivial but
it has horrifying implications. If you want to do a case-insensitive
search for the user with an email address of "foo@bar.com" then you'll
be sequentially scanning the entire table. If you want to enforce
uniqueness of the EMAIL_ADDRESS column in a case insensitive manner,
you'll have to write a PL/SQL trigger that sequentially scans the
table before allowing an insert to proceed. If the USERS table has 1
million rows, this will take forever.
The only solution that I can think of is to use triggers to maintain a
separate column called UPPER_EMAIL_ADDRESS that stores the uppercased
version. Then declare it to be UNIQUE or define a UNIQUE index on that
column. But the whole process is so ugly that I'm reluctant to admit to
having done it.
Here's how it works in Informix Universal Server:
create unique index users_by_email on users(upper(email));
You can do this as well in the current (September 2000; 8.1.7) version
of Oracle.
The Oracle SQL Parser's Error Messages Won't Help You
If you type an SQL statement that Oracle doesn't like, it will give you
an error message. Unfortunately, the error message is almost never
helpful. Here's a table definition that SQL*Plus doesn't like.
create table photos (
photo_id integer not null primary key,
photographer_id integer not null references photographers,
film_roll_id integer references film_rolls,
photocd_id integer references photo_cds,
cd_image_number integer, -- will be null unless photocd_id is set
frame_number integer, -- will be null unless film_roll_id is set
-- oftentimes scans are not properly oriented and/or the film
-- was done emulsion-side out instead of emulsion-side in;
-- we need to store enough info to undo these
required_rotation integer, -- in degrees to the right, e.g., 90 for CW; NULL or 0 means none
requires_flip_p char(1) check (requires_flip_p in ('t','f')),
-- image conversion stuff
-- bit vectors done with ASCII 0 and 1; probably convert this to
-- Oracle 8 abstract data type
jpeg_resolutions char(6), -- only makes sense for images off PhotoCD
-- is this image going to be searchable by the public?
public_p char(1) check (public_p in ('t','f')), -- searchable?
model_released_p char(1) default 'f' check (model_released_p in ('t','f')),
exposure_date date,
title varchar(100),
-- this makes the most sense if we are delivering back to the user
-- a collection of JPEGs for serving from his own site; for us, we'd
-- probably rather store images in files tagged with photo_id
filename_stub varchar(100), -- we may append frame number or cd image number
caption varchar(4000),
tech_details varchar(4000), -- f-stop, shutter speed, film used
tutorial_info varchar(4000),
copyright_label varchar(100),
-- extra stuff for image retrieval
search_keywords varchar(4000),
-- can others else use this picture?
rights_personal_web_p char(1) check (rights_personal_web_p in ('t','f')),
rights_personal_print_p char(1) check (rights_personal_print_p in ('t','f')),
rights_nonprofit_web_p char(1) check (rights_nonprofit_web_p in ('t','f')),
rights_nonprofit_print_p char(1) check (rights_nonprofit_print_p in ('t','f')),
rights_comm_web_p char(1) check (rights_comm_web_p in ('t','f')),
rights_comm_print_p char(1) check (rights_comm_print_p in ('t','f'))
);
unknown command beginning "eb_p in ('..." - rest of line ignored.
For a list of known commands enter HELP
and to leave enter EXIT.
SQL> unknown command beginning "rights_per..." - rest of line ignored.
SQL> unknown command beginning "rights_non..." - rest of line ignored.
SQL> unknown command beginning "rights_non..." - rest of line ignored.
SQL> unknown command beginning "rights_com..." - rest of line ignored.
For a list of known commands enter HELP
and to leave enter EXIT.
SQL> unknown command beginning "rights_com..." - rest of line ignored.
SQL> unknown command ")" - rest of line ignored.
SQL> SQL>
So I fed stripped out all the comments and fed it to SQL*Plus again.
Same error message. So I fed the same SQL to AOLserver and got back the
following:
ORA-02267: column type incompatible with referenced column type SQL: [too long]
So I took out all the REFERENCES constraints and went back to
AOLserver's SQL evaluator and ended up with
ORA-00911: invalid character SQL: [too long]
I eventually figured out that PHOTOCD_ID should actually be VARCHAR(20)
rather than INTEGER. My mistake but certainly Oracle was no help in
finding it. So I fixed the problem and resubmitted the CREATE TABLE to
SQL*Plus.
unknown command beginning "or image r..." - rest of line ignored.
SQL> unknown command beginning "search_key..." - rest of line ignored.
I took out all the comments and tried again:
invalid starting line number
SQL> unknown command beginning "rights_per..." - rest of line ignored.
SQL> unknown command beginning "rights_non..." - rest of line ignored.
I fed the comment-free statement to AOLserver's SQL evaluator:
ORA-00911: invalid character SQL: [too long]
So I split up the table definition into a smaller CREATE and then a
subsequent ALTER and fed it to SQL*Plus (still without comments):
SQL> SQL>
create table photos (
photo_id integer not null primary key,
photographer_id integer not null references photographers,
film_roll_id integer references film_rolls,
photocd_id varchar(20) references photo_cds,
cd_image_number integer,
frame_number integer,
required_rotation integer,
requires_flip_p char(1) check (requires_flip_p in ('t','f')),
jpeg_resolutions char(6),
public_p char(1) check (public_p in ('t','f')),
model_released_p char(1) default 'f' check (model_released_p in ('t','f')),
exposure_date date,
title varchar(100),
filename_stub varchar(100),
caption varchar(4000),
tech_details varchar(4000),
tutorial_info varchar(4000),
copyright_label varchar(100),
search_keywords varchar(4000));
SQL> SQL> SQL> 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Table created.
SQL> alter table photos add (
rights_personal_web_p char(1) check (rights_personal_web_p in ('t','f')),
rights_personal_print_p char(1) check (rights_personal_print_p in ('t','f')),
rights_nonprofit_web_p char(1) check (rights_nonprofit_web_p in ('t','f')),
rights_nonprofit_print_p char(1) check (rights_nonprofit_print_p in ('t','f')),
rights_comm_web_p char(1) check (rights_comm_web_p in ('t','f')),
rights_comm_print_p char(1) check (rights_comm_print_p in ('t','f'))
);
2 3 4 5 6 7 8
Table altered.
SQL> SQL>
Worked perfectly! Only took me an hour to find one bad column
definition and define one table. I still hate how Illustra kept me up
at night with its deadlocks, but the same process would have taken about
2 minutes with Illustra since its SQL parser would have said "I don't
like PHOTOCD_ID".
Bumping up the log file size
More fun fixing up the unusable defaults set by the Oracle Installer
(world's greatest source of misery caused by a single program)...
Here's how to get from measly 512k logfiles to something bigger, so you
don't get a logfile switch every few seconds....
REM find out where the logfiles are now and how big they are
select v$logfile.member, v$log.status, v$log.bytes
from v$log, v$logfile
where v$log.group# = v$logfile.group#
/
REM add 3 new logfiles that are the right size...
REM of course a production system would add 2 or more members
REM per group on different devices... here we only have 1 member/group
alter database add logfile '/user2/oracle73/dbs/oradata/oracle73/newredo1.log'
size 10m;
alter database add logfile '/user2/oracle73/dbs/oradata/oracle73/newredo2.log'
size 10m;
alter database add logfile '/user2/oracle73/dbs/oradata/oracle73/newredo3.log'
size 10m;
REM issue the following enough times until the CURRENT logfile is one of the
REM NEW logfiles. The first query above shows you the current logfile....
alter system switch logfile;
alter system switch logfile;
REM Now, drop the little logfiles we don't need anymore. You need to erase or
REM rm them from the filesystem yourself AFTER you drop them from the database.
alter database drop logfile
'/user2/oracle73/dbs/oradata/oracle73/redooracle7301.log';
alter database drop logfile
'/user2/oracle73/dbs/oradata/oracle73/redooracle7302.log';
alter database drop logfile
'/user2/oracle73/dbs/oradata/oracle73/redooracle7303.log';
PL/SQL Korner
Writing PL/SQL is pretty easy. The hard part for me is always getting
back from Oracle what I typed in. Here's the magic that you need:
select line, text
from dba_source
where name = 'YOUR_PROC_NAME'
order by line;
Note that you need dba privileges to execute this query.
Everyday problem solving
If Oracle rains on your parade with
ORA-02292: integrity constraint (PHOTONET.SYS_C003676)
violated - child record found
then you need to go in as
select table_name
from user_constraints
where constraint_name = 'SYS_C003676;
If Oracle won't let you drop the table user_groups
because
some other table is referring to it, type
column table_name format a20
column constraint_name format a15
column r_constraint_name format a15
select uc1.table_name, uc1.constraint_name, uc1.R_constraint_name, uc2.table_name
from user_constraints uc1, user_constraints uc2
where uc1.R_constraint_name = uc2.constraint_name
and uc1.constraint_type = 'R'
and uc2.table_name = 'USER_GROUPS';
TABLE_NAME CONSTRAINT_NAME R_CONSTRAINT_NA TABLE_NAME
-------------------- --------------- --------------- --------------------
LYDIA_INFO SYS_C004316 SYS_C003225 USER_GROUPS
MISC_INFO SYS_C004308 SYS_C003225 USER_GROUPS
SITE_STAFF_INFO SYS_C004306 SYS_C003225 USER_GROUPS
(Remember that you need to use uppercase "USER_GROUPS" in your query.)
You can infer from this query result that you'll have to drop
lydia_info
, misc_info
,
site_staff_info
.
asj-editors@arsdigita.com
Reader's Comments
A great free tool for working with oracle is Toad.
In particular it makes it very easy to edit PL/SQL
procedures in the database.
Get it at: http://www.toadsoft.com
-- David Tauzell, August 29, 1998
I'm not a DBA, I'm a sysadmin. But as a sysadmin,
you have to work with DBAs and you learn little
things about the RDBMS, like how to
startup/shutdown, how you do a backup, how you
find out which areas of the disk the RDBMS has
glommed onto, etc.
From my perspective, Informix is a far
better-designed product than Oracle. One of the
DBAs at work says Informix is five years ahead of
Oracle in technology, and I believe it.
Informix is also vastly easier to use and to
learn; after a year of working at an Oracle site I
had learned how to startup/shutdown. After a
month of working at an Informix site, I had
learned startup/shutdown, how to do backups, how
to find out what areas of the disk the database
has glommed onto, how to install the Perl DBI
module for Informix, how to run it from an
Intel/FreeBSD box to a HPPA/HP-UX box, etc.
Think of a computer-industry leader who is
fantastically rich, fantastically driven by the
art of the deal, who likes fast cars, who has
near-total market share, and whose products have
kinda crummy pragmatics (buggy, hard to work with,
hard to troubleshoot, beloved of managerial
types).
Not Bill Gates, it's Larry Ellison.
-- Daniel Boyd, September 22, 1998
Oracle documentation
The Oracle documentation became available around
a year ago at http://technet.oracle.com; you have
to register there before you can use it.
Locking
The lmode=6 clause finds all the exclusive locks.
A more direct way to find the blocking session
and the sessions that are blocked would use the
v$lock.block column.
col type heading "Lock Type" format a15
col lmode heading "Mode Held"
col request heading "Mode Requested"
col block heading "Blocking?"
select sid,type,lmode,request,block
from v$lock
where block != 0 or
request != 0
order by id1,id2,block desc
ID Lock Type Mode Held Mode Requested
Blocking?
---- --------------- ---------- --------------
----------
16 TX 6
0 1
31 TX 0
6 0
15 TX 0
6 0
From this you know that session 16 is blocking
sessions 31 and 15. You can now query v$sqltext
to find out what the sessions are doing.
select s.sid,s.serial#,a.sql_text
from v$sqltext a,v$session s
where a.hash_value = s.sql_hash_value and
s.sid in &sidstring
order by s.sid,a.hash_value,a.piece
SQL> /
Enter value for sidstring: (16,31,15)
SID SERIAL# SQL_TEXT
---- ----------
--------------------------------------------------
--------------
15 30 update x set the_data = 'this is
not a test' where the_key=1
16 15 update x set the_data = 'this is
a test' where the_key=1
31 185 update x set the_data = 'this is
also a test' where the_key=1
'' is the same as Null
For the next time around... you could default the
phone_number column to 'n/a' when you create the
table. If you load data behind the scenes you
could use insert statements that omit the
phone_number column. The loaded data will have
pn='n/a'; the data entered on the form will have
either null, or an actual phone number...
PL/SQL
You can see the text of your own
packages,procedures and functions without the
select any table privilege:
select line,text
from user_source
where name = 'YOUR_PROC_NAME'
order by type,line
-- Chaim Katz, January 5, 1999
I would have to agree with the Informix vs. Oracle comments. As far as 99% of the user-base is concerned, both products supply about the same level of service, cost about the same amount of money, and run about as fast.
I should mention at this point that I'm an Informix DBA by profession who was forced to become an Oracle DBA about a year ago due to interesting management decisions. This probably gives me a skewed perspective on things, so keep that in mind.
Anyway, the one place that the two systems really diverge in my opinion is in basic entry-level administration. While both systems provide a complete set of 'system tables' that an advanced DBA can write queries on, Informix is light-years ahead as far as friendly common tasks go. A good example of this would be the amount of effort to do common administrative procedures. Some examples follow.
I was assured in official Oracle training that I wanted to rebuild my b-tree indexes periodically to reclaim the space lost when rows were deleted from the main table. This surprised me, as Informix takes care of this detail automatically as data is inserted and deleted. I don't actually recall the command to do it while the index is in place, and Oracle doesn't provide a 'how to do daily maintenance' user guide (at least, not in my opinion).
To determine the amount of space used in every data file in the system, the Informix command to use would be 'onstat -d'. In Oracle, I have to write a query in PL/SQL to accomplish the same task.
In Informix, I can get a complete point-in-time binary backup of a running system by typing 'ontape -c -l0' (Create a level 0 backup). This can go to a tape, or a single file on disk. Oracle, on the other hand, requires you to:
- Obtain a list of all current datafiles.
- For each file:
Manually suspend activity on the file.
Copy the file to a free area.
Restore activity on the file.
- Add the system configuration files to the free area.
- Tar the individual files together to a tape.
Admittedly, this is not difficult to do. It requires a fairly short shell script that can be cobbled together in an hour or so. However, when running an enterprise-level system this has to be tested and validated. You'd probably want to add some form of intelligence to the script to restore access to locked files if the script is killed in mid-backup as well.
Quite simply, Informix has either automated or provided utilities to cover 99% of standard maintenance issues (consolidation, backups, etc). They both give you sufficent information that you could write your own if you wished, but Informix assumes that you'll be wanting to make hot backups, etc. and provides tested, supported tools to make your life easier.
After a couple of years of dealing with Oracle, I am beginning to see why Oracle DBAs are more in demand than Informix DBAs. I can (and do) administrate over 100 live production OLTP Informix databases with more ease and confidence than I have when working with our 2-3 internal Oracle development systems.
Aargh.
-- Richard Stanford, February 24, 1999
Another way to handle long strings is to break them up into segments. Instead of storing the text or a message directly in a message table, you add another table, say msg_text, with columns msg_id, seg_num, and seg_text. On insert the message is split into segments limited to the smaller of the query limit and database varchar limit, minus the overhead of the sql query (the "insert..." stuff). It makes for an ugly insert operation, though.I had to do this to work with PostgreSQL, which has an 8K limit. If you split on a word boundary (a whitespace character) then a full text indexer might be able to make sense of this.
-- Ari Halberstadt, March 4, 1999
ConText.. what ConText? Take a look at the Oracle Home Page.. Search engine licensed by Oracle is the Altavista Engine..... Or am I missing something here ???
-- Ravi Nagpal, May 20, 1999
Re: How to rename a column in Oracle
You can do something like the following,
though I agree it shouldn't be necessary:
SQL> describe phone_directory
Name Null? Type
------------------------------- -------- ----
NAME VARCHAR2(100)
PHONE VARCHAR2(100)
SQL> create table new_phone_directory (full_name varchar(100), phone varchar(100));
Table created.
SQL> insert into new_phone_directory (full_name, phone)
2 select name, phone from phone_directory;
1 row created.
SQL> commit;
Commit complete.
SQL> drop table phone_directory;
Table dropped.
SQL> rename new_phone_directory to phone_directory;
Table renamed.
-- Larry Mulcahy, November 15, 1999
Doctor Philip Greenspun,About the Oracle Documentation you say "I can't for the life of me figure out why this isn't simply available at www.oracle.com but it isn't.".
However, Oracle has all of their documentation on its technical site: technet.oracle.com. You have to register yourself to access the technet but it's free. Besides, nowadays, everybody asks for registering. Even to add this comment! :))
For instance, the Oracle8i Server Documentation, Release 8.1.5
may be found at http://technet.oracle.com/doc/server815.htm.
Best regards and thank for your nice site,
Pedro Bizarro
-- Pedro Bizarro, December 27, 1999
This page is very helpful. Thanks.One thing, i se that Informix is more appreciated than Oracle. I think there is one point where Oracle is however interesting, the use of SQL for all administration process of the database. This way will be the most useful, i believe, for tomorrow networked application.
Note that it's possible with Oracle to compile all database operation on SQL script, wich is very effective for migration or structural evolution of distributed architecture.
Nevertheless, Oracle suffer of this great problem, lot of important administration tasks could not be done outside SQL statements,even if the most common tasks are automated under entreprise tools.
Cedric A.
-- C A, January 4, 2000
To determine block size on Linux, use tune2fs.According to the documentation I found, block size can be Linux distribution specific. On Red Hat 6.1, it appears to be 4096, and tune2fs can be found at: /sbin/tune2fs.
-- Jerry Asher, January 14, 2000
More Linux filesystem stuff: On my (Debian) system, tune2fs will only change stuff; to see the blocksize use 'dumpe2fs -h' (without -h, you get a dump of the entire ondisk block structure, which can be interesting, but is probably not what you want).The blocksize is determined when the filesystem is created (mke2fs). I've got two at 4k, probably the Debian default, and one at 1k that I
created it later, and intended to put lots of small files on it (C code, sorry Phil). Now that Oracle is living on it, the choice seems less wise.
You'll need to be root to use this, as it directly accesses the raw disk device.
-- Steve Greenland, January 24, 2000
I just came across a HOWTO for installing Oracle 8i on Linux that may be of help to anyone attempting such a project:http://www.zx81.org.uk/computing/oracle/oracle-howto/oracle-howto.html
Good Luck.
-- John Milazzo, February 18, 2000
To get block size on RH Linux you can do a tune2fs -l /dev/disk1
-- Robert Bigin, March 1, 2000
If you use Oracle server, you need a copy of Embarcadero's DB Artisan.
It's an amazing product that allows you to change tables, indexes tables, tablespaces without having to learn the arcane syntax and strange table names of Oracle innards. It's a DBA in a box and hugely better than the obtuse GUI that comes with Oracle. Unfortunately I think it runs only under Windoze. Be careful though, it's a powerful tool and can powerfully ruin your production DB in seconds. No affiliation, blah, blah...
-- Rob Quinn, March 22, 2000
The nice folk at Orasoft have produced some superb GTK+ based tools for working with Oracle. Our resident Oracle DBA swears by them.
-- Rodger Donaldson, August 3, 2000
I'm a DBA. I never using Informix before. But as what I heard it good. I'm been using Oracle for almost 5 year till now. So, what I can say it really headache at all parts. Start from the licensing untill the technical part.
-- Shahrizal Muhamad Kasim, September 11, 2000
I've been working with Oracle for almost 20 years and found your perspective err, ummm entertaining. I think what you fail to realize is that Oracle is a career, not a lifestyle choice. :)Please keep re-enforcing the importance of a good DBA; emphasis on the good. A good DBA has no problem sleeping at night.
-- Ruth Dejam, October 4, 2000
There exists the very nice SQL*Plus replacement -
yasql.
Take a look at this if you do not like SQL*Plus.
Oh, yes. It is written in perl...
-- Marcin Kasperski, May 19, 2001
Related Links
- Search engine for Oracle docs- It isn't always easy to find information in the Oracle docs in their "raw" form as found on technet.oracle.com. tahiti.oracle.com is a combination search engine + expert system that lets you find specific information within the database server docs. The system is built according to "Philip and Alex's" principles (no tcl though :-). (contributed by John Russell)
- Another search engine for the Oracle documentation (and more) - The Ixora web site incorporates another search engine for the Oracle documentation. It does not have all the expert system features of the one above, but is very good a pointing you to the right parts of the documentation very quickly. This is partly because it knows about synonyms and abbreviations for Oracle terminology. So if you search for information about "locks" you will not miss results that use the term "enqueues" instead, and so on.
The other nice thing about the Ixora search engine is that it simultaneously searches the best of the web - about 40 sites with high quality Oracle information - and it gives you links at the bottom of the results to continue your search on Google, Usenet, TechNet or MetaLink. (contributed by Steve Adams)