ACS Documentation : ACS Core Architecture Guide : Database Access API
db_*
calls that accept a SQL/DML/DDL
statement as an argument now require a statement name to be provided
as the first argument. See SQL
Abstraction below for details.
There were four significant problems with the way ACS previously used
the database (i.e., directly through the ns_db
interface):
begin
transaction
really means "turn auto-commit mode off" and
end transaction
means "commit the current transaction and
turn auto-commit mode on." Thus if transactional code needed to call
a routine which needed to operate transactionally, the semantics were
non-obvious. Consider:
This would insert greeble #33 and do all the stuff inproc foo { db args } { db_transaction { ... } } db_transaction { db_dml unused "insert into greeble(bork) values(33)" foo $db db_dml unused "insert into greeble(bork) values(50)" }
foo
transactionally, but the end transaction
in
foo
would actually cause a commit, and greeble #50 would
later be inserted in auto-commit mode. This could cause subtle bugs:
e.g., in the case that the insert for greeble #50 failed, part of the
"transaction" would have already have been committed!. This is not a
good thing.
set_variables_after_query
routine, which relies on an
uplevel variable named selection
(likewise for
set_variables_after_subquery
and
subselection
).
DECODE
on Oracle and CASE ... WHEN
on
Postgres).
To be clear, SQL abstraction is not fully implemented in ACS 3.4. The statement names supplied to each call are not used by the API at all. The API's design for SQL abstraction is in fact incomplete; unresolved issues include:
WHERE
clause criteria dynamically
ORDER BY
clause (Ben Adida has
a proposed solution for this)
SELECT
clause must
contain if it's a query) without actually implementing the statement
in a specific SQL dialect
set_variables_after_query
set_variables_after_query
is gone! (Well, it's still there, but you'll never
need to use it.) The new API routines set local variables automatically. For instance:
Likedb_1row select_names "select first_names, last_name from users where user_id = [ad_get_user_id]" doc_body_append "Hello, $first_names $last_name!"
ns_db 1row
, this will bomb if the query doesn't return any rows (no such
user exists). If this isn't what you want, you can write:
Selecting a bunch of rows is a lot prettier now:if { [db_0or1row select_names "select first_names, last_name from users where user_id = [ad_get_user_id]"] } { doc_body_append "Hello, $first_names $last_name!" } else { # Executed if the query returns no rows. doc_body_append "There's no such user!" }
That's right,db_foreach select_names "select first_names, last_name from users" { doc_body_append "Say hi to $first_names $last_name for me!<br>" }
db_foreach
is now like ns_db select
plus
a while
loop plus set_variables_after_query
plus
an if
statement (containing code to be
executed if no rows are returned).
db_foreach select_names "select first_names, last_name from users where last_name like 'S%'" { doc_body_append "Say hi to $first_names $last_name for me!<br>" } if_no_rows { doc_body_append "There aren't any users with last names beginnings with S!" }
A new handle isn't actually allocated and released for every selection, of course - as a performance optimization, the API keeps old handles around untildoc_body_append "<ul>" db_foreach select_names "select first_names, last_name, user_id from users" { # Automatically allocated a database handle from the main pool. doc_body_append "<li>User $first_names $last_name\n<ul>" db_foreach select_groups "select group_id from user_group_map where user_id = $user_id" { # There's a selection in progress, so we allocated a database handle # from the subquery pool for this selection. doc_body_append "<li>Member of group #$group_id.\n" } if_no_rows { # Not a member of any groups. doc_body_append "<li>Not a member of any group.\n" } } doc_body_append "</ul>" db_release_unused_handles
db_release_unused_handles
is invoked (or the script terminates).
Note that there is no analogue to ns_db gethandle
- the handle
is always automatically allocated the first time it's needed.
But what ifdb_dml presentation_delete "delete from wp_presentations where presentation_id = $presentation_id"
$presentation_id
isn't an integer - it's something
like "3 or 1 = 1"
, yielding a query like
which deletes every presentation in the database? We can get around this by carefully checking every string included in a SQL query, but a safer way is to use bind variables. This enables the following syntax:delete from wp_presentations where presentation_id = 3 or 1 = 1
The value fordb_dml presentation_delete " delete from wp_presentations where presentation_id = :presentation_id "
presentation_id
is pulled from the Tcl variable
$presentation_id
(in the current stack frame), so even if input checking is spotty and
$presentation_id
is set to 3 or 1 = 1
as above,
Oracle will attempt to delete presentations whose presentation ID is literally
'3 or 1 = 1'
(i.e., no presentations, since '3 or 1 = 1'
can't possibly be a valid integer primary key for wp_presentations
.
In general, since Oracle always considers the values of bind variables to be
literals, it becomes more difficult for users to perform URL surgery to trick
scripts into running dangerous queries and DML.
Every db_*
command accepting a
SQL command as an argument now supports bind variables. You can either
-bind
switch to provide a set with bind variable values, or
-bind
switch to explicitly provide a list of bind variable names and
values, or
-bind
switch is omitted)
is that these procedures expect to find local
variables that correspond in name to the referenced bind variables,
e.g.:
The value of the local Tcl variableset user_id 123456 set role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } { # do something for each group of which user 123456 is in the role # of "administrator" }
user_id
(123456) is
bound to the user_id
bind variable.
The -bind
switch can takes the name of an ns_set
containing keys for each bind variable named in the query, e.g.:
Alternatively, as an argument toset bind_vars [ns_set create] ns_set put $bind_vars user_id 123456 ns_set put $bind_vars role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind $bind_vars { # do something for each group in which user 123456 has the role # of "administrator" }
-bind
you can specify a list
of alternating name/value pairs for bind variables:
db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind [list user_id 123456 role "administrator"] { # do something for each group in which user 123456 has the role # of "administrator" }
null
. (This coercion does not occur in the
WHERE
clause of a query, i.e.
col = ''
and col is null
are not equivalent.)
As a result, when using bind variables, the only way to make Oracle
set a column value to null
is to set the corresponding
bind variable to the empty string, since a bind variable whose value
is the string "null" will be interpreted as the literal string "null".
These Oracle quirks complicate the process of writing clear and abstract DML difficult. Here is an example that illustrates why:
Since databases other than Oracle do not coerce empty strings into# # Given the table: # # create table foo ( # bar integer, # baz varchar(10) # ); # set bar "" set baz "" db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" # # the values of the "bar" and "baz" columns in the new row are both # null, because Oracle has coerced the empty string (even for the # numeric column "bar") into null in both cases
null
, this code has different semantics depending on the
underlying database (i.e., the row that gets inserted may not have
null as its column values), which defeats the purpose of SQL
abstraction.
Therefore, the Database Access API provides a database-independent way
to represent null
(instead of the Oracle-specific idiom
of the empty string): db_null
.
Use it instead of the empty string whenever you want to set a column
value explicitly to null
, e.g.:
set bar [db_null] set baz [db_null] db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" # # sets the values for both the "bar" and "baz" columns to null
db_foreach
, db_0or1row
, and db_1row
places the results of queries in Tcl variables, so you can say:
However, sometimes this is not sufficient: you may need to examine the rows returned, to dynamically determine the set of columns returned by the query, or to avoid collisions with existing variables. You can use thedb_foreach users_select "select first_names, last_name from users" { doc_body_append "<li>$first_names $last_name\n" }
-column_array
and
-column_set
switches to
db_foreach
, db_0or1row
, and db_1row
to instruct the database routines to place the results in a Tcl array or ns_set
,
respectively, where the keys are the column names and the values are the column values.
For example:
will write something like:db_foreach users_select "select first_names, last_name from users" -column_set columns { # Now $columns is an ns_set. doc_body_append "<li>" for { set i 0 } { $i < [ns_set size $columns] } { incr i } { doc_body_append "[ns_set key $columns $i] is [ns_set value $columns $i]. \n" } }
ns_db
anymore (including
ns_db gethandle
)! Just start doing stuff, and (if you
want) call db_release_unused_handles
when you're done as
a hint to release the database handle.
db_null
null
. See Nulls and Bind Variables
above.
db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
[ -column_array array_name | -column_set set_name ] \
code_block [ if_no_rows if_no_rows_block ]
sql
, executing code_block
once for each row with
variables set to column values (or a set or array populated if -column_array
or
column_set
is specified). If the query returns no rows, executes
if_no_rows_block
(if provided).
Example:
The code block may containdb_foreach select_foo "select foo, bar from greeble" { doc_body_append "<li>foo=$foo; bar=$bar\n" } if_no_rows { doc_body_append "<li>There are no greebles in the database.\n" }
break
statements (which terminate the
loop and flush the database handle) and continue
statements (which
continue to the next row of the loop).
db_1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
[ -column_array array_name | -column_set set_name ]
sql
, setting variables to column values.
Raises an error if the query does not return exactly 1 row.
Example:
db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" # Bombs if there's no such greeble! # Now $foo and $bar are set.
db_0or1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
[ -column_array array_name | -column_set set_name ]
sql
. If a row is returned, sets variables to column values
and returns 1. If no rows are returned, returns 0. If more than one row is returned, throws
an error.
If a length-n list of
blobs or clobs is provided, then the SQL should return n blobs or clobs
into the bind variables Example:
Example:
To handle errors, use
Example 1:
Example:
The basic idea is to translate the logical
For instructing the Database Access API to translate a named statement
in a specific SQL dialect, we may define a new API call:
Issues:
One possibility is a file-based approach, where the alternative SQL
statements would live in conventionally named and located files, e.g.,
Another similar approach would be just to have one massive, magic file for
each SQL dialect that maps each statement identifier (location plus
name) to the corresponding statement.
We plan to defer to the OpenACS team for the Postgres implementation
of the API.
db_string statement-name sql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ]
sql
. If sql
doesn't return a row, returns default
(or throws an error if default
is unspecified). Analogous to
database_to_tcl_string
and database_to_tcl_string_or_null
.
db_list statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
sql
.
If sql
doesn't return any rows, returns an empty list.
Analogous to database_to_tcl_list
.
db_list_of_lists statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
sql
.
If sql
doesn't return any rows, returns an empty list.
(Analogous to database_to_tcl_list_list
.)
db_dml statement-name sql \
[ -bind bind_set_id | -bind bind_value_list ] \
[ -blobs blob_list | -clobs clob_list |
-blob_files blob_file_list | -clob_files clob_file_list ]
sql
.
:1
, :2
, ... :n
.
blobs
or clobs
, if specified, should be a list
of individual BLOBs or CLOBs to insert;
blob_files
or clob_files
, if specified, should be a list
of paths to files containing the data to insert.
Only one of -blobs
, -clobs
, -blob_files
, and
-clob_files
may be provided.
This inserts a new row into the
db_dml insert_photos "
insert photos(photo_id, image, thumbnail_image)
values(photo_id_seq.nextval, empty_blob(), empty_blob())
returning image, thumbnail_image into :1, :2
" -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"]
photos
table,
with the contents of the files /var/tmp/the_photo
and
/var/tmp/the_thumbnail
in the image
and
thumbnail
columns, respectively.
db_write_clob statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
db_write_blob statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
db_blob_get_file statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
ns_ora write_clob/write_blob/blob_get_file
.
db_exec_plsql statement-name sql
[ -bind bind_set_id | -bind bind_value_list ]BEGIN
and END;
). If the bind variable :1
is provided in the statement, it is considered
an OUT bind variable and its value is returned.
db_exec_plsql last_visit_rotate ":1 := sec_rotate_last_visit(:browser_id, :current_time)"
db_release_unused_handles
db_transaction transaction_code [
on_error error_code ]
transaction_code
with transactional semantics. This means that either all of the database commands
within transaction_code
are committed to the database or none of them are. Nested transactions are supported
(end transaction is transparently ns_db dml'ed when the outermost transaction completes).db_transaction {transaction_code} on_error {error_code}
. Any error generated in
transaction_code
will be caught automatically and process control will transfer to error_code
with a variable errmsg
set. The error_code block can then clean up after the error, such as presenting a usable
error message to the user. Following the execution of error_code
the transaction will be aborted.
Alternatively, a command to continue the transaction db_continue_transaction
can be issued. This
command will commit any successful database commands when the transaction completes, assuming no further errors are raised.
If you want to explicity abort the transaction, call db_abort_transaction
from within the transaction_code
block or the error_code
block.
In this example, db_dml triggers an error, so control passes to the on_error block which prints a readable error.
Example 2:
db_transaction {
db_dml test "nonsense"
} on_error {
ad_return_complaint "The DML failed."
}
In this example, the second command, "nonsense" triggers an error. There is no on_error block, so the
transaction is immediately halted and aborted.
Example 3:
db_transaction {
db_dml test {insert into footest values(1)}
nonsense
db_dml test {insert into footest values(2)}
}
In this example, all of the dml statements are executed and committed. The call to db_abort_transaction
signals that the transaction should be aborted which activates the higher level on_error block. That code
issues a db_continue_transaction which commits the transaction. Had there not been an on_error block, none
of the dml statements would have been committed.
Example 4:
db_transaction {
db_dml test {insert into footest values(1)}
db_transaction {
db_dml test {insert into footest values(2) }
db_abort_transaction
}
db_dml test {insert into footest values(3) }
} on_error {
db_continue_transaction
}
proc replace_the_foo { col } {
db_transaction {
db_dml "delete from foo"
db_dml "insert into foo(col) values($col)"
}
}
proc print_the_foo {} {
doc_body_append "foo is [db_string "select col from foo"]<br>\n"
}
replace_the_foo 8
print_the_foo ; # Writes out "foo is 8"
db_transaction {
replace_the_foo 14
print_the_foo ; # Writes out "foo is 14"
db_dml "insert into some_other_table(col) values(999)"
...
db_abort_transaction
}
print_the_foo ; # Writes out "foo is 8"
db_abort_transaction
db_dml "abort" "abort transaction"
.
db_continue_transaction
db_transaction
on_error code block only.
db_resultrows
db_with_handle var code_block
var
and executes code_block
.
This is useful when you don't want to have to use the new API
(db_foreach
, db_1row
, etc.), but
need to use database handles explicitly.
proc lookup_the_foo { foo } {
db_with_handle db {
return [database_to_tcl_string $db "select ..."]
}
}
db_with_handle db {
# Now there's a database handle in $db.
set selection [ns_db select $db "select foo from bar"]
while { [ns_db getrow $db $selection] } {
set_variables_after_query
lookup_the_foo $foo
}
}
db_nullify_empty_string string
db_nullify_empty_string
, which returns
[db_null] if its string
argument is the empty
string and can be used to encapsulate another Oracle quirk:
To balance out this asymmetry, you can explicitly set
set baz ""
# Clean out the foo table
#
db_dml unused "delete from foo"
db_dml unused "insert into foo(baz) values('$baz')"
set n_rows [db_string unused "select count(*) from foo where baz is null"]
#
# $n_rows is 1; in effect, the "baz is null" criterion is matching
# the empty string we just inserted (because of Oracle's coercion
# quirk)
baz
to null
by writing:
db_dml foo_insert "insert into foo(baz) values(:1)" {[db_nullify_empty_string $baz]}
Implementation Design (work in progress)
The ideas here are preliminary, so please send feedback to michael@arsdigita.com. There
may well be a much simpler, superior design that I (Michael) am just
missing right now. If so, please let me know!
statement-name
into an actual SQL statement, written
in the appropriate SQL dialect for the RDBMS that is in use. The
sql
argument is essentially a convenience that
enables the SQL for the "default dialect" to be written inline. For
3.4, we will probably use configuration parameters to tell the
Database Access API what the default dialect is and what dialect is
actually in use:
(An alternative approach would be to use the ACS Package Manager,
i.e., install a "pseudo-package" with no actual code to indicate what
RDBMS is installed. Then, the Database Access API could query the APM
to figure what SQL dialect to employ.)
[ns/server/server_name/acs]
...
DefaultSQLDialect=oracle8
SQLDialect=postgres7
which would be called at server initialization time. The Database
Access API will then know to use the SQL statement appropriate for the
specified
db_implement_statement statement_location statement_name sql_dialect sql
SQLDialect
. (The name
db_implement_statement
is very tentative.)
Version 2.1 of the ArsDigita
Oracle Driver adds a set of db_implement_statement
explicitly
specify the statement location (e.g., "/bboard/q-and-a") too much of a
pain? Can we make this more convenient somehow?
SQLDialect
, reading the rewritten SQL into memory for the
life of the server may not be a good idea. The three basic approaches
I can think of to implement the db_implement_statement
API are:
nsv
array
db_implement_statement
API altogether?
/bboard/q-and-a.postgres7
would contain Postgres 7
versions of the SQL statements in /bboard/q-and-a.tcl
.)
A potential con of this approach is that the Database Access API would
have to perform file I/O for every SQL statement that's been
rewritten. This may be a non-issue; I don't actually know. (We could
augment this approach with caching too, perhaps a fixed-size LRU
cache.)
ns_ora
analogs for the
following ns_db
calls: 0or1row
,
1row
, select
, and dml
. (It also
adds ns_ora array_dml
.) Thus, the groundwork for
implementing the above API for ACS/Oracle is already established.
jsalz@mit.edu