Index: openacs-4/packages/acs-core-docs/www/db-api.html =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/db-api.html,v diff -u -r1.2 -r1.3 --- openacs-4/packages/acs-core-docs/www/db-api.html 17 Oct 2001 20:39:25 -0000 1.2 +++ openacs-4/packages/acs-core-docs/www/db-api.html 2 Feb 2002 03:47:32 -0000 1.3 @@ -1,28 +1,66 @@ -
-Table of Contents
By Pete Su
- This document is a short tutorial on the Database API, meant for - experienced ACS developers to get a jump-start on its usage. For a - more complete discussion of the DB API, you can read - the documentation in - the kernel. -
- Here's a typical block of code from an ACS 3.x dynamic page: -
-set tcl_var "foo" + + + + +The Database Access API + + + + + + + + + ++ ++ By Pete Su and + Jon Salz. Modified + by Roberto Mello. +
++ +++ One of OpenACS's great strengths is that code written for it is + very close to the database. It is very easy to interact with the + database from anywhere within OpenACS. Our goal is to develop a + coherent API for database access which makes this even easier. +
++ ++ Here's a typical block of code from an OpenACS 3.x dynamic page: +
++set tcl_var "foo" set db [ns_db gethandle] -ns_db dml $db "begin transaction" +ns_db dml $db "begin transaction" -set sql "select +set sql "select foo, bar, baz from some_table, some_other_table where some_table.id = some_other_table.id and some_table.condition_p = '$tcl_var' - " + " set selection [ns_db select $db $sql] set count 0 @@ -36,36 +74,50 @@ } ns_db releasehandle $db -
- Writing code like this had the following annoyances: + +
+ Writing code like this had the following annoyances: -
- It was repetitive, tedious and error prone to write the same type of - loops over and over again. -
- Using Tcl variable interpolation in a literal string, to pass values - from the page to the database, is error prone, relatively inefficient, - and a good way to compromise the security of a web site. -
- Magic like set_variables_after_query made code confusing. -
- The scope of transactions is not clear from reading the code. -
- Passing handles around explicitly made it easy to use them in bad - ways, like holding a handle for too long while returning data to a - user's browser. -
+ It was repetitive, tedious and error prone to write the same type of + loops over and over again. +
+ Using Tcl variable interpolation in a literal string, to pass values + from the page to the database, is error prone, relatively inefficient, + and a good way to compromise the security of a web site. +
+ Magic like set_variables_after_query made code confusing. +
+ The scope of transactions is not clear from reading the code. +
+ Passing handles around explicitly made it easy to use them in bad + ways, like holding a handle for too long while returning data to a + user's browser. +
- Introduced in ACS 3.4, the new Database API is meant to save - developers from the above tedium and provide a more structured syntax - for specifying database operations, including transactions. -
- Here is how you would code up the example above using the new API. -
+ ++ Introduced in ACS 3.4, the new Database API is meant to save + developers from the above tedium and provide a more structured syntax + for specifying database operations, including transactions. +
+ +
- There are several things to note here: + +
+ There are several things to note here: -
- No explicit code for grabbing and releasing handles. Usage of the - Database API implicitly deals with all handle management issues. -
- The new command db_transaction makes the scope of a - transaction clear. db_transaction takes the code block - argument and automatically runs it in the context of a transaction. -
- The new command db_foreach writes our old while loop for us. -
- Every SQL query has a name, meant to be unique within the server - instance (though this is not enforced). -
- Finally and most importantly, there is a new scheme for passing data - from a Tcl variable to the database, which we'll cover next. -
+ No explicit code for grabbing and releasing handles. Usage of the + Database API implicitly deals with all handle management issues. +
+ The new command db_transaction + makes the scope of a transaction + clear. db_transaction takes the + code block argument and automatically runs it in the context of a + transaction. +
+ The new command db_foreach writes + our old while loop for us. +
+ Every SQL query has a name, meant to be unique within the server + instance (though this is not enforced). +
+ Finally and most importantly, there is a new scheme for passing data + from a Tcl variable to the database, which we'll cover next. +
- - Bind variables are placeholders for literal values in an SQL query - being sent to the server. Take the example query above: in the old - way, data was generally passed to Oracle directly, via Tcl string - interpolation. So in the example above, the actual query we send would - look like this: -
+ + ++ ++ Bind variables are placeholders for literal values in an SQL query + being sent to the server. Take the example query above: in the old + way, data was generally passed to Oracle directly, via Tcl string + interpolation. So in the example above, the actual query we send would + look like this: +
+select foo, bar, baz from some_table, some_other_table where some_table.id=some_other_table.id and some_table.condition_p = 'foo' -
- There are a few problems with this: first, if the literal value is a - huge string, then we waste a lot of time in the database server doing - useless parsing. Second, if the literal value contains characters like - single quotes, we have to be careful to double-quote them, because not - quoting them will lead to surprising errors. Third, no type checking - occurs on the literal value. Finally, if the Tcl variable is passed in - or between web forms or otherwise subject to external modification, - there is nothing keeping malicious users from setting the Tcl variable - to some string that changes the query textually. This type of attack, - called SQL smuggling, can be very damaging - entire tables can be - exposed or have their contents deleted, for example. Another very - important reason for using bind variables is performance. Oracle caches - all previously parsed queries. If there are values in the where clause, - that is how the query is cached. It also performs bind variable - susbstitution after parsing the SQL statement. This means that SQL - statements that use bind variables will always match (assuming all else is - the same) while SQL statements that do not use bind variables will not - match unless the values in the statement are exactly the same. This will - improve performance considerably. -
- To fix all these problems, we replace literal values in the query with - a placeholder character, and then send the data along after. So the - query looks like this: -
+++ There are a few problems with this: +
+++
- + If the literal value is a + huge string, then we waste a lot of time in the database server doing + useless parsing. +
+- + Second, if the literal value contains characters like + single quotes, we have to be careful to double-quote them, because not + quoting them will lead to surprising errors. +
+- +
++ Third, no type checking + occurs on the literal value. Finally, if the Tcl variable is passed in + or between web forms or otherwise subject to external modification, + there is nothing keeping malicious users from setting the Tcl variable + to some string that changes the query textually. +
++ This type of attack, + called SQL smuggling, can be very + damaging - entire tables can be + exposed or have their contents deleted, for example. Another very + important reason for using bind variables is performance. Oracle caches + all previously parsed queries. If there are values in the where clause, + that is how the query is cached. It also performs bind variable + susbstitution after parsing the SQL statement. This means that SQL + statements that use bind variables will always match (assuming all else is + the same) while SQL statements that do not use bind variables will not + match unless the values in the statement are exactly the same. This will + improve performance considerably. +
++ To fix all these problems, we replace literal values in the query with + a placeholder character, and then send the data along after. So the + query looks like this: +
+select foo, bar, baz from some_table, some_other_table where some_table.id = some_other_table.id and some_table.condition_p = ? -
- The '?' character means "This will be filled in later with literal - data". In use, you might write code that looks like this: -
-set statement [prepare_query " +++ The '?' character means "This will be filled in later with literal + data". In use, you might write code that looks like this: +
++set statement [prepare_query " select foo, bar, baz from some_table, some_other_table where some_table.id = some_other_table.id and some_table.condition_p = ? - "] + "] [bind_param $statement 1 $tcl_var] -
- The above example is meant to be psuedo-Tcl - no API like this - actually exists. What happens is that we first send the SQL statement - to the server for parsing, then later we bind values to the - placeholders, and send those values along seperately. This seperate - binding step is where the term bind variable comes from. -
- This split has several advantages. First, type checking happens on the - literal. If the column we are comparing against holds numbers, and we - send a string, we get a nice error. Second, since string literals are - no longer in the query, no extra quoting is required. Third, - substitution of bind variables cannot change the actual text of the - query, only the literal values in the placeholders. -
- The database API makes bind variables easy to use by hooking them - smoothly into the Tcl runtime. Rather than using a '?' as a generic - placeholder, you use a colon followed by the name of the Tcl variable - that you wish to pass as a literal. So here's the final, real-life - form of the example query: -
+++ The above example is meant to be psuedo-Tcl - no API like this + actually exists. What happens is that we first send the SQL statement + to the server for parsing, then later we bind values to the + placeholders, and send those values along seperately. This seperate + binding step is where the term bind variable comes from. +
++ This split has several advantages. First, type checking happens on the + literal. If the column we are comparing against holds numbers, and we + send a string, we get a nice error. Second, since string literals are + no longer in the query, no extra quoting is required. Third, + substitution of bind variables cannot change the actual text of the + query, only the literal values in the placeholders. +
++ The database API makes bind variables easy to use by hooking them + smoothly into the Tcl runtime. Rather than using a '?' as a generic + placeholder, you use a colon followed by the name of the Tcl variable + that you wish to pass as a literal. So here's the final, real-life + form of the example query: +
+select foo, bar, baz from some_table, some_other_table where some_table.id = some_other_table.id and some_table.condition_p = :tcl_var -
- The database API parses the query and pulls out all the bind variable - specifications and replaces them with generic placeholders. It then - automatically pulls the values of the named Tcl vars out of the - runtime environment of the script, and passes them to the database. -
- Note that while this looks like a simple syntactic change, it really - is very different from how we've written queries in the past. You use - bind variables to replace what would otherwise be a literal value in a - query, and Tcl style string interpolation does not happen. So you - cannot do something like: -
-set table "baz" -set condition "where foo = bar" +++ The database API parses the query and pulls out all the bind variable + specifications and replaces them with generic placeholders. It then + automatically pulls the values of the named Tcl vars out of the + runtime environment of the script, and passes them to the database. +
++ Note that while this looks like a simple syntactic change, it really + is very different from how we've written queries in the past. You use + bind variables to replace what would otherwise be a literal value in a + query, and Tcl style string interpolation does not happen. So you + cannot do something like: +
++set table "baz" +set condition "where foo = bar" db_foreach my_query { select :table from some_table where :condition } -
- SQL will not allow a literal to occur where we've put the bind - variables, so the query is syntactically incorrect. You have to - remember that while the bind variable syntax looks similar to variable - interpolation in Tcl, it is not the same thing at all. -
- Finally, the DB API has several different styles for passing bind - variable values to queries. In general, use the style presented here - because it is the most convenient. You'll see the other styles - described in the detailed API document, but they're not important. -
- The Database API has several functions that wrap familiar parts of the - AOLserver database API. We won't cover the details of all the calls - here; instead, we'll list the most frequently used calls, with links - to further detail: + +
+ SQL will not allow a literal to occur where we've put the bind + variables, so the query is syntactically incorrect. You have to + remember that while the bind variable syntax looks similar to variable + interpolation in Tcl, it is not the same thing at all. +
++ Finally, the DB API has several different styles for passing bind + variable values to queries. In general, use the style presented here + because it is the most convenient. +
+Every db_* command accepting a SQL command as an argument + supports bind variables. You can either
++ Specify the -bind switch to provide a set with bind variable + values, or +
+ Specify the -bind switch to explicitly provide a list of + bind variable names and values, or +
+ Not specify a bind variable list at all, in which case Tcl variables are + used as bind variables. +
+ The default behavior (i.e., if the -bind switch is omitted) is + that these procedures expect to find local variables that correspond in name + to the referenced bind variables, e.g.: +
+-+set user_id 123456 +set role "administrator" -
- db_null
- Returns a value which can be used in a bind variable to represent - the SQL value null. You should use this instead of empty - strings when passing nulls to the database, because using the empty - string as the null value is not portable across databases. See - Nulls and Bind Variables. -
- - db_foreach - query_name sql { code } -
Runs a query returning multiple rows and executes a code block for - each row returned. -
- - db_1row - query_name sql
This is similar to the old ns_db 1row call.
- - db_0or1row - query_name sql
This is similar to the old call ns_db 0or1row
- - db_string - query_name sql -
Returns the first column of the result of a SQL query as a - string. Analogous to database_to_tcl_string and - database_to_tcl_string_or_null. -
- - db_list - query_name sql
Returns a Tcl list of values from the first column of each result row - of a SQL query. Similar to the old database_to_tcl_list. -
- - db_dml - query_name dml -
Performs a DML or DDL statement.
- - db_transaction - { code } -
Performs a block of Tcl code in the scope of a transaction.
- - db_abort_transaction - { code } -
Aborts a transaction.
($Id$)
+ The value of the local Tcl variable user_id (123456) is bound to + the user_id bind variable. +
+The -bind switch can takes the name of an ns_set + containing keys for each bind variable named in the query, e.g.:
++ +set bind_vars [ns_set create] +ns_set put $bind_vars user_id 123456 +ns_set put $bind_vars role "administrator" + +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} -bind $bind_vars { + # do something for each group in which user 123456 has the role + # of "administrator" +} + ++
+ Alternatively, as an argument to -bind you can specify a list of + alternating name/value pairs for bind variables: +
++ +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} -bind [list user_id 123456 role "administrator"] { + # do something for each group in which user 123456 has the role + # of "administrator" +} + ++
+ When processing a DML statement, Oracle coerces empty strings into + null. (This coercion does not occur in the + WHERE clause of a query, i.e. + col = '' and + col is null are not equivalent.) +
+As a result, when using bind variables, the only way to make Oracle set a + column value to null is to set the corresponding bind variable + to the empty string, since a bind variable whose value is the string + "null" will be interpreted as the literal string + "null".
+These Oracle quirks complicate the process of writing clear and abstract + DML difficult. Here is an example that illustrates why:
++ +# +# Given the table: +# +# create table foo ( +# bar integer, +# baz varchar(10) +# ); +# + +set bar "" +set baz "" + +db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" +# +# the values of the "bar" and "baz" columns in the new row are both +# null, because Oracle has coerced the empty string (even for the +# numeric column "bar") into null in both cases + ++
+ Since databases other than Oracle do not coerce empty strings into + null, this code has different semantics depending on the + underlying database (i.e., the row that gets inserted may not have null as + its column values), which defeats the purpose of SQL abstraction. +
+Therefore, the Database Access API provides a database-independent way to + represent null (instead of the Oracle-specific idiom of the + empty string): db_null.
+Use it instead of the empty string whenever you want to set a column value + explicitly to null, e.g.:
++ +set bar [db_null] +set baz [db_null] + +db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" +# +# sets the values for both the "bar" and "baz" columns to null + ++
+ The database library can transparently maintain pools of sequence values, so + that each request for a new sequence value (using db_nextval) + does not incur a roundtrip to the server. For instance, this functionality is + very useful in the security/sessions library, which very frequently allocates + values from the sec_id_seq sequence. To utilize this + functionality for a particular sequence, register the sequence to be pooled, + either using the db_register_pooled_sequence procedure at server + startup time, or by including a configuration parameter of the form +
++ +PoolSequence.sequence_name_seq=count + ++
+ in any configuration section in the yourservername.ini + file, e.g., e.g., +
+
+
+[ns/server/yourservername/acs/security]
+PoolSequence.sec_id_seq=20
+
+
++ The database library will allocate this number of sequence values at server + startup. It will periodically scan pools and allocate new values for + sequences which are less than half-full. (This normally occurs every 60 + seconds, and is configurable via the + PooledSequenceUpdateInterval parameter in the + [ns/server/ + yourservername + /acs/database] configuration + section.) +
++ The Database API has several functions that wrap familiar parts of the + AOLserver database API. +
++ Note that you never have to use ns_db anymore (including + ns_db gethandle)! Just start doing stuff, and (if you want) call + db_release_unused_handles when you're done as a hint to + release the database handle. +
++db_abort_transaction ++
Aborts all levels of a transaction. That is if this is called within + several nested transactions, all of them are terminated. Use this insetead of + db_dml "abort" "abort transaction". + + +
++db_null ++
+ Returns a value which can be used in a bind variable + to represent the SQL value + null. + See Nulls and + Bind Variables above. +
++db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] \ + code_block [ if_no_rows if_no_rows_block ] ++
+ Performs the SQL query + sql + , executing + code_block + once for each row + with variables set to column values (or a set or array + populated if + -column_array or + column_set is + specified). If the query returns no rows, executes + if_no_rows_block + (if provided). +
+Example:
++ +db_foreach select_foo "select foo, bar from greeble" { + doc_body_append "<li>foo=$foo; bar=$bar\n" +} if_no_rows { + doc_body_append "<li>There are no greebles in the database.\n" +} + ++
+ The code block may contain break statements (which terminate the + loop and flush the database handle) and continue statements + (which continue to the next row of the loop).
++db_1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] ++
+ Performs the SQL query + sql, + setting variables to column values. Raises an error if the + query does not return exactly 1 row. +
+Example:
++ +db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" +# Bombs if there's no such greeble! +# Now $foo and $bar are set. + ++
+db_0or1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] ++
+ Performs the SQL query + sql. + If a row is returned, sets variables to column values and + returns 1. If no rows are returned, returns 0. If more + than one row is returned, throws an error. +
+
+db_nextval sequence-name
+
++ Returns the next value for the sequence sequence-name (using a + SQL statement like SELECT + sequence-name.nextval FROM + DUAL). If sequence pooling is enabled for the sequence, transparently + uses a value from the pool if available to save a round-trip to the database + (see Sequence Pooling). +
++db_register_pooled_sequence sequence-name pool-size ++
Registers the sequence sequence-name to be pooled, with a pool + size of pool-size sequence values + (see Sequence Pooling). + +
++db_string statement-name sql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ] ++
Returns the first column of the result of SQL query + sql. + If sql doesn't return a + row, returns + default + (or throws an error if + default is unspecified). Analogous to + database_to_tcl_string and + database_to_tcl_string_or_null. + +
++db_list statement-name sql [ -bind bind_set_id | -bind bind_value_list ] ++
Returns a Tcl list of the values in the first column of the result of SQL + query + sql. + If sql doesn't + return any rows, returns an empty list. Analogous to + database_to_tcl_list. + +
++db_list_of_lists statement-name sql [ -bind bind_set_id | -bind bind_value_list ] ++
Returns a Tcl list, each element of which is a list of all column values + in a row of the result of SQL query sql. If + sql doesn't return any rows, returns an empty list. + (Analogous to database_to_tcl_list_list.) + +
++db_dml statement-name sql \ + [ -bind bind_set_id | -bind bind_value_list ] \ + [ -blobs blob_list | -clobs clob_list | + -blob_files blob_file_list | -clob_files clob_file_list ] ++
Performs the DML or DDL statement sql.
+If a length-n list of blobs or clobs is provided, then the SQL + should return n blobs or clobs into the bind variables + :1, :2, ... :n. + blobs or clobs, if specified, + should be a list of individual BLOBs or CLOBs to insert; + blob_files or clob_files, if + specified, should be a list of paths to files containing the data to + insert. Only one of -blobs, -clobs, + -blob_files, and -clob_files may be provided.
+Example:
++ +db_dml insert_photos " + insert photos(photo_id, image, thumbnail_image) + values(photo_id_seq.nextval, empty_blob(), empty_blob()) + returning image, thumbnail_image into :1, :2 + " -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"] + ++
+ This inserts a new row into the photos table, with the contents + of the files /var/tmp/the_photo and + /var/tmp/the_thumbnail in the image and + thumbnail columns, respectively. + + +
++db_write_clob statement-name sql [ -bind bind_set_id | -bind bind_value_list ] + +db_write_blob statement-name sql [ -bind bind_set_id | -bind bind_value_list ] + +db_blob_get_file statement-name sql [ -bind bind_set_id | -bind bind_value_list ] ++
Analagous to ns_ora write_clob/write_blob/blob_get_file. + + +
++ db_release_unused_handles ++
Releases any allocated, unused database handles.
++db_transaction code_block [ on_error { code_block } ] ++
Executes code_block transactionally. Nested + transactions are supported (end transaction is transparently + ns_db dml'ed when the outermost transaction completes). The + db_abort_transaction command can be used to abort all levels of + transactions. It is possible to specify an optional on_error + code block that will be executed if some code in code_block throws + an exception. The variable errmsg will be bound in that scope. + If there is no on_error code, any errors will be propagated.
+Example:
++ +proc replace_the_foo { col } { + db_transaction { + db_dml "delete from foo" + db_dml "insert into foo(col) values($col)" + } +} + +proc print_the_foo {} { + doc_body_append "foo is [db_string "select col from foo"]<br>\n" +} + +replace_the_foo 8 +print_the_foo ; # Writes out "foo is 8" + +db_transaction { + replace_the_foo 14 + print_the_foo ; # Writes out "foo is 14" + db_dml "insert into some_other_table(col) values(999)" + ... + db_abort_transaction +} on_error { + doc_body_append "Error in transaction: $errmsg" +} + + +print_the_foo ; # Writes out "foo is 8" + ++
+db_resultrows ++
Returns the number of rows affected or returned by the previous + statement. + + +
++db_with_handle var code_block ++
Places a database handle into the variable var and + executes code_block. This is useful when you don't + want to have to use the new API (db_foreach, + db_1row, etc.), but need to use database handles explicitly.
+Example:
++ +proc lookup_the_foo { foo } { + db_with_handle db { + return [db_string unused "select ..."] + } +} + +db_with_handle db { + # Now there's a database handle in $db. + set selection [ns_db select $db "select foo from bar"] + while { [ns_db getrow $db $selection] } { + set_variables_after_query + + lookup_the_foo $foo + } +} + ++
+db_nullify_empty_string string
+
+For true SQL purists, we provide the convenience function + db_nullify_empty_string, which returns + [db_null] if its string argument is the empty string + and can be used to encapsulate another Oracle quirk:
++ +set baz "" + +# Clean out the foo table +# +db_dml unused "delete from foo" + +db_dml unused "insert into foo(baz) values('$baz')" + +set n_rows [db_string unused "select count(*) from foo where baz is null"] +# +# $n_rows is 1; in effect, the "baz is null" criterion is matching +# the empty string we just inserted (because of Oracle's coercion +# quirk) + ++
+ To balance out this asymmetry, you can explicitly set baz to + null by writing: +
++ +db_dml foo_insert "insert into foo(baz) values(:1)" {[db_nullify_empty_string $baz]} + ++
+