5. The Database Access API

Table of Contents

5.1. Overview
5.2. The Old Way
5.3. The New Way
5.4. Bind Variables
5.5. Basic API

5.1. Overview

- This document is a short tutorial on the Database API, meant for - experienced ACS developers to get a jump-start on its usage. For a - more complete discussion of the DB API, you can read - the documentation in - the kernel. -

5.2. The Old Way

- Here's a typical block of code from an ACS 3.x dynamic page: -

-set tcl_var "foo"
+
+
+
+
+The Database Access API
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Prev� Chapter 4. OpenACS Developer's Guide �Next
+
+
+
+
+
+The Database Access API
+
+    By Pete Su and
+    Jon Salz. Modified
+    by Roberto Mello.
+  
+
+
+Overview
+
+      One of OpenACS's great strengths is that code written for it is
+      very close to the database. It is very easy to interact with the
+      database from anywhere within OpenACS. Our goal is to develop a
+      coherent API for database access which makes this even easier.
+    
+
+
+
+The Old Way
+
+      Here's a typical block of code from an OpenACS 3.x dynamic page:
+    
++set tcl_var "foo"
 
 set db [ns_db gethandle]
 
-ns_db dml $db "begin transaction"
+ns_db dml $db "begin transaction"
 
-set sql "select 
+set sql "select 
              foo, 
              bar, 
              baz 
 	 from some_table, 
 	     some_other_table
 	 where some_table.id = some_other_table.id  
 	     and some_table.condition_p = '$tcl_var'
-	 "
+	 "
 
 set selection [ns_db select $db $sql] 
 set count 0 
@@ -36,36 +74,50 @@
 }
 
 ns_db releasehandle $db
-

- Writing code like this had the following annoyances: + +

+ Writing code like this had the following annoyances: -

- It was repetitive, tedious and error prone to write the same type of - loops over and over again. -
- Using Tcl variable interpolation in a literal string, to pass values - from the page to the database, is error prone, relatively inefficient, - and a good way to compromise the security of a web site. -
- Magic like set_variables_after_query made code confusing. -
- The scope of transactions is not clear from reading the code. -
- Passing handles around explicitly made it easy to use them in bad - ways, like holding a handle for too long while returning data to a - user's browser. -

+ It was repetitive, tedious and error prone to write the same type of + loops over and over again. +
+ Using Tcl variable interpolation in a literal string, to pass values + from the page to the database, is error prone, relatively inefficient, + and a good way to compromise the security of a web site. +
+ Magic like set_variables_after_query made code confusing. +
+ The scope of transactions is not clear from reading the code. +
+ Passing handles around explicitly made it easy to use them in bad + ways, like holding a handle for too long while returning data to a + user's browser. +

- Introduced in ACS 3.4, the new Database API is meant to save - developers from the above tedium and provide a more structured syntax - for specifying database operations, including transactions. -

5.3. The New Way

- Here is how you would code up the example above using the new API. -

+    
+
+      Introduced in ACS 3.4, the new Database API is meant to save
+      developers from the above tedium and provide a more structured syntax
+      for specifying database operations, including transactions.
+    
+
+
+
+The New Way
+
+      Here is how you would code up the example above using the new API.
+    
+ set count 0 
 
-set tcl_var "foo"
+set tcl_var "foo"
 
 set sql {
        	select 
@@ -84,181 +136,804 @@
             	incr count
         }
 }
-

- There are several things to note here: + +

+ There are several things to note here: -

- No explicit code for grabbing and releasing handles. Usage of the - Database API implicitly deals with all handle management issues. -
- The new command db_transaction makes the scope of a - transaction clear. db_transaction takes the code block - argument and automatically runs it in the context of a transaction. -
- The new command db_foreach writes our old while loop for us. -
- Every SQL query has a name, meant to be unique within the server - instance (though this is not enforced). -
- Finally and most importantly, there is a new scheme for passing data - from a Tcl variable to the database, which we'll cover next. -

+ No explicit code for grabbing and releasing handles. Usage of the + Database API implicitly deals with all handle management issues. +
+ The new command db_transaction + makes the scope of a transaction + clear. db_transaction takes the + code block argument and automatically runs it in the context of a + transaction. +
+ The new command db_foreach writes + our old while loop for us. +
+ Every SQL query has a name, meant to be unique within the server + instance (though this is not enforced). +
+ Finally and most importantly, there is a new scheme for passing data + from a Tcl variable to the database, which we'll cover next. +

5.4. Bind Variables

- - Bind variables are placeholders for literal values in an SQL query - being sent to the server. Take the example query above: in the old - way, data was generally passed to Oracle directly, via Tcl string - interpolation. So in the example above, the actual query we send would - look like this: -

+ +

+Bind Variables

+ Bind variables are placeholders for literal values in an SQL query + being sent to the server. Take the example query above: in the old + way, data was generally passed to Oracle directly, via Tcl string + interpolation. So in the example above, the actual query we send would + look like this: +

 select 
     foo, 
     bar, 
     baz 
 from some_table, some_other_table
 where some_table.id=some_other_table.id  
 and some_table.condition_p = 'foo'
-

- There are a few problems with this: first, if the literal value is a - huge string, then we waste a lot of time in the database server doing - useless parsing. Second, if the literal value contains characters like - single quotes, we have to be careful to double-quote them, because not - quoting them will lead to surprising errors. Third, no type checking - occurs on the literal value. Finally, if the Tcl variable is passed in - or between web forms or otherwise subject to external modification, - there is nothing keeping malicious users from setting the Tcl variable - to some string that changes the query textually. This type of attack, - called SQL smuggling, can be very damaging - entire tables can be - exposed or have their contents deleted, for example. Another very - important reason for using bind variables is performance. Oracle caches - all previously parsed queries. If there are values in the where clause, - that is how the query is cached. It also performs bind variable - susbstitution after parsing the SQL statement. This means that SQL - statements that use bind variables will always match (assuming all else is - the same) while SQL statements that do not use bind variables will not - match unless the values in the statement are exactly the same. This will - improve performance considerably. -

- To fix all these problems, we replace literal values in the query with - a placeholder character, and then send the data along after. So the - query looks like this: -

+    
+
+ There are a few problems with this: +
+
+
+ If the literal value is a + huge string, then we waste a lot of time in the database server doing + useless parsing. +
+
+ Second, if the literal value contains characters like + single quotes, we have to be careful to double-quote them, because not + quoting them will lead to surprising errors. +
+
+
+ Third, no type checking + occurs on the literal value. Finally, if the Tcl variable is passed in + or between web forms or otherwise subject to external modification, + there is nothing keeping malicious users from setting the Tcl variable + to some string that changes the query textually. +
+
+ This type of attack, + called SQL smuggling, can be very + damaging - entire tables can be + exposed or have their contents deleted, for example. Another very + important reason for using bind variables is performance. Oracle caches + all previously parsed queries. If there are values in the where clause, + that is how the query is cached. It also performs bind variable + susbstitution after parsing the SQL statement. This means that SQL + statements that use bind variables will always match (assuming all else is + the same) while SQL statements that do not use bind variables will not + match unless the values in the statement are exactly the same. This will + improve performance considerably. +
+
+
+
+ To fix all these problems, we replace literal values in the query with + a placeholder character, and then send the data along after. So the + query looks like this: +
+
 select 
     foo, 
     bar, 
     baz 
 from some_table, some_other_table
 where some_table.id = some_other_table.id
 and some_table.condition_p = ?
-		

- The '?' character means "This will be filled in later with literal - data". In use, you might write code that looks like this: -

-set statement [prepare_query "
+

+ The '?' character means "This will be filled in later with literal + data". In use, you might write code that looks like this: +

+set statement [prepare_query "
     		select 
         		foo, 
         		bar, 
         		baz 
     		from some_table, some_other_table
     		where some_table.id = some_other_table.id  
     		and some_table.condition_p = ?
-	       "]
+	       "]
 
 [bind_param $statement 1 $tcl_var]
-

- The above example is meant to be psuedo-Tcl - no API like this - actually exists. What happens is that we first send the SQL statement - to the server for parsing, then later we bind values to the - placeholders, and send those values along seperately. This seperate - binding step is where the term bind variable comes from. -

- This split has several advantages. First, type checking happens on the - literal. If the column we are comparing against holds numbers, and we - send a string, we get a nice error. Second, since string literals are - no longer in the query, no extra quoting is required. Third, - substitution of bind variables cannot change the actual text of the - query, only the literal values in the placeholders. -

- The database API makes bind variables easy to use by hooking them - smoothly into the Tcl runtime. Rather than using a '?' as a generic - placeholder, you use a colon followed by the name of the Tcl variable - that you wish to pass as a literal. So here's the final, real-life - form of the example query: -

+    
+
+ The above example is meant to be psuedo-Tcl - no API like this + actually exists. What happens is that we first send the SQL statement + to the server for parsing, then later we bind values to the + placeholders, and send those values along seperately. This seperate + binding step is where the term bind variable comes from. +
+
+ This split has several advantages. First, type checking happens on the + literal. If the column we are comparing against holds numbers, and we + send a string, we get a nice error. Second, since string literals are + no longer in the query, no extra quoting is required. Third, + substitution of bind variables cannot change the actual text of the + query, only the literal values in the placeholders. +
+
+ The database API makes bind variables easy to use by hooking them + smoothly into the Tcl runtime. Rather than using a '?' as a generic + placeholder, you use a colon followed by the name of the Tcl variable + that you wish to pass as a literal. So here's the final, real-life + form of the example query: +
+
 select 
     foo, 
     bar, 
     baz 
 from some_table, some_other_table
 where some_table.id = some_other_table.id  
 and some_table.condition_p = :tcl_var
-		

- The database API parses the query and pulls out all the bind variable - specifications and replaces them with generic placeholders. It then - automatically pulls the values of the named Tcl vars out of the - runtime environment of the script, and passes them to the database. -

- Note that while this looks like a simple syntactic change, it really - is very different from how we've written queries in the past. You use - bind variables to replace what would otherwise be a literal value in a - query, and Tcl style string interpolation does not happen. So you - cannot do something like: -

-set table "baz"
-set condition "where foo = bar"
+    
+
+ The database API parses the query and pulls out all the bind variable + specifications and replaces them with generic placeholders. It then + automatically pulls the values of the named Tcl vars out of the + runtime environment of the script, and passes them to the database. +
+
+ Note that while this looks like a simple syntactic change, it really + is very different from how we've written queries in the past. You use + bind variables to replace what would otherwise be a literal value in a + query, and Tcl style string interpolation does not happen. So you + cannot do something like: +
+
+set table "baz"
+set condition "where foo = bar"
 
 db_foreach my_query { select :table from some_table where :condition }
-		

- SQL will not allow a literal to occur where we've put the bind - variables, so the query is syntactically incorrect. You have to - remember that while the bind variable syntax looks similar to variable - interpolation in Tcl, it is not the same thing at all. -

- Finally, the DB API has several different styles for passing bind - variable values to queries. In general, use the style presented here - because it is the most convenient. You'll see the other styles - described in the detailed API document, but they're not important. -

5.5. Basic API

- The Database API has several functions that wrap familiar parts of the - AOLserver database API. We won't cover the details of all the calls - here; instead, we'll list the most frequently used calls, with links - to further detail: + +

+ SQL will not allow a literal to occur where we've put the bind + variables, so the query is syntactically incorrect. You have to + remember that while the bind variable syntax looks similar to variable + interpolation in Tcl, it is not the same thing at all. +

+ Finally, the DB API has several different styles for passing bind + variable values to queries. In general, use the style presented here + because it is the most convenient. +

+Usage

Every db_* command accepting a SQL command as an argument + supports bind variables. You can either

+ Specify the -bind switch to provide a set with bind variable + values, or +
+ Specify the -bind switch to explicitly provide a list of + bind variable names and values, or +
+ Not specify a bind variable list at all, in which case Tcl variables are + used as bind variables. +

+ The default behavior (i.e., if the -bind switch is omitted) is + that these procedures expect to find local variables that correspond in name + to the referenced bind variables, e.g.: +

db_null: - Returns a value which can be used in a bind variable to represent - the SQL value null. You should use this instead of empty - strings when passing nulls to the database, because using the empty - string as the null value is not portable across databases. See - Nulls and Bind Variables. -
- db_foreach - query_name sql { code } -: Runs a query returning multiple rows and executes a code block for - each row returned. -
- db_1row - query_name sql: This is similar to the old ns_db 1row call.
- db_0or1row - query_name sql: This is similar to the old call ns_db 0or1row
- db_string - query_name sql -: Returns the first column of the result of a SQL query as a - string. Analogous to database_to_tcl_string and - database_to_tcl_string_or_null. -
- db_list - query_name sql: Returns a Tcl list of values from the first column of each result row - of a SQL query. Similar to the old database_to_tcl_list. -
- db_dml - query_name dml -: Performs a DML or DDL statement.
- db_transaction - { code } -: Performs a block of Tcl code in the scope of a transaction.
- db_abort_transaction - { code } -: Aborts a transaction.

+set user_id 123456 +set role "administrator" -

($Id$)

acs-docs@arsdigita.com

+db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} { + # do something for each group of which user 123456 is in the role + # of "administrator" +} + + +

+ The value of the local Tcl variable user_id (123456) is bound to + the user_id bind variable. +

The -bind switch can takes the name of an ns_set + containing keys for each bind variable named in the query, e.g.:

+
+set bind_vars [ns_set create]
+ns_set put $bind_vars user_id 123456
+ns_set put $bind_vars role "administrator"
+
+db_foreach user_group_memberships_by_role {
+    select g.group_id, g.group_name
+    from user_groups g, user_group_map map
+    where g.group_id = map.user_id
+    and map.user_id = :user_id
+    and map.role = :role
+} -bind $bind_vars {
+    # do something for each group in which user 123456 has the role
+    # of "administrator"
+}
+
+

+ Alternatively, as an argument to -bind you can specify a list of + alternating name/value pairs for bind variables: +

+	
+db_foreach user_group_memberships_by_role {
+    select g.group_id, g.group_name
+    from user_groups g, user_group_map map
+    where g.group_id = map.user_id
+    and map.user_id = :user_id
+    and map.role = :role
+} -bind [list user_id 123456 role "administrator"] {
+    # do something for each group in which user 123456 has the role
+    # of "administrator"
+}
+
+

+Basic API

+ The Database API has several functions that wrap familiar parts of the + AOLserver database API. +

+ Note that you never have to use ns_db anymore (including + ns_db gethandle)! Just start doing stuff, and (if you want) call + db_release_unused_handles when you're done as a hint to + release the database handle. +

+ + db_abort_transaction + +

+db_abort_transaction
+

Aborts all levels of a transaction. That is if this is called within + several nested transactions, all of them are terminated. Use this insetead of + db_dml "abort" "abort transaction". + + +

+ + + db_null + + + +

+db_null
+

+ Returns a value which can be used in a bind variable + to represent the SQL value + null. + See Nulls and + Bind Variables above. +

+ + + db_foreach + + +

+db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -column_array array_name | -column_set set_name ] \
+    code_block [ if_no_rows if_no_rows_block ]
+

+ Performs the SQL query + sql + , executing + code_block + once for each row + with variables set to column values (or a set or array + populated if + -column_array or + column_set is + specified). If the query returns no rows, executes + if_no_rows_block + (if provided). +

Example:

+
+db_foreach select_foo "select foo, bar from greeble" {
+    doc_body_append "<li>foo=$foo; bar=$bar\n"
+} if_no_rows {
+    doc_body_append "<li>There are no greebles in the database.\n"
+}
+
+

+ The code block may contain break statements (which terminate the + loop and flush the database handle) and continue statements + (which continue to the next row of the loop).

+ + + db_1row + + +

+db_1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -column_array array_name | -column_set set_name ]
+

+ Performs the SQL query + sql, + setting variables to column values. Raises an error if the + query does not return exactly 1 row. +

Example:

+
+db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id"
+# Bombs if there's no such greeble!
+# Now $foo and $bar are set.
+
+

+ + + db_0or1row + + +

+db_0or1row statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -column_array array_name | -column_set set_name ]
+

+ Performs the SQL query + sql. + If a row is returned, sets variables to column values and + returns 1. If no rows are returned, returns 0. If more + than one row is returned, throws an error. +

db_nextval

+db_nextval sequence-name
+

+ Returns the next value for the sequence sequence-name (using a + SQL statement like SELECT + sequence-name.nextval FROM + DUAL). If sequence pooling is enabled for the sequence, transparently + uses a value from the pool if available to save a round-trip to the database + (see Sequence Pooling). +

+ + + db_register_pooled_sequence + + +

+db_register_pooled_sequence sequence-name pool-size
+

Registers the sequence sequence-name to be pooled, with a pool + size of pool-size sequence values + (see Sequence Pooling). + +

db_string

+db_string statement-name sql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ]
+

Returns the first column of the result of SQL query + sql. + If sql doesn't return a + row, returns + default + (or throws an error if + default is unspecified). Analogous to + database_to_tcl_string and + database_to_tcl_string_or_null. + +

db_list

+db_list statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
+

Returns a Tcl list of the values in the first column of the result of SQL + query + sql. + If sql doesn't + return any rows, returns an empty list. Analogous to + database_to_tcl_list. + +

db_list_of_lists

+db_list_of_lists statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
+

Returns a Tcl list, each element of which is a list of all column values + in a row of the result of SQL query sql. If + sql doesn't return any rows, returns an empty list. + (Analogous to database_to_tcl_list_list.) + +

db_dml

+db_dml statement-name sql \
+    [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -blobs blob_list | -clobs clob_list |
+      -blob_files blob_file_list | -clob_files clob_file_list ]
+

Performs the DML or DDL statement sql.

If a length-n list of blobs or clobs is provided, then the SQL + should return n blobs or clobs into the bind variables + :1, :2, ... :n. + blobs or clobs, if specified, + should be a list of individual BLOBs or CLOBs to insert; + blob_files or clob_files, if + specified, should be a list of paths to files containing the data to + insert. Only one of -blobs, -clobs, + -blob_files, and -clob_files may be provided.

Example:

+
+db_dml insert_photos "
+        insert photos(photo_id, image, thumbnail_image)
+        values(photo_id_seq.nextval, empty_blob(), empty_blob())
+        returning image, thumbnail_image into :1, :2
+    "  -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"] 
+
+

+ This inserts a new row into the photos table, with the contents + of the files /var/tmp/the_photo and + /var/tmp/the_thumbnail in the image and + thumbnail columns, respectively. + + +

+ db_write_clob, + db_write_blob, + db_blob_get_file +

+db_write_clob statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
+
+db_write_blob statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
+
+db_blob_get_file statement-name sql [ -bind bind_set_id | -bind bind_value_list ]
+

Analagous to ns_ora write_clob/write_blob/blob_get_file. + + +

db_release_unused_handles

+	    db_release_unused_handles
+

Releases any allocated, unused database handles.

db_transaction

+db_transaction code_block [ on_error { code_block } ]
+

Executes code_block transactionally. Nested + transactions are supported (end transaction is transparently + ns_db dml'ed when the outermost transaction completes). The + db_abort_transaction command can be used to abort all levels of + transactions. It is possible to specify an optional on_error + code block that will be executed if some code in code_block throws + an exception. The variable errmsg will be bound in that scope. + If there is no on_error code, any errors will be propagated.

Example:

+
+proc replace_the_foo { col } {
+    db_transaction {
+        db_dml "delete from foo"
+        db_dml "insert into foo(col) values($col)"
+    }
+}
+
+proc print_the_foo {} {
+    doc_body_append "foo is [db_string "select col from foo"]<br>\n"
+}
+
+replace_the_foo 8
+print_the_foo ; # Writes out "foo is 8"
+
+db_transaction {
+    replace_the_foo 14
+    print_the_foo ; # Writes out "foo is 14"
+    db_dml "insert into some_other_table(col) values(999)"
+    ...
+    db_abort_transaction
+} on_error {
+    doc_body_append "Error in transaction: $errmsg"
+}
+    
+
+print_the_foo ; # Writes out "foo is 8"
+
+

db_resultrows

+db_resultrows
+

Returns the number of rows affected or returned by the previous + statement. + + +

db_with_handle

+db_with_handle var code_block
+

Places a database handle into the variable var and + executes code_block. This is useful when you don't + want to have to use the new API (db_foreach, + db_1row, etc.), but need to use database handles explicitly.

Example:

+
+proc lookup_the_foo { foo } {
+    db_with_handle db {
+        return [db_string unused "select ..."]
+    }
+}
+
+db_with_handle db {
+    # Now there's a database handle in $db.
+    set selection [ns_db select $db "select foo from bar"]
+    while { [ns_db getrow $db $selection] } {
+        set_variables_after_query
+
+        lookup_the_foo $foo
+    }
+}
+
+

+ + + db_nullify_empty_string + + +

+db_nullify_empty_string string
+

For true SQL purists, we provide the convenience function + db_nullify_empty_string, which returns + [db_null] if its string argument is the empty string + and can be used to encapsulate another Oracle quirk:

+
+set baz ""
+
+# Clean out the foo table
+#
+db_dml unused "delete from foo"
+
+db_dml unused "insert into foo(baz) values('$baz')"
+
+set n_rows [db_string unused "select count(*) from foo where baz is null"]
+#
+# $n_rows is 1; in effect, the "baz is null" criterion is matching
+# the empty string we just inserted (because of Oracle's coercion
+# quirk)
+
+

+ To balance out this asymmetry, you can explicitly set baz to + null by writing: +

+
+db_dml foo_insert "insert into foo(baz) values(:1)" {[db_nullify_empty_string $baz]}
+
+

($Id$)

+Prev�	Home	�Next +
The Request Processor�	Up	�Using Templates in OpenACS 4