When using AOLserver, remember that there are effectively +two types of global +namespace, not one:
+Server-global: As
+you'd expect, there is only one server-global namespace per
+server, and variables set within it can be accessed by any Tcl code
+running subsequently, in any of the server's threads. To
+set/get server-global variables, use AOLserver 3's
+nsv
API (which
+supersedes ns_share
from the
+pre-3.0 API).
+Script-global: Each Tcl +script (ADP, Tcl page, registered proc, filter, etc.) executing +within an AOLserver thread has its own global namespace. Any +variable set in the top level of a script is, by definition, +script-global, meaning that it is accessible only by subsequent +code in the same script and only for the duration of the current +script execution.
The Tcl built-in command global
accesses
+script-global, not
+server-global, variables from within a procedure. This distinction
+is important to understand in order to use global
correctly when programming
+AOLserver.
Also, AOLserver purges all script-global variables in a thread +(i.e., Tcl interpreter) between HTTP requests. If it didn't, +that would affect (and complicate) our use of script-global +variables dramatically, which would then be better described as +thread-global variables. +Given AOLserver's behaviour, however, "script-global" +is a more appropriate term.
+
+ns_schedule_proc
and
+ad_schedule_proc
each take a
+-thread
flag to cause a
+scheduled procedure to run asychronously, in its own thread. It
+almost always seems like a good idea to specify this switch, but
+there's a problem.
It turns out that whenever a task scheduled with ns_schedule_proc -thread
or ad_schedule_proc -thread t
is run,
+AOLserver creates a brand new thread and a brand new interpreter,
+and reinitializes the procedure table (essentially, loads all
+procedures that were created during server initialization into the
+new interpreter). This happens every
+time the task is executed - and it is a very expensive
+process that should not be taken lightly!
The moral: if you have a lightweight scheduled procedure which
+runs frequently, don't use the -thread
switch.
Note also that thread is initialized +with a copy of what was installed during server startup, so if the +procedure table have changed since startup (e.g. using the +APM watch facility), that will not be +reflected in the scheduled thread.
The return
command in Tcl
+returns control to the caller procedure. This definition allows
+nested procedures to work properly. However, this definition also
+means that nested procedures cannot use return
to end an entire thread. This
+situation is most common in exception conditions that can be
+triggered from inside a procedure e.g., a permission denied
+exception. At this point, the procedure that detects invalid
+permission wants to write an error message to the user, and
+completely abort execution of the caller thread. return
doesn't work, because the
+procedure may be nested several levels deep. We therefore use
+ad_script_abort
to abort the remainder
+of the thread. Note that using return
instead of ad_script_abort
may raise some security
+issues: an attacker could call a page that performed some DML
+statement, pass in some arguments, and get a permission denied
+error -- but the DML statement would still be executed because the
+thread was not stopped. Note that return -code return
can be used in
+circumstances where the procedure will only be called from two
+levels deep.
Many functions have a single return value. For instance,
+util_email_valid_p
returns a number: 1
+or 0. Other functions need to return a composite value. For
+instance, consider a function that looks up a user's name and
+email address, given an ID. One way to implement this is to return
+a three-element list and document that the first element contains
+the name, and the second contains the email address. The problem
+with this technique is that, because Tcl does not support
+constants, calling procedures that returns lists in this way
+necessitates the use of magic numbers, e.g.:
+set user_info [ad_get_user_info $user_id] +set first_name [lindex $user_info 0] +set email [lindex $user_info 1] +
AOLserver/Tcl generally has three mechanisms that we like, for +returning more than one value from a function. When to use which +depends on the circumstances.
Using Arrays and Pass-By-Value
The one we generally prefer is returning an array
+get
-formatted list. It has all the nice properties of
+pass-by-value, and it uses Tcl arrays, which have good native
+support.
+ad_proc ad_get_user_info { user_id } { + db_1row user_info { select first_names, last_name, email from users where user_id = :user_id } + return [list \ + name "$first_names $last_name" \ + email $email \ + namelink "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>" \ + emaillink "<a href=\"mailto:$email\">$email</a>"] +} + +array set user_info [ad_get_user_info $user_id] + +doc_body_append "$user_info(namelink) ($user_info(emaillink))" +
You could also have done this by using an array internally and
+using array get
:
+ +ad_proc ad_get_user_info { user_id } { + db_1row user_info { select first_names, last_name, email from users where user_id = :user_id } + set user_info(name) "$first_names $last_name" + set user_info(email) $email + set user_info(namelink) "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>" + set user_info(emaillink) "<a href=\"mailto:$email\">$email</a>" + return [array get user_info] +} + +
Using Arrays and Pass-By-Reference
Sometimes pass-by-value incurs too much overhead, and you'd +rather pass-by-reference. Specifically, if you're writing a +proc that uses arrays internally to build up some value, there are +many entries in the array, and you're planning on iterating +over the proc many times. In this case, pass-by-value is expensive, +and you'd use pass-by-reference.
The transformation of the array into +a list and back to an array takes, in our test environment, +approximately 10 microseconds per entry of 100 character's +length. Thus you can process about 100 entries per milisecond. The +time depends almost completely on the number of entries, and almost +not at all on the size of the entries.
You implement pass-by-reference in Tcl by taking the name of an
+array as an argument and upvar
+it.
+ +ad_proc ad_get_user_info { + -array:required + user_id +} { + upvar $array user_info + db_1row user_info { select first_names, last_name, email from users where user_id = :user_id } + set user_info(name) "$first_names $last_name" + set user_info(email) $email + set user_info(namelink) "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>" + set user_info(emaillink) "<a href=\"mailto:$email\">$email</a>" +} + +ad_get_user_info -array user_info $user_id + +doc_body_append "$user_info(namelink) ($user_info(emaillink))" + +
We prefer pass-by-value over pass-by-reference.
+Pass-by-reference makes the code harder to read and debug, because
+changing a value in one place has side effects in other places.
+Especially if have a chain of upvar
s through several layers of the call
+stack, you'll have a hard time debugging.
Multisets: Using ns_set
s and
+Pass-By-Reference
An array is a type of set, which means you can't have +multiple entries with the same key. Data structures that can have +multiple entries for the same key are known as a multiset or bag.
If your data can have multiple entries with the same key, you
+should use the AOLserver built-in ns_set
. You can also
+do a case-insensitive lookup on an ns_set
, something you can't easily do
+on an array. This is especially useful for things like HTTP
+headers, which happen to have these exact properties.
You always use pass-by-reference with ns_set
s, since they don't have any
+built-in way of generating and reconstructing themselves from a
+string representation. Instead, you pass the handle to the set.
+ +ad_proc ad_get_user_info { + -set:required + user_id +} { + db_1row user_info { select first_names, last_name, email from users where user_id = :user_id } + ns_set put $set name "$first_names $last_name" + ns_set put $set email $email + ns_set put $set namelink "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>" + ns_set put $set emaillink "<a href=\"mailto:$email\">$email</a>" +} + +set user_info [ns_set create] +ad_get_user_info -set $user_info $user_id + +doc_body_append "[ns_set get $user_info namelink] ([ns_set get $user_info emaillink])" + +
We don't recommend ns_set
as a general mechanism for passing
+sets (as opposed to multisets) of data. Not only do they inherently
+use pass-by-reference, which we dis-like, they're also somewhat
+clumsy to use, since Tcl doesn't have built-in syntactic
+support for them.
Consider for example a loop over the entries in a ns_set
as compared to an array:
+ +# ns_set variant +set size [ns_set size $myset] +for { set i 0 } { $i < $size } { incr i } { + puts "[ns_set key $myset $i] = [ns_set value $myset $i]" +} + +# array variant +foreach name [array names myarray] { + puts "$myarray($name) = $myarray($name)" +} + +
And this example of constructing a value:
+ +# ns_set variant +set myset [ns_set create] +ns_set put $myset foo $foo +ns_set put $myset baz $baz +return $myset + +# array variant +return [list + foo $foo + baz $baz +] + +
+ns_set
s are designed to be
+lightweight, so memory consumption should not be a problem.
+However, when using ns_set get
+to perform lookup by name, they perform a linear lookup, whereas
+arrays use a hash table, so ns_set
s are slower than arrays when the
+number of entries is large.