Index: openacs-4/packages/acs-templating/www/doc/noquote.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-templating/www/doc/noquote.adp,v diff -u -N -r1.3.2.3 -r1.3.2.4 --- openacs-4/packages/acs-templating/www/doc/noquote.adp 1 Dec 2015 11:18:00 -0000 1.3.2.3 +++ openacs-4/packages/acs-templating/www/doc/noquote.adp 5 Jul 2016 12:14:22 -0000 1.3.2.4 @@ -20,14 +20,14 @@ presentation that is output to the user.

Before introduction of a templating systems to ACS, pages were built by outputting HTML text directly from Tcl code. Therefore it was hard for a designer or a later reviewer to change the -appearance of the page. "Change the color of the table? How do I do -that when I cannot even find the body tag?" At this point some -suggest to embed Tcl code in the document rather than the other way -around, like PHP does. But it doesn't solve the problem, because -the code is still tightly coupled with the markup, requiring -programmer-level understanding for every change. The only workable -solution is to try to uncouple the presentation from the design as -much as possible.

ACS 4.0 addressed the problem by introducing a custom-written +appearance of the page. "Change the color of the table? How do +I do that when I cannot even find the body tag?" At this point +some suggest to embed Tcl code in the document rather than the +other way around, like PHP does. But it doesn't solve the +problem, because the code is still tightly coupled with the markup, +requiring programmer-level understanding for every change. The only +workable solution is to try to uncouple the presentation from the +design as much as possible.

ACS 4.0 addressed the problem by introducing a custom-written templating system loosely based on the already-present capabilities of the AolServer, the ADP pages. Unlike the ADP system, which allowed the coder to register his own tags to encapsulate @@ -50,22 +50,25 @@ Quoting.

In the context of HTML, we define quoting as transforming text in such a way that the HTML-rendered version of the transformed text is identical to the original text. Thus one way to quote the -text "<i>" is to transform it to "&lt;i&gt;". When a -browser renders the transformed text, entities &lt; and -&gt; are converted back to < and >, which makes the -rendered version of the transformation equal to the original.

The easiest way to guarantee correct transformation in all cases -is to "escape" ("quote") all the characters that HTML considers -special. In the minimalistic case, it is enough to transform &, -<, and > into their quoted equivalents, &amp;, &lt;, -and &gt; respectively. For additional usefulness in quoted -fields, it's a good idea to also quote double and single quotes -into &quot; and &#39; respectively.

All of this assumes that the text to be quoted is not meant to +text "<i>" is to transform it to +"&lt;i&gt;". When a browser renders the +transformed text, entities &lt; and &gt; are converted back +to < and >, which makes the rendered version of the +transformation equal to the original.

The easiest way to guarantee correct transformation in all cases +is to "escape" ("quote") all the characters +that HTML considers special. In the minimalistic case, it is enough +to transform &, <, and > into their quoted equivalents, +&amp;, &lt;, and &gt; respectively. For additional +usefulness in quoted fields, it's a good idea to also quote +double and single quotes into &quot; and &#39; +respectively.

All of this assumes that the text to be quoted is not meant to be rendered as HTML in the first place. So if your text contains -"<i>word</i>", and you expect the word to show up in -italic, you should not quote that entire string. However, if word -in fact comes from the database and you don't want it to, for -instance, close the <i> behind your back, you should quote -it, and then enclose it between <i> and </i>.

The ACS has a procedure that performs HTML quoting, +"<i>word</i>", and you expect the word to +show up in italic, you should not quote that entire string. +However, if word in fact comes from the database and you don't +want it to, for instance, close the <i> behind your back, you +should quote it, and then enclose it between <i> and +</i>.

The ACS has a procedure that performs HTML quoting, ns_quotehtml. It accepts the string that needs to be quoted, and returns the quoted string. In ACS 3.x, properly written code was expected to call ns_quotehtml every time it published a string to a @@ -77,11 +80,12 @@ doc_body_append "</ul>\n"

Obviously, this was very error-prone, and more often than not, the programmers would forget to quote the variables that come from -the database or from the user. This would "usually" work, but in -some cases it would lead to broken pages and even pose a security -problem. For instance, one could imagine a mathematicians' forum -being named "0 < 1", or an HTML designers' forum being named -"The Woes of <h1>".

In some cases the published variable must not be quoted. +the database or from the user. This would "usually" work, +but in some cases it would lead to broken pages and even pose a +security problem. For instance, one could imagine a +mathematicians' forum being named "0 < 1", or an +HTML designers' forum being named "The Woes of +<h1>".

In some cases the published variable must not be quoted. Examples for that are the bboard postings that are posted in HTML, or variables containing the result of export_form_vars. All in all, the decision about when to quote had to be made by the programmer @@ -91,19 +95,19 @@ templating system, would provide an easy and obvious solution for the (lack of) quoting problem. It turned out that this did not happen, partly because no easy solution exists, and partly because -the issue was ignored or postponed.

Let's review the ACS 3.x code from above. The most important +the issue was ignored or postponed.

Let's review the ACS 3.x code from above. The most important change is that it comes in two parts: the presentation template, and the programming logic code. The template will look like this:

 <ul> <multiple name=forums> <li>Forum:
   <tt>\@forums.name\@</tt> </multiple> </ul>
 

Once you understand the (simple) workings of the multiple tag, this version strikes you as much more readable than the old one. -But we're not done yet: we need to write the Tcl code that grabs -forum names from the database. The db_multirow proc is designed -exactly for this; it retrieves rows from the database and assigns -variables from each row to template variables in each pass of a -multiple of our choice.

+But we're not done yet: we need to write the Tcl code that
+grabs forum names from the database. The db_multirow proc is
+designed exactly for this; it retrieves rows from the database and
+assigns variables from each row to template variables in each pass
+of a multiple of our choice.

 db_multirow forums get_forum_names { SELECT name FROM forums }
 

At this point the careful reader will wonder at which point the forum name gets quoted, and if so, how does the templating system @@ -116,17 +120,17 @@ data in the loop. That is ugly and error-prone because it is more typing and it requires you to explicitly name the variables you wish to export at several points. It is exactly the kind of ugly -code that db_multirow was designed to avoid.

The alternative approach means less typing, but it's even uglier -in its own subtle way. The trick is to remember that our templating -still supports all the ADP features, including embedding Tcl code -in the template. Thus instead of referring to the multirow variable -with the \@forums.name\@ variable substitutions, we use +code that db_multirow was designed to avoid.

The alternative approach means less typing, but it's even +uglier in its own subtle way. The trick is to remember that our +templating still supports all the ADP features, including embedding +Tcl code in the template. Thus instead of referring to the multirow +variable with the \@forums.name\@ variable substitutions, we use <%= [ns_quotehtml \@forums.name\@] %>. This works correctly, but obviously breaks the abstraction barrier between ADP and Tcl syntaxes. The practical result of breaking the abstraction is that every occurrence of Tcl code in an ADP template will have to be painstakingly reviewed and converted once ADPs -start being invoked by Java code rather than Tcl.

At this point, most programmers simply give up and don't quote their variables at all . +start being invoked by Java code rather than Tcl.

At this point, most programmers simply give up and don't quote their variables at all . Quoting is handled only in the areas where it is really crucial and where not handling it would quote immediate and visible breakage, such as in the case of displaying the bodies of bboard articles. @@ -145,8 +149,8 @@ variable fall into one of three categories:

  1. Those that need to be quoted -- names and descriptions of objects, and in general stuff that ultimately comes from the -user.

  2. Those for which it doesn't make a difference whether they are -quoted or not -- e.g. all the database IDs.

  3. Those that must not be quoted -- e.g. exported form vars stored +user.

  4. Those for which it doesn't make a difference whether they +are quoted or not -- e.g. all the database IDs.

  5. Those that must not be quoted -- e.g. exported form vars stored to a variable.

  6. Finally we also remembered the fact that almost none of the variables are quoted in the current source base.

Our reasoning went further: if it is a fact that most variables @@ -164,7 +168,7 @@ ADPs and replaced the instances of \@foo\@ where foo contained HTML code with \@foo;noquote\@.

The change took two people less than one day for the system that consisted of core ACS 4.0.1, and modules bboard, news, chat, and -bookmarks. (We were also doing other things, so it's hard to +bookmarks. (We were also doing other things, so it's hard to measure correctly.) During two of the following days, we would find a broken page from time to time, typically by spotting the obviously visible HTML markup. Such a page would get fixed it in a @@ -177,8 +181,7 @@ included into the next ACS release. Since the change is incompatible, it will be announced to module owners and the general public. Explanation on how to port your existing modules and the -"gotchas" that one can expect follows in a separate -document .

The discussion about speed, i.e. benchmarking results before and +"gotchas" that one can expect follows in a separate document .

The discussion about speed, i.e. benchmarking results before and after the change, is also available .

Hrvoje Niksic