<html> <head> <!--AD_DND--> <title>Establishing Style and Supporting Multi-Lingualism</title> </head> <body bgcolor=#ffffff text=#000000> <h2>Establishing Style and Supporting Multi-Lingualism</h2> using <a href="index.html">the ArsDigita Community System</a> by <a href="http://photo.net/philg/">Philip Greenspun</a> <hr> <ul> <li>User-accessible directory: none <li>Site administrator directory: none (<a href="styles.tcl">/doc/styles.tcl</a> is as close as it gets) <li>data model : none; entirely Tcl- and virtual memory-based <li>Tcl scripts: /tcl/ad-style.tcl <li>style definitions: by convention in /tcl/sitename-styles.tcl <li>Templates: /web/yourservername/templates/ </ul> This document explains how to establish site-wide style and presentation conventions. A core element of the system is AOLserver's ADP template parsing system. <h3>The Big Problem</h3> Here are some of the challenges that we need to attack: <ul> <li>consistent look and style conventions across thousands of pages <li>publishers who change their mind about how the site should look <li>users who are browsing on a simple device (e.g., Nokia 9000 cell phone) and need a text-only site <li>users who can't read English (even though it was good enough for Jesus Christ) </ul> <h3>Some Trivial Solutions</h3> Suppose that you simply want consistent look and feel, changeable by editing only one file, across thousands of dynamic pages: <ul> <li>edit the ad.ini file to set site background and text colors (these parameters are read by <code>ad_header</code>) <li>bash the source code for <code>ad_header</code> in /tcl/ad-defs.tcl to do something more dramatic (e.g., display a company logo at the top right of each page). <li>bash the source code for <code>ad_footer</code> in /tcl/ad-defs.tcl to do something consistent at the bottom of all the pages on a site </ul> What about static HTML pages? You can put <code>regsub</code> calls in <code>ad_serve_html_page</code> (in /tcl/ad-html.tcl) to consistently change the appearance of outgoing pages. <p> If you're building from scratch, you could build in ADP instead of HTML and use <code>ns_register_adptag</code> to augment the HTML a bit. <h3>Some Trivial Solutions in a Perfect World</h3> In a perfect world, you'd modify <code>ad_header</code> or your static HTML to reference a cascading style sheet (CSS). See <a href="http://photo.net/wtr/thebook/html.html">the HTML chapter of <cite>Philip and Alex's Guide to Web Publishing</cite></a> for an explanation and also do a View Source on the document to see a style sheet reference from the HEAD of a document. <p> This doesn't work out too great because (1) only the 4.0 browsers interpret style sheets, and (2) Brand M and Brand N browsers do very different things given the same instructions (each implements a subset of the CSS standard). <h3>Why These Trivial Solutions Won't Work for You</h3> Publishers and the designers they hire want to control much more than background, text, alink, and vlink colors. They want to move around the elements on each page. <p> So what's the big deal? Let them write whatever HTML they want. <p> The problem is that they want control over pages that are generated by querying the database and executing procedures but they don't want to learn how to program. Your naive solution is to let the designers build static HTML files and show them to you. You'll work these elements into Tcl string literals and write programs that print them to the browser. In the end you'll have programs that query the database and produce output exactly like what the designer wanted... on Monday. By Friday, the designer has changed his or her mind. Would you rather spend your life attacking the hard problem of Web-based collaboration or moving strings around inside .tcl pages? <h3>Templates</h3> Suppose that you send your staff the following message: <blockquote> <pre> To: Web Developers I want you to put all the SQL queries into Tcl functions that get loaded at server start-up time. The graphic designers are to build ADP pages that call a Tcl procedure which will set a bunch of local variables with values from the database. They are then to stick <%=$variable_name=> in the ADP page wherever they want one of the variables to appear. Alternatively, write .tcl scripts that implement the "business logic" and, after stuffing a bunch of local vars, call ns_adp_parse to drag in the ADP created by the graphic designer. </pre> </blockquote> In future, a change to the look of a site won't require a programmer, only someone who knows HTML and who is careful enough not to disturb references to variables. <h3>Putting It All Together</h3> Putting it all together in an ArsDigita Community System-based site: <ul> <li>We define a set of styles using <code>ad_register_styletag</code>. This procedure will (a) record that we've got a style we want to use site-wide, (b) register an ADP tag, and (c) create a Tcl function to be used by straight .tcl pages (and that is also called by the ADP subsystem) <li>We create a convention that /www/foo/bar.tcl looks for ADP templates at /templates/foo/bar.* <li>We create a /style section in the ad.ini file to specify whether or not plain and fancy templates are available ("foobar.plain.adp" and "foobar.fancy.adp"), whether or not we're trying to be multilingual, and what language is our system default ("foobar.plain.en.adp" for English, "foobar.fancy.fr.adp" for a French graphics version). </ul> Why are the templates stored under a separate directory structure than the .tcl scripts? Isn't this inconvenient? Yes, if you're one person maintaining a site. However, the whole point of this system is that a bunch of programmers and designers are collaborating. The programmers will probably be happier if the designers never get FTP access to the directories containing .tcl scripts. Also, from a security point of view, if someone is going to upload files to your server via FTP, you don't want them ending up directly underneath the Web server root. <p> Caveat nerdor: remember that AOLserver sources private Tcl libraries alphabetically. So your calls to <code>ad_register_styletag</code> must be in a Tcl file that sorts alphabetically after "ad-style.tcl" (we suggest that you stick to a convention of "sitename-styles.tcl", e.g., "photonet-styles.tcl" would be the photo.net styles). <h3>How we represent languages</h3> Languages are represented by lowercase ISO 639 two-character abbreviations, e.g., "en" for English, "km" for Cambodian, "ja" for Japanese (<em>not</em> "jp" as you might expect; jp is the country code for Japan, not the language code for the Japanese language). For a complete list, check your Netscape preferences (click on "languages" and then try to add one), visit <a href="http://www.w3.org/International/O-charset-lang.html">http://www.w3.org/International/O-charset-lang.html</a>, or refer to this list below (we're not going to make sure that it is kept up to date, so you might want to visit the source). <p> <TABLE ALIGN="CENTER"> <TR><TH>Language Name</TH><TH>Code</TH><TH>Language Family</TH></TR> <TR><TD>Abkhazian</TD><TH>ab</TH><TD>Ibero-Caucasian</TD></TR> <TR><TD>Afan (Oromo)</TD><TH>om</TH><TD>Hamitic</TD></TR> <TR><TD>Afar</TD><TH>aa</TH><TD>Hamitic</TD></TR> <TR><TD>Afrikaans</TD><TH>af</TH><TD>Germanic</TD></TR> <TR><TD>Albanian</TD><TH>sq</TH><TD>Indo-european (other)</TD></TR> <TR><TD>Amharic</TD><TH>am</TH><TD>Semitic</TD></TR> <TR><TD>Arabic</TD><TH>ar</TH><TD>Semitic</TD></TR> <TR><TD>Armenian</TD><TH>hy</TH><TD>Indo-european (other)</TD></TR> <TR><TD>Assamese</TD><TH>as</TH><TD>Indian</TD></TR> <TR><TD>Aymara</TD><TH>ay</TH><TD>Amerindian</TD></TR> <TR><TD>Azerbaijani</TD><TH>az</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Bashkir</TD><TH>ba</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Basque</TD><TH>eu</TH><TD>Basque</TD></TR> <TR><TD>Bengali;bangla</TD><TH>bn</TH><TD>Indian</TD></TR> <TR><TD>Bhutani</TD><TH>dz</TH><TD>Asian</TD></TR> <TR><TD>Bihari</TD><TH>bh</TH><TD>Indian</TD></TR> <TR><TD>Bislama</TD><TH>bi</TH><TD>[notgiven]</TD></TR> <TR><TD>Breton</TD><TH>br</TH><TD>Celtic</TD></TR> <TR><TD>Bulgarian</TD><TH>bg</TH><TD>Slavic</TD></TR> <TR><TD>Burmese</TD><TH>my</TH><TD>Asian</TD></TR> <TR><TD>Byelorussian</TD><TH>be</TH><TD>Slavic</TD></TR> <TR><TD>Cambodian</TD><TH>km</TH><TD>Asian</TD></TR> <TR><TD>Catalan</TD><TH>ca</TH><TD>Romance</TD></TR> <TR><TD>Chinese</TD><TH>zh</TH><TD>Asian</TD></TR> <TR><TD>Corsican</TD><TH>co</TH><TD>Romance</TD></TR> <TR><TD>Croatian</TD><TH>hr</TH><TD>Slavic</TD></TR> <TR><TD>Czech</TD><TH>cs</TH><TD>Slavic</TD></TR> <TR><TD>Danish</TD><TH>da</TH><TD>Germanic</TD></TR> <TR><TD>Dutch</TD><TH>nl</TH><TD>Germanic</TD></TR> <TR><TD>English</TD><TH>en</TH><TD>Germanic</TD></TR> <TR><TD>Esperanto</TD><TH>eo</TH><TD>Internationalaux.</TD></TR> <TR><TD>Estonian</TD><TH>et</TH><TD>Finno-ugric</TD></TR> <TR><TD>Faroese</TD><TH>fo</TH><TD>Germanic</TD></TR> <TR><TD>Fiji</TD><TH>fj</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Finnish</TD><TH>fi</TH><TD>Finno-ugric</TD></TR> <TR><TD>French</TD><TH>fr</TH><TD>Romance</TD></TR> <TR><TD>Frisian</TD><TH>fy</TH><TD>Germanic</TD></TR> <TR><TD>Galician</TD><TH>gl</TH><TD>Romance</TD></TR> <TR><TD>Georgian</TD><TH>ka</TH><TD>Ibero-caucasian</TD></TR> <TR><TD>German</TD><TH>de</TH><TD>Germanic</TD></TR> <TR><TD>Greek</TD><TH>el</TH><TD>Latin/greek</TD></TR> <TR><TD>Greenlandic</TD><TH>kl</TH><TD>Eskimo</TD></TR> <TR><TD>Guarani</TD><TH>gn</TH><TD>Amerindian</TD></TR> <TR><TD>Gujarati</TD><TH>gu</TH><TD>Indian</TD></TR> <TR><TD>Hausa</TD><TH>ha</TH><TD>Negro-african</TD></TR> <TR><TD>Hebrew</TD><TH>iw</TH><TD>Semitic</TD></TR> <TR><TD>Hindi</TD><TH>hi</TH><TD>Indian</TD></TR> <TR><TD>Hungarian</TD><TH>hu</TH><TD>Finno-ugric</TD></TR> <TR><TD>Icelandic</TD><TH>is</TH><TD>Germanic</TD></TR> <TR><TD>Indonesian</TD><TH>in</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Interlingua</TD><TH>ia</TH><TD>Internationalaux.</TD></TR> <TR><TD>Interlingue</TD><TH>ie</TH><TD>Internationalaux.</TD></TR> <TR><TD>Inupiak</TD><TH>ik</TH><TD>Eskimo</TD></TR> <TR><TD>Irish</TD><TH>ga</TH><TD>Celtic</TD></TR> <TR><TD>Italian</TD><TH>it</TH><TD>Romance</TD></TR> <TR><TD>Japanese</TD><TH>ja</TH><TD>Asian</TD></TR> <TR><TD>Javanese</TD><TH>jv</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Kannada</TD><TH>kn</TH><TD>Dravidian</TD></TR> <TR><TD>Kashmiri</TD><TH>ks</TH><TD>Indian</TD></TR> <TR><TD>Kazakh</TD><TH>kk</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Kinyarwanda</TD><TH>rw</TH><TD>Negro-african</TD></TR> <TR><TD>Kirghiz</TD><TH>ky</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Kurundi</TD><TH>rn</TH><TD>Negro-african</TD></TR> <TR><TD>Korean</TD><TH>ko</TH><TD>Asian</TD></TR> <TR><TD>Kurdish</TD><TH>ku</TH><TD>Iranian</TD></TR> <TR><TD>Laothian</TD><TH>lo</TH><TD>Asian</TD></TR> <TR><TD>Latin</TD><TH>la</TH><TD>Latin/greek</TD></TR> <TR><TD>Latvian;lettish</TD><TH>lv</TH><TD>Baltic</TD></TR> <TR><TD>Lingala</TD><TH>ln</TH><TD>Negro-african</TD></TR> <TR><TD>Lithuanian</TD><TH>lt</TH><TD>Baltic</TD></TR> <TR><TD>Macedonian</TD><TH>mk</TH><TD>Slavic</TD></TR> <TR><TD>Malagasy</TD><TH>mg</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Malay</TD><TH>ms</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Malayalam</TD><TH>ml</TH><TD>Dravidian</TD></TR> <TR><TD>Maltese</TD><TH>mt</TH><TD>Semitic</TD></TR> <TR><TD>Maori</TD><TH>mi</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Marathi</TD><TH>mr</TH><TD>Indian</TD></TR> <TR><TD>Moldavian</TD><TH>mo</TH><TD>Romance</TD></TR> <TR><TD>Mongolian</TD><TH>mn</TH><TD>[notgiven]</TD></TR> <TR><TD>Nauru</TD><TH>na</TH><TD>[notgiven]</TD></TR> <TR><TD>Nepali</TD><TH>ne</TH><TD>Indian</TD></TR> <TR><TD>Norwegian</TD><TH>no</TH><TD>Germanic</TD></TR> <TR><TD>Occitan</TD><TH>oc</TH><TD>Romance</TD></TR> <TR><TD>Oriya</TD><TH>or</TH><TD>Indian</TD></TR> <TR><TD>Pashto;pushto</TD><TH>ps</TH><TD>Iranian</TD></TR> <TR><TD>Persian</TD><TH>(farsi)</TH><TD>Fairanian</TD></TR> <TR><TD>Polish</TD><TH>pl</TH><TD>Slavic</TD></TR> <TR><TD>Portuguese</TD><TH>pt</TH><TD>Romance</TD></TR> <TR><TD>Punjabi</TD><TH>pa</TH><TD>Indian</TD></TR> <TR><TD>Quechua</TD><TH>qu</TH><TD>Amerindian</TD></TR> <TR><TD>Rhaeto-romance</TD><TH>rm</TH><TD>Romance</TD></TR> <TR><TD>Romanian</TD><TH>ro</TH><TD>Romance</TD></TR> <TR><TD>Russian</TD><TH>ru</TH><TD>Slavic</TD></TR> <TR><TD>Samoan</TD><TH>sm</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Sangho</TD><TH>sg</TH><TD>Negro-african</TD></TR> <TR><TD>Sanskrit</TD><TH>sa</TH><TD>Indian</TD></TR> <TR><TD>Scots</TD><TH>gaelic</TH><TD>Gdceltic</TD></TR> <TR><TD>Serbian</TD><TH>sr</TH><TD>Slavic</TD></TR> <TR><TD>Serbo-croatian</TD><TH>sh</TH><TD>Slavic</TD></TR> <TR><TD>Sesotho</TD><TH>st</TH><TD>Negro-african</TD></TR> <TR><TD>Setswana</TD><TH>tn</TH><TD>Negro-african</TD></TR> <TR><TD>Shona</TD><TH>sn</TH><TD>Negro-african</TD></TR> <TR><TD>Sindhi</TD><TH>sd</TH><TD>Indian</TD></TR> <TR><TD>Singhalese</TD><TH>si</TH><TD>Indian</TD></TR> <TR><TD>Siswati</TD><TH>ss</TH><TD>Negro-african</TD></TR> <TR><TD>Slovak</TD><TH>sk</TH><TD>Slavic</TD></TR> <TR><TD>Slovenian</TD><TH>sl</TH><TD>Slavic</TD></TR> <TR><TD>Somali</TD><TH>so</TH><TD>Hamitic</TD></TR> <TR><TD>Spanish</TD><TH>es</TH><TD>Romance</TD></TR> <TR><TD>Sundanese</TD><TH>su</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Swahili</TD><TH>sw</TH><TD>Negro-african</TD></TR> <TR><TD>Swedish</TD><TH>sv</TH><TD>Germanic</TD></TR> <TR><TD>Tagalog</TD><TH>tl</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Tajik</TD><TH>tg</TH><TD>Iranian</TD></TR> <TR><TD>Tamil</TD><TH>ta</TH><TD>Dravidian</TD></TR> <TR><TD>Tatar</TD><TH>tt</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Telugu</TD><TH>te</TH><TD>Dravidian</TD></TR> <TR><TD>Thai</TD><TH>th</TH><TD>Asian</TD></TR> <TR><TD>Tibetan</TD><TH>bo</TH><TD>Asian</TD></TR> <TR><TD>Tigrinya</TD><TH>ti</TH><TD>Semitic</TD></TR> <TR><TD>Tonga</TD><TH>to</TH><TD>Oceanic/indonesian</TD></TR> <TR><TD>Tsonga</TD><TH>ts</TH><TD>Negro-african</TD></TR> <TR><TD>Turkish</TD><TH>tr</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Turkmen</TD><TH>tk</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Twi</TD><TH>tw</TH><TD>Negro-african</TD></TR> <TR><TD>Ukrainian</TD><TH>uk</TH><TD>Slavic</TD></TR> <TR><TD>Urdu</TD><TH>ur</TH><TD>Indian</TD></TR> <TR><TD>Uzbek</TD><TH>uz</TH><TD>Turkic/altaic</TD></TR> <TR><TD>Vietnamese</TD><TH>vi</TH><TD>Asian</TD></TR> <TR><TD>Volapuk</TD><TH>vo</TH><TD>Internationalaux.</TD></TR> <TR><TD>Welsh</TD><TH>cy</TH><TD>Celtic</TD></TR> <TR><TD>Wolof</TD><TH>wo</TH><TD>Negro-african</TD></TR> <TR><TD>Xhosa</TD><TH>xh</TH><TD>Negro-african</TD></TR> <TR><TD>Yiddish</TD><TH>ji</TH><TD>Germanic</TD></TR> <TR><TD>Yoruba</TD><TH>yo</TH><TD>Negro-african</TD></TR> <TR><TD>Zulu</TD><TH>zu</TH><TD>Negro-african</TD></TR> </TABLE> <p> What about language variants, e.g., British English versus correct English? The standard way to handle variants is with suffixes, e.g., "zh-CN" and "zh-TW" for Chinese from China and Taiwan respectively, "en-GB" and "en-US" for UK and US English, "fr-CA" and "fr-FR" for Quebecois and French French. We think this is cumbersome and can't imagine anyone wanting to have templates named "foobar.fancy.en-US.adp". Our system doesn't require that the two-character coding be ISO-standard. A publisher who wished to serve British and American readers could use "gb" and "us", for example. Non-standard? Yes. But in my defence, let me note that if you've flown over to England in an aeroplane, gone out in a mackintosh with a brolly, rotted your teeth on fairy cakes with coloured frosting, you probably have worse problems that non-standard file names. <h3>How we pick the right template</h3> At the end of /foo/bar.tcl, release your database handle (good practice; this way other threads can reuse it while AOLserver is streaming bytes out to your client) and then call <code>ad_return_template</code>. <p> If you need to set a cookie, bash <code>ns_conn outputheaders</code>. <p> How does <code>ad_return_template</code> work? It goes up one Tcl level so that it can have access to all the local vars that bar.tcl might have set. Then it looks at the user's language and graphics preferences (from the <code>users_preferences</code> defined in community-core.sql). Then it looks in the templates subtree of the file system to see what the closest matching template is (language preference overrides graphics preference). <p> Note that <code>ad_return_template</code> returns headers and content bytes to the connection but does not terminate the thread. So you can do logging or other database activity following the service of the parsed ADP template to the user. <h3>Standard Cookie Names</h3> If you're supporting registered users, you'll be pulling graphics and language preferences from <code>users_preferences</code>. You might want to offer casual users a choice of languages or graphics complexity (see <a href="http://scorecard.org">scorecard.org</a> for an example). In this case, you need to use cookies to record what the user said he or she wanted. <p> It is tough to know how and where the publisher will want to present users with language and graphics options. But we can build standard Tcl API calls into /tcl/ad-style.tcl if we agree to standardize on cookie names. So let's agree on the same names as the columns in <code>users_preferences</code>: "prefer_text_only_p" (value "t" or "f") and "language_preference" (two-char lowercase code). <p> Note that the code in ad-style.tcl will only look for cookies if PlainFancyCookieP and LanguageCookieP parameters are turned on. <hr> <a href="http://photo.net/philg/"><address>philg@mit.edu</address></a> </body> </html>