Index: openacs-4/packages/acs-core-docs/www/xml/index.xml
===================================================================
RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/xml/index.xml,v
diff -u -N -r1.8 -r1.9
--- openacs-4/packages/acs-core-docs/www/xml/index.xml 22 Sep 2002 22:44:28 -0000 1.8
+++ openacs-4/packages/acs-core-docs/www/xml/index.xml 1 Oct 2002 09:42:43 -0000 1.9
@@ -21,6 +21,7 @@
+
@@ -32,6 +33,7 @@
+
@@ -186,6 +188,7 @@
&templates;
&permissions;
&subsites;
+ &i18n-devel;
@@ -270,6 +273,7 @@
&subsites-design;
&apm-req;
&apm-design;
+ &i18n-req;
&security-req;
&security-design;
Index: openacs-4/packages/acs-core-docs/www/xml/developers-guide/i18n.xml
===================================================================
RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/xml/developers-guide/i18n.xml,v
diff -u -N
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ openacs-4/packages/acs-core-docs/www/xml/developers-guide/i18n.xml 1 Oct 2002 09:42:46 -0000 1.1
@@ -0,0 +1,105 @@
+
+
+
+
+ By Peter Marklund
+
+
+
+ Introduction
+
+
+ This document describes how to develop internationalized OpenACS packages.
+
+
+
+
+ Multilingual Text - Using the Message Catalog
+
+
+ In this section we present the mechanisms that OpenACS provide that allows your
+ OpenACS packages to handle text in multiple languages.
+
+
+
+ Multilingual OpenACS Parameters
+
+
+ The syntax for storing multilingual pieces of text in APM parameters is identical
+ to the one used for adp templates. Any message catalog keys in APM parameters should
+ be surrounded by hash marks and will be replaced by the parameter::get procedure if
+ it is invoked with the -localize flag. The following three examples illustrate:
+
+
+
+
+
+
+
+
+
+
+ Parameter Name
+ Parameter Value
+ Command used to retrieve Value
+ Retrieved Value
+
+
+
+
+ class_instance_pages_csv
+ #dotlrn.class_page_home_title#,Simple 2-Column;#dotlrn.class_page_calendar_title#,Simple 1-Column;#dotlrn.class_page_file_storage_title#,Simple 1-Column
+ parameter::get -localize -parameter class_instances_pages_csv
+ Kurs Startseite,Simple 2-Column;Kalender,Simple 1-Column;Dateien,Simple 1-Column
+
+
+ departments_pretty_name
+ #departments_pretty_name#
+ parameter::get -localize -parameter departments_pretty_name
+ Abteilung
+
+
+ ...
+
+
+ departments_pretty_name
+ #departments_pretty_name#
+ parameter::get -parameter departments_pretty_name
+ #departments_pretty_name#
+
+
+
+
+
+
+ The value in the rightmost column in the table above is the value returned by an invocation
+ of parameter::get. Not that for localization to happen you must use the -localize flag.
+ The locale used for the message lookup will be
+ the locale of the current request, or if there is no current request, the site-wide default locale
+ (set by the parameter SiteWideLocale of the acs-lang package).
+
+
+
+
+
+ Multilingual Page Templates (.adp Files)
+
+
+ There are two syntaxes to choose from for doing message catalog lookups in adp templates.
+ Any message catalog keys surrounded by hash marks (i.e. #message_key#) will be replaced
+ with the corresponding text in the message catalog (the procedure
+ lang::message::lookup is used for the lookup) using the locale of the request (given by
+ ad_conn locale). If there is no message can be retrieved from the message catalog then
+ a translation missing message will be used instead.
+
+
+
+ The other syntax for message lookups in adp pages is <trn key="message_key">default text</trn>.
+ Use the trn tag if you want to provide a default message
+ in the template. The default message is in the body of the trn tag and is mandatory.
+ The default message is only used if no message could be retrieved from the message catalog.
+
+
+
+
+
Index: openacs-4/packages/acs-core-docs/www/xml/kernel/i18n-requirements.xml
===================================================================
RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/xml/kernel/i18n-requirements.xml,v
diff -u -N
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ openacs-4/packages/acs-core-docs/www/xml/kernel/i18n-requirements.xml 1 Oct 2002 09:42:47 -0000 1.1
@@ -0,0 +1,750 @@
+
+ OpenACS &version; Internationalization Requirements
+
+
+ by Henry Minsky,
+ Yon Feldman,
+ Lars Pind,
+ Peter Marklund,
+ Christian Hvid,
+ and others.
+
+
+
+ Introduction
+
+
+ This document describes the requirements for functionality in
+ the OpenACS platform to support globalization of the core and optional
+ modules. The goal is to make it possible to support delivery of
+ applications which work properly in multiple locales with the
+ lowest development and maintenance cost.
+
+
+
+
+ Definitions
+
+
+
+ internationalization (i18n)
+
+
+ The provision within a computer program of the
+ capability of making itself adaptable to the requirements of different
+ native languages, local customs and coded character sets.
+
+
+
+
+
+ locale
+
+
+ The definition of the subset of a user's environment that depends on
+ language and cultural conventions.
+
+
+
+
+
+ localization (L10n)
+
+
+ The process of establishing information within a computer system
+ specific to the operation of particular native languages, local
+ customs and coded character sets.
+
+
+
+
+
+ globalization
+
+
+ A product development approach which ensures that software products
+ are usable in the worldwide markets through a combination of
+ internationalization and localization.
+
+
+
+
+
+
+
+
+
+ Vision Statement
+
+The Mozilla project suggests keeping two catchy phrases in
+mind when thinking about globalization:
+
+
+
+One code base for the world
+
+
+
+English is just another language
+
+
+
+Building an application often involves making a number of
+assumptions on the part of the developers which depend on their own
+culture. These include constant strings in the user interface and
+system error messages, names of countries, cities, order of given
+and family names for people, syntax of numeric and date strings and
+collation order of strings.
+
+The ACS should be able to operate in languages and regions
+beyond US English. The goal of ACS Globalization is to provide a
+clean and efficient way to factor out the locale dependent
+functionality from our applications, in order to be able to easily
+swap in alternate localizations.
+
+This in turn will reduce redundant, costly, and error prone
+rework when targeting the toolkit or applications built with the
+toolkit to another locale.
+
+The cost of porting the ACS to another locale without some
+kind of globalization support would be large and ongoing, since
+without a mechanism to incorporate the locale-specific changes
+cleanly back into the code base, it would require making a new fork
+of the source code for each locale.
+
+System/Application Overview
+
+A globalized application will perform some or all of the
+following steps to handle a page request for a specific
+locale:
+
+
+
+Decide what the target locale is for an incoming page
+request
+
+
+
+Decide which character set encoding the output should be
+delivered in
+
+
+
+If a script file to handle the request needs to be loaded
+from disk, determine if a character set conversion needs to be
+performed when loading the script
+
+
+
+If needed, locale-specific resources are fetched. These can
+include text, graphics, or other resources that would vary with the
+target locale.
+
+
+
+If content data is fetched from the database, check for
+locale-specific versions of the data (e.g. country names).
+
+
+
+Source code should use a message catalog API to translate
+constant strings in the code to the target locale
+
+
+
+Perform locale-specific linguistic sorting on data if
+needed
+
+
+
+If the user submitted form input data, decide what character
+set encoding conversion if any is needed. Parse locale-specific
+quantities if needed (number formats, date formats).
+
+
+
+If templating is being used, select correct locale-specific
+template to merge with content
+
+
+
+Format output data quantities in locale-specific manner
+(date, time, numeric, currency). If templating is being used, this
+may be done either before and/or after merging the data with a
+template.
+
+
+
+Since the internationalization APIs may potentially be used
+on every page in an application, the overhead for adding
+internationalization to a module or application must not cause a
+significant time delay in handling page requests.
+
+In many cases there are facilities in Oracle to perform
+various localization functions, and also there are facilities in
+Java which we will want to move to. So the design to meet the
+requirements will tend to rely on these capabilities, or close
+approximations to them where possible, in order to make it easier
+to maintain Tcl and Java ACS versions.
+
+Use-cases and User-scenarios
+
+Here are the cases that we need to be able to handle
+efficiently:
+
+
+
+A developer needs to author a web site/application in a
+language besides English, and possibly a character set besides
+ISO-8859-1. This includes the operation of the ACS itself, i.e.,
+navigation, admin pages for modules, error messages, as well as
+additional modules or content supplied by the web site
+developer.
+
+What do they need to modify to make this work? Can their
+localization work be easily folded in to future releases of
+ACS?
+
+
+
+A developer needs to author a web site which operates in
+multiple languages simultaneously. For example, arsDigita.com with
+content and navigation in English, German, and Japanese.
+
+The site would have an end-user visible UI to support these
+languages, and the content management system must allow articles to
+be posted in these languages. In some cases it may be necessary to
+make the modules' admin UI's operate in more than one
+supported language, while in other cases the backend admin
+interface can operate in a single language.
+
+
+
+A developer is writing a new module, and wants to make it
+easy for someone to localize it. There should be a clear path to
+author the module so that future developers can easily add support
+for other locales. This would include support for creating
+resources such as message catalogs, non-text assets such as
+graphics, and use of templates which help to separate application
+logic from presentation.
+
+
+
+Competitive
+Analysis
+
+Other application servers: ATG Dyanmo, Broadvision, Vignette,
+... ? Anyone know how they deal with i18n ?
+
+Related
+Links
+
+
+
+System/Package "coversheet" - where all
+documentation for this software is linked off of
+
+
+
+Design document
+
+
+
+Developer's guide
+
+
+
+User's guide
+
+
+
+Other-cool-system-related-to-this-one
+documentLI18NUX
+2000 Globalization Specification:
+http://www.li18nux.net/
+
+Mozilla
+i18N Guidelines:
+http://www.mozilla.org/docs/refList/i18n/l12yGuidelines.html
+
+ISO
+639:1988 Code for the representation of names of languages
+http://sunsite.berkeley.edu/amher/iso_639.html
+
+ISO 3166-1:1997
+Codes for the representation of names of countries and their
+subdivisions Part 1: Country codes
+http://www.niso.org/3166.html
+
+IANA
+Registry of Character Sets
+
+
+
+Test plan
+
+
+
+Competitive system(s)
+
+
+
+Requirements
+
+Because the requirements for globalization affect many areas
+of the system, we will break up the requirements into phases, with
+a base required set of features, and then stages of increasing
+functionality.
+
+Locales
+
+10.0
+A standard representation of locale will be used throughout
+the system. A locale refers to a language and territory, and is
+uniquely identified by a combination of ISO language and ISO
+country abbreviations.
+
+
+See
+Content
+Repository Requirement 100.20
+
+10.10 Provide a consistent
+representation and API for creating and referencing a locale
+
+10.20 There will be a Tcl library of
+locale-aware formatting and parsing functions for numbers, dates
+and times. Note that Java has builtin support for these
+already.
+
+10.30 For each locale there will be
+default date, number and currency formats.
+
+
+Associating a Locale with a Request
+
+20.0
+The request processor must have a mechanism for associating a
+locale with each request. This locale is then used to select the
+appropriate template for a request, and will also be passed as the
+locale argument to the message catalog or locale-specific
+formatting functions.
+
+
+20.10 The locale for a request should be
+computed by the following method, in descending order of
+priority:
+
+
+
+get locale associated with subsite or package id
+
+
+
+get locale from user preference
+
+
+
+get locale from site wide default
+
+20.20 An API will be provided for
+getting the current request locale from the
+ad_conn structure.
+
+
+
+
+Resource Bundles / Content Repository
+
+30.0
+A mechanism must be provided for a developer to group a set
+of arbitrary content resources together, keyed by a unique
+identifier and a locale.
+
+For example, what approaches could be used to implement a
+localizable nav-bar mechanism for a site? A navigation bar might be
+made up of a set of text strings and graphics, where the graphics
+themselves are locale-specific, such as images of English or
+Japanese text (as on www.arsdigita.com). It should be easy to
+specify alternate configurations of text and graphics to lay out
+the page for different locales.
+
+Design note: Alternative mechanisms to implement this
+functionality might include using templates, Java ResourceBundles,
+content-item containers in the Content Repository, or some
+convention assigning a common prefix to key strings in the message
+catalog.
+
+Message Catalog for String Translation
+
+40.0
+A message catalog facility will provide a database of
+translations for constant strings for multilingual applications. It
+must support the following:
+
+
+40.10 Each message will referenced via
+unique a key.
+
+40.20 The key for a message will have
+some hierarchical structure to it, so that sets of messages can be
+grouped with respect to a module name or package path.
+
+40.30 The API for lookup of a message
+will take a locale and message key as arguments, and return the
+appropriate translation of that message for the specifed
+locale.
+
+40.40 The API for lookup of a message
+will accept an optional default string which can be used if the
+message key is not found in the catalog. This lets the developer
+get code working and tested in a single language before having to
+initialize or update a message catalog.
+
+40.50 For use within templates, custom
+tags which invoke the message lookup API will be provided.
+
+40.60 Provide a method for importing and
+exporting a flat file of translation strings, in order to make it
+as easy as possible to create and modify message translations in
+bulk without having to use a web interface.
+
+40.70 Since translations may be in
+different character sets, there must be provision for writing and
+reading catalog files in different character sets. A mechanism must
+exist for identifying the character set of a catalog file before
+reading it.
+
+40.80 There should be a mechanism for
+tracking dependencies in the message catalog, so that if a string
+is modified, the other translations of that string can be flagged
+as needing update.
+
+40.90 The message lookup must be as
+efficient as possible so as not to slow down the delivery of
+pages.
+
+Design question: Is there any reason to implement
+the message catalog on top of the content repository as the
+underlying storage and retrieval service, with a layer of caching
+for performance? Would we get a nice user interface and version
+control almost for free?
+
+
+Character Set Encoding
+
+Character Sets
+50.0 A locale will have a primary
+associated character set which is used to encode text in the
+language. When given a locale, we can query the system for the
+associated character set to use.
+
+The assumption is that we are going to use Unicode in our
+database to hold all text data. Our current programming
+environments (Tcl/Oracle or Java/Oracle) operate on Unicode data
+internally. However, since Unicode is not yet commonly used in
+browsers and authoring tools, the system must be able to read and
+write other character sets. In particular, conversions to and from
+Unicode will need to be explicitly performed at the following
+times:
+
+
+
+Loading source files (.tcl or .adp) or content files from the
+filesystem
+
+
+
+Accepting form input data from users
+
+
+
+Delivering text output to a browser
+
+
+
+Composing an email message
+
+
+
+Writing data to the filesystem
+
+
+
+Design question: Do we want to mandate that all
+template files be stored in UTF8? I don't think so, because
+most people don't have Unicode editors, or don't want to be
+bothered with an extra step to convert files to UTF8 and back when
+editing them in their favorite editor.
+
+Same question for script and template files, how do
+we know what language and character set they are authored in?
+Should we overload the filename suffix (e.g.,
+'.shiftjis.adp',
+'.ja_JP.euc.adp')?
+
+The simplest design is probably just to assign a
+default mapping from each locale to character a set: e.g. ja_JP
+-> ShiftJIS, fr_FR -> ISO-8859-1. +++ (see new ACS/Java
+notes) +++
+
+
+
+ Tcl Source File Character Set
+
+
+ There are two classes of Tcl files loaded by the system;
+ library files loaded at server startup, and page script files,
+ which are run on each page request.
+
+ Should we require all Tcl files be stored as UTF8?
+ That seems too much of a burden on developers.
+
+ 50.10 Tcl library files can be authored
+ in any character set. The system must have a way to determine the
+ character set before loading the files, probably from the
+ filename.
+
+ 50.20 Tcl page script files can be
+ authored in any character set. The system must have a way to
+ determine the character set before loading the files, probably from
+ the filename.
+
+
+
+
+ Submitted Form Data Character Set
+
+ 50.30 Data which is submitted with a
+ HTTP request using a GET or POST method may be in any character
+ set. The system must be able to determine the encoding of the form
+ data and convert it to Unicode on demand.
+
+ 50.35 The developer must be able to
+ override the default system choice of character set when parsing
+ and validating user form data.
+
+ 50.30.10 Extra hair: In Japan and some
+ other Asian languages where there are multiple character set
+ encodings in common use, the server may need to attempt to do an
+ auto-detection of the character set, because buggy browsers may
+ submit form data in an unexpected alternate encoding.
+
+
+
+ Output Character Set
+
+
+ 50.40 The output character set for a
+ page request will be determined by default by the locale associated
+ with the request (see requirement 20.0).
+
+ 50.50 It must be possible for a
+ developer to manually override the output character set encoding
+ for a request using an API function.
+
+
+
+
+
+
+
+ACS Kernel Issues
+
+
+60.10 All ACS error messages must use
+the message catalog and the request locale to generate error
+message for the appropriate locale.
+
+60.20 Web server error messages such as
+404, 500, etc must also be delivered in the appropriate
+locale.
+
+60.30 Where files are written or read
+from disk, their filenames must use a character set and character
+values which are safe for the underlying operating system.
+
+
+Templates
+
+
+70.0 For a given abstract URL, the
+designer may create multiple locale-specific template files may be
+created (one per locale or language)
+
+70.10 For a given page request, the
+system must be able to select an approprate locale-specific
+template file to use. The request locale is computed as per (see
+requirement 20.0).
+
+Design note: this would probably be implemented by
+suffixing the locale or a locale abbreviation to the template
+filename, such as foo.ja.adp or foo.en_GB.adp.
+
+70.20A template file may be created for
+a partial locale (language only, without a territory), and the
+request processor should be able to find the closest match for the
+current request locale.
+
+70.30 A template file may be created in
+any character set. The system must have a way to know which
+character set a template file contains, so it can properly process
+it.
+
+Formatting
+Datasource Output in Templates
+
+70.50 The properties of a datasource
+column may include a datatype so that the templating system can
+format the output for the current locale. The datatype is defined
+by a standard ACS datatype plus a format token or format string,
+for example: a date column might be specified as
+'current_date:date LONG,' or 'current_date:date
+"YYYY-Mon-DD"'
+
+Forms
+
+
+70.60 The forms API must support
+construction of locale-specific HTML form widgets, such as date
+entry widgets, and form validation of user input data for
+locale-specific data, such as dates or numbers.
+
+70.70 For forms which allow users to
+upload files, a standard method for a user to indicate the charset
+of a text file being uploaded must be provided.
+
+Design note: this presumably applies to uploading
+data to the content repository as well
+
+
+Sorting and Searching
+
+
+80.10 Support API for correct collation
+(sorting order) on lists of strings in locale-dependent way.
+
+80.20 For the Tcl API, we will say that
+locale-dependent sorting will use Oracle SQL operations (i.e., we
+won't provide a Tcl API for this). We require a Tcl API
+function to return the correct incantation of NLS_SORT to use for a
+given locale with ORDER BY clauses in
+queries.
+
+80.40 The system must handle full-text
+search in any supported language.
+
+
+Time Zones
+
+
+90.10 Provide API support for specifying
+a time zone
+
+90.20 Provide an API for computing time
+and date operations which are aware of timezones. So for example a
+calendar module can properly synchronize items inserted into a
+calendar from users in different time zones using their own local
+times.
+
+90.30 Store all dates and times in
+universal time zone, UTC.
+
+90.40 For a registered users, a time
+zone preference should be stored.
+
+90.50 For a non-registered user a time
+zone preference should be attached via a session or else UTC should
+be used to display every date and time.
+
+90.60 The default if we can't
+determine a time zone is to display all dates and times in some
+universal time zone such as GMT.
+
+
+Database
+
+
+100.10 Since UTF8 strings can use up to
+three (UCS2) or six (UCS4) bytes per character, make sure that
+column size declarations in the schema are large enough to
+accomodate required data (such as email addresses in
+Japanese).
+
+
+Email and
+Messaging
+
+When sending an email message, just as when delivering the
+content in web page over an HTTP connection, it is necessary to be
+able to specify what character set encoding to use.
+
+
+110.10 The email message sending API
+will allow for a character set encoding to be specified.
+
+110.20 The email accepting API will
+allow for character set to be parsed correctly (hopefully a well
+formatted message will have a MIME character set content type header)
+
+
+
+
+ Implementation Notes
+
+ Because globalization touches many different parts of the system,
+ we want to reduce the implementation risk by breaking the
+ implementation into phases.
+
+
+
+
+ Revision History
+
+
+
+
+
+ Document Revision #
+ Action Taken, Notes
+ When?
+ By Whom?
+
+
+
+ 0.4
+ converting from HTML to DocBook and importing the document to the OpenACS
+ kernel documents. This was done as a part of the internationalization of
+ OpenACS and .LRN for the Heidelberg University in Germany
+ 12 September 2002
+ Peter Marklund
+
+
+
+ 0.3
+ comments from Christian
+ 1/14/2000
+ Henry Minsky
+
+
+
+ 0.2
+ Minor typos fixed, clarifications to wording
+ 11/14/2000
+ Henry Minsky
+
+
+
+ 0.1
+ Creation
+ 11/08/2000
+ Henry Minsky
+
+
+
+
+
+
+
+
+