Index: openacs-4/packages/acs-lang/www/doc/i18n-requirements.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-lang/www/doc/i18n-requirements.adp,v diff -u -r1.1.2.6 -r1.1.2.7 --- openacs-4/packages/acs-lang/www/doc/i18n-requirements.adp 21 Aug 2015 10:49:20 -0000 1.1.2.6 +++ openacs-4/packages/acs-lang/www/doc/i18n-requirements.adp 25 Aug 2015 18:02:07 -0000 1.1.2.7 @@ -2,15 +2,17 @@ {/doc/acs-lang {Localization}} {ACS 4 Globalization Requirements} ACS 4 Globalization Requirements +

ACS 4 Globalization Requirements

+

by Henry Minsky, Yon Feldman, Lars Pind, others

+

I. Introduction

- -

ACS 4 Globalization Requirements

by Henry Minsky, Yon Feldman, Lars Pind, others

I. Introduction

This document describes the requirements for functionality in the ACS platform to support globalization of the core and optional modules. The goal is to make it possible to support delivery of applications which work properly in multiple locales with the lowest development and maintenance cost. -

Definitions

+

Definitions

+
internationalization (i18n)

The provision within a computer program of the capability of making itself adaptable to the requirements of different native languages, local customs and coded character sets.

locale

The definition of the subset of a user's environment that @@ -19,27 +21,35 @@ customs and coded character sets.

globalization

A product development approach which ensures that software products are usable in the worldwide markets through a combination of internationalization and localization.

-

II. Vision Statement

+
+

II. Vision Statement

+ The Mozilla project suggests keeping two catchy phrases in mind when thinking about globalization:

Building an application often involves making a number of + +

Building an application often involves making a number of assumptions on the part of the developers which depend on their own culture. These include constant strings in the user interface and system error messages, names of countries, cities, order of given and family names for people, syntax of numeric and date strings and -collation order of strings.

The ACS should be able to operate in languages and regions +collation order of strings.

+

The ACS should be able to operate in languages and regions beyond US English. The goal of ACS Globalization is to provide a clean and efficient way to factor out the locale dependent functionality from our applications, in order to be able to easily -swap in alternate localizations.

This in turn will reduce redundant, costly, and error prone +swap in alternate localizations.

+

This in turn will reduce redundant, costly, and error prone rework when targeting the toolkit or applications built with the -toolkit to another locale.

The cost of porting the ACS to another locale without some kind +toolkit to another locale.

+

The cost of porting the ACS to another locale without some kind of globalization support would be large and ongoing, since without a mechanism to incorporate the locale-specific changes cleanly back into the code base, it would require making a new fork of the -source code for each locale.

III. System/Application Overview

+source code for each locale.

+

III. System/Application Overview

+ A globalized application will perform some or all of the following steps to handle a page request for a specific locale:
    @@ -60,15 +70,19 @@ time, numeric, currency). If templating is being used, this may be done either before and/or after merging the data with a template. -

Since the internationalization APIs may potentially be used on + +

Since the internationalization APIs may potentially be used on every page in an application, the overhead for adding internationalization to a module or application must not cause a -significant time delay in handling page requests.

In many cases there are facilities in Oracle to perform various +significant time delay in handling page requests.

+

In many cases there are facilities in Oracle to perform various localization functions, and also there are facilities in Java which we will want to move to. So the design to meet the requirements will tend to rely on these capabilities, or close approximations to them where possible, in order to make it easier to maintain Tcl and -Java ACS versions.

IV. Use-cases and User-scenarios

+Java ACS versions.

+

IV. Use-cases and User-scenarios

+ Here are the cases that we need to be able to handle efficiently:
  1. A developer needs to author a web site/application in a @@ -95,8 +109,12 @@ such as message catalogs, non-text assets such as graphics, and use of templates which help to separate application logic from presentation.
  2. -

Competitive Analysis

Other application servers: ATG Dyanmo, Broadvision, Vignette, -... ? Anyone know how they deal with i18n ?

V. Related Links

+

VI Requirements

+ Because the requirements for globalization affect many areas of the system, we will break up the requirements into phases, with a base required set of features, and then stages of increasing functionality. -

VI.A Locales

10.0 A standard representation of locale will be used +

VI.A Locales

+10.0 + A standard representation of locale will be used throughout the system. A locale refers to a language and territory, and is uniquely identified by a combination of ISO language and ISO country abbreviations. @@ -127,7 +149,10 @@ Note that Java has builtin support for these already.

10.30 For each locale there will be default date, number and currency formats.

-

VI.B Associating a Locale with a Request

20.0 The request processor must have a mechanism for + +

VI.B Associating a Locale with a Request

+20.0 + The request processor must have a mechanism for associating a locale with each request. This locale is then used to select the appropriate template for a request, and will also be passed as the locale argument to the message catalog or @@ -142,7 +167,10 @@ request locale from the ad_conn structure.

-

VI.C Resource Bundles / Content Repository

30.0 A mechanism must be provided for a developer to group a + +

VI.C Resource Bundles / Content Repository

+30.0 + A mechanism must be provided for a developer to group a set of arbitrary content resources together, keyed by a unique identifier and a locale.

For example, what approaches could be used to implement a @@ -151,11 +179,15 @@ themselves are locale-specific, such as images of English or Japanese text (as on www.arsdigita.com). It should be easy to specify alternate configurations of text and graphics to lay out -the page for different locales.

Design note: Alternative mechanisms to implement this +the page for different locales.

+

Design note: Alternative mechanisms to implement this functionality might include using templates, Java ResourceBundles, content-item containers in the Content Repository, or some convention assigning a common prefix to key strings in the message -catalog.

VI.D Message Catalog for String Translation

40.0 A message catalog facility will provide a database of +catalog.

+

VI.D Message Catalog for String Translation

+40.0 + A message catalog facility will provide a database of translations for constant strings for multilingual applications. It must support the following:
@@ -194,33 +226,42 @@ caching for performance? Would we get a nice user interface and version control almost for free?

-

VI.E Character Set Encoding

Character Sets

+ +

VI.E Character Set Encoding

+Character Sets +

50.0 A locale will have a primary associated character set which is used to encode text in the language. When given a locale, we can query the system for the associated character set to -use.

The assumption is that we are going to use Unicode in our +use.

+

The assumption is that we are going to use Unicode in our database to hold all text data. Our current programming environments (Tcl/Oracle or Java/Oracle) operate on Unicode data internally. However, since Unicode is not yet commonly used in browsers and authoring tools, the system must be able to read and write other character sets. In particular, conversions to and from Unicode will need to be explicitly performed at the following -times:

+


Design question: Do we want to mandate that all template files be stored in UTF8? I don't think so, because most people don't have Unicode editors, or don't want to be bothered with an extra step to convert files to UTF8 and back when editing them in their favorite editor. -

Same question for script and template +

+

Same question for script and template files, how do we know what language and character set they are authored in? Should we overload the filename suffix (e.g., -'.shiftjis.adp', '.ja_JP.euc.adp')?

The simplest design is probably just to +'.shiftjis.adp', '.ja_JP.euc.adp')?

+

The simplest design is probably just to assign a default mapping from each locale to character a set: e.g. ja_JP -> ShiftJIS, fr_FR -> ISO-8859-1. +++ (see new ACS/Java -notes) +++

+notes) +++

+

Tcl Source File Character Set

There are two classes of Tcl files loaded by the system; library files loaded at server startup, and page script files, which are @@ -255,7 +296,9 @@ 50.50 It must be possible for a developer to manually override the output character set encoding for a request using an API function.

-

VI.F ACS Kernel Issues

+
+

VI.F ACS Kernel Issues

+
60.10 All ACS error messages must use the message catalog and the request locale to generate error message for the appropriate locale. @@ -265,7 +308,9 @@ 60.30 Where files are written or read from disk, their filenames must use a character set and character values which are safe for the underlying operating system.

-

VI.G Templates

+
+

VI.G Templates

+
70.0 For a given abstract URL, the designer may create multiple locale-specific template files may be created (one per locale or language) @@ -297,7 +342,9 @@ standard method for a user to indicate the charset of a text file being uploaded must be provided.

Design note: this presumably applies to uploading data to the content repository as well

-

VI.H Sorting and Searching

+
+

VI.H Sorting and Searching

+
80.10 Support API for correct collation (sorting order) on lists of strings in locale-dependent way.

@@ -308,7 +355,9 @@ ORDER BY clauses in queries.

80.40 The system must handle full-text search in any supported language.

-

VI.G Time Zones

+
+

VI.G Time Zones

+
90.10 Provide API support for specifying a time zone

@@ -326,11 +375,15 @@ 90.60 The default if we can't determine a time zone is to display all dates and times in some universal time zone such as GMT.

-

VI.H Database

+

+

VI.H Database

+

100.10 Since UTF8 strings can use up to three (UCS2) or six (UCS4) bytes per character, make sure that column size declarations in the schema are large enough to accomodate required -data (such as email addresses in Japanese).

VI.I Email and Messaging

+data (such as email addresses in Japanese).

+

VI.I Email and Messaging

+ When sending an email message, just as when delivering the content in web page over an HTTP connection, it is necessary to be able to specify what character set encoding to use. @@ -341,11 +394,14 @@ 110.20 The email accepting API will allow for character set to be parsed correctly (hopefully a well formatted message will have a MIME character set content type header)

-

Implementation Notes

+ +

Implementation Notes

+ Because globalization touches many different parts of the system, we want to reduce the implementation risk by breaking the implementation into phases. -

VII. Revision History

+

VII. Revision History

+
@@ -356,5 +412,7 @@ -
Document Revision #Action Taken, NotesWhen?By Whom?
0.3comments from Christian1/14/2000Henry Minsky

hqm\@arsdigita.com

Last modified: $Date$

- + +
+
hqm\@arsdigita.com
+

Last modified: $Date$