Building a Multilingual Web Service Using ACS
by Jeff Davis (davis@arsdigita.com), John Lowry (lowry@arsdigita.com), Henry Minsky (hqm@ai.mit.edu)
Submitted on: 2000-07-24
Last updated: 2000-07-24
ArsDigita : ArsDigita Systems Journal : One article
Imagine you are a web publisher with a very big idea. You want your
idea to be accessible to users in different parts of the world in
whatever language they speak. The first part of this goal is easy. In
fact, nothing is required of the publisher. Web sites can already be
accessed from almost any part of the world. The second part is more
difficult. At first glance, the solution seems simply a matter of
translating content into each language the site supports. But what
happens when the pages of a site are dynamically-generated by a
computer program? Very few web publishers have succesfully implemented
such a site.
You should read this article if your idea is so compelling that you
are not daunted by the obstacles that have held back all but the most
ambitious publishers. We describe a solution for building a
multilingual web site in which all the pages are
dynamically-generated. We use the ACS toolkit to build the
site, but we provide sufficient information that non-ACS users can
adapt our solutions.
Contents
- Big Picture
Very few organizations have successfully implemented a multilingual
site. We give an overview of the extreme organizational and
technological challenges that are involved.
- Language issues
This section describes how to identify a user's language preference
and how to send output to the user in that language. We also discuss
how to organize the process of translating content into different
languages.
- Localization
Locales are the set of language and cultural rules which are used to
format dates, numbers, and monetary amounts. This section describes an
API that can be used to read and write localized data.
- Character encodings
The default web character set is ISO Latin 1, which can encode most
characters in Western European languages. This section discusses how
to handle characters that cannot be encoded in ISO Latin 1. It
includes a sidebar on character entity
references in HTML files.
- Timezones
This section discusses why we need to store timezone information. It
describes how to convert dates from local time to Universal Time and
vice versa.
Demo site
If you want to see our code working, go to our demonstration of a multilingual
site, which supports four languages: English, German, French, and
Spanish. Each page on the site is dynamically generated and can be
viewed in any of the four supported languages.
You may be disappointed by the quality of translation if you are a
native speaker of one of these languages. That is because we have used
translation software rather than human translators to create the
non-English content. You must have cookies enabled in your web browser
to switch to a different language.
Code status
A set of patches for serving web pages in multiple character encodings
using the ACS is available from http://imode.arsdigita.com/i18n/.
In a short while, we will release the code that was used to implement
the language, localization and timezone parts of the demo site.
Credits
This article is based on the recent work on multilingual web sites.
The team was led by Jeff
Davis and its members included Marc Anderson, Ashok Argent-Katwala,
John Lowry, Henry Minsky, Sebastian Skracic, and Kai Wu.
More information
ACS toolkit http://www.arsdigita.com/doc/
Demonstration multilingual web site http://multilingual.arsdigita.com/
Patches to the ACS for serving web pages in multiple character
encodings http://imode.arsdigita.com/i18n/
W3C Internationalization and Localization http://www.w3.org/International/Overview.html
asj-editors@arsdigita.com
Reader's Comments
I just want to point out the the language features listed here are available for ACS 4.x in the acs-lang package available in the repository:
ACS Localization Utilities
I spent some time before I figured that out.
Jacob
-- Jacob Williams, May 23, 2001