Common Mistakes When Planning Website / Application Globalisation
(Edited by freelance Chinese translator li – English to Chinese or Chinese to English translation services)
When creating websites or applications which are designed to be multi-lingual, software developers who are new to globalization tend to make the same mistakes. Presented here are some of the more common mistakes we see on a regular basis.
1. String Concatenation Problems
It is common amongst English-speaking developers to simply add an "s" to the end of a word expecting it to pluralise the word, but this should be avoided as it is an English only construct. Other string concatenation problems can occur, such as when constructing a sentence which may have masculine or feminine word endings in other languages. Because some cultures present a value at the end of the sentence rather than the beginning it's a good idea to use tokens to represent variables so that the sentence can be restructured by the translator depending on the language. The C# language provides the System.String.Format method for this purpose, but most programming languages have equivalent functionality.
2. Text Encodings
When creating Web pages or creating applications which write to the file system of a computer, it is important to use a text encoding which can support the wide range of character sets that you want to express. Many English speaking developers use ASCII or the Windows 1252 character set by default, however, a better choice is UTF-8 which can support Chinese, Japanese and other difficult languages while maintaining backwards compatibility with ASCII for English text. UTF-8 is the default text encoding of XML as defined by the W3C, as well as the default output encoding of ASP.Net. It's also important to consider text encoding when designing database structures, for instance by using Unicode database fields (which may take up twice as much space on the database for fixed width columns such as the Microsoft SQLServer nvarchar type) and marking the language used within a text field so that it can be easily identified later.
3. Graphics
When designing the user interface of an application, a common problem is that text in other languages can be longer than the English equivalent, causing text to be hidden by other elements or causing undesired wrapping effects. German, in particular, can be up to 50% longer than its English equivalent. When designing a user interface or when sending text for translation, it may be worth investing the time up front to provide translators with examples of where the text may be used, or designing the user interface to adjust elegantly to different text lengths. Problems with graphics can be far reaching, from country specific telephone numbers (forgetting to add the country code) to a lack of space to write the translation, whilst another common problem is simply forgetting to send graphical elements for translation.
4. Lack of Context
Sometimes, our customers send us lists of strings in XML format without providing any context around the list, such as a description of where the string is used (next to a text box for example). This causes many of our translators to ask for a description of where the label is to be used so that they can provide a proper translation, resulting in an increase in turnaround time. When creating a list of terms for translation, it is very useful for translators if the customer creates screen shots or writes a description of where the string will be used, especially if it is to be an urgent job.
5. Input and Output Styles
Many English developers are not familiar with the ways that other cultures express dates and numbers. For instance, in most of the world, the comma is used to express a decimal point, whereas in English, the comma is used as a thousand separator. This means that we see a lot of JavaScript where number parsing fails and a lot of confusion regarding the correct input method.
The .Net Framework provides the System.Globalization.CultureInfo class which you can use to express variables as strings, or parse input in a culturally sensitive way. You can set the System.Threading.Thread.CurrentUICulture to a CultureInfo, which will automatically force the ToString() method to output in the correct culture format, in addition, the month name will be expressed in the correct language. In Java, similar functionality is handled within java.util.Locale.
When sharing date formats between systems, the best format to use is the ISO 8601 date format, which is close to the Japanese date format YYYY-MM-DD. The use of the hyphen rather than the slash shows that the ISO format is to be used. For reference, the US expresses dates as MM/DD/YYYY whereas the UK expresses dates as DD/MM/YYYY which shows how different cultures which speak the same language can have different ways of expressing the same information.
Many developers use 3 drop down lists (combo boxes / select boxes) for date entry, but it is important to make sure that the method makes sense across all cultures, for instance, using 4 numbers to represent the year rather than 01,02,03,04,05 and by using month names rather than numbers, which could be confused with the day of the month.
from: Adrian Hesketh, Lead Systems Developer thebigword
No comments:
Post a Comment