Before Multilingual Computing
In the beginning there was the mainframe, and the mainframe had an English keyboard, and the keyboard had no accent characters, and all were content. Then others heard of the computer and its miraculous powers, and there was rejoicing around the world. However, the keyboard had no accents and the keyboard could not make math symbols and the keyboard could not output Russian, Chinese, Spanish or any language other than English or Swahili. Then a great sadness fell upon the nations of the world, and the great international computer scientists met and asked "What shall we do?" (and some may have whispered "Drat those non-English scripts! Now what?") Thirty years later, we are finally beginning to approach a universal solution to support computing in all languages.
In today's world, it is in fact possible to type and process other languages besides English, but few users will ever claim that it is as easy as typing and processing English. ANGEL Help Desk Coordinator and Hungarian native speaker Irma Giannetti noted, "I just stumbled onto the Hungarian keyboard by accident; it shouldn't be so difficult." However, the Penn State computing and foreign language communities have been working together in the past few years to provide utilities and instructions for students, staff, and faculty to make the process a little easier.
Penn State's Role in Facilitating International Computing
What exactly has Penn State done to facilitate non-English computing? There are a variety of factors and tools to consider, but with suggestions from different language departments, Classroom and Lab Computing, part of Information Technology Services (ITS), has installed the following utilities in its University Park Student Computing Labs:
In addition, both Teaching and Learning with Technology (TLT), a unit of ITS, and the Center for Language Acquisition provide online instructions for configuring computers to work with a variety of languages from Europe, Asia, the Middle East and elsewhere. Some of this documentation has been modified by language programs such as the Spanish Basic Language Program for use in working with their own languages, and some members of the Help Desk staff are providing information to help their customers activate and use the utilities necessary to work with non-English languages on their systems.
Finally, the Penn State CALPER Foreign Language Resource Center (http://calper.la.psu.edu/cmc.php), administered by the Center for Language Acquisition, has developed Unicode-compliant (see below) blog, Wiki, online chat, and message board tools which are open to all Penn State and non-Penn State users.
Who Benefits?
The major beneficiaries for these utilities are obviously students and instructors working in foreign languages, but they are not the only users who can take advantage of these resources. Students, staff, and instructors from outside the U.S. or those who know other languages can also benefit. In fact, the installation of fonts for South Asian scripts from India was not suggested by a language department, but by Jerrold Maddox, an instructor in the College of Arts and Architecture who teaches a course in Web development, and is encouraging international students to post material in their native languages.
Similarly, both TLT and the Penn State Center for Language Acquisition have consulted with instructors who need to be able to type or read languages besides English, but are not able to easily change their computer settings to do so.
The Importance of Unicode
Ten years ago, configuring your computer system to use languages other than English primarily meant installing some special fonts on your machine and memorizing some keyboard codes. However, the demands of the Internet along with electronic communication and data transmission have forced the technology community to reevaluate how language data is processed. In the older "internationalization" models, users had to have the same system or the same fonts in order to read the non-English text correctly. If a user's computer did not have the correct font, the content would be unreadable.
Different operating system vendors and countries came up with their own standards for different languages, but the content was not always transferable. A Russian document written in Windows might not be readable on a Macintosh or Linux machine and vice versa. To counteract these compatibility problems, an encoding system called Unicode was developed and accepted by most major technology vendors including Microsoft, Apple, Adobe, Macromedia, and others.
The Unicode system essentially assigns a unique number to every character in every script of the world's languages. Fonts have been redesigned so that the correct character is displayed when the system sees a particular Unicode numeric code. As long as an appropriate Unicode-compliant font is in place, a computer should be able to correctly display the text, even if it was originally written with a different font. This would be similar to English text which is readable even if the font switches from Times New Roman to Arial.
Configuring Your System
Although most technology vendors agree on the value of Unicode, there has been a sizeable lag in time between creating the theoretical standard and implementing it for different platforms, applications, and scripts. Unicode compliance for East Asian languages is fairly complete across platforms, but Unicode support for other scripts, such as Armenian or Georgian, is still lagging. In addition, other factors such as whether a script is typed left-to-right (LTR) or right-to-left (RTL) as in the Middle East or whether each symbol is a letter, a syllable as in some Japanese scripts, or a concept as in Chinese will affect how utilities function. In short, languages and scripts each have their own quirks.
To fully internationalize your system for a particular language, your computer should have a Unicode-compliant font for the script and a matching keyboard or input utility which correctly matches the character with its Unicode value. In addition, some users may require fonts and utilities which work with pre-Unicode systems or "legacy" encodings. The "By Language" section of the TLT Computing with Accents Web site at http://tlt.its.psu.edu/suggestions/international/bylanguage/index.html includes pages for different languages which list font, utilities, and other factors important for each language. (Note: Western European languages like French, Spanish, or German use the same fonts as English; the only difference is that certain accent codes must be memorized).
Once these fonts and keyboard utilities are installed, it is a matter of activating the keyboard and using those software applications which can recognize them. Currently, these include recent versions of Microsoft Office (Office 2004 only on the Macintosh platform), Star Office, Adobe InDesign and Photoshop, Netscape Composer, Notepad for Windows, Text Edit for the Macintosh and others. Some products may be able to read Unicode text, but may not be able to type it. For these products, cutting and pasting from another text editor may be the best solution for now.
Another point to consider is that just because you are able to read and work with Unicode on your machine, it does not mean all other users are equally enabled. You may be able to create a Unicode-compliant Web site and view it correctly, but users with a different browser or missing a font may not be so lucky. You may need to provide instructions on how to help other users configure their systems. As you can see, foreign language technology is advancing, but still nowhere as universally or easily implemented as English.
The Future
Although the current methods for multilingual computing are more complex than most users would like, the outlook is actually very promising. Three years ago, only Windows and Unix were fully Unicode-compliant and only the major scripts of the world were supported in all browsers. It was very difficult to post material in other scripts on the Web without very specialized knowledge of encoding systems. Now all modern operating systems are in Unicode, all the recent versions of each browser recognize Unicode, and support for lesser-known scripts have been included in most platforms.
Most importantly, more Unicode-aware applications and keyboards have been developed so that it is more possible than ever to be able to create and share Unicode documents. In Netscape, Mozilla, and Firefox, instructors can type material in different scripts directly into ANGEL and other Web tools and the content will be readable on other browsers. Unicode has become so common that many academic consortia and international user groups are developing fonts and utilities free for academic and commercial use. Multilingual support is not just becoming universal, it is becoming open-source.
Where to Learn More: Penn State Resources
TLT Computing with Accents
http://tlt.its.psu.edu/suggestions/international/
Center for Language Acquisition (click "Resources")
http://language.la.psu.edu/
Penn State CALPER Foreign Language CMCTools
http://calper.la.psu.edu/cmc.php
Foreign Languages and ANGEL
http://tlt.its.psu.edu/suggestions/international/web/mozillatype.html
Where to Learn More: Unicode
Alan Wood's Unicode Resources
http://www.alanwood.net/unicode/index.html
Jukka Korpela's Tutorial on Character Coding
http://www.cs.tut.fi/~jkorpela/chars.html
Unicode Consortium
http://www.unicode.org/
Tex Texin Internationalization Guy
http://www.i18nguy.com/