Modality Translation Services on Demand
Making the World More Accessible For All

Gottfried Zimmermann, Ph.D.,
Gregg Vanderheiden, Ph.D.,
Trace R&D Center, 2107 Engineering Centers Bldg., 1550 Engineering Dr. Madison, WI 53706


Two things must occur for a person to use information. The information must be accessible and it must be presented to the person in an understandable way, or mode. This paper introduces the "Modality Translation Services" concept, which comprises a set of remote services to provide instant translation from one presentation mode to another, available anywhere at anytime. This paper will explain these services, potential applications, and show how this concept could benefit people with disabilities and people who are not disabled but experience functional limitations.


Thomas Jefferson's words, "Information is the currency of democracy," pertain to today's information society more than ever. Exclusion from information can keep a person from fully participating in society. The problem is not a shortage of information. Indeed, we may often experience information overload. The question is, how can we access information in the way we need (in the appropriate "currency") to be able to use it? For example, it would be inappropriate for a person to visually read e-mail while driving a car because the eyes are busy watching the road and traffic. However, the driver could use a "text-to-speech service" to voice the e-mail messages. Another example is a blind person participating in a business meeting where a diagram is being discussed. Here, an "image description service" could provide a verbal translation for the visual diagram.


The "Modality Translation Services" concept is a variety of remote services available anywhere, anytime [1]. These services are becoming possible as a result of recent technological advancements in wide-area, high-bandwidth networks and wireless communication technologies.

Service Spectrum

Modality translation services render information from one specific presentation form (mode) to another. Within the wide spectrum of possible services, each service is tailored to a person's communication needs regarding temporary, or permanent functional limitations (see figure 1).

Modality Translation GraphicD

Figure 1: Modality Translation Service Spectrum

Try Harder Feature

While some of these services can be provided in a fully-automated manner today (e.g. text-to-speech synthesizers for the text to speech service), others may need human assistance for some time (e.g. speech to text, language level translation, and image/video description service). Although more automated services will be implemented with emerging technologies, the early implementations may not be as mature as needed in some cases. In these situations a "Try Harder" feature could be used to harness more powerful applications (network advanced services), and use human assistance in the automatic translation process when technology fails to be effective in certain environments and for certain materials [2].

Service Access Devices

To use these on-demand translation services, a person needs to have a device that connects remotely to a global, high-bandwidth network and renders information on a display, or through other output units. Although any kind of computer system can be used as an access device, the small, wireless devices bring the real "anywhere at anytime" feature to the user. Examples include handheld computers, cell phones, etc., outfitted with earbuds, buttonhole microphones, or eyeglasses with a built-in monitor.

Who are the users?

We identify four user groups that could benefit from the "Modality Translation Services" concept, differing only in the kind of functional limitation they encounter:


Many of these services are already implemented in a human-assisted, semi-automatic or full-automatic manner. Examples of the speech-to-text service include Ultratec's Instant Captioning™ technology [3] and the Classroom Captioner from Personal Captioning Systems [4]; for the speech-to-sign service the Signing Avatar™ from VCom3D [5]; and for the international language service the AltaVista Babel Fish translation service powered by SYSTRAN [6].

In order to make these services available to a broad user basis they should be embedded in a globally available telecommunication network. As part of the Partnership for Advanced Computational Infrastructure (PACI) [7] the Trace Center is currently investigating options and promoting feasible solutions for modality translation services on the Grid, and other next-generation networks, services and computational resources [8].


This paper was partly funded by the National Science Foundation (NSF) in the context of the Universal Design/Disability Access Program (UD/DA) [8].

