Modality Translation And Assistance Services:
A Challenge For Artificial Intelligence

Gottfried Zimmermann, Ph.D.; Gregg Vanderheiden, Ph.D.
Trace R&D Center, University of Wisconsin-Madison
2107 Engineering Centers Bldg., 1550 Engineering Dr., Madison, WI 53706 USA,

This paper introduces the "Modality Translation and Assistance Services" concept, a variety of remote ser­vices available anywhere and anytime, to enhance the lives of people with and without disabilities.  It iden­tifies the research and development challenges that exist in order to implement automated personal services by Artificial Intelligence technologies.


Today, people with functional limitations such as hearing, visual, and cognitive impairments rely on human-assisted services like text transcription and sign language interpretation services, that have to be arranged ahead of time and provided on-site. This dependence can pose severe constraints on people because the presence of other people is required in order to communicate, get access to public information, and live independently.  However, recent technological advancements in wide-area, high-bandwidth networks and wireless communication technologies could be utilized to remotely and wirelessly provide personal services like instant text transcription, or sign language interpretation on demand.  Moreover, these on-demand services would benefit people without disabilities.  The text transcription service could provide a speech input mode for small wireless devices with tiny or no keyboards.  A manager could use the same service in order to instantly get accurate minutes of an important business meeting.  In fact, we can find many examples in the past, where inventions were created for use by people with disabilities and turned out to improve the quality of life for everybody (e.g. typewriter, telephone).

Modality Translation GraphicD

Modality Translation and Assistance Services Concept

Modality translation and assistance services render information from one specific presentation form (mode) to another, or provide other forms of assistance on demand. Within the wide spectrum of possible services, each service is tailored to a person's communication and assistance needs regarding temporary, or permanent functional limitations.  Users can connect to a service by a variety of stationary or mobile devices; examples include handheld computers and cell phones, outfitted with earbuds, buttonhole microphones, and eyeglasses with a built-in monitor.  While some of these services can be provided on site in a fully-automated manner today (layer "local automatic services"), others require more advanced implementations and computational resources available through a wide-area network (layer "network enhanced services").  Others still may depend on human assistance for some time (layer "human assisted services"). Although more automated services will be implemented with emerging technologies, the early implementations may not be as mature as needed in some cases. In these situations a "Try Harder" feature could be used to harness more powerful applications on the network, and use human assistance in the automatic translation process when technology fails to be effective in certain environments and for certain problem classes.

Toward an Automated Service Model

Humans can remotely provide all services today.  However, automated services implemented in the local and network enhanced layer could facilitate a more cost-effective and scalable service model.  Among the services shown, only one (text-to-speech) can be delivered solely by computers today.  Others (speech-to-text and international language) are already available as automated services, but still rely on human assistance for verifying results and making corrections if needed.  And there are services (speech-to-sign, sign-to-speech, assistance mentoring, language level, and image/video description) for which there are no implementations yet mature enough to be used even in conjunction with human assistance.

Artificial intelligence could provide key technologies to facilitate automated implementations of mo­da­lity translation and assistance services of tomorrow.  Relevant research and development areas include:

In all these areas, the "Try Harder" feature allows for a smooth transition from automated to human-provided service implementations.  A probabilistical model should be part of the service implementations.  By keeping track of the probabilities of the provided output, and based on heuristics, a more sophisticated service implementation (machine or human-based) could be automatically consulted for certain parts of the problem, or the whole assignment could be transferred to a superior (or inferior) service implementation, if appropriate.


This work was partly funded by the National Science Foundation (USA) via the Alliance Partnership for Advanced Computational Infrastructure and the National Instit­ute on Disability and Rehabilitation Research (NIDRR), US Department of Education under grants H133E980008, & H133E990006.  Opinions expressed are those of the authors and not the funding agencies.

For more information please visit the Trace UD/DA Project page.