Use of a Common Table Architecture for Creating Hands Free, Eyes Free , Noisy Environment (Flex-Modal, Flex-Input) Interfaces.

Gregg C. Vanderheiden, Ph.D.

Trace R & D Center, Industrial Engineering Department, University of Wisconsin-Madison, Madison, WI 53705, e-mail


As we look to the design of nomadic systems, we need to find ways to create interfaces that the user can adapt to meet the constraints of different environments. We need to create interfaces that we can use whether we are walking down the hall, driving the car, sitting at our workstation, or sitting in a meeting; that we can use when we're under stress or distracted; and that make it easy for us to locate and use new services. This requires an interface that at one point in the day can be used without using vision (e.g., when driving a car), but at other times may need to be totally visual (e.g., on a noisy prop airplane or in a meeting where you cannot use sound). It should also allow you to use a full keyboard and mouse (e.g., when seated) but not require use of these devices since they may not always be usable (e.g., while driving or while walking).

Interestingly, this same interface flexibility (including the ability to change modalities) is what is needed to provide access to next generation systems by people with disabilities. That is, if we create fully nomadic systems we will have solved most of the problems and issues around providing access for people with disabilities, with reading problems and for those who are aging.

The task in both of these pursuits is identifying ways to create interfaces that are Flex-Modal (the user can chose the sensory modality for information presentation) and Flex-Input (the user can chose which technique they use to input information to the device). Proposed is a method that uses a common table of semantic items that can be built in advance or on the fly. This can then be used as a basis for flex-modal and flex-input interface options.

1. Introduction

As applications get more interactive, involve animated information, and take on more of an "appliance" character, the strategy of relying on assistive technologies to provide access is becoming less effective. We are now seeing systems where users are able to navigate though stores or whole shopping malls using 3D virtual environments on their web browsers. There is also an increasing number of 3-dimensional educational programs. Within 2 years, it will be possible for even the smallest, poorest school to have the most sophisticated chemistry laboratories, physics labs and electronics training labs using nothing more than a room full of donated surplus computers. Students will be able to manipulate virtual chemicals, conduct experiments using highly sophisticated virtual equipment, and build circuits using virtual components, and have their creations on screen behave just as if they had concocted the real thing. Within 5-8 years, sales forces will be outfitted with small pocket-size appliances that they can use to stay in contact with the office and have full access to all their files and applications wherever they are. Any document, any book, any information anywhere that they have authority to access will be available to them wherever they are in the world - even if there are no electrical outlets for 1000 miles.

2. Emerging Problems

These emerging systems are going to provide new access challenges for a number of reasons.

As these new devices come on line and are integrated into the school, work and daily living environments, access to them will become essential for those who want to attend regular schools, compete in the job market or even have access to the online commerce and shopping.

Current access approaches will not be very effective with these systems. Screen reading technologies that rely on the ability of the screen reading software (or the user) to figure out what the information on screen is trying to convey by providing a verbal description of the visual representation will not work on these dynamic, animated, 3D graphic representations. They may be able to tell the user what is there, but will not be able to convey the meaning or relationship of the different components.

For example, a demonstration of gas dynamics is available that consists of images of a balloon, a thermometer and a heater. As you look at the balloon, you can see little gas molecules bouncing around inside moving somewhat slowly. As you turn up the heat you see the temperature increase and the gas molecules speed up. They hit the sides of the balloon harder and it expands; a few of them even escape through the sides of the balloon. Increasing the temperature increases the speed of the molecules, the size of the balloon, and the number of escaping molecules. Increase it too much and the balloon pops.

A screen reader will tell you "90, 80, 70, 60, 50, 40, 30, 20, 10," which doesn't tell you much. (The only text on screen is the numbers on the thermometer.) With active accessibility, you could learn that there was a balloon, molecules, a thermometer and a way to turn the heat up. You would also be able to turn the heat up and perhaps learn the status of the objects as you stepped around to them. You would not, however, be able to operate the heat and simultaneously get feedback on the balloon behavior and the behavior of the gas molecules in an instructive manner. If the application were constructed to operate in a verbal fashion, however, it would both be accessible if one could not see and provide more information than would be possible as a visual only application.

Similarly, a 3D shopping application which involved travelling about a simulated shopping mall (which could be 3 dimensional itself - like a weightless space shopping center) would be completely inaccessible to a screen reader.

3. Building Access Into the Application

3.1. Multiple Benefits

Although applications such as these would not be very accessible to screen readers, they would be accessible even to users who were blind or deaf-blind if a verbal (voice or braille) interface were built into them (see below). In addition, they would also be usable by someone who was driving a car, since they could be accessed using the verbal interface. Finally, since the interface could be navigated and controlled using text strings, it would be much easier to create intelligent or semi- intelligent electronic agents which would be able to encounter and navigate the mall to catalog or index its contents or even to do straightforward shopping. As discussed below, by designing the application in this fashion, a company could create a single application that could be accessed visually, by phone, on a small PDA display, or by intelligent agent. At the same time, however, these interfaces need to be both operable by and efficient for experienced and power users.

3.2. Requirements of Cross-Disability Accessible Systems

In order to build cross-disability access into a system, two key elements will be required.

Modality independence (Flex-Modal) -- The interfaces will need to allow the user to choose the sensory modalities that are appropriate to the environment, situation, or user. Text-based systems will need to allow users to display information visually at some times and auditorially at others - on high-resolution displays when they are available and on smaller, low-resolution displays when that is all that is handy. We refer to an interface which provides this choice of modalities as Flex-Modal.

Flexible/adaptable input (Flex-input) -- We will need interfaces that can take advantage of fine motor movements and three-dimensional gestures when a user's situation and/or abilities allow, but that can also be operated using speech, keyboard, or other input techniques when that is all that is practical given the environment the user is in, the activities they're engaged in, or any motor constraints.

The first requires that information should be stored and available in either modality-independent or parallel-modality form.

Modality-independent refers to information that is stored in a form which is not tied to any particular form of presentation. For example, ASCII text is not inherently visual, auditory, or tactile. It can be easily presented visually on a visual display or printer, but can just as easily be presented auditorially through a voice synthesizer or tactually through a dynamic braille display.

Parallel-modality refers to information which is stored in multiple modalities. An example would be a movie which includes a description of the audio track (e.g., captions) and a description of the video track in audio and electronic text format so that all (or essentially all) of the information can be presented visually, auditory, or tactually at the user's request based upon their needs, preferences, or environmental situation.

Although this is simple to state -- that information would be available in modality independent or parallel format -- it will range from quite simple to simply impossible to do depending on the nature of the information to be presented. Trying to present Picasso's Guernica in text form is not possible (although a description of it could be, as well as impressions of some viewers). For a great deal of information, however, ASCII text descriptions already exist or can be fairly straightforwardly created.

The user selectable interface is more difficult. It amounts to both separating the interface from the application and implementing multiple simultaneous interfaces in the same device.

4. Common Table Architecture

One approach to doing this is what might be termed a Common Table Architecture. From this a universal list of information and action items can be generated and used to provide interfaces that work across environments or abilities. It has been successfully used in touchscreen kiosk applications and can also be extended to situations like virtual shopping centers and other similar applications that largely deal with navigating among objects and either picking or viewing them.

By maintaining an updated listing of all of the information available to the user at any point in time, as well as all of the actions or commands available or displayed for the user, it is possible to relatively easily provide great flexibility in the techniques available to a user to operate the device or system.

For example, in a 3D virtual shopping mall, a database could be used to generate the image seen by the user and to react to user movements or choices of objects in the view. If properly constructed, this database would be able to provide a listing of all of the objects in view as well as information about any actionable objects presented to the user at any point. By including verbal (e.g., text) information about the various objects and items, it is possible for individuals to navigate and use this 3D virtual shopping system in a wide variety of ways, including purely verbally.

If you maintain or have available such a list, it is then fairly straightforward to create a very flexible set of alternate selection techniques, which can accommodate widely varying physical and sensory abilities that an individual may have, as well as providing input flexibility to nomadic users in different environments/situations (e.g., walking, wearing heavy gloves, etc.).

A suggested selection of operating modes might be:

This combination would provide a set of interfaces which would be able to accommodate a wide variety of users, including people who have difficulty reading; who cannot read at all; who have low vision; who are completely blind; who are hard of hearing; who are deaf; who cannot speak; who have physical disabilities; who are completely paralyzed; and who are deaf-blind -- as well as people with no disabilities , power users and novices. It would also provide the ability to access the device in harsh environments or remote fashion.

Moreover, once the Common Table or List is available, it is very straightforward to create all of the above interface options in a product. You basically have whatever standard interface you would like and then the above set of techniques for accessing the List. Once the code for the techniques is developed, it is relatively easily transferred to other devices, even if they have quite different standard interfaces.

4.1. Application

The Common Table or List approach has been used in an interface that has allowed very flexible interfaces to be developed for multimedia kiosks. It has been used commercially to create kiosks at the Mall of America that are accessible to people with low vision or blindness, who are hard of hearing or deaf, who have difficulty reading or cannot read, or who have limited reach or manipulation. It can also be used to address other applications that largely deal with navigating among objects and either picking or viewing them, including 3D shopping applications. There is still much work to be done to handle the more complex graphic user interfaces, interactive and animated interfaces, and immersive environments.

More information about all of the topics in this paper can be found at Designing an Accessible World at

This work was funded in part by Grant #H133E30012 from the National Institute on Disability and Rehabilitation Research, US Department of Education. The views expressed in this paper do not necessarily reflect those of NIDRR.
This document is hosted on the Trace R&D Center Web site. Please visit our home page for the latest information about Designing a More Usable World - for All.