Experimental Text Captioning on the Grid
The Trace Center has developed an experimental text captioning service which is now available to the Access Grid community. T-Trans, a speech to text modality translation service, is integrated into the Grid-based collaborative visualization work space(s) to allow participation of individuals with hearing impairments and deafness. It will also help participants who find themselves in noisy environments, with poor or lost audio, distracted by side conversation, or who have difficulty in following a presentation due to language differences. Additional benefits are the availability of a transcript for archiving, indexing, and searching the audio portion of a presentation or interaction.
The T-Trans system currently relies on remote human captioners from Caption First, in Franklin, Illinois. The text is displayed on a screen near the presenter. The system allows a content expert at the presentation (or remotely) to correct errors in the provided text, particularly for names and technical terms. The translation from speech to text is done using a Computer-Assisted Real-time Transcription (CART) keyboard and system, which is the most accurate process available today. In the future, this service could be provided by automated, speaker-independent speech recognition systems running on Grid resources, while still allowing local human correction of names or terms.
The Trace Center is currently working with the NCSA Alliance Expedition for Scientific Workspaces of the Future (SWOF) team to make T-Trans available for Access Grid sessions. For more information, or to download and schedule a session, please visit the T-Trans Modality Translation Service page. This effort is the first step towards the implementation of the Translation, Modality Transformation, and Assistance concept described on the reverse side.
The Trace Center is part of the College of Engineering at the University of Wisconsin-Madison, and is a member of the Education, Outreach and Training (EOT) team of the National Computational Science Alliance (funded by the National Science Foundation). This work is also supported in part by the Rehabilitation Engineering Research Center on Telecommunication Access (grant #H133E990006) funded by the National Institute on Disability and Rehabilitation Research.
Beyond Text Captioning - Translation and Modality Translation for the Grid
The spectrum of services envisioned includes:
- Speech recognition (speech to text) -for people who are hard of hearing or deaf (so they could see what is being voiced); participants with mobile Internet devices with restricted input and output capabilities; or for someone who would like a transcript of a conversation or meeting.
- Sign Language (speech to sign language) and Sign Language Recognition (sign language to speech) - for individuals who are deaf and communicate better or more efficiently using sign language.
- International Language Translation - for cross-cultural or trans-global communications.
- Assistance/Mentoring - for providing expertise or assistance; including people with cognitive impairments; people in a mentoring or learning environment; or for tapping into specialized world-class expertise (e.g. for medical doctors, scientists).
- Language Simplification - for individuals with cognitive or language impairment who encounter complex content in conversations or documents; or for a general audience member listening to highly specialized professionals.
- Print Recognition and Image/Video Description - for people with visual impairments to access visual information (e.g. charts, or video clips presented in a meeting); and for generating an electronic index for visual documents.
Network Enabled Personal Services on Demand
Some of these services might be available in fully automated
form in the near future while others will not be for a very long time (decades).
Also, services that may be automatic in the near future may not work in all
environments. A "try harder" feature is therefore proposed in the
infrastructure that would allow users to easily move from inexpensive local
automatic services (such as speech-to-text) to more sophisticated (and expensive)
network-based automation, or even more expensive human assistance as needed,
instantly, on demand. “Remote services on demand” also provides
an efficient and flexible service model by eliminating the need for traveling
and long-term arrangements, and offer job opportunities for people with disabilities.
For more information, please visit the Modality Translation Services Program page.