System and method for the secure, real-time, high accuracy conversion of speech to text
United States 7,539,086
The system is designed to interface with external devices and services, to transcribe audio that may be stored elsewhere such as a wireless phone'voice mail, or occurring between two or more parties such as a conference call. An audio stream is separated into many audio shreds, each of which has duration of only a few seconds and cannot reveal the context of the conversation. A workforce of geographically distributed transcription agents who transcribe the audio shreds is able to generate transcription in real time, with many agents working in parallel on a single conversation. No one agent (or group of agents) receives a sufficient number of audio shreds to reconstruct the context of any conversation. The use of human transcribers allows the system to overcome limitations typical of computer-based speech recognition and permits accurate transcription of general-quality speech even in acoustically hostile environments.