TalkTo™ is a suite of proprietary Machine Learning & Microphone Processing algorithms that enables voice control in the noisy & unpredictable spaces of everyday life.



TalkTo Variants

Voice-controlled products span the spectrum of embedded design so does TalkTo.  Each TalkTo variant is scaled and optimized for the capabilities and requirements of the target application.  

Smart Speaker


Set Top Box

Battery Optimized Devices


Multi-channel Soundbar

Smart Home / IoT



TV / Smart Screen







Anatomy of a Voice-Controlled Product



Audio Front End (AFE)

Conceptually, the AFE serves as a microphone cleaner. It uses the raw, noisy audio from the microphones to detect, extract, and clean any speech-activity found in the ambient sound field.  A cleaned, reconstructed voice-stream is output to the Wake Word Engine.

Wake Word Engine (WWE)

Wake-Word Engines continuously scan the AFE-provided voice-stream, looking for the presence of a specific phrase (e.g. "Alexa").  The accuracy of the WWE is one of the primary drivers of UX-quality, and is highly dependent upon the purity of the AFE-generated voice-stream.

Back End Services

Once the wake-word is detected, the extracted voice-command is passed to a [cloud-based or local] voice-service which converts speech-to-text (ASR) and determines user-intent (NLU). Again, accuracy relies on a high-quality voice-stream from the Audio Front End.

Speaker Processing

For a high-quality voice-experience, products with audio-output(s) must have low speaker-distortion. To achieve this, the output-audio is processed to ensure it doesn't exceed the physical capabilities of the speaker(s). Audio Weaver® is a fast path solution to this problem.





TalkTo Performance


The video below shows the 6-Microphone, Amazon Qualified, TalkTo - Set Top Box AFE running on an Amlogic A113 board.  The system is running stand-alone, directly below the TV's speakers (playout at 90+ dBC!).  Even without the closed-loop benefit of AEC (there's no reference signal to eliminate), the TalkTo AFE is still able to detect and extract faint voice-commands from the ambient sound-field;  allowing the WWE and Voice Services to deliver a robust user-experience, even in the harshest of conditions!






For more information or to schedule a live demo, please contact!