Speech Recognition – Smart Microphone – Jetson Development Kits

This article starts a new series on Speech Recognition. A “smart microphone” is an array of microphones with special signal processing hardware to locate and isolate speech, even in noisy environments. Looky here:

Background

All the cool kids now have in home, voice activated devices like Amazon Echo or Google Home. These devices can play your favorite music, answer questions, read books, control home automation, and all those other things people thought the future was about in the 1960s. For the most part, the speech recognition on the devices works well, although you may find yourself with an extra dollhouse or two occasionally.

One of the enabling technologies of these devices is what is called a microphone array. Several microphones are placed in a circle, with the output being sent to a Digital Signal Processor, or DSP for short. The DSP has several special algorithms which help detect where a voice originates from (localization) and uses audio beamforming to process, reduce echo and reverberation from the signal. The result is an audio stream that is an accurate representation of the original voice.

Once a suitable audio stream has been acquired, the stream can be either processed locally or sent to a server for further processing. In the case of something like an Amazon Echo, a local processor “listens” to the incoming audio stream for a keyword trigger, e.g. “Alexa”. Once the keyword has been identified, the rest of the audio stream is sent to online servers which do speech recognition on the stream, and then parse the audio into “actions”. The service then sends the action back to the device. These actions vary from device to device, but typically allow the user to request the device to play music, control home automation devices, or ask/answer questions. Amazon, Google and Microsoft all have APIs to interface their online services with audio.

The online services have large data bases which they have used machine learning techniques to train their speech recognizers. You may have noticed that many of the online services have become significantly better at recognizing speech over the last couple of years. This advance is mostly due to advances in machine learning.

Speech Recognition for the rest of us

The consumer devices are interesting, and now the technology for smart microphones is available separately from several manufacturers. In the video, a Seeed Studio Respeaker is shown. There are several other manufacturers, the Respeaker in the video was ordered through a Kickstarter campaign.

The Far Field Microphone Array is built around a XVSM-2000 chip from XMOS. Watch the video for a rundown of the rest of the fun hardware that is available on the Respeaker, with sprinkles like RGB LEDs and an Arduino type of processor. The Jetson can talk to either the Respeaker Core or Microphone Array using USB.

Conclusion

Over the course of the next few articles, we’ll figure out how to interface with the Microphone Array, gather the audio stream, and then perform speech recognition both locally and through online services.

6 Comments

    • There are several manufacturers of such devices. The one in the video is a a Respeaker from Seeed Studio. To me, the Mic Array ($ 79 USD ) is the interesting piece (https://www.seeedstudio.com/ReSpeaker-Mic-Array-Far-field-w%2F-7-PDM-Microphones-p-2719.html). The Respeaker Core has 1 mic in the middle, along with a WiFi chip, LEDs and such. The Mic Array appears to be able to run independently of the Core, and has 7 Mics, LEDs, and all the DSP goodies built in. But I have just started work on getting the Microphone Array to work on the Jetson. When the Mic Array is plugged into the Jetson, it does act as a USB Mic, so at least it appears to works.

  1. hello
    did you test recognition ?
    could you do a video with noise, and play recorded sample to hear echo cancellation and others goodies.
    Christmas is past, but we need to always hope !! Lol
    Thanks Jetson hacks for all your posts. (I’m happy owner of a TX1, and it’s due to you.. Thanks again, and don’t stop)
    PS: I’ll bother you regulary, lol.
    Vincent

    • The microphone array can be used stand alone, or attached to the ReSpeaker Core. The headers come attached to the mic array, no soldering necessary.
      The Mic Array board can be used independently either through the micro-usb connection, or though the headers.

Leave a Reply

Your email address will not be published.


*