Numerous mobile and smart home devices now use voice detection technologies. One of the crucial features of these voice interfaces is the so-called keyword spotting system used to “wake-up” these type of voice detectors. Can one quickly build their own speech recognition detector using free data and open source software packages? Can one achieve this result by creating a relatively simple and low-memory cost model that can be run on mobile devices? In this talk, we will present our journey of building the keyword spotting speech recognition model with Convolutional Neural Networks (CNNs), and open sourced data from the TensorFlow and AIY Speech Commands Dataset. We will present our CNN models and show their performance. We will also describe what we have learned about compiling an optimal computing (CPU/GPU/Memory) architecture for these type of computational problems at a lower cost. A brief history of Machine Learning and Neural Networks/Deep Learning will be also reviewed.
Neural Networks: A Speech Recognition Journey