[!] The blog has moved [!]

The blog has moved to http://carlitoscontraptions.com/.

You will be redirected to the new URL in 5 s. Sorry for the inconvenience.

June 23, 2007

Speech Recognition Using FPGA Technology

My friends David and Kanwen, and I implemented a speech recognition system on an FPGA development board (Altera DE2 Board) for the Design Project course at McGill (ECSE 494). We did this in two step: first we wrote a prototype for the algorithm in MATLAB (I'll maybe port it to Octave), and then we did the hardware description for the FPGA.

MATLAB Prototype

Inspired by the algorithm described in a site from the University of Toronto, we wrote two MATLAB scripts: train.m and recogniz.m.

train.m deals with the training phase, in which many versions of a sound (a spoken word for instance) are input and averaged in the frequency domain thus generating the sound's “reference fingerprint”.

recogniz.m deals with the recognition phase, where a sound is input, translated to the frequency domain (i.e. Its fingerprint is generated), and compared to the reference fingerprint by computing the euclidean distance between them (as if both fingerprints where vectors).

Both scripts need to detect the beginning of the sound (i.e tell when the spoken word begins). They do so by averaging two adjacent 1024-sound-samples groups (in the time domains) and computing the difference between the averages. So, if there is a sudden increase in the sound's amplitude, the difference will be significant and the sound is assumed to start after that sudden increase. The sound's length is fixed to 1,024 s (see the picture below for more details)

Note that the scripts use 16-bit WAV files as input @ 22050 Hz (this is the default windows sound recorder output, since I could not do it in Linux because the mic did not wanted to work). The sound input is downsampled and quantized in order to get it down to 8 bit /sample @ 5 kHz for processing.

Also you might encounter problems if the sound file is too short (it should last for more than 1,1 s), or if its volume level is too low (this happens because the detector threshold is fixed).

Hardware Implementation

Once we had played enough with the MATLAB prototype parameters, we mapped the algorithm into combinational logic and finite state machines (FSM) by breaking it down into independent modules.

For more details about the hardware implementation and the project in general you can read the full project report. You may also want to see the slides for a presentation we did (below).



Unfortunately, I cannot post the project files (i.e. VHDL code).

Here is a little video demo, enjoy:




Note that all the documentation for this project was done using the very excellent OpenOffice.org.

7 comments:

elio said...

hello
I'm elio, philippines...
I'm having a hard time looking for a good reference book discussing thoroughly FPGA. i saw your project and i was amazed on how it works. what books must I acquire to understand more the program. if you have the books, can i download it and have it too?
Thank you...
God Bless

Carlos said...

I don't know of any book treating specifically on FPGAs. The book I used for the Digital System Design Class (the one where I learned to use FPGAs) was Fundamentals of Digital Logic with VHDL Design Second Edition.I don't really remember reading it but it looks like I did. I think you could try to get it online (although it may be illegal). Please, send me an email if you need more info on the book.

elio said...

thank you...
by the way may i know who's the athur of the book?

Carlos said...

The authors are Stephen Brown and Zvonko Vranesic. Here is a link to the book's website: http://highered.mcgraw-hill.com/sites/0072460857/

sarfaraz said...

hy i am sarfaraz.I read and found this project very good and taken an idea from this project now I wants to implement this speech recognition method using VHDL and FPGA for car locking, unlocking and ignition which is my final year project.Please help me for this.I hope u will.
my email addresses are
sarfarazattariciit@gmail.com

sarfaraz_attari_ciit@yahoo.com

tahder said...

Hi Carlos! I am new to FPGA (had just started early this year) and I find your blog interesting and helpful. btw, I had added your blog site in my links. Here's my url
http://fpga-dsp-scratch.blogspot.com/

Thank you so much for sharing! Your site helps many :).

saurabh chadha said...

sir i have read your entire project. i found it very interesting.am also working on voice recognition based on fpga..am using the same altera de2 board..am facing problems in programming the codec..we have to load the control word in codec..how to check the output of adcdat pin where digital data will be there.plz help me in this regard...