SlideShare a Scribd company logo
Gerome Jan Llames
MEECE - CCO
Project in MEE 1231: Digital Signal Processing
Yes or No Speech Recognition
Objectives:
1. To build a program, involving Digital Signal Processing, which would detect the speech
signal if it is a Yes or a No.
2. To know the difference between a yes and a no signal.
INTRODUCTION
According to Wikipedia, in Computer Science and Electrical Engineering speech recognition (SR)
is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR),
"computer speech recognition", or just "speech to text" (STT). Some SR systems use "speaker
independent speech recognition" while others use "training" where an individual speaker reads sections
of text into the SR system. These systems analyze the person's specific voice and use it to fine tune the
recognition of that person's speech, resulting in more accurate transcription. Systems that do not use
training are called "speaker independent" systems. Systems that use training are called "speaker
dependent" systems.
Speech recognition applications include voice user interfaces such as voice dialing (e.g. "Call
home"), call routing (e.g. "I would like to make a collect call"), domotic appliance control, search (e.g.
find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card
number), preparation of structured documents (e.g. a radiology report), speech-to-text processing
(e.g., word processors or emails), and aircraft (usually termed Direct Voice Input).
Although speech recognition in general is a very complex problem, even a simple program can
distinguish between the two words yes and no. Although any two words could be used for a project like
this, yes and no were chosen because there are real systems that do exactly this task. For example, calls
to a company and telephone surveys are often handled by automated systems that ask the person
questions and attempt to determine the response using speech recognition. In a situation where the
question is answered by yes or no, a yes/no speech recognition system is very useful.
DIGITAL SIGNAL PROCESSING SYSTEM
System Workflow
This project runs in real time. It all starts from the speech signal, yes or no, which makes use the
built-in microphone of any computer. Then the signal will be processed in MATLAB where fft takes
place. Finally, if the value is below the threshold it would display Yes and no if it’s the other way around.
DATA ANALYSIS
The two table's shows data collected after 8 real time testings. It achieves an impressive 100%
accuracy. It shows the voices, frequencies, amplitudes and f values. With these data we can say that
there are more signals in the higher frequency in the word yes compared to no while higher amplitude
occurs when the word no is spoken. With threshold value at 12, I can say that indeed it has a 100%
accuracy.
Voice Frequency Amplitude f value
YesMine ~ 1400 Hz ~ 0.3 5.9218
YesBaruc ~ 1800 Hz ~ 0.25 1.8685
YesAyn ~ 1600 Hz ~ 0.6 4.2221
YesEra ~ 1400 Hz ~ 1.0 7.2813
YesGeorge ~ 1400 Hz ~ 0.45 5.5708
YesSoheib ~ 600 Hz ~ 0.35 9.6850
YesJaybee ~ 1800 Hz ~ 0.7 3.7257
YesJudilyn ~ 1400 Hz ~ 0.3 5.3822
Table 1: Data analysis of YES
Voice Frequency Amplitude f value
NoMine ~ 1000 Hz ~ 0.4 21.3383
NoBaruc ~ 1000 Hz ~ 0.4 18.6965
NoAyn ~ 900 Hz ~ 0.7 18.0442
NoEra ~ 1200 Hz ~ 0.4 20.2817
NoGeorge ~ 1000 Hz ~ 0.9 25.5547
NoSoheib ~ 450 Hz ~ 0.28 18.1870
NoJaybee ~ 1000 Hz ~ 0.8 30.7095
NoJudilyn ~ 1600 Hz ~ 0.8 15.3298
Table 2: Data analysis of No
SCREENSHOTS OF THE DATA IN TABLE 1 & TABLE 2
YesMine: YesBaruc:
YesAyn:
YesEra:
YesGeorge:
YesSoheib:
YesJaybee:
YesJudilyn:
NoMine:
NoBaruc:
NoAyn:
NoEra:
NoGeorge:
NoSoheib:
NoJaybee:
NoJudilyn:
HOW TO RUN THE PROGRAM?
Step 1: Run the YesOrNoRecorder m-file. "YesOrNoRecorder"
Step 2: Run the Yes_Or_No_Project function file with a sampling frequency of 44100 Hz which is twice
the frequency of the speech signal. "Yes_Or_No_Project(x,44100)"
Watch the video here: http://www.youtube.com/watch?v=EOcp7pxQOBA&feature=youtu.be
LIMITATIONS
There was no external microphone that was used. I used the built in microphone in my laptop.
Before I started the recording, it was make sure that there were no unnecessary loud noises. Using a
filtered microphone would be a good choice.
CONCLUSION
Yes or No Speech recognition is a good example of a system which involves Digital Signal
Processing. It make use of the well-known Fast Fourier Transform (fft). Before I started making the
project, I already found some interesting facts about the two words yes and no. The two words have an
unvoiced consonant and voiced consonant sound respectively. Voiced consonant means that when we
say a certain word there is that vibration in our vocal cords while unvoiced consonant doesn't. Unvoiced
consonants has that larger energy compared to that of Voiced as well. That was proven when the testing
was conducted.
The data that was gathered after the testing shows the difference between the word yes and
no. In the waveform x that was plotted, there were more values in the high frequency in the yes signals
than that of no. It was because of the sound 's'. However the amplitude of the no signals were much
higher compared to that of yes.
Finally, based on the data, the objective was successfully achieved with an excellent 100%
accuracy.

More Related Content

PPTX
Motivacija
petraognjenovic
 
PPTX
UNIT 4 -Advanced Nano finishing Processes.pptx
Raja P
 
PPT
2. LATHE OPERATIONS.ppt
MPIndhu
 
PPTX
Unconventional machining process unit-1
D. Palani Kumar / Kamaraj College of Engineering & Technology
 
PPTX
Seminar report on electric discharge machine
Ankit Amlan
 
PPTX
Mjerenje inteligencije.pptx
petraognjenovic
 
PDF
Нестандартні уроки з математики
sveta7940
 
PPTX
SOCKOGN UCENJE.pptx
petraognjenovic
 
Motivacija
petraognjenovic
 
UNIT 4 -Advanced Nano finishing Processes.pptx
Raja P
 
2. LATHE OPERATIONS.ppt
MPIndhu
 
Unconventional machining process unit-1
D. Palani Kumar / Kamaraj College of Engineering & Technology
 
Seminar report on electric discharge machine
Ankit Amlan
 
Mjerenje inteligencije.pptx
petraognjenovic
 
Нестандартні уроки з математики
sveta7940
 
SOCKOGN UCENJE.pptx
petraognjenovic
 

Similar to Speech Recognition No Code (20)

PDF
Voice Recognition System using Template Matching
IJORCS
 
PPTX
Silent sound technologyrevathippt
revathiyadavb
 
PPTX
E0ad silent sound technology
Madhuri Rudra
 
PDF
Dy36749754
IJERA Editor
 
PPTX
Seminar PPT - Shreya Suroliya.pptx
chiragsharmaa36
 
PPTX
Silent-Sound-Technology-PPT.pptx
omkarrekulwar
 
PPTX
Silent Sound Technology
Prathibha Thammineni
 
PPTX
Speech Analysis
Mohamed Essam
 
PDF
An Introduction to Various Features of Speech SignalSpeech features
Sivaranjan Goswami
 
PPTX
Simulation of speech recognition using correlation method on matlab software
VaishaliVaishali14
 
PPT
Speech recognition
Charu Joshi
 
PDF
Block codes
Mostafa Deep
 
PPTX
How speech reorganization works
Muhammad Taqi
 
PDF
Silent Sound Technology
Hafiz Sanni
 
PPTX
Esophageal Speech Recognition using Artificial Neural Network (ANN)
Saibur Rahman
 
PDF
SPEECH ENHANCEMENT USING KERNEL AND NORMALIZED KERNEL AFFINE PROJECTION ALGOR...
sipij
 
PDF
Silent sound technology final report
Lohit Dalal
 
PDF
Robust Speech Recognition Technique using Mat lab
IRJET Journal
 
PDF
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Roman Atachiants
 
Voice Recognition System using Template Matching
IJORCS
 
Silent sound technologyrevathippt
revathiyadavb
 
E0ad silent sound technology
Madhuri Rudra
 
Dy36749754
IJERA Editor
 
Seminar PPT - Shreya Suroliya.pptx
chiragsharmaa36
 
Silent-Sound-Technology-PPT.pptx
omkarrekulwar
 
Silent Sound Technology
Prathibha Thammineni
 
Speech Analysis
Mohamed Essam
 
An Introduction to Various Features of Speech SignalSpeech features
Sivaranjan Goswami
 
Simulation of speech recognition using correlation method on matlab software
VaishaliVaishali14
 
Speech recognition
Charu Joshi
 
Block codes
Mostafa Deep
 
How speech reorganization works
Muhammad Taqi
 
Silent Sound Technology
Hafiz Sanni
 
Esophageal Speech Recognition using Artificial Neural Network (ANN)
Saibur Rahman
 
SPEECH ENHANCEMENT USING KERNEL AND NORMALIZED KERNEL AFFINE PROJECTION ALGOR...
sipij
 
Silent sound technology final report
Lohit Dalal
 
Robust Speech Recognition Technique using Mat lab
IRJET Journal
 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Roman Atachiants
 
Ad

Speech Recognition No Code

  • 1. Gerome Jan Llames MEECE - CCO Project in MEE 1231: Digital Signal Processing Yes or No Speech Recognition Objectives: 1. To build a program, involving Digital Signal Processing, which would detect the speech signal if it is a Yes or a No. 2. To know the difference between a yes and a no signal. INTRODUCTION According to Wikipedia, in Computer Science and Electrical Engineering speech recognition (SR) is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR), "computer speech recognition", or just "speech to text" (STT). Some SR systems use "speaker independent speech recognition" while others use "training" where an individual speaker reads sections of text into the SR system. These systems analyze the person's specific voice and use it to fine tune the recognition of that person's speech, resulting in more accurate transcription. Systems that do not use training are called "speaker independent" systems. Systems that use training are called "speaker dependent" systems. Speech recognition applications include voice user interfaces such as voice dialing (e.g. "Call home"), call routing (e.g. "I would like to make a collect call"), domotic appliance control, search (e.g. find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g. a radiology report), speech-to-text processing (e.g., word processors or emails), and aircraft (usually termed Direct Voice Input). Although speech recognition in general is a very complex problem, even a simple program can distinguish between the two words yes and no. Although any two words could be used for a project like this, yes and no were chosen because there are real systems that do exactly this task. For example, calls to a company and telephone surveys are often handled by automated systems that ask the person questions and attempt to determine the response using speech recognition. In a situation where the question is answered by yes or no, a yes/no speech recognition system is very useful.
  • 2. DIGITAL SIGNAL PROCESSING SYSTEM System Workflow This project runs in real time. It all starts from the speech signal, yes or no, which makes use the built-in microphone of any computer. Then the signal will be processed in MATLAB where fft takes place. Finally, if the value is below the threshold it would display Yes and no if it’s the other way around.
  • 3. DATA ANALYSIS The two table's shows data collected after 8 real time testings. It achieves an impressive 100% accuracy. It shows the voices, frequencies, amplitudes and f values. With these data we can say that there are more signals in the higher frequency in the word yes compared to no while higher amplitude occurs when the word no is spoken. With threshold value at 12, I can say that indeed it has a 100% accuracy. Voice Frequency Amplitude f value YesMine ~ 1400 Hz ~ 0.3 5.9218 YesBaruc ~ 1800 Hz ~ 0.25 1.8685 YesAyn ~ 1600 Hz ~ 0.6 4.2221 YesEra ~ 1400 Hz ~ 1.0 7.2813 YesGeorge ~ 1400 Hz ~ 0.45 5.5708 YesSoheib ~ 600 Hz ~ 0.35 9.6850 YesJaybee ~ 1800 Hz ~ 0.7 3.7257 YesJudilyn ~ 1400 Hz ~ 0.3 5.3822 Table 1: Data analysis of YES Voice Frequency Amplitude f value NoMine ~ 1000 Hz ~ 0.4 21.3383 NoBaruc ~ 1000 Hz ~ 0.4 18.6965 NoAyn ~ 900 Hz ~ 0.7 18.0442 NoEra ~ 1200 Hz ~ 0.4 20.2817 NoGeorge ~ 1000 Hz ~ 0.9 25.5547 NoSoheib ~ 450 Hz ~ 0.28 18.1870 NoJaybee ~ 1000 Hz ~ 0.8 30.7095 NoJudilyn ~ 1600 Hz ~ 0.8 15.3298 Table 2: Data analysis of No
  • 4. SCREENSHOTS OF THE DATA IN TABLE 1 & TABLE 2 YesMine: YesBaruc:
  • 9. HOW TO RUN THE PROGRAM? Step 1: Run the YesOrNoRecorder m-file. "YesOrNoRecorder" Step 2: Run the Yes_Or_No_Project function file with a sampling frequency of 44100 Hz which is twice the frequency of the speech signal. "Yes_Or_No_Project(x,44100)" Watch the video here: http://www.youtube.com/watch?v=EOcp7pxQOBA&feature=youtu.be LIMITATIONS There was no external microphone that was used. I used the built in microphone in my laptop. Before I started the recording, it was make sure that there were no unnecessary loud noises. Using a filtered microphone would be a good choice. CONCLUSION Yes or No Speech recognition is a good example of a system which involves Digital Signal Processing. It make use of the well-known Fast Fourier Transform (fft). Before I started making the project, I already found some interesting facts about the two words yes and no. The two words have an unvoiced consonant and voiced consonant sound respectively. Voiced consonant means that when we say a certain word there is that vibration in our vocal cords while unvoiced consonant doesn't. Unvoiced consonants has that larger energy compared to that of Voiced as well. That was proven when the testing was conducted. The data that was gathered after the testing shows the difference between the word yes and no. In the waveform x that was plotted, there were more values in the high frequency in the yes signals than that of no. It was because of the sound 's'. However the amplitude of the no signals were much higher compared to that of yes. Finally, based on the data, the objective was successfully achieved with an excellent 100% accuracy.