Course: Digital Speech Processing
Code: 3ФЕИТ05005
ECTS points: 6 ECTS
Number of classes per week: 3+0+0+3
Lecturer: Asst. Prof. Dr. Branislav Gerazov
Course Goals (acquired competencies): The goal of the course program is to allow students to acquire a wide knowledge of the techniques for the analysis, synthesis and recognition of speech signals. It is designed to bring close the various approaches and applications of digital speech processing through studying the state-of-the-art.
Course Syllabus: 1. Fundamentals of digital audio, principles of digitisation, oversampling, jitter 2. Working with audio signals in the digital domain; quantisation, dither, noise shaping 3. Fourier transform, Z-transform, amplitude and phase spectrum 4. Sliding window method, short time Fourier transform (STFT), spectrograms 5. Fundamentals of digital filters, filtering, FIR, IIR, FIR filter design 6. LP, HP, BP, BS, and Notch filters, filterbanks, equalisation 7. Basics of speech production, source filter model, LP analysis, VOCODER; compression of speech LP10, CELP. 8. Machine learning for speech signals, basics of ASR, feature extraction, MFCCs 9. DTW, HMM, GММ 10. Deep learning in ASR, NN, DNN, RNN, LSTM, CNN 11. Speaker recognition, GММ, ЕМ, UBM, LLRM 12. Speech synthesis, concatenative, articulatory and formant synthesis, parametric synthesis with HMM and NN 13. Deep learning for speech synthesis Wave-Net.
Literature:
Required Literature |
||||
No. |
Author |
Title |
Publisher |
Year |
1 |
Lawrence Rabiner, Ronald Schafer | Theory and Applications of Digital Speech Processing | Pearson | 2010 |
Additional Literature |
||||
No. |
Author |
Title |
Publisher |
Year |
1 |
Lawrence Rabiner, Biing-Hwang Juang | Fundamentals of Speech Recognition | Prentice Hall | 1993 |
2 |
Dong Yu, Li Deng | Automatic Speech Recognition: A Deep Learning Approach | Springer | 2015 |
3 |
by Xuedong Huang, Alex Acero, Hsiao-Wuen Hon | Spoken Language Processing: A Guide to Theory, Algorithm and System Development | Prentice Hall | 2001 |