Computer Vision News - November 2020

2 Young Scientist Winner 18 Best of MICCAI 2020 Pramit Saha is a Master’s student in the Electrical and Computer Engineering department at the University of British Columbia (UBC), under the supervision of Professor Sidney Fels. He was among only four authors to be awarded the title of Young Scientist atMICCAI 2020 in recognition of his paper on developing an ultrasound-based silent-speech interface. He tells us more about his work. Silent-speech interfaces (SSIs) are devices which enable speech communication and can improve quality of life for people who have lost their voice, due to laryngectomy or diseases of the vocal cord, for example. This work proposes an SSI which synthesizes speech from ultrasound tongue videos by automatically tracking the tongue contour. When someone forms speech in their mind, it is called imagined speech . This is followed by the transfer of information through the central nervous system to different muscles responsible for making the articulatory movements of the larynx, tongue, and lips. The dynamic shape and position change of the tongue plays a significant role in shaping the sound by creating different resonances. Based on these resonances, some frequencies become more prominent than others and show up as dark bands on the spectrogram. These are called formant frequencies . With this SSI, people can express their thoughts by moving their tongue as in normal speech, butwith anon-invasive ultrasound probe beneath their chin and a virtual speech synthesizer. The ultrasound videos are mapped to speech formant frequencies using deep learning techniques which can connect articulatory movements recorded in the ultrasound video to the acoustic responses as recorded through a microphone. This predicts the sequence of formant frequencies Ultra2Speech - A Deep Learning Framework for Formant Frequency Estimation and Tracking from Ultrasound Tongue Images

RkJQdWJsaXNoZXIy NTc3NzU=