19

vowel triangle visualizer

A speech science related demo to engage students of 2023 EE Summer School, IISc Bangalore

spire-lab

speech-processing

python

Overview

During a EE Summer School event, many engineering students visited research labs in EE Dept. IISc Bangalore. As a member of the Speech and Audio Processing Lab, I wanted to create an engaging demo to showcase speech processing concepts. I developed an interactive tool that visualizes a user's Vowel Triangle Area in real-time as they speak sustained vowels into a microphone.

This allowed students to see components speech signals and notice similarities and difference between formant frequencies between individuals and it's significance. The demo was built using Python and speech processing libraries. It was a great experience interacting with students and provide an effective hands-on learning experience.

Background

Speech signals contain resonant frequencies called formants that correspond to different vowel sounds. The first two formant frequencies (F1 and F2) are important for distinguishing vowels. We can extract F1 and F2 using techniques like LPC analysis to estimate the filter parameters of the vocal tract.

However, formant estimation can be challenging in noisy and reverberant environments. Methods like pre-emphasis filtering, windowing, and smoothing can improve accuracy, but a more robust approach is in need for development to use as formant tracker that adapts to changes in the signal. Formant analysis has uses beyond speech processing too - it can aid diagnosis of certain voice and speech disorders.

Demo Development

For this project, I used Python with libraries like Librosa to analyze live speech input and extract the first two formant frequencies. The formants are plotted on a vowel quadrilateral in real-time, with the vowel space animated as the user speaks sustained vowels.

This allows the user to visually see their vowel triangle area and how it morphs between different vowels. The interactive visual feedback was designed to be intuitive and engaging for the students.

vowel triangle viz

Outcomes

The students were eager to try the demo, enjoyed the experience and seeing the visualisation change as they explored different vowel sounds, it even turned into a challenge of who can get the largest area! Many were fascinated by speech science concepts and the underlying science. I hope this encouraged them to explore speech processing further in their studies.

The demo provided an effective introduction to core concepts like formants, and made a strong impression on the students based on their feedback. Overall, it achieved its goal of showcasing speech processing principles through an interactive tool.

Discussion

Developing this demo was an insightful experience. I gained valuable skills in formant analysis and creative visualisation techniques.

If I were to expand on this project, collecting a dataset of different speakers could allow creating personalized vowel space visualisations. Additional preprocessing and noise reduction would also be useful for handling more challenging audio. This was an exciting opportunity to engage students with interactive technology.

I hope to develop more tools like this to make fundamental engineering concepts come alive through demos.

todo: request for pictures from the event