Tutorials

tutorial 1

  • Speech Processing for Health: Overview, Challenges and Opportunities

Dr. Nicholas Cummins

Department of Biostatistics & Health Informatics Institute of Psychiatry, Psychology & Neuroscience King’s College London.

Speech is a unique health signal, in that it contains a highly singular combination of cognitive, neuromuscular and physiological information. Despite this property, speech analysis remains underutilised in healthcare. With this in mind, the tutorial will overview both core and state-of-the-art concepts relating to speech processing for health analysis. The content will include an overview of the physiological aspects of speech production, highlighting how changes in health state affect these processes; the extraction of linguistic and paralinguistics features; and an overview of machine learning techniques commonly used in this area. As speech-processing for health is a growing field of research, we will also dedicate a proportion of the tutorial to highlighting key research challenges and discussing the opportunities that relate to these. The tutorial will be targeted at the broader ASRU audience. It will introduce fundamentals of speech processing, language processing and machine learning to target researchers newer to the field, while at the same time covering recent deep learning approaches hence targeting intermediate to advanced level researchers in the area

tutorial 2

  • SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit.
SpeechBrain is an open-source conversational AI toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. It currently supports a large variety of speech, audio, and language processing tasks, such as speech recognition, speaker recognition, speech enhancement, speech separation, spoken language understanding, to name a few.

With this tutorial, we would like to present SpeechBrain and its latest updates to the ASRU 2021 attendees. We will describe the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel conversational AI pipelines. We will show the performance achieved by the toolkit in a wide range of benchmarks. Finally, we will provide a guided tour on training recipes, pretrained models, and inference scripts, as well as tutorials that allow anyone with basic Python proficiency to familiarize themselves with speech technologies. 

 https://speechbrain.github.io/about.html