Speech Processing Tasks๏
๐ Speech Recognition From Scratch
Ravanelli M. & Parcollet T. |
Apr. 2021 |
Difficulty: medium |
Time: 45min |
Do you want to figure out how to implement your speech recognizer with SpeechBrain? Look no further, youโre in the right place. This tutorial will walk you through all the steps needed to implement an offline end-to-end attention-based speech recognizer. This is a self-contained tutorial that will help you โconnecting the dotsโ across all the steps needed to train a modern speech recognizer. We will address data preparation, tokenizer training, language model, ASR model, and inference. We will explain how to train your model on your data.
๐ Metrics for Speech Recognition
de Langen S. |
Sep. 2024 |
Difficulty: medium |
Time: 30min |
Estimating the accuracy of a speech recognition model is not a trivial problem. The Word Error Rate (WER) and Character Error Rate (CER) metrics are standard, but some research has been trying to develop alternatives that better correlate with human evaluation (such as SemDist).
This tutorial introduces some alternative ASR metrics and their flexible integration into SpeechBrain, which can help you research, use or develop new metrics.
๐ Source Separation
Subakan C. |
Jan. 2021 |
Difficulty: medium |
Time: 30min |
In source separation, the goal is to be able to separate out the sources from an observed mixture signal which consists of superposition of several sources. In this tutorial, we cover few examples of performing source separation with SpeechBrain.
๐ Speech Enhancement From Scratch
Plantinga P. |
Feb. 2021 |
Difficulty: medium |
Time: 30min |
So you want to do regression tasks with speech? Look no further, youโre in the right place. This tutorial will walk you through a basic speech enhancement template with SpeechBrain to show all the components needed for making a new recipe.
๐ Speech Classification From Scratch
Ravanelli M. |
Jan. 2021 |
Difficulty: medium |
Time: 30min |
In this tutorial, we show how to use SpeechBrain to implement an utterance-level speech classifier. It might help if you want to develop systems for speaker-id, language-id, emotion recognition, sound classification, keyword spotting, and many other tasks.
๐ Voice Activity Detection
Ravanelli M. |
Sept. 2021 |
Difficulty: easy |
Time: 15min |
In this tutorial, we show how to use SpeechBrain for voice activity detection. The tutorial will describe how to train a neural VAD and use it for inference on long audio recordings.