speechbrain.lobes.models

Package defining neural netword models (CRDNN, Xvectors …)

speechbrain.lobes.models.CRDNN

A combination of Convolutional, Recurrent, and Fully-connected networks.

speechbrain.lobes.models.Cnn14

This file implements the CNN14 model from https://arxiv.org/abs/1912.10211

speechbrain.lobes.models.ContextNet

The SpeechBrain implementation of ContextNet by https://arxiv.org/pdf/2005.03191.pdf

speechbrain.lobes.models.ECAPA_TDNN

A popular speaker recognition and diarization model.

speechbrain.lobes.models.ESPnetVGG

This lobes replicate the encoder first introduced in ESPNET v1

speechbrain.lobes.models.EnhanceResnet

Wide ResNet for Speech Enhancement.

speechbrain.lobes.models.FastSpeech2

Neural network modules for the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech synthesis model Authors * Sathvik Udupa 2022 * Pradnya Kandarkar 2023 * Yingzhi Wang 2023

speechbrain.lobes.models.HifiGAN

Neural network modules for the HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

speechbrain.lobes.models.L2I

This file implements the necessary classes and functions to implement Listen-to-Interpret (L2I) interpretation method from https://arxiv.org/abs/2202.11479v2

speechbrain.lobes.models.MetricGAN

Generator and discriminator used in MetricGAN

speechbrain.lobes.models.MetricGAN_U

Generator and discriminator used in MetricGAN-U

speechbrain.lobes.models.PIQ

This file implements the necessary classes and functions to implement Posthoc Interpretations via Quantization.

speechbrain.lobes.models.RNNLM

Implementation of a Recurrent Language Model.

speechbrain.lobes.models.Tacotron2

Neural network modules for the Tacotron2 end-to-end neural Text-to-Speech (TTS) model

speechbrain.lobes.models.VanillaNN

Vanilla Neural Network for simple tests.

speechbrain.lobes.models.Xvector

A popular speaker recognition and diarization model.

speechbrain.lobes.models.conv_tasnet

Implementation of a popular speech separation model.

speechbrain.lobes.models.convolution

This is a module to ensemble a convolution (depthwise) encoder with or without residule connection.

speechbrain.lobes.models.dual_path

Library to support dual-path speech separation.

speechbrain.lobes.models.fairseq_wav2vec

This lobe enables the integration of fairseq pretrained wav2vec models.

speechbrain.lobes.models.huggingface_wav2vec

This lobe enables the integration of huggingface pretrained wav2vec2/hubert/wavlm models.

speechbrain.lobes.models.huggingface_whisper

This lobe enables the integration of huggingface pretrained whisper model.

speechbrain.lobes.models.resepformer

Library for the Reseource-Efficient Sepformer.

speechbrain.lobes.models.segan_model

This file contains two PyTorch modules which together consist of the SEGAN model architecture (based on the paper: Pascual et al. https://arxiv.org/pdf/1703.09452.pdf).

speechbrain.lobes.models.wav2vec

Components necessary to build a wav2vec 2.0 architecture following the original paper: https://arxiv.org/abs/2006.11477.

speechbrain.lobes.models.g2p

speechbrain.lobes.models.transformer

High level processing blocks.