speechbrain.lobes.models

Package defining neural netword models (CRDNN, Xvectors …)

`speechbrain.lobes.models.BESTRQ`	Few components to support BEST RQ training as described in the original paper: https://arxiv.org/pdf/2202.01855.
`speechbrain.lobes.models.CRDNN`	A combination of Convolutional, Recurrent, and Fully-connected networks.
`speechbrain.lobes.models.Cnn14`	This file implements the CNN14 model from https://arxiv.org/abs/1912.10211
`speechbrain.lobes.models.ContextNet`	The SpeechBrain implementation of ContextNet by https://arxiv.org/pdf/2005.03191.pdf
`speechbrain.lobes.models.DiffWave`	Neural network modules for DIFFWAVE: A VERSATILE DIFFUSION MODEL FOR AUDIO SYNTHESIS
`speechbrain.lobes.models.ECAPA_TDNN`	A popular speaker recognition and diarization model.
`speechbrain.lobes.models.ESPnetVGG`	This lobes replicate the encoder first introduced in ESPNET v1
`speechbrain.lobes.models.EnhanceResnet`	Wide ResNet for Speech Enhancement.
`speechbrain.lobes.models.FastSpeech2`	Neural network modules for the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech synthesis model Authors * Sathvik Udupa 2022 * Pradnya Kandarkar 2023 * Yingzhi Wang 2023
`speechbrain.lobes.models.GatedNN`	Gated Neural Network variant of `VanillaNN` for simple feed-forward tests.
`speechbrain.lobes.models.HifiGAN`	Neural network modules for the HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
`speechbrain.lobes.models.L2I`	This file implements the necessary classes and functions to implement Listen-to-Interpret (L2I) interpretation method from https://arxiv.org/abs/2202.11479v2
`speechbrain.lobes.models.MSTacotron2`	Neural network modules for the Zero-Shot Multi-Speaker Tacotron2 end-to-end neural Text-to-Speech (TTS) model
`speechbrain.lobes.models.MetricGAN`	Generator and discriminator used in MetricGAN
`speechbrain.lobes.models.MetricGAN_U`	Generator and discriminator used in MetricGAN-U
`speechbrain.lobes.models.PIQ`	This file implements the necessary classes and functions to implement Posthoc Interpretations via Quantization.
`speechbrain.lobes.models.RNNLM`	Implementation of a Recurrent Language Model.
`speechbrain.lobes.models.ResNet`	ResNet PreActivated for speaker verification
`speechbrain.lobes.models.Tacotron2`	Neural network modules for the Tacotron2 end-to-end neural Text-to-Speech (TTS) model
`speechbrain.lobes.models.VanillaNN`	Vanilla Neural Network for simple tests.
`speechbrain.lobes.models.Xvector`	A popular speaker recognition and diarization model.
`speechbrain.lobes.models.beats`	This lobe enables the integration of pretrained BEATs: Audio Pre-Training with Acoustic Tokenizers.
`speechbrain.lobes.models.bsq`	Binary spherical quantizer.
`speechbrain.lobes.models.conv_tasnet`	Implementation of a popular speech separation model.
`speechbrain.lobes.models.convolution`	This is a module to ensemble a convolution (depthwise) encoder with or without residual connection.
`speechbrain.lobes.models.dual_path`	Library to support dual-path speech separation.
`speechbrain.lobes.models.fairseq_wav2vec`	This lobe enables the integration of fairseq pretrained wav2vec models.
`speechbrain.lobes.models.kmeans`	This file ensures old links to kmeans continue to work while providing a Deprecation warning
`speechbrain.lobes.models.resepformer`	Library for the Resource-Efficient Sepformer.
`speechbrain.lobes.models.segan_model`	This file contains two PyTorch modules which together consist of the SEGAN model architecture (based on the paper: Pascual et al. https://arxiv.org/pdf/1703.09452.pdf).
`speechbrain.lobes.models.wav2vec`	Components necessary to build a wav2vec 2.0 architecture following the original paper: https://arxiv.org/abs/2006.11477.

`speechbrain.lobes.models.discrete`	High level processing blocks.
`speechbrain.lobes.models.g2p`
`speechbrain.lobes.models.transformer`	High level processing blocks.