speechbrain.inference.enhancement module

Specifies the inference interfaces for speech enhancement modules.

Authors:

Aku Rouhe 2021
Peter Plantinga 2021
Loren Lugosch 2020
Mirco Ravanelli 2020
Titouan Parcollet 2021
Abdel Heba 2021
Andreas Nautsch 2022, 2023
Pooneh Mousavi 2023
Sylvain de Langen 2023
Adel Moumen 2023
Pradnya Kandarkar 2023

Summary

Classes:

`SpectralMaskEnhancement`	A ready-to-use model for speech enhancement.
`WaveformEnhancement`	A ready-to-use model for speech enhancement.

Reference

class speechbrain.inference.enhancement.SpectralMaskEnhancement(modules=None, hparams=None, run_opts=None, freeze_params=True)[source]

Bases: Pretrained

A ready-to-use model for speech enhancement.

Parameters:: Pretrained. (See) –

Example

>>> import torch
>>> from speechbrain.inference.enhancement import SpectralMaskEnhancement
>>> # Model is downloaded from the speechbrain HuggingFace repo
>>> tmpdir = getfixture("tmpdir")
>>> enhancer = SpectralMaskEnhancement.from_hparams(
...     source="speechbrain/metricgan-plus-voicebank",
...     savedir=tmpdir,
... )
>>> enhanced = enhancer.enhance_file(
...     "speechbrain/metricgan-plus-voicebank/example.wav"
... )

HPARAMS_NEEDED = ['compute_stft', 'spectral_magnitude', 'resynth']

MODULES_NEEDED = ['enhance_model']

compute_features(wavs)[source]

Compute the log spectral magnitude features for masking.

Parameters:: wavs (torch.Tensor) – A batch of waveforms to convert to log spectral mags.

enhance_batch(noisy, lengths=None)[source]

Enhance a batch of noisy waveforms.

Parameters:

noisy (torch.Tensor) – A batch of waveforms to perform enhancement on.
lengths (torch.Tensor) – The lengths of the waveforms if the enhancement model handles them.

Returns:

A batch of enhanced waveforms of the same shape as input.

Return type:

torch.Tensor

enhance_file(filename, output_filename=None, **kwargs)[source]

Enhance a wav file.

Parameters:

filename (str) – Location on disk to load file for enhancement.
output_filename (str) – If provided, writes enhanced data to this file.

training: bool

class speechbrain.inference.enhancement.WaveformEnhancement(modules=None, hparams=None, run_opts=None, freeze_params=True)[source]

Bases: Pretrained

A ready-to-use model for speech enhancement.

Parameters:: Pretrained. (See) –

Example

>>> from speechbrain.inference.enhancement import WaveformEnhancement
>>> # Model is downloaded from the speechbrain HuggingFace repo
>>> tmpdir = getfixture("tmpdir")
>>> enhancer = WaveformEnhancement.from_hparams(
...     source="speechbrain/mtl-mimic-voicebank",
...     savedir=tmpdir,
... )
>>> enhanced = enhancer.enhance_file(
...     "speechbrain/mtl-mimic-voicebank/example.wav"
... )

MODULES_NEEDED = ['enhance_model']

enhance_batch(noisy, lengths=None)[source]

Enhance a batch of noisy waveforms.

Parameters:

noisy (torch.Tensor) – A batch of waveforms to perform enhancement on.
lengths (torch.Tensor) – The lengths of the waveforms if the enhancement model handles them.

Returns:

A batch of enhanced waveforms of the same shape as input.

Return type:

torch.Tensor

enhance_file(filename, output_filename=None, **kwargs)[source]

Enhance a wav file.

Parameters:

filename (str) – Location on disk to load file for enhancement.
output_filename (str) – If provided, writes enhanced data to this file.

forward(noisy, lengths=None)[source]: Runs enhancement on the noisy input

training: bool