speechbrain.inference.enhancement module

Specifies the inference interfaces for speech enhancement modules.

Authors:
  • Aku Rouhe 2021

  • Peter Plantinga 2021

  • Loren Lugosch 2020

  • Mirco Ravanelli 2020

  • Titouan Parcollet 2021

  • Abdel Heba 2021

  • Andreas Nautsch 2022, 2023

  • Pooneh Mousavi 2023

  • Sylvain de Langen 2023

  • Adel Moumen 2023

  • Pradnya Kandarkar 2023

Summary

Classes:

SpectralMaskEnhancement

A ready-to-use model for speech enhancement.

WaveformEnhancement

A ready-to-use model for speech enhancement.

Reference

class speechbrain.inference.enhancement.SpectralMaskEnhancement(modules=None, hparams=None, run_opts=None, freeze_params=True)[source]

Bases: Pretrained

A ready-to-use model for speech enhancement.

Parameters:

Pretrained. (See) –

Example

>>> import torch
>>> from speechbrain.inference.enhancement import SpectralMaskEnhancement
>>> # Model is downloaded from the speechbrain HuggingFace repo
>>> tmpdir = getfixture("tmpdir")
>>> enhancer = SpectralMaskEnhancement.from_hparams(
...     source="speechbrain/metricgan-plus-voicebank",
...     savedir=tmpdir,
... )
>>> enhanced = enhancer.enhance_file(
...     "speechbrain/metricgan-plus-voicebank/example.wav"
... )
HPARAMS_NEEDED = ['compute_stft', 'spectral_magnitude', 'resynth']
MODULES_NEEDED = ['enhance_model']
compute_features(wavs)[source]

Compute the log spectral magnitude features for masking.

Parameters:

wavs (torch.Tensor) – A batch of waveforms to convert to log spectral mags.

enhance_batch(noisy, lengths=None)[source]

Enhance a batch of noisy waveforms.

Parameters:
  • noisy (torch.Tensor) – A batch of waveforms to perform enhancement on.

  • lengths (torch.Tensor) – The lengths of the waveforms if the enhancement model handles them.

Returns:

A batch of enhanced waveforms of the same shape as input.

Return type:

torch.Tensor

enhance_file(filename, output_filename=None, **kwargs)[source]

Enhance a wav file.

Parameters:
  • filename (str) – Location on disk to load file for enhancement.

  • output_filename (str) – If provided, writes enhanced data to this file.

training: bool
class speechbrain.inference.enhancement.WaveformEnhancement(modules=None, hparams=None, run_opts=None, freeze_params=True)[source]

Bases: Pretrained

A ready-to-use model for speech enhancement.

Parameters:

Pretrained. (See) –

Example

>>> from speechbrain.inference.enhancement import WaveformEnhancement
>>> # Model is downloaded from the speechbrain HuggingFace repo
>>> tmpdir = getfixture("tmpdir")
>>> enhancer = WaveformEnhancement.from_hparams(
...     source="speechbrain/mtl-mimic-voicebank",
...     savedir=tmpdir,
... )
>>> enhanced = enhancer.enhance_file(
...     "speechbrain/mtl-mimic-voicebank/example.wav"
... )
MODULES_NEEDED = ['enhance_model']
enhance_batch(noisy, lengths=None)[source]

Enhance a batch of noisy waveforms.

Parameters:
  • noisy (torch.Tensor) – A batch of waveforms to perform enhancement on.

  • lengths (torch.Tensor) – The lengths of the waveforms if the enhancement model handles them.

Returns:

A batch of enhanced waveforms of the same shape as input.

Return type:

torch.Tensor

enhance_file(filename, output_filename=None, **kwargs)[source]

Enhance a wav file.

Parameters:
  • filename (str) – Location on disk to load file for enhancement.

  • output_filename (str) – If provided, writes enhanced data to this file.

forward(noisy, lengths=None)[source]

Runs enhancement on the noisy input

training: bool