speechbrain.nnet.loss.transducer_loss module
Transducer loss implementation (depends on numba)
- Authors
Abdelwahab Heba 2020
Titouan Parcollet 2023
Summary
Classes:
This class implements the Transducer loss computation with forward-backward algorithm Sequence Transduction with naive implementation : https://arxiv.org/pdf/1211.3711.pdf |
|
This class implements the Transduce loss computation with forward-backward algorithm. |
Functions:
Compute backward pass for the forward-backward algorithm using Numba cuda kernel. |
|
Compute gradient for the forward-backward algorithm using Numba cuda kernel. |
|
Compute forward pass for the forward-backward algorithm using Numba cuda kernel. |
Reference
- speechbrain.nnet.loss.transducer_loss.cu_kernel_forward(log_probs, labels, alpha, log_p, T, U, blank, lock)[source]
Compute forward pass for the forward-backward algorithm using Numba cuda kernel. Sequence Transduction with naive implementation : https://arxiv.org/pdf/1211.3711.pdf
- Parameters:
log_probs (tensor) – 4D Tensor of (batch x TimeLength x LabelLength x outputDim) from the Transducer network.
labels (tensor) – 2D Tensor of (batch x MaxSeqLabelLength) containing targets of the batch with zero padding.
alpha (tensor) – 3D Tensor of (batch x TimeLength x LabelLength) for forward computation.
log_p (tensor) – 1D Tensor of (batch) for forward cost computation.
T (tensor) – 1D Tensor of (batch) containing TimeLength of each target.
U (tensor) – 1D Tensor of (batch) containing LabelLength of each target.
blank (int) – Blank indice.
lock (tensor) – 2D Tensor of (batch x LabelLength) containing bool(1-0) lock for parallel computation.
- speechbrain.nnet.loss.transducer_loss.cu_kernel_backward(log_probs, labels, beta, log_p, T, U, blank, lock)[source]
Compute backward pass for the forward-backward algorithm using Numba cuda kernel. Sequence Transduction with naive implementation : https://arxiv.org/pdf/1211.3711.pdf
- Parameters:
log_probs (tensor) – 4D Tensor of (batch x TimeLength x LabelLength x outputDim) from the Transducer network.
labels (tensor) – 2D Tensor of (batch x MaxSeqLabelLength) containing targets of the batch with zero padding.
beta (tensor) – 3D Tensor of (batch x TimeLength x LabelLength) for backward computation.
log_p (tensor) – 1D Tensor of (batch) for backward cost computation.
T (tensor) – 1D Tensor of (batch) containing TimeLength of each target.
U (tensor) – 1D Tensor of (batch) containing LabelLength of each target.
blank (int) – Blank indice.
lock (tensor) – 2D Tensor of (batch x LabelLength) containing bool(1-0) lock for parallel computation.
- speechbrain.nnet.loss.transducer_loss.cu_kernel_compute_grad(log_probs, labels, alpha, beta, grads, T, U, blank)[source]
Compute gradient for the forward-backward algorithm using Numba cuda kernel. Sequence Transduction with naive implementation : https://arxiv.org/pdf/1211.3711.pdf
- Parameters:
log_probs (tensor) – 4D Tensor of (batch x TimeLength x LabelLength x outputDim) from the Transducer network.
labels (tensor) – 2D Tensor of (batch x MaxSeqLabelLength) containing targets of the batch with zero padding.
beta (tensor) – 3D Tensor of (batch x TimeLength x LabelLength) for backward computation.
log_p (tensor) – 1D Tensor of (batch) for backward cost computation.
T (tensor) – 1D Tensor of (batch) containing TimeLength of each target.
U (tensor) – 1D Tensor of (batch) containing LabelLength of each target.
blank (int) – Blank indice.
lock (int) – 2D Tensor of (batch x LabelLength) containing bool(1-0) lock for parallel computation.
- class speechbrain.nnet.loss.transducer_loss.Transducer(*args, **kwargs)[source]
Bases:
Function
This class implements the Transducer loss computation with forward-backward algorithm Sequence Transduction with naive implementation : https://arxiv.org/pdf/1211.3711.pdf
This class use torch.autograd.Function. In fact of using the forward-backward algorithm, we need to compute the gradient manually.
This class can’t be instantiated, please refer to TransducerLoss class
It is also possible to use this class directly by using Transducer.apply
- class speechbrain.nnet.loss.transducer_loss.TransducerLoss(blank=0, reduction='mean')[source]
Bases:
Module
This class implements the Transduce loss computation with forward-backward algorithm. Sequence Transduction with naive implementation : https://arxiv.org/pdf/1211.3711.pdf
The TranducerLoss(nn.Module) use Transducer(autograd.Function) to compute the forward-backward loss and gradients.
Input tensors must be on a cuda device.
Example
>>> import torch >>> loss = TransducerLoss(blank=0) >>> logits = torch.randn((1,2,3,5)).cuda().requires_grad_() >>> labels = torch.Tensor([[1,2]]).cuda().int() >>> act_length = torch.Tensor([2]).cuda().int() >>> # U = label_length+1 >>> label_length = torch.Tensor([2]).cuda().int() >>> l = loss(logits, labels, act_length, label_length) >>> l.backward()