speechbrain.nnet.normalization module¶

Library implementing normalization.

Authors

Mirco Ravanelli 2020

Summary¶

Classes:

`BatchNorm1d`	Applies 1d batch normalization to the input tensor.
`BatchNorm2d`	Applies 2d batch normalization to the input tensor.
`InstanceNorm1d`	Applies 1d instance normalization to the input tensor.
`InstanceNorm2d`	Applies 2d instance normalization to the input tensor.
`LayerNorm`	Applies layer normalization to the input tensor.

Reference¶

class speechbrain.nnet.normalization.BatchNorm1d(input_shape=None, input_size=None, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, combine_batch_time=False, skip_transpose=False)[source]¶

Bases: torch.nn.modules.module.Module

Applies 1d batch normalization to the input tensor.

Parameters

input_shape (tuple) – The expected shape of the input. Alternatively, use input_size.
input_size (int) – The expected size of the input. Alternatively, use input_shape.
eps (float) – This value is added to std deviation estimation to improve the numerical stability.
momentum (float) – It is a value used for the running_mean and running_var computation.
affine (bool) – When set to True, the affine parameters are learned.
track_running_stats (bool) – When set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics.
combine_batch_time (bool) – When true, it combines batch an time axis.

Example

>>> input = torch.randn(100, 10)
>>> norm = BatchNorm1d(input_shape=input.shape)
>>> output = norm(input)
>>> output.shape
torch.Size([100, 10])

forward(x)[source]¶

Returns the normalized input tensor.

Parameters: x (torch.Tensor (batch, time, [channels])) – input to normalize. 2d or 3d tensors are expected in input 4d tensors can be used when combine_dims=True.

training: bool¶

class speechbrain.nnet.normalization.BatchNorm2d(input_shape=None, input_size=None, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)[source]¶

Bases: torch.nn.modules.module.Module

Applies 2d batch normalization to the input tensor.

Parameters

input_shape (tuple) – The expected shape of the input. Alternatively, use input_size.
input_size (int) – The expected size of the input. Alternatively, use input_shape.
eps (float) – This value is added to std deviation estimation to improve the numerical stability.
momentum (float) – It is a value used for the running_mean and running_var computation.
affine (bool) – When set to True, the affine parameters are learned.
track_running_stats (bool) – When set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics.

Example

>>> input = torch.randn(100, 10, 5, 20)
>>> norm = BatchNorm2d(input_shape=input.shape)
>>> output = norm(input)
>>> output.shape
torch.Size([100, 10, 5, 20])

forward(x)[source]¶

Returns the normalized input tensor.

Parameters: x (torch.Tensor (batch, time, channel1, channel2)) – input to normalize. 4d tensors are expected.

training: bool¶

class speechbrain.nnet.normalization.LayerNorm(input_size=None, input_shape=None, eps=1e-05, elementwise_affine=True)[source]¶

Bases: torch.nn.modules.module.Module

Applies layer normalization to the input tensor.

Parameters

input_shape (tuple) – The expected shape of the input.
eps (float) – This value is added to std deviation estimation to improve the numerical stability.
elementwise_affine (bool) – If True, this module has learnable per-element affine parameters initialized to ones (for weights) and zeros (for biases).

Example

>>> input = torch.randn(100, 101, 128)
>>> norm = LayerNorm(input_shape=input.shape)
>>> output = norm(input)
>>> output.shape
torch.Size([100, 101, 128])

forward(x)[source]¶

Returns the normalized input tensor.

Parameters: x (torch.Tensor (batch, time, channels)) – input to normalize. 3d or 4d tensors are expected.

training: bool¶

class speechbrain.nnet.normalization.InstanceNorm1d(input_shape=None, input_size=None, eps=1e-05, momentum=0.1, track_running_stats=True, affine=False)[source]¶

Bases: torch.nn.modules.module.Module

Applies 1d instance normalization to the input tensor.

Parameters

input_shape (tuple) – The expected shape of the input. Alternatively, use input_size.
input_size (int) – The expected size of the input. Alternatively, use input_shape.
eps (float) – This value is added to std deviation estimation to improve the numerical stability.
momentum (float) – It is a value used for the running_mean and running_var computation.
track_running_stats (bool) – When set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics.
affine (bool) – A boolean value that when set to True, this module has learnable affine parameters, initialized the same way as done for batch normalization. Default: False.

Example

>>> input = torch.randn(100, 10, 20)
>>> norm = InstanceNorm1d(input_shape=input.shape)
>>> output = norm(input)
>>> output.shape
torch.Size([100, 10, 20])

forward(x)[source]¶

Returns the normalized input tensor.

Parameters: x (torch.Tensor (batch, time, channels)) – input to normalize. 3d tensors are expected.

training: bool¶

class speechbrain.nnet.normalization.InstanceNorm2d(input_shape=None, input_size=None, eps=1e-05, momentum=0.1, track_running_stats=True, affine=False)[source]¶

Bases: torch.nn.modules.module.Module

Applies 2d instance normalization to the input tensor.

Parameters

input_shape (tuple) – The expected shape of the input. Alternatively, use input_size.
input_size (int) – The expected size of the input. Alternatively, use input_shape.
eps (float) – This value is added to std deviation estimation to improve the numerical stability.
momentum (float) – It is a value used for the running_mean and running_var computation.
track_running_stats (bool) – When set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics.
affine (bool) – A boolean value that when set to True, this module has learnable affine parameters, initialized the same way as done for batch normalization. Default: False.

Example

>>> input = torch.randn(100, 10, 20, 2)
>>> norm = InstanceNorm2d(input_shape=input.shape)
>>> output = norm(input)
>>> output.shape
torch.Size([100, 10, 20, 2])

forward(x)[source]¶

Returns the normalized input tensor.

Parameters: x (torch.Tensor (batch, time, channel1, channel2)) – input to normalize. 4d tensors are expected.

training: bool¶