speechbrain.lobes.models.convolution module

This is a module to ensemble a convolution (depthwise) encoder with or without residule connection.

Authors

Jianyuan Zhong 2020

Summary

Classes:

`ConvBlock`	An implementation of convolution block with 1d or 2d convolutions (depthwise).
`ConvolutionFrontEnd`	This is a module to ensemble a convolution (depthwise) encoder with or without residual connection.

Reference

class speechbrain.lobes.models.convolution.ConvolutionFrontEnd(input_shape, num_blocks=3, num_layers_per_block=5, out_channels=[128, 256, 512], kernel_sizes=[3, 3, 3], strides=[1, 2, 2], dilations=[1, 1, 1], residuals=[True, True, True], conv_module=<class 'speechbrain.nnet.CNN.Conv2d'>, activation=<class 'torch.nn.modules.activation.LeakyReLU'>, norm=<class 'speechbrain.nnet.normalization.LayerNorm'>, dropout=0.1)[source]

Bases: Sequential

This is a module to ensemble a convolution (depthwise) encoder with or without residual connection.

Arguments

out_channels: int: Number of output channels of this model (default 640).
out_channels: Optional(list[int]): Number of output channels for each of block.
kernel_size: int: Kernel size of convolution layers (default 3).
strides: Optional(list[int]): Striding factor for each block, this stride is applied at the last convolution layer at each block.
num_blocks: int: Number of block (default 21).
num_per_layers: int: Number of convolution layers for each block (default 5).
dropout: float: Dropout (default 0.15).
activation: torch class: Activation function for each block (default Swish).
norm: torch class: Normalization to regularize the model (default BatchNorm1d).
residuals: Optional(list[bool]): Whether apply residual connection at each block (default None).

Example

>>> x = torch.rand((8, 30, 10))
>>> conv = ConvolutionFrontEnd(input_shape=x.shape)
>>> out = conv(x)
>>> out.shape
torch.Size([8, 8, 3, 512])

class speechbrain.lobes.models.convolution.ConvBlock(num_layers, out_channels, input_shape, kernel_size=3, stride=1, dilation=1, residual=False, conv_module=<class 'speechbrain.nnet.CNN.Conv2d'>, activation=<class 'torch.nn.modules.activation.LeakyReLU'>, norm=None, dropout=0.1)[source]

Bases: Module

An implementation of convolution block with 1d or 2d convolutions (depthwise).

Parameters

out_channels (int) – Number of output channels of this model (default 640).
kernel_size (int) – Kernel size of convolution layers (default 3).
strides (int) – Striding factor for this block (default 1).
num_layers (int) – Number of depthwise convolution layers for this block.
activation (torch class) – Activation function for this block.
norm (torch class) – Normalization to regularize the model (default BatchNorm1d).
residuals (bool) – Whether apply residual connection at this block (default None).

Example

>>> x = torch.rand((8, 30, 10))
>>> conv = ConvBlock(2, 16, input_shape=x.shape)
>>> out = conv(x)
>>> x.shape
torch.Size([8, 30, 10])

training: bool

forward(x)[source]: Processes the input tensor x and returns an output tensor.