speechbrain.lobes.models.ResNet moduleο
ResNet PreActivated for speaker verification
- Authors
Mickael Rouvier 2022
Summaryο
Classes:
An implementation of ResNet Block. |
|
This class implements the cosine similarity on the top of features. |
|
An implementation of ResNet |
|
An implementation of Squeeze-and-Excitation ResNet Block. |
|
An implementation of Squeeze-and-Excitation Block. |
Functions:
2D convolution with kernel_size = 1 |
|
2D convolution with kernel_size = 3 |
Referenceο
- speechbrain.lobes.models.ResNet.conv3x3(in_planes, out_planes, stride=1)[source]ο
2D convolution with kernel_size = 3
- speechbrain.lobes.models.ResNet.conv1x1(in_planes, out_planes, stride=1)[source]ο
2D convolution with kernel_size = 1
- class speechbrain.lobes.models.ResNet.SEBlock(channels, reduction=1, activation=<class 'torch.nn.modules.activation.ReLU'>)[source]ο
Bases:
Module
An implementation of Squeeze-and-Excitation Block.
- Parameters:
Example
>>> inp_tensor = torch.rand([1, 64, 80, 40]) >>> se_layer = SEBlock(64) >>> out_tensor = se_layer(inp_tensor) >>> out_tensor.shape torch.Size([1, 64, 80, 40])
- class speechbrain.lobes.models.ResNet.BasicBlock(in_channels, out_channels, stride=1, downsample=None, activation=<class 'torch.nn.modules.activation.ReLU'>)[source]ο
Bases:
Module
An implementation of ResNet Block.
- Parameters:
in_channels (int) β Number of input channels.
out_channels (int) β The number of output channels.
stride (int) β Factor that reduce the spatial dimensionality
downsample (torch function) β A function for downsample the identity of block when stride != 1
activation (torch class) β A class for constructing the activation layers.
Example
>>> inp_tensor = torch.rand([1, 64, 80, 40]) >>> layer = BasicBlock(64, 64, stride=1) >>> out_tensor = layer(inp_tensor) >>> out_tensor.shape torch.Size([1, 64, 80, 40])
- class speechbrain.lobes.models.ResNet.SEBasicBlock(in_channels, out_channels, reduction=1, stride=1, downsample=None, activation=<class 'torch.nn.modules.activation.ReLU'>)[source]ο
Bases:
Module
An implementation of Squeeze-and-Excitation ResNet Block.
- Parameters:
in_channels (int) β Number of input channels.
out_channels (int) β The number of output channels.
reduction (int) β The reduction factor of channels.
stride (int) β Factor that reduce the spatial dimensionality
downsample (torch function) β A function for downsample the identity of block when stride != 1
activation (torch class) β A class for constructing the activation layers.
Example
>>> inp_tensor = torch.rand([1, 64, 80, 40]) >>> layer = SEBasicBlock(64, 64, stride=1) >>> out_tensor = layer(inp_tensor) >>> out_tensor.shape torch.Size([1, 64, 80, 40])
- class speechbrain.lobes.models.ResNet.ResNet(input_size=80, device='cpu', activation=<class 'torch.nn.modules.activation.ReLU'>, channels=[128, 128, 256, 256], block_sizes=[3, 4, 6, 3], strides=[1, 2, 2, 2], lin_neurons=256)[source]ο
Bases:
Module
An implementation of ResNet
- Parameters:
input_size (int) β Expected size of the input dimension.
device (str) β Device used, e.g., βcpuβ or βcudaβ.
activation (torch class) β A class for constructing the activation layers.
channels (list of ints) β List of number of channels used per stage.
block_sizes (list of ints) β List of number of groups created per stage.
strides (list of ints) β List of stride per stage.
lin_neurons (int) β Number of neurons in linear layers.
Example
>>> input_feats = torch.rand([2, 400, 80]) >>> compute_embedding = ResNet(lin_neurons=256) >>> outputs = compute_embedding(input_feats) >>> outputs.shape torch.Size([2, 256])
- class speechbrain.lobes.models.ResNet.Classifier(input_size, device='cpu', lin_blocks=0, lin_neurons=256, out_neurons=1211)[source]ο
Bases:
Module
This class implements the cosine similarity on the top of features.
- Parameters:
Example
>>> classify = Classifier(input_size=2, lin_neurons=2, out_neurons=2) >>> outputs = torch.tensor([ [1., -1.], [-9., 1.], [0.9, 0.1], [0.1, 0.9] ]) >>> outputs = outputs.unsqueeze(1) >>> cos = classify(outputs) >>> (cos < -1.0).long().sum() tensor(0) >>> (cos > 1.0).long().sum() tensor(0)