speechbrain.lobes.models.Xvector moduleο
A popular speaker recognition and diarization model.
- Authors
Nauman Dawalatabad 2020
Mirco Ravanelli 2020
Summaryο
Classes:
This class implements the last MLP on the top of xvector features. |
|
This class implements a discriminator on the top of xvector features. |
|
This model extracts X-vectors for speaker recognition and diarization. |
Referenceο
- class speechbrain.lobes.models.Xvector.Xvector(device='cpu', activation=<class 'torch.nn.modules.activation.LeakyReLU'>, tdnn_blocks=5, tdnn_channels=[512, 512, 512, 512, 1500], tdnn_kernel_sizes=[5, 3, 3, 1, 1], tdnn_dilations=[1, 2, 3, 1, 1], lin_neurons=512, in_channels=40)[source]ο
Bases:
ModuleThis model extracts X-vectors for speaker recognition and diarization.
- Parameters:
device (str) β Device used e.g. βcpuβ or βcudaβ.
activation (torch class) β A class for constructing the activation layers.
tdnn_blocks (int) β Number of time-delay neural (TDNN) layers.
tdnn_channels (list of ints) β Output channels for TDNN layer.
tdnn_kernel_sizes (list of ints) β List of kernel sizes for each TDNN layer.
tdnn_dilations (list of ints) β List of dilations for kernels in each TDNN layer.
lin_neurons (int) β Number of neurons in linear layers.
in_channels (int) β Expected size of input features.
Example
>>> compute_xvect = Xvector("cpu") >>> input_feats = torch.rand([5, 10, 40]) >>> outputs = compute_xvect(input_feats) >>> outputs.shape torch.Size([5, 1, 512])
- forward(x, lens=None)[source]ο
Returns the x-vectors.
- Parameters:
x (torch.Tensor) β Inputs features for extracting x-vectors.
lens (torch.Tensor) β The corresponding relative lengths of the inputs.
- Returns:
x β X-vectors.
- Return type:
- class speechbrain.lobes.models.Xvector.Classifier(input_shape, activation=<class 'torch.nn.modules.activation.LeakyReLU'>, lin_blocks=1, lin_neurons=512, out_neurons=1211)[source]ο
Bases:
SequentialThis class implements the last MLP on the top of xvector features.
- Parameters:
Example
>>> input_feats = torch.rand([5, 10, 40]) >>> compute_xvect = Xvector() >>> xvects = compute_xvect(input_feats) >>> classify = Classifier(input_shape=xvects.shape) >>> output = classify(xvects) >>> output.shape torch.Size([5, 1, 1211])
- class speechbrain.lobes.models.Xvector.Discriminator(input_shape, activation=<class 'torch.nn.modules.activation.LeakyReLU'>, lin_blocks=1, lin_neurons=512, out_neurons=1)[source]ο
Bases:
SequentialThis class implements a discriminator on the top of xvector features.
- Parameters:
Example
>>> input_feats = torch.rand([5, 10, 40]) >>> compute_xvect = Xvector() >>> xvects = compute_xvect(input_feats) >>> discriminate = Discriminator(xvects.shape) >>> output = discriminate(xvects) >>> output.shape torch.Size([5, 1, 1])