speechbrain.nnet.pooling module¶
Library implementing pooling.
- Authors
Titouan Parcollet 2020
Mirco Ravanelli 2020
Nauman Dawalatabad 2020
Jianyuan Zhong 2020
Summary¶
Classes:
This class implements the adaptive average pooling. |
|
This function implements 1d pooling of the input tensor. |
|
This function implements 2d pooling of the input tensor. |
|
This class implements a statistic pooling layer. |
Reference¶
- class speechbrain.nnet.pooling.Pooling1d(pool_type, kernel_size, input_dims=3, pool_axis=1, ceil_mode=False, padding=0, dilation=1, stride=None)[source]¶
Bases:
torch.nn.modules.module.Module
This function implements 1d pooling of the input tensor.
- Parameters
pool_type (str) – It is the type of pooling function to use (‘avg’,’max’).
kernel_size (int) – It is the kernel size that defines the pooling dimension. For instance, kernel size=3 applies a 1D Pooling with a size=3.
input_dims (int) – The count of dimensions expected in the input.
pool_axis (int) – The axis where the pooling is applied.
stride (int) – It is the stride size.
padding (int) – It is the number of padding elements to apply.
dilation (int) – Controls the dilation factor of pooling.
ceil_mode (int) – When True, will use ceil instead of floor to compute the output shape.
Example
>>> pool = Pooling1d('max',3) >>> inputs = torch.rand(10, 12, 40) >>> output=pool(inputs) >>> output.shape torch.Size([10, 4, 40])
- class speechbrain.nnet.pooling.Pooling2d(pool_type, kernel_size, pool_axis=(1, 2), ceil_mode=False, padding=0, dilation=1, stride=None)[source]¶
Bases:
torch.nn.modules.module.Module
This function implements 2d pooling of the input tensor.
- Parameters
pool_type (str) – It is the type of pooling function to use (‘avg’,’max’).
pool_axis (tuple) – It is a list containing the axis that will be considered during pooling.
kernel_size (int) – It is the kernel size that defines the pooling dimension. For instance, kernel size=3,3 performs a 2D Pooling with a 3x3 kernel.
stride (int) – It is the stride size.
padding (int) – It is the number of padding elements to apply.
dilation (int) – Controls the dilation factor of pooling.
ceil_mode (int) – When True, will use ceil instead of floor to compute the output shape.
Example
>>> pool = Pooling2d('max',(5,3)) >>> inputs = torch.rand(10, 15, 12) >>> output=pool(inputs) >>> output.shape torch.Size([10, 3, 4])
- class speechbrain.nnet.pooling.StatisticsPooling[source]¶
Bases:
torch.nn.modules.module.Module
This class implements a statistic pooling layer.
It returns the concatenated mean and std of input tensor.
Example
>>> inp_tensor = torch.rand([5, 100, 50]) >>> sp_layer = StatisticsPooling() >>> out_tensor = sp_layer(inp_tensor) >>> out_tensor.shape torch.Size([5, 1, 100])
- forward(x, lengths=None)[source]¶
Calculates mean and std for a batch (input tensor).
- Parameters
x (torch.Tensor) – It represents a tensor for a mini-batch.
- class speechbrain.nnet.pooling.AdaptivePool(output_size)[source]¶
Bases:
torch.nn.modules.module.Module
This class implements the adaptive average pooling.
- Parameters
delations (output_size) – The size of the output.
Example
>>> pool = AdaptivePool(1) >>> inp = torch.randn([8, 120, 40]) >>> output = pool(inp) >>> output.shape torch.Size([8, 1, 40])