speechbrain.nnet.quaternion_networks.q_pooling module

Library implementing quaternion-valued max and average pooling layers.

Authors
  • Drew Wagner 2024

Summary

Classes:

QPooling2d

This class implements the quaternion average pooling and max pooling by magnitude as described in: "Geometric methods of perceptual organisation for computer vision", Altamirano G.

Reference

class speechbrain.nnet.quaternion_networks.q_pooling.QPooling2d(pool_type, kernel_size, pool_axis=(1, 2), ceil_mode=False, padding=0, dilation=1, stride=None)[source]

Bases: Pooling2d

This class implements the quaternion average pooling and max pooling by magnitude as described in: “Geometric methods of perceptual organisation for computer vision”, Altamirano G.

Parameters:
  • pool_type (str) – It is the type of pooling function to use (‘avg’,’max’).

  • kernel_size (int) – It is the kernel size that defines the pooling dimension. For instance, kernel size=3,3 performs a 2D Pooling with a 3x3 kernel.

  • pool_axis (tuple) – It is a list containing the axis that will be considered during pooling.

  • ceil_mode (bool) – When True, will use ceil instead of floor to compute the output shape.

  • padding (int) – It is the number of padding elements to apply.

  • dilation (int) – Controls the dilation factor of pooling.

  • stride (int) – It is the stride size.

Example

>>> pool = QPooling2d('max',(5,3))
>>> inputs = torch.rand(10, 15, 12)
>>> output=pool(inputs)
>>> output.shape
torch.Size([10, 3, 4])
forward(x)[source]

Performs 2d pooling to the input tensor.

Parameters:

x (torch.Tensor) – It represents a tensor for a mini-batch.

Return type:

The pooled tensor.