speechbrain.utils.quirks module

Global changes and platform/GPU-specific quirks, i.e. workarounds and saner defaults, sometimes due to platform-specific issues.

Author:

Sylvain de Langen 2024

Summary

Functions:

`allow_tf32`	On CUDA backends (potentially including ROCm), enables TensorFloat32 support for CuDNN and the matmul operator.
`apply_quirks`	Apply quirks depending on the platform.
`disable_cudnn_benchmarking`	Disables CuDNN benchmarking.
`disable_jit_profiling`	Disables JIT profiling to avoid performance issues on highly dynamic shapes.
`log_applied_quirks`	Logs whichever quirks have been applied by `apply_quirks`.

Reference

speechbrain.utils.quirks.disable_cudnn_benchmarking()[source]

Disables CuDNN benchmarking. no-op on platforms where it is already off by default.

Benchmarking, when enabled, theoretically improves convolution performance by automatically comparing different kernels for some operations.

However, benchmarking has to be re-run for every unique input shape, which makes it unsuitable for highly dynamic shapes. Since SpeechBrain does tend to use very varied shapes without attempting to pad the differences out, leaving benchmarking on can severely degrade training performance.

This function disables it as we deem no-benchmarking to be a saner default to avoid performance bugs at the moment.

As of PyTorch 2.3.0, the default is False for CUDA GPUs, but True for HIP GPUs.

The HIP equivalent to CuDNN is MIOpen, but it is controlled through the same PyTorch API.

speechbrain.utils.quirks.disable_jit_profiling()[source]: Disables JIT profiling to avoid performance issues on highly dynamic shapes.

speechbrain.utils.quirks.allow_tf32()[source]

On CUDA backends (potentially including ROCm), enables TensorFloat32 support for CuDNN and the matmul operator.

This allows performing certain operations transparently at a lower precision, even in fp32 math when AMP is not in use, when otherwise tensor cores would not be used. TF32 supports accumulation into fp32, so the concern for overflowing is somewhat mitigated.

On NVIDIA GPUs, this is available since Ampere (e.g. A100).

See PyTorch documentation for more details.

speechbrain.utils.quirks.KNOWN_QUIRKS = {'allow_tf32': <function allow_tf32>, 'disable_cudnn_benchmarking': <function disable_cudnn_benchmarking>, 'disable_jit_profiling': <function disable_jit_profiling>}: Applied quirk list. Populated by apply_quirks.

speechbrain.utils.quirks.applied_quirks = {'allow_tf32', 'disable_jit_profiling'}: Excluded quirk list. Populated by apply_quirks from the SB_DISABLE_QUIRKS environment variable, which is a comma-separated list of quirks to disable.

speechbrain.utils.quirks.apply_quirks()[source]: Apply quirks depending on the platform. Also populates applied_quirks.

speechbrain.utils.quirks.log_applied_quirks()[source]: Logs whichever quirks have been applied by apply_quirks.