============================= Audio loading troubleshooting ============================= This page is intended to document how to install torchaudio backends and provides troubleshooting steps for your audio loading troubles. Introduction ============ SpeechBrain relies on `torchaudio `_ for loading audio files in most cases. Please first try to **update torchaudio** if you are encountering issues. Please also ensure that you are using the correct PyTorch version for your installed torchaudio version. As of torchaudio `2.2.0`, three backends are supported: ``ffmpeg``, ``sox`` and ``soundfile``. torchaudio documents how their backends are found in their `optional dependency docs `_. You can determine which backends are available in your environment by running :func:`torchaudio.list_audio_backends`. .. warning:: **A backend can *silently* fail to load** if initialization failed and will be omitted from this list. .. warning:: **Not every backend can support any codec.** For instance, at the time of writing, the torchaudio SoX backend cannot handle MP3 and the SoundFile backend cannot handle AAC (usually ``.m4a``), both of which are found in certain popular speech datasets. However, most common formats are typically well supported by all backends (``.wav``/``.ogg`` vorbis/opus/``.flac``). Recommended install steps ========================= Often, torchaudio will work out of the box. On certain systems, there might not be a working backend installed. We recommend you try if any of those steps fixes your issue: - On Linux, if you have superuser rights, install ffmpeg and/or libsndfile and/or SoX through your distribution's package manager. - On Windows/Linux/macOS, you can try installing ffmpeg through Conda (see `ffmpeg`_), which does not require superuser rights (provided Conda is available). - On macOS, alternatively, it appears to be possible to install ffmpeg through Homebrew. Make sure that you are installing a version compatible with torchaudio (see `ffmpeg`_). - On Windows/Linux/macOS, `SoundFile `_ has started shipping with a prebuilt ``libsndfile``, which does not require admin rights. Try installing or updating it. See the linked page for more details. Note for developers & breaking torchaudio `2.x` changes ======================================================= With torchaudio `<2.x`, backends were selected through ``torchaudio.set_audio_backend``. This function was deprecated and then removed in the `2.x` branch of torchaudio and is no longer used in SpeechBrain. Since then, the backend is (optionally) selected through the ``backend`` argument of :func:`torchaudio.load` and :func:`torchaudio.info`. Installing/troubleshooting backends =================================== ffmpeg ------ torchaudio compiles their ffmpeg backend for a **specific range** of ffmpeg versions. ffmpeg is commonly already installed on common Linux distributions. On Ubuntu, it can be installed through ``sudo apt install ffmpeg``. Depending on your OS version, it is possible that your installed ffmpeg version is not supported by torchaudio (if too recent or too old). If you believe this to be the case, you can try installing a specific version of the ``ffmpeg`` package as supplied by `conda-forge `_. See `torchaudio documentation on optional dependencies `_ for more details. SoundFile --------- torchaudio can use `soundfile `_ as an audio backend, which depends on ``libsndfile``. Starting with SoundFile 0.12.0, this package bundles a prebuilt ``libsndfile`` for a number of platforms. Refer to the project page for more details. SoX --- Starting with torchaudio 0.12.0, the SoX backend no longer supports mp3 files. Starting with torchaudio 2.1.0, torchaudio no longer compiles and bundles SoX by itself, and expects it to be provided by the system. If you have upgraded from an earlier version and can no longer load audio files, it may be due to this. In this case, you may need to install SoX or use a different backend.