speechbrain.utils.profiling module

Polymorphic decorators to handle PyTorch profiling and benchmarking.

  • Andreas Nautsch 2022




Takes two EventList:s in, filters events of equal value (default: by the count of events).


Exports current and aggregated traces for: - Chrome tensorboard - FlameGraph (and sets default parameters for log file folder/filenames).


For instances of speechbrain.core.Brain, critical functions are hooked to profiler start/stop methods.


Sets up a torch.profiler.profile to also (a) aggregate traces issued from various interactions with speechbrain.core.Brain:s and (b) hooks a method to merge_traces.


Wrapper to create a PyTorch profiler to benchmark training/inference of speechbrain.core.Brain instances.


Pre-configured profiling for a fully detailed benchmark - analyst perspective.


Pre-configured profiling for a detailed benchmark (better suitable for speed-optimisation than @profile_analyst).


Pre-configured profiling for a reporting benchmark (changed scheduler to @profile_optimiser).


Summary reporting of total time - see: torch.autograd.profiler_util


Summary reporting of total time - see: torch.autograd.profiler_util


Wrapper to create a `torch.profiler.schedule` (sets default parameters for warm-up).


Sets handler for profiler: scheduler or trace export.


speechbrain.utils.profiling.set_profiler_attr(func: object, set_attr: str, handler: Callable)[source]

Sets handler for profiler: scheduler or trace export.

speechbrain.utils.profiling.schedule(func: Optional[object] = None, wait: int = 2, warmup: int = 2, active: int = 2, repeat: int = 1, skip_first: int = 0)[source]

Wrapper to create a `torch.profiler.schedule` (sets default parameters for warm-up).

speechbrain.utils.profiling.export(func: Optional[object] = None, dir_name: str = './log/', worker_name: Optional[str] = None, use_gzip: bool = False)[source]

Exports current and aggregated traces for: - Chrome tensorboard - FlameGraph (and sets default parameters for log file folder/filenames).

speechbrain.utils.profiling.prepare_profiler_for_brain(prof: profile)[source]

Sets up a torch.profiler.profile to also (a) aggregate traces issued from various interactions with speechbrain.core.Brain:s and (b) hooks a method to merge_traces.

speechbrain.utils.profiling.hook_brain_methods(func: object, prof: profile, class_hooks: Optional[Iterable[str]] = None)[source]

For instances of speechbrain.core.Brain, critical functions are hooked to profiler start/stop methods.

speechbrain.utils.profiling.profile(func: Optional[object] = None, class_hooks: Optional[Iterable[str]] = None, activities: Optional[Iterable[ProfilerActivity]] = None, schedule: Optional[Callable[[int], ProfilerAction]] = None, on_trace_ready: Optional[Callable[[...], Any]] = None, record_shapes: bool = False, profile_memory: bool = False, with_stack: bool = False, with_flops: bool = False, with_modules: bool = False) object[source]

Wrapper to create a PyTorch profiler to benchmark training/inference of speechbrain.core.Brain instances. See torch.profiler.profile documentation for details (brief summary below).

  • func (object) – speechbrain.core.Brain:s or a (train/eval) function to be profiled.

  • class_hooks (iterable) – List of method/function names of speechbrain.core.Brain:s that should be profiled also. Otherwise, only the __init__ constructor will be profiled when decorating a Brain class. Default: ['fit', 'evaluate'] for classes, and None for functions.

  • activities (iterable) – List of activity groups. Default: ProfilerActivity.CPU and (when available) ProfilerActivity.CUDA. (Default value should be ok for most cases.)

  • schedule (callable) – Waits a specified amount of steps for PyTorch to warm-up; see the above schedule decorator. Default: ProfilerAction.RECORD (immediately starts recording).

  • on_trace_ready (callable) – Specifies what benchmark record should be saved (after each scheduled step); see above trace_handler decorator. Default: None (pick up collected reporting once profiling ended, but not details per step).

  • record_shapes (bool) – Save input shapes of operations (enables to group benchmark data by after profiling). Default: False.

  • profile_memory (bool) – Track tensor memory allocation/deallocation. Default: False.

  • with_stack (bool) – Record source information (file and line number). Default: False.

  • with_flops (bool) – Estimate the number of FLOPs. Default: False.

  • with_modules (bool) – Record module hierarchy (including function names) Default: False


>>> import torch
>>> @profile
... def run(x : torch.Tensor):
...     y = x ** 2
...     z = y ** 3
...     return y.backward()  # y.backward() returns None --> return value is substituted with profiler
>>> data = torch.randn((1, 1), requires_grad=True)
>>> prof = run(data)
>>> out = [len(prof.events()), len(prof.key_averages()), prof.profiler.total_average().count]
speechbrain.utils.profiling.profile_analyst(func: Optional[object] = None, class_hooks: Optional[Iterable[str]] = None)[source]

Pre-configured profiling for a fully detailed benchmark - analyst perspective.

Creating this analyst view will create overheads (disabling some PyTorch optimisations); use @profile_optimiser to take benefits of optimisations and further optimise your modules, accordingly.

speechbrain.utils.profiling.profile_optimiser(func: Optional[object] = None, class_hooks: Optional[Iterable[str]] = None)[source]

Pre-configured profiling for a detailed benchmark (better suitable for speed-optimisation than @profile_analyst).

speechbrain.utils.profiling.profile_report(func: Optional[object] = None, class_hooks: Optional[Iterable[str]] = None)[source]

Pre-configured profiling for a reporting benchmark (changed scheduler to @profile_optimiser).

speechbrain.utils.profiling.events_diff(a: EventList, b: EventList, filter_by: str = 'count')[source]

Takes two EventList:s in, filters events of equal value (default: by the count of events).

The purpose of the results of this diff are for visualisation only (to see the difference between implementations).

speechbrain.utils.profiling.report_time(events: object, verbose=False, upper_control_limit=False)[source]

Summary reporting of total time - see: torch.autograd.profiler_util

speechbrain.utils.profiling.report_memory(handler: object, verbose=False)[source]

Summary reporting of total time - see: torch.autograd.profiler_util