speechbrain.utils.dynamic_chunk_training module
Configuration and utility classes for classes for Dynamic Chunk Training, as often used for the training of streaming-capable models in speech recognition.
The definition of Dynamic Chunk Training is based on that of the following paper, though a lot of the literature refers to the same definition: https://arxiv.org/abs/2012.05481
Authors * Sylvain de Langen 2023
Summary
Classes:
Dynamic Chunk Training configuration object for use with transformers, often in ASR for streaming. |
|
Helper class to generate a DynChunkTrainConfig at runtime depending on the current stage. |
Reference
- class speechbrain.utils.dynamic_chunk_training.DynChunkTrainConfig(chunk_size: int, left_context_size: int | None = None)[source]
Bases:
object
Dynamic Chunk Training configuration object for use with transformers, often in ASR for streaming.
This object may be used both to configure masking at training time and for run-time configuration of DynChunkTrain-ready models.
- chunk_size: int
Size in frames of a single chunk, always
>0
. If chunkwise streaming should be disabled at some point, pass an optional streaming config parameter.
- left_context_size: int | None = None
Number of chunks (not frames) visible to the left, always
>=0
. If zero, then chunks can never attend to any past chunk. IfNone
, the left context is infinite (but use.is_fininite_left_context
for such a check).
- class speechbrain.utils.dynamic_chunk_training.DynChunkTrainConfigRandomSampler(chunkwise_prob: float, chunk_size_min: int, chunk_size_max: int, limited_left_context_prob: float, left_context_chunks_min: int, left_context_chunks_max: int, test_config: DynChunkTrainConfig | None = None, valid_config: DynChunkTrainConfig | None = None)[source]
Bases:
object
Helper class to generate a DynChunkTrainConfig at runtime depending on the current stage.
Example
>>> from speechbrain.core import Stage >>> from speechbrain.utils.dynamic_chunk_training import DynChunkTrainConfig >>> from speechbrain.utils.dynamic_chunk_training import DynChunkTrainConfigRandomSampler >>> # for the purpose of this example, we test a scenario with a 100% >>> # chance of the (24, None) scenario to occur >>> sampler = DynChunkTrainConfigRandomSampler( ... chunkwise_prob=1.0, ... chunk_size_min=24, ... chunk_size_max=24, ... limited_left_context_prob=0.0, ... left_context_chunks_min=16, ... left_context_chunks_max=16, ... test_config=DynChunkTrainConfig(32, 16), ... valid_config=None ... ) >>> one_train_config = sampler(Stage.TRAIN) >>> one_train_config DynChunkTrainConfig(chunk_size=24, left_context_size=None) >>> one_train_config.is_infinite_left_context() True >>> sampler(Stage.TEST) DynChunkTrainConfig(chunk_size=32, left_context_size=16)
- chunkwise_prob: float
When sampling (during
Stage.TRAIN
), the probability that a finite chunk size will be used. In the other case, any chunk can attend to the full past and future context.
- limited_left_context_prob: float
When sampling a random chunk size, the probability that the left context will be limited. In the other case, any chunk can attend to the full past context.
- left_context_chunks_min: int
When sampling a random left context size, the minimum number of left context chunks that can be picked.
- left_context_chunks_max: int
When sampling a random left context size, the maximum number of left context chunks that can be picked.
- test_config: DynChunkTrainConfig | None = None
The configuration that should be used for
Stage.TEST
. WhenNone
, evaluation is done with full context (i.e. non-streaming).
- valid_config: DynChunkTrainConfig | None = None
The configuration that should be used for
Stage.VALID
. WhenNone
, evaluation is done with full context (i.e. non-streaming).
- __call__(stage: Stage) DynChunkTrainConfig [source]
In training stage, samples a random DynChunkTrain configuration. During validation or testing, returns the relevant configuration.
- Parameters:
stage (speechbrain.core.Stage) – Current stage of training or evaluation. In training mode, a random DynChunkTrainConfig will be sampled according to the specified probabilities and ranges. During evaluation, the relevant DynChunkTrainConfig attribute will be picked.