speechbrain.utils.data_utils module
This library gathers utilities for data io operation.
- Authors
Mirco Ravanelli 2020
Aku Rouhe 2020
Samuele Cornell 2020
Summary
Functions:
Given a list of torch tensors it batches them together by padding to the right on each dimension in order to get same length for all. |
|
Shuffles batches of fixed size within a sequence |
|
Concatenates multiple padded feature tensors into a single padded tensor in a vectorized manner without including the padding in the final tensor, adding padding only at the end. |
|
Returns all possible key-value combinations from the given dictionary |
|
Returns a generation of permutations of the specified values dictionary |
|
Returns standard distribution statistics (mean, std, min, max) |
|
Downloads the file from the given source and saves it in the given destination path. |
|
Returns a list of files found within a folder. |
|
Gets a list from the selected field of the input csv file. |
|
Creates a tensor with a range in a single dimension to one matching the shape of a its tensor |
|
A metric function that computes the minimum of each sample |
|
A metric function that computes the mean of each sample, excluding padding |
|
A metric function that computes the minimum of each sample |
|
A metric function that computes the standard deviation of each sample, excluding padding |
|
A swiss-army-knife helper function to match the shape of a tensor to match that of another tensor - useful for masks, etc. |
|
Makes a tensor from list of batch values. |
|
Returns all dimensons of the specified tensor except the batch dimension |
|
Adds extra padding to the specified dimension of a tensor to make it divisible by the specified factor. |
|
This function takes a torch tensor of arbitrary shape and pads it to target shape by appending values on the right. |
|
Yield each (key, value) of a nested dictionary. |
|
Moves data to device, or other type, and handles containers. |
|
Similar function to dict.update, but for a nested dict. |
|
Converts a namedtuple or dictionary containing tensors to their scalar value Arguments: ---------- value: dict or namedtuple a dictionary or named tuple of tensors :returns: result -- a result dictionary :rtype: dict |
|
This function sets user writing permissions to all the files in the given folder. |
|
A very basic functional version of str.split |
|
Returns a list of splits in the sequence. |
|
Splits a path to source and filename |
|
Trims the specified tensor to match the shape of another tensor (at most) |
|
Trims the specified tensor to match the specified shape |
|
Produces Python lists given a batch of sentences with their corresponding relative lengths. |
|
Unsqueezes a 1-D tensor to the specified number of dimension preserving one dimension and creating "dummy" dimensions elsewhere |
|
Reshape the tensor to be of a shape compatible with the target tensor, only valid if x.dim() <= y.dim() |
Reference
- speechbrain.utils.data_utils.undo_padding(batch, lengths)[source]
Produces Python lists given a batch of sentences with their corresponding relative lengths.
- Parameters:
batch (tensor) – Batch of sentences gathered in a batch.
lengths (tensor) – Relative length of each sentence in the batch.
Example
>>> batch=torch.rand([4,100]) >>> lengths=torch.tensor([0.5,0.6,0.7,1.0]) >>> snt_list=undo_padding(batch, lengths) >>> len(snt_list) 4
- speechbrain.utils.data_utils.get_all_files(dirName, match_and=None, match_or=None, exclude_and=None, exclude_or=None)[source]
Returns a list of files found within a folder.
Different options can be used to restrict the search to some specific patterns.
- Parameters:
dirName (str) – The directory to search.
match_and (list) – A list that contains patterns to match. The file is returned if it matches all the entries in match_and.
match_or (list) – A list that contains patterns to match. The file is returned if it matches one or more of the entries in match_or.
exclude_and (list) – A list that contains patterns to match. The file is returned if it matches none of the entries in exclude_and.
exclude_or (list) – A list that contains pattern to match. The file is returned if it fails to match one of the entries in exclude_or.
Example
>>> get_all_files('tests/samples/RIRs', match_and=['3.wav']) ['tests/samples/RIRs/rir3.wav']
- speechbrain.utils.data_utils.get_list_from_csv(csvfile, field, delimiter=',', skipinitialspace=True)[source]
Gets a list from the selected field of the input csv file.
- speechbrain.utils.data_utils.split_list(seq, num)[source]
Returns a list of splits in the sequence.
- Parameters:
seq (iterable) – The input list, to be split.
num (int) – The number of chunks to produce.
Example
>>> split_list([1, 2, 3, 4, 5, 6, 7, 8, 9], 4) [[1, 2], [3, 4], [5, 6], [7, 8, 9]]
- speechbrain.utils.data_utils.recursive_items(dictionary)[source]
Yield each (key, value) of a nested dictionary.
- Parameters:
dictionary (dict) – The nested dictionary to list.
- Yields:
(key, value) tuples from the dictionary.
Example
>>> rec_dict={'lev1': {'lev2': {'lev3': 'current_val'}}} >>> [item for item in recursive_items(rec_dict)] [('lev3', 'current_val')]
- speechbrain.utils.data_utils.recursive_update(d, u, must_match=False)[source]
Similar function to dict.update, but for a nested dict.
From: https://stackoverflow.com/a/3233356
If you have to a nested mapping structure, for example:
{“a”: 1, “b”: {“c”: 2}}
Say you want to update the above structure with:
{“b”: {“d”: 3}}
This function will produce:
{“a”: 1, “b”: {“c”: 2, “d”: 3}}
Instead of:
{“a”: 1, “b”: {“d”: 3}}
- Parameters:
Example
>>> d = {'a': 1, 'b': {'c': 2}} >>> recursive_update(d, {'b': {'d': 3}}) >>> d {'a': 1, 'b': {'c': 2, 'd': 3}}
- speechbrain.utils.data_utils.download_file(source, dest, unpack=False, dest_unpack=None, replace_existing=False, write_permissions=False)[source]
Downloads the file from the given source and saves it in the given destination path.
Arguments
the web.
- destpath
Destination path.
- unpackbool
If True, it unpacks the data in the dest folder.
- dest_unpack: path
Path where to store the unpacked dataset
- replace_existingbool
If True, replaces the existing files.
- write_permissions: bool
When set to True, all the files in the dest_unpack directory will be granted write permissions. This option is active only when unpack=True.
- speechbrain.utils.data_utils.set_writing_permissions(folder_path)[source]
This function sets user writing permissions to all the files in the given folder.
- Parameters:
folder_path (folder) – Folder whose files will be granted write permissions.
- speechbrain.utils.data_utils.pad_right_to(tensor: ~torch.Tensor, target_shape: (<class 'list'>, <class 'tuple'>), mode='constant', value=0)[source]
This function takes a torch tensor of arbitrary shape and pads it to target shape by appending values on the right.
- Parameters:
tensor (input torch tensor) – Input tensor whose dimension we need to pad.
target_shape ((list, tuple)) – Target shape we want for the target tensor its len must be equal to tensor.ndim
mode (str) – Pad mode, please refer to torch.nn.functional.pad documentation.
value (float) – Pad value, please refer to torch.nn.functional.pad documentation.
- Returns:
tensor (torch.Tensor) – Padded tensor.
valid_vals (list) – List containing proportion for each dimension of original, non-padded values.
- speechbrain.utils.data_utils.batch_pad_right(tensors: list, mode='constant', value=0)[source]
Given a list of torch tensors it batches them together by padding to the right on each dimension in order to get same length for all.
- Parameters:
- Returns:
tensor (torch.Tensor) – Padded tensor.
valid_vals (list) – List containing proportion for each dimension of original, non-padded values.
- speechbrain.utils.data_utils.split_by_whitespace(text)[source]
A very basic functional version of str.split
- speechbrain.utils.data_utils.recursive_to(data, *args, **kwargs)[source]
Moves data to device, or other type, and handles containers.
Very similar to torch.utils.data._utils.pin_memory.pin_memory, but applies .to() instead.
- speechbrain.utils.data_utils.mod_default_collate(batch)[source]
Makes a tensor from list of batch values.
Note that this doesn’t need to zip(*) values together as PaddedBatch connects them already (by key).
Here the idea is not to error out.
This is modified from: https://github.com/pytorch/pytorch/blob/c0deb231db76dbea8a9d326401417f7d1ce96ed5/torch/utils/data/_utils/collate.py#L42
- speechbrain.utils.data_utils.split_path(path)[source]
Splits a path to source and filename
This also handles URLs and Huggingface hub paths, in addition to regular paths.
- Parameters:
path (str or FetchSource) –
- Returns:
str – Source
str – Filename
- speechbrain.utils.data_utils.scalarize(value)[source]
Converts a namedtuple or dictionary containing tensors to their scalar value Arguments: ———- value: dict or namedtuple
a dictionary or named tuple of tensors
- Returns:
result – a result dictionary
- Return type:
- speechbrain.utils.data_utils.unsqueeze_as(x, target)[source]
Reshape the tensor to be of a shape compatible with the target tensor, only valid if x.dim() <= y.dim()
- Parameters:
x (torch.Tensor) – the original tensor
target (torch.Tensor) – the tensor whose shape
- Returns:
result – a view of tensor x reshaped to a shape compatible with y
- Return type:
- speechbrain.utils.data_utils.pad_divisible(tensor, length=None, factor=2, len_dim=1, pad_value=0)[source]
Adds extra padding to the specified dimension of a tensor to make it divisible by the specified factor. This is useful when passing variable-length sequences to downsampling UNets or other similar architectures in which inputs are expected to be divisible by the downsampling factor
- Parameters:
tensor (torch.Tensor) – the tensor to be padded, of arbitrary dimension
length (torch.Tensor) – a 1-D tensor of relative lengths
factor (int) – the divisibility factor
len_dim (int) – the index of the dimension used as the length
pad_value (int) – the value with which outputs will be padded
- Returns:
tensor_padded (torch.Tensor) – the tensor, with additional padding if required
length (torch.Tensor) – the adjsted length tensor, if provided
Example
>>> x = torch.tensor([[1, 2, 3, 4], ... [5, 6, 0, 0]]) >>> lens = torch.tensor([1., .5]) >>> x_pad, lens_pad = pad_divisible(x, length=lens, factor=5) >>> x_pad tensor([[1, 2, 3, 4, 0], [5, 6, 0, 0, 0]]) >>> lens_pad tensor([0.8000, 0.4000])
- speechbrain.utils.data_utils.trim_to_shape(tensor, shape)[source]
Trims the specified tensor to match the specified shape
- Parameters:
tensor (torch.Tensor) – a tensor
shape (enumerable) – the desired shape
- Returns:
tensor – the trimmed tensor
- Return type:
- speechbrain.utils.data_utils.trim_as(tensor, other)[source]
Trims the specified tensor to match the shape of another tensor (at most)
- Parameters:
tensor (torch.Tensor:) – a tensor
other (torch.Tensor) – the tensor whose shape to match
- Returns:
tensor – the trimmed tensor
- Return type:
- speechbrain.utils.data_utils.match_shape(tensor, other)[source]
A swiss-army-knife helper function to match the shape of a tensor to match that of another tensor - useful for masks, etc.
- Parameters:
tensor (torch.Tensor:) – a tensor
other (torch.Tensor) – the tensor whose shape to match
- Returns:
tensor – the tensor with matching shape
- Return type:
- speechbrain.utils.data_utils.batch_shuffle(items, batch_size)[source]
Shuffles batches of fixed size within a sequence
- Parameters:
items (sequence) – a tensor or an indexable sequence, such as a list
batch_size (int) – the batch size
- Returns:
items – the original items. If a tensor was passed, a tensor will be returned. Otherwise, it will return a list
- Return type:
sequence
- speechbrain.utils.data_utils.concat_padded_features(feats, lens, dim=1, feats_slice_start=None, feats_slice_end=None)[source]
Concatenates multiple padded feature tensors into a single padded tensor in a vectorized manner without including the padding in the final tensor, adding padding only at the end. The function supports optional relative sicing of the tensors.
One possible use case is to concatenate batches of spectrograms or audio.
- Parameters:
feats (list) – a list of padded tesnors
lens (list) – a list of length tensors
feats_slice_start (list) – offsets, relative to the beginning of the sequence, for each of the tensors being concatenated. This is useful if only a subsequence of some slices is included
feats_slice_end (list) – offsets, relative to the end of the sequence, for each of the tensors being concatenated. This is useful if only a subsequence of some slices is included
- Returns:
out – a concatenated tensor
- Return type:
- speechbrain.utils.data_utils.unsqueeze_1d(value, dim, value_dim)[source]
Unsqueezes a 1-D tensor to the specified number of dimension preserving one dimension and creating “dummy” dimensions elsewhere
- Parameters:
value (torch.Tensor) – A 1-D tensor
dim (int) – the number of dimension
value_dim (int) – the dimension that the value tensor represents
- Returns:
result – a dim-dimensional tensor
- Return type:
- speechbrain.utils.data_utils.length_range(feats, len_dim)[source]
Creates a tensor with a range in a single dimension to one matching the shape of a its tensor
- Parameters:
feats (torch.Tensor) – a features tensor of arbitrary shape
len_dim (torch.Tensor) – the dimension used as length
- Returns:
result – a tensor matching the shape of feats with an 0 to max-length range along the length dimension repeated across other dimensions
- Return type:
- speechbrain.utils.data_utils.non_batch_dims(sample)[source]
Returns all dimensons of the specified tensor except the batch dimension
- Parameters:
sample (torch.Tensor) – an arbitrary tensor
- Returns:
dims – a list of dimensions
- Return type:
- speechbrain.utils.data_utils.masked_mean(sample, mask=None)[source]
A metric function that computes the mean of each sample, excluding padding
- Parameters:
samples (torch.Tensor) – a tensor of spectrograms
mask (torch.Tensor) – a length mask
- Returns:
result – a tensor fo means
- Return type:
- speechbrain.utils.data_utils.masked_std(sample, mask=None)[source]
A metric function that computes the standard deviation of each sample, excluding padding
- Parameters:
samples (torch.Tensor) – a tensor of spectrograms
mask (torch.Tensor) – a length mask
- Returns:
result – a tensor fo means
- Return type:
- speechbrain.utils.data_utils.masked_min(sample, mask=None)[source]
A metric function that computes the minimum of each sample
- Parameters:
samples (torch.Tensor) – a tensor of spectrograms
mask (torch.Tensor) – a length mask
- Returns:
result – a tensor fo means
- Return type:
- speechbrain.utils.data_utils.masked_max(sample, mask=None)[source]
A metric function that computes the minimum of each sample
- Parameters:
samples (torch.Tensor) – a tensor of spectrograms
mask (torch.Tensor) – a length mask
- Returns:
result – a tensor fo means
- Return type:
- speechbrain.utils.data_utils.dist_stats(sample, mask=None)[source]
Returns standard distribution statistics (mean, std, min, max)
- Parameters:
samples (torch.Tensor) – a tensor of spectrograms
mask (torch.Tensor) – a length mask
- Returns:
result – a tensor fo means
- Return type:
- speechbrain.utils.data_utils.dict_value_combinations(values)[source]
Returns all possible key-value combinations from the given dictionary