speechbrain.integrations.huggingface.wordemb.util moduleο
Utilities for word embeddings
Authors * Artem Ploujnikov 2021
Summaryο
Functions:
Expands word embeddings to a sequence of character embeddings, assigning each character the word embedding of the word to which it belongs |
Referenceο
- speechbrain.integrations.huggingface.wordemb.util.expand_to_chars(emb, seq, seq_len, word_separator)[source]ο
Expands word embeddings to a sequence of character embeddings, assigning each character the word embedding of the word to which it belongs
- Parameters:
emb (torch.Tensor) β a tensor of word embeddings
seq (torch.Tensor) β a tensor of character embeddings
seq_len (torch.Tensor) β a tensor of character embedding lengths
word_separator (torch.Tensor) β the word separator being used
- Returns:
char_word_emb β a combined character + word embedding tensor
- Return type:
Example
>>> import torch >>> emb = torch.tensor( ... [ ... [[1.0, 2.0, 3.0], [3.0, 1.0, 2.0], [0.0, 0.0, 0.0]], ... [[1.0, 3.0, 2.0], [3.0, 2.0, 1.0], [2.0, 3.0, 1.0]], ... ] ... ) >>> seq = torch.tensor([[1, 2, 0, 2, 1, 0], [1, 0, 1, 2, 0, 2]]) >>> seq_len = torch.tensor([4, 5]) >>> word_separator = 0 >>> expand_to_chars(emb, seq, seq_len, word_separator) tensor([[[1., 2., 3.], [1., 2., 3.], [0., 0., 0.], [3., 1., 2.], [3., 1., 2.], [0., 0., 0.]], [[1., 3., 2.], [0., 0., 0.], [3., 2., 1.], [3., 2., 1.], [0., 0., 0.], [2., 3., 1.]]])