speechbrain.utils.semdist moduleο
Provides a metrics class for the SemDist metric.
Authors * Sylvain de Langen 2024
Summaryο
Classes:
Base class to implement the SemDist metric, for the variants that estimate a single cosine similarity per pair of target and predicted texts. |
|
Computes the SemDist metric with a provided HuggingFace Transformers text encoder. |
Referenceο
- class speechbrain.utils.semdist.BaseSemDistStats(embed_function: Callable[[List[str]], Tensor], scale: float = 1000.0, batch_size: int = 64)[source]ο
Bases:
MetricStats
Base class to implement the SemDist metric, for the variants that estimate a single cosine similarity per pair of target and predicted texts. The SemDist metrics are described by the paper Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric.
- Parameters:
embed_function (Callable[[List[str]], torch.Tensor]) β Given a list of sentences, return their summarized embedding using the method of your choice (e.g. mean pooling)
scale (float, optional) β The
Ξ±
scale applied to the cosine similarity result for clarity. The default is1000
, in order to match the authorsβ recommendation.batch_size (int, optional) β How many pairs of utterances should be considered at once. Higher is faster but may result in OOM.
- summarize(field=None)[source]ο
Summarize the SemDist metric scores. Performs the actual embedding function call and SemDist calculation.
Full set of fields: -
semdist
: The average SemDist over all utterances, multiplied bythe scale optionally specified at initialization.
Additionally, a
scores
list is populated by this function for each pair of sentences. Each entry of that list is a dict, with the fields: -key
: the ID of the utterance. -semdist
: The SemDist of the utterance, multiplied by the scale.- Parameters:
field (str, optional) β The field to return, if you are only interested in one of them. If specified, a single
float
is returned, otherwise, a dict is.- Returns:
dict from str to float, if
field is None
β A dictionary of the fields documented above.float, if
field is not None
β The single field selected byfield
.
- class speechbrain.utils.semdist.SemDistStats(lm, method: Literal['meanpool', 'cls'] = 'meanpool', *args, **kwargs)[source]ο
Bases:
BaseSemDistStats
Computes the SemDist metric with a provided HuggingFace Transformers text encoder.
- Parameters:
lm (speechbrain.lobes.models.huggingface_transformers.TextEncoder) β HF Transformers tokenizer and text encoder wrapper to use as a LM.
method ("meanpool" or "cls") β
"meanpool"
(default): Computes the mean of all contextualized embeddings, excluding padding tokens."cls"
: Exclusively uses the first contextualized embedding, which with BERT-like tokenizers is the[CLS]
token, which is typically intended to capture classification information.
*args β Extra positional arguments passed to the base constructor.
**kwargs β Extra keyword arguments passed to the base constructor.