speechbrain.utils.bleu module

Library for computing the BLEU score

Authors

Mirco Ravanelli 2021

Summary

Classes:

BLEUStats

A class for tracking BLEU (https://www.aclweb.org/anthology/P02-1040.pdf).

Functions:

merge_words

Merge successive words into phrase, putting space between each word

Reference

speechbrain.utils.bleu.merge_words(sequences)[source]

Merge successive words into phrase, putting space between each word

Parameters: sequences (list) – Each item contains a list, and this list contains a word sequence.
Return type: The list contains phrase sequences.

class speechbrain.utils.bleu.BLEUStats(lang='en', merge_words=True)[source]

Bases: MetricStats

A class for tracking BLEU (https://www.aclweb.org/anthology/P02-1040.pdf). :param merge_words: Whether to merge the successive words to create sentences. :type merge_words: bool

Example

>>> bleu = BLEUStats()
>>> i2l = {0: 'a', 1: 'b'}
>>> bleu.append(
...     ids=['utterance1'],
...     predict=[[0, 1, 1]],
...     targets=[[[0, 1, 0]], [[0, 1, 1]], [[1, 1, 0]]],
...     ind2lab=lambda batch: [[i2l[int(x)] for x in seq] for seq in batch],
... )
>>> stats = bleu.summarize()
>>> stats['BLEU']
0.0

append(ids, predict, targets, ind2lab=None)[source]

Add stats to the relevant containers. * See MetricStats.append() :param ids: List of ids corresponding to utterances. :type ids: list :param predict: A predicted output, for comparison with the target output :type predict: torch.tensor :param targets:

list of references (when measuring BLEU, one sentence could have more
than one target translation).

Parameters: ind2lab (callable) – Callable that maps from indices to labels, operating on batches, for writing alignments.

summarize(field=None)[source]: Summarize the BLEU and return relevant statistics. * See MetricStats.summarize()

write_stats(filestream)[source]: Write all relevant info (e.g., error rate alignments) to file. * See MetricStats.write_stats()