speechbrain.inference.text module

Specifies the inference interfaces for text-processing modules.

Authors:
  • Aku Rouhe 2021

  • Peter Plantinga 2021

  • Loren Lugosch 2020

  • Mirco Ravanelli 2020

  • Titouan Parcollet 2021

  • Abdel Heba 2021

  • Andreas Nautsch 2022, 2023

  • Pooneh Mousavi 2023

  • Sylvain de Langen 2023

  • Adel Moumen 2023

  • Pradnya Kandarkar 2023

Summary

Classes:

GPTResponseGenerator

A ready-to-use Response Generator model

GraphemeToPhoneme

A pretrained model implementation for Grapheme-to-Phoneme (G2P) models that take raw natural language text as an input and

Llama2ResponseGenerator

A ready-to-use Response Generator model

ResponseGenerator

A ready-to-use Response Generator model

Reference

class speechbrain.inference.text.GraphemeToPhoneme(*args, **kwargs)[source]

Bases: Pretrained, EncodeDecodePipelineMixin

A pretrained model implementation for Grapheme-to-Phoneme (G2P) models that take raw natural language text as an input and

Parameters:
  • *args (tuple)

  • **kwargs (dict) – Arguments are forwarded to Pretrained parent class.

Example

>>> text = ("English is tough. It can be understood "
...         "through thorough thought though")
>>> from speechbrain.inference.text import GraphemeToPhoneme
>>> tmpdir = getfixture('tmpdir')
>>> g2p = GraphemeToPhoneme.from_hparams('path/to/model', savedir=tmpdir) 
>>> phonemes = g2p.g2p(text) 
INPUT_STATIC_KEYS = ['txt']
OUTPUT_KEYS = ['phonemes']
property phonemes

Returns the available phonemes

property language

Returns the language for which this model is available

g2p(text)[source]

Performs the Grapheme-to-Phoneme conversion

Parameters:

text (str or list[str]) – a single string to be encoded to phonemes - or a sequence of strings

Returns:

result – if a single example was provided, the return value is a single list of phonemes

Return type:

list

load_dependencies()[source]

Loads any relevant model dependencies

__call__(text)[source]

A convenience callable wrapper - same as G2P

Parameters:

text (str or list[str]) – a single string to be encoded to phonemes - or a sequence of strings

Returns:

result – if a single example was provided, the return value is a single list of phonemes

Return type:

list

forward(noisy, lengths=None)[source]

Runs enhancement on the noisy input

class speechbrain.inference.text.ResponseGenerator(*args, **kwargs)[source]

Bases: Pretrained

A ready-to-use Response Generator model

The class can be used to generate and continue dialogue given the user input. The given YAML must contain the fields specified in the *_NEEDED[] lists. It needs to be used with custom.py to load the expanded model with added tokens like bos,eos, and speaker’s tokens.

Parameters:
  • *args (tuple)

  • **kwargs (dict) – Arguments are forwarded to Pretrained parent class.

MODULES_NEEDED = ['model']
generate_response(turn)[source]

Complete a dialogue given the user’s input. :param turn: User input which is the last turn of the dialogue. :type turn: str

Returns:

Generated response for the user input based on the dialogue history.

Return type:

response

prepare_input()[source]

Users should modify this function according to their own tasks.

generate()[source]

Users should modify this function according to their own tasks.

class speechbrain.inference.text.GPTResponseGenerator(*args, **kwargs)[source]

Bases: ResponseGenerator

A ready-to-use Response Generator model

The class can be used to generate and continue dialogue given the user input. The given YAML must contain the fields specified in the *_NEEDED[] lists. It needs to be used with custom.py to load the expanded GPT model with added tokens like bos,eos, and speaker’s tokens.

Parameters:
  • *args (tuple)

  • **kwargs (dict) – Arguments are forwarded to Pretrained parent class.

Example

>>> from speechbrain.inference.text import GPTResponseGenerator
>>> tmpdir = getfixture("tmpdir")
>>> res_gen_model = GPTResponseGenerator.from_hparams(source="speechbrain/MultiWOZ-GPT-Response_Generation",
... pymodule_file="custom.py")  
>>> response = res_gen_model.generate_response("I want to book a table for dinner")  
generate(inputs)[source]

Complete a dialogue given the user’s input.

Parameters:

inputs (tuple) – history_bos which is the tokenized history+input values with appropriate speaker token appended before each turn and history_token_type which determines the type of each token based on who is uttered that token (either User or System).

Returns:

Generated hypothesis for the user input based on the dialogue history.

Return type:

response

prepare_input()[source]
Convert user input and previous histories to the format acceptable for GPT model.

It appends all previous history and input and truncates it based on max_history value. It then tokenizes the input and generates additional input that determines the type of each token (System or User).

Returns:

  • history_bos (torch.Tensor) – Tokenized history+input values with appropriate speaker token appended before each turn.

  • history_token_type (torch.LongTensor) – Type of each token based on who is uttered that token (either User or System)

class speechbrain.inference.text.Llama2ResponseGenerator(*args, **kwargs)[source]

Bases: ResponseGenerator

A ready-to-use Response Generator model

The class can be used to generate and continue dialogue given the user input. The given YAML must contain the fields specified in the *_NEEDED[] lists. It needs to be used with custom.py to load the expanded Llama2 model with added tokens like bos,eos, and speaker’s tokens.

Parameters:
  • *args (tuple)

  • **kwargs (dict) – Arguments are forwarded to Pretrained parent class.

Example

>>> from speechbrain.inference.text import Llama2ResponseGenerator
>>> tmpdir = getfixture("tmpdir")
>>> res_gen_model = Llama2ResponseGenerator.from_hparams(source="speechbrain/MultiWOZ-Llama2-Response_Generation",
... pymodule_file="custom.py")  
>>> response = res_gen_model.generate_response("I want to book a table for dinner")  
generate(inputs)[source]

Complete a dialogue given the user’s input. :param inputs: prompted inputs to be passed to llama2 model for generation. :type inputs: prompt_bos

Returns:

Generated hypothesis for the user input based on the dialogue history.

Return type:

response

prepare_input()[source]
Convert user input and previous histories to the format acceptable for Llama2 model.

It appends all previous history and input and truncates it based on max_history value. It then tokenizes the input and add prompts.

Returns:

prompt_bos – Tokenized history+input values with appropriate prompt.

Return type:

torch.Tensor