hyperpyyaml.core module

This library gathers utilities for hyperpyyaml loading

Authors
  • Peter Plantinga 2020

  • Aku Rouhe 2020

Summary

Classes:

Placeholder

Class for dumping !PLACEHOLDER tags to yaml

RefTag

Class for dumping !ref tags to yaml

Functions:

deref

Find the value referred to by a reference in dot-notation

dump_hyperpyyaml

Dump yaml including placeholder and reference tags.

load_hyperpyyaml

This function implements the HyperPyYAML syntax

parse_arithmetic

Parses simple arithmetic operations in references

recursive_resolve

Resolve a reference to a value, following chained references

recursive_update

Similar function to dict.update, but for a nested dict.

resolve_references

Resolves inter-document references, a component of HyperPyYAML.

Reference

hyperpyyaml.core.load_hyperpyyaml(yaml_stream, overrides=None, overrides_must_match=True)[source]

This function implements the HyperPyYAML syntax

The purpose for this syntax is a compact, structured hyperparameter and function definition. This function implements a few extensions to the yaml syntax, listed below.

PyYAML complex tag shortcuts

Part of our clean structured hyperparameter interface is being able to specify python objects easily and cleanly. This is possible with native YAML using the following syntax:

alignment_saver: !!python/object/new:speechbrain.data_io.TensorSaver
    kwargs: {save_dir: results/asr/ali}

However, due to the extensive use within speechbrain yaml files, we have added a shortcut for this that has the following syntax:

alignment_saver: !new:speechbrain.data_io.TensorSaver
    save_dir: results/asr/ali

In this example, the alignment_saver will be an instance of the TensorSaver class, with 'exp/asr/ali' passed to the __init__() method as a keyword argument. This is equivalent to:

import speechbrain.data_io.data_io
alignment_saver = speechbrain.data_io.TensorSaver(
    save_dir='exp/asr/ali'
)

We have also implemented a few more shortcuts::

!!python/name: => !name:
!!python/module: => !module:
!!python/object/apply: => !apply:

References and copies

Allows internal references to any node in the file. Any node with tag !ref will create an object reference to the yaml object at the <key.subkey> location within the yaml itself, following reference chains.

output_folder: results/asr
alignment_saver: !new:speechbrain.data_io.TensorSaver
    save_dir: !ref <output_folder>

Strings values are handled specially: references are substituted but the rest of the string is left in place, allowing filepaths to be easily extended:

output_folder: results/asr
alignment_saver: !new:speechbrain.data_io.TensorSaver
    save_dir: !ref <output_folder>/ali  # results/asr/ali

A more complex example for demonstration purposes:

key1: {a: !new:object {arg1: 1}}
key2: !ref <key1[a]>

Here, key2 will contain a reference to the a object, so changing a.arg1 will also change key2.arg1. If you need a deep copy of the object instead of a shallow reference, you can use a similar syntax with the tag !copy. For example:

key1: {a: !new:object {arg1: 1}}
key2: !copy <key1[a]>

These will also implement very basic arithmetic, so:

key1: 1
key2: !ref <key1> + 3  # this is 4

Tuples

One last minor enhancement is an implicit tuple resolver. Passing a string value of (3, 4) will be given a tag of !tuple which is then interpreted as a tuple.

Parameters
  • yaml_stream (stream) – A file-like object or string from which to read.

  • overrides (mapping or str) – A set of overrides for the values read from the stream. As yaml implements a nested structure, so can the overrides. See speechbrain.utils.data_utils.recursive_update

  • overrides_must_match (bool) – Whether an error will be thrown when an override does not match a corresponding key in the yaml_stream.

  • return_dict (bool) – Whether to return a dictionary rather than the default namespace.

Returns

hparams – Reflects the structure of yaml_stream.

Return type

dict

Example

>>> yaml_string = """
... a: 3
... thing: !new:collections.Counter
...     b: !ref <a>
... """
>>> params = load_hyperpyyaml(yaml_string)
>>> params["thing"]
Counter({'b': 3})
class hyperpyyaml.core.RefTag(ref_str)[source]

Bases: object

Class for dumping !ref tags to yaml

Parameters

ref_str (str) – String including yaml keys in <key> notation

Example

See dump_hyperpyyaml

yaml_tag = '!ref'
classmethod to_yaml(representer, node)[source]
class hyperpyyaml.core.Placeholder[source]

Bases: object

Class for dumping !PLACEHOLDER tags to yaml

Example

See dump_hyperpyyaml

yaml_tag = '!PLACEHOLDER'
classmethod to_yaml(representer, node)[source]
hyperpyyaml.core.dump_hyperpyyaml(yaml_tree, output_stream, *args, **kwargs)[source]

Dump yaml including placeholder and reference tags.

Parameters
  • yaml_tree (dict) – An object to dump

  • output_stream (stream) – A file stream for putting the yaml

  • *args – Arguments to forward to ruamel.yaml.YAML().dump()

  • **kwargs – Arguments to forward to ruamel.yaml.YAML().dump()

Example

>>> to_yaml = {'a': Placeholder(), 'b': RefTag('<a>')}
>>> stringio = StringIO()
>>> dump_hyperpyyaml(to_yaml, stringio)
>>> stringio.getvalue()
'a: !PLACEHOLDER\nb: !ref <a>\n'
hyperpyyaml.core.resolve_references(yaml_stream, overrides=None, overrides_must_match=False)[source]

Resolves inter-document references, a component of HyperPyYAML.

Parameters
  • yaml_stream (stream) – A file-like object or string with the contents of a yaml file written with the HyperPyYAML syntax.

  • overrides (mapping or str) – Replacement values, either in a yaml-formatted string or a dict.

  • overrides_must_match (bool) – Whether an error will be thrown when an override does not match a corresponding key in the yaml_stream. This is the opposite default from load_hyperpyyaml because resolve_references doesn’t need to be as strict by default.

Returns

A yaml-formatted stream with all references and overrides resolved.

Return type

stream

Example

>>> yaml_string = """
... constants:
...     a: 3
...     b: !ref <constants[a]>
... """
>>> overrides = {'constants': {'a': 4}}
>>> resolve_references(yaml_string, overrides).getvalue()
'constants:\n  a: 4\n  b: 4\n'
hyperpyyaml.core.deref(ref, full_tree, copy_mode=False)[source]

Find the value referred to by a reference in dot-notation

Parameters
  • ref (str) – The location of the requested value, e.g. ‘constants.param’

  • full_tree (dict) – The dictionary to use for finding values

  • copy_mode (bool) – Whether to copy the node before dereferencing.

Returns

The node in the full_tree dictionary referenced by ref.

Return type

node

Example

>>> deref('constants[a][b]', {'constants': {'a': {'b': 'c'}}})
'c'
hyperpyyaml.core.recursive_resolve(reference, reference_list, full_tree, copy_mode=False)[source]

Resolve a reference to a value, following chained references

Parameters
  • reference (str) – a string containing ‘<x[y]>’ in it where x[y] refers to a scalar node in the file.

  • reference_list (list) – list of prior references in the chain, in order to catch circular references.

  • full_tree (dict) – the dictionary in which to find all references and their values.

  • copy_mode (bool) – Whether to perform a deep copy of the referenced node, rather than a shallow reference to the same object.

Returns

The dereferenced value, with possible string interpolation and arithmetic parsing.

Return type

scalar

Example

>>> tree = {'a': 3, 'b': 'x', 'c': '<a>', 'd': '<c>/<c>', 'e': '<b>/<b>'}
>>> recursive_resolve('<d>', [], tree)
1.0
>>> recursive_resolve('<e>', [], tree)
'x/x'
hyperpyyaml.core.parse_arithmetic(reference_string)[source]

Parses simple arithmetic operations in references

Adapted from https://stackoverflow.com/a/9558001/1761970

Parameters

reference_string (str) – A string with references and possible arithmetic operations.

Returns

Result of parsing and applying the arithmetic.

Return type

str

Example

>>> parse_arithmetic('2 * 6')
12
hyperpyyaml.core.recursive_update(d, u, must_match=False)[source]

Similar function to dict.update, but for a nested dict.

From: https://stackoverflow.com/a/3233356

If you have to a nested mapping structure, for example:

{“a”: 1, “b”: {“c”: 2}}

Say you want to update the above structure with:

{“b”: {“d”: 3}}

This function will produce:

{“a”: 1, “b”: {“c”: 2, “d”: 3}}

Instead of:

{“a”: 1, “b”: {“d”: 3}}

Parameters
  • d (dict) – mapping to be updated

  • u (dict) – mapping to update with

  • must_match (bool) – Whether to throw an error if the key in u does not exist in d.

Example

>>> d = {'a': 1, 'b': {'c': 2}}
>>> recursive_update(d, {'b': {'d': 3}})
>>> d
{'a': 1, 'b': {'c': 2, 'd': 3}}