speechbrain.processing.decomposition module

Generalized Eigenvalue Decomposition.

This library contains different methods to adjust the format of complex Hermitian matrices and find their eigenvectors and eigenvalues.

Authors
  • William Aris 2020

  • Francois Grondin 2020

Summary

Functions:

f

Transform 1.

finv

Inverse transform 1

g

Transform 2.

gevd

This method computes the eigenvectors and the eigenvalues of complex Hermitian matrices.

ginv

Inverse transform 2.

inv

Inverse Hermitian Matrix.

pos_def

Diagonal modification.

svdl

Singular Value Decomposition (Left Singular Vectors).

Reference

speechbrain.processing.decomposition.gevd(a, b=None)[source]

This method computes the eigenvectors and the eigenvalues of complex Hermitian matrices. The method finds a solution to the problem AV = BVD where V are the eigenvectors and D are the eigenvalues.

The eigenvectors returned by the method (vs) are stored in a tensor with the following format (*,C,C,2).

The eigenvalues returned by the method (ds) are stored in a tensor with the following format (*,C,C,2).

Parameters
  • a (tensor) – A first input matrix. It is equivalent to the matrix A in the equation in the description above. The tensor must have the following format: (*,2,C+P).

  • b (tensor) – A second input matrix. It is equivalent tot the matrix B in the equation in the description above. The tensor must have the following format: (*,2,C+P). This argument is optional and its default value is None. If b == None, then b is replaced by the identity matrix in the computations.

Example

Suppose we would like to compute eigenvalues/eigenvectors on the following complex Hermitian matrix:

A = [ 52 34 + 37j 16 + j28 ;

34 - 37j 125 41 + j3 ; 16 - 28j 41 - j3 62 ]

>>> a = torch.FloatTensor([[52,34,16,125,41,62],[0,37,28,0,3,0]])
>>> vs, ds = gevd(a)

This corresponds to:

D = [ 20.9513 0 0 ;

0 43.9420 0 ; 0 0 174.1067 ]

V = [ 0.085976 - 0.85184j -0.24620 + 0.12244j -0.24868 - 0.35991j ;

-0.16006 + 0.20244j 0.37084 + 0.40173j -0.79175 - 0.087312j ; -0.43990 + 0.082884j -0.36724 - 0.70045j -0.41728 + 0 j ]

where

A = VDV^-1

speechbrain.processing.decomposition.svdl(a)[source]

Singular Value Decomposition (Left Singular Vectors).

This function finds the eigenvalues and eigenvectors of the input multiplied by its transpose (a x a.T).

The function will return (in this order):
  1. The eigenvalues in a tensor with the format (*,C,C,2)

  2. The eigenvectors in a tensor with the format (*,C,C,2)

atensor

A complex input matrix to work with. The tensor must have the following format: (*,2,C+P).

>>> import torch
>>> from speechbrain.processing.features import STFT
>>> from speechbrain.processing.multi_mic import Covariance
>>> from speechbrain.processing.decomposition import svdl
>>> from speechbrain.dataio.dataio import read_audio_multichannel
>>> xs_speech = read_audio_multichannel(
...    'samples/audio_samples/multi_mic/speech_-0.82918_0.55279_-0.082918.flac'
... )
>>> xs_noise = read_audio_multichannel('samples/audio_samples/multi_mic/noise_diffuse.flac')
>>> xs = xs_speech + 0.05 * xs_noise
>>> xs = xs.unsqueeze(0).float()
>>>
>>> stft = STFT(sample_rate=16000)
>>> cov = Covariance()
>>>
>>> Xs = stft(xs)
>>> XXs = cov(Xs)
>>> us, ds = svdl(XXs)
speechbrain.processing.decomposition.f(ws)[source]

Transform 1.

This method takes a complex Hermitian matrix represented by its upper triangular part and converts it to a block matrix representing the full original matrix with real numbers. The output tensor will have the following format: (*,2C,2C)

Parameters

ws (tensor) – An input matrix. The tensor must have the following format: (*,2,C+P)

speechbrain.processing.decomposition.finv(wsh)[source]

Inverse transform 1

This method takes a block matrix representing a complex Hermitian matrix and converts it to a complex matrix represented by its upper triangular part. The result will have the following format: (*,2,C+P)

Parameters

wsh (tensor) – An input matrix. The tensor must have the following format: (*,2C,2C)

speechbrain.processing.decomposition.g(ws)[source]

Transform 2.

This method takes a full complex matrix and converts it to a block matrix. The result will have the following format: (*,2C,2C).

Parameters

ws (tensor) – An input matrix. The tensor must have the following format: (*,C,C,2)

speechbrain.processing.decomposition.ginv(wsh)[source]

Inverse transform 2.

This method takes a complex Hermitian matrix represented by a block matrix and converts it to a full complex complex matrix. The result will have the following format: (*,C,C,2)

Parameters

wsh (tensor) – An input matrix. The tensor must have the following format: (*,2C,2C)

speechbrain.processing.decomposition.pos_def(ws, alpha=0.001, eps=1e-20)[source]

Diagonal modification.

This method takes a complex Hermitian matrix represented by its upper triangular part and adds the value of its trace multiplied by alpha to the real part of its diagonal. The output will have the format: (*,2,C+P)

Parameters
  • ws (tensor) – An input matrix. The tensor must have the following format: (*,2,C+P)

  • alpha (float) – A coefficient to multiply the trace. The default value is 0.001.

  • eps (float) – A small value to increase the real part of the diagonal. The default value is 1e-20.

speechbrain.processing.decomposition.inv(x)[source]

Inverse Hermitian Matrix.

This method finds the inverse of a complex Hermitian matrix represented by its upper triangular part. The result will have the following format: (*, C, C, 2).

Parameters

x (tensor) – An input matrix to work with. The tensor must have the following format: (*, 2, C+P)

Example

>>> import torch
>>>
>>> from speechbrain.dataio.dataio import read_audio
>>> from speechbrain.processing.features import STFT
>>> from speechbrain.processing.multi_mic import Covariance
>>> from speechbrain.processing.decomposition import inv
>>>
>>> xs_speech = read_audio(
...    'samples/audio_samples/multi_mic/speech_-0.82918_0.55279_-0.082918.flac'
... )
>>> xs_noise = read_audio('samples/audio_samples/multi_mic/noise_0.70225_-0.70225_0.11704.flac')
>>> xs = xs_speech + 0.05 * xs_noise
>>> xs = xs.unsqueeze(0).float()
>>>
>>> stft = STFT(sample_rate=16000)
>>> cov = Covariance()
>>>
>>> Xs = stft(xs)
>>> XXs = cov(Xs)
>>> XXs_inv = inv(XXs)