fi.helsinki.cs.ohtu.mpeg2.audio.mpa
Class StandardPAModel1

java.lang.Object
  extended by fi.helsinki.cs.ohtu.mpeg2.audio.mpa.StandardPAModel1
All Implemented Interfaces:
PsychoacousticModel

public class StandardPAModel1
extends java.lang.Object
implements PsychoacousticModel

Implements the Psychoacoustic Model 1 described in Annex D of 11172-3.


Nested Class Summary
protected static class StandardPAModel1.TonalFlag
          used to group signal components based on their tonality
 
Field Summary
protected static double[] ABS_THR_L_II
          for the relevant sampling frequency
protected static double[] ABS_THR_L_II_32
          Table D.1d. - Absolute thresholds (dB) for Layer II at 32 kHz.
protected static double[] ABS_THR_L_II_441
          Table D.1e. - Absolute thresholds (dB) for Layer II at 44.1 kHz.
protected static double[] ABS_THR_L_II_48
          Table D.1f. - Absolute thresholds (dB) for Layer II at 48 kHz.
protected  int bitrate
          current encoding bitrate
protected static double[] CB_BOUND_BARK_L_II
          for the relevant sampling frequency
protected static double[] CB_BOUND_BARK_L_II_32
          Table D.2d. - Critical band boundaries Bark[z] for Layer II at 32 kHz.
protected static double[] CB_BOUND_BARK_L_II_441
          Table D.2e. - Critical band boundaries Bark[z] for Layer II at 44.1 kHz.
protected static double[] CB_BOUND_BARK_L_II_48
          Table D.2f. - Critical band boundaries Bark[z] for Layer II at 48 kHz.
protected static double[] CB_BOUND_HZ_L_II
          for the relevant sampling frequency
protected static double[] CB_BOUND_HZ_L_II_32
          Table D.2d. - Critical band boundaries frequency for Layer II at 32 kHz.
protected static double[] CB_BOUND_HZ_L_II_441
          Table D.2e. - Critical band boundaries frequency for Layer II at 44.1 kHz.
protected static double[] CB_BOUND_HZ_L_II_48
          Table D.2f. - Critical band boundaries frequency for Layer II at 48 kHz.
protected static double[] CR_BR_L_II
          for the relevant sampling frequency
protected static double[] CR_BR_L_II_32
          Table D.1d. - Critical band rates (z) for Layer II at 32 kHz.
protected static double[] CR_BR_L_II_441
          Table D.1e. - Critical band rates (z) for Layer II at 44.1 kHz.
protected static double[] CR_BR_L_II_48
          Table D.1f. - Critical band rates (z) for Layer II at 48 kHz.
protected  int FRAME_LENGTH
          input audio frame length
protected static double[] FREQ_L_II
          for the relevant sampling frequency
protected static double[] FREQ_L_II_32
          Table D.1d. - Frequencies (Hz) for Layer II at 32 kHz.
protected static double[] FREQ_L_II_441
          Table D.1e. - Frequencies (Hz) for Layer II at 44.1 kHz.
protected static double[] FREQ_L_II_48
          Table D.1f. - Frequencies (Hz) for Layer II at 48 kHz.
protected  int fs
          sampling frequency
protected static int[] INDEX_FBC_L_II
          for the relevant sampling frequency
protected static int[] INDEX_FBC_L_II_32
          Table D.2d. - Index of table F&CB for Layer II at 32 kHz.
protected static int[] INDEX_FBC_L_II_441
          Table D.2e. - Index of table F&CB for Layer II at 44.1 kHz.
protected static int[] INDEX_FBC_L_II_48
          Table D.2f. - Index of table F&CB for Layer II at 48 kHz.
protected  int SBLIMIT
          number of subbands
private static double[] SCALEFACTORS
          Scalefactors table taken from table 3-B.1 of ISO 11172-3.
protected  int SPECTRUM_LENGTH
          the number of points in the frequency domain representation
protected  int SUBS_DOMAIN_LENGTH
          number of points in the subsampled domain
protected  int TRANSFORM_LENGTH
          the fourier transform length
 
Constructor Summary
StandardPAModel1()
           Must exist for testing purposes.
StandardPAModel1(int bitrate, AudioEncoder.SampleRate sampleRate)
          Create a new instance of StandardPAModel1.
 
Method Summary
protected  double[] computeDFT(double[] x)
           Compute the Discrete Fourier Transform of length TRANSFORM_LENGTH for the argument x.
protected  double[] computeFFT(double[] x)
           Compute a Fast Fourier Transform with the minim library component.
protected  double[] computeGMTs(double[][] LTm, double[] LTq)
           Step 7 - Determine the Global Masking Thresholds.
protected  double[][] computeIMTs(java.util.ArrayList<java.lang.Integer> relevantPoints, StandardPAModel1.TonalFlag[] tonalities, double[] powers, int[] indexMapping)
           Step 6 - Calculate the Individual Masking Thresholds.
protected  double[] computeMMTs(double[] LTg, int[] indexMapping)
           Step 8 - Determine the Minimum Masking Thresholds in each sub-band.
protected  double[] computeSMRs(double[] Lsb, double[] LTmin)
           Calculate the Signal-to-Mask Ratio for each sub-band.
 double[] computeSMRs(double[] samples, int[][] scF)
           Compute the Signal-to-Mask Ratios for the sub-bands of the input PCM audio frame.
protected  double[] computeSPLs(double[] X, int[][] scF)
          Step 2 - Determine the Sound Pressure Level per subband.
protected  int[] createMapping()
           Create a mapping table between the subsampled spectrum points (126 at 48 kHz) and the full frequency spectrum (512 points).
protected  java.util.ArrayList<java.lang.Integer> decimateMaskers(int[] map, double[] LTq, StandardPAModel1.TonalFlag[] tonalities, double[] Xm)
           Step 5 - Decimate the maskers to obtain the relevant points of the spectrum for the next steps.
protected  StandardPAModel1.TonalFlag[] findTonalComponents(double[] spectrum, double[] componentPowers)
          Step 4 - Find the tonal (sinusoid-like) and non-tonal (noise) components of the signal.
protected  double[] getAbsoluteThresholds(int bitrate)
           Step 3 - Determine the absolute thresholds (threshold in quiet).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FREQ_L_II_32

protected static final double[] FREQ_L_II_32
Table D.1d. - Frequencies (Hz) for Layer II at 32 kHz.


CR_BR_L_II_32

protected static final double[] CR_BR_L_II_32
Table D.1d. - Critical band rates (z) for Layer II at 32 kHz.


ABS_THR_L_II_32

protected static final double[] ABS_THR_L_II_32
Table D.1d. - Absolute thresholds (dB) for Layer II at 32 kHz.


INDEX_FBC_L_II_32

protected static final int[] INDEX_FBC_L_II_32
Table D.2d. - Index of table F&CB for Layer II at 32 kHz.


CB_BOUND_HZ_L_II_32

protected static final double[] CB_BOUND_HZ_L_II_32
Table D.2d. - Critical band boundaries frequency for Layer II at 32 kHz.


CB_BOUND_BARK_L_II_32

protected static final double[] CB_BOUND_BARK_L_II_32
Table D.2d. - Critical band boundaries Bark[z] for Layer II at 32 kHz.


FREQ_L_II_441

protected static final double[] FREQ_L_II_441
Table D.1e. - Frequencies (Hz) for Layer II at 44.1 kHz.


CR_BR_L_II_441

protected static final double[] CR_BR_L_II_441
Table D.1e. - Critical band rates (z) for Layer II at 44.1 kHz.


ABS_THR_L_II_441

protected static final double[] ABS_THR_L_II_441
Table D.1e. - Absolute thresholds (dB) for Layer II at 44.1 kHz.


INDEX_FBC_L_II_441

protected static final int[] INDEX_FBC_L_II_441
Table D.2e. - Index of table F&CB for Layer II at 44.1 kHz.


CB_BOUND_HZ_L_II_441

protected static final double[] CB_BOUND_HZ_L_II_441
Table D.2e. - Critical band boundaries frequency for Layer II at 44.1 kHz.


CB_BOUND_BARK_L_II_441

protected static final double[] CB_BOUND_BARK_L_II_441
Table D.2e. - Critical band boundaries Bark[z] for Layer II at 44.1 kHz.


FREQ_L_II_48

protected static final double[] FREQ_L_II_48
Table D.1f. - Frequencies (Hz) for Layer II at 48 kHz.


CR_BR_L_II_48

protected static final double[] CR_BR_L_II_48
Table D.1f. - Critical band rates (z) for Layer II at 48 kHz.


ABS_THR_L_II_48

protected static final double[] ABS_THR_L_II_48
Table D.1f. - Absolute thresholds (dB) for Layer II at 48 kHz.


INDEX_FBC_L_II_48

protected static final int[] INDEX_FBC_L_II_48
Table D.2f. - Index of table F&CB for Layer II at 48 kHz.


CB_BOUND_HZ_L_II_48

protected static final double[] CB_BOUND_HZ_L_II_48
Table D.2f. - Critical band boundaries frequency for Layer II at 48 kHz.


CB_BOUND_BARK_L_II_48

protected static final double[] CB_BOUND_BARK_L_II_48
Table D.2f. - Critical band boundaries Bark[z] for Layer II at 48 kHz.


SCALEFACTORS

private static final double[] SCALEFACTORS
Scalefactors table taken from table 3-B.1 of ISO 11172-3.


FREQ_L_II

protected static double[] FREQ_L_II
for the relevant sampling frequency


CR_BR_L_II

protected static double[] CR_BR_L_II
for the relevant sampling frequency


ABS_THR_L_II

protected static double[] ABS_THR_L_II
for the relevant sampling frequency


INDEX_FBC_L_II

protected static int[] INDEX_FBC_L_II
for the relevant sampling frequency


CB_BOUND_HZ_L_II

protected static double[] CB_BOUND_HZ_L_II
for the relevant sampling frequency


CB_BOUND_BARK_L_II

protected static double[] CB_BOUND_BARK_L_II
for the relevant sampling frequency


SBLIMIT

protected final int SBLIMIT
number of subbands

See Also:
Constant Field Values

FRAME_LENGTH

protected final int FRAME_LENGTH
input audio frame length

See Also:
Constant Field Values

TRANSFORM_LENGTH

protected final int TRANSFORM_LENGTH
the fourier transform length

See Also:
Constant Field Values

SPECTRUM_LENGTH

protected final int SPECTRUM_LENGTH
the number of points in the frequency domain representation

See Also:
Constant Field Values

SUBS_DOMAIN_LENGTH

protected int SUBS_DOMAIN_LENGTH
number of points in the subsampled domain


bitrate

protected int bitrate
current encoding bitrate


fs

protected int fs
sampling frequency

Constructor Detail

StandardPAModel1

public StandardPAModel1()

Must exist for testing purposes.


StandardPAModel1

public StandardPAModel1(int bitrate,
                        AudioEncoder.SampleRate sampleRate)
Create a new instance of StandardPAModel1.

Parameters:
bitrate - The encoding bitrate.
sampleRate - The sampling frequency.
Method Detail

computeSMRs

public double[] computeSMRs(double[] samples,
                            int[][] scF)

Compute the Signal-to-Mask Ratios for the sub-bands of the input PCM audio frame. The variable names are taken directly from Annex D but are listed below for convenience.

     X               the frequency spectrum (Fourier transformed input frame)
     Lsb(n)          the sound pressure level in subband n
     LTq(k)          the absolute threshold (threshold in quiet) for spectral line k
     Xtm, Xnm        tonal component, non-tonal component (masker)
     LTtm, LTnm      individual masking threshold (t = tonal, n = non-tonal)
     z               critical band rate
     avtm, avnm      masking index
     LTg             global masking threshold
     LTmin           minimum masking threshold
     SMRsb           signal-to-mask ratio                
 

A Bark is the width of a critical band.

Specified by:
computeSMRs in interface PsychoacousticModel
Parameters:
frame - The input PCM frame whose samples are doubles in ]-1.0, 1.0[.
scF - The scalefactors (int[SBLIMIT][3]) for each subband
Returns:
A double[SBLIMIT] containing the signal-to-mask ratios for each subband.

computeFFT

protected double[] computeFFT(double[] x)

Compute a Fast Fourier Transform with the minim library component.

Parameters:
x - The input sequence.

computeDFT

protected double[] computeDFT(double[] x)

Compute the Discrete Fourier Transform of length TRANSFORM_LENGTH for the argument x.

See http://www.dspguide.com/ch8.htm
and
http://en.wikipedia.org/wiki/Eulers_formula .

The complex transform is broken down into two real transforms, one for the real component and the other for the imaginary component. The meaning of these components in this context is the signals amplitude and phase, respectively. The real part (by Euler's formula) corresponds to a cosine function and the imaginary part to a sine function.

The notation follows the convention of the ISO 11172-3 document with some minor differences and additions as follows:

 
     x (lower-case)   -   the original signal (the sequence to be transformed)
     w                -   the original signal windowed by a Hann window
     X (upper-case)   -   the transformed sequence
     N                -   the input sequence and transform length
     k                -   indexing variable
     n                -   indexing variable
     re               -   the real component of the complex transform
     im               -   the imaginary component of the complex transform
 

NOTE: What is the squaring of the absolute value in the formula for?

Parameters:
x - The sequence of N samples to be transformed.
Returns:
The sequence of the N / 2 transformed samples.

computeSPLs

protected double[] computeSPLs(double[] X,
                               int[][] scF)
Step 2 - Determine the Sound Pressure Level per subband.

Parameters:
spectrum - The frequency spectrum of the audio frame.
scf - The scalefactors as an array[SBLIMIT][3] containing the scalefactors of the subband data.
Returns:
A double[SBLIMIT] containing the SPLs per subband.

getAbsoluteThresholds

protected double[] getAbsoluteThresholds(int bitrate)

Step 3 - Determine the absolute thresholds (threshold in quiet).

Parameters:
bitrate - The bitrate.
Returns:
The appropriate table of absolute thresholds.

findTonalComponents

protected StandardPAModel1.TonalFlag[] findTonalComponents(double[] spectrum,
                                                           double[] componentPowers)
Step 4 - Find the tonal (sinusoid-like) and non-tonal (noise) components of the signal.

Parameters:
spectrum - The frequency spectrum of the audio frame.
powers - A list of powers of the tonal and non-tonal components.
Returns:
An array[SPECTRUM_LENGTH] of flags denoting the tonality of the critical spectral lines.

decimateMaskers

protected java.util.ArrayList<java.lang.Integer> decimateMaskers(int[] map,
                                                                 double[] LTq,
                                                                 StandardPAModel1.TonalFlag[] tonalities,
                                                                 double[] Xm)

Step 5 - Decimate the maskers to obtain the relevant points of the spectrum for the next steps.

Parameters:
map - A mapping between the complete and the subsampled spectrum.
LTq - The thresholds in quiet (absolute threshold).
tonalities - A list of flags denoting the tonality of the spectral lines.
Xm - The sound pressure levels of the masking components.
Returns:
An array containing the decimated indices of the spectral lines.

computeIMTs

protected double[][] computeIMTs(java.util.ArrayList<java.lang.Integer> relevantPoints,
                                 StandardPAModel1.TonalFlag[] tonalities,
                                 double[] powers,
                                 int[] indexMapping)

Step 6 - Calculate the Individual Masking Thresholds.

Parameters:
relevantPoints - The spectrum points in (0 ... N/2) that will be considered in the calculation.
tonalities - A list of flags denoting the tonality of the masking components
powers - Powers of the masking components
Returns:
The Individual Masking Thresholds.

computeGMTs

protected double[] computeGMTs(double[][] LTm,
                               double[] LTq)

Step 7 - Determine the Global Masking Thresholds.

Parameters:
LTm - The individual masking thresholds.
LTq - The thresholds in quiet.
Returns:
The global masking thresholds for each "i"

createMapping

protected int[] createMapping()

Create a mapping table between the subsampled spectrum points (126 at 48 kHz) and the full frequency spectrum (512 points).

Returns:
The point mapping between frequencies and spectrum points, that is: for each k in [0, N/2 - 1] the i in [0, 125] that most closely corresponds to the frequency at spectral line k.

computeMMTs

protected double[] computeMMTs(double[] LTg,
                               int[] indexMapping)

Step 8 - Determine the Minimum Masking Thresholds in each sub-band.

Parameters:
GMTs - The global masking thresholds.
Returns:
double[SBLIMIT] MMT for each subband.

computeSMRs

protected double[] computeSMRs(double[] Lsb,
                               double[] LTmin)

Calculate the Signal-to-Mask Ratio for each sub-band.

Returns:
Signal-to-Mask Ratio.