Simultaneous Masking in the Frequency Domain

Vicente González Ruiz & Savins Puertas Martín & Marcos Lupión Lorente

March 23, 2025

1 Frequencial Masking

The HAS (Human Auditory System) has a finite frequency resolution, which basically means that weaker audio signal (maskee) becomes inaudible in the presence of (is masked by) a louder audio signal (masker), when they are close enough [1], in the frequency domain (and obviously in time, i.e., in the same chunk). When this happens, the subband [2] in which the maskee signal is placed can be quantized more severely without perceiving the quantization noise in the maskee subband (see Figure 1).

Figure 1: An example of simultaneous masking generated by a tonal sound of 1 kHz. In the vecinity of the tone the ToH has been increased.

2 A dynamic computation of the Quantization Step Sizes

  1. Given a decomposition of a chunk W={ws}, determine the energy {E(ws)} of each subband. For this is a good idea to have the same bandwidth in all the subbands.
  2. Find the subband with the highest energy:

    (1)wm=argmaxwiW E(wm):={wW : E(wi)E(wm) for all wiW}.
  3. Being Δm the current QSS of the subband wm, compute the set of optimal1 QSSs as

    (2)Δ:={,3Δm2,2Δm1,Δm,2Δm+1,3Δm+2,}.

3 Deliverables

Implement the algorithm described in Section 2 in a module named simultaneous_masking.py to be used in the InterCom.

Mark: 10 points.

4 Resources

[1]   M. Bosi and R.E. Goldberd. Introduction to Digital Audio Coding and Standards. Kluwer Academic Publishers, 2003.

[2]   M. Vetterli and J. Kovačević. Wavelets and Subband Coding. Prentice-hall, 1995.

1From a perceptual perspective.