The HAS (Human Auditory System) has a finite frequency resolution, which
basically means that weaker audio signal (maskee) becomes inaudible in the presence
of (is masked by) a louder audio (masker), when they are closed enough[1],
in the frequency domain (and obviously in time). When this happens, the
subband [2] in which the maskee signal is placed can be quantized more
severely without perceiving that the quantization noise in such subband (see
Figure 1).
2 A dynamic computation of the Quantization Step Sizes
Given a decomposition of a chunk \({\mathbf W}=\{{\mathbf w}_s\}\), determine the energy \(\{E({\mathbf w}_s)\}\) of each subband.
For this is a good idea to have the same bandwidth in all the subbands.
Find the subband with the highest energy: \begin {equation} {\mathbf w}_m = \underset {{\mathbf w}_i \in {\mathbf W}}{\operatorname {arg\,max}}~E({\mathbf w}_m) := \{{\mathbf w}_* \in {\mathbf W} ~:~ E({\mathbf w}_i) \leq E({\mathbf w}_m) \text { for all } {\mathbf w}_i \in {\mathbf W} \}. \end {equation}
Being \({\mathbf \Delta }_m\) the current QSS of the subband \({\mathbf w}_m\), compute the set of
optimal1
QSSs as \begin {equation} {\mathbf \Delta }^* := \{\cdots ,3{\mathbf \Delta }_{x-2},2{\mathbf \Delta }_{x-1},{\mathbf \Delta }_x,2{\mathbf \Delta }_{x+1},3{\mathbf \Delta }_{x+2}, \cdots \}. \end {equation}
3 Deliverables
Implement the algorithm described in Section 2 in a module named
simultaneous_masking.py. You should extend the classes defined in
advanced_ToH.py.