HO-SIRR

Description for the HO-SIRR application.

Application description

Higher-order Spatial Impulse Response Rendering (HO-SIRR) is a rendering method, which can synthesise output loudspeaker array room impulse responses (RIRs) using input spherical harmonic (Ambisonic/B-Format) RIRs of arbitrary order [1,2]. The method makes assumptions regarding the composition of the sound-field and extracts spatial parameters over time, which allows it to map the input to the output in an adaptive and more informed manner; when compared to purely linear methods such as Ambisonics.

The idea is that you then convolve a monophonic source with this loudspeaker array RIR, and it will be reproduced and exhibit the spatial characteristics of the captured space. Note that the HO-SIRR algorithm is an extention of the original first-order SIRR formulation, first proposed back in 2005 [3,4], by employing the higher-order analysis principles described in [5], which permits higher spatial accuracy during the mapping provided that higher-order components are available.

Note that this HO-SIRR application is essentially a direct port of the HO-SIRR MATLAB toolbox, which is slighly more configurable than the C/C++ implementation (and easier to augment) and can be found here.


The suggested workflow is:

  • Measure a room impulse response (RIR) of a space with a spherical microphone array (e.g. using HAART), and convert it into an Ambisonic/B-format RIR (e.g. using sparta_array2sh).
  • Load this B-Format/Ambisonic RIR into the HOSIRR App/plug-in and specify your loudspeaker array directions and desired rendering configuration (although, the default should suffice for most purposes).
  • Click “Render”, and then “Save”, to export the resulting loudspeaker array RIR as a multi-channel .wav file.
  • Then simply convolve this loudspeaker array RIR with a monophonic source signal, and it will be reproduced over the loudspeaker array (also exhibiting the spatial characteristics of the captured space). Plug-ins such as Xvolver, X-MCFX, sparta_matrixconv (included in the installer), and mcfx_convolver, are well suited to this convolution task.

Listening test results at a glance

The perceptual performance of HO-SIRR was evaluated based on formal listening tests in [1], where it was compared to Mode-Matching Ambisonics decoding. It was found that if the mono signal is quite stationary (such as a trombone recording), then first-order SIRR renderings can sound almost equivalent to 5th order Ambisonics. However, if the mono signal is more transient (such as a kick drum or speech sample), then the benefits of the higher-order SIRR renderings are revealed. For an in-depth description of the listening test and a discussion of the results, see: [1].


About the authors

  • Leo McCormack: a doctoral candidate at Aalto University.
  • Archontis Politis: post doctorate researcher at Tampere University, specialising in spatial sound recording and reproduction, acoustic scene analysis and microphone array processing.
  • Ville Pulkki: Professor at Aalto University, known for VBAP, SIRR, DirAC and eccentric behaviour.

License

This application may be used for academic, personal, and/or commercial use. The source code may also be used for commercial purposes, provided that the terms of the GPLv3 license are fulfilled. This requires that the original code and/or any derived works must also be open-sourced and made available under the same GPLv3 license, if it is to be used for commercial purposes.

References

[1] McCormack, L., Pulkki, V., Politis, A., Scheuregger, O. and Marschall, M. (2020). Higher-Order Spatial Impulse Response Rendering: Investigating the Perceived Effects of Spherical Order, Dedicated Diffuse Rendering, and Frequency Resolution.
Journal of the Audio Engineering Society, 68(5), pp.338-354.

[2] McCormack, L., Politis, A., Scheuregger, O., and Pulkki, V. (2019). Higher-order processing of spatial impulse responses.
Proceedings of the 23rd International Congress on Acoustics, 9–13 September 2019 in Aachen, Germany.

[3] Merimaa, J. and Pulkki, V. (2005). Spatial impulse response rendering I: Analysis and synthesis
Journal of the Audio Engineering Society, 53(12), pp.1115-1127.

[4] Pulkki, V. and Merimaa, J. (2006). Spatial impulse response rendering II: Reproduction of diffuse sound and listening tests
Journal of the Audio Engineering Society, 54(1/2), pp.3-20.

[5] Politis, A. and Pulkki, V. (2016). Acoustic intensity, energy-density and diffuseness estimation in a directionally-constrained region
arXiv preprint arXiv:1609.03409.

Edit this page on GitHub