Second-order decoding equations

From Ambisonia

Jump to: navigation, search
This article is based on a post to the sursound e-mailing list.

From: Fons Adriaensen <fons kokkinizita.net>
To: sursound@music.vt.edu
Date: Sat, 24 May 2008 01:22:42 +0200
Subject: Re: [Sursound] Near Field Distance Coding filters was [ambisonics paner with distance

The extension to 2nd order you propose seems to be based on the the assumption that the encoding gains for a source at (Az, El) are also the decoding gains for a speaker in that direction. It's an easy mistake, and you're not the first to fall into that trap.

The assumption is true (for regular layouts) only if you use the *normalised* form of the spherical harmonics, but not if unequal gains are applied as in the Furse-Malham set. By coincidence it happens to work for first order 2D because in that case the three F-M components W,X,Y are all equal to sqrt(1/2) times the 2D-normalised ones. So the only error is an irrelevant constant gain factor.

I'll try to complete the picture below. Please note that this applies to non-minimal, regular layouts only.

"Non-minimal" means there must be more speakers than channels. "Regular" means the same angle between all pairs of adjacent speakers for 2D, and octahedron, cube, dodecahedron or icosahedron for 3D.

It's near impossible to give simple formulas for the general non-regular case because the procedure involves a matrix inversion. This can be written out of course, but would fill some pages. In practice nobody would use such equations - the matrix is just inverted numerically. The right way to do the matrix inversion is via SVD, which allows to detect an ill-conditioned matrix and also provides the means to take corrective action.

Contents

[edit] Step 1: Encode the speaker positions

We will use the normalised 3D (N3D) form of the spherical harmonics. Normalised means that when you integrate the square of any of the nine functions below over the entire sphere, the result is equal to 1.

A = azimuth, E = elevation.

We first convert to direction cosines, as this saves on sin() and cos() calls.

x = cos(E) * cos(A)
y = cos(E) * sin(A)
z = sin(E)

Then

W(A,E) = 1
X(A,E) = sqrt(3) * x
Y(A,E) = sqrt(3) * y
Z(A,E) = sqrt(3) * z
R(A,E) = sqrt(5) * (1.5 * z * z - 0.5)
S(A,E) = sqrt(15) * x * z
T(A,E) = sqrt(15) * y * z
U(A,E) = sqrt(15) * (x * x - y * y) / 2
V(A,E) = sqrt(15) * x * y

Of course you only need to calculate the encodings for the channels you want to use. We will consider four cases:

1st order, 2D: W, X, Y
1st order, 3D: W, X, Y, Z
2nd order, 2D: W, X, Y, U, V
2nd order, 3D: all

(Optional) Divide all coeffients by N, the number of speakers. This ensures that the sum of all speaker outputs will be equal to the W input.

[edit] Step 2: Apply gain corrections for 2D

(This step is what remains of the matrix inversion for a regular layout.)

For 3D, the gains calculated in step 1 are the decoder coefficients for max-rV, provided the input is in the N3D format. To convert to F-M, goto step 5.

For 2D, multiply X,Y by 2/3 and U,V by 8/15. The result is the max-rV decoder under the same condition as for the 3D case.

[edit] Step 3: Modify gains for max-rE and in-phase

Starting with the gains found in step 2, multiply by the factors given below to find the max-rE or in-phase decoders. W is not modified in this step.

For max-rE:XYZRSTUV
1st order, 2D0.707-
1st order, 3D0.577-
2nd order, 2D0.8660.500
2nd order, 3D0.7750.400
For in_phase:XYZRSTUV
1st order, 2D0.500-
1st order, 3D0.333-
2nd order, 2D0.6670.167
2nd order, 3D0.5000.100

(These numbers are taken from Jerome Daniel's PhD thesis.)

[edit] Step 4: Sum of powers normalisation

At HF we want to sum of powers to add up to unity instead of the sum of pressures. You may want to do this (giving a dual-band decoder) even if you use the same 'flavour' for both LF and HF.

To achieve this apply a gain factor of sqrt(N/v) to all components of the HF part of the decoder, with N = number of speakers, and v = the value given below (again from Daniel's thesis):

typeorderdimensionv
max-rV12D3.000
13D4.000
22D5.000
23D9.000
max-rE12D2.000
13D2.000
22D3.000
23D3.600
in-phase12D1.500
13D1.333
22D1.944
23D1.800

[edit] Step 5: Adjust for Furse-Malham gains

If the decoder input uses the F-M set rather than N3D, apply the following gains to the corresponding input signals or coefficients:

W  1.414224f      = sqrt(2)
X  1.732051f      = sqrt(3)
Y  1.732051f
Z  1.732051f
R  2.236068f      = sqrt(5)
S  1.936492f      = sqrt(15)/2
T  1.936492f
U  1.936492f
V  1.936492f

[edit] That's it

That's it. For a dual-band decoder you would go through steps 1 and 2 once, use steps 3 and 4 for the HF part, and finally step 5 on both parts.

Personal tools