4.2.7.5. Normalized Line Spectral Frequency (LSF) and Linear Predictive Coding (LPC) Coefficients
A set of normalized Line Spectral Frequency (LSF) coefficients follow the quantization gains in the bitstream and represent the Linear Predictive Coding (LPC) coefficients for the current SILK frame.
Once decoded, the normalized LSFs form an increasing list of Q15 values between 0 and 1. These represent the interleaved zeros on the upper half of the unit circle (between 0 and pi, hence "normalized") in the standard decomposition [SPECTRAL-PAIRS] of the LPC filter into a symmetric part and an anti-symmetric part (P and Q in Section 4.2.7.5.6). Because of non-linear effects in the decoding process, an implementation SHOULD match the fixed-point arithmetic described in this section exactly. An encoder SHOULD also use the same process. The normalized LSFs are coded using a two-stage vector quantizer (VQ) (Sections 4.2.7.5.1 and 4.2.7.5.2). NB and MB frames use an order-10 predictor, while WB frames use an order-16 predictor. Thus, each of these two cases uses a different set of tables. After reconstructing the normalized LSFs (Section 4.2.7.5.3), the decoder runs them through a stabilization process (Section 4.2.7.5.4), interpolates them between frames (Section 4.2.7.5.5), converts them back into LPC coefficients (Section 4.2.7.5.6), and then runs them through further processes to limit the range of the coefficients (Section 4.2.7.5.7) and the gain of the filter (Section 4.2.7.5.8). All of this is necessary to ensure the reconstruction process is stable.4.2.7.5.1. Normalized LSF Stage 1 Decoding
The first VQ stage uses a 32-element codebook, coded with one of the PDFs in Table 14, depending on the audio bandwidth and the signal type of the current SILK frame. This yields a single index, I1, for the entire frame, which 1. Indexes an element in a coarse codebook, 2. Selects the PDFs for the second stage of the VQ, and 3. Selects the prediction weights used to remove intra-frame redundancy from the second stage. The actual codebook elements are listed in Tables 23 and 24, but they are not needed until the last stages of reconstructing the LSF coefficients.
+-----------+----------+--------------------------------------------+ | Audio | Signal | PDF | | Bandwidth | Type | | +-----------+----------+--------------------------------------------+ | NB or MB | Inactive | {44, 34, 30, 19, 21, 12, 11, 3, 3, 2, 16, | | | or | 2, 2, 1, 5, 2, 1, 3, 3, 1, 1, 2, 2, 2, 3, | | | unvoiced | 1, 9, 9, 2, 7, 2, 1}/256 | | | | | | NB or MB | Voiced | {1, 10, 1, 8, 3, 8, 8, 14, 13, 14, 1, 14, | | | | 12, 13, 11, 11, 12, 11, 10, 10, 11, 8, 9, | | | | 8, 7, 8, 1, 1, 6, 1, 6, 5}/256 | | | | | | WB | Inactive | {31, 21, 3, 17, 1, 8, 17, 4, 1, 18, 16, 4, | | | or | 2, 3, 1, 10, 1, 3, 16, 11, 16, 2, 2, 3, 2, | | | unvoiced | 11, 1, 4, 9, 8, 7, 3}/256 | | | | | | WB | Voiced | {1, 4, 16, 5, 18, 11, 5, 14, 15, 1, 3, 12, | | | | 13, 14, 14, 6, 14, 12, 2, 6, 1, 12, 12, | | | | 11, 10, 3, 10, 5, 1, 1, 1, 3}/256 | +-----------+----------+--------------------------------------------+ Table 14: PDFs for Normalized LSF Stage-1 Index Decoding4.2.7.5.2. Normalized LSF Stage 2 Decoding
A total of 16 PDFs are available for the LSF residual in the second stage: the 8 (a...h) for NB and MB frames given in Table 15, and the 8 (i...p) for WB frames given in Table 16. Which PDF is used for which coefficient is driven by the index, I1, decoded in the first stage. Table 17 lists the letter of the corresponding PDF for each normalized LSF coefficient for NB and MB, and Table 18 lists the same information for WB.
+----------+--------------------------------------+ | Codebook | PDF | +----------+--------------------------------------+ | a | {1, 1, 1, 15, 224, 11, 1, 1, 1}/256 | | | | | b | {1, 1, 2, 34, 183, 32, 1, 1, 1}/256 | | | | | c | {1, 1, 4, 42, 149, 55, 2, 1, 1}/256 | | | | | d | {1, 1, 8, 52, 123, 61, 8, 1, 1}/256 | | | | | e | {1, 3, 16, 53, 101, 74, 6, 1, 1}/256 | | | | | f | {1, 3, 17, 55, 90, 73, 15, 1, 1}/256 | | | | | g | {1, 7, 24, 53, 74, 67, 26, 3, 1}/256 | | | | | h | {1, 1, 18, 63, 78, 58, 30, 6, 1}/256 | +----------+--------------------------------------+ Table 15: PDFs for NB/MB Normalized LSF Stage-2 Index Decoding +----------+---------------------------------------+ | Codebook | PDF | +----------+---------------------------------------+ | i | {1, 1, 1, 9, 232, 9, 1, 1, 1}/256 | | | | | j | {1, 1, 2, 28, 186, 35, 1, 1, 1}/256 | | | | | k | {1, 1, 3, 42, 152, 53, 2, 1, 1}/256 | | | | | l | {1, 1, 10, 49, 126, 65, 2, 1, 1}/256 | | | | | m | {1, 4, 19, 48, 100, 77, 5, 1, 1}/256 | | | | | n | {1, 1, 14, 54, 100, 72, 12, 1, 1}/256 | | | | | o | {1, 1, 15, 61, 87, 61, 25, 4, 1}/256 | | | | | p | {1, 7, 21, 50, 77, 81, 17, 1, 1}/256 | +----------+---------------------------------------+ Table 16: PDFs for WB Normalized LSF Stage-2 Index Decoding
+----+---------------------+
| I1 | Coefficient |
+----+---------------------+
| | 0 1 2 3 4 5 6 7 8 9 |
| 0 | a a a a a a a a a a |
| | |
| 1 | b d b c c b c b b b |
| | |
| 2 | c b b b b b b b b b |
| | |
| 3 | b c c c c b c b b b |
| | |
| 4 | c d d d d c c c c c |
| | |
| 5 | a f d d c c c c b b |
| | |
| g | a c c c c c c c c b |
| | |
| 7 | c d g e e e f e f f |
| | |
| 8 | c e f f e f e g e e |
| | |
| 9 | c e e h e f e f f e |
| | |
| 10 | e d d d c d c c c c |
| | |
| 11 | b f f g e f e f f f |
| | |
| 12 | c h e g f f f f f f |
| | |
| 13 | c h f f f f f g f e |
| | |
| 14 | d d f e e f e f e e |
| | |
| 15 | c d d f f e e e e e |
| | |
| 16 | c e e g e f e f f f |
| | |
| 17 | c f e g f f f e f e |
| | |
| 18 | c h e f e f e f f f |
| | |
| 19 | c f e g h g f g f e |
| | |
| 20 | d g h e g f f g e f |
| | |
| 21 | c h g e e e f e f f |
| | |
| 22 | e f f e g g f g f e | | | | | 23 | c f f g f g e g e e | | | | | 24 | e f f f d h e f f e | | | | | 25 | c d e f f g e f f e | | | | | 26 | c d c d d e c d d d | | | | | 27 | b b c c c c c d c c | | | | | 28 | e f f g g g f g e f | | | | | 29 | d f f e e e e d d c | | | | | 30 | c f d h f f e e f e | | | | | 31 | e e f e f g f g f e | +----+---------------------+ Table 17: Codebook Selection for NB/MB Normalized LSF Stage-2 Index Decoding +----+------------------------------------------------+ | I1 | Coefficient | +----+------------------------------------------------+ | | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | | | | | 0 | i i i i i i i i i i i i i i i i | | | | | 1 | k l l l l l k k k k k j j j i l | | | | | 2 | k n n l p m m n k n m n n m l l | | | | | 3 | i k j k k j j j j j i i i i i j | | | | | 4 | i o n m o m p n m m m n n m m l | | | | | 5 | i l n n m l l n l l l l l l k m | | | | | 6 | i i i i i i i i i i i i i i i i | | | | | 7 | i k o l p k n l m n n m l l k l | | | | | 8 | i o k o o m n m o n m m n l l l | | | | | 9 | k j i i i i i i i i i i i i i i |
| | | | 10 | i j i i i i i i i i i i i i i j | | | | | 11 | k k l m n l l l l l l l k k j l | | | | | 12 | k k l l m l l l l l l l l k j l | | | | | 13 | l m m m o m m n l n m m n m l m | | | | | 14 | i o m n m p n k o n p m m l n l | | | | | 15 | i j i j j j j j j j i i i i j i | | | | | 16 | j o n p n m n l m n m m m l l m | | | | | 17 | j l l m m l l n k l l n n n l m | | | | | 18 | k l l k k k l k j k j k j j j m | | | | | 19 | i k l n l l k k k j j i i i i i | | | | | 20 | l m l n l l k k j j j j j k k m | | | | | 21 | k o l p p m n m n l n l l k l l | | | | | 22 | k l n o o l n l m m l l l l k m | | | | | 23 | j l l m m m m l n n n l j j j j | | | | | 24 | k n l o o m p m m n l m m l l l | | | | | 25 | i o j j i i i i i i i i i i i i | | | | | 26 | i o o l n k n n l m m p p m m m | | | | | 27 | l l p l n m l l l k k l l l k l | | | | | 28 | i i j i i i k j k j j k k k j j | | | | | 29 | i l k n l l k l k j i i j i i j | | | | | 30 | l n n m p n l l k l k k j i j i | | | | | 31 | k l n l m l l l k j k o m i i i | +----+------------------------------------------------+ Table 18: Codebook Selection for WB Normalized LSF Stage-2 Index Decoding
Decoding the second stage residual proceeds as follows. For each coefficient, the decoder reads a symbol using the PDF corresponding to I1 from either Table 17 or Table 18, and subtracts 4 from the result to give an index in the range -4 to 4, inclusive. If the index is either -4 or 4, it reads a second symbol using the PDF in Table 19, and adds the value of this second symbol to the index, using the same sign. This gives the index, I2[k], a total range of -10 to 10, inclusive. +-------------------------------+ | PDF | +-------------------------------+ | {156, 60, 24, 9, 4, 2, 1}/256 | +-------------------------------+ Table 19: PDF for Normalized LSF Index Extension Decoding The decoded indices from both stages are translated back into normalized LSF coefficients in silk_NLSF_decode() (NLSF_decode.c). The stage-2 indices represent residuals after both the first stage of the VQ and a separate backwards-prediction step. The backwards prediction process in the encoder subtracts a prediction from each residual formed by a multiple of the coefficient that follows it. The decoder must undo this process. Table 20 contains lists of prediction weights for each coefficient. There are two lists for NB and MB, and another two lists for WB, giving two possible prediction weights for each coefficient.
+-------------+-----+-----+-----+-----+ | Coefficient | A | B | C | D | +-------------+-----+-----+-----+-----+ | 0 | 179 | 116 | 175 | 68 | | | | | | | | 1 | 138 | 67 | 148 | 62 | | | | | | | | 2 | 140 | 82 | 160 | 66 | | | | | | | | 3 | 148 | 59 | 176 | 60 | | | | | | | | 4 | 151 | 92 | 178 | 72 | | | | | | | | 5 | 149 | 72 | 173 | 117 | | | | | | | | 6 | 153 | 100 | 174 | 85 | | | | | | | | 7 | 151 | 89 | 164 | 90 | | | | | | | | 8 | 163 | 92 | 177 | 118 | | | | | | | | 9 | | | 174 | 136 | | | | | | | | 10 | | | 196 | 151 | | | | | | | | 11 | | | 182 | 142 | | | | | | | | 12 | | | 198 | 160 | | | | | | | | 13 | | | 192 | 142 | | | | | | | | 14 | | | 182 | 155 | +-------------+-----+-----+-----+-----+ Table 20: Prediction Weights for Normalized LSF Decoding The prediction is undone using the procedure implemented in silk_NLSF_residual_dequant() (NLSF_decode.c), which is as follows. Each coefficient selects its prediction weight from one of the two lists based on the stage-1 index, I1. Table 21 gives the selections for each coefficient for NB and MB, and Table 22 gives the selections for WB. Let d_LPC be the order of the codebook, i.e., 10 for NB and MB, and 16 for WB, and let pred_Q8[k] be the weight for the k'th coefficient selected by this process for 0 <= k < d_LPC-1. Then, the stage-2 residual for each coefficient is computed via res_Q10[k] = (k+1 < d_LPC ? (res_Q10[k+1]*pred_Q8[k])>>8 : 0) + ((((I2[k]<<10) - sign(I2[k])*102)*qstep)>>16) ,
where qstep is the Q16 quantization step size, which is 11796 for NB
and MB and 9830 for WB (representing step sizes of approximately 0.18
and 0.15, respectively).
+----+-------------------+
| I1 | Coefficient |
+----+-------------------+
| | 0 1 2 3 4 5 6 7 8 |
| | |
| 0 | A B A A A A A A A |
| | |
| 1 | B A A A A A A A A |
| | |
| 2 | A A A A A A A A A |
| | |
| 3 | B B B A A A A B A |
| | |
| 4 | A B A A A A A A A |
| | |
| 5 | A B A A A A A A A |
| | |
| 6 | B A B B A A A B A |
| | |
| 7 | A B B A A B B A A |
| | |
| 8 | A A B B A B A B B |
| | |
| 9 | A A B B A A B B B |
| | |
| 10 | A A A A A A A A A |
| | |
| 11 | A B A B B B B B A |
| | |
| 12 | A B A B B B B B A |
| | |
| 13 | A B B B B B B B A |
| | |
| 14 | B A B B A B B B B |
| | |
| 15 | A B B B B B A B A |
| | |
| 16 | A A B B A B A B A |
| | |
| 17 | A A B B B A B B B |
| | |
| 18 | A B B A A B B B A |
| | |
| 19 | A A A B B B A B A |
| | | | 20 | A B B A A B A B A | | | | | 21 | A B B A A A B B A | | | | | 22 | A A A A A B B B B | | | | | 23 | A A B B A A A B B | | | | | 24 | A A A B A B B B B | | | | | 25 | A B B B B B B B A | | | | | 26 | A A A A A A A A A | | | | | 27 | A A A A A A A A A | | | | | 28 | A A B A B B A B A | | | | | 29 | B A A B A A A A A | | | | | 30 | A A A B B A B A B | | | | | 31 | B A B B A B B B B | +----+-------------------+ Table 21: Prediction Weight Selection for NB/MB Normalized LSF Decoding +----+---------------------------------------------+ | I1 | Coefficient | +----+---------------------------------------------+ | | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 | | | | | 0 | C C C C C C C C C C C C C C D | | | | | 1 | C C C C C C C C C C C C C C C | | | | | 2 | C C D C C D D D C D D D D C C | | | | | 3 | C C C C C C C C C C C C D C C | | | | | 4 | C D D C D C D D C D D D D D C | | | | | 5 | C C D C C C C C C C C C C C C |
| | |
| 6 | D C C C C C C C C C C D C D C |
| | |
| 7 | C D D C C C D C D D D C D C D |
| | |
| 8 | C D C D D C D C D C D D D D D |
| | |
| 9 | C C C C C C C C C C C C C C D |
| | |
| 10 | C D C C C C C C C C C C C C C |
| | |
| 11 | C C D C D D D D D D D C D C C |
| | |
| 12 | C C D C C D C D C D C C D C C |
| | |
| 13 | C C C C D D C D C D D D D C C |
| | |
| 14 | C D C C C D D C D D D C D D D |
| | |
| 15 | C C D D C C C C C C C C D D C |
| | |
| 16 | C D D C D C D D D D D C D C C |
| | |
| 17 | C C D C C C C D C C D D D C C |
| | |
| 18 | C C C C C C C C C C C C C C D |
| | |
| 19 | C C C C C C C C C C C C D C C |
| | |
| 20 | C C C C C C C C C C C C C C C |
| | |
| 21 | C D C D C D D C D C D C D D C |
| | |
| 22 | C C D D D D C D D C C D D C C |
| | |
| 23 | C D D C D C D C D C C C C D C |
| | |
| 24 | C C C D D C D C D D D D D D D |
| | |
| 25 | C C C C C C C C C C C C C C D |
| | |
| 26 | C D D C C C D D C C D D D D D |
| | |
| 27 | C C C C C D C D D D D C D D D |
| | |
| 28 | C C C C C C C C C C C C C C D |
| | |
| 29 | C C C C C C C C C C C C C C D |
| | | | 30 | D C C C C C C C C C C D C C C | | | | | 31 | C C D C C D D D C C D C C D C | +----+---------------------------------------------+ Table 22: Prediction Weight Selection for WB Normalized LSF Decoding4.2.7.5.3. Reconstructing the Normalized LSF Coefficients
Once the stage-1 index I1 and the stage-2 residual res_Q10[] have been decoded, the final normalized LSF coefficients can be reconstructed. The spectral distortion introduced by the quantization of each LSF coefficient varies, so the stage-2 residual is weighted accordingly, using the low-complexity Inverse Harmonic Mean Weighting (IHMW) function proposed in [LAROIA-ICASSP]. The weights are derived directly from the stage-1 codebook vector. Let cb1_Q8[k] be the k'th entry of the stage-1 codebook vector from Table 23 or Table 24. Then, for 0 <= k < d_LPC, the following expression computes the square of the weight as a Q18 value: w2_Q18[k] = (1024/(cb1_Q8[k] - cb1_Q8[k-1]) + 1024/(cb1_Q8[k+1] - cb1_Q8[k])) << 16 where cb1_Q8[-1] = 0 and cb1_Q8[d_LPC] = 256, and the division is integer division. This is reduced to an unsquared, Q9 value using the following square-root approximation: i = ilog(w2_Q18[k]) f = (w2_Q18[k]>>(i-8)) & 127 y = ((i&1) ? 32768 : 46214) >> ((32-i)>>1) w_Q9[k] = y + ((213*f*y)>>16) The constant 46214 here is approximately the square root of 2 in Q15. The cb1_Q8[] vector completely determines these weights, and they may be tabulated and stored as 13-bit unsigned values (with a range of 1819 to 5227, inclusive) to avoid computing them when decoding. The reference implementation already requires code to compute these weights on unquantized coefficients in the encoder, in silk_NLSF_VQ_weights_laroia() (NLSF_VQ_weights_laroia.c) and its callers, so it reuses that code in the decoder instead of using a pre-computed table to reduce the amount of ROM required.
+----+----------------------------------------+
| I1 | Codebook (Q8) |
+----+----------------------------------------+
| | 0 1 2 3 4 5 6 7 8 9 |
| | |
| 0 | 12 35 60 83 108 132 157 180 206 228 |
| | |
| 1 | 15 32 55 77 101 125 151 175 201 225 |
| | |
| 2 | 19 42 66 89 114 137 162 184 209 230 |
| | |
| 3 | 12 25 50 72 97 120 147 172 200 223 |
| | |
| 4 | 26 44 69 90 114 135 159 180 205 225 |
| | |
| 5 | 13 22 53 80 106 130 156 180 205 228 |
| | |
| 6 | 15 25 44 64 90 115 142 168 196 222 |
| | |
| 7 | 19 24 62 82 100 120 145 168 190 214 |
| | |
| 8 | 22 31 50 79 103 120 151 170 203 227 |
| | |
| 9 | 21 29 45 65 106 124 150 171 196 224 |
| | |
| 10 | 30 49 75 97 121 142 165 186 209 229 |
| | |
| 11 | 19 25 52 70 93 116 143 166 192 219 |
| | |
| 12 | 26 34 62 75 97 118 145 167 194 217 |
| | |
| 13 | 25 33 56 70 91 113 143 165 196 223 |
| | |
| 14 | 21 34 51 72 97 117 145 171 196 222 |
| | |
| 15 | 20 29 50 67 90 117 144 168 197 221 |
| | |
| 16 | 22 31 48 66 95 117 146 168 196 222 |
| | |
| 17 | 24 33 51 77 116 134 158 180 200 224 |
| | |
| 18 | 21 28 70 87 106 124 149 170 194 217 |
| | |
| 19 | 26 33 53 64 83 117 152 173 204 225 |
| | |
| 20 | 27 34 65 95 108 129 155 174 210 225 |
| | |
| 21 | 20 26 72 99 113 131 154 176 200 219 |
| | | | 22 | 34 43 61 78 93 114 155 177 205 229 | | | | | 23 | 23 29 54 97 124 138 163 179 209 229 | | | | | 24 | 30 38 56 89 118 129 158 178 200 231 | | | | | 25 | 21 29 49 63 85 111 142 163 193 222 | | | | | 26 | 27 48 77 103 133 158 179 196 215 232 | | | | | 27 | 29 47 74 99 124 151 176 198 220 237 | | | | | 28 | 33 42 61 76 93 121 155 174 207 225 | | | | | 29 | 29 53 87 112 136 154 170 188 208 227 | | | | | 30 | 24 30 52 84 131 150 166 186 203 229 | | | | | 31 | 37 48 64 84 104 118 156 177 201 230 | +----+----------------------------------------+ Table 23: NB/MB Normalized LSF Stage-1 Codebook Vectors +----+------------------------------------------------------------+ | I1 | Codebook (Q8) | +----+------------------------------------------------------------+ | | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | | | | | 0 | 7 23 38 54 69 85 100 116 131 147 162 178 193 208 223 239 | | | | | 1 | 13 25 41 55 69 83 98 112 127 142 157 171 187 203 220 236 | | | | | 2 | 15 21 34 51 61 78 92 106 126 136 152 167 185 205 225 240 | | | | | 3 | 10 21 36 50 63 79 95 110 126 141 157 173 189 205 221 237 | | | | | 4 | 17 20 37 51 59 78 89 107 123 134 150 164 184 205 224 240 | | | | | 5 | 10 15 32 51 67 81 96 112 129 142 158 173 189 204 220 236 | | | | | 6 | 8 21 37 51 65 79 98 113 126 138 155 168 179 192 209 218 | | | | | 7 | 12 15 34 55 63 78 87 108 118 131 148 167 185 203 219 236 | | | | | 8 | 16 19 32 36 56 79 91 108 118 136 154 171 186 204 220 237 | | | | | 9 | 11 28 43 58 74 89 105 120 135 150 165 180 196 211 226 241 |
| | | | 10 | 6 16 33 46 60 75 92 107 123 137 156 169 185 199 214 225 | | | | | 11 | 11 19 30 44 57 74 89 105 121 135 152 169 186 202 218 234 | | | | | 12 | 12 19 29 46 57 71 88 100 120 132 148 165 182 199 216 233 | | | | | 13 | 17 23 35 46 56 77 92 106 123 134 152 167 185 204 222 237 | | | | | 14 | 14 17 45 53 63 75 89 107 115 132 151 171 188 206 221 240 | | | | | 15 | 9 16 29 40 56 71 88 103 119 137 154 171 189 205 222 237 | | | | | 16 | 16 19 36 48 57 76 87 105 118 132 150 167 185 202 218 236 | | | | | 17 | 12 17 29 54 71 81 94 104 126 136 149 164 182 201 221 237 | | | | | 18 | 15 28 47 62 79 97 115 129 142 155 168 180 194 208 223 238 | | | | | 19 | 8 14 30 45 62 78 94 111 127 143 159 175 192 207 223 239 | | | | | 20 | 17 30 49 62 79 92 107 119 132 145 160 174 190 204 220 235 | | | | | 21 | 14 19 36 45 61 76 91 108 121 138 154 172 189 205 222 238 | | | | | 22 | 12 18 31 45 60 76 91 107 123 138 154 171 187 204 221 236 | | | | | 23 | 13 17 31 43 53 70 83 103 114 131 149 167 185 203 220 237 | | | | | 24 | 17 22 35 42 58 78 93 110 125 139 155 170 188 206 224 240 | | | | | 25 | 8 15 34 50 67 83 99 115 131 146 162 178 193 209 224 239 | | | | | 26 | 13 16 41 66 73 86 95 111 128 137 150 163 183 206 225 241 | | | | | 27 | 17 25 37 52 63 75 92 102 119 132 144 160 175 191 212 231 | | | | | 28 | 19 31 49 65 83 100 117 133 147 161 174 187 200 213 227 242 | | | | | 29 | 18 31 52 68 88 103 117 126 138 149 163 177 192 207 223 239 | | | | | 30 | 16 29 47 61 76 90 106 119 133 147 161 176 193 209 224 240 | | | | | 31 | 15 21 35 50 61 73 86 97 110 119 129 141 175 198 218 237 | +----+------------------------------------------------------------+ Table 24: WB Normalized LSF Stage-1 Codebook Vectors
Given the stage-1 codebook entry cb1_Q8[], the stage-2 residual res_Q10[], and their corresponding weights, w_Q9[], the reconstructed normalized LSF coefficients are NLSF_Q15[k] = clamp(0, (cb1_Q8[k]<<7) + (res_Q10[k]<<14)/w_Q9[k], 32767) where the division is integer division. However, nothing in either the reconstruction process or the quantization process in the encoder thus far guarantees that the coefficients are monotonically increasing and separated well enough to ensure a stable filter [KABAL86]. When using the reference encoder, roughly 2% of frames violate this constraint. The next section describes a stabilization procedure used to make these guarantees.4.2.7.5.4. Normalized LSF Stabilization
The normalized LSF stabilization procedure is implemented in silk_NLSF_stabilize() (NLSF_stabilize.c). This process ensures that consecutive values of the normalized LSF coefficients, NLSF_Q15[], are spaced some minimum distance apart (predetermined to be the 0.01 percentile of a large training set). Table 25 gives the minimum spacings for NB and MB and those for WB, where row k is the minimum allowed value of NLSF_Q15[k]-NLSF_Q15[k-1]. For the purposes of computing this spacing for the first and last coefficient, NLSF_Q15[-1] is taken to be 0 and NLSF_Q15[d_LPC] is taken to be 32768.
+-------------+-----------+-----+ | Coefficient | NB and MB | WB | +-------------+-----------+-----+ | 0 | 250 | 100 | | | | | | 1 | 3 | 3 | | | | | | 2 | 6 | 40 | | | | | | 3 | 3 | 3 | | | | | | 4 | 3 | 3 | | | | | | 5 | 3 | 3 | | | | | | 6 | 4 | 5 | | | | | | 7 | 3 | 14 | | | | | | 8 | 3 | 14 | | | | | | 9 | 3 | 10 | | | | | | 10 | 461 | 11 | | | | | | 11 | | 3 | | | | | | 12 | | 8 | | | | | | 13 | | 9 | | | | | | 14 | | 7 | | | | | | 15 | | 3 | | | | | | 16 | | 347 | +-------------+-----------+-----+ Table 25: Minimum Spacing for Normalized LSF Coefficients The procedure starts off by trying to make small adjustments that attempt to minimize the amount of distortion introduced. After 20 such adjustments, it falls back to a more direct method that guarantees the constraints are enforced but may require large adjustments.
Let NDeltaMin_Q15[k] be the minimum required spacing for the current audio bandwidth from Table 25. First, the procedure finds the index i where NLSF_Q15[i] - NLSF_Q15[i-1] - NDeltaMin_Q15[i] is the smallest, breaking ties by using the lower value of i. If this value is non-negative, then the stabilization stops; the coefficients satisfy all the constraints. Otherwise, if i == 0, it sets NLSF_Q15[0] to NDeltaMin_Q15[0], and if i == d_LPC, it sets NLSF_Q15[d_LPC-1] to (32768 - NDeltaMin_Q15[d_LPC]). For all other values of i, both NLSF_Q15[i-1] and NLSF_Q15[i] are updated as follows: i-1 __ min_center_Q15 = (NDeltaMin_Q15[i]>>1) + \ NDeltaMin_Q15[k] /_ k=0 d_LPC __ max_center_Q15 = 32768 - (NDeltaMin_Q15[i]>>1) - \ NDeltaMin_Q15[k] /_ k=i+1 center_freq_Q15 = clamp(min_center_Q15[i], (NLSF_Q15[i-1] + NLSF_Q15[i] + 1)>>1 max_center_Q15[i]) NLSF_Q15[i-1] = center_freq_Q15 - (NDeltaMin_Q15[i]>>1) NLSF_Q15[i] = NLSF_Q15[i-1] + NDeltaMin_Q15[i] Then, the procedure repeats again, until it has either executed 20 times or stopped because the coefficients satisfy all the constraints. After the 20th repetition of the above procedure, the following fallback procedure executes once. First, the values of NLSF_Q15[k] for 0 <= k < d_LPC are sorted in ascending order. Then, for each value of k from 0 to d_LPC-1, NLSF_Q15[k] is set to max(NLSF_Q15[k], NLSF_Q15[k-1] + NDeltaMin_Q15[k]) Next, for each value of k from d_LPC-1 down to 0, NLSF_Q15[k] is set to min(NLSF_Q15[k], NLSF_Q15[k+1] - NDeltaMin_Q15[k+1]) There is no need to check if the coefficients satisfy all the constraints before applying this fallback procedure. If they do, then it will not change their values.
4.2.7.5.5. Normalized LSF Interpolation
For 20 ms SILK frames, the first half of the frame (i.e., the first two subframes) may use normalized LSF coefficients that are interpolated between the decoded LSFs for the most recent coded frame (in the same channel) and the current frame. A Q2 interpolation factor follows the LSF coefficient indices in the bitstream, which is decoded using the PDF in Table 26. This happens in silk_decode_indices() (decode_indices.c). After either o An uncoded regular SILK frame in the side channel, or o A decoder reset (see Section 4.5.2), the decoder still decodes this factor, but ignores its value and always uses 4 instead. For 10 ms SILK frames, this factor is not stored at all. +---------------------------+ | PDF | +---------------------------+ | {13, 22, 29, 11, 181}/256 | +---------------------------+ Table 26: PDF for Normalized LSF Interpolation Index Let n2_Q15[k] be the normalized LSF coefficients decoded by the procedure in Section 4.2.7.5, n0_Q15[k] be the LSF coefficients decoded for the prior frame, and w_Q2 be the interpolation factor. Then, the normalized LSF coefficients used for the first half of a 20 ms frame, n1_Q15[k], are n1_Q15[k] = n0_Q15[k] + (w_Q2*(n2_Q15[k] - n0_Q15[k]) >> 2) This interpolation is performed in silk_decode_parameters() (decode_parameters.c).4.2.7.5.6. Converting Normalized LSFs to LPC Coefficients
Any LPC filter A(z) can be split into a symmetric part P(z) and an anti-symmetric part Q(z) such that d_LPC __ -k 1 A(z) = 1 - \ a[k] * z = - * (P(z) + Q(z)) /_ 2 k=1
with -d_LPC-1 -1 P(z) = A(z) + z * A(z ) -d_LPC-1 -1 Q(z) = A(z) - z * A(z ) The even normalized LSF coefficients correspond to a pair of conjugate roots of P(z), while the odd coefficients correspond to a pair of conjugate roots of Q(z), all of which lie on the unit circle. In addition, P(z) has a root at pi and Q(z) has a root at 0. Thus, they may be reconstructed mathematically from a set of normalized LSF coefficients, n[k], as d_LPC/2-1 -1 ___ -1 -2 P(z) = (1 + z ) * | | (1 - 2*cos(pi*n[2*k])*z + z ) k=0 d_LPC/2-1 -1 ___ -1 -2 Q(z) = (1 - z ) * | | (1 - 2*cos(pi*n[2*k+1])*z + z ) k=0 However, SILK performs this reconstruction using a fixed-point approximation so that all decoders can reproduce it in a bit-exact manner to avoid prediction drift. The function silk_NLSF2A() (NLSF2A.c) implements this procedure. To start, it approximates cos(pi*n[k]) using a table lookup with linear interpolation. The encoder SHOULD use the inverse of this piecewise linear approximation, rather than the true inverse of the cosine function, when deriving the normalized LSF coefficients. These values are also re-ordered to improve numerical accuracy when constructing the LPC polynomials.
+-------------+-----------+----+ | Coefficient | NB and MB | WB | +-------------+-----------+----+ | 0 | 0 | 0 | | | | | | 1 | 9 | 15 | | | | | | 2 | 6 | 8 | | | | | | 3 | 3 | 7 | | | | | | 4 | 4 | 4 | | | | | | 5 | 5 | 11 | | | | | | 6 | 8 | 12 | | | | | | 7 | 1 | 3 | | | | | | 8 | 2 | 2 | | | | | | 9 | 7 | 13 | | | | | | 10 | | 10 | | | | | | 11 | | 5 | | | | | | 12 | | 6 | | | | | | 13 | | 9 | | | | | | 14 | | 14 | | | | | | 15 | | 1 | +-------------+-----------+----+ Table 27: LSF Ordering for Polynomial Evaluation The top 7 bits of each normalized LSF coefficient index a value in the table, and the next 8 bits interpolate between it and the next value. Let i = (n[k] >> 8) be the integer index and f = (n[k] & 255) be the fractional part of a given coefficient. Then, the re-ordered, approximated cosine, c_Q17[ordering[k]], is c_Q17[ordering[k]] = (cos_Q12[i]*256 + (cos_Q12[i+1]-cos_Q12[i])*f + 4) >> 3
where ordering[k] is the k'th entry of the column of Table 27 corresponding to the current audio bandwidth and cos_Q12[i] is the i'th entry of Table 28. +-----+-------+-------+-------+-------+ | i | +0 | +1 | +2 | +3 | +-----+-------+-------+-------+-------+ | 0 | 4096 | 4095 | 4091 | 4085 | | | | | | | | 4 | 4076 | 4065 | 4052 | 4036 | | | | | | | | 8 | 4017 | 3997 | 3973 | 3948 | | | | | | | | 12 | 3920 | 3889 | 3857 | 3822 | | | | | | | | 16 | 3784 | 3745 | 3703 | 3659 | | | | | | | | 20 | 3613 | 3564 | 3513 | 3461 | | | | | | | | 24 | 3406 | 3349 | 3290 | 3229 | | | | | | | | 28 | 3166 | 3102 | 3035 | 2967 | | | | | | | | 32 | 2896 | 2824 | 2751 | 2676 | | | | | | | | 36 | 2599 | 2520 | 2440 | 2359 | | | | | | | | 40 | 2276 | 2191 | 2106 | 2019 | | | | | | | | 44 | 1931 | 1842 | 1751 | 1660 | | | | | | | | 48 | 1568 | 1474 | 1380 | 1285 | | | | | | | | 52 | 1189 | 1093 | 995 | 897 | | | | | | | | 56 | 799 | 700 | 601 | 501 | | | | | | | | 60 | 401 | 301 | 201 | 101 | | | | | | | | 64 | 0 | -101 | -201 | -301 | | | | | | | | 68 | -401 | -501 | -601 | -700 | | | | | | | | 72 | -799 | -897 | -995 | -1093 | | | | | | | | 76 | -1189 | -1285 | -1380 | -1474 | | | | | | | | 80 | -1568 | -1660 | -1751 | -1842 |
| | | | | | | 84 | -1931 | -2019 | -2106 | -2191 | | | | | | | | 88 | -2276 | -2359 | -2440 | -2520 | | | | | | | | 92 | -2599 | -2676 | -2751 | -2824 | | | | | | | | 96 | -2896 | -2967 | -3035 | -3102 | | | | | | | | 100 | -3166 | -3229 | -3290 | -3349 | | | | | | | | 104 | -3406 | -3461 | -3513 | -3564 | | | | | | | | 108 | -3613 | -3659 | -3703 | -3745 | | | | | | | | 112 | -3784 | -3822 | -3857 | -3889 | | | | | | | | 116 | -3920 | -3948 | -3973 | -3997 | | | | | | | | 120 | -4017 | -4036 | -4052 | -4065 | | | | | | | | 124 | -4076 | -4085 | -4091 | -4095 | | | | | | | | 128 | -4096 | | | | +-----+-------+-------+-------+-------+ Table 28: Q12 Cosine Table for LSF Conversion Given the list of cosine values, silk_NLSF2A_find_poly() (NLSF2A.c) computes the coefficients of P and Q, described here via a simple recurrence. Let p_Q16[k][j] and q_Q16[k][j] be the coefficients of the products of the first (k+1) root pairs for P and Q, with j indexing the coefficient number. Only the first (k+2) coefficients are needed, as the products are symmetric. Let p_Q16[0][0] = q_Q16[0][0] = 1<<16, p_Q16[0][1] = -c_Q17[0], q_Q16[0][1] = -c_Q17[1], and d2 = d_LPC/2. As boundary conditions, assume p_Q16[k][j] = q_Q16[k][j] = 0 for all j < 0. Also, assume p_Q16[k][k+2] = p_Q16[k][k] and q_Q16[k][k+2] = q_Q16[k][k] (because of the symmetry). Then, for 0 < k < d2 and 0 <= j <= k+1, p_Q16[k][j] = p_Q16[k-1][j] + p_Q16[k-1][j-2] - ((c_Q17[2*k]*p_Q16[k-1][j-1] + 32768)>>16) q_Q16[k][j] = q_Q16[k-1][j] + q_Q16[k-1][j-2] - ((c_Q17[2*k+1]*q_Q16[k-1][j-1] + 32768)>>16)
The use of Q17 values for the cosine terms in an otherwise Q16 expression implicitly scales them by a factor of 2. The multiplications in this recurrence may require up to 48 bits of precision in the result to avoid overflow. In practice, each row of the recurrence only depends on the previous row, so an implementation does not need to store all of them. silk_NLSF2A() uses the values from the last row of this recurrence to reconstruct a 32-bit version of the LPC filter (without the leading 1.0 coefficient), a32_Q17[k], 0 <= k < d2: a32_Q17[k] = -(q_Q16[d2-1][k+1] - q_Q16[d2-1][k]) - (p_Q16[d2-1][k+1] + p_Q16[d2-1][k])) a32_Q17[d_LPC-k-1] = (q_Q16[d2-1][k+1] - q_Q16[d2-1][k]) - (p_Q16[d2-1][k+1] + p_Q16[d2-1][k])) The sum and difference of two terms from each of the p_Q16 and q_Q16 coefficient lists reflect the (1 + z**-1) and (1 - z**-1) factors of P and Q, respectively. The promotion of the expression from Q16 to Q17 implicitly scales the result by 1/2.4.2.7.5.7. Limiting the Range of the LPC Coefficients
The a32_Q17[] coefficients are too large to fit in a 16-bit value, which significantly increases the cost of applying this filter in fixed-point decoders. Reducing them to Q12 precision doesn't incur any significant quality loss, but still does not guarantee they will fit. silk_NLSF2A() applies up to 10 rounds of bandwidth expansion to limit the dynamic range of these coefficients. Even floating-point decoders SHOULD perform these steps, to avoid mismatch. For each round, the process first finds the index k such that abs(a32_Q17[k]) is largest, breaking ties by choosing the lowest value of k. Then, it computes the corresponding Q12 precision value, maxabs_Q12, subject to an upper bound to avoid overflow in subsequent computations: maxabs_Q12 = min((maxabs_Q17 + 16) >> 5, 163838) If this is larger than 32767, the procedure derives the chirp factor, sc_Q16[0], to use in the bandwidth expansion as (maxabs_Q12 - 32767) << 14 sc_Q16[0] = 65470 - -------------------------- (maxabs_Q12 * (k+1)) >> 2
where the division here is integer division. This is an approximation of the chirp factor needed to reduce the target coefficient to 32767, though it is both less than 0.999 and, for k > 0 when maxabs_Q12 is much greater than 32767, still slightly too large. The upper bound on maxabs_Q12, 163838, was chosen because it is equal to ((2**31 - 1) >> 14) + 32767, i.e., the largest value of maxabs_Q12 that would not overflow the numerator in the equation above when stored in a signed 32-bit integer. silk_bwexpander_32() (bwexpander_32.c) performs the bandwidth expansion (again, only when maxabs_Q12 is greater than 32767) using the following recurrence: a32_Q17[k] = (a32_Q17[k]*sc_Q16[k]) >> 16 sc_Q16[k+1] = (sc_Q16[0]*sc_Q16[k] + 32768) >> 16 The first multiply may require up to 48 bits of precision in the result to avoid overflow. The second multiply must be unsigned to avoid overflow with only 32 bits of precision. The reference implementation uses a slightly more complex formulation that avoids the 32-bit overflow using signed multiplication, but is otherwise equivalent. After 10 rounds of bandwidth expansion are performed, they are simply saturated to 16 bits: a32_Q17[k] = clamp(-32768, (a32_Q17[k] + 16) >> 5, 32767) << 5 Because this performs the actual saturation in the Q12 domain, but converts the coefficients back to the Q17 domain for the purposes of prediction gain limiting, this step must be performed after the 10th round of bandwidth expansion, regardless of whether or not the Q12 version of any coefficient still overflows a 16-bit integer. This saturation is not performed if maxabs_Q12 drops to 32767 or less prior to the 10th round.4.2.7.5.8. Limiting the Prediction Gain of the LPC Filter
The prediction gain of an LPC synthesis filter is the square root of the output energy when the filter is excited by a unit-energy impulse. Even if the Q12 coefficients would fit, the resulting filter may still have a significant gain (especially for voiced sounds), making the filter unstable. silk_NLSF2A() applies up to 16 additional rounds of bandwidth expansion to limit the prediction gain. Instead of controlling the amount of bandwidth expansion using the prediction gain itself (which may diverge to infinity for an unstable filter), silk_NLSF2A() uses silk_LPC_inverse_pred_gain_QA()
(LPC_inv_pred_gain.c) to compute the reflection coefficients associated with the filter. The filter is stable if and only if the magnitude of these coefficients is sufficiently less than one. The reflection coefficients, rc[k], can be computed using a simple Levinson recurrence, initialized with the LPC coefficients a[d_LPC- 1][n] = a[n], and then updated via rc[k] = -a[k][k] , a[k][n] - a[k][k-n-1]*rc[k] a[k-1][n] = --------------------------- 2 1 - rc[k] However, silk_LPC_inverse_pred_gain_QA() approximates this using fixed-point arithmetic to guarantee reproducible results across platforms and implementations. Since small changes in the coefficients can make a stable filter unstable, it takes the real Q12 coefficients that will be used during reconstruction as input. Thus, let a32_Q12[n] = (a32_Q17[n] + 16) >> 5 be the Q12 version of the LPC coefficients that will eventually be used. As a simple initial check, the decoder computes the DC response as d_PLC-1 __ DC_resp = \ a32_Q12[n] /_ n=0 and if DC_resp > 4096, the filter is unstable. Increasing the precision of these Q12 coefficients to Q24 for intermediate computations allows more accurate computation of the reflection coefficients, so the decoder initializes the recurrence via inv_gain_Q30[d_LPC] = 1 << 30 a32_Q24[d_LPC-1][n] = a32_Q12[n] << 12
Then, for each k from d_LPC-1 down to 0, if abs(a32_Q24[k][k]) > 16773022, the filter is unstable and the recurrence stops. The constant 16773022 here is approximately 0.99975 in Q24. Otherwise, the inverse of the prediction gain, inv_gain_Q30[k], is updated via rc_Q31[k] = -a32_Q24[k][k] << 7 div_Q30[k] = (1<<30) - (rc_Q31[k]*rc_Q31[k] >> 32) inv_gain_Q30[k] = (inv_gain_Q30[k+1]*div_Q30[k] >> 32) << 2 and if inv_gain_Q30[k] < 107374, the filter is unstable and the recurrence stops. The constant 107374 here is approximately 1/10000 in Q30. If neither of these checks determine that the filter is unstable and k > 0, row k-1 of a32_Q24 is computed from row k as b1[k] = ilog(div_Q30[k]) b2[k] = b1[k] - 16 (1<<29) - 1 inv_Qb2[k] = ----------------------- div_Q30[k] >> (b2[k]+1) err_Q29[k] = (1<<29) - ((div_Q30[k]<<(15-b2[k]))*inv_Qb2[k] >> 16) gain_Qb1[k] = ((inv_Qb2[k] << 16) + (err_Q29[k]*inv_Qb2[k] >> 13)) num_Q24[k-1][n] = a32_Q24[k][n] - ((a32_Q24[k][k-n-1]*rc_Q31[k] + (1<<30)) >> 31) a32_Q24[k-1][n] = (num_Q24[k-1][n]*gain_Qb1[k] + (1<<(b1[k]-1))) >> b1[k] where 0 <= n < k. In the above, rc_Q31[k] are the reflection coefficients. div_Q30[k] is the denominator for each iteration, and gain_Qb1[k] is its multiplicative inverse (with b1[k] fractional bits, where b1[k] ranges from 20 to 31). inv_Qb2[k], which ranges from 16384 to 32767, is a low-precision version of that inverse (with b2[k] fractional bits). err_Q29[k] is the residual error, ranging from -32763 to 32392, which is used to improve the accuracy. The values t_Q24[k-1][n] for each n are the numerators for the next row of coefficients in the recursion, and a32_Q24[k-1][n] is the final version of that row. Every multiply in this procedure except the one used to compute gain_Qb1[k] requires more than 32 bits of precision,
but otherwise all intermediate results fit in 32 bits or less. In practice, because each row only depends on the next one, an implementation does not need to store them all. If abs(a32_Q24[k][k]) <= 16773022 and inv_gain_Q30[k] >= 107374 for 0 <= k < d_LPC, then the filter is considered stable. However, the problem of determining stability is ill-conditioned when the filter contains several reflection coefficients whose magnitude is very close to one. This fixed-point algorithm is not mathematically guaranteed to correctly classify filters as stable or unstable in this case, though it does very well in practice. On round i, 0 <= i < 16, if the filter passes these stability checks, then this procedure stops, and the final LPC coefficients to use for reconstruction in Section 4.2.7.9.2 are a_Q12[k] = (a32_Q17[k] + 16) >> 5 Otherwise, a round of bandwidth expansion is applied using the same procedure as in Section 4.2.7.5.7, with sc_Q16[0] = 65536 - (2<<i) During round 15, sc_Q16[0] becomes 0 in the above equation, so a_Q12[k] is set to 0 for all k, guaranteeing a stable filter.