The TSVCIS codec augments the standard MELP 2400, 1200, and 600 bitrates and hence uses 22.5, 67.5, or 90 ms frames with a sampling rate clock of 8 kHz, so the RTP timestamp
MUST be in units of 1/8000 of a second.
The RTP payload for TSVCIS has the format shown in
Figure 1. No additional header specific to this payload format is needed. This format is intended for situations where the sender and the receiver send one or more codec data frames per packet.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RTP Header |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| |
+ one or more frames of TSVCIS |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The RTP header of the packetized encoded TSVCIS speech has the expected values as described in [
RFC 3550]. The usage of the M bit
SHOULD be as specified in the applicable RTP profile -- for example, [
RFC 3551] specifies that if the sender does not suppress silence (i.e., sends a frame on every frame interval), the M bit will always be zero. When more than one codec data frame is present in a single RTP packet, the timestamp specified is that of the oldest data frame represented in the RTP packet.
The assignment of an RTP payload type for this new packet format is outside the scope of this document and will not be specified here. It is expected that the RTP profile for a particular class of applications will assign a payload type for this encoding; if that is not done, then a payload type in the dynamic range shall be chosen by the sender.
The TSVCIS speech coder includes all three MELPe coder rates used as base speech parameters or as speech coders for bandwidth-restricted links. RTP packetization of MELPe follows [
RFC 8130] and is repeated here for all three MELPe rates [
RFC 8130], with its recommendations now regarded as requirements. The bits previously labeled as RSVA, RSVB, and RSVC in [
RFC 8130]
SHOULD be filled with rate code bits CODA, CODB, and CODC, as shown in
Table 1 (compatible with Table 7 in
Section 3.3 of
RFC 8130).
Coder Bitrate |
CODA |
CODB |
CODC |
Length |
2400 bps |
0 |
0 |
N/A |
7 |
1200 bps |
1 |
0 |
0 |
11 |
600 bps |
0 |
1 |
N/A |
7 |
Comfort Noise |
1 |
0 |
1 |
2 |
TSVCIS Data |
1 |
1 |
N/A |
var. |
Table 1: TSVCIS/MELPe Frame Bitrate Indicators and Frame Length
The total number of bits used to describe one MELPe frame of 2400 bps speech is 54, which fits in 7 octets (with two rate code bits). For MELPe 1200 bps speech, the total number of bits used is 81, which fits in 11 octets (with three rate code bits and four unused bits). For MELPe 600 bps speech, the total number of bits used is 54, which fits in 7 octets (with two rate code bits). The comfort noise frame consists of 13 bits, which fits in 2 octets (with three rate code bits). TSVCIS packed parameters will use the last code combination in a trailing byte as discussed in
Section 3.2.
It should be noted that CODB for MELPe 600 bps mode
MAY deviate from the value in
Table 1 when bit 55 is used as an alternating 1/0 end-to-end framing bit. Frame decoding would remain distinct as CODA being zero on its own would indicate a 7-byte frame for either a 2400 or 600 bps rate, and the use of 600 bps speech coding could be deduced from the RTP timestamp (and anticipated by the Session Description Protocol (SDP) negotiations).
The 2400 bps MELPe RTP payload is constructed as per
Figure 2. Note that CODA
MUST be filled with 0 and CODB
SHOULD be filled with 0 as per
Section 3.1. CODB
MAY contain an end-to-end framing bit if required by the endpoints.
MSB LSB
0 1 2 3 4 5 6 7
+------+------+------+------+------+------+------+------+
| B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+------+------+------+------+------+------+------+------+
| B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 |
+------+------+------+------+------+------+------+------+
| B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 |
+------+------+------+------+------+------+------+------+
| B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 |
+------+------+------+------+------+------+------+------+
| B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 |
+------+------+------+------+------+------+------+------+
| B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 |
+------+------+------+------+------+------+------+------+
| CODA | CODB | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 |
+------+------+------+------+------+------+------+------+
The 1200 bps MELPe RTP payload is constructed as per
Figure 3. Note that CODA, CODB, and CODC
MUST be filled with 1, 0, and 0, respectively, as per
Section 3.1. RSV0
MUST be coded as 0.
MSB LSB
0 1 2 3 4 5 6 7
+------+------+------+------+------+------+------+------+
| B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+------+------+------+------+------+------+------+------+
| B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 |
+------+------+------+------+------+------+------+------+
| B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 |
+------+------+------+------+------+------+------+------+
| B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 |
+------+------+------+------+------+------+------+------+
| B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 |
+------+------+------+------+------+------+------+------+
| B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 |
+------+------+------+------+------+------+------+------+
| B_56 | B_55 | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 |
+------+------+------+------+------+------+------+------+
| B_64 | B_63 | B_62 | B_61 | B_60 | B_59 | B_58 | B_57 |
+------+------+------+------+------+------+------+------+
| B_72 | B_71 | B_70 | B_69 | B_68 | B_67 | B_66 | B_65 |
+------+------+------+------+------+------+------+------+
| B_80 | B_79 | B_78 | B_77 | B_76 | B_75 | B_74 | B_73 |
+------+------+------+------+------+------+------+------+
| CODA | CODB | CODC | RSV0 | RSV0 | RSV0 | RSV0 | B_81 |
+------+------+------+------+------+------+------+------+
The 600 bps MELPe RTP payload is constructed as per
Figure 4. Note CODA
MUST be filled with 0 and CODB
SHOULD be filled with 1 as per
Section 3.1. CODB
MAY contain an end-to-end framing bit if required by the endpoints.
MSB LSB
0 1 2 3 4 5 6 7
+------+------+------+------+------+------+------+------+
| B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+------+------+------+------+------+------+------+------+
| B_16 | B_15 | B_14 | B_13 | B_12 | B_11 | B_10 | B_09 |
+------+------+------+------+------+------+------+------+
| B_24 | B_23 | B_22 | B_21 | B_20 | B_19 | B_18 | B_17 |
+------+------+------+------+------+------+------+------+
| B_32 | B_31 | B_30 | B_29 | B_28 | B_27 | B_26 | B_25 |
+------+------+------+------+------+------+------+------+
| B_40 | B_39 | B_38 | B_37 | B_36 | B_35 | B_34 | B_33 |
+------+------+------+------+------+------+------+------+
| B_48 | B_47 | B_46 | B_45 | B_44 | B_43 | B_42 | B_41 |
+------+------+------+------+------+------+------+------+
| CODA | CODB | B_54 | B_53 | B_52 | B_51 | B_50 | B_49 |
+------+------+------+------+------+------+------+------+
The comfort noise MELPe RTP payload is constructed as per
Figure 5. Note that CODA, CODB, and CODC
MUST be filled with 1, 0, and 1, respectively, as per
Section 3.1.
MSB LSB
0 1 2 3 4 5 6 7
+------+------+------+------+------+------+------+------+
| B_08 | B_07 | B_06 | B_05 | B_04 | B_03 | B_02 | B_01 |
+------+------+------+------+------+------+------+------+
| CODA | CODB | CODC | B_13 | B_12 | B_11 | B_10 | B_09 |
+------+------+------+------+------+------+------+------+
The TSVCIS augmented speech data as packed parameters
MUST be placed immediately after a corresponding MELPe 2400 bps payload in the same RTP packet. The packed parameters are counted in octets (TC). The preferred placement
SHOULD be used for TSVCIS payloads with TC less than or equal to 77 octets; this is shown in
Figure 6. In the preferred placement, a single trailing octet
SHALL be appended to include a two-bit rate code, CODA and CODB (both bits set to one), and a six-bit modified count (MTC). The special modified count value of all ones (representing an MTC value of 63)
SHALL NOT be used for this format as it is used as the indicator for the alternate packing format shown next. In a standard implementation, the TSVCIS speech coder uses a minimum of 15 octets for parameters in octet packed form. The modified count (MTC)
MUST be reduced by 15 from the full octet count (TC). Computed MTC = TC-15. This accommodates a maximum of 77 parameter octets (the maximum value of MTC is 62; 77 is the sum of 62+15).
MSB LSB
0 1 2 3 4 5 6 7
+------+------+------+------+------+------+------+------+
1 | T008 | T007 | T006 | T005 | T004 | T003 | T002 | T001 |
+------+------+------+------+------+------+------+------+
2 | T016 | T015 | T014 | T013 | T012 | T011 | T010 | T009 |
+------+------+------+------+------+------+------+------+
3 | T024 | T023 | T022 | T021 | T020 | T019 | T018 | T017 |
+------+------+------+------+------+------+------+------+
4 | T032 | T031 | T030 | T029 | T028 | T027 | T026 | T025 |
+------+------+------+------+------+------+------+------+
5 | T040 | T039 | T038 | T037 | T036 | T035 | T034 | T033 |
+------+------+------+------+------+------+------+------+
6 | T048 | T047 | T046 | T045 | T044 | T043 | T042 | T041 |
+------+------+------+------+------+------+------+------+
7 | TO56 | TO55 | T054 | T053 | T052 | T051 | T050 | T049 |
+------+------+------+------+------+------+------+------+
8 | T064 | T063 | T062 | T061 | T060 | T059 | T058 | T057 |
+------+------+------+------+------+------+------+------+
9 | T072 | T071 | T070 | T069 | T068 | T067 | T066 | T065 |
+------+------+------+------+------+------+------+------+
10 | T080 | T079 | T078 | T077 | T076 | T075 | T074 | T073 |
+------+------+------+------+------+------+------+------+
11 | T088 | T087 | T086 | T085 | T084 | T083 | T082 | T081 |
+------+------+------+------+------+------+------+------+
12 | TO96 | TO95 | T094 | T093 | T092 | T091 | T090 | T089 |
+------+------+------+------+------+------+------+------+
13 | T104 | T103 | T102 | T101 | T100 | T099 | T098 | T097 |
+------+------+------+------+------+------+------+------+
14 | T112 | T111 | T110 | T109 | T108 | T107 | T106 | T105 |
+------+------+------+------+------+------+------+------+
15 | T120 | T119 | T118 | T117 | T116 | T115 | T114 | T113 |
+------+------+------+------+------+------+------+------+
| . . . . |
+------+------+------+------+------+------+------+------+
TC+1 | CODA | CODB | modified octet count |
+------+------+------+------+------+------+------+------+
In order to accommodate all other NRL VDR configurations, an alternate parameter placement
MUST use two trailing bytes as shown in
Figure 7. The last trailing byte
MUST be filled with a two-bit rate code, CODA and CODB (both bits set to one), and its six-bit count field
MUST be filled with ones. The second to last trailing byte
MUST contain the parameter count (TC) in octets (a value from 1 and 255, inclusive). The value of zero
SHALL be considered as reserved.
MSB LSB
0 1 2 3 4 5 6 7
+------+------+------+------+------+------+------+------+
1 | T008 | T007 | T006 | T005 | T004 | T003 | T002 | T001 |
+------+------+------+------+------+------+------+------+
2 | T016 | T015 | T014 | T013 | T012 | T011 | T010 | T009 |
+------+------+------+------+------+------+------+------+
| . . . . |
+------+------+------+------+------+------+------+------+
TC+1 | octet count |
+------+------+------+------+------+------+------+------+
TC+2 | CODA | CODB | 1 | 1 | 1 | 1 | 1 | 1 |
+------+------+------+------+------+------+------+------+
A TSVCIS RTP packet payload consists of zero or more consecutive TSVCIS coder frames (each consisting of MELPe 2400 and TSVCIS coder data), with the oldest frame first, followed by zero or one MELPe comfort noise frame. The presence of a comfort noise frame can be determined by its rate code bits in its last octet.
The default packetization interval is one coder frame (22.5, 67.5, or 90 ms) according to the coder bitrate (2400, 1200, or 600 bps). For some applications, a longer packetization interval is used to reduce the packet rate.
A TSVCIS RTP packet without coder and comfort noise frames
MAY be used periodically by an endpoint to indicate connectivity by an otherwise idle receiver.
TSVCIS coder frames in a single RTP packet
MAY have varying TSVCIS parameter octet counts. Its packed parameter octet count (length) is indicated in the trailing byte(s). All MELPe frames in a single RTP packet
MUST be of the same coder bitrate. For all MELPe coder frames, the coder rate bits in the trailing byte identify the contents and length as per
Table 1.
It is important to observe that senders have the following additional restrictions:
-
Senders SHOULD NOT include more TSVCIS or MELPe frames in a single RTP packet than will fit in the MTU of the RTP transport protocol.
-
Frames MUST NOT be split between RTP packets.
It is
RECOMMENDED that the number of frames contained within an RTP packet be consistent with the application. For example, in telephony and other real-time applications where delay is important, the fewer frames per packet, the lower the delay. However, for bandwidth-constrained links or delay-insensitive streaming messaging applications, more than one frame per packet or many frames per packet would be acceptable.
Information describing the number of frames contained in an RTP packet is not transmitted as part of the RTP payload. The way to determine the number of TSVCIS/MELPe frames is to identify each frame type and length, thereby counting the total number of octets within the RTP packet.
The target bitrate of TSVCIS can be adjusted at any point in time, thus allowing congestion management. Furthermore, the amount of encoded speech or audio data encoded in a single packet can be used for congestion control, since the packet rate is inversely proportional to the packet duration. A lower packet transmission rate reduces the amount of header overhead but at the same time increases latency and loss sensitivity, so it ought to be used with care.
Since UDP does not provide congestion control, applications that use RTP over UDP
SHOULD implement their own congestion control above the UDP layer [
RFC 8085] and
MAY also implement a transport circuit breaker [
RFC 8083]. Work in the RMCAT Working Group [
RMCAT] describes the interactions and conceptual interfaces necessary between the application components that relate to congestion control, including the RTP layer, the higher-level media codec control layer, and the lower-level transport interface, as well as components dedicated to congestion control functions.