Tech-invite3GPPspaceIETFspace
21222324252627282931323334353637384‑5x

Content for  TS 26.223  Word version:  18.0.0

Top   Top   None   None   Next
1…   5…   8…

...

1  Scopep. 5

The present document specifies a client for the IMS-based telepresence service supporting conversational speech, video and text transported over RTP. Telepresence is defined as a conference with interactive audio-visual communications experience between remote locations, where the users enjoy a strong sense of realism and presence between all participants (i.e. as if they are in same location) by optimizing a variety of attributes such as audio and video quality, eye contact, body language, spatial audio, coordinated environments and natural image size. A telepresence system is defined as a set of functions, devices and network elements which are able to capture, deliver, manage and render multiple high quality interactive audio and video signals in a telepresence conference. An appropriate number of devices (e.g. cameras, screens, loudspeakers, microphones, codecs) and environmental characteristics are used to establish telepresence.
The media handling capabilities of a telepresence client (TP UE) are specified in the present document. A TP UE supports Multimedia Telephony Service for IMS (MTSI) UE media handling capabilities TS 26.114, but it also supports more advanced media handling capabilities. The media handling aspects of a TP UE within the scope of the present document include media codecs, media configuration and session control, data transport, audio/video parameters, and interworking with MTSI.
Up

2  Referencesp. 5

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
  • References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.
  • For a specific reference, subsequent revisions do not apply.
  • For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
[1]
TR 21.905: "Vocabulary for 3GPP Specifications".
[2]
TS 26.114: "IP Multimedia Subsystem (IMS); Multimedia telephony; Media handling and interaction".
[3]
TS 22.228: "Service requirements for the Internet Protocol (IP) multimedia core network subsystem (IMS); Stage 1".
[4]
TS 24.229: "IP multimedia call control protocol based on Session Initiation Protocol (SIP) and Session Description Protocol (SDP); Stage 3".
[5]
TS 24.147: "Conferencing using the IP Multimedia (IM) Core Network (CN) subsystem; Stage 3".
[6]
TS 24.103: "Telepresence using the IP Multimedia (IM) Core Network (CN) Subsystem (IMS); Stage 3".
[7]
RFC 8845:  (January 2021) "Framework for Telepresence Multi-Streams".
[8]
RFC 8850:  (January 2021) "Controlling Multiple Streams for Telepresence (CLUE) Protocol Data Channel".
[9]
RFC 8848:  (January 2021) "Session Signaling for Controlling Multiple Streams for Telepresence (CLUE)".
[10]
RFC 8846:  (January 2021) "An XML Schema for the Controlling Multiple Streams for Telepresence (CLUE) Data Model".
[11]
RFC 8847:  (January 2021) "Protocol for Controlling Multiple Streams for Telepresence (CLUE)".
[12]  Void
[13]
RFC 8841:  (January 2021) "Session Description Protocol (SDP) Offer/Answer Procedures for Stream Control Transmission Protocol (SCTP) over Datagram Transport Layer Security (DTLS) Transport".
[14]
RFC 8864:  (January 2021) "Negotiation Data Channels Using the Session Description Protocol (SDP)".
[15]  Void
[16]
ITU-T Recommendation H.264 (04/2013): "Advanced video coding for generic audiovisual services".
[17]
ITU-T Recommendation H.265 (04/2013): "High efficiency video coding".
[18]
RFC 6184  (2011): "RTP Payload Format for H.264 Video", Y.-K. Wang, R. Even, T. Kristensen, R. Jesup.
[19]
RFC 7798  (2016): "RTP Payload Format for High Efficiency Video Coding (HEVC)", Y.-K. Wang, Y. Sanchez, T. Schierl, S. Wenger, M. M. Hannuksela.
[20]
RFC 3264  (2002): "An Offer/Answer Model with the Session Description Protocol (SDP)", J. Rosenberg and H. Schulzrinne.
[21]
RFC 8853:  (January 2021) "Using Simulcast in Session Description Protocol (SDP) and RTP Sessions".
[22]
Recommendation ITU-T H.241 (02/2012): "Extended video procedures and control signals for ITU-T H.300-series terminals ".
[23]
RFC 6236  (2011): "Negotiation of Generic Image Attributes in the Session Description Protocol (SDP) ", I. Johansson and K. Jung.
[24]
Recommendation ITU-T H.245 (05/2011): "Control protocol for multimedia communication".
[25]
RFC 4566  (2006): "SDP: Session Description Protocol", M. Handley, V. Jacobson and C. Perkins.
[26]
RFC 6464:  "A Real-time Transport Protocol (RTP) Header Extension for Client-to-Mixer Audio Level Indication".
[27]
TS 26.441: "Codec for Enhanced Voice Services (EVS); General Overview".
[28]
TS 26.442: "Codec for Enhanced Voice Services (EVS); ANSI C code (fixed-point)".
[29]
TS 26.445: "Codec for Enhanced Voice Services (EVS); Detailed Algorithmic Description".
[30]
TS 26.450: "Codec for Enhanced Voice Services (EVS); Discontinuous Transmission (DTX)".
[31]
TS 26.171: "Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; General description".
[32]
TS 26.190: "Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Transcoding functions".
[33]
TS 26.173: "ANCI-C code for the Adaptive Multi Rate - Wideband (AMR-WB) speech codec".
[34]
TS 26.204: "Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; ANSI-C code".
[35]
TS 26.071: "Mandatory Speech Codec speech processing functions; AMR Speech CODEC; General description".
[36]
TS 26.090: "Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec; Transcoding functions".
[37]
TS 26.073: "ANSI-C code for the Adaptive Multi Rate (AMR) speech codec".
[38]
TS 26.104: "ANSI-C code for the floating-point Adaptive Multi Rate (AMR) speech codec".
[39]
Recommendation ITU-T F.734 (10/2014): "Definitions, requirements, and use cases for Telepresence Systems".
[40]
Recommendation ITU G.1091 (10/2014): "Quality of Experience requirements for telepresence services".
[41]
Recommendation ITU-T H.TPS-AV (02/2015): "Audio/video parameters for telepresence systems ".
[42]
Recommendation ITU-T H.420 (10/2014): "Telepresence System Architecture".
[43]
Recommendation ITU-T H.323 (12/2009): "Packet-based multimedia communication systems".
[44]
TS 26.093: "Mandatory speech codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec; Source controlled rate operation".
[45]
TS 26.193: "Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Source controlled rate operation".
[46]
TS 26.446: "Codec for Enhanced Voice Services (EVS); AMR-WB Backward Compatible Functions".
[47]
TS 26.443: "Codec for Enhanced Voice Services (EVS); ANSI C code (floating-point)".
[48]
RFC 4574:  "The Session Description Protocol (SDP) Label Attribute".
[49]
TS 26.452: "Codec for Enhanced Voice Services (EVS); ANSI C code; Alternative fixed-point using updated basic operators".
Up

3  Definitions and abbreviationsp. 7

3.1  Definitionsp. 7

For the purposes of the present document, the terms and definitions given in TR 21.905 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905.
Conference:
An IP multimedia session with two or more participants. Each conference has a "conference focus". A conference can be uniquely identified by a user. Examples for a conference could be a Telepresence or a multimedia game, in which the conference focus is located in a game server.
IM session:
An IP multimedia (IM) session is a set of multimedia senders and receivers and the data streams flowing from senders to receivers. IP multimedia sessions are supported by the IP multimedia CN Subsystem and are enabled by IP connectivity bearers (e.g. GPRS as a bearer). A user may invoke concurrent IP multimedia sessions.
Telepresence:
A conference with interactive audio-visual communications experience between remote locations, where the users enjoy a strong sense of realism and presence between all participants by optimizing a variety of attributes such as audio and video quality, eye contact, body language, spatial audio, coordinated environments and natural image size.
Telepresence System:
A set of functions, devices and network elements which are able to capture, deliver, manage and render multiple high quality interactive audio and video signals in a Telepresence conference. An appropriate number of devices (e.g. cameras, screens, loudspeakers, microphones, codecs) and environmental characteristics are used to establish Telepresence.
Up

3.2  Abbreviationsp. 8

For the purposes of the present document, the abbreviations given in TR 21.905 and the following apply.
An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905.
AS
Application Server
AVC
Advanced Video Coding
BFCP
Binary Floor Control Protocol
CBP
Constrained Baseline Profile
CHP
Constrained High Profile
CLUE
ControLling mUltiple streams for tElepresence
DTLS
Datagram Transport Layer Security
FIR
Full Intra Request
HEVC
High Efficiency Video Coding
ICE
Interactivity Connectivity Establishment
IDR
Instantaneous Decoding Refresh
IRAP
Intra Random Access Point
MCC
Multiple Content Capture
MIME
Multipurpose Internet Mail Extensions
MRFC
Multimedia Resource Function Controller
MRFP
Multimedia Resource Function Processor
MTSI
Multimedia Telephony Service for IMS
PPS
Picture Parameter Set
RTP
Real-time Transport Protocol
SCTP
Stream Control Transmission Protocol
SDP
Session Description Protocol
SEI
Supplemental Enhancement Information
SPS
Sequence Parameter Set
SSRC
Synchronization Source Identifier
TP
Telepresence
TP UE
TelePresence User Equipment
VPS
Video Parameter Set
XML
EXtensible Markup Language
Up

4  System Descriptionp. 8

4.1  Overviewp. 8

The use cases and requirements on IMS-based telepresence are defined in TS 22.228 to enable telepresence support in IMS applications.
Enabling telepresence support involves updating and enhancing the existing IMS procedures for point-to-point calls as specified in TS 24.229 and for multiparty conferences as specified TS 24.147. This has been addressed in TS 24.103, which incorporates IETF's ControLling mUltiple streams for tElepresence (CLUE) framework [7]-[10] with the Session Initiation Protocol (SIP), Session Description Protocol (SDP) and Binary Floor Control Protocol (BFCP) to facilitate controlling multiple spatially related media streams in an IM session supporting telepresence.
In order to provide a "being there" experience for conversational audio and video telepresence sessions between remote locations, where the users enjoy a strong sense of realism and presence, capabilities and preferences need to be co-ordinated and negotiated between local and remote participants such as:
  • audio and video spatial composition information; e.g. spatial relationship of two or more objects (audio/video sources) in the same room to allow for accurate reproduction on the receiver side
  • capabilities of cameras, screens, microphones and loudspeakers, and their relative spatial relationships
  • meeting description, such as view information, language information, participant information, participant type, etc.
CLUE achieves media advertisement and configuration to facilitate controlling and negotiating multiple spatially related media streams in an IMS conference supporting telepresence, taking into account capability information, e.g. screen size, number of screens and cameras, codecs, etc., so that sending system, receiving system, or intermediate system can make decisions about transmitting, selecting, and rendering media streams. With the establishment of the CLUE data channel, the participants have consented to use the CLUE protocol mechanisms to determine the capabilities of the each of the endpoints with respect to multiple streams support, via the exchange of an XML-based data format. The exchange of CLUE messages of each participant's "advertisement" and "configure" is to achieve a common view of media components sent and received in the IM session supporting telepresence.
TS 24.103 specifies procedures to deal with multiple spatially related media streams according to the CLUE framework to support telepresence and to interwork with IM session as below:
  1. Initiation of telepresence using IMS, which includes an initial offer/answer exchange establishes a basic media session and a CLUE channel, CLUE exchanges to "advertisement" and "configure" media components used in the session, then followed by an SDP offer/answer in Re-INVITE request to complete the session establishment (see more for the general idea in RFC 8845);
  2. Release or leaving of an IM session supporting telepresence, which needs remove the corresponding CLUE channel;
  3. Update of an ongoing IM session supporting telepresence, triggered by CLUE exchanges modifying existing CLUE information. For example: a new participant at an endpoint may require the establishment of a specific media stream;
  4. Presentation during an IM session supporting telepresence, which may also be initiated by the exchange of CLUE messages and possibly need an updated SDP offer/answer and activation of BFCP for floor control; and
  5. Interworking with normal IM session, this is to let the normal IMS users be able to join telepresence using IMS.
Up

4.2  TP UEp. 9

A TP UE shall support functional components and user plane protocol stack of an MTSI client as specified in clause 4.2 of TS 26.114. Moreover, a TP UE shall support the data channel capabilities of a DCMTSI client as described in clause 6.2.10 of TS 26.114. However, the DCMTSI client support for 'bootstrap data channels' described in clause 6.2.10 of TS 26.114 is not expected from TP UEs.
A TP UE shall use the IMS data channel capability of a DCMTSI client for the exchange of CLUE messages based on DTLS/SCTP (Datagram Transport Layer Security over Stream Control Transmission Protocol) RFC 8841 negotiated via the initial SDP offer and answer, in order to open the CLUE data channel based on a SCTP stream in each direction, following RFC 8864.
A TP UE offers a DTLS/SCTP association together with the media format indicating the use of a data channel in the first SDP offer or subsequent SDP offers. A TP UE can further open the data channel via the SDP-based "SCTP over DTLS" data channel negotiation mechanism to indicate specific non-conversational application (e.g. CLUE protocol) over it.
The protocol stack of a TP UE with CLUE data channel is shown in Figure 4.1.
Copy of original 3GPP image for 3GPP TS 26.223, Fig. 4.1: Protocol stack of a TP UE
Figure 4.1: Protocol stack of a TP UE
(⇒ copy of original 3GPP image)
Up

Up   Top   ToC