In 3GPP, the use of WebRTC technologies has been investigated since Release-12 (around 2014). They are a network-based architecture for WebRTC access to IMS specified in
Annex U to
TS 23.228 and its stage 3 has specified in
TS 24.371. They define functional entities including WIC (WebRTC IMS Client) and eP-CSCF (P-CSCF enhanced for WebRTC). The eP-CSCF is assumed to be located in the home IMS domain and communicates with other IMS entities using the existing interfaces. For the C-plane signalling between WIC and eP-CSCF, those specifications specify an option to use SIP over WebSocket, whose information model can be used for options other than SIP over WebSocket. However, a lot of real-time communication services are familiar with JSON based light weight signalling protocol which is flexible, extensible, and can be optimized for new XR conversational applications. These characteristics remind us of the original design principle of WebRTC. WebRTC, by its inherent characteristics, does not regulate C-plane signalling and allow a wide range of C-plane signalling. This document looks over this design principle again and investigates a new SIP-decoupled C-plane signalling, called native WebRTC.
Regarding the level of signalling details,
TS 24.371 specifies a signalling transport mechanism using SIP over WebSocket, but it is not a mandatory mechanism for eP-SCSF. Even though there are other options such as XMPP or other application protocols over WebSocket, a RESTful based interface, etc.,
TS 24.371 does not specify any details of C-plane signalling using other options. Each service provider (e.g., operator) develops its own application by following the guidelines in
TS 24.371. Its subscriber downloads the application and connects to the service and other subscribers only within the same service. Detailed C-plane signalling is left open to each operator's design. In contrast, this document identifies a new C-plane signalling in detail (as an interface specification) to the extent that client implementations based on it have enough interoperability. This realizes connectivity to any operators or roaming services for new XR real-time communications. Operators can provide the interface common to them according to well-defined C-plane signalling specifications. Clients can connect to any operators via the interface (see
Figure 4.1-1).
The new C-Plane signalling protocol (i.e., RESPECT) studied in this document is intended for various media session control on the following interfaces:
-
UNI: The interface between operator network and UE (e.g., smart phone, content server of the content provider).
-
NNI: The interface between the two different operator networks, or that between operator network and service provider network.
A UE and a content provider can set up a media session by using RESPECT for session control on the UNI. A service operator can set up a media session by using RESPECT for session control on the NNI.
Figure 4.2-1 shows the high-level network model indicating above interfaces and media sessions established via the functional entities supporting RESPECT (which described in
clause 6.2) by using RESPECT.
There are following benefits to using RESPECT.
-
A UE (including the equipment of content provider) which is compliant with the RESPECT can connect to any operator network which supports the RESPECT and set up a media session in the operator network, based on the same signalling requirements.
-
A UE (including the equipment of content provider) which is compliant with the RESPECT can connect to services provided by other operator network or service provider network via NNI, based on the same signalling requirements.
-
Content Providers can set up an operator assisted media session (e.g., media session with QoS) with UEs connected to the Operator Network via the UNI, by connecting to the operator network via the UNI.
-
Service providers can set up an operator assisted media session (e.g., media session with QoS) with UEs connected to the operator network via the UNI, by connecting to the operator network via the NNI.
Terminology
User Equipment (UE):
It indicates the user equipment and servers acting as user equipment such as a content server of a content provider. User equipment includes an WebRTC endpoint supporting RESPECT.
Operator:
Mobile and Fixed network operator who provides telecommunication services.
Service Provider (SP):
3rd party service provider who connects its service to operator network via NNI. OTT service is one of the typical services provided by service provider. Network Operator is excluded from the definition of this terminology in this document.
Content Provider (CP):
3rd party service provider who connects its service to operator network via UNI. Network Operator is excluded from the definition of this terminology in this document.
UNI:
User-to-Network Interface. The interface between UE and Network.
NNI:
Network-to-Network Interface. The interface between two different Networks.
The C-plane signalling can be expressed as follows. Now, there are roughly four possible methods, classified in terms of their protocol stacks (see
Figure 4.3-1).
The first method is MTSI-based, using SIP and SDP. General C-plane signalling requirements for conversational services can be covered by SIP. Interoperability is fine with the existing 5G core network. It is to be treated in IMS-based AR Conversational Services (IBACS).
The second is the method specified in
TS 24.371. It enables the WebRTC clients to communicate over an IMS-based core network; only the interfaces for downloading dedicated applications and the signalling path using WebSocket are specified for C-plane signalling. Ordinary implementations adopt SIP-like protocols over WebSocket. In most cases, it is partially SIP-compliant or tightly coupled with SIP to adapt WebRTC clients in IMS domain.
The third method is an alternative to the second method that uses SIP-like protocol over WebSocket. The third method uses another signalling protocol over WebSocket, but SIP-decoupled approaches are investigated. It can be more lightweight, omitting features that is not used in XR conversational. Some constraints on SDP are necessary for interoperability. Non-browser based implementations are also in the scope. This method is the main subject of this document.
The other is a general WebRTC protocol stack that is not specified and left open to the users (i.e., service providers). C-plane may be SIP, XMPP, http, etc. A general WebRTC application uses SDP syntax compliant to
RFC 8866 for its internal representation, when setting the local and remote descriptions. C-plane protocol may have its own on-the-wire format for SDP, which can be constructed from SDP and be serialized out to SDP.