In the WebRTC and IMS post we briefly described the IMS and WebRTC integration. We explained that the WebRTC allows a rapid development of clients. The clients still need some infrastructure for the signalling and services – that’s the IMS. The network element which is acting as an interface between these two worlds – the world of web and the world of IMS is called a WebRTC GW. The WebRTC GW is a collection of network functions which we need for the translation of protocols, interworking and authentication procedures. It can be implemented as an enhancement of already present elements (e.g. eP-CSCF) or we can have a new stand-alone entity.
As we said the flows and procedures are described in the reference architecture for WebRTC – IMS communication in 3GPP TR 23.701 and 3GPP TR 33.871. Some new information and experience can be also found in GSMA WebRTC to complement IP Communication Services.
From the high-level WebRTC GW does the translation between http/ws to SIP and vice versa. When we go a bit more in detail there are many issues which have to be addressed.
- Authentication and Security Issues
- Protocol Issues
- Network Issues
- Media Issues
- Legal Issues
Mind this overview can’t be final or complete in any way. So please take it as some kind of brainstorming. Any comments welcomed!
Authentication and Security
Security in the IMS is based on a long-term secret key. The key is known to both the IP Multimedia Services Identity Module (ISIM) and the home network’s Authentication Centre (AuC) which is a part of the HSS. The ISIM module acts as a storage for the shared secret (K) and accompanying Authentication and Key Agreement (AKA) algorithms, and is usually embedded on a Universal Integrated Circuit Card (UICC). Access to the shared secret is limited. The module takes AKA parameters as input and outputs the resulting AKA parameters and the authentication response. Thus, it never exposes the actual shared secret to the outside world.
But in case of the WebRTC the WebRTC Client doesn’t have any ISIM module (as it naturally doesn’t have any SIM card). We also don’t want to transmit the secret information over the Internet. So how can the Client authenticate to the IMS network? We usually combine 2 factors – something which user knows and something which user has. In this case during the first registration we can make sure that the user really owns the IMPU/MSISDN he wants to be associated with from the Web Client. The easiest way is to send an SMS with a one-time password, which will prove the ownership of the SIM card for the given IMPU/MSISDN. (We know it already from many RCS deployments.)
Then for the next logins the user can either use some dedicated credentials or e.g we can use OAuth/OpenID. But who does the IMS Registration when the client doesn’t have the secure information? This is a task for the WebRTC GW.
As with the VoLTE client the WebRTC should be able to manage the supplementary services. We can’t connect directly via Ut interface for the same reason (no ISIM modul). That means that the WebRTC GW has to do the authentication towards Authentication Proxy and then proxy all the XCAP traffic.
If the client needs some more information (e.g. configuration information, shared address book, access to the shared conversation history etc.), either we can follow the same authentication pattern or a temporary token for the webRTC client can be generated (as with OAuth).
In IMS we use SIP, RTP and MSRP. In WebRTC we use Websocket (ws)/HTTP and RTP.
Let’s start with the signaling. SIP is a peer-2-peer statefull protocol, but http is a stateless client-server. From our point of view the WebRTC GW has to be an http server. If in WebRTC we want to inform the user that he is invited to a new session we either have to use (bi-directional) ws or http long-polling (or some kind of push-notification) to forward the information.
The WebRTC doesn’t define signalling and it is very flexible when it comes to identities. WebRTC GW has to be able to translate a particular signalling method and identities into SIP signalling. One of the possibilities is to use RESTful Network API for WebRTC Signaling defined by OMA. Either way there are many things we should think about (e.g. multi client support, forking, PANI, sip.instance, registration/deregistration, etc.)
As we discussed above the WebRTC GW has to also do the SIP registration on behalf of the user.
For the media, QoS and NAT traversal it is used the Session Description Protocol (SDP) which is a mandatory part of both WebRTC signaling and SIP. The WebRTC GW has to be able to do the modifications of the SDP (e.g. change codecs, IPs, Ports, etc.)
BTW. W3C is developing ORTC (Object Real-time Communications) for WebRTC. This is commonly referred as WebRTC 1.1. In contrast to the current WebRTC 1.0 APIs, ORTC does not mandate a media signaling protocol or format. As a result, ORTC does not utilize SDP within its APIs, nor does it mandate support for the Offer/Answer state machine! This can be a complication for the WebRTC GW interworking function.
For the media transport we use RTP in both WebRTC and IMS. However when two do the same it might not mean the same. In WebRTC the RTP and RTCP multiplexing on a same port is required (see the draft-ietf-rtcweb-rtp-usage-24). In IMS we traditionally use two UDP ports fort RTP and RTCP. Similarly in the WebRTC voice and video data is bundled into one RTP stream. In IMS we use dedicated streams.
In IMS we typically use SBC and latching for NAT traversal. In WebRTC we use ICE/STUN/TURN. More about both techniques can be found in the Crack the NAT post.
The WebRTC GW has to support the ICE framework and implement the TURN proxy. Because of the Lawful intercept it can be needed that all the traffic is relayed to A-SBC. As we will mention later, the WebRTC is also responsible for media transcoding when one of the participants is in the IMS network. Then except of the pure data relay the WebRTC GW has to also transcode the media streams.
There are different codecs supported in IMS and WebRTC. The task for the WebRTC GW is to do the transcoding.
The WebRTC media codecs are defined in the IETF draft-ietf-rtcweb-video document.
In a shortcut the WebRTC requires a support for:
- Opus (royalty free, RFC 6176)
- iSAC (internet Speech Audio Codec)
- iLIBC (internet Low Bitrate Codec RFC 3951)
- G.711 (alaw/ulaw)
- VP8 Chrome, Firefox
- H.264 Browser(Ericsson Lab), (Firefox planed)
In IMS we support typically G.729, G.723.1, AMR (WB G722.2, NB), etc.
So as we can see, for WebRTC-IMS we have to do the transcoding.
And last but not least don’t forget the DTMF. The WebRTC GW supports the DTMF as defind in the RFC 4733.
Some updates about media adaptation can be found here.
There are two basic requirements on the Tier1 IMS network. The network has to be able to support
- Emergency Calls
- Lawfull Intercept
When we talk about the Emergency Call (EC) support we are interested in 2 main things: EC detection and user location. To detect the EC should not be a big issue. However with the general Internet access we should technically support all emergency numbers of all the countries where the user can log on. (Kind of like with VoWifi.) A bigger problem is to be able to identify the location from which is the user connected. It is not easy to obtain the real location of the user (e.g. because of a VPN or NAT involved).
Lawfull Intercept (LI) support depends on the country where is the WebRTC GW and/or IMS network physically present. If required then the WebRTC GW has to provide the X1, X2 and X3 interfaces. LI is still mostly done on the A-SBC.