It is really hard to predict the future. The authors of SIP and SDP designed (1996) a great concept which really addressed the needs of not just real-time communication for the next two decades. But they also believed the the Network Address Translation (NAT) is only a temporary solution which will be obsolete once everyone will use IPv6. In 2015 we still use the NATs and I’d think (! the same mistake again) that we’ll use it for a couple more years.
NAT is technique which became in conjunction with IP masquerading a popular as an essential tool in conserving global address space allocations in face of IPv4 address exhaustion. These days the NAT is used also for security reasons e.g. topology hiding, port and IP restrictions etc.
The basic functionality of NAT is to translate one IP into another. Typically we can found NATs which mask behind one public IP a whole private network (one-to-many NAT). The traffic then can originate only from the private network (private IP space is not directly addressable from the public network).
Why we care about the NAT anyway? And what’s wrong with the SIP?
Right. Let’s remind that the SIP+SDP are used to establish a media session. It means we’re exchanging IP addresses of the originator and recipient which will be then used for (e.g. RTP, MSRP) data stream. These IP addresses are in the SIP body in the SDP content.
The media communication is then established on these IP:ports. As the addresses and ports are private the other clients can’t use them as they don’t see each other.
There are 3 basic issues with NAT traversal for SIP/SDP:
- As mentioned the IP:port encoded in SDP bodies by NATed UEs can’t be used across the Internet, because they represent the private network addressing information of the UE rather than the addresses/ports that will be mapped to/from by the NAT.
- The policies performed by NATs, and explicit in Firewalls, are such that packets from outside the NAT cannot reach the UE until the UE sends packets out first. (This is not an issue for e.g. http client as this one always initiates the communication.)
- Some NATs apply endpoint dependent filtering on incoming packets, as described in RFC4787 and thus a UE may only be able to receive packets from the same remote peer IP:port as it sends packets out to.
There are several types of NAT. It is either possible to find a fixed public IP:port address which can be used for communication over the public internet or we can’t predict what public IP:port tuple will be assigned by the NAT a new communication stream.
- Full Cone NAT
- Restricted Cone NAT
- Port-restricted Cone NAT
- Symmetric NAT
The details can be found on wikipedia. For now it is important that for Cone NATs we can find a ‘reflexive’ address – the public IP:port which can be put in the SDP for the future RTP communication. In case of Symmetric NAT this is not possible as the NAT will assign this IP:port dynamically. Hence we need to use some kind of proxy and use its IP:port in SDP instead.
So far we’ve been talking about a NAT implemented on IP layer. But because the SIP/SDP contain numeric IP addresses and Ports, we have to be able to provide NAT functionality on the Service/Application layer too. That is a job for Session Border Controller (SBC).
Moreover the SBC also provides a way how to deal with a NAT within the access network – NAT Traversal Functionality.
Hosted Nat Traversal on SBC
The offer/answer media negotiation model is such that once an offer is sent, the client generating the offer needs to be prepared to receive media on the advertised address/ports. In practice such media may or may not be received, depending on the implementations participating in a given session, local policies, and call scenario. For example if a SIP SDP Offer originally came from a UE behind a NAT, the SIP SBC cannot send media to it until an SDP Answer is given to the UE and ‘latching’ occurs. Another example is when a SIP SBC sends an SDP Offer in a SIP INVITE to a residential customer’s UE and receives back SDP in a 18x response, the SBC may decide, for policy reasons, not to send media to that customer UE until a SIP 200 response has been received (e.g., to prevent toll- fraud).
In IMS we typically use the SBC which performs the ‘latching’ also called Hosted Nat Traversal (HNT). SBC will replace the IP:ports in SDP by its own address and then waits for the RTP data sent from the UEs. Then locks (latches) the Port:IP from which the RTP came with its internal address.
Note that in order have the latching working correctly, the UE behind the NAT needs to support the symmetric RTP. That means, it needs to use the same ports for sending data as the ones it listens on for inbound packets. Nowadays almost all SIP and XMPP clients supports it. Also UEs have to begin sending media packets independently and without waiting for packets from the other side.
Media relays also need to begin receiving media before they start sending. In case that there are more signaling intermediaries involved which are performing HNT it is possible that deadlocks will occur.
SBCs sometimes support only UDP-based media latching, and in particular RTP/RTCP. TCP-based latching is a bit more complicated, and involves forcing the UE behind the NAT to be the TCP client and sending the initial SYN-flagged TCP packet to the SBC (i.e., be the ‘active’ mode side of a TCP-based media session).
HNT and latching are generally found to be working reliably. However there have been some issues identified. The first one is that UEs are not aware of it occurring. This makes it impossible for the mechanism to be used with protocols such as Interactive Connectivity Establishment (ICE) that try various traversal techniques in an effort to choose the one that best suits a particular situation. Overwriting address information in offers and answers may actually completely prevent UEs from using ICE because of the ice-mismatch rules described in RFC5245.
The second issue raised by IETF participants is that it causes media to go through a relay instead of directly over the IP-routed path between the two participating UEs. (However this is not the issue in telecom networks because we always want to proxy the media for various reasons.)
And last but not least there are some security concerns related to the enabling of the relay by the SBC.
For this reasons the WebRTC suggests to use ICE/STUN/TURN mechanism for NAT traversal.
Session Traversal Utilities for NAT (STUN) is a standardized set of methods and a network protocol to allow an end host to discover its public IP address (RFC 5389). That means that we can’t use the STUN in case a symmetric NAT is involved. The STUN requires an assistance from a third-party network server (STUN server) located on the opposite (public) side of the NAT. The STUN server sends echos to client requests and via STUN algorithm the client can identify the type of the involved NAT and possibly the public IP:port.
The STUN is very easy to implement and we can find a plenty of free STUN servers in the Internet. STUN protocol is also used for pin-hole management (keeping the NAT channel open), usage of short term credentials for authentication and many other things.
Ok, this works for Cone NATs. So what to do with the symmetric NAT? In that case we need to use some kind of proxy. For that we have the Traversal Using Relays around NAT (TURN) defined in RFC 5766.
Note that TURN also defines a secure way how to reserve the resources on the TURN server.
ICE allows to combine STUN and TURN. Basically if possible the STUN is used (as it is cheaper), if not we will use the TURN relay. ICE defines a framework which does use the SDP for transmission of ‘ICE Candidates’. These candidates represent the available addresses which can be used for the data transmission.
When the client wants to establish a media session, it will firstly gather the candidates. We have 3 basic types
- Host candidates
- Server reflexive candidates (STUN)
- Relayed candidates (TURN)
Where an ICE candidate is encoded as:
So during the SDP negotiation the clients can select the most appropriate IP addresses for the media traffic.
For more details see the RFC 5245. Note that there exist three variants of ICE – light and full implementation (RFC 5245). The last extension to ICE is called Trickle ICE: Incremental Provisioning of Candidates for the ICE Protocol. The idea is to enable ICE agents to send and receive candidates incrementally rather than exchanging complete lists. Hence ICE agents can start with connectivity checks while they are still gathering candidates. This mechanism is called “trickle ICE” leads to considerably shorten the time necessary for ICE processing. The RFC is still in draft.