When we talked about the Diameter protocol in Diameter Overview, we mentioned that it relies on transport level protocols – TCP or SCTP.
I guess most of people are familiar with TCP. But not everyone is familiar with SCTP protocol yet. And that’s a pity because the Stream Control Transmission Protocol (SCTP) was designed especially for telcos as TCP is burdened with some limitations. Let’s explore SCTP’s basic featurers today. Btw. we should remind that SCTP is a must also in legacy networks, because SCTP is a part of Sigtran stack (SS7 over IP).
And last but not least SCTP can also transport the SIP protocol (RFC 4168). It is not that common, but there are some operators benefiting from this option e.g. in case of NNI.
What is SCTP?
SCTP is defined in RFC 4960. Moreover there is RFC 3286, which is not a typical spec, but An Introduction to the SCTP.
We said that SCTP is a transport level protocol. Like TCP, it provides a reliable transport service. It makes sure that data is transported across the network without error and in sequence. SCTP is rate-adaptive and provides congestion control, too. Unlike TCP, which is byte oriented and does not preserve any implicit structure within a transmitted byte stream, SCTP is message oriented (as UPD) and supports framing of individual message boundaries.
The protocol is full-duplex and session-oriented. A relationship between exactly two endpoints of an association is established before data is transmitted. This relationship is maintained until all data transmission has been successfully completed.
Besides SCTP provides plenty of functions which are critical for signaling transport, and its need of additional performance and reliability. The most important features are SCTP Multi-Streaming and SCTP Multi-homing.
SCTP packets have a very simple structure. Each consists of two basic sections – the common header, and the data chunks.
The important values from the SCTP flow point of view are the chunk types:
|2||INIT ACK||Initiation acknowledgement|
|5||HEARTBEAT ACK||Heartbeat acknowledgement|
|8||SHUTDOWN ACK||Shutdown acknowledgement|
|10||COOKIE ECHO||State cookie|
|11||COOKIE ACK||Cookie acknowledgement|
|12||ECNE||Explicit congestion notification echo (reserved)|
|13||CWR||Congestion window reduced (reserved)|
|14||SHUTDOWN COMPLETE||Shutdown complete|
|15-62||N/A||Reserved by IETF|
|63||IETF-defined chunk extensions|
|64-126||Reserved by IETF|
|127||IETF-defined chunk extensions|
|128-190||Reserved by IETF|
|191||IETF-defined chunk extensions|
|192-254||Reserved by IETF|
|255||IETF-defined chunk extensions|
In order to establish an SCTP session we have to perform a 4-way handshake. We use so-called “cookie” mechanism in order to guard the session against some types of denial of service attacks (as SYN flood in case of TCP). How does it work?
In the first INIT message the SCTP endpoint A sends a random number in Initiate Tag field.
The other SCTP endpoint Z immediately responds with an INIT ACK chunk. In the response, besides other parameters, the Verification Tag field is set to A’s Initiate Tag value. The endpoint Z sends in return its own tag in the Initiate Tag field. The response also contains a State Cookie. Inside this State Cookie, the SCTP endpoint Z transmits a Transmission Control Block (TCB).
TCB is an internal data structure created for each of existing SCTP associations to other peers. TCB contains all the status and operational information for the endpoint to maintain and manage the association. The RFC 4960 defines what parameters are necessary and recommended for TCB. For simplicity the cookie contains all information important for creation of the session on the Z endpoint, along with a Message Authentication Code (MAC, see RFC2104), a timestamp, and the lifespan of the State Cookie.
The important thing is that the endpoint Z doesn’t create any context – TCB for this association yet. It doesn’t allocate any resources, keep state or start any timers. Instead TCB is sent as a cookie to the endpoint A. Thanks to this mechanism the SCTP protocol is less vulnerable to DOS type of attacks.
When the endpoint A receives an INIT ACK chunk with a State Cookie parameter, it immediately sends back a COOKIE ECHO to Z. The message contains the received State Cookie as the value of the COOKIE ECHO chunk.
Once the endpoint Z receives the COOKIE ECHO, it verifies the content and uses the TCB data to create the session. As the confirmation the COOKIE ACK is sent back to the originating peer A.
Each session is identified by an Association Index. From the practical point of view it is good to know that wireshark supports SCTP streams and provides some handy features (Analyze/SCTP).
Once the session is established, an endpoint can exchange data. DATA chunk exchange in SCTP is similar to TCP’s Selective ACK procedure. Receipt of DATA chunks is acknowledged by sending SACK chunks. In TCP we acknowledge the number of received bytes. In SCTP we use Transmission Sequence Number (TSN) to acknowledge the number of received chunks. We can have a cumulative TSN indicating the range of chunks received, or also a non-cumulative TSNs, implying gaps in the received TSN sequence.
If the SCTP originating endpoint doesn’t receive the ACK with the right TSN, it performs a retransmission. However, it is also possible that the endpoint states that this data is not valid anymore and we can simply skip it (e.g. some multimedia data).
By default, an SCTP endpoint monitors the reachability of the idle destination transport address(es) of its peer by sending a HEARTBEAT chunk periodically to the destination transport address(es). The peers respond with HEARTBEAT ACK.
Shutdown and Abort
The session can be closed by SCTP Shutdown, which uses a 3-message procedure SHUTDOWN, SHUTDOWN ACK, SHUTDOWN COMPLETE to allow a graceful shutdown. In contrast to TCP, SCTP does not support the function of a “half-open” connection. In TCP when one endpoint indicates that it has no more data to send, the other can still continue to send data (indefinitely). In the SCTP we assume that once the shutdown procedure begins, both parties will stop sending any new data. They just have to clear up acknowledgements of previously sent data messages.
For an immediate shutdown in case of an error an Abort procedure is available.
TCP does use only a single stream of data. Within this stream the TCP makes sure that delivery of data takes place with byte sequence preservation. In case of data loss or sequence order error, TCP must delay delivery of data until the correct sequencing is restored. It is done either by receipt of an out-of-sequence message, or by retransmission of the lost message.
The key feature of SCTP is multi-streaming. Hence we have the protocol name – Stream Control Transmission Protocol. Multi-streaming allows data to be partitioned into multiple streams. Each stream is independent on any other, so that a message loss in any of them doesn’t affect delivery delivery in other streams.
This is achieved by creating independence between data transmission and data delivery. More specifically, each payload DATA chunk in the protocol uses two sets of sequence numbers, the Transmission Sequence Number (TSN) and a Stream ID/Stream Sequence Number (SN) pair. TSN governs the transmission of messages and the detection of message loss. The Stream ID + SN determine the sequence of delivery of received data.
SCTP allows the receiver to determine immediately when a gap in the transmission sequence occurs (e.g., due to message loss), and also whether or not messages received following the gap are within an affected stream.
If a message is received within the affected stream, there will be a corresponding gap in the SN, while messages from other streams will not show a gap. The receiver can therefore continue to deliver messages to the unaffected streams while buffering messages in the affected stream until retransmission occurs.
The number of Outbound Streams (OSs) and Maximum Inbound Streams (MISs) is negotiated during the Initialization process. The value of OSs has to respect MISs of the other party, or an error is reported and association is aborted.
If we want to have redundant physical connectivity between two nodes, we can either have redundancy on link layer (bonded interfaces, vNICs,..), or we need to have more parallel TCP/UDP sessions. SCTP offers us to have a redundancy on the network layer as a single SCTP endpoint can create an association on multiple IP addresses. This can be beneficial especially when an application does make use of more nodes (blades, VMs).
SCTP does not do loadsharing, that means we use multihoming for redundancy only. One IP address is configured as primary and it serves as the destination address for all DATA chunks for normal transmission. For retransmittion of DATA chunks the secondary IP addresses are used. A continued failure to send to the primary IP ultimately results in the decision to transmit all DATA chunks to the secondary until heartbeats can reestablish the reachability of the primary destination.
SCTP endpoints are exchanged as lists of addresses during the initiation of the association. Note, that a single SCTP port number is used across the entire address list at an endpoint for a specific session.
SCTP, TCP, UDP Comparison
We can’t say that any of transport level protocols is better than the others. There are good use-cases for SCTP as well for TCP or UDP. Although SCTP is a proffered option for Diameter communication (e.g. Cx, Sh interfaces), TCP can do a good job here too. What protocol is the best depends very much on the network design and also on what people are familiar with.
|Reliable data transfer||yes||yes||no|
|Partial-reliable data transfer||optional||no||no|
|Ordered data delivery||yes||yes||no|
|Unordered data delivery||yes||no||yes|
|Preservation of message boundaries||yes||no||yes|
|Path MTU discovery||yes||yes||no|
|Application PDU fragmentation||yes||yes||no|
|Application PDU bundling||yes||yes||no|
|Protection against SYN flooding attacks||yes||no||n/a|
|Allows half-closed connections||no||yes||n/a|
|Psuedo-header for checksum||no (uses vtags)||yes||yes|
|Time wait state||for vtags||for 4-tuple||n/a|
The best Explanation I have found!
Thank you for the feedback and motivation booster Cecilia!