Saturday, March 5, 2011

Bridging SIP videophones to RTSP providers

Just a quick post about trying to bridge SIP video phones to RTSP video content providers. The goal is to enable SIP phones to display audio and video from regular RTSP providers such as YouTube. Even though SIP and RTSP have nothing to do with each other, they carry SDP inside the messages, so it shouldn't be too hard to make them talk to each other.

App overview

To achieve the goal here we will need a SIP stack, RTSP stack and potentially an RTP stack. For SIP we'll use Mobicents Sip Servlets, for RTSP we can use JBoss Netty where the RTSP codecs were contributed by Mobicents Media Server team. RTP is really optional, because we should be able to wire the client and the RTSP server to talk directly with each-other RTP and there is no need the app server to understand what's going on.

Basically a Sip Servlets application will initiate RTSP play sequence for each INVITE transaction and it will tear down the RTSP session when the client disconnects with a BYE.

Simple enough. Let's see the code we need:

    protected void doInvite(SipServletRequest request) throws ServletException,
IOException {

RTSPStack rtsp = new RTSPStack();

// Try to take RTSP URI from the SIP request "To" header
String rtspParameter = request.getTo().getURI().getParameter("rtsp");
if(rtspParameter == null) {
rtspParameter = request.getRequestURI().getParameter("rtsp");
if(rtspParameter != null) {
rtsp.uri = "rtsp://" + rtspParameter;
.. SDP pre-processing..

rtsp.init(); // start the OPTIONS, DESCRIBE, SETUP, PLAY sequence

.. SDP post-processing ..

sipServletResponse = request.createResponse(SipServletResponse.SC_OK);
sipServletResponse.setContent(rtsp.sdp, "application/sdp");
Negotiating the SDP

The SDP used by normal SIP phones is slightly different from the SDP used by most RTSP servers, but it is close enough. If your phones and RSTP servers know exactly the same codecs and payload types, the integration would be effort-less. For YouTube it requires a few pre- and post-processing steps on both sides:
  • Extract the audio and video RTP socket port numbers from the client SDP advertised in the INVITE
  • Translate the SDP the the RTSP server returns. The Wireshark screenshot on the right shows the captures RSTP SDP (note the track information and the way ports are shown in the RTSP response).
One big challenge for all this to work is to have a phone that understands the standard RTSP codecs used by many RTSP providers. Most commonly those are AMR/G722.2 for audio and H263-2000 for video advertised with the following RTP map:
a=rtpmap:99 AMR/8000/1
a=rtpmap:98 H263-2000/90000
Unfortunately, neither of these codecs is supported by any free SIP phone making it very difficult to build a one-for-all solution and test it without transcoding through a media server. Transcoding should be avoided, because it adds great computational complexity and difficulties with the project scalability.

Matching the RTP/SDP payload type

Even without transcoding, apparently, very often you will have to deal with RTP payload number incompatibility. IANA assigns the standard RTP payload types over here. RTSP from YouTube uses the dynamic payload type range (96-127) where the numbers don't correspond strictly to particular codec and it is up to the phones to interpret it correctly with codec name string matching or some other method. This can also be be compensated for in the SIP server side. If you have phones that understand exactly the same payload types as the RTSP server, then great - no extra work is needed. If not, you will want to convert the types.

To allow payload type modification we will need to bring the app server back in the RTP traffic to translate the RTP. We don't need a media server for this, just a very light-weight stack to pass-thru the RTP traffic. I added the RTPPacketForwarder class to the project which can be managed by the RTSPStack. Basically it binds sockets on both interfaces on the server and forwards the RTP packets unchanged except for the payload type which can optionally be overwritten to match the payload number required by the phone. Note that, when payload number modification is required the SDP must also be adapted in the Sip Servlets app to advise both the phone and the RTSP server to talk to the ports owned by the RTPPackerForwarder.

NAT/firewall problems

SIP already has means for NAT traversal but for RTP it's a different story. There are two cases here - a firewall between the client and MSS or a firewall between MSS and the RTSP server. In the first case it would be responsibility of the SIP client to work around it. In the second case this means our server will have to implement at least a dumb RTP stack to pass packets from one side of the firewall to the other - a modification to the RTPPacketForwarder would enabled this. The stun4j module is added to the project to allow STUN lookup of the addresses if needed. That would once again require extra SDP modification to change the port numbers and addresses. The good news is that this is not required in many cases, because a good number of firewalls will do application layer gateway-ing or simply forward certain port ranges to the computer behind the firewall automatically or after asking the router to do it.


The example is available in the Mobicents SVN, you will probably need to customize it for particular SIP phone and RTSP provider depending on the supported codecs and network topology. By default it is configured for YouTube.

svn co