R&D

18 Mar

Recipe for Fast Channel Zapping in IPTV Deployments

As I have mentioned in my previous blogs, I want to outline one idea of ours on how to improve the quality of experience in enjoying television over IP networks. That is, how to reach the same experience of fast channel zapping as you get from the old analogue television, still staying true to the open standards by staying within the framework of MPEG.

Since Industria is very true to its vision of enabling a genuine entertainment experience on devices such as televisions, we want to do what we can to solve the challenge with clumsy channel changing in a typical IPTV environment.

As previously posted, I am very happy with our blazing fast and intuitive user interface. It means that we are doing our part as well as we can. Nevertheless, IPTV’s current inability to change between channels in a timely fashion is in our mind one of the single biggest problem of our industry.

As we know, channel zapping is mostly outside of the scope of IPTV middleware solutions like Zignal, but there is a role we can play in terms of improving it. One solution we have been toying with and brainstorming about for the last three years at Industria is an idea of unicasting to the set-top box during the zapping-period, merging seamlessly with the multicast stream after a little while. (I have sometimes called this Intra Coded Frames on Demand service).

In this blog I discuss one potential way to solve and implement this. It’s a bit technical, as I rely massively on input from my fellow engineers.

The improvement is only possible on a network with a return channel, unicast delivery and a significant amount of spare bandwidth on the access line. It is not possible, for example, on traditional DVB-S, DVB-C and DVB-T networks. If implemented, this improvement gives IPTV service providers an advantage in service quality, compared to other IPTV service providers. The improvement can be applied equally well to MPEG-2 and H.264 streams.

I give credit to my fellow engineers (Boyan and Baldur) for helping conceptualize this. Now I hope that some other engineers or entrepreneurs out there can take this even further and get this implemented in the real world.

First, let’s outline three different perspectives on the scenario of this implementation;

  1. The IPTV Set-top Box Perspective
  2. The Network Perspective
  3. The GOP Server Perspective

Below you see more detailed analysis of these three perspectives.

1.) IPTV Set-top Box Perspective (Zignal)

The IPTV set-top box scenario can be best explained in the following user scenarios steps:

  • Step 1: User presses button on remote.
  • Step 2: STB leaves previous channel and resets decoder. In a properly configured network this is a very fast operation taking not more than 10 ms. This should not be a blocking operation for the STB software, i.e. no waiting involved.
  • Step 3: STB opens UDP socket.
  • Step 4: STB software notifies GOP Server. Message includes multicast group and port.
  • Step 5: STB joins multicast group (sends IGMPv2 membership report).
  • Step 6: For a duration of a maximum of 500ms, traffic from both the GOP server and Multicast group arrives at the STB. This traffic is destined for the same UDP port. Traffic from the GOP server is unicast. Other traffic is multicast. The STB should be able to receive both types of traffic on a single socket. The STB should be able to deal with duplicate and out-of-order traffic. To do this in a reliable manner, RTP transport is preferred.

Notes to the above scenarios:

  1. Notification as explained in step 3 is done through a UDP message and it is not a blocking procedure for the STB software. Even if UDP is not possible, a persistent TCP session would do the job as well, because the message would fit in a single MSS.
  2. The STB wouldn’t bind to the multicast group, as it uses a local address on the UDP socket to receive the multicast traffic. It would bind to 0.0.0.0 (any local address). This means that the STB would be able to receive unicast and multicast traffic on the same socket. Even if the current implementation is not binding to a “any” address, this would be a minor change to the STB software.
  3. An MPEG-TS layer or MPEG2 Video decoder must be able to deal with repeated and out-of order MPEG-TS 188 byte packets, through time stamp or sequence elimination. The easiest way to do this is to use RTP rather than plain UDP for sending data.

2.) Network Perspective

  • Step 2: IGMP Leave. IGMP fast-leave kicks in. Multicast group left.
  • Step 4: Single UDP message or TCP segment. Forwarded to GOP server.
  • Step 5: IGMP membership report. Join PIM group. SPT switch-over. Deliver Multicast packets.
  • Step 6: Traffic from multicast group flows through, together with traffic from GOP server. To avoid interface overload when zapping, traffic from GOP server should be treated as less-than-best-effort (LBE)

3.) GOP Server Perspective

The GOP Server keeps a copy of the current incomplete GOP in its memory. In a small setup, a GOP server would listen to a lot of multicast groups.

  • Step 4: The GOP server receives a request.
  • Step 6: The GOP server sends a shaped burst of UDP packets from the beginning of the GOP to current time plus a constant. To avoid interface queue overload when zapping, traffic from GOP server should be treated as a less-than-best-effort (LBE). Also traffic is shaped to a predefined bandwidth. For a 4Mbps stream, the size of the burst can be as big as 250KB.

Notes on the GOP Server:

  • The server needs to cache all the relevant data of the current incomplete GOP. This is easily accomplished by parsing the transport stream and reading the MPEG sequence and GOP headers when they arrive. No further processing or decoding of the stream needs to be done on the server.
  • There is no need of decryption and interaction with the CA/DRM system, as the encryption is actually scrambling, and it is done in the MPEG elementary streams; transport stream tables are clear text. The decryption will be resolved by the STB by itself. The only interaction needed is with the middleware. The middleware needs to send as fast as possible a request to the GOP cache server, and the server will flood the current GOP to the unicast IP address of the STB. That solution is simple and will work with most of the IPTV headends. Our major issue is the time needed from the pressing of the button of the remote to the receiving of the GOP. As the Zignal currently does not support OOB signalling or UDP messaging directly from the STB, that means TCP or “thru-Zignal Server” signalling have to be implemented. Because of the TCP architecture, we’ll have at least 2 RTT times between the STB and GOP server + 2* OS user-process - kernel jitter and delay (which is very low in Linux). This means that this will take something close to 2 * RTT (if it is 10ms - 20ms) to notify the GOP server. If the messaging is passed thru the Zignal server, the delay will be increased at least two times (if the Zignal server architecture does not introduces delay by itself). Notifications could also be transmitted by UDP from the set-top box to the GOP server, with the addition of a lightweight client on the box that interfaces with the middleware.
  • The current incomplete GOP will be transmitted by unicast to the STB. There can be an issue with the buffer in the STB (imagine a situation where just the last frames have to be played, we send a full GOP, then comes the second GOP, the STB will have to receive both GOPs, or picture freezing will be introduced), but this can be addressed on the set top box, either by having a large enough buffer for two whole GOPs or by implementing a selective buffer purge. This is hard to address on the server side, because the network latency can vary. I believe that this solution can be easily developed and will shorten the channel change time significally.

Timeline | Assumptions

  • Button-press to IGMP leave = 200 ms (rather slow STB reaction)
  • IGMP leave = 10 ms
  • Decoder reset is a blocking operation = 50 ms
  • Decoder start-up once, the jitter buffer is full = 50 ms
  • IGMP join = 10 ms
  • Network delay = 10 ms
  • Stream bandwidth = 4Mbps
  • Stream is shaped. i.e. I frame takes much more time than other frames
  • GOP server burst shape = 15Mbps
  • GOP server guard time = 20ms
  • Join time offset = average case (250ms after beginning of GOP transmission)
  • Jitter buffer threshold in STB = 2Mb (250KB)

Timeline list | Without the improvement

  1. 0ms - User presses remote button
  2. 200ms (+200) - IGMP leave, decoder reset start
  3. 210ms (+10) - last Multicast packet from previous stream arrives
  4. 250ms (+50 from nr.3) - decoder reset end, Open UDP socket, IGMP join
  5. 260ms (+10) - first multicast packet from new stream arrives
  6. 510ms (+250) - GOP begins
  7. 1010ms (+500) - jitter buffer threshold reached, decoder start
  8. 1060ms (+50) - First picture displayed

Timeline list | With improvement

  1. 0ms - User presses remote button
  2. 200ms (+200) - IGMP leave, decoder reset start
  3. 3. 210ms (+10) - last Multicast packet from previous stream arrives
  4. 4. 250ms (+50 from nr.2) - decoder reset end, Open UDP socket, signal GOP server, IGMP join
  5. 5. 260ms (+10) - first multicast packet from new stream arrives. Request arrives at GOP server, GOP burst begins at server
  6. 6. 270ms (+20 from nr.4) - GOP burst begins (20ms roundtrip to GOP server). 3.08Mb (exactly 1 GOP + 250ms random access time + 20ms guard time) at 15Mbps = 206ms. 1 GOP will arrive in 134ms
  7. 404ms (+134) - Jitter buffer threshold reached, decoder start
  8. 454ms (+50 from nr.7) - First picture displayed
  9. 476ms (+206 from nr.6) - GOP burst ends, 20ms overlap with multicast

GOP burst example 1

Timeline Discussion

In the above example the proposed improvement reduces actual channel zapping time by 606ms. It is worth noting that if STB associated delays (reaction, decoder reset, decoder start) were zero, the picture would appear in only 154ms (20ms roundtrip + 134ms for first GOP and buffer threshold) from remote button press which is about 5 PAL frames. This is very close to an analogue TV experience.

GOP burst example 2

A solution like this could be quite a natural extension for companies within the video server field. GOP Server functionality would be added as an additional service. The business case for service providers to add servers to enable a fast-channel zapping experience for its customers could easily be calculated and would have similarities to VOD network capability cost calculations. More and more servers can be added upon demand, based on what level of experience the service provider would like to enable for its subscribers.

I could also foresee a project like this being implemented as an open source cooperation project between Industria, several vendors and service providers, in terms of fixing one of our main challenges in enabling a true television experience over IP networks. Please get in touch with us if you are interested in this.

One Comment

  1. 1
    varun
    March 19, 2008 at 05:27
    Permalink

    Hi.
    this is a very interesting post. I am curious to know how you measure the various time durations. I would love to have a detailed explanation. :)

Add Comment

Your email is never published nor shared. Required fields are marked *

*
*