Fundamentals of LL-DASH and LL-HLS

This guide explains the basic principles behind low-latency streaming with LL-DASH and LL-HLS.

Live Latency

Live latency is a result of the player maintaining a certain buffer level at all times during playback. This buffer helps to prevent playback interruptions even if network conditions fluctuate. The target buffer levels of our player are set to a fixed number, for example 40 seconds, which requires the playhead to be at least 40 seconds behind the stream's live edge and results in a playback latency of 40 seconds. By reducing these target buffer levels one can therefore also reduce the latency. However, this approach reaches its limits when aiming for latency values around 10 seconds and below as there is an increased risk of playback interruptions due to running out of buffer -- it's essentially a tradeoff between latency and playback stability.

DASH and HLS specifications have added methods to optimize the way video players acquire segments at the live edge of the stream and thereby maintain a high enough buffer level for smooth playback.

Fundamentals of LL-DASH and LL-HLS

The main principle lies in the server offering the currently being produced segment (i.e. the live edge segment) to the client early, as partially available during its production. This enables the player to load and consume the already available parts of the segment while the remainder is still being produced. The content must be encoded as chunked CMAF for the player to be able to consume it in parts.

The way this is achieved in DASH and HLS is slightly different and is outlined below:

Basics of LL-DASH

  • Manifest signals that the live edge segment is partially available once its production has started.
  • Server offers partially available segment through HTTP Chunked Transfer Encoding (CTE). This enables streaming of the segment data to the client using a single network request while it is being produced.
  • Client/player loads partial segments using HTTP CTE (fetchbrowser API). While streaming the response data, complete CMAF chunks are identified and decoded while the segment request is still ongoing.

Basics of LL-HLS

  • Partial Segments: The live edge segment is not offered early but parts of it are offered as individual files (or byte ranges) as they get produced. Each part consists of one or multiple CMAF chunks. The player loads these parts through individual network requests and thereby acquires the data of the segment before it becomes available in full.
  • Preload Hints in the media playlists offer parts before their actual availability. This allows the player to open the network request early and save network latency (time to first byte).
  • Blocking Playlist Reloads are used to optimize part discovery by the player. It allows the player to receive a playlist update exactly when the next part becomes available and thereby eliminates timing issues with playlist polling.
  • Playlist Delta Updates are used to reduce network and processing load on the client for streams with long live windows. It allows the player to request a shorter version (a delta update) of the full playlist.
  • Rendition Reports to minimize required round-trips when switching renditions.