HTTP Live Streaming (HLS) uses HTTP as the standard for video delivery. As streaming protocols shifted from proprietary solutions like Flash and RTMP, they adopted HTTP for its reliability. Modern HTML5 players rely on HTTP for streaming.
But HTTP wasn't built for live streaming. It focuses on reliable delivery, not speed. As streaming platforms push for real-time interaction, this creates a fundamental problem: How do we achieve low latency while keeping HTTP's advantages?
HLS breaks video into small HTTP-delivered chunks that play in any HTML5 player. Your browser loads these chunks like any other web content - through the same networks that deliver websites and images. This makes HLS reliable and scalable, but it also introduces delay as chunks move through the delivery chain.
The protocol's design serves two key goals: broad device support and reliable playback. It succeeds at both, but at the cost of added latency, often 20-30 seconds between capture and playback. Unlike real-time protocols, HLS prioritizes consistent playback over immediate delivery.
HLS introduces latency because it breaks video into small chunks (segments) that must be fully downloaded before playback can start. Imagine waiting for a full bucket of water before pouring instead of letting it flow continuously. This segmented approach introduces multiple delays.
HLS moves video through a structured pipeline, each step building on the last. Think of it like an assembly line where raw video transforms into downloadable chunks.
The encoder receives raw input of live video as a continuous stream of frames. The encoder compresses this raw feed in real time, turning bulky video data into efficient H.264 or HEVC formats. The first step:
Next comes segmentation. Compressed video is split into playable chunks, typically as MPEG-2 Transport Stream (*.ts) files. Each segment contains a few seconds of video and audio, packaged together in a format players can understand. These segments form the basic building blocks of HLS delivery.
The player can:
Manifest files track these segments. The master manifest (*.m3u8) lists different quality versions of your stream. Child manifests detail the segments within each quality level. As new segments are created, these manifests update to reflect the latest content. It's a constant process of listing, updating, and expiring old segments.
Players move through the list of manifests, starting from the top and going down to find the segments they need. They start with the master manifest, choose a quality level, fetch the child manifest, and finally download segments. Each step requires its own HTTP request, moving through the chain methodically.
Different streaming protocols tackle the latency challenge through distinct approaches. Each makes specific trade-offs between speed, scale, and complexity.
In comparing HLS with other streaming protocols, FastPix’s video API uses HLS for efficient streaming. This approach ensures broad compatibility across devices, making it a reliable choice for adaptive bitrate streaming.
WebRTC takes the direct approach. By establishing peer-to-peer connections over User Datagram Protocol (UDP), it achieves sub-second latency. Video flows directly between participants without server-side packaging or buffering.
This works great for video calls but struggles at scale. Each new viewer needs its connection, which quickly overwhelms servers as audience size grows. Maintaining quality across varying network conditions can be challenging, as peer-to-peer connections are sensitive to bandwidth fluctuations.
For interactive experiences like video conferencing and online gaming, WebRTC is the clear choice. It enables real-time data channels with a latency of 200-500ms.
DASH mirrors many HLS concepts but adds flexibility. Its design supports various media formats and delivery methods. Recent DASH-LL (DASH – Low Latency) implementations achieve 3-6 second latency through chunk-based delivery and HTTP/2 push.
HTTP/2 push is a feature that allows the server to send multiple pieces of content to the player proactively without waiting for the player to request each one. This means that when a player connects to the server, it can receive not just the initial video segment but also additional segments or resources that it is likely to need soon.
For broadcast-scale streaming with millions of viewers, DASH with CMAF works best, utilizing existing HTTP infrastructure and supporting standard DRM.
CMAF is a media format that simplifies video delivery, while DASH is a streaming protocol for adaptive bitrate streaming. When used together, CMAF provides a standardized way to package media content, which DASH can then stream efficiently.
SRT prioritizes reliability instead of speed. It adds error correction and congestion control on top of UDP, creating strong streams that can manage network issues effectively. This makes it perfect for sending video between production points or to distribution servers.
However, limited player support restricts its use for direct viewer delivery because not all media players are compatible with SRT. For live video contributions, SRT shines by handling unstable networks with excellent security, achieving a latency of 400ms-1s.
The choice between protocols depends on your specific needs. Each excels in particular scenarios:
HLS latency affects stream quality beyond simple delay. When latency increases, video segments pile up in the buffer, using more device memory and processing power. This forces players to choose between smooth playback and staying close to live content.
With higher latency, the player's adaptive bitrate (ABR) algorithm becomes less effective. By the time the algorithm detects network congestion and switches to a lower bitrate, the viewer has already buffered several high-bitrate segments. This leads to more severe quality drops instead of smooth transitions.
High latency also affects error recovery. When network errors occur, players must reload the manifest, download new segments, and rebuild the buffer. With a 30-second latency, viewers miss significant content during recovery. In comparison, low-latency streams can recover faster and skip less content.
Traditional HLS wasn't designed for live streaming. Its 20-30 second latency became a major limitation as live streaming grew. Platforms needed faster delivery for sports, gaming, and interactive content.
Early attempts to reduce latency focused on shorter segments, reducing segment duration from 6 seconds to 2 seconds. However, this led to new challenges, such as increased server load and players struggling with frequent requests, while the fundamental HLS architecture still imposed delays.
CDN providers attempted to optimize edge delivery and reduce cache times, which helped but didn't solve the core issue: HLS's segment-based design required multiple segments in the buffer before playback could start.
Apple launched LL-HLS in 2019 as a significant upgrade to the HLS protocol. This enhancement redefined how HLS manages live streaming, aiming for sub-2-second latency while preserving the scalability and reliability that HLS is known for.
LL-HLS brought three main features:
These changes maintain backward compatibility. Older players can still play LL-HLS streams using traditional methods, while newer players use the low-latency features.
The core innovation in LL-HLS is partial segment delivery. Instead of waiting for 6-second segments, the encoder creates small chunks called "parts" - typically 200-500ms each. Players can start playback as soon as they receive the first few parts.
The manifest structure changes to support this:
1#EXT-X-PART:DURATION=0.33334,URI="filePart1.mp4"
2#EXT-X-PART:DURATION=0.33334,URI="filePart2.mp4"
Blocking playlist reload transforms manifest updates. Players maintain a persistent HTTP connection, and servers push updates when new parts arrive. This eliminates polling delay and reduces server load.
Players can request the next part before it's complete, reducing gaps between parts during playback. The preload hints feature helps players prepare for upcoming segments:
1#EXT-X-PRELOAD-HINT:TYPE=PART,URI="nextPart.mp4"
Managing HLS latency involves careful attention to both encoding and delivery methods. Focusing on synchronization and optimizing settings can improve the responsiveness of your stream.
Want to make video streaming easier? Whatever your latency needs may be, FastPix gets it done. Our video API supports on-demand video and live streaming, making it easy to deliver great content. With features like in-video AI and a customizable player, you can create a viewing experience your audience will enjoy.
Our media studio also simplifies content management. Whether you're streaming live events or sharing videos on demand, we’ve got the tools to help you get it done. Let’s turn your video ideas into reality.
Sign up today and start streaming with FastPix!
Latency in HLS streaming mainly results from segment size, buffering, and network conditions. HLS typically uses chunked segments that introduce delays, as each segment must load before playing. Reducing latency is crucial, especially for live sports, gaming, and interactive streams, where real-time engagement is key. Minimizing latency ensures viewers experience content close to real-time, enhancing engagement and reducing lag-induced frustrations.
Key techniques include using Low-Latency HLS (LL-HLS), which allows smaller, faster segments and reduces latency to as low as 2-5 seconds. LL-HLS relies on smaller GOPs (group of pictures) and CMAF (Common Media Application Format) to speed up data delivery. Implementing adaptive bitrate streaming and leveraging edge computing or CDN services can further reduce latency by serving data from locations closer to viewers.
LL-HLS optimizes traditional HLS byenabling real-time segment transmission, allowing segments to be partiallydelivered, and reducing buffering. While standard HLS segments introducedelays, LL-HLS segments are encoded and transmitted in smaller parts,significantly reducing the delay to just a few seconds. This change makesLL-HLS more suitable for interactive or time-sensitive applications likeauctions, Q&As, or gaming.
Network infrastructure, such as CDNs and edge computing, plays a critical role. CDNs reduce latency by caching content on servers closer to the end users, minimizing data travel. Edge computing places data processing closer to the source, reducing latency in interactive streams. For broadcasters, optimizing network routes and using dedicated streaming servers can further improve latency.
A practical latency target for HLS is around 2-5 seconds for a near real-time experience. To maintain this, monitor streaming metrics, optimize encoder settings, and adjust buffer sizes as needed. Regularly testing network conditions and ensuring adaptive bitrate streaming is enabled can also help maintain consistent low latency for different viewing environments.