Media over QUIC (MoQ): The Next-Generation Protocol for Live Streaming
Every engineer who has built a live streaming system has eventually arrived at the same uncomfortable conversation with their product team: "We can have sub-second latency, or we can have millions of concurrent viewers. Pick one."
This is the streaming trilemma or the three-way trade-off between latency, scale, and architectural simplicity that has constrained live media delivery for the better part of two decades. WebRTC gave us the speed of a video call but crumbled under broadcast-scale fan-out. HLS and DASH gave us global distribution but punished viewers with 5–30 second delays. Every new protocol that arrived, such as SRT, RTMP, LL-HLS, patched one side of the triangle while quietly making the others worse.
Media over QUIC (MoQ) is the IETF's attempt to collapse that triangle entirely. It is a pub/sub transport protocol built on QUIC (the same foundation as HTTP/3) designed from scratch to deliver sub-second latency at CDN scale, without the architectural complexity that has made WebRTC painful to deploy at broadcast volumes. Whether it succeeds is still an open question, but the organisations placing bets on it, for example Cloudflare, Google, Meta, Cisco, Akamai, suggest this is not another niche protocol experiment.
This is a deep dive into what MoQ is, how it compares to the protocols it may eventually replace, who is building it, and what the production readiness picture looks like in April 2026.
Table of contents
- The streaming trilemma
- What is media over QUIC?
- MoQ vs WebRTC: coexistence, not replacement
- Who is building MoQ?
- Production readiness status
- What MoQ means for video platforms
- FAQ
The streaming trilemma
To understand why MoQ matters, it helps to understand the problem it is solving, and why every previous solution has required an uncomfortable compromise.
WebRTC: real-time, but not really scalable
WebRTC was designed for video conferencing. It delivers sub-100ms latency through direct peer-to-peer connections using UDP-based RTP streams. For a two-person call, or a small group meeting, this is exactly the right architecture.
The problem is that WebRTC's peer-to-peer model does not scale to broadcast. When 100,000 people tune into a live event, you cannot establish 100,000 individual peer connections from the broadcaster. Selective Forwarding Units (SFUs) help by relaying streams server-side, but even SFU-based architectures require significant per-connection state management that makes large-scale fan-out expensive, operationally complex, and prone to cascading failure under unexpected load spikes.
WebRTC also carries substantial connection overhead: ICE negotiation, STUN/TURN traversal, DTLS handshake, and a signalling layer that must run before a single frame of media reaches the viewer. This machinery is appropriate for conferencing, where bidirectional communication justifies the complexity. For one-to-many live streaming, it is significant overhead for a use case that does not need it.
HLS and DASH: scalable, but slow
Apple's HTTP Live Streaming and MPEG-DASH solved the scalability problem elegantly. By encoding video into short segments and delivering them over HTTP, these protocols leveraged the global CDN infrastructure that already existed for web content. A live stream on HLS can serve millions of simultaneous viewers with no special infrastructure beyond a standard CDN.
The cost is latency. Classic HLS operates with a latency of 20–45 seconds. Low-Latency HLS (LL-HLS) and Low-Latency DASH improved this considerably (to roughly 2–4 seconds under ideal conditions), but achieving that in practice requires careful tuning of segment duration, CDN configuration, and player buffering behaviour. The inherent polling model of HTTP means players must repeatedly request new segments, introducing round-trip overhead that makes sub-second delivery practically impossible.
MoQ vs HLS: why this comparison matters
The MoQ vs HLS framing is where the stakes become clearest. HLS's scale is real and proven. The question MoQ is answering is: can you have that scale without the latency penalty? The early evidence suggests yes, but with the significant caveat that the ecosystem maturity of HLS (a decade-old standard with universal toolchain support) is not something MoQ can replicate overnight.
What is media over QUIC?
Media over QUIC is a publish-subscribe transport protocol being developed by the IETF's MoQ Working Group. The latest published version of the core transport specification, namely draft-ietf-moq-transport-17, was published on 2 March 2026, with co-editors from Cisco, Google, and Meta.
The QUIC foundation
QUIC is a multiplexed, encrypted transport protocol originally developed by Google and now standardised as RFC 9000. It operates over UDP by avoiding TCP's head-of-line blocking and while providing the reliability, ordering, and congestion control that applications need. HTTP/3 runs over QUIC, which means the protocol already underpins a substantial fraction of global web traffic.
The QUIC streaming protocol properties that make it attractive for media are:
- Multiplexed streams with independent delivery: a dropped packet on one stream does not block delivery on others, unlike TCP
- Partial reliability: QUIC can be configured to drop data rather than retransmit it, which is the correct trade-off for live media where a stale video frame is worse than a missing one
- Built-in encryption: TLS 1.3 is baked into QUIC at the transport layer, not layered on top
- Connection migration: QUIC connections survive IP address changes, which matters for mobile clients
The pub/sub model and relay architecture
The MOQT transport layer sits above QUIC (or WebTransport, for browser environments) and defines a publish-subscribe model for media delivery. Rather than the request-response model of HTTP or the peer-to-peer model of WebRTC, MOQT uses an architecture built around publishers, subscribers, and relays.
A publisher (the live encoder or broadcaster) connects to a relay and publishes tracks. Subscribers (viewers) connect to relays and subscribe to tracks. The relay network handles fan-out: a single published stream can be distributed to any number of subscribers without the publisher needing to maintain individual connections to each one.
Publisher → [Relay] → [Relay] → [Relay] → Subscriber
↘ [Relay] → Subscriber
↘ [Relay] → [Relay] → Subscriber
↘ Subscriber
This MoQ relay network model is architecturally analogous to how CDNs distribute HTTP content (edge nodes subscribe to content from origin nodes on demand), but operates at the transport layer rather than the HTTP layer, with the latency benefits that entails.
Tracks, groups, and objects
MOQT defines a three-level data model:
Objects are the base unit, namely arbitrary byte sequences with headers containing metadata. A video frame, an audio segment, a subtitle cue, or any other piece of time-aligned data is an object.
Groups are ordered sequences of objects, typically bounded by video keyframes (I-frames). A group can be independently decoded, so subscribing to a stream mid-broadcast means starting from the next group boundary, not waiting for the beginning of the transmission.
Tracks are collections of groups from a single publisher. A video track, an audio track, and a caption track can be separately subscribed to, prioritised differently, and dropped independently under network pressure.
Media agnosticism
An important clarification in the latest draft: despite its name, MOQT is explicitly media agnostic. As the specification states, it "can be used for a wide range of use cases" beyond video. Researchers are already exploring MOQT as a transport for the Model Context Protocol (MCP) for AI agent communications, demonstrating that the pub/sub architecture generalises well beyond live streaming.
MoQ vs WebRTC: coexistence, not replacement
The MoQ vs WebRTC conversation is one of the most common and most frequently misframed debates in streaming engineering right now. The short answer: they solve different problems, and the right architecture for most platforms in 2026 uses both.
|
WebRTC |
MoQ (MOQT) |
|
|
Primary use case |
Real-time bidirectional conferencing |
Scalable low-latency media distribution |
|
Latency |
< 100ms (sub-second) |
< 500ms (sub-second, targeting < 200ms) |
|
Fan-out scale |
Hundreds per SFU (requires federation) |
Millions via relay tree |
|
Connection setup |
High overhead (ICE, DTLS, STUN/TURN, signalling) |
Low overhead via QUIC/WebTransport |
|
Browser support |
Universal (all major browsers, all versions) |
Requires WebTransport (all major browsers as of March 2026) |
|
Protocol status |
RFC 8829 (stable standard) |
IETF draft-17 (approaching RFC) |
|
Directionality |
Bidirectional (designed for it) |
Primarily unidirectional pub/sub |
|
Complexity |
High (SFU, ICE, STUN/TURN infrastructure) |
Lower (relay network, no ICE/DTLS overhead) |
|
Ideal for |
Video conferencing, live collaboration |
Live sports, concerts, news, interactive broadcast |
WebRTC remains the unambiguous choice for real-time bidirectional communication, such as conferencing, interview platforms, collaborative tools, and voice AI. Its browser support is universal and its media stack (codec negotiation, echo cancellation, bandwidth estimation) handles the complexity of interactive communication automatically.
MoQ's advantage emerges when the use case is distribution rather than conversation. A sports broadcaster streaming to 500,000 concurrent viewers does not need the bidirectional overhead of WebRTC. What they need is sub-second latency at CDN-level scale, which is exactly the problem MoQ's relay architecture addresses.
The hybrid architecture that most streaming platforms will arrive at: WebRTC for the interactive layer (panellists, commentators, broadcast tools), MoQ for the distribution layer (the audience receiving the stream). This is not a zero-sum replacement, but rather a complementary stack where each protocol operates in its domain of advantage.
Who's vuilding MoQ?
IETF MoQ 2026 is not a research project. The organisations building production MoQ infrastructure represent a significant portion of global internet traffic.
Cloudflare: the first MoQ CDN
In August 2025, Cloudflare launched what it described as the first MoQ relay network in production, running across its global infrastructure in more than 330 cities. Every Cloudflare server is now a MoQ relay, so it is not a test deployment, but production infrastructure at the same scale as Cloudflare's existing services.
The implementation, moq-rs, is open source and compatible with multiple publishers, players, and tools across the MoQ ecosystem. Cloudflare has also been an active IETF contributor, with engineers co-authoring Internet-Drafts on relay DoS mitigations and CDN provisioning for MoQ.
OpenMOQ Software Consortium
Founded by Red5, Akamai, CDN77, Cisco, Synamedia, and YouTube, the OpenMOQ Software Consortium is the industry body coordinating open-source MoQ implementation work. As of April 2026, the Consortium has completed its governance setup, established a technical roadmap, and moved into the core infrastructure development stage.
Eleven vendors, including Ant Media, AWS, Bitmovin, Broadpeak, CacheFly, Cloudflare, Nomad Media, Oracle, Norsk, Synamedia, and Red5, demonstrated their first MoQ implementations at NAB Show 2026 in Las Vegas, representing the largest coordinated MoQ interoperability demonstration to date.
Oracle OCI: enterprise MoQ at the edge
Oracle's Oracle Video @ Edge (OVE) platform functions as a MOQT relay network for enterprise media workflows. At NAB Show 2026, Oracle demonstrated multi-partner MoQ workflows integrating Ateme (encoding), Broadpeak (packaging), and Cloudflare (CDN delivery), connected through a shared MOQT relay fabric. OVE acts as a coordination layer by managing session lifecycle, routing, and telemetry across a multi-vendor streaming pipeline.
Google and Meta: IETF co-authors
Engineers from Google (Ian Swett) and Meta (Alan Frindell) are listed as editors of the core MOQT specification. This is not passive endorsement, since the editors are responsible for the technical direction of the draft and the resolution of implementation issues raised by the working group. Both companies have MoQ implementations that participate in interoperability testing.
moq.dev: open-source Rust and TypeScript
The moq.dev project maintains open-source MoQ libraries in both Rust and TypeScript with equivalent APIs. The project includes moq-lite (the simplified transport subset), hang (media-specific encoding and streaming abstractions), and relay/relay infrastructure. The hang.live demo runs a live public MoQ stream demonstrating sub-second latency in a browser with no plugins required.
Production readiness status
The MoQ protocol explained: where things actually stand in April 2026
Honest assessment: MoQ is production-ready for controlled deployments, early-mover experimental workloads, and ingest pipelines. It is not yet ready for consumer-facing deployments that require universal browser compatibility and polished fallback handling, but that gap is closing faster than most expected.
WebTransport reaches Baseline
The critical infrastructure dependency for MoQ in browsers is WebTransport, which is the W3C API that exposes QUIC's capabilities to browser applications. As of March 2026, Safari 26.4 shipped WebTransport out of the box, joining Chrome, Firefox, and Edge. This crossed the threshold for "Baseline" status, which is the web platform's designation for features safe to use in production across all major browsers.
This is a meaningful inflection point. Until March 2026, Safari's absence from WebTransport meant that every iOS device and every Safari user was excluded from MoQ-based web applications. That constraint is now removed.
The draft incompatibility challenge
The transport specification has moved through multiple revisions, and implementations written against draft-14 or draft-15 may not interoperate cleanly with draft-17. This is the typical messy middle of IETF standardisation. Organisations building on MoQ today are accepting some version-lock risk in exchange for early mover advantage. The specification is expected to reach RFC status in 2027 or 2028, at which point the incompatibility problem becomes historical.
The adoption question
There is also a counter-narrative worth taking seriously. In a May 2026 analysis, BlogGeek.me's Tsahi Levent-Levi argued that MoQ is showing the classic signs of a vendor-pushed protocol rather than a customer-pulled one: despite Cloudflare's 330+ city relay network, no flagship broadcaster or consumer platform has yet been publicly named as an end-user customer, and the typical enterprise question is still framed around whether it is too early to start looking at MoQ, not when to migrate.
The pessimistic case rests on three concrete points. First, MoQ is not a transparent upgrade in the way HTTP/3 was. HTTP/3 rolled out invisibly because the application layer did not change; MoQ requires player rewrites, encoder reworks, and operational adjustments that need a clear commercial trigger. Second, the protocol surface is still fragmented, with competing media-format proposals (WARP, LOC, CMSF, MSF) meaning that even vendors agreeing on transport have not yet agreed on what flows through it, and RFC publication for the core transport is now realistically targeted for late 2027 to mid-2028. Third, MoQ is competing against incumbents whose infrastructure is amortised, whose toolchains are mature, and whose engineers are easy to hire. Unlike WebRTC, which enabled a use case (browser-native video calling) that did not previously exist, MoQ proposes architectural improvements on use cases that already have working solutions, which is a harder commercial case to make.
None of this is an argument that MoQ will fail. The protocol is sound, the working group is serious, and the relay infrastructure is real. But the timeline from "production-grade relay network exists" to "a meaningful share of live streaming traffic actually runs on it" is likely to be measured in years, not quarters. For platforms evaluating MoQ today, this reinforces the practical advice rather than overturning it: build familiarity now, prototype against the available relay networks, but do not bet a 2027 product roadmap on mass MoQ deployment that the market has not yet validated.
moq-lite: the pragmatic path to deployment
moq-lite is a simplified, forwards-compatible subset of the full MOQT specification, developed by Luke Curley (formerly of Twitch and Discord) and used by the moq.dev project. It strips away features that have been contentious or slow to stabilise in the working group, exposing a clean pub/sub API that works with any MoQ CDN.
As the moq.dev documentation states: "The principles behind MoQ are fantastic, but standards are slow and involve too much arguing/bloat." moq-lite is the pragmatist's answer: a subset that is forwards-compatible with full MOQT (draft-14 and above) and deployable today, without waiting for every corner case in the specification to be resolved.
The Rust implementation of moq-lite is available on crates.io. The TypeScript implementation targets browsers via WebTransport. Both share equivalent APIs, which means relay infrastructure can be shared between server-side and browser-side components.
What MoQ means for video platforms
For platforms that bridge conferencing and broadcast, such as video meeting tools that offer live streaming, event platforms, or hybrid conference-to-audience solutions, MoQ represents a meaningful architectural opportunity.
The hybrid architecture
The most likely deployment pattern for platforms with both interactive and broadcast components is a hybrid stack:
- WebRTC handles the interactive session layer: participant video, audio, screen sharing, and bidirectional communication between speakers and moderators
- MoQ handles the distribution layer: taking the mixed output of the conferencing session and distributing it at broadcast scale to audiences who are watching but not participating
This is architecturally clean because WebRTC and MoQ address different points in the pipeline. The SFU that manages the WebRTC conference can publish a composite stream into a MoQ relay network, which then fans it out to viewer audiences without the SFU needing to manage per-viewer connections.
Restreaming and MoQ ingest
For platforms that offer restreaming which is the ability to simultaneously push a live stream to multiple destinations (YouTube, Twitch, a custom RTMP endpoint), MoQ's ingest model is relevant. MoQ ingest is lower-overhead than RTMP for the publisher and simpler to implement than WebRTC ingest without the peer connection negotiation. As MoQ support arrives in broadcast tools like OBS (GStreamer plugins are already in development), the ingest-to-relay path becomes a natural alternative to RTMP-based restreaming workflows.
The timing question
For streaming engineers and architects evaluating where to invest in 2026: MoQ is worth building familiarity and prototype investment now. The specification is stable enough to evaluate, the relay infrastructure (Cloudflare, OCI) is production-grade, and the ecosystem of implementations is broad enough to develop against.
Universal consumer deployment, or the scenario where you ship MoQ to end users without WebRTC or HLS fallback, is a 2027–2028 story at the earliest, contingent on RFC publication and wider toolchain support. The window for early adoption without the risk is narrowing.
Architecture diagram

FAQ
No and this is an important distinction. WebRTC and MoQ are designed for different use cases that only partially overlap. WebRTC is optimised for real-time bidirectional communication: conferencing, voice calls, collaborative tools. Its ICE/DTLS/STUN stack, while complex, handles the NAT traversal and peer connection establishment that interactive communication requires. MoQ is a publish-subscribe distribution protocol that is fundamentally unidirectional in its pub/sub model and optimised for fan-out to large audiences. The architectures where MoQ will displace WebRTC are those where WebRTC was being misused for broadcast distribution such as in one-to-many delivery where you were using SFU cascades to serve audiences at scale. For actual conferencing, WebRTC will remain the correct choice for the foreseeable future.
As of May 2026, Cloudflare has the largest known production MoQ deployment: a relay network across 330+ cities running on the same infrastructure as its existing CDN services. WINK Streaming has a production MoQ implementation for MediaMTX achieving 200–300ms latency, currently deployed for government traffic camera networks and public safety systems. nanocosmos launched MoQ support on its nanoStream platform at IBC 2025. Oracle has a production MoQ relay service in Oracle Video @ Edge (OVE). Red5 has a beta MoQ offering built on OCI infrastructure in partnership with CacheFly. Eleven vendors demonstrated interoperating MoQ implementations at NAB Show 2026.
The OpenMOQ Software Consortium is an industry body founded to develop high-performance, open-source MoQ infrastructure suitable for real-world production deployment. Its founding members include Red5, Akamai, CDN77, Cisco, Synamedia, Oracle, and YouTube (Google). Academic members include Universität Klagenfurt and Özyeğin University. With its governance and technical roadmap in place, the Consortium is now focused on building open-source relay and player implementations that allow the industry to deploy MoQ without proprietary lock-in.
It depends on the use case. For controlled enterprise deployments, server-to-server ingest workflows, and applications where you control the browser environment, MoQ is deployable today, as Cloudflare's relay network is production-grade infrastructure. For consumer-facing broadcast applications requiring universal browser compatibility, the picture improved significantly in March 2026 when Safari shipped WebTransport, crossing the Baseline threshold. However, the transport specification is still at draft-17 and has not yet been published as an RFC, which means implementations may require updates as the specification is finalised. The honest assessment: experimental production deployment is viable now; universal consumer deployment is a 2027–2028 story.
MoQ (specifically the MOQT transport) and WebRTC differ at the architectural level. WebRTC is a complete framework for peer-to-peer real-time communication, including media stack (codecs, echo cancellation, bandwidth estimation), signalling model, and security architecture (DTLS-SRTP). MoQ is a transport protocol or a lower-level primitive for publishing and subscribing to media tracks via relay networks. WebRTC requires ICE/STUN/TURN infrastructure and a signalling layer before any media flows; MoQ establishes connections with lower overhead via QUIC's 0-RTT capability. WebRTC scales via SFUs with per-connection state management; MoQ scales via relay trees with shared caching and pull-through subscription. For interactive conferencing: WebRTC. For scalable low-latency broadcast: MoQ.
Conclusion
Two decades of streaming engineering have produced a fragmented landscape: one protocol for real-time communication, another for broadcast distribution, a third for ingest, and an ever-growing stack of CDN configurations, SFU clusters, and TURN servers to hold it together.
MoQ is an attempt to rebuild from a cleaner foundation that is a single transport protocol, grounded in QUIC, that can span the latency-scale spectrum without requiring separate architectures for each point on it. The MoQ protocol is explained in its simplest form: pub/sub media delivery over QUIC, with a relay model that scales like a CDN and latency that approaches WebRTC.
The ecosystem is real, the relay infrastructure is production-grade, and browser support has crossed into Baseline territory. What remains is RFC publication, toolchain maturation, and the kind of broad deployment experience that turns a promising protocol into an industry standard.
For streaming engineers: now is the right time to build familiarity with MoQ, run experiments against Cloudflare's relay network, and design your hybrid architectures. The production window is opening.
→ Explore Digital Samba's streaming features including our restreaming API, which connects to RTMP endpoints and is being monitored closely for MoQ ingest support as the ecosystem matures.
References
-
Bitmovin. (2026). Media over QUIC (MoQ) with Bitmovin and Cloudflare. https://bitmovin.com/blog/media-over-quic-bitmovin-cloudflare/
-
Cloudflare. (2025, August 29). MoQ: Refactoring the internet's real-time media stack. https://blog.cloudflare.com/moq/
-
English, M., Pardue, L., Sharma, A., & Swett, I. (2026, March 2). Denial-of-service considerations for Media over QUIC relay deployments (draft-englishm-moq-relay-dos-00). IETF. https://datatracker.ietf.org/doc/draft-englishm-moq-relay-dos/
-
Internet Engineering Task Force (IETF). (n.d.). Media over QUIC (MoQ) working group. https://datatracker.ietf.org/group/moq/about/
-
Kurinnoi, P. (2026, April). Media over QUIC (MoQ): The protocol that could finally unify streaming. Medium - Video Tech. https://medium.com/video-tech/media-over-quic-moq-the-protocol-that-could-finally-unify-streaming-8b95972db9ce
-
Law, W. (2026, January 19). MOQT Streaming Format (draft-ietf-moq-msf-00). IETF. https://datatracker.ietf.org/doc/draft-ietf-moq-msf/
-
Levent-Levi, T. (2026, May). The MoQ adoption problem. BlogGeek.me. https://bloggeek.me/moq-adoption-problem/
-
moq.dev. (2025). The first MoQ CDN: Cloudflare. https://moq.dev/blog/first-cdn/
-
moq.dev. (n.d.). Media over QUIC. https://moq.dev/
-
moq-dev. (n.d.). MoQ: Media over QUIC library in Rust + TypeScript [GitHub repository]. https://github.com/moq-dev/moq
-
Nandakumar, S., Vasiliev, V., Swett, I. (Ed.), & Frindell, A. (Ed.). (2026, March 2). Media over QUIC Transport (draft-ietf-moq-transport-17). IETF. https://datatracker.ietf.org/doc/html/draft-ietf-moq-transport-17
-
OpenMOQ Software Consortium. (n.d.). OpenMOQ: Advancing MOQ protocol. https://openmoq.org/
-
Oracle. (2026, April). Oracle Video @ Edge: Showcasing Media over QUIC partner workflows at NAB. Oracle Cloud Infrastructure Blog. https://blogs.oracle.com/cloud-infrastructure/oracle-video-edge-media-over-quic-workflows-nab
-
Red5. (2026, April). What is MOQ (Media over QUIC) and why it matters. https://www.red5.net/blog/what-is-moq-media-over-quic/
-
Red5. (2026, January 26). 6 MOQ players you need to know about: Pros and cons. https://www.red5.net/blog/6-moq-players-you-need-to-know-about/
-
Red5. (2026). MOQ vs WebRTC: Why both protocols can and should exist in live streaming space in 2026. https://www.red5.net/blog/moq-vs-webrtc/
-
Red5. (2025, December). Media over QUIC (MoQ): Beta access. https://www.red5.net/media-over-quic-moq/
-
Red5. (2025, December 24). Red5 joined the OpenMOQ Software Consortium. https://www.red5.net/blog/red5-joined-openmoq/
-
WebRTC.ventures. (2026, April). WebTransport is now Baseline: Here's what that means for real-time media. https://webrtc.ventures/2026/04/webtransport-is-now-baseline-what-it-means-for-real-time-media/
-
WINK Streaming. (2025). Media over QUIC (MoQ) implementation: Technical analysis & browser reality. https://www.wink.co/documentation/WINK-MoQ-Implementation-Analysis-2025.php
Share this
You May Also Like
These Related Stories

What is WebRTC Signalling?

AV1 vs H.264 vs VP9 vs VP8: Best Video Codec for Conferencing in 2026

