Development

Resolving SRTP Authentication Failures in Janus RTP Forwarding

by Nina Benkotic

8 min read

February 27, 2026

When building large-scale WebRTC infrastructure, most production issues are not caused by broken encryption or failed negotiation. They arise from subtle state mismatches between components that all behave correctly in isolation but drift out of alignment over time.

One such issue we recently investigated involved intermittent SRTP auth fail errors during RTP forwarding in Janus. At first glance, it looked like a cryptographic problem. In reality, it was a rollover counter desynchronisation triggered by browser SSRC reuse - and it only manifested after several minutes of continuous media transmission.

This article explains what happened, why it happens, and how to fix it reliably in production systems.

Table of contents

The scenario
SRTP and the 16-bit sequence number
Why the problem only appear after several minutes
Why is RTP forwarding particularly vulnerable
Diagnosing the issue
The practical solution: force a new SSRC
Why not attempt ROC resynchronisation?
Operational considerations
Lessons for WebRTC infrastructure
Final thoughts

The scenario

In a typical deployment, Janus acts as a selective forwarding unit (SFU), receiving encrypted media from participants and forwarding it either to other participants, external systems, recording pipelines, or broadcast endpoints.

Under normal operation, the media pipeline functions smoothly and predictably. Encrypted audio and video streams flow securely over SRTP, RTP forwarding is established using a clearly defined SSRC, and incoming packets are authenticated and decrypted without issue. As long as the transport state remains consistent on both sides, the system maintains integrity, confidentiality, and stable media delivery throughout the session.

However, under specific circumstances, forwarding would suddenly start failing with errors similar to: SRTP unprotect error: srtp_err_status_auth_fail

Interestingly, this did not occur immediately after a participant joined. Nor did it happen consistently across all sessions. The failure only appeared after:

A participant had camera or microphone enabled for several minutes
The participant disabled and later re-enabled media
RTP forwarding was still using the original SSRC

The key to understanding the issue lies in how SRTP tracks packet integrity.

SRTP and the 16-bit sequence number

SRTP protects RTP streams against replay attacks using a packet index constructed from:

The 16-bit RTP sequence number

The RTP header includes a 16-bit sequence number that increments by one for each packet sent, allowing receivers to detect packet loss, reorder arriving packets, and identify duplicates. Because it is limited to 16 bits, it wraps back to zero after 65,536 packets, which makes rollover tracking necessary during longer media sessions.

A rollover counter (ROC)

The rollover counter increments each time the 16-bit RTP sequence number wraps, extending the effective packet index used by SRTP for authentication and replay protection. Both sender and receiver must maintain identical ROC state, otherwise packet authentication fails due to mismatched rollover assumptions.

The sequence number field in RTP is only 16 bits long. This means it wraps after 65,536 packets. When that wrap occurs, the ROC increments. Together, the sequence number and ROC form a 48-bit packet index used for authentication.

This design works well, but it introduces statefulness. Both sender and receiver must maintain the same understanding of current sequence progression but also how many times the sequence number has wrapped. If these two sides disagree about the ROC, authentication fails.

Crucially, because the sequence number is 16-bit, the wrap - and therefore the first ROC increment - does not happen immediately. At common video packet rates, it takes minutes of continuous transmission before the counter rolls over.

That timing detail is what made this issue particularly subtle.

Why the problem only appear after several minutes

During the initial phase of a session, the sequence number increases but does not yet wrap. The ROC remains zero. If a participant disables and re-enables their camera at this stage, sequence behaviour may restart or shift, but ROC alignment is often still recoverable because both sides are effectively working within the first 16-bit window.

However, once enough packets have been transmitted for the sequence number to wrap:

The sender increments its ROC: The sender increases its rollover counter each time the 16-bit RTP sequence number wraps back to zero after reaching its maximum value.
The receiver increments its ROC: The receiver independently updates its rollover counter when it detects that the incoming sequence number has wrapped.
Both sides now depend on a synchronised rollover state: Once rollover has occurred, successful SRTP authentication requires both sender and receiver to maintain identical rollover counter values for packet index calculation.

If the browser later restarts media and resets its internal SRTP state, but reuses the same SSRC, the ROC at the sender side returns to zero. Meanwhile, Janus forwarding may still assume the previous ROC value.

From that point onward, packet authentication fails. So the issue does not happen immediately, as it requires:

Media to remain enabled long enough for at least one 16-bit sequence wrap
The browser to restart media
The SSRC to remain unchanged
Forwarding to continue using the previous SRTP context

Only then does the mismatch become visible.

Browser behaviour: SSRC reuse

Modern browsers are designed to preserve session continuity wherever possible, so when a user disables and later re-enables their camera or microphone, the WebRTC stack often keeps the same SSRC to maintain stream identity. At the same time, it may restart packet sequencing internally and reinitialise its SRTP state as if the media flow were beginning anew. From the browser’s perspective this behaviour is entirely valid, but it effectively resets transport-level state while presenting the stream as unchanged to downstream components.

From the browser’s perspective, this is perfectly valid. The media stream identity remains consistent. Signalling does not necessarily renegotiate SSRC. Downstream consumers can treat it as the same stream.

But SRTP replay protection is stateful. If Janus forwarding continues using the existing SRTP context tied to that SSRC, it will:

Retain the old ROC value
Expect sequence progression consistent with earlier packets
Reject incoming packets as unauthenticated

The result is srtp_err_status_auth_fail.

Why is RTP forwarding particularly vulnerable

Within the room itself, Janus manages peer streams dynamically. It has more flexibility to react to renegotiation events and track internal state transitions.

RTP forwarding is commonly set up once at the start of a session and then treated as a stable, long-running pipeline. It generally assumes that the SSRC will remain constant, that the associated SRTP context will persist for the lifetime of the stream, and that rollover tracking will continue uninterrupted. This design works well under steady conditions, where media flows continuously and no internal state is reset mid-session.

The difficulty arises when a browser restarts media while keeping the same SSRC, because the forwarding layer retains its previous replay protection state without any indication that the sender has reset its internal counters. From the sender’s perspective, the stream has effectively begun again, whereas the forwarding context assumes it is still processing a continuous flow of packets. As a result, the calculated packet index diverges, and SRTP authentication fails because the rollover expectations on each side no longer align.

Diagnosing the issue

The behaviour can be confusing in production environments because:

It only occurs after several minutes of media transmission
It depends on specific user interaction (disable/enable)
It may appear intermittent
Restarting forwarding to another Janus streaming plugin context (port) fixes the problem

A key diagnostic clue is timing. So if failures only appear after extended media activity and not during short sessions, sequence rollover should immediately become a suspect.

Another strong indicator is that restarting forwarding resolves the issue without renegotiating encryption keys. That suggests state desynchronisation rather than key mismatch.

The practical solution: force a new SSRC

The most robust and operationally safe fix is straightforward: When media is restarted, stop RTP forwarding and reinitialise it with a new SSRC.

Assigning a new SSRC when restarting forwarding effectively creates a clean boundary for the stream. A fresh SRTP context is initialised, the rollover counter begins again from its initial state, and both sender and receiver rebuild their packet indices from a common baseline. With this realignment in place, SRTP authentication proceeds normally because there is no longer any disagreement about rollover history or packet numbering.

In practical terms, this requires detecting moments when the media lifecycle changes, such as track replacement, camera or microphone re-enablement, or a publisher renegotiation event. When such a change occurs, the existing RTP forwarding session should be stopped and a new one started with an explicitly assigned SSRC. By doing so, the system avoids carrying forward stale replay protection state and ensures that encryption contexts remain synchronised across components.

Why not attempt ROC resynchronisation?

In theory, one could attempt to detect ROC divergence and resynchronise dynamically. In practice, this approach is risky and complex.

SRTP replay protection is designed to prevent packet injection and replay attacks. Relaxing validation rules to accommodate uncertain rollover state undermines security guarantees.

Additionally:

Inferring correct ROC values mid-stream is non-trivial: rollover detection depends on precise sequence progression and timing assumptions that may no longer hold after a media restart. Any attempt to reconstruct that state requires carefully analysing packet flow without introducing ambiguity.
Mistakes can lead to either packet rejection or weakened security: an incorrect rollover assumption may cause valid packets to fail authentication or allow replayed packets to pass verification. Both outcomes are undesirable in production systems where reliability and security must coexist.
Behaviour varies across browsers and implementations: this is because different WebRTC stacks handle SSRC reuse and SRTP reinitialisation slightly differently. This inconsistency makes generic recovery logic difficult to implement reliably across heterogeneous client environments.

For production systems, deterministic reinitialisation is safer than heuristic repair.

Operational considerations

In large-scale WebRTC deployments, media restarts are a routine part of everyday operation rather than an edge case. Users regularly toggle their cameras and microphones, network fluctuations can trigger renegotiation events, and hardware such as webcams or headsets may be unplugged and reconnected during active sessions. These normal behaviours introduce frequent lifecycle transitions that infrastructure must handle gracefully without assuming uninterrupted media continuity.

Given this, forwarding pipelines should treat media restarts as identity boundaries. Even if the SSRC remains unchanged upstream, forwarding contexts should not assume continuity after device reconfiguration.

A practical approach includes:

Monitoring for track-level changes
Logging sequence progression for diagnostics
Enforcing SSRC rotation during restart events

This ensures that forwarding remains robust even under unpredictable client behaviour.

Lessons for WebRTC infrastructure

This case highlights several broader engineering principles:

1. Legacy field sizes still matter

The 16-bit RTP sequence number is a decades-old design decision. Its limitations surface in modern high-bitrate video scenarios where wraparound happens relatively quickly. Even small header fields can introduce surprising behaviour when state persists across restarts.

2. Statefulness requires lifecycle discipline

Encryption contexts are not stateless wrappers. They accumulate assumptions about packet ordering and rollover. When stream lifecycles change, cryptographic state must be reconsidered - even if signalling appears stable.

3. Continuity at one layer is not continuity at another

Browsers optimise for application-level continuity. Media servers optimise for transport integrity. Forwarding pipelines often assume identity persistence. These goals do not always align automatically. Explicit lifecycle management bridges the gap.

Final thoughts

SRTP authentication failures are often treated as cryptographic anomalies. In reality, many are state management issues triggered by perfectly valid behaviour at another layer.

In this case, the combination of:

A 16-bit packet counter
ROC increment after several minutes
Browser SSRC reuse
Persistent forwarding contexts

created a narrow but reproducible failure condition.

The resolution was not to modify encryption, nor to weaken replay protection, but simply to treat media restart as a boundary and rotate SSRC accordingly.

In production WebRTC systems, stability often comes from recognising when identity must change - even if everything upstream suggests it has not.

Understanding these subtle protocol interactions is essential when operating selective forwarding units at scale. Sometimes, the safest fix is not a complex algorithm, but a clean reset at the right moment.

At Digital Samba, we continuously refine our media infrastructure to ensure reliable, secure video communication at scale. Engineering challenges like this one are reminders that real-time systems demand both protocol expertise and careful lifecycle design.

Share this

Previous story

← Digital Samba at ISE 2026: A European Video Innovation Story in Barcelona

Next story

The End of Webex Training Center: What It Means for Virtual Learning →

Get Email Notifications