mirror of https://github.com/TelegramMessenger/Telegram-iOS.git synced 2026-07-05 19:28:46 +02:00

Isaac 5962a563e4 feat: tgcalls CLI test tool with group SFU, video, and adaptation

Squashed buildout of the tgcalls testbench:

- CLI test tool with --mode p2p/reflector/group/group-churn,
  cross-version interop (--version, --version2), and quiet/summary output
- Linux toolchain + Docker multi-stage build, AWS Fargate mass test harness,
  local parallel mass test harness with signaling loss simulation
- SCTP writable gate, retransmission timer tuning, role-based handshake
- InstanceV2CompatImpl (PeerConnection backend with V2Impl signaling) and
  SignalingTranslator for v14.0.0 interop
- In-process Go/Pion SFU (ICE+DTLS+SRTP+SCTP per participant) with audio
  RTP forwarding, ActiveAudio/VideoSsrcs data channel broadcast, RTCP
  feedback path, and CGo c-archive integration
- GroupInstanceReferenceImpl (PeerConnection group-call) and mixed-impl
  group mode (--reference-participants), with SDP munging for simulcast
- H264 simulcast group video (FakeVideoTrackSource pattern generator,
  FakeVideoSink frame counting, --video flag, two-pass channel setup,
  reactive video setup from ActiveVideoSsrcs)
- Group churn stress mode (--mode group-churn, --churn-cycles)
- SFU stream-quality adaptation: BandwidthEstimator, LayerSelector
  state machine, RtxRingBuffer, simulcast SSRC rewrite
- Transport-cc feedback generation, NetworkSimulator (delay/jitter/loss/
  token-bucket bandwidth), --network-scenario step-down-up
- CLAUDE.md updates throughout

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-30 18:28:43 +02:00

26 KiB

Raw Blame History

tgcalls Library

The tgcalls VoIP library source. See the root CLAUDE.md for build instructions and the project overview.

macOS Build Support

This repo has been patched to support native macOS arm64 builds (darwin_arm64 CPU) in addition to the original iOS targets. Changes made:

third-party/webrtc/BUILD — added @platforms//os:linux to arch_specific_cflags select (fixes macOS getting Linux flags via //conditions:default); moved cocoa_threading.mm from cc_library to webrtc_platform_helpers objc_library (Bazel 8 rejects .mm in cc_library); replaced UIKit with AppKit for macOS
third-party/openh264/BUILD — added //conditions:default to select() statements
third-party/webrtc/absl/absl/base/attributes.h — disabled ABSL_ATTRIBUTE_LIFETIME_BOUND (newer Xcode clang rejects it on void-returning functions)
8 third-party BUILD files + 8 build shell scripts — added darwin_arm64 -> macos_arm64 architecture support (opus, libvpx, ffmpeg, dav1d, mozjpeg, webp, libjxl, td)

Linux Build Support

The repo supports native Linux arm64 and x86_64 builds. Key changes from the iOS/macOS-only baseline:

.bazelrc — Apple toolchain settings under build:macos, Linux uses default CC toolchain via build:linux (auto-selected by --enable_platform_specific_config)
build-system/BUILD — linux_arm64 and linux_x86_64 config_settings
objc_library → cc_library conversions for pure C/C++ targets (ogg, opusfile, rnnoise, opus, libvpx, dav1d, ffmpeg wrappers, WebRTC main target)
WebRTC BUILD — platform flags via select() (-DWEBRTC_LINUX vs -DWEBRTC_MAC), stdlib task queue instead of GCD on Linux, macOS-only sources excluded
Third-party genrule build scripts — Linux architecture cases added (libvpx, dav1d, ffmpeg), system cmake/meson/ninja used instead of downloaded macOS binaries
BoringSSL — _Generic C11 guarded for C++ mode (GCC compatibility)
tgcalls headers — #include <cstdint> added for GCC 15 strictness

SCTP Signaling

Writable Gate (role-based handshake ordering)

tgcalls uses a custom SCTP association (via dc-sctp) over the signaling channel for reliable message delivery. SignalingSctpConnection wraps DcSctpTransport with a SignalingPacketTransport shim.

The SCTP handshake is ordered using DcSctpTransport's writable gate (MaybeConnectSocket()), mirroring how WebRTC PeerConnection uses DTLS writable state to control SCTP connection timing:

Caller (isOutgoing=true): SignalingPacketTransport starts writable → Connect() fires immediately → sends INIT
Callee (isOutgoing=false): starts not-writable → Connect() deferred → on first receiveExternal(), setWritable(true) fires SignalWritableState → MaybeConnectSocket() → Connect()

The callee's Connect() and processing of the caller's INIT happen synchronously within the same BlockingCall on the network thread (RFC 4960 §5.2.1 simultaneous-open).

Key files:

SignalingSctpConnection.cpp — SignalingPacketTransport writable state, setWritable(), constructor takes isInitiator
InstanceV2Impl.cpp / InstanceV2ReferenceImpl.cpp — pass _encryptionKey.isOutgoing as isInitiator
third-party/webrtc/webrtc/media/sctp/dcsctp_transport.cc:662-667 — MaybeConnectSocket() gate (unmodified)

Timer Tuning (CustomDcSctpSocket)

WebRTC's stock DcSctpSocket has a bug: max_timer_backoff_duration is wired to the T3-rtx (data retransmission) timer but not to the t1_init and t1_cookie (handshake) timers. The handshake timers use unlimited exponential backoff (1000, 2000, 4000, 8000ms...), causing the SCTP handshake to stall for 20+ seconds under packet loss with simultaneous-open (both sides call Connect()).

Fix: CustomDcSctpSocket (in tgcalls/v2/) is a copy of DcSctpSocket with the 6-line fix that passes max_timer_backoff_duration to the t1_init and t1_cookie timer constructors. A CustomDcSctpSocketFactory in SignalingSctpConnection.cpp creates it instead of the stock socket, with configurable timer overrides. WebRTC source is untouched.

Default signaling SCTP timer values (set in SignalingSctpConnection::Options):

Setting	WebRTC Default	Signaling Override
`t1_init_timeout`	1000ms	400ms
`t1_cookie_timeout`	1000ms	400ms
`max_timer_backoff_duration`	3000ms	750ms
`max_init_retransmits`	8	unlimited (from `DcSctpTransport::Start`)

Retry pattern: 400ms, 750ms, 750ms, 750ms... (~18 attempts in 15s). At 30% loss, 100% success rate over 5000 runs.

These values are configurable via JSON custom parameters (passed to InstanceV2Impl via config.customParameters):

network_sctp_t1_init_ms — T1-init timeout (0 = use default 400ms)
network_sctp_t1_cookie_ms — T1-cookie timeout (0 = use default 400ms)
network_sctp_max_backoff_ms — max timer backoff cap (0 = use default 750ms)

Key files:

tgcalls/v2/CustomDcSctpSocket.h/.cpp — patched DcSctpSocket copy
tgcalls/v2/SignalingSctpConnection.cpp — CustomDcSctpSocketFactory, timer option plumbing
tgcalls/v2/InstanceV2Impl.cpp — reads JSON params, passes Options to SignalingSctpConnection

InstanceV2CompatImpl (version 14.0.0)

A cross-version interop implementation that uses WebRTC PeerConnection internally (like InstanceV2ReferenceImpl) but speaks V2Impl's signaling protocol (InitialSetupMessage, NegotiateChannelsMessage, CandidatesMessage). This enables bidirectional calls between PeerConnection-based clients and V2Impl-based clients (versions 7.0.0–13.0.0).

Architecture

PeerConnection <-> SignalingTranslator <-> EncryptedConnection <-> SignalingSctpConnection

SignalingTranslator (tgcalls/v2/SignalingTranslator.h/.cpp): Converts between cricket::SessionDescription (PeerConnection's internal format) and V2Impl signaling messages. Uses JsepSessionDescription programmatic API — no SDP string round-trips.
Outbound: PeerConnection generates offer/answer → SignalingTranslator extracts InitialSetupMessage (transport params) + NegotiateChannelsMessage (media contents)
Inbound: Buffers both messages until complete → builds cricket::SessionDescription → wraps in JsepSessionDescription → SetRemoteDescription

Key Design Decisions

No data channel with V2Impl peers: WebRTC data channel requires PeerConnection on both sides. V2Impl uses NativeNetworkingImpl (no PeerConnection). When paired with V2Impl, the data channel m-line is padded as rejected in the remote answer so PeerConnection accepts it. For CompatImpl↔CompatImpl calls, the data channel works normally.
Caller-only renegotiation: Only the outgoing side triggers offers from onRenegotiationNeeded to prevent unsolicited offer storms.
MediaState via signaling: MediaStateMessage sent over the SCTP signaling channel (not data channel), ensuring it works with both V2Impl and CompatImpl peers.
Sequential content IDs: Uses "0", "1", ... as m-line mids, matching PeerConnection's default scheme.
Shared conversion functions: convertContentInfoToSignalingContent() and convertSignalingContentToContentInfo() extracted to Signaling.h/.cpp for use by both ContentNegotiationContext (V2Impl) and SignalingTranslator (CompatImpl).

Cross-Version Testing

# CompatImpl caller → V2Impl callee
./bazel-bin/tools/tgcalls_cli/tgcalls_cli --mode p2p --version 14.0.0 --version2 13.0.0 --duration 10 --quiet

# V2Impl caller → CompatImpl callee
./bazel-bin/tools/tgcalls_cli/tgcalls_cli --mode p2p --version 13.0.0 --version2 14.0.0 --duration 10 --quiet

# With lossy signaling
./bazel-bin/tools/tgcalls_cli/tgcalls_cli --mode p2p --version 14.0.0 --version2 13.0.0 --duration 15 --drop-rate 0.3 --delay 50-200 --quiet

100% success rate at 30% loss in both directions (tested with 50 sequential + 20 parallel runs each direction).

Key files:

tgcalls/v2/InstanceV2CompatImpl.h/.cpp — main implementation
tgcalls/v2/SignalingTranslator.h/.cpp — cricket↔signaling conversion
tgcalls/v2/Signaling.h/.cpp — shared conversion functions (convertContentInfoToSignalingContent, convertSignalingContentToContentInfo)

GroupInstanceCustomImpl (Group Calls)

The group call implementation in tgcalls/group/GroupInstanceCustomImpl.cpp (~4700 lines). Uses a client-server model with an SFU, unlike 1:1 calls which are peer-to-peer.

Protocol Stack

Join signaling: JSON over application layer (emitJoinPayload → app sends to SFU → setJoinResponsePayload)
Transport: ICE + DTLS-SRTP over UDP (standard WebRTC transport, NOT PeerConnection)
Media: RTP/RTCP with Opus audio (48kHz, 2ch, 32kbps), optional VP8/H264/VP9 video
Control: SCTP data channel over DTLS for Colibri protocol (video constraints, debug messages)

Join Flow

Client calls emitJoinPayload() → generates JSON with audio SSRC, ICE ufrag/pwd, DTLS fingerprint
Application sends JSON to SFU server
Server responds with its ICE candidates, DTLS fingerprint, video codec info
Client calls setJoinResponsePayload(json) → ICE/DTLS negotiation begins
On connection: networkStateUpdated callback fires

Participant Discovery

Unknown SSRC arrives in RTP → receiveUnknownSsrcPacket() → maybeRequestUnknownSsrc(ssrc)
App's requestMediaChannelDescriptions callback queries server for SSRC→participant mapping
addIncomingAudioChannel(ssrc, userId) creates decoder channel

Colibri Data Channel Messages

// SFU → Client
{"colibriClass": "SenderVideoConstraints", "videoConstraints": {"idealHeight": 360}}

// Client → SFU
{"colibriClass": "ReceiverVideoConstraints", "defaultConstraints": {"maxHeight": 0},
 "constraints": {"endpoint1": {"minHeight": 720, "maxHeight": 720}}}

Key Files

tgcalls/group/GroupInstanceCustomImpl.h/.cpp — main implementation
tgcalls/group/GroupNetworkManager.h/.cpp — ICE/DTLS/SRTP transport
tgcalls/group/GroupJoinPayloadInternal.h/.cpp — join JSON serialization

GroupInstanceReferenceImpl (PeerConnection-based Group Calls)

An alternative group call implementation that uses standard WebRTC PeerConnection instead of the manual ICE/DTLS/SRTP management in GroupInstanceCustomImpl. Supports both audio and video (H264 simulcast). Implements the same GroupInstanceInterface.

Architecture

GroupInstanceReferenceImpl
  └── PeerConnection (single, to SFU)
        ├── sendrecv audio transceiver (outgoing audio)
        ├── sendonly video transceiver (outgoing H264 simulcast, SDP-munged SSRCs)
        ├── recvonly audio transceivers (one per remote SSRC, added dynamically)
        ├── recvonly video transceivers (one per remote endpoint, added dynamically)
        └── data channel ("data", for ActiveAudioSsrcs / ActiveVideoSsrcs)

How It Differs from CustomImpl

Aspect	CustomImpl	ReferenceImpl
Transport	Manual ICE/DTLS/SRTP via GroupNetworkManager	WebRTC PeerConnection
SDP	None (custom JSON protocol)	Local SDP construction, translates to/from JSON
SSRC discovery	`unknownSsrcPacketReceived` on raw RTP	`ActiveAudioSsrcs`/`ActiveVideoSsrcs` data channel messages from SFU
Audio channels	Manual `IncomingAudioChannel` per SSRC	PeerConnection recvonly transceivers
Audio levels	RTP header extension parsing	Synthetic levels based on known SSRCs
Video outgoing	Manual `cricket::VideoChannel` with direct SSRC control	PeerConnection sendonly transceiver + SDP munging for simulcast SSRCs
Video incoming	Manual `IncomingVideoChannel` per endpoint	PeerConnection recvonly transceivers with SSRCs in answer
Video decode	Manual decoder lifecycle	PeerConnection handles internally
Code size	~4700 lines	~1500 lines

Join Flow (SDP Translation)

Create PeerConnection with Opus audio transceiver, sendonly video transceiver (no track), and data channel
createOffer → munge video SSRCs (replace PeerConnection's auto-generated SSRCs with pre-allocated simulcast SSRCs) → SetLocalDescription → extract ICE/DTLS params from local SDP
Serialize as JSON (same format as CustomImpl): {ssrc, ufrag, pwd, fingerprints, ssrc-groups}
Parse SFU response JSON → construct JsepSessionDescription("answer") programmatically via cricket::SessionDescription API (no SDP string parsing)
SetRemoteDescription → ICE/DTLS connects via PeerConnection internals
Add remote ICE candidates via AddIceCandidate after SetRemoteDescription
Activate outgoing video: attach FakeVideoTrackSource track to the existing sendonly transceiver via sender()->SetTrack() — no renegotiation needed

Dynamic Participant Handling

Audio:

SFU sends {"colibriClass":"ActiveAudioSsrcs","ssrcs":[54321,98765]} over data channel
Client diffs against known SSRCs
New SSRCs: add recvonly audio transceiver → renegotiate (new offer + constructed answer mirroring offer mids)
Removed SSRCs: clean up from tracking map

Video:

SFU sends ActiveVideoSsrcs over data channel → forwarded to app via dataChannelMessageReceived
App calls setRequestedVideoChannels() → adds recvonly video transceivers, sends ReceiverVideoConstraints over data channel
Renegotiate: new offer → munge outgoing video SSRCs → SetLocalDescription → build answer with incoming video SSRCs → SetRemoteDescription
wirePendingVideoSinks(): attach FakeVideoSink to the recvonly transceiver's receiver track after SetRemoteDescription completes
Renegotiations are serialized (_isRenegotiating / _pendingRenegotiation flags) to prevent overlapping offer/answer cycles

Outgoing Video: SDP Munging for Simulcast

PeerConnection's API doesn't support SSRC-based simulcast directly (only RID-based, which doesn't put SSRCs in the SDP). The workaround:

Pre-allocate 6 random video SSRCs at construction: 3 layers × (primary + RTX)
Add a sendonly video transceiver in start() with no track
Before SetLocalDescription, mungeVideoSsrcsInOffer() replaces the video m-line's auto-generated StreamParams with our pre-allocated SSRCs + SIM + FID groups
UpdateLocalStreams_w() in WebRTC's channel.cc sees SSRCs already present and skips generation
Later, setVideoSource() just calls sender()->SetTrack() — no renegotiation

Incoming Video: SSRC-Based Demux

The answer for incoming video m-lines includes remote SSRCs from VideoChannelDescription.ssrcGroups. This is required because CustomImpl sets the WebRTC-Video-DiscardPacketsWithUnknownSsrc field trial process-wide, which disables unsignaled stream creation. Without explicit SSRCs, PeerConnection drops incoming video packets in mixed groups.

Key Implementation Details

ICE roles: PeerConnection uses standard ICE (full agent, controlling when remote is ICE-lite). The SFU uses Accept for PeerConnection clients vs Dial for CustomImpl clients.
Loopback: PeerConnectionFactory::Options::network_ignore_mask = 0 enables loopback interface gathering for localhost SFU
MID exclusion: The buildRemoteAnswer() excludes the urn:ietf:params:rtp-hdrext:sdes:mid RTP header extension from ALL m-lines (audio and video). The SFU forwards raw RTP with the sender's MID value, which would cause the BUNDLE demuxer to route packets to the wrong channel. Without MID, PeerConnection falls back to SSRC/PT-based routing.
RTP header extensions: Copied from the local offer per m-line (minus MID), ensuring BUNDLE-safe IDs. Hardcoding IDs risks collisions across the BUNDLE group.
SDP mid matching: During renegotiation, the constructed remote answer mirrors the local offer's m-line structure and mids exactly. Mismatched mids cause SetRemoteDescription to fail.
Audio level reporting: Uses synthetic levels (0.1) for all known remote SSRCs, since the SFU forwards RTP with extension IDs that may not match PeerConnection's negotiated mapping
Video sink wiring: OnTrack doesn't fire for locally-created recvonly transceivers. Sinks are wired explicitly in wirePendingVideoSinks() after SetRemoteDescription completes, and also in addIncomingVideoOutput() if the track already exists.
H264 codec in answer: PT 104 (primary) + PT 105 (RTX, apt=104), matching WebRTC's assignPayloadTypes order. RTCP feedback: nack, nack pli, ccm fir, goog-remb, transport-cc.
Renegotiation serialization: Only one offer/answer cycle runs at a time. Deferred renegotiations only fire if there are unnegotiated transceivers (no mid assigned yet), avoiding redundant cycles.

Key Files

tgcalls/group/GroupInstanceReferenceImpl.h/.cpp — implementation
tgcalls/group/GroupInstanceImpl.h — shared GroupInstanceInterface

Video Support Pitfalls

Critical findings from implementing video in the test SFU — relevant for anyone working on group video:

H264 Decoder Requires Two Build Flags

The WebRTC BUILD needs BOTH -DWEBRTC_USE_H264 (encoder, OpenH264) AND -DWEBRTC_USE_H264_DECODER (decoder, FFmpeg). Without the decoder flag, H264Decoder::Create() returns nullptr and WebRTC silently falls back to NullVideoDecoder which accepts frames but never decodes them — no error logged. The encoder works fine without the decoder flag, making this easy to miss.

FFmpeg 7+ Removed `reordered_opaque`

h264_decoder_impl.cc uses AVCodecContext::reordered_opaque and AVFrame::reordered_opaque for passing timestamps through the decode pipeline. FFmpeg 7+ removed this field. The fix uses AVPacket::pts instead. IMPORTANT: AVCodecContext::opaque is already used to store the H264DecoderImpl* pointer (line 74 of AVGetBuffer2) — do NOT use it for timestamps.

Outgoing Video Channel Steals Incoming RTP

GroupInstanceCustomImpl creates separate cricket::VideoChannel objects for outgoing and incoming video, all sharing the same RtpTransport. The outgoing channel's WebRtcVideoReceiveChannel has an "unsignalled SSRC" handler that creates default receive streams for unknown SSRCs. When video RTP from other participants arrives before IncomingVideoChannel registers its SSRCs, the outgoing channel intercepts the packets permanently. Fix: enable the WebRTC-Video-DiscardPacketsWithUnknownSsrc field trial in the field trial string.

Video Channel Setup Is Reactive, Not Pre-Registered

Video channels are set up reactively when ActiveVideoSsrcs arrives via the data channel — same as the real Telegram app. The dataChannelMessageReceived callback in GroupInstanceDescriptor forwards Colibri messages to the app, which calls setRequestedVideoChannels. The DiscardPacketsWithUnknownSsrc field trial prevents the outgoing channel from stealing RTP packets for SSRCs not yet registered. The SFU sends proactive PLI after constraints arrive, ensuring keyframes are produced after the incoming channel is ready.

SFU Must Send Proactive PLI

WebRTC's VideoReceiveStream2 doesn't immediately request a keyframe when a new receive stream is created — it waits until it detects missing packets or a timeout. The SFU must proactively send PLI to the sender when a receiver first requests video via ReceiverVideoConstraints. Without this, the decoder waits indefinitely for a keyframe.

RTP/RTCP Demux: Marker Bit False Positives

RFC 5761 demux by second byte: RTCP types are 200-211. But RTP with Marker=1 and dynamic PT ≥ 96 gives byte[1] ≥ 224. Using byte[1] >= 200 falsely classifies H264 RTP (PT=104, M=1 → byte[1]=232) as RTCP. Correct range: byte[1] >= 200 && byte[1] < 224.

SRTCP Requires Separate Contexts from SRTP

Pion's SessionSRTP and SessionSRTCP can't share the same net.Conn (both start read loops that fight for packets). The solution: demux RTCP at the transport level (in PacketDemux), create separate srtp.Context instances for SRTCP decrypt/encrypt using the same DTLS-extracted keys, and handle RTCP manually without SessionSRTCP.

PeerConnection Simulcast SSRCs Require SDP Munging

PeerConnection's API doesn't support SSRC-based simulcast (only RID-based). With RID-based simulcast, SSRCs are NOT in the createOffer SDP — they're generated internally during SetLocalDescription and not accessible via sender->GetParameters() (only primary SSRCs, not RTX). The workaround: add a single-encoding transceiver (no RIDs), then replace the auto-generated StreamParams in the offer with pre-allocated SSRCs + SIM + FID groups before calling SetLocalDescription. UpdateLocalStreams_w() skips generation when SSRCs already exist. IMPORTANT: transceiver->mid() is nullopt before SetLocalDescription — match by content direction, not mid.

MID RTP Header Extension Causes Wrong Channel Routing in SFU

The SFU forwards raw RTP packets including all header extensions. If the sender's video RTP includes a MID extension (e.g., MID="1"), the receiver's PeerConnection BUNDLE demuxer routes the packet to its own mid=1 channel — which is the outgoing video, not the incoming video transceiver. Fix: exclude urn:ietf:params:rtp-hdrext:sdes:mid from ALL m-lines in buildRemoteAnswer(). Without MID negotiated, PeerConnection falls back to SSRC/PT-based routing. This must be done for ALL m-lines (including audio) because the BUNDLE transport shares the extension map across all channels.

`DiscardPacketsWithUnknownSsrc` Is Process-Wide

CustomImpl calls field_trial::InitFieldTrialsFromString(...) which sets WebRTC-Video-DiscardPacketsWithUnknownSsrc/Enabled/ globally for the process. In mixed groups, this prevents ReferenceImpl's PeerConnection from creating unsignaled receive streams for incoming video. Fix: include explicit remote video SSRCs in the buildRemoteAnswer() for incoming video m-lines, so PeerConnection registers SSRC-based demux entries instead of relying on unsignaled stream handling.

`OnTrack` Doesn't Fire for Locally-Created Recvonly Transceivers

When you call AddTransceiver(MEDIA_TYPE_VIDEO, {direction=recvonly}), PeerConnection creates the transceiver and its receiver track immediately. OnTrack only fires when a REMOTE-initiated track is added. For locally-created recvonly transceivers, you must wire sinks explicitly after SetRemoteDescription completes — don't wait for OnTrack.

SSRC Parsing: json11 int_value() Overflows for uint32 > INT_MAX

GoSfu_QueryVideoSsrcs returns SSRCs as uint32 in JSON. For values > 2^31, json11's int_value() (which returns int) overflows to INT_MAX (2147483647). Fix: use number_value() (returns double) and cast via int64_t to uint32_t.

Join Payload JSON Field Name: `"sources"` Not `"ssrcs"`

tgcalls serializes video SSRC groups in GroupJoinInternalPayload::serialize() using the key "sources" (not "ssrcs"). The Go SFU's JSON struct tags must match: Sources []int32 \json:"sources"``.

Simulcast Max Layers Depends on Source Resolution, Not Bitrate

WebRTC's kSimulcastFormats table in video/config/simulcast.cc hardcodes max_layers per resolution: 640x360 → 2 layers, 960x540 → 3 layers, 1280x720 → 3 layers. The SimulcastEncoderAdapter uses this to cap the number of encoders regardless of available bitrate. If you need 3 simulcast layers, the source must be at least 960x540. The FakeVideoTrackSource uses 1280x720 for this reason. With 1280x720 and scale factors /4, /2, /1, the layers are 320x180, 640x360, 1280x720.

SFU Must Rewrite SSRCs When Switching Simulcast Layers

CustomImpl's IncomingVideoChannel calls SetSink(_mainVideoSsrc, ...) where _mainVideoSsrc is the first SSRC in the SIM group (layer 0). The video sink only receives decoded frames from that specific SSRC's receive stream. When the SFU forwards a higher layer's packets, it must rewrite bytes 8-11 of the RTP header to the primary (layer 0) SSRC. RTX packets must similarly be rewritten to the layer 0 FID SSRC. Without this, higher-layer packets are delivered to the wrong receive stream and produce zero decoded frames. This is standard SFU behavior for simulcast — Jitsi and mediasoup do the same.

Sender BWE Start Bitrate Determines Initial Layer Count

adjustBitratePreferences sets start_bitrate_bps = max(min_bitrate_bps, 400k). At 400kbps start, the BitrateAllocator gives L0 (60k) + L1 (110k) = 170k, leaving only 230k for L2 which needs min 300k. Layer 2 is disabled until the GCC ramps up. The SFU's transport-cc feedback enables this ramp-up. The UpdateAllocationLimits log shows total_requested_max_bitrate — if this is below the sum of all layers' min bitrates, some layers are excluded.

`assignPayloadTypes` Codec Ordering

WebRTC's assignPayloadTypes assigns dynamic PTs starting at 100 in order: VP8 (100/101), VP9 (102/103), H264 (104/105). Both sender and receiver call this independently with the same codec list, so PTs match. The SFU's join response codec PTs (100 for H264 in our case) are used by configureVideoParams to SELECT which codec to use, but the actual PT assignment comes from assignPayloadTypes.

Known Issues

ThreadLocalObject::~ThreadLocalObject() posts fire-and-forget cleanup tasks to the tgcalls media thread. If the process does orderly static destruction, the static thread pool may be torn down while these tasks are still executing, causing "pure virtual function called". The CLI tool uses _exit() to avoid this. This is not a problem in the real Telegram app.
SignalingSctpConnection::OnReadyToSend() had a missing break after the first send failure in its pending-data flush loop. This could cause application-level message reordering (though the application handles it gracefully via _pendingIceCandidates buffering). Fixed in our fork.
InstanceV2ReferenceImpl::writeStateLogRecords() had a use-after-free: it captured a raw Call* pointer on the media thread and posted it to the worker thread. If stop() called _peerConnection->Close() (which destroys Call) between the post and worker thread execution, the worker thread would dereference a dangling pointer. The call_ptr_ field in WebRTC's PeerConnection is Call* const and is never nulled, so the existing null check didn't catch this. Fixed with an _isStopped atomic flag checked in the worker thread lambda before accessing call. Manifested as ~2% segfault rate under 250-process parallel load; 100% pass rate after fix (5000/5000).
WebRTC's RTC_LOG writes to stdout, not stderr. There is no way to separate it from application output within a single process. The local mass test harness (run-local-test.sh) works around this by using separate processes and checking exit codes rather than parsing output.

26 KiB Raw Blame History Unescape Escape

tgcalls Library

macOS Build Support

Linux Build Support

SCTP Signaling

Writable Gate (role-based handshake ordering)

Timer Tuning (CustomDcSctpSocket)

InstanceV2CompatImpl (version 14.0.0)

Architecture

Key Design Decisions

Cross-Version Testing

GroupInstanceCustomImpl (Group Calls)

Protocol Stack

Join Flow

Participant Discovery

Colibri Data Channel Messages

Key Files

GroupInstanceReferenceImpl (PeerConnection-based Group Calls)

Architecture

How It Differs from CustomImpl

Join Flow (SDP Translation)

Dynamic Participant Handling

Outgoing Video: SDP Munging for Simulcast

Incoming Video: SSRC-Based Demux

Key Implementation Details

Key Files

Video Support Pitfalls

H264 Decoder Requires Two Build Flags

FFmpeg 7+ Removed reordered_opaque

Outgoing Video Channel Steals Incoming RTP

Video Channel Setup Is Reactive, Not Pre-Registered

SFU Must Send Proactive PLI

RTP/RTCP Demux: Marker Bit False Positives

SRTCP Requires Separate Contexts from SRTP

PeerConnection Simulcast SSRCs Require SDP Munging

MID RTP Header Extension Causes Wrong Channel Routing in SFU

DiscardPacketsWithUnknownSsrc Is Process-Wide

OnTrack Doesn't Fire for Locally-Created Recvonly Transceivers

SSRC Parsing: json11 int_value() Overflows for uint32 > INT_MAX

Join Payload JSON Field Name: "sources" Not "ssrcs"

Simulcast Max Layers Depends on Source Resolution, Not Bitrate

SFU Must Rewrite SSRCs When Switching Simulcast Layers

Sender BWE Start Bitrate Determines Initial Layer Count

assignPayloadTypes Codec Ordering

Known Issues

26 KiB

Raw Blame History

FFmpeg 7+ Removed `reordered_opaque`

`DiscardPacketsWithUnknownSsrc` Is Process-Wide

`OnTrack` Doesn't Fire for Locally-Created Recvonly Transceivers

Join Payload JSON Field Name: `"sources"` Not `"ssrcs"`

`assignPayloadTypes` Codec Ordering