Fix inline emoji/image line-height inflation in InstantPage V2

The V2 line-breaker used `lineAscent` as both the line height and the baseline offset, then inflated it to each inline emoji/image's full visual size and bottom-aligned the attachment on that inflated baseline. A 24pt emoji on a ~17pt line therefore doubled the line height and shoved the text baseline (and all text on the line) down. Stop inflating the line for emoji/images (only formulas, which carry their own metrics, still grow it) and center each attachment on the font line box at `baselineY - fontLineHeight/2 - size/2`, matching V1 `layoutTextItemWithString` and the chat `InteractiveTextComponent`. The attachment now bleeds symmetrically instead of moving the baseline. `extraDescent` absorbs tall-attachment bottom overflow so the next line is not overlapped, and the streaming-reveal `characterRect` is centered in lockstep so the reveal mask tracks the cell (reveal cost stays width-only). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-05 19:28:46 +02:00 · 2026-05-31 18:28:05 +02:00 · 2026-05-31 18:28:05 +02:00 · 9205fb2303
commit 9205fb2303
parent a3492c0cb0
2 changed files with 37 additions and 23 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -128,7 +128,8 @@ A V2 `.table` block's item frame is **full-width / flush** with the bubble inter

 - **flatc casing/`required` gotchas.** Edit `RichText.fbs`, not the generated Swift. Scalars (`long`) cannot be `(required)` — only strings/tables can. A union member `RichText_CustomEmoji` generates the Swift enum case `.richtextCustomemoji` (everything after the suffix's first letter is lowercased); the table type stays `TelegramCore_RichText_CustomEmoji` and field accessors keep `.fbs` casing (`value.fileId`). See the `flatbuffers-codegen` memory.
 - **`ChatTextInputTextCustomEmojiAttribute` is reused end-to-end** (display layer ⇄ layout model). The attribute is written to the placeholder in `attributedStringForRichText` and read back by the V2 line-breaker under the SAME key (`ChatTextInputAttributes.customEmoji`); `InlineStickerItemLayer.init` consumes it directly and resolves the file lazily from `fileId`.
- **Emoji participates in the streaming reveal.** Its placeholder char's `characterRect` is overwritten to a full cell (width = `itemSize`, baseline-relative bottom at `y=0`), so the width-based cost map charges it like other content. `updateEmojiReveal` pops the layer in (alpha 0→1 + scale) when `charIndexInItem < currentRevealCharacterCount`; unrevealed → opacity 0.
+- **Emoji participates in the streaming reveal.** Its placeholder char's `characterRect` is overwritten to a full cell (width = `itemSize`), so the width-based cost map charges it like other content. `updateEmojiReveal` pops the layer in (alpha 0→1 + scale) when `charIndexInItem < currentRevealCharacterCount`; unrevealed → opacity 0.
+- **Inline emoji/images are CENTERED on the font line box, NOT baseline-aligned, and do NOT inflate the line.** The line-breaker keeps `lineAscent = fontLineHeight` (only formulas grow it) and places each attachment at `baselineY − fontLineHeight/2 − size/2`, so a 24pt emoji on a ~17pt line bleeds symmetrically instead of doubling the line height and shoving the text baseline down (the prior `lineAscent = emoji.size` behavior was a regression from V1 `layoutTextItemWithString`, which centers via `(fontLineHeight − imageHeight)/2`). Mirrors the chat `InteractiveTextComponent`. The cell's `characterRect` is centered the same way (`y = fontLineHeight/2 − size/2`) so the reveal mask (`renderer: y = minY + lineAscent − rect.maxY`) tracks it; a tall attachment grows `extraDescent` so the next line isn't overlapped. Three things must stay in lockstep: the display frame, the `characterRect`, and `extraDescent`.
 - **Layers sit ABOVE the reveal mask.** They attach to `InstantPageV2TextView.emojiContainerView` (a sibling above `renderContainer`), NOT inside it — so the reveal mask wipes glyphs while emoji pop in independently. Adding a CTRunDelegate-glyph to the mask would clip-wipe them instead.
 - **Layers are owned by `InstantPageV2View`, not the text view.** Keyed by `InlineStickerItemLayer.Key(id: fileId, index: occurrence)`. The pageView is now REUSED across `stableVersion` bumps (see streaming section), so the inline-emoji dict PERSISTS across chunks; `updateInlineEmoji` prunes stale keys (emoji whose blocks have been removed) and creates/repositions layers for new or unchanged emoji each update pass.
 - **`visibilityRect` gates looping; `nil` means "not visible".** The bubble's `visibility` override pushes a full-width sub-rect to the root `pageView.visibilityRect`, re-pushed in the apply closure after `pageView.frame` is set. `propagateVisibilityRect` converts the rect into each nested V2View's coordinate space (`self.convert(_:to:)`) for details bodies / table cells+title, fanning out via each child's `didSet`.
--- a/submodules/InstantPageUI/Sources/InstantPageV2Layout.swift
+++ b/submodules/InstantPageUI/Sources/InstantPageV2Layout.swift
@ -2838,13 +2838,15 @@ func layoutTextItem(
                }
            }

+            // Inline emoji and images do NOT inflate the line: they are centered on the font
+            // line box and allowed to bleed above/below (mirroring V1 `layoutTextItemWithString`
+            // and the chat `InteractiveTextComponent`). Their run delegates already report the
+            // font's own ascent/descent, so CoreText lays the line out at the normal height — the
+            // old `lineAscent = emoji.size` inflation both doubled the line height and (because the
+            // baseline sits at the bottom of the box) shoved the text baseline down. Only formulas,
+            // which carry their own typographic metrics, are allowed to grow the line.
            var lineAscent: CGFloat = fontLineHeight
            var lineDescent: CGFloat = fontDescentBelowBaseline
-            for image in pendingImages {
-                if image.size.height > lineAscent {
-                    lineAscent = image.size.height
-                }
-            }
            for formula in pendingFormulas {
                let formulaAscent = formula.attachment.rendered.size.height - formula.attachment.rendered.descent
                if formulaAscent > lineAscent {
@ -2854,17 +2856,15 @@ func layoutTextItem(
                    lineDescent = formula.attachment.rendered.descent
                }
            }
-            for emoji in pendingEmoji {
-                if emoji.size > lineAscent {
-                    lineAscent = emoji.size
-                }
-            }
            let baselineY = workingLineOrigin.y + lineAscent

            for image in pendingImages {
+                // Center on the font line box (baseline − fontLineHeight/2), matching V1's
+                // `(fontLineHeight - imageHeight) / 2` offset, instead of bottom-aligning on the
+                // baseline. Keeps the text baseline put and lets the image bleed symmetrically.
                let imageFrame = CGRect(
                    x: workingLineOrigin.x + image.xOffset,
-                    y: baselineY - image.size.height,
+                    y: floorToScreenPixels(baselineY - fontLineHeight / 2.0 - image.size.height / 2.0),
                    width: image.size.width,
                    height: image.size.height
                )
@ -2882,9 +2882,12 @@ func layoutTextItem(
                lineFormulaItems.append(InstantPageTextFormulaRun(frame: formulaFrame, range: formula.range, attachment: attachment))
            }
            for emoji in pendingEmoji {
+                // Center on the font line box (baseline − fontLineHeight/2) so a 24pt emoji on a
+                // ~17pt line bleeds symmetrically rather than forcing the line taller and pushing
+                // the text baseline down. Matches the chat `InteractiveTextComponent` placement.
                let emojiFrame = CGRect(
                    x: workingLineOrigin.x + emoji.xOffset,
-                    y: baselineY - emoji.size,
+                    y: floorToScreenPixels(baselineY - fontLineHeight / 2.0 - emoji.size / 2.0),
                    width: emoji.size,
                    height: emoji.size
                )
@ -2892,6 +2895,15 @@ func layoutTextItem(
            }

            extraDescent = max(0.0, lineDescent - baselineToNextTopSlack)
+            // A centered attachment taller than the line bleeds below the baseline; grow the
+            // descent so the following line isn't overlapped (mirrors V1's extraDescent handling).
+            // Emoji at the default 24/17 ratio stay within the line slack and contribute nothing.
+            for imageItem in lineImageItems {
+                extraDescent = max(extraDescent, imageItem.frame.maxY - (baselineY + baselineToNextTopSlack))
+            }
+            for emojiItem in lineEmojiItems {
+                extraDescent = max(extraDescent, emojiItem.frame.maxY - (baselineY + baselineToNextTopSlack))
+            }

            if !minimizeWidth && !hadIndexOffset && lineCharacterCount > 1 && lineWidth > currentMaxWidth + 5.0 {
                if let imageItem = lineImageItems.last {
@ -3025,22 +3037,23 @@ func layoutTextItem(
                    let localIndex = emoji.range.location - lineRange.location
                    if localIndex >= 0 && localIndex < rects.count {
                        let x = CTLineGetOffsetForStringIndex(line, emoji.range.location, nil)
-                        // characterRects are baseline-relative (positive-up). The emoji cell sits
-                        // bottom-on-baseline (see frame loop: y = baselineY - emoji.size), so its
-                        // baseline-relative bottom is 0 and maxY = emoji.size — the width feeds the
-                        // reveal cost map; maxY feeds the reveal-mask y conversion in the renderer.
-                        rects[localIndex] = CGRect(x: x, y: 0.0, width: emoji.size, height: emoji.size)
+                        // characterRects are baseline-relative (positive-up). The emoji cell is now
+                        // centered on the font line box (see frame loop), so in baseline-relative
+                        // coords it spans [fontLineHeight/2 − size/2, fontLineHeight/2 + size/2].
+                        // Width feeds the reveal cost map; maxY feeds the reveal-mask y conversion in
+                        // the renderer (lineAscent − maxY), keeping the mask tracking the centered cell.
+                        rects[localIndex] = CGRect(x: x, y: fontLineHeight / 2.0 - emoji.size / 2.0, width: emoji.size, height: emoji.size)
                    }
                }
                for image in pendingImages {
                    let localIndex = image.range.location - lineRange.location
                    if localIndex >= 0 && localIndex < rects.count {
                        let x = CTLineGetOffsetForStringIndex(line, image.range.location, nil)
-                        // Image cell sits bottom-on-baseline (frame loop: y = baselineY - image.size.height).
-                        // Baseline-relative cell: y = 0, height = image.size.height. The full width feeds
-                        // the reveal cost map so the streaming cursor is charged the image's width when
-                        // crossing it — same as an emoji cell.
-                        rects[localIndex] = CGRect(x: x, y: 0.0, width: image.size.width, height: image.size.height)
+                        // Image cell is centered on the font line box (see frame loop). Baseline-relative
+                        // cell spans [fontLineHeight/2 − height/2, fontLineHeight/2 + height/2]; the full
+                        // width feeds the reveal cost map so the streaming cursor is charged the image's
+                        // width when crossing it — same as an emoji cell.
+                        rects[localIndex] = CGRect(x: x, y: fontLineHeight / 2.0 - image.size.height / 2.0, width: image.size.width, height: image.size.height)
                    }
                }
                lineCharacterRects = rects