fix: voice tracking highlight and mic stall bugs#37
Open
fix: voice tracking highlight and mic stall bugs#37
Conversation
Addresses two user-reported bugs: (1) highlight not tracking at the right speed, jumping erratically or lagging behind speech, and (2) mic appearing to stall out and stop picking up audio after ~60 seconds. Root causes identified and fixed: **Seamless recognition restart (P0)** - Split cleanupRecognition() so AVAudioEngine stays alive across SFSpeechRecognitionTask restarts, eliminating audio gaps - Add pre-emptive 55-second restart timer to beat Apple's ~60s timeout - Update matchStartOffset to recognizedCharCount before each restart so new sessions match from the correct position - Thread-safe request swapping via NSLock for audio I/O thread safety - Add contextualStrings from remaining source text for better STT accuracy **Fix fuzzy matching false positives (P1)** - Remove overly permissive `contains` check from isFuzzyMatch that caused "and" to match "demand", "the" to match "other", etc. - Tighten prefix matching to require minimum 3-char words - Require exact match for 2-char words (no edit distance tolerance) - Fix charLevelMatch skip-both fallback: no longer advances lastGoodOrigIndex on genuine mismatches (gibberish no longer matches) - Fix wordLevelMatch +1 space overcount on last matched word - Fix unicode scalar vs Character count mismatch in charLevelMatch **Confidence gating (P2)** - Replace blind max(charResult, wordResult) with agreement-based selection - Add sliding window requiring 2-of-3 recent results to agree before committing large forward jumps (small steps always pass through) **Retry resilience (P3)** - Distinguish timeout errors (code 1110/216) from real errors - No retry limit for expected timeouts; immediate soft restart - Backoff with retry limit only for genuine errors **Architecture cleanup (P4)** - Merge two polling timers in observeDismiss() into one - Fix retain cycle in dismiss() asyncAfter closure - Add isDismissing guard to prevent double-dismiss - Fix cancelled-task error callback race in restartTask() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
AVAudioEnginealive acrossSFSpeechRecognitionTaskrestarts, add pre-emptive 55-second restart timer to beat Apple's ~60s timeout, syncmatchStartOffsetbefore each restart, and addcontextualStringsfor better STT accuracycontainscheck (e.g. "and" matching "demand"), fixcharLevelMatchskip-both fallback that treated gibberish as matches, fixwordLevelMatch+1 space overcount, fix unicode scalar vs Character count mismatchmax(charResult, wordResult)with agreement-based selection; require 2-of-3 recent results to agree before committing large forward jumpsobserveDismiss()into one, fix retain cycle and double-dismiss race indismiss(), fix cancelled-task error callback race inrestartTask()Context
Two user-reported bugs in Word Tracking mode:
Root cause analysis revealed the "mic stalling" was primarily a matching bug masquerading as an audio bug — the mic was working, but
matchStartOffsetstaying at 0 after recognition restarts meant new session results couldn't advance the highlight past its current position.Test plan
🤖 Generated with Claude Code