Skip to content

feat(audio): Call Ducking Compensation — counter-boost other apps during VoIP calls#311

Open
PCopath wants to merge 3 commits into
ronitsingh10:mainfrom
PCopath:feature/call-ducking-compensation
Open

feat(audio): Call Ducking Compensation — counter-boost other apps during VoIP calls#311
PCopath wants to merge 3 commits into
ronitsingh10:mainfrom
PCopath:feature/call-ducking-compensation

Conversation

@PCopath

@PCopath PCopath commented May 25, 2026

Copy link
Copy Markdown

Problem

macOS automatically reduces ("ducks") the volume of every other audio source by roughly 20 dB whenever a VoIP / video-call app drives Apple's voice-processing IO unit (FaceTime, Phone, WhatsApp, Zoom, Teams, Discord, …). The ducking is applied inside coreaudiod's mixer for the duration of AUVoiceIOOtherAudioDuckingConfiguration and cannot be disabled via any public API on Apple Silicon with SIP enabled — the legacy lldb AudioDeviceDuck=0xc3 patch is x86-only and process-level HAL properties for the call daemons are unreachable.

This is the most common reason users reach for SoundSource — their music goes quiet the moment a call comes in. Since FineTune already taps per-process audio with the right APIs and an aggregate-device pipeline, it can compensate for the ducking without users buying a commercial license.

Approach

Pragmatic counter-boost rather than a true bypass:

  1. VoIPCallDetector observes the existing AudioProcessMonitor.activeApps list and matches each entry against an allowlist of well-known VoIP / conferencing bundle IDs (FaceTime, Phone, WhatsApp, Zoom, Teams, Discord, Slack, Skype, Signal, Webex, GoToMeeting, BlueJeans, Jitsi, Element, Rocket.Chat, WeChat, LINE, Viber, Google Meet/Duo) plus a process-name fallback for daemons whose kAudioProcessPropertyBundleID is unreliable (avconferenced, callservicesd, telephonyutilities). User-extensible / user-disable-able.

  2. AudioEngine.effectiveVolume(for:) multiplies the per-app gain by 10^(boostDB/20) whenever the feature is enabled AND a call is active AND the target is not itself the call app. The existing 30 ms volume ramp in ProcessTapController keeps the transition click-free; the existing SoftLimiter protects against clipping. Default boost is 12 dB — empirically the sweet spot on built-in MacBook speakers.

  3. tearDownVoIPTaps() invalidates any live tap on a VoIP app and refuses to create new ones while the feature is enabled. Reason: when FineTune routes call audio through its private aggregate device, coreaudiod's voice-processing mixer treats our aggregate output as "other audio" and ducks the call itself, silencing the remote caller. Leaving call apps untapped preserves the system's native voice routing.

  4. Menu-bar header toggle for one-click enable/disable, with the icon doubling as a live status indicator: faded when off, normal-tint when armed, green-filled when actively boosting. Tooltip exposes the detected call app. Detailed knobs (boost dB, allowlist) live in Settings → Audio.

What's changed

File Change
FineTune/Audio/Monitors/VoIPCallDetector.swift NEW — @Observable @MainActor detector, ~190 LOC
FineTune/Audio/Engine/AudioEngine.swift Wire detector; effectiveVolume × call-boost; tearDownVoIPTaps
FineTune/Settings/SettingsManager.swift New CallDuckingCompensation Codable struct in AppSettings (boostDecibels, enabled, user allow/deny lists)
FineTune/Views/Settings/Tabs/AudioTab.swift "Call Ducking Compensation" settings section (toggle, dB slider 6–24, live status)
FineTune/Views/MenuBarPopupView.swift Compact callDuckingButton next to settingsButton in the popup header
FineTuneTests/VoIPCallDetectorTests.swift NEW — 11 cases covering bundle-ID matching, case insensitivity, name-fallback, user allowlist extensions/removals, callback semantics, concurrent calls

Testing

  • Full FineTuneTests suite: 822 cases, 0 failures on macOS 26.3.1 / Apple Silicon
  • New VoIPCallDetectorTests: 11 cases, all green
  • Real-world testing on built-in MacBook speakers with FaceTime + Google Chrome (YouTube): remote caller audio at normal level, music ~3-4 dB short of pre-call loudness at the default 12 dB compensation, no clipping or pumping artefacts

Known limitations

  • Compensation is a constant gain stage, so it can't match macOS's dynamic gain reduction perfectly. We explored a compressor + make-up gain alternative but reverted it — the artefact pattern (post-call gain recovery audible as a "swell-then-settle") was worse than the constant-gain residual.
  • Per-tap SoftLimiter will engage on already-loud material above ~16 dB of compensation. Default 12 dB stays out of that region for typical mastered music.
  • VoIP-app coverage is allowlist-based. Apps not on the list won't trigger the feature — users can extend the list in Settings (UI for this lives in CallDuckingCompensation.extraCallAppBundleIDs but isn't surfaced in a row editor yet; PR welcome).
  • Browser-based calls (Google Meet via Chrome) are out of scope because we can't distinguish "Chrome is on a Meet call" from "Chrome is playing a YouTube video" without injecting into the page.

Why not a HAL plug-in / AudioServerPlugIn?

The "perfect" solution is what SoundSource does: ship a signed AudioServerPlugIn that becomes the user's default output and routes its output without going through coreaudiod's VP-IO mixer. That requires a kex-equivalent permission, a Developer ID signing identity, notarization, and System Settings → Privacy → Audio loading. Out of scope for this PR. The counter-boost approach is a strictly additive change that ships under the same TCC permissions FineTune already requests, and the feature is opt-in.


Happy to iterate on naming, the UI placement, or the bundle-ID allowlist — flagging the unconventional approach upfront so a maintainer can sanity-check it before reading code.

PCopath added 3 commits May 25, 2026 21:49
…king

macOS automatically reduces ("ducks") the volume of every other audio
source by roughly 20 dB whenever a VoIP / video-call app drives Apple's
voice-processing IO unit (FaceTime, Phone, WhatsApp, Zoom, Teams, ...).
This happens inside coreaudiod's mixer and cannot be disabled via any
public API on Apple Silicon with SIP enabled — both the legacy lldb
`AudioDeviceDuck=0xc3` hack and process-level HAL properties are
unreachable on modern systems.

The pragmatic alternative implemented here is to apply an equal,
opposite gain stage to every tapped non-call app while a call is in
progress, cancelling the perceptual effect of the system ducking.

Adds:
- `VoIPCallDetector` — observes the existing audio-process list from
  `AudioProcessMonitor` and matches bundle IDs against a built-in
  allowlist of conferencing apps plus an optional user extension list.
- `CallDuckingCompensation` settings (persisted, off by default) with
  an `enabled` toggle and a `boostDecibels` knob (6–24 dB, default 18).
- A `Call Ducking Compensation` section in the Audio settings tab with
  a toggle, a dB slider, and a live status line.
- A new `callDuckingCompensationFactor(for:)` choke-point in
  `AudioEngine.effectiveVolume(for:deviceUIDs:)` that multiplies the
  per-app gain by `10^(dB/20)` when a call is active and the target
  is not itself the call app. The existing 30 ms volume ramp in
  `ProcessTapController` keeps transitions click-free; the existing
  `SoftLimiter` protects against clipping at the chain's end.
…back

Two follow-up fixes to the call-ducking compensation feature, surfaced by
real-world testing with FaceTime + Chrome/YouTube on macOS 26.

1. Name-based detector fallback. `kAudioProcessPropertyBundleID` is unreliable
   for the daemons that actually host call audio (`avconferenced`,
   `callservicesd`, `telephonyutilities`) — sometimes nil, sometimes the
   bundle-ID lookup race-conditions against process-list refresh. The detector
   now matches on lower-cased `AudioApp.name` as a fallback when bundle ID
   doesn't hit the allowlist. Empirical hit rate went from "FaceTime only" to
   "FaceTime + avconferenced" in tests.

2. Skip VoIP apps in the tap pipeline entirely. Even after exempting call apps
   from the counter-boost, the user reported the remote caller's voice still
   came through quieter. Root cause: FineTune was tapping `avconferenced` and
   routing its audio through our private aggregate device. coreaudiod's
   voice-processing mixer then treated our aggregate output as "other audio"
   and ducked the call itself. Now: when call-ducking compensation is enabled,
   `AudioEngine` refuses to create new taps for any app matching
   `VoIPCallDetector.isVoIPApp`, and `tearDownVoIPTaps()` invalidates any
   pre-existing call-app taps both when settings change and on every audio
   process-list refresh.

Also: lower default `boostDecibels` from 18 → 14, since the per-tap soft
limiter starts audibly compressing transients above ~14 dB of make-up gain
on typical music material.

Adds a diagnostic info log ("Scanning N audio-active processes: …") that
dumps name|bundleID of every audio-active process per detector refresh —
helpful while we tune the heuristic; will be downgraded to .debug or
removed before the feature is considered done.
Surfaces the call-ducking compensation feature with a one-click toggle next
to the existing gear icon, instead of burying it three levels deep in
Settings → Audio. The icon doubles as a live status indicator so users can
tell at a glance whether the feature is dormant, armed, or actively
boosting because a call is in progress.

Visual states (SF Symbol + hierarchical render):
- Disabled:      faded `phone.connection`
- Enabled, idle: normal-tint `phone.connection`
- Active call:   green-filled `phone.connection.fill`

Hover tooltip exposes the same status text including which call app is
currently detected ("…boosting other apps to compensate for facetime,
avconferenced"). Detailed configuration (boost dB, allowlist) stays in
Settings → Audio.

Also:
- Lower default `boostDecibels` from 14 → 12. Empirically the sweet spot
  on built-in MacBook speakers: recovers most of the lost loudness without
  driving typical mastered music into the per-tap SoftLimiter.
- Downgrade the per-refresh "Scanning N audio-active processes" trace from
  `.info` to `.debug`. It was useful while we were hunting missed-match
  cases; routine users don't need to see it in their Console feed.
- Add `VoIPCallDetectorTests` (11 cases) covering bundle-ID matching,
  case insensitivity, name-fallback for daemons with nil bundle IDs,
  user-extended allowlist, user-disabled built-ins, callback semantics,
  and concurrent call apps.

Full test suite (822 cases) green after these changes.
@djbob2000

Copy link
Copy Markdown

I appreciate the effort, but I respectfully disagree with this approach.

@EllandeVED

Copy link
Copy Markdown

Why?

@djbob2000

Copy link
Copy Markdown

Why?

Since we are using macOS's public AudioTap APIs (unlike SoundSource's proprietary ACE driver), bypassing VoIP apps like FaceTime/Zoom is indeed the only way to keep echo cancellation (AEC) working properly and prevent calls from breaking.
Since the automatic +12 dB boost can cause digital distortion/clipping when audio is already loud, here is what I suggest:

  1. Let's keep the VoIP bypass—it's a great and necessary fix.
  2. Make sure the Call Ducking Compensation boost remains optional (disabled by default, which I see it is in your settings).
  3. In the future, once the feature/agc-and-compressor branch lands, we should integrate this boost with the new compressor/AGC pipeline. Running the compensation through a proper compressor will sound much cleaner and warmer than the current SoftLimiter waveshaping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants