Skip to content

fix: audio export quality and Whisper language selection#4

Open
ScreepCode wants to merge 1 commit into
thehwang:mainfrom
ScreepCode:fix/audio-quality-and-whisper-language
Open

fix: audio export quality and Whisper language selection#4
ScreepCode wants to merge 1 commit into
thehwang:mainfrom
ScreepCode:fix/audio-quality-and-whisper-language

Conversation

@ScreepCode

@ScreepCode ScreepCode commented Jun 15, 2026

Copy link
Copy Markdown

Summary

  • Audio too quiet (Audio export is too quiet #2): Raises system audio capture from 16 kHz to 48 kHz and updates the AAC encoder to 48 kHz / 128 kbps. The exported audio file now receives the original 48 kHz PCM instead of the already-downsampled 16 kHz buffer that was previously written. Before this change the Nyquist limit cut everything above 8 kHz, making recordings sound thin and quiet.
  • Whisper language hardcoded (Language selection not respected for mic transcription (Whisper) and summary #3): Adds a language property to WhisperEngine and wires it up from MeetingRecorder.startRecording() using the 2-letter ISO prefix of recognitionLanguage (e.g. "de-DE""de"). Previously transcribeChunk had the language hardcoded, so the mic channel always transcribed in one language regardless of the UI selection.

Changes

File Change
SystemAudioCapture.swift config.sampleRate: 16 000 → 48 000
MeetingRecorder.swift audioFileSettings: 48 kHz / 128 kbps; write original PCM to file; fix writeMicAudio memcpy condition
WhisperEngine.swift Add language property, use self.language in transcribeChunk

Test plan

  • Record a short meeting in German or other language — mic channel should now transcribe in German (or other language) (not English/hardcoded language)
  • Remote channel (SFSpeech) continues to work as before
  • Exported audio-mic.m4a and audio-system.m4a sound noticeably louder/fuller than before
  • Switching language in UI between sessions changes Whisper transcription language correctly

- SystemAudioCapture: raise sample rate from 16 kHz to 48 kHz so
  exported audio captures the full voice frequency range (0–24 kHz)
  instead of being limited to 8 kHz (Nyquist of 16 kHz)

- MeetingRecorder: update audio file settings to 48 kHz / 128 kbps AAC;
  write original 48 kHz PCM to the audio file in handleSystemAudioBuffer
  instead of the already-downsampled 16 kHz buffer that was fed to
  SFSpeech; fix writeMicAudio memcpy fast-path to also trigger for
  stereo hardware input (was gated on channelCount == 1 unnecessarily)

- WhisperEngine: add `language` property (default "en"), use it in
  transcribeChunk instead of a hardcoded language string; set it from
  MeetingRecorder.startRecording() via the 2-letter ISO prefix of
  recognitionLanguage (e.g. "de-DE" → "de")

Fixes thehwang#2, fixes thehwang#3
@ScreepCode ScreepCode force-pushed the fix/audio-quality-and-whisper-language branch from c211121 to 0c22e3d Compare June 15, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant