Skip to content

BillLucky/echocut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

9 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

echocut

English | ็ฎ€ไฝ“ไธญๆ–‡

CI License: Apache-2.0 Node PRs welcome

Turn raw footage into brand-ready, platform-optimized video โ€” with one command.

echocut is a local-first video CLI. Point it at a video and it transcribes the speech, burns in big readable subtitles, adds your brand band + cover, optionally cuts fillers/silence, slices highlights, and writes a multi-platform publish kit โ€” all on your own machine. No cloud upload, no editor, no timeline.

echocut burn talk.mp4 --cut-fillers
# โ†’ talk_burn.mp4  (subtitles + title + brand band + cover + fade)  +  cover.jpg  +  subtitles.srt  +  publish.md

โœจ Features

๐ŸŽฌ Subtitle burn-in โ€” word-level ASR (WhisperX cross-platform; Qwen3/MLX on Apple Silicon), large readable captions, your @brand capsule on every frame
๐Ÿ–ผ๏ธ Brand cover as first frame + smooth fade-out + end CTA card
โœ‚๏ธ Filler / silence cutting at the video-track level (removes "um" and dead air โ€” video, audio and subtitles stay in sync)
๐ŸŽฏ Highlights โ€” slice a long video into shareable clips
๐Ÿ“ Any aspect ratio โ€” vertical / landscape / square / 4:3 auto-fit; --obs mode for face-on-top + screen-below recordings
๐Ÿ“ค Publish kit โ€” titles + descriptions + hashtags for multiple platforms
๐ŸŒ Multi-brand โ€” every brand is one JSON file
โšก Built for long videos โ€” chunked transcription with resume, cross-run transcript cache, hardware encode/decode on Apple Silicon

๐Ÿš€ Quickstart

1. Prerequisites

Dependency Why Install
Node.js 18+ the CLI https://nodejs.org
Python 3.11+ speech-to-text https://python.org
FFmpeg video/audio processing brew install ffmpeg ยท apt install ffmpeg
Ollama local LLM (titles, caption fixes) https://ollama.com โ†’ ollama pull qwen3.5:9b

Platform: works cross-platform via WhisperX (CPU or CUDA). The fastest ASR (qwen3, mlx) is Apple Silicon only and falls back to WhisperX elsewhere. Video encoding uses hardware acceleration on Mac and software libx264 elsewhere (correct, just slower) โ€” see Troubleshooting.

2. Install

Quickstart โ€” macOS (Apple Silicon)

On a Mac (M1/M2/M3/M4), one idempotent script installs everything โ€” Homebrew, FFmpeg, Node, the Python ASR stack, fonts and the CLI:

git clone https://github.com/<you>/echocut.git && cd echocut
bash scripts/setup-macos.sh                  # idempotent โ€” safe to re-run
USE_CN_MIRROR=1 bash scripts/setup-macos.sh  # mainland China โ€” use domestic mirrors

Two gotchas the script handles for you (mind them if you install by hand): FFmpeg must be ffmpeg-full, not the slim ffmpeg (the slim build has no libass โ†’ subtitle burn-in fails), and Node must be the LTS node@22, not the latest (better-sqlite3 has no prebuilt binary for bleeding-edge Node). Full details, mirrors and troubleshooting: docs/INSTALL-MACOS.md.

Manual install (any platform)

git clone https://github.com/<you>/echocut.git && cd echocut
npm install                                                  # Node deps + CLI
python3 -m venv .venv && .venv/bin/pip install -r requirements.txt   # Python ASR
npm run fetch-fonts                                          # download default CJK font (Noto Sans SC, OFL)
cp .env.example .env                                         # optional keys (proxy, MiniMax, โ€ฆ)
npm link                                                     # register the `echocut` command
echocut doctor                                               # environment self-check

3. Make your first video

# Subtitles + title + brand band + cover + publish kit
echocut burn /path/to/video.mp4 --cut-fillers

# OBS screen recording (face on top, screen below) โ€” face stays visible, compact title
echocut burn /path/to/obs.mov --obs --headline "Title" --subline "Subtitle"

# Long landscape tutorial โ€” stays full-screen landscape, readable
echocut burn /path/to/tutorial.mp4 --no-title

# Slice a long video into highlight clips
echocut highlights /path/to/long.mp4 --segments 4

Output โ†’ debug_outputs/video/<run_id>/: the *.mp4, a *_cover.jpg, *.srt, and publish.md.

Re-rendering the same video reuses the cached transcription instantly. --fresh forces a re-transcribe; --reuse-captions <file> skips transcription + LLM entirely.

๐ŸŽจ Make it yours โ€” brands

Every brand is one file: configs/brands/<id>.json โ€” identity, colors, brand capsule, CTA, BGM defaults and the LLM persona. Start from the template:

cp configs/brands/_template.json configs/brands/mybrand.json
#  edit name / colors / @handle / CTA โ€ฆ
echocut burn /path/to/video.mp4 --brand mybrand

See configs/brands/example.json for a filled-in example and _README.md for the field reference.

๐Ÿ“ Output layouts

echocut keeps one frame size per file and adapts the overlays to the source shape. Your @brand capsule is drawn on every frame โ€” it's your traceable mark.

Vertical 9:16 (1080ร—1920) โ€” the default for talking-head clips:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ [@yourhandle]   Headline    โ”‚  โ† brand capsule (top-left) + title + subline
โ”‚                 Subline     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                             โ”‚
โ”‚         video content        โ”‚
โ”‚                             โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚    BIG  SUBTITLE     โ”‚   โ”‚  โ† large captions, emphasis words highlighted
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   echocut burn clip.mp4 --cut-fillers

Landscape 16:9 โ€” full-screen screencasts / tutorials stay landscape (readable):

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ [@yourhandle]                              โ”‚  โ† capsule only (use --no-title)
โ”‚                                            โ”‚
โ”‚              video content                 โ”‚
โ”‚                                            โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚   โ”‚           BIG  SUBTITLE            โ”‚  โ”‚  โ† captions on a bottom band
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   echocut burn tutorial.mp4 --no-title
   (cover is exported as a separate .jpg, not prepended)

OBS โ€” --obs for "face on top, screen below" recordings: a compact top band keeps the face visible, with a small title beside the capsule:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ [@you]  Small title         โ”‚  โ† compact band โ€” face below stays visible
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚      webcam / face    โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                             โ”‚
โ”‚         screen share         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚    BIG  SUBTITLE     โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   echocut burn obs.mov --obs --headline "Title"

Any other shape (e.g. 4:3 live-stream) โ†’ --auto-pad fits it into the target container, --strip-top <px> wipes a top watermark band.

๐Ÿงฐ Commands

echocut is a toolbox. Full reference + agent-friendly guide: docs/CLI.md. Every subcommand has live help โ€” echocut <command> --help โ€” with examples.

Group Commands
Video core burn (transcribeโ†’subtitlesโ†’titleโ†’brandโ†’coverโ†’publish kit) ยท package (already-edited video โ†’ cover+BGM+CTA) ยท batch (a whole folder)
Long video highlights (auto-slice N clips) ยท hls (analyze + list segments) ยท hmk (render chosen segments) ยท afc (article from a segment)
Multi-person panel-clip (panel โ†’ per-speaker clips) ยท identity-card (name/title overlay)
Marketing distribute (per-platform packages) ยท hook-gen (5 opening hooks) ยท cover (standalone cover .jpg) ยท publish (upload โ†’ signed URL)
Text article ยท essay ยท translate ยท cross-lang (zhโ†’en/ja/es) ยท weekly-retro
Media / AI music (BGM) ยท minimax (tts/image/video) ยท vlog / ingest
Ops doctor (self-check) ยท studio (admin UI) ยท brand (list/show/validate)

Most flags live on burn โ€” see the full flag table in docs/CLI.md for --engine, --ratio, --cut-fillers, --golden-hook, --reuse-captions, and more.

๐Ÿ“š More

๐Ÿ“„ License

Apache-2.0. Bundled/declared third-party components keep their own licenses โ€” see NOTICE. The default font (Noto Sans SC) is downloaded at setup under the SIL Open Font License 1.1. The optional Remotion render path is licensed separately and is not required by the default FFmpeg pipeline.