English | ็ฎไฝไธญๆ
Turn raw footage into brand-ready, platform-optimized video โ with one command.
echocut is a local-first video CLI. Point it at a video and it transcribes the
speech, burns in big readable subtitles, adds your brand band + cover, optionally cuts
fillers/silence, slices highlights, and writes a multi-platform publish kit โ all on
your own machine. No cloud upload, no editor, no timeline.
echocut burn talk.mp4 --cut-fillers
# โ talk_burn.mp4 (subtitles + title + brand band + cover + fade) + cover.jpg + subtitles.srt + publish.md| ๐ฌ | Subtitle burn-in โ word-level ASR (WhisperX cross-platform; Qwen3/MLX on Apple Silicon), large readable captions, your @brand capsule on every frame |
| ๐ผ๏ธ | Brand cover as first frame + smooth fade-out + end CTA card |
| โ๏ธ | Filler / silence cutting at the video-track level (removes "um" and dead air โ video, audio and subtitles stay in sync) |
| ๐ฏ | Highlights โ slice a long video into shareable clips |
| ๐ | Any aspect ratio โ vertical / landscape / square / 4:3 auto-fit; --obs mode for face-on-top + screen-below recordings |
| ๐ค | Publish kit โ titles + descriptions + hashtags for multiple platforms |
| ๐ | Multi-brand โ every brand is one JSON file |
| โก | Built for long videos โ chunked transcription with resume, cross-run transcript cache, hardware encode/decode on Apple Silicon |
| Dependency | Why | Install |
|---|---|---|
| Node.js 18+ | the CLI | https://nodejs.org |
| Python 3.11+ | speech-to-text | https://python.org |
| FFmpeg | video/audio processing | brew install ffmpeg ยท apt install ffmpeg |
| Ollama | local LLM (titles, caption fixes) | https://ollama.com โ ollama pull qwen3.5:9b |
Platform: works cross-platform via WhisperX (CPU or CUDA). The fastest ASR (
qwen3,mlx) is Apple Silicon only and falls back to WhisperX elsewhere. Video encoding uses hardware acceleration on Mac and softwarelibx264elsewhere (correct, just slower) โ see Troubleshooting.
On a Mac (M1/M2/M3/M4), one idempotent script installs everything โ Homebrew, FFmpeg, Node, the Python ASR stack, fonts and the CLI:
git clone https://github.com/<you>/echocut.git && cd echocut
bash scripts/setup-macos.sh # idempotent โ safe to re-run
USE_CN_MIRROR=1 bash scripts/setup-macos.sh # mainland China โ use domestic mirrorsTwo gotchas the script handles for you (mind them if you install by hand): FFmpeg must be
ffmpeg-full, not the slim ffmpeg (the slim build has no libass โ subtitle burn-in
fails), and Node must be the LTS node@22, not the latest (better-sqlite3 has no
prebuilt binary for bleeding-edge Node). Full details, mirrors and troubleshooting:
docs/INSTALL-MACOS.md.
git clone https://github.com/<you>/echocut.git && cd echocut
npm install # Node deps + CLI
python3 -m venv .venv && .venv/bin/pip install -r requirements.txt # Python ASR
npm run fetch-fonts # download default CJK font (Noto Sans SC, OFL)
cp .env.example .env # optional keys (proxy, MiniMax, โฆ)
npm link # register the `echocut` command
echocut doctor # environment self-check# Subtitles + title + brand band + cover + publish kit
echocut burn /path/to/video.mp4 --cut-fillers
# OBS screen recording (face on top, screen below) โ face stays visible, compact title
echocut burn /path/to/obs.mov --obs --headline "Title" --subline "Subtitle"
# Long landscape tutorial โ stays full-screen landscape, readable
echocut burn /path/to/tutorial.mp4 --no-title
# Slice a long video into highlight clips
echocut highlights /path/to/long.mp4 --segments 4Output โ debug_outputs/video/<run_id>/: the *.mp4, a *_cover.jpg, *.srt, and publish.md.
Re-rendering the same video reuses the cached transcription instantly.
--freshforces a re-transcribe;--reuse-captions <file>skips transcription + LLM entirely.
Every brand is one file: configs/brands/<id>.json โ identity, colors, brand capsule,
CTA, BGM defaults and the LLM persona. Start from the template:
cp configs/brands/_template.json configs/brands/mybrand.json
# edit name / colors / @handle / CTA โฆ
echocut burn /path/to/video.mp4 --brand mybrandSee configs/brands/example.json for a filled-in example and _README.md for the field reference.
echocut keeps one frame size per file and adapts the overlays to the source shape.
Your @brand capsule is drawn on every frame โ it's your traceable mark.
Vertical 9:16 (1080ร1920) โ the default for talking-head clips:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [@yourhandle] Headline โ โ brand capsule (top-left) + title + subline
โ Subline โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ video content โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ BIG SUBTITLE โ โ โ large captions, emphasis words highlighted
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
echocut burn clip.mp4 --cut-fillers
Landscape 16:9 โ full-screen screencasts / tutorials stay landscape (readable):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [@yourhandle] โ โ capsule only (use --no-title)
โ โ
โ video content โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ BIG SUBTITLE โ โ โ captions on a bottom band
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
echocut burn tutorial.mp4 --no-title
(cover is exported as a separate .jpg, not prepended)
OBS โ --obs for "face on top, screen below" recordings: a compact top band keeps
the face visible, with a small title beside the capsule:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [@you] Small title โ โ compact band โ face below stays visible
โ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ webcam / face โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ screen share โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ BIG SUBTITLE โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
echocut burn obs.mov --obs --headline "Title"
Any other shape (e.g. 4:3 live-stream) โ --auto-pad fits it into the target container,
--strip-top <px> wipes a top watermark band.
echocut is a toolbox. Full reference + agent-friendly guide: docs/CLI.md.
Every subcommand has live help โ echocut <command> --help โ with examples.
| Group | Commands |
|---|---|
| Video core | burn (transcribeโsubtitlesโtitleโbrandโcoverโpublish kit) ยท package (already-edited video โ cover+BGM+CTA) ยท batch (a whole folder) |
| Long video | highlights (auto-slice N clips) ยท hls (analyze + list segments) ยท hmk (render chosen segments) ยท afc (article from a segment) |
| Multi-person | panel-clip (panel โ per-speaker clips) ยท identity-card (name/title overlay) |
| Marketing | distribute (per-platform packages) ยท hook-gen (5 opening hooks) ยท cover (standalone cover .jpg) ยท publish (upload โ signed URL) |
| Text | article ยท essay ยท translate ยท cross-lang (zhโen/ja/es) ยท weekly-retro |
| Media / AI | music (BGM) ยท minimax (tts/image/video) ยท vlog / ingest |
| Ops | doctor (self-check) ยท studio (admin UI) ยท brand (list/show/validate) |
Most flags live on burn โ see the full flag table in docs/CLI.md
for --engine, --ratio, --cut-fillers, --golden-hook, --reuse-captions, and more.
- CLI reference (agent-friendly): docs/CLI.md
- Troubleshooting / FAQ: docs/TROUBLESHOOTING.md
- ASR engine selection: docs/ASR-ENGINES.md
- Contributing & dev setup: CONTRIBUTING.md ยท Roadmap: ROADMAP.md
- AI coding tools (Claude Code / Cursor): CLAUDE.md ยท AGENTS.md
- Changelog: CHANGELOG.md โ Security: SECURITY.md
- All commands:
echocut --help(every subcommand has--helpwith examples) - ็ฎไฝไธญๆๆๆกฃ: README.zh-CN.md
Apache-2.0. Bundled/declared third-party components keep their own licenses โ see NOTICE. The default font (Noto Sans SC) is downloaded at setup under the SIL Open Font License 1.1. The optional Remotion render path is licensed separately and is not required by the default FFmpeg pipeline.