jarvis-runtime

A production-grade, multi-channel personal AI-assistant runtime in TypeScript — built to orchestrate Claude Code and Gemini CLIs as long-running, tool-using, autonomous agents.

jarvis-runtime is a long-lived Node service that turns CLI-based coding agents (Claude Code, Gemini CLI) into a persistent assistant you can talk to from Telegram or a local web UI. It manages sessions, routes work across LLM providers, exposes an in-process tool layer (filesystem, full-text search, headless browser, sandboxed shell), and can run multi-step engineering projects autonomously — including overnight, under budget caps and PM-style oversight.

Snapshot notice: This repository is a March 2026 snapshot of a real, working system. It is published as an engineering portfolio reference. The agent ecosystem has moved fast since — see Status & 2026 Context for an honest account of what still holds up and what I'd rebuild today.

Why this exists

Most "AI assistant" demos are a single chat loop around one API. This is the opposite: a runtime with the unglamorous parts that make an agent usable day-to-day — session persistence, provider failover, rate limiting, crash recovery, a scheduler, budget enforcement, and a unified message layer across channels. The interesting design decisions are in the seams between subsystems, not in any single prompt.

Architecture

Nine cooperating subsystems, wired together in src/index.ts:

#	Subsystem	What it does	Key source
1	Channels	Unified message interface across Telegram (grammY, locked to a single allowed user ID) and a local web UI (browser chat with Web Speech API voice input, localhost-only). Both feed the same gateway/session/tool stack.	`src/channels/`
2	Gateway	The agentic loop. Builds context, routes to a provider, runs a bounded tool-call loop, applies token-bucket rate limiting, and does retry-with-provider-fallback on failure.	`src/gateway/`, `src/util/rate-limiter.ts`
3	LLM Router & Cascade	Task-type routing (`chat`, `worker`, `engineering`, …) with `provider:model` override syntax. Providers: Claude Code CLI, Gemini CLI, OpenRouter (HTTP), and Google AI REST. A cascade provider chains free-tier Gemini API keys → paid Anthropic fallbacks for cost-controlled bulk work.	`src/llm/`
4	Session + Transcripts	In-memory session manager (history window + idle timeout) backed by an append-only JSONL transcript store per chat, plus persistence of CLI session IDs so conversations survive restarts.	`src/session/`, `src/state/`
5	Search (SQLite + FTS5)	Full-text index over the markdown knowledge base using SQLite FTS5 (porter tokenizer), with incremental mtime-based refresh and snippet/rank scoring. Rebuilt on boot, refreshed on an interval.	`src/search/`
6	Tool Registry	A single in-process tool layer exposed to LLMs: filesystem read/write/append, search, headless browser (Playwright), sandboxed shell, project control, orchestrator control, and reminders.	`src/tools/`, `src/browser/`, `src/shell/`
7	Project Autonomy	Autonomous multi-step engineering: plan → execute → validate, with a git branch per project, dependency-ordered tasks, retry/self-healing, and validation commands as the success gate.	`src/project/`
8	Orchestrator	Spawns long-lived Claude Code sessions over the stream-json protocol, supervised by a cheaper PM model (via OpenRouter) that classifies output and decides `continue` / `wait_for_human` / `stop` / `escalate`. Includes per-session budget limits and loop detection.	`src/orchestrator/`
9	Overnight Runner & Scheduler	A time-windowed autonomous queue (e.g. 22:00–06:00) with nightly + per-session USD budget caps, plus an interval scheduler driving proactive checks (inbox size, stale memory, stale projects) and morning digests.	`src/overnight/`, `src/scheduler/`

Cross-cutting concerns: structured logging with rotation (winston), a /health HTTP endpoint, global crash handlers, request-scoped IDs, and graceful shutdown with a hard timeout guard.

Tech stack

Language / runtime: TypeScript (strict), Node.js ≥ 20, ES modules
Storage: better-sqlite3 with FTS5 for search; append-only JSONL for transcripts
Channels: grammy (Telegram); native ws + a static web UI for the browser channel
Browser automation: playwright (headless Chromium, pooled)
Validation / config: zod schema over a json5 config file with ${ENV} interpolation
Observability: winston + winston-daily-rotate-file
Process management: PM2 (ecosystem.config.cjs)
Testing: vitest

Test status

505 unit/integration tests passing (vitest, 518 total; 13 live-CLI end-to-end tests are env-gated and skipped by default), spanning every subsystem: router/cascade, providers, gateway loop, session/transcript persistence, FTS5 search, tool registry, shell security, browser pool, project planner/runner/healing, orchestrator store/PM classification, overnight scheduling/budgets, and channel command handling.

npm test

Quick start

Prerequisites

Node.js ≥ 20
(Optional) Claude Code CLI authenticated (claude login) for the Anthropic provider
(Optional) Gemini CLI authenticated for the Gemini provider
A Telegram bot token (from @BotFather) if you use the Telegram channel

Install & build

npm install
npm run build
npm test

Configure

Copy the example environment file and fill in your own values:

cp .env.example .env

The Claude and Gemini providers use CLI subscription auth (no API key in env). OpenRouter is the only required key for the pay-per-token fallback path; the Google AI REST keys are optional and power the free-tier worker cascade. Provider blocks whose credentials resolve to empty are stripped automatically at boot, so you can run with whatever subset you have.

Runtime behavior (which channels, tools, autonomy levels, budgets, and routing are enabled) is configured in config/default.json5. Browser, shell, project, orchestrator, and overnight modes are all individually toggleable and default to safe limits (allowed base dirs, timeouts, max output sizes, USD budget caps).

Run

# Development (watch mode)
npm run dev

# Production
npm start

# Or under PM2
pm2 start ecosystem.config.cjs

Health check: GET /health on the configured health port. Local web UI (if enabled): http://localhost:3000.

Security model

This is a single-user system by design. The Telegram channel only responds to one configured user ID; the shell executor restricts execution to an allowlist of base directories with timeout, output-size, and concurrency caps plus blocked-pattern filtering; the browser runs headless and pooled; and autonomous modes enforce per-session and nightly USD budget limits with loop detection. Secrets live only in .env (gitignored); config references them via ${ENV} interpolation, never literals.

Status & 2026 Context

I'm publishing this as an honest snapshot: a real system, frozen at its last commit (2026-03-01), with a clear-eyed read on how it ages. That last commit pre-dates the Claude Agent SDK (Apr 8, 2026) and a wave of agent infrastructure that landed right after — so several subsystems I hand-rolled here now have first-party or best-in-class equivalents. Knowing exactly which is the point.

Still defensible — I'd keep these:

In-process tool registry (subsystem 6). Running tools in-process instead of behind out-of-process servers is a deliberate token-economics call: external tool servers re-inject large schemas into context on every turn. For a latency- and cost-sensitive personal runtime, keeping the hot tools in-process is still the right tradeoff.
SQLite + FTS5 for memory/search (subsystem 5). The 2026 consensus has converged toward embedded SQLite-backed agent memory, not away from it. This was ahead of the curve. The clear next step is hybrid retrieval — add a vector column and fuse lexical + semantic ranking with Reciprocal Rank Fusion.
Unified multi-channel session layer (subsystems 1 & 4). No mainstream agent framework cleanly owns session unification across heterogeneous channels (Telegram + web sharing one session/router/tool stack). This remains genuinely useful glue.

What I'd rebuild on newer tooling today:

Overnight queue (subsystem 9) → a durable-execution platform (Trigger.dev v3 / Inngest). My cron-window + budget-cap runner works, but durable execution gives you retries, replay, and observability for free.
Model cascade / provider failover (subsystem 3) → a model gateway (Vercel AI Gateway / OpenRouter routing). My hand-rolled cascade is sound, but gateways now do failover, cost routing, and key rotation as a managed concern.
Orchestrator & project autonomy (subsystems 7 & 8) → the Claude Agent SDK, Vercel AI SDK's agent primitives, or Claude Code's own goal-driven modes. The patterns here (PM-supervised sessions, plan/execute/validate, loop detection, budget gating) are still exactly right — I'd just implement them on the SDK rather than against the raw stream-json protocol.

Net: the architecture and the judgment calls hold up; some of the plumbing now has better off-the-shelf parts. That's the honest state of any system built on the leading edge of a fast-moving field — and being precise about it is the whole point of publishing this.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ecosystem.config.cjs		ecosystem.config.cjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jarvis-runtime

Why this exists

Architecture

Tech stack

Test status

Quick start

Prerequisites

Install & build

Configure

Run

Security model

Status & 2026 Context

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

jarvis-runtime

Why this exists

Architecture

Tech stack

Test status

Quick start

Prerequisites

Install & build

Configure

Run

Security model

Status & 2026 Context

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages