Skip to content

marcusvx/new-ats

Repository files navigation

ATS Search PoC

A proof of concept for a scalable search architecture in ATS (Applicant Tracking System) platforms. It demonstrates how separating concerns — two focused OpenSearch indexes, CDC via Postgres WAL, and vector search for semantic recommendations — resolves the "God Document" anti-pattern without adding operational complexity.

See plan.md for the full technical rationale and architecture breakdown.


Pre-requirements

Tool Version Notes
Node.js v24 (see .nvmrc) Use nvm use to switch automatically
npm 10+ Comes with Node 24
Docker 27+ Required for all infrastructure services
Docker Compose v2 (plugin) docker compose (not docker-compose)
jq any Used by Makefile OpenSearch targets
make any macOS ships with it via Xcode CLT

Stack

Layer Technology
HTTP API Fastify 5
Database Postgres 18 + Drizzle ORM
CDC / WAL pg-logical-replication → Kafka
Messaging Apache Kafka 4.2 (KRaft — no Zookeeper)
Search OpenSearch 3.5
Embeddings OpenAI text-embedding-3-small (optional — Phase 5)
Validation Zod
Testing Vitest

Monorepo Structure

./
├── apps/
│   ├── api/              # Fastify HTTP server + Kafka consumers
│   │                     # Clean Architecture: domain → application → infrastructure → entrypoint
│   └── cdc-producer/     # Plain Node — reads Postgres WAL, publishes to Kafka
│
├── libs/
│   ├── shared-types/     # Zod schemas for Kafka events and OpenSearch documents
│   ├── db/               # Drizzle schema, migrations, seed script
│   └── embedding/        # IEmbeddingService interface + OpenAI implementation
│
├── infra/
│   ├── docker-compose.yml
│   └── scripts/
│       ├── setup-replication.sql
│       └── opensearch-mappings/
│           ├── candidates_v1.json
│           └── applications_v1.json
│
├── Makefile              # All project commands
└── plan.md               # Architecture and implementation plan

Managed as an npm workspace — dependencies are hoisted to the root node_modules/ with symlinks for local packages.


Setup

1. Clone and install

git clone <repo-url>
cd new-ats
nvm use          # switches to Node v24 via .nvmrc

2. Configure environment

cp .env.example .env
# Edit .env if you need non-default ports or credentials
# Set OPENAI_API_KEY to enable vector search (Phase 5, optional)

3. First-time setup (one command)

make setup

This runs in order: install → infra-up → db-migrate → db-pub → os-create-indices.

4. Load test data

make db-seed        # 1,000 candidates, 10 jobs, 500 applications

5. Start development servers

make dev            # starts api + cdc-producer in parallel

The API will be available at http://localhost:3000.


Common Commands

make help           # List all available commands

# Infrastructure
make infra-up       # Start Docker services
make infra-down     # Stop Docker services
make infra-logs     # Follow service logs
make infra-ps       # Check service health

# Database
make db-migrate     # Apply Drizzle migrations
make db-seed        # Load test data
make db-reset       # Nuke + restart everything clean

# OpenSearch
make os-create-indices   # Create index mappings
make os-count            # Check document counts

# Development
make dev-api        # Start API only
make dev-cdc        # Start CDC producer only

# Testing & Linting
make test           # Run all unit tests
make test-api       # Run just apps/api tests
make lint           # ESLint + Prettier check
make lint-fix       # Auto-fix

# Cleanup
make clean          # Remove compiled dist/ dirs
make clean-all      # Remove dist/, node_modules/, Docker volumes

API Endpoints

Method Path Description
GET /api/search/candidates Talent pool search (q, skills[], location, seniority, page)
GET /api/jobs/:id/applications Screening view for a job (stage, status, q, page)
GET /api/jobs/:id/recommendations Candidate recommendations (mode=fulltext|vector|hybrid)
POST /api/candidates/:id/view Record a profile view (async, fire-and-forget)

Environment Variables

Variable Default Description
DATABASE_URL postgres://ats:ats@localhost:5432/ats Postgres connection string
KAFKA_BROKERS localhost:9092 Comma-separated broker list
OPENSEARCH_URL http://localhost:9200 OpenSearch node URL
OPENAI_API_KEY (unset) Enables vector search. Without it, recommendations fall back to full-text
DISABLE_CONSUMERS (unset) Set to true to skip starting Kafka consumers (useful in tests)

About

Proof of concept for a scalable ATS search architecture using PostgreSQL CDC, Kafka, OpenSearch, and vector embeddings for semantic talent recommendations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors