Chain Index is a focused blockchain data product for teams that need reliable, queryable ERC-20 transfer data without building and operating a full custom indexing stack from scratch.
At a product level, it turns raw on-chain activity into an operational data service: transfers are ingested from an Ethereum-compatible RPC endpoint, normalized, stored in PostgreSQL, exposed over HTTP, and optionally published to Kafka for downstream consumers. At an engineering level, it is a small Go service built around finalized-block indexing, idempotent persistence, reorg handling, and production-friendly observability.
Direct RPC access is a poor fit for most product and data workflows. It is expensive to query repeatedly, difficult to paginate consistently, and awkward to connect to internal systems such as analytics pipelines, customer operations tooling, or risk engines.
Chain Index solves that gap by creating a compact indexing layer for a defined set of ERC-20 tokens.
It is useful when you need to:
- power wallet, token, or treasury activity views in a product UI
- support compliance, finance, or operations teams with searchable transfer history
- feed internal event-driven systems from blockchain activity via Kafka
- backfill historical token flows and then keep them updated in live mode
- avoid coupling product features directly to RPC latency and rate limits
For a configured list of token contract addresses, Chain Index:
- reads finalized blocks from an Ethereum-compatible chain
- fetches ERC-20 Transfer logs for those tokens
- stores indexed blocks and transfer records in PostgreSQL
- exposes transfer history and indexing stats over HTTP
- emits transfer events to either an in-memory broker or Kafka
This makes it suitable both as a standalone indexing service and as a reusable data plane component inside a larger platform.
- Focused scope: indexes ERC-20 Transfer events for explicitly configured token addresses, not arbitrary contract events
- Operationally simple: one Go binary with PostgreSQL, optional Kafka, and Prometheus-compatible metrics
- Useful for both history and realtime: supports backfill mode and continuous live indexing
- Designed for internal product teams: easy to query, easy to deploy, easy to integrate into downstream systems
- Built for correctness over novelty: finalized-block strategy, reorg detection, and idempotent writes reduce data drift
The service follows a ports-and-adapters architecture.
- The indexer reads configuration from environment variables.
- On startup it opens PostgreSQL, applies SQL migrations automatically, and connects to the RPC endpoint.
- It determines the latest finalized block. If the chain does not expose finalized or safe tags, it falls back to latest minus a configurable confirmation depth.
- It resumes from the last stored block or from START_BLOCK for a fresh deployment.
- It fetches block headers in batches, validates parent-child continuity, and detects reorgs.
- For each batch, it fetches ERC-20 Transfer logs for the configured tokens.
- It upserts indexed block metadata and transfer records into PostgreSQL.
- It publishes each transfer to the configured broker.
- It updates metrics and serves query traffic over HTTP.
Chain Index is intentionally opinionated about correctness.
- Finalized-first indexing: the reader prioritizes finalized or safe block tags before using a confirmation-based fallback
- Reorg-aware processing: if a stored block hash no longer matches observed chain history, the service deletes affected rows from the divergence point and replays from there
- Idempotent storage: blocks and transfers are written with upsert semantics, so restarts and retries do not create duplicates
- Retry-aware runtime: transient RPC quota and timeout failures are retried with backoff by the indexer supervisor
- Explicit readiness: readiness checks verify both PostgreSQL reachability and RPC access
This makes the service suitable for production workloads where downstream users care about stable, replayable historical data.
Core runtime components:
- Indexer service: orchestrates batching, finality tracking, reorg handling, persistence, and broker publishing
- Ethereum RPC adapter: fetches block headers and transfer logs from an Ethereum-compatible node
- PostgreSQL store: persists indexed blocks and transfers and serves query workloads
- HTTP API adapter: exposes health, readiness, metrics, stats, and transfer search endpoints
- Broker adapter: publishes transfer payloads to Kafka or stores them in memory for local development and tests
- Observability package: exposes Prometheus metrics for RPC latency, RPC failures, and indexing lag
The product stores two core entities.
Indexed blocks:
- block number
- block hash
- parent hash
- block timestamp
- created and updated timestamps
Transfers:
- block number and block hash
- transaction hash and log index
- from and to addresses
- token address
- raw transfer value as a string
- block timestamp
- created and updated timestamps
Database indexes support common query patterns on sender, recipient, token, and reverse chronological block traversal.
- GET /healthz returns a simple process health response
- GET /readyz verifies PostgreSQL and RPC connectivity
- GET /metrics exposes Prometheus metrics
- GET /stats returns aggregate indexing statistics
- GET /transfers returns transfer history filtered by wallet address and/or token address
Example:
curl "http://localhost:8080/transfers?address=0xabc...&limit=50&offset=0"
curl "http://localhost:8080/transfers?token=0xdac17f958d2ee523a2206206994597c13d831ec7&limit=100"
curl "http://localhost:8080/stats"The transfers endpoint requires at least one of:
- address
- token
Supported pagination parameters:
- limit, default 100, max 500
- offset, default 0
Each indexed transfer can be published as JSON.
- memory broker: useful for tests and local-only runs
- Kafka broker: useful when transfer activity should feed data pipelines, alerting, enrichment, settlement, or product automations
Kafka messages use the transaction hash as the message key and the full transfer payload as the message value.
Prometheus metrics include:
- chain_index_rpc_latency_seconds
- chain_index_rpc_failures_total
- chain_index_indexing_lag_blocks
- chain_index_indexing_lag_seconds
These cover the two most important operational questions:
- is the RPC dependency healthy enough to sustain indexing?
- how far behind realtime is the indexer?
- Go 1.25+
- Docker and Docker Compose
- access to an Ethereum-compatible RPC endpoint
- Copy the environment template.
- Fill in at least RPC_URL and TOKEN_ADDRESSES.
- Start the stack.
cp .env.example .env
docker compose up --buildThe compose setup includes:
- PostgreSQL
- Redpanda acting as a Kafka-compatible broker
- Redpanda Console
- the Chain Index application
By default the app is exposed on port 8080 and the Redpanda Console on port 8081.
Start local dependencies first, then run:
go run ./cmd/indexerEnvironment variables:
| Variable | Required | Description |
|---|---|---|
| DATABASE_URL | Yes | PostgreSQL connection string |
| RPC_URL | Yes | Ethereum-compatible RPC endpoint |
| HTTP_ADDR | No | HTTP listen address, default :8080 |
| LOG_LEVEL | No | debug, info, warn, or error |
| LOG_FORMAT | No | json or text |
| MODE | Yes | backfill or live |
| START_BLOCK | Yes | First block to index on a fresh deployment |
| END_BLOCK | No | Optional upper bound for backfill runs |
| BATCH_SIZE | No | Number of blocks processed per batch |
| TOKEN_ADDRESSES | Yes | Comma-separated ERC-20 token contract addresses |
| POLL_INTERVAL | No | How often live mode checks for new finalized blocks |
| RPC_TIMEOUT | No | Timeout per RPC request |
| RPC_RETRIES | No | Retry attempts for RPC calls |
| RPC_BACKOFF | No | Initial RPC retry backoff |
| RPC_FINALITY_FALLBACK_CONFIRMATIONS | No | Confirmation depth used if finalized or safe tags are unavailable |
| BROKER_KIND | Yes | memory or kafka |
| KAFKA_BROKERS | No | Comma-separated Kafka broker list |
| KAFKA_TOPIC | No | Kafka topic for transfer events |
| APP_PORT | Compose only | Host port mapping for the app container |
Typical ways product and platform teams use Chain Index:
- As an internal data service behind dashboards or support tooling
- As a transfer event source for stream-processing systems via Kafka
- As a backfill worker that seeds PostgreSQL before a product launch
- As a lightweight chain-ingestion component inside a broader fintech or web3 platform
This repository is deliberately narrow.
- It indexes only ERC-20 Transfer events
- Token coverage is allowlist-based through configuration
- Query API is optimized for operational lookups, not broad analytical SQL replacement
- It stores values as strings to preserve on-chain precision and avoid implicit decimal assumptions
That narrow scope is intentional: for many teams, a dependable and understandable indexing service is more valuable than a generic but harder-to-operate indexing platform.
- Written in Go with a small runtime footprint
- Uses PostgreSQL migrations on startup via Goose
- Ships as a single container image based on a distroless runtime
- Exposes JSON logs and text logs depending on environment
- Keeps infrastructure optional: Kafka can be switched off in favor of the in-memory broker for basic local flows
Chain Index is a good fit for:
- product managers validating token-driven features and needing fast access to transfer history
- backend engineers who need a dependable blockchain ingestion layer
- data and analytics teams that want a normalized operational dataset
- platform teams that need Kafka-ready blockchain events without maintaining a larger indexing platform
If you need a small, production-ready ERC-20 transfer indexing service that is understandable by both engineers and business stakeholders, this repository is built for that use case.