RepoIntel — Nick Coma

01 — The problem

33 documents, zero findability

I was deep into the design strategy across a large enterprise program spanning multiple workstreams. The output was substantial: analysis documents, gap trackers, blueprint decks, and strategy write-ups, all living in a GitHub repo. Sensitive internal documents that couldn't be sent to third-party AI services.

The problem was simple: nobody could find anything. Stakeholders would ask "what's the status of the onboarding workflows?" and the answer required opening five different .docx files, cross-referencing an Excel tracker, and remembering which decisions had been made three months ago. Every question meant a manual scavenger hunt. I watched our team burn hours every week just locating information we'd already produced.

This wasn't assigned work. Nobody filed a ticket or raised it in a retro. I spotted the pattern: every meeting started with someone digging through files, every status update required re-reading documents we'd already written, and every new team member faced a weeks-long ramp just to understand what had been decided. I recognized that as a systemic problem, not just an inconvenience, and took it on myself to solve it. The goal was to give our team back the hours we were burning on document archaeology so we could spend them on the actual strategy work. And because these were sensitive internal documents, whatever I built had to keep everything local. No external APIs. No data leaving the machine.

33

Documents indexed

1,847

Chunks extracted

6

Entity types tracked

5

Document formats

02 — Strategy

Build the tool, then open-source it

The first version was called Recall, an internal tool built for a specific enterprise program. It indexed workflow documents from GitHub, extracted entities (decisions, dependencies, gaps, stakeholders, milestones, workflows), mapped relationships between them, and gave the team a RAG-powered chat interface to ask questions with source citations. Everything ran locally. Ollama handled embeddings and chat inside the container. No data left the machine. No API keys needed. For a team working with sensitive internal documents, this wasn't a nice-to-have, it was a requirement.

It worked. Questions that took 20 minutes of document diving now took 10 seconds. But I realized the problem wasn't unique to our team. Every program with a document corpus has the same findability gap and the same privacy concerns about sending proprietary documents to external AI services. So I rebuilt Recall as RepoIntel, a generalized, open-source version that any team could deploy against their own GitHub repo with the same local-first privacy guarantees.

The key design decision: ship it as a Railway template. One click, environment variables configured, and you have a running instance indexing your repo. No Docker knowledge required. No infrastructure setup. No API keys. The gap between "I have documents" and "I have intelligence" should be measured in minutes, not sprints.

Leading this effort. No one asked for RepoIntel. I identified the findability and privacy problem firsthand on an enterprise engagement, built the internal tool (Recall) to enable my team, then generalized and open-sourced it so other teams with the same constraints could skip the custom build. Solo build end to end, scoped so it could be adopted by teams I'd never meet.

03 — Architecture

How it all fits together

Source

GitHub Repo

Push via Webhooks

.docx .xlsx .csv .md .pdf

Indexing Pipeline

SHA Diff Check

Changed files only

Document Parsers

5 format-specific parsers

Structure-Aware Chunker

512 tokens, heading hierarchy

Ollama Embeddings

768-dim, runs locally, no API key

Entity Extraction

6 types via LLM

Storage

SQLite

Documents, entities, relations, gaps, risks

ChromaDB

768-dim vector index

Query Engine

Intent Classifier

Factual, synthesis, relational, exploratory

4-Strategy Retriever

Route by intent type

LLM Response

Streamed with citations

Interface

Dashboard

Docs, domains, health scores

RAG Chat

Multi-turn Q&A with sources

Gap Tracker

Status, domain, workflow filters

Risk Radar

6 detection rules, severity scoring

End-to-end data flow: from GitHub push to user-facing intelligence. Everything runs locally, no data leaves the container.

04 — The pipeline

Indexing that understands structure

01

Incremental indexing via SHA comparison

Every file's SHA hash is stored on first index. On subsequent runs (or webhook triggers), only changed files are reprocessed. For a 33-document corpus, this turns a 10-minute full reindex into a 30-second incremental update.

02

Format-specific parsing with heading extraction

Five parsers handle DOCX, XLSX, CSV, Markdown, and PDF. Each extracts structural metadata: heading hierarchy from DOCX, sheet names from XLSX, header rows from CSV. This structure is preserved through chunking so the system knows that a chunk about "gap analysis" came from "Q4 Planning > Workflow 3 > Gap Summary."

03

Structure-aware chunking

Documents are split at heading boundaries rather than arbitrary token counts. Chunks respect the document's own organization: a section stays together if it's under 512 tokens, gets split with overlap if it's longer, and gets merged with its neighbor if it's too short. The section path (e.g., "Heading 1 > Subheading 2") travels with every chunk.

04

Entity extraction and relationship mapping

An LLM pass over each chunk extracts six entity types: decisions, dependencies, gaps, stakeholders, milestones, and workflows. A second pass detects relationships: what blocks what, who owns what, what supersedes what. This entity graph is what powers the relational query strategy and the risk detection engine.

05

4-strategy RAG retrieval

Not every question needs the same retrieval approach. The intent classifier routes queries to one of four strategies: factual (direct entity lookup), synthesis (broad semantic search), relational (entity graph traversal), or exploratory (risk and gap scanning). A question like "who owns the onboarding workflows?" hits a different pipeline than "what should I be worried about?" And because Ollama runs locally, even the queries themselves never leave your infrastructure.

05 — The interface

Four views, one intelligence layer

The UI is built with IBM's Carbon Design System. Each view serves a different mode of interaction: browsing (Dashboard), asking (Chat), tracking (Gap Tracker), and monitoring (Risk Radar). They all read from the same locally-stored indexed data, so a gap surfaced in chat is the same gap tracked in the Gap Tracker and scored in the Risk Radar. No external services are called at any point in the user flow.

RepoIntel admin authorization modal for triggering incremental indexing

Admin-protected indexing: incremental mode only processes new or changed files, password-gated for security

RepoIntel chat interface with suggested queries and input field

RAG Chat: suggested queries surface common intents, all processing happens locally via Ollama

06 — Open source strategy

Why give it away

RepoIntel started as a tool to solve my team's problem. But the pattern it solves, turning a pile of documents into searchable, structured intelligence without sending anything to a third party, is universal. Every consulting engagement, every enterprise program, every legal team has a document corpus that's both hard to navigate and too sensitive for external AI tools.

Open-sourcing it as a one-click Railway template was a deliberate decision. The deployment barrier had to be zero. No Docker knowledge, no infrastructure provisioning, no config files to edit. Click deploy, add your GitHub token and repo, and you have a running instance in under 5 minutes. Your data stays on your Railway instance.

The template packages everything into a single container: Next.js for the app, Ollama for local embeddings and chat (zero API keys required by default), ChromaDB for vector storage, and SQLite for structured data. Railway's persistent volume means your index survives redeploys. The entire AI stack runs inside the container, so document content never touches an external server.

1

Click to deploy

0

API keys required

5

Minutes to running

MIT

License

07 — Key decisions

Technical bets that paid off

01

Privacy by architecture, not policy

Ollama runs embeddings and chat locally inside the container. Documents are parsed, chunked, embedded, and queried without a single byte leaving the machine. No API keys are required to run the full platform. This was a non-negotiable design constraint: teams working with sensitive internal documents, legal files, or proprietary strategy shouldn't have to trust a third-party API to search their own work. Cloud providers (Mistral, Anthropic) are available as optional upgrades for teams that want higher-quality extraction, but the default experience is fully local and fully private.

02

Single-container deployment

Instead of orchestrating three separate services (app, vector DB, LLM), everything runs in one Docker container managed by a shell script. This trades some operational elegance for dramatically simpler deployment. For the target user (a small team that wants to deploy and forget), simplicity wins.

03

Structure-aware over naive chunking

Most RAG systems chunk documents at fixed token intervals. RepoIntel chunks at heading boundaries, preserving the document's own organization. This means the retriever returns "the gap analysis section from the onboarding workflow card" rather than "tokens 2048-2560 of some file." The quality difference in answers is substantial.

04

Entity extraction as a first-class feature

Most document search tools stop at semantic similarity. RepoIntel goes further by extracting typed entities (decisions, dependencies, gaps) and mapping relationships between them. This enables queries like "what blocks the platform migration?" that pure vector search can't answer.

05

GitHub webhooks for live sync

A push to the indexed repo triggers incremental reindexing via webhook. Documents stay in sync without manual intervention. Combined with SHA-based diffing, only changed files are reprocessed, so updates are fast even as the corpus grows.

08 — The impact

From document archaeology to instant answers

Recall replaced the weekly document scavenger hunts that were eating hours of the team's time. Questions that required opening five files and cross-referencing a tracker now got answered in seconds through RAG chat with source citations. New team members stopped needing weeks to ramp up on past decisions because the entire knowledge base was searchable from day one.

When RepoIntel launched as an open-source Railway template, the same architecture became available to any team with a GitHub repo and sensitive documents. The one-click deploy model meant teams could go from "I have a document problem" to "I have a running intelligence platform" in under five minutes, with zero data leaving their infrastructure.

Hours

Saved weekly on document search

Days

To onboard vs. weeks

10s

To answer vs. 20 minutes

0

Data sent externally

RepoIntel: Open Source Document Intelligence