grabit is a self-contained Go binary that:
- Indexes a local folder or shallow-cloned GitHub repo
- Chunks source files into manageable blocks
- Embeds those blocks using OpenAI embeddings
- Stores an on-disk index (
.grabit/) - Lets you run semantic search or ask natural language questions against the repo
- Answers are powered by OpenAI’s Responses API with Retrieval-Augmented Generation (RAG)
This gives you a local “chat with your codebase” workflow that’s portable, fast, and easy to run.
- Go 1.22+
git(only if using--repomode for shallow clone)- Environment variable:
OPENAI_API_KEY(required)
Optional:
GITHUB_TOKEN(higher API rate limits if you extend for GitHub API usage)
- Responses / Q&A:
gpt-4o(balanced), oro3-mini(stronger reasoning, higher cost)
- Embeddings:
text-embedding-3-large(best recall, 3072-dim)text-embedding-3-small(cheaper, 1536-dim, optional)
grabit index --path . # Index current repo
grabit index --repo https://github.com/org/repo
grabit search "rate limiter" # Semantic search
grabit ask "How does auth work?" # Ask with RAG
grabit map # Show file types & index stats- Walk repo
- Include extensions:
.go,.rb,.py,.ts,.js,.java,.c,.cpp,.rs,.sql,.yaml,.json, etc. - Exclude dirs:
.git,node_modules,dist,build,venv, etc. - Skip >5MB files
- ~700 bytes per chunk
- ~120 byte overlap
- Tracks file + line ranges for citations
- Use
text-embedding-3-large - Batch API calls (default 64 per batch)
- Save vectors alongside text chunks
.grabit/index.jsonl→ newline JSON (chunks + vectors).grabit/meta.json→ model + repo metadata
- Query → embed → cosine similarity over chunks
- Top-K (default 12) returned
-
Include selected chunks in context:
FILE: path LINES: start-end- Snippet text
-
Guard against max token budget
-
Instruction: “Answer strictly from context; if not present, say so”
- Responses API (
/v1/responses) - Model =
gpt-4o(default, overridable via--model)
- Parallel embedding worker pool
- Smarter chunking (AST-based for Go/Ruby)
- Hybrid retrieval (keyword + embedding)
- SQLite w/
sqlite-vecfor faster search - Citations in answers (file + line refs)
- Streaming responses for better UX
- RAG eval harness (measure groundedness)
-
User installs binary
-
Exports API key:
export OPENAI_API_KEY=sk-... -
Runs
grabit index --path . -
Runs
grabit ask "Where are rate limits enforced?" -
Gets answer + context citations
- Designed to be portable: single binary, no extra DB
- OpenAI key is the only dependency
- Extensible: can later plug into FAISS, pgvector, or SQLite
- Works offline after indexing except for OpenAI calls