Skip to content

Instantly share code, notes, and snippets.

@sychou
Created January 27, 2026 01:30
Show Gist options
  • Select an option

  • Save sychou/85709f72ad3177fd4cee2a289e3c535d to your computer and use it in GitHub Desktop.

Select an option

Save sychou/85709f72ad3177fd4cee2a289e3c535d to your computer and use it in GitHub Desktop.
Claude skill for a poor man semantic search built for Obsidian vaults
name description user_invocable
semantic-search
Search the vault using semantic query expansion and intelligent result ranking. Use when the user asks questions about vault contents, searches for topics, or needs to find related notes.
true

Semantic Search Skill

Search using LLM-powered query expansion and intelligent result synthesis. No vector database required.

Commands

  • /search <query> - Search and answer a question
  • /search add-mapping - Add a new semantic mapping
  • /search config - Review or edit search configuration

Configuration Files

The skill uses two configuration files in System/Search/:

File Purpose
System/Search/Config.md Directory structure, source hints, temporal patterns
System/Search/Semantics.md Term variations, synonyms, domain-specific mappings

If these files don't exist, the skill will bootstrap by analyzing the directory structure and asking setup questions.


Bootstrapping (First Run)

If System/Search/Config.md does not exist:

Step 1: Analyze Directory Structure

Run these commands to understand the vault:

# List top-level directories
ls -d */

# Count files by directory
find . -name "*.md" -type f | cut -d'/' -f2 | sort | uniq -c | sort -rn | head -20

# Find deepest nested paths to understand structure
find . -name "*.md" -type f | head -50

Step 2: Ask Setup Questions

Present these questions to the user using AskUserQuestion:

  1. Content types: "What types of content do you store? (e.g., notes, journal, projects, references)"

  2. Time-based content: "Do you have dated entries like a journal or daily notes? If so, where and what format?"

  3. Key topics: "What main topics or domains does this vault cover?"

  4. Important entities: "Are there people, projects, or concepts that go by multiple names?"

  5. Excluded areas: "Are there directories that should be excluded from search? (e.g., templates, archives)"

Step 3: Generate Config Files

Create System/Search/Config.md and System/Search/Semantics.md based on the analysis and answers.


Search Flow

Step 1: Load Configuration

Read both config files:

  • System/Search/Config.md — Directory mappings, temporal patterns
  • System/Search/Semantics.md — Term expansions

If either is missing, run bootstrapping first.

Step 2: Analyze Query

Determine from the query:

Temporal relevance (from Config.md patterns):

  • recent — Sort newest first (--sortr modified)
  • older — Sort oldest first (--sort modified)
  • chronological — Sample across time range
  • none — Sort by relevance

Source hints (from Config.md mappings):

  • Match query patterns to likely directories
  • Apply as path filter if confident

Step 3: Expand Query Terms

Generate 5-15 search terms by combining:

  1. Semantic mappings — Direct substitutions from Semantics.md
  2. Synonyms — Related words the LLM knows
  3. Variations — Compound forms, abbreviations

Do NOT generate:

  • Random typos (use -i for case insensitivity)
  • Overly broad terms
  • Terms outside the vault's domain (check Config.md key entities)

Step 4: Execute Search

Build ripgrep command:

rg -l -i "(term1|term2|term3)" --type md [--sort modified | --sortr modified] [path_filter]

Exclude paths from Config.md:

rg -l -i "(terms)" --type md --glob '!System/Templates/*' --glob '!.obsidian/*'

For relevance ranking:

rg -c -i "(terms)" --type md | sort -t: -k2 -nr

Step 5: Read Results

Query Type Files to Read Selection
Factual 3-5 Highest match count
Exploratory 5-10 Mix of relevance and temporal
Chronological 5-10 Earliest + latest + samples

Step 6: Synthesize Answer

  1. Answer the question — Use content from files read
  2. Cite sources — Link files with [[filename]]
  3. Acknowledge gaps — If vault doesn't contain answer, say so
  4. Suggest mappings — If useful terms were missing from Semantics.md

Response format:

[Natural language answer]

**Sources:**
- [[File One]] — what it contributed
- [[File Two]] — what it contributed

[Optional: "Consider adding to Semantics.md: `term: variation1, variation2`"]

Adding Mappings

When searches miss expected results:

  1. Open System/Search/Semantics.md
  2. Add under appropriate category
  3. Format: primary term: variation1, variation2, "phrase variation"

The mappings file grows through use. Don't try to be exhaustive upfront.


Updating Configuration

When vault structure changes:

  1. Run /search config
  2. Review System/Search/Config.md
  3. Update directory mappings, source hints, or excluded paths

Example Session

User: /search How did we handle authentication in the API?

Step 1 - Load Config:
  Source hints: "API" → docs/, src/
  Temporal: none

Step 2 - Load Semantics:
  Found: "authentication: auth, jwt, oauth, login, session"

Step 3 - Expand:
  Terms: [authentication, auth, jwt, oauth, "bearer token", middleware, login]

Step 4 - Search:
  rg -l -i "(authentication|auth|jwt|oauth|bearer)" --type md docs/ src/

Step 5 - Read top 5 hits by match count

Step 6 - Synthesize answer with citations

Edge Cases

No config files: Run bootstrapping flow

No results: Broaden terms, check if mappings missing, try without source hint filter

Too many results: Add more specific terms, apply source hint, focus on highest match counts

New domain/topic: Add to Semantics.md and optionally to Config.md key entities

@sychou
Copy link
Author

sychou commented Jan 27, 2026

To use, create this in <Your Vault>/.claude/skill/search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment