Skip to content

Instantly share code, notes, and snippets.

@silverkors
Last active January 25, 2026 22:19
Show Gist options
  • Select an option

  • Save silverkors/eafe83dc19f075463561187ce23735b2 to your computer and use it in GitHub Desktop.

Select an option

Save silverkors/eafe83dc19f075463561187ce23735b2 to your computer and use it in GitHub Desktop.
Ora CRM & Dossier System Architecture - Comprehensive system documentation with Mermaid diagrams

Ora CRM & Dossier System Architecture

System Overview

The Ora CRM and dossier system is a sophisticated relationship management platform built on PostgreSQL with pgvector extensions. It combines automatic signal extraction, time-decay scoring, contact lifecycle management, and comprehensive relationship tracking.

Key Stats:

  • 16 service modules (~5,100 LOC)
  • 10 core database tables
  • 4 signal sources (iMessage, Email, OraChat, Archive)
  • 6 relationship tiers
  • 4 lifecycle states (candidate → provisional → contact → archived)

1. High-Level Architecture

graph TB
    subgraph "External Sources"
        IM[iMessage]
        EM[Email]
        OC[OraChat]
        AC[Archive Candidates]
        OBS[Obsidian Vault]
        MC[macOS Contacts]
        OL[Outlook]
    end
    
    subgraph "Signal Pipeline"
        SE[Signal Extractors]
        ESE[entity_signal_events]
        SS[ScoringService]
        LS[LifecycleService]
    end
    
    subgraph "CRM Services"
        CS[ContactSyncService]
        ES[EntityRelationshipService]
        DS[DeduplicationService]
        INS[InteractionsService]
        CP[ContactProvisioner]
        DDS[DailyDashboardService]
    end
    
    subgraph "Dossier Services"
        DSR[DossierService]
        SC[ScoreCalculator]
        HS[HealthScorer]
        SD[StaleDetector]
        EL[EntityLinker]
        DEN[DossierEnrichment]
    end
    
    subgraph "Database Layer"
        ENT[memory.entities]
        REL[memory.entity_relationships]
        CDO[memory.contact_dossiers]
        DME[memory.dossier_memories]
        DIN[memory.dossier_interactions]
        SCO[memory.relationship_scores]
        INT[memory.interactions]
        OPP[memory.opportunities]
    end
    
    subgraph "API Layer"
        CA[CRM API Routes]
        DA[Dossier API Routes]
    end
    
    subgraph "Frontend"
        CRM["CRM Contacts Page"]
        CDP["CRM Detail Page"]
        DUP["CRM Duplicates Page"]
    end
    
    IM --> SE
    EM --> SE
    OC --> SE
    AC --> SE
    OBS <--> CS
    MC --> CS
    OL --> CS
    
    SE --> ESE
    ESE --> SS
    SS --> ENT
    ENT --> LS
    LS --> CP
    CP --> OBS
    
    CS <--> ENT
    ES <--> REL
    DS --> ENT
    INS --> INT
    DDS --> SCO
    
    DSR <--> CDO
    SC --> DSR
    HS --> SCO
    SD --> DSR
    EL --> DME
    DEN --> CDO
    
    ENT --> CA
    CDO --> DA
    CA --> CRM
    CA --> CDP
    DA --> CDP
    DA --> DUP
    
    style ENT fill:#e1f5ff
    style CDO fill:#e1f5ff
    style ESE fill:#fff4e1
    style OBS fill:#f0e1ff
Loading

2. Database Schema

erDiagram
    entities ||--o{ entity_relationships : "relates to"
    entities ||--o{ entity_signal_events : "generates"
    entities ||--|| contact_dossiers : "has profile"
    entities ||--o{ dossier_memories : "linked to"
    entities ||--o{ dossier_interactions : "has"
    entities ||--o{ interactions : "logged in"
    entities ||--o{ opportunities : "associated with"
    entities ||--o{ relationship_scores : "scored"
    
    contact_dossiers ||--o{ dossier_memories : "contains"
    contact_dossiers ||--o{ dossier_interactions : "includes"
    
    memories ||--o{ dossier_memories : "linked from"
    
    entities {
        text id PK
        text name
        text entity_type
        jsonb contact_info
        jsonb crm_data
        text status "candidate|provisional|contact|archived"
        text relationship_tier
        decimal relevance_score
        timestamp last_signal_at
        integer total_signals
        timestamp promoted_at
        timestamp archived_at
        text obsidian_path
        jsonb external_sources
        timestamp last_contact_at
        timestamp next_follow_up_at
        text merged_into
    }
    
    entity_relationships {
        text entity_id FK
        text related_entity_id FK
        text relationship_type
        timestamp since
        jsonb metadata
    }
    
    entity_signal_events {
        text id PK
        text entity_id FK
        text signal_type
        text signal_source
        text dedupe_key UK
        timestamp occurred_at
        decimal weight
        decimal quality
        decimal source_factor
        decimal score_contribution
        text source_message_id
        text source_thread_id
        text source_candidate_id
    }
    
    contact_dossiers {
        text entity_id FK
        text display_name
        text title
        text organization
        text location
        text timezone
        text relationship_tier
        integer health_score
        timestamp last_interaction_at
        timestamp first_seen_at
        timestamp last_updated_at
        integer memory_count
        integer interaction_count_90d
        text notes
        text[] tags
        boolean pending_followup
        timestamp last_followup_suggested_at
        timestamp followup_cooldown_until
    }
    
    dossier_memories {
        text dossier_id FK
        text memory_id FK
        decimal relevance_score
        text mention_context
    }
    
    dossier_interactions {
        text dossier_id FK
        text interaction_type
        text source
        text source_id
        timestamp interaction_date
        text summary
        jsonb metadata
    }
    
    relationship_scores {
        text entity_id FK
        integer health_score
        decimal recency_score
        decimal frequency_score
        decimal sentiment_score
        integer interaction_count_7d
        integer interaction_count_30d
        integer interaction_count_90d
        timestamp last_interaction_at
        timestamp last_calculated_at
    }
    
    interactions {
        uuid id PK
        text entity_id FK
        text interaction_type
        text title
        text notes
        text location
        integer duration_minutes
        timestamp interaction_at
        jsonb metadata
    }
    
    opportunities {
        uuid id PK
        text entity_id FK
        text stage
        decimal value
        text currency
        integer probability
        timestamp expected_close_date
        timestamp actual_close_date
    }
Loading

3. Signal Extraction Pipeline

flowchart TD
    Start([Signal Pipeline Start]) --> Extract[Signal Extractors]
    
    subgraph "Phase 1: Extract"
        Extract --> IMX[iMessageSignalExtractor]
        Extract --> EMX[EmailSignalExtractor]
        Extract --> OCX[OraChatSignalExtractor]
        Extract --> ACX[ArchiveCandidateSignalExtractor]
    end
    
    IMX --> Dedupe[Deduplication Check]
    EMX --> Dedupe
    OCX --> Dedupe
    ACX --> Dedupe
    
    Dedupe -->|Unique| Insert[Insert into entity_signal_events]
    Dedupe -->|Duplicate| Skip[(Skip Duplicate)]
    
    Insert --> Phase2{Phase 2: Score}
    
    Phase2 -->|Enabled| Score[ScoringService]
    Phase2 -->|Skipped| Phase3{Phase 3: Promote}
    
    Score --> Calc[Calculate Decay Scores]
    Calc --> UpdateEnt[Update entities.relevance_score]
    UpdateEnt --> Phase3
    
    Phase3 -->|Enabled| Promote[LifecycleService]
    Phase3 -->|Skipped| Phase4{Phase 4: Interactions}
    
    Promote --> Check{Score Thresholds}
    Check -->|>= 50, >=10 signals| Contact[Set state: contact]
    Check -->|>= 15, >=3 signals| Provisional[Set state: provisional]
    Check -->|< 10, >90d| Archived[Set state: archived]
    Check -->|Default| Candidate[Set state: candidate]
    
    Contact --> Provisioning[ContactProvisioner]
    Provisioning --> CreateFile[Create Obsidian File]
    CreateFile --> Phase4
    
    Provisional --> Phase4
    Archived --> Phase4
    Candidate --> Phase4
    
    Phase4 -->|Enabled| Interact[InteractionsService]
    Phase4 -->|Skipped| End([Pipeline Complete])
    
    Interact --> CreateRec[Create interaction records]
    CreateRec --> End
    
    style Insert fill:#e1f5ff
    style UpdateEnt fill:#fff4e1
    style Contact fill:#e8f5e9
    style CreateFile fill:#f0e1ff
Loading

4. Time-Decay Scoring Algorithm

flowchart LR
    subgraph "Signal Input"
        S1[Signal Event]
        S2[occurred_at]
        S3[weight]
        S4[quality]
        S5[source_factor]
    end
    
    subgraph "Calculation"
        A1[Calculate age in days]
        A2[Apply decay formula]
        A3[Multiply by weights]
        A4[Sum all signals]
    end
    
    subgraph "Output"
        O1[relevance_score]
        O2[last_signal_at]
        O3[total_signals]
    end
    
    S1 --> A1
    S2 --> A1
    A1 -->|age = now - occurred_at| A2
    S3 --> A3
    S4 --> A3
    S5 --> A3
    A2 -->|decay = e^-0.0154 x age| A3
    A3 -->|contribution = weight x quality x source_factor x decay| A4
    A4 --> O1
    S2 -->|max| O2
    A4 -->|count| O3
    
    style A2 fill:#fff4e1
    style A4 fill:#e1f5ff
Loading

Formula:

// For each signal within 180 days:
ageInDays = (now - signal.occurredAt) / (24 × 60 × 60 × 1000)
decayFactor = e^(-0.0154 × ageInDays)  // Half-life: 45 days
contribution = max(
  signal.weight × signal.quality × signal.sourceFactor × decayFactor,
  0.001  // MIN_SCORE_CONTRIBUTION
)
totalScore += contribution

5. Contact Lifecycle State Machine

stateDiagram-v2
    [*] --> Candidate: Initial Detection
    Candidate --> Provisional: score >= 15 AND signals >= 3
    Provisional --> Contact: score >= 50 AND signals >= 10
    Contact --> Archived: score < 10 AND no signals > 90d
    Provisional --> Archived: score < 10 AND no signals > 90d
    Candidate --> Archived: score < 10 AND no signals > 90d
    
    Archived --> Candidate: Manual Restore
    
    Contact --> Provisional: Score drops
    Provisional --> Candidate: Score drops
    
    note right of Candidate
        Score: < 15
        Signals: < 3
        No Obsidian file
    end note
    
    note right of Provisional
        Score: 15-49
        Signals: 3-9
        No Obsidian file
    end note
    
    note right of Contact
        Score: >= 50
        Signals: >= 10
        Has Obsidian file
        Fully enriched dossier
    end note
    
    note right of Archived
        Score: < 10
        No signals > 90 days
        Preserved in database
    end note
Loading

6. Dossier Enrichment Flow

flowchart TD
    Start([Entity Update Triggered]) --> Fetch[Fetch entity data]
    
    Fetch --> Enrich[DossierEnrichment Service]
    
    Enrich --> Extract[Extract Profile Info]
    Enrich --> Link[Link Memories]
    Enrich --> AddInt[Add Interactions]
    Enrich --> Calc[Calculate Health Score]
    
    Extract --> P1[Parse title from memories]
    Extract --> P2[Parse organization]
    Extract --> P3[Infer timezone]
    
    Link --> L1[Vector search memories]
    L1 --> L2[Score by relevance]
    L2 --> L3[Top 20 memories]
    
    AddInt --> I1[Aggregate interactions]
    I1 --> I2[Group by type]
    I2 --> I3[Last 90 days]
    
    Calc --> H1[Recency Score: 40%]
    Calc --> H2[Frequency Score: 30%]
    Calc --> H3[Sentiment Score: 30%]
    
    P1 --> Merge[Merge into dossier]
    P2 --> Merge
    P3 --> Merge
    L3 --> Merge
    I3 --> Merge
    H1 --> Merge
    H2 --> Merge
    H3 --> Merge
    
    Merge --> Update[Update contact_dossiers]
    Update --> Cache[Invalidate 5-min cache]
    Cache --> End([Dossier Ready])
    
    style Enrich fill:#e1f5ff
    style Merge fill:#fff4e1
    style Update fill:#e8f5e9
Loading

7. API Structure

flowchart TD
    subgraph "CRM API Routes"
        root["CRM Contacts API"]
        
        root --> GET_LIST[GET /]
        root --> POST_CREATE[POST /]
        
        root_detail["CRM Contact ID API"]
        root_detail --> GET_ONE[GET]
        root_detail --> PUT_UPDATE[PUT]
        root_detail --> DELETE_SOFT[DELETE]
        
        root_timeline["Contact Timeline API"]
        root_timeline --> GET_TIME[GET - Unified Timeline]
        
        root_rel["Contact Relationships API"]
        root_rel --> GET_REL[GET - Relationships]
        root_rel --> POST_REL[POST - Add Relationship]
        
        root_content["Contact Content API"]
        root_content --> GET_CONT[GET - Linked Content]
        
        root_merge["Contact Merge API"]
        root_merge --> POST_MERGE[POST - Merge Duplicate]
        
        root_dup["Duplicates API"]
        root_dup --> GET_DUP[GET - Find Duplicates]
    end
    
    subgraph "Response Types"
        GET_LIST --> RL["contacts array with total"]
        POST_CREATE --> RC["contact object"]
        GET_ONE --> RO["contact with counts"]
        GET_TIME --> RT["timeline array"]
        GET_REL --> RR["relationships or graph"]
        GET_CONT --> RCO["memories and content"]
        POST_MERGE --> RM["merged and archived"]
        GET_DUP --> RD["duplicate groups or stats"]
    end
    
    subgraph "Query Parameters"
        QL[Filters]
        QL --> Q1["type, scope, sensitivity, status"]
        QL --> Q2["relationshipTier, sortBy, sortOrder"]
        QL --> Q3["limit, offset, q search"]
        
        GET_LIST -.-> QL
        GET_DUP -.-> QL
    end
    
    style root fill:#e1f5ff
    style root_detail fill:#e1f5ff
Loading

8. Obsidian Sync Flow

sequenceDiagram
    participant User as User/CLI
    participant CS as ContactSyncService
    participant Scan as ContactScanner
    participant Vault as Obsidian Vault
    participant DB as Database
    
    User->>CS: sync(direction)
    
    opt Import Mode
        CS->>Scan: scanVault()
        Scan->>Vault: Read contact files
        Vault-->>Scan: File metadata + content
        Scan-->>CS: Changed files since checkpoint
        
        loop For each changed file
            CS->>CS: parseContactFile()
            CS->>DB: Upsert entity
            CS->>DB: Import relationships
        end
        
        CS->>DB: Update checkpoint
    end
    
    opt Export Mode
        CS->>DB: Query entities with obsidian_path
        DB-->>CS: Entities to export
        
        loop For each entity
            CS->>CS: buildContactFile()
            CS->>Vault: Write/overwrite file
        end
    end
    
    opt Bidirectional Mode
        CS->>CS: Import first
        CS->>CS: Resolve conflicts
        CS->>CS: Export changes
    end
    
    DB-->>User: Sync complete
    
    Note over CS,Vault: Conflict Resolution: OBSIDIAN_WINS or DATABASE_WINS or MANUAL
Loading

Summary

The Ora CRM & Dossier system is a comprehensive relationship management platform that:

  1. Automatically detects contacts from multiple signal sources (iMessage, Email, Chat)
  2. Scores relevance using time-decay algorithms (45-day half-life)
  3. Manages lifecycle from candidate → provisional → contact → archived
  4. Syncs bidirectionally with Obsidian as source of truth
  5. Detects duplicates using multi-strategy confidence scoring
  6. Enriches dossiers with linked memories, interactions, and health scores
  7. Provides nurturing dashboards for relationship maintenance
  8. Tracks relationships with graph-based connection mapping

Architecture Highlights:

  • Pipeline-based signal processing with 4 phases
  • Time-decay scoring prevents stale contacts from dominating
  • Obsidian-first design with automatic export/promotion
  • Comprehensive duplicate detection with manual review
  • Health scoring based on recency, frequency, and sentiment
  • RESTful API with filtering, sorting, and pagination
  • React-based frontend with real-time updates

Total Lines of Code: ~5,100 across 16 service modules Database Tables: 10 core tables with complex relationships API Endpoints: 12 routes covering full CRUD operations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment