Expo SQLite Storage And Cache Plan

Goal

Introduce an expo-sqlite data layer so analysis results survive app restarts, manual/photo cleanup workflows are resumable, and ML indexing work is reused safely when assets or model version change.

Scope Anchors In Current Code

Main analysis flow: app/index.tsx
Selection/cleanup flow: app/pickone.tsx
Global in-memory state: contexts/PhotoGroupsContext.tsx
Embedding pipeline: modules/ImageEmbedding.ts
Similarity/grouping logic: modules/PhotoSimilarity.ts

Data Model (What To Persist)

1) `app_state` (small key-value config/state)

Store:

schema_version
ml_model_version (manual bump when embedding model changes)
last_scan_started_at, last_scan_finished_at
last_cursor_asset_id (resume large scans)
User preferences: similarity threshold/window size/batch size

2) `assets` (normalized photo index)

Store per photo asset:

asset_id (PK, from MediaLibrary identifier)
uri
width, height, creation_time, modification_time
source (library | manual)
is_hidden (soft exclusion from UI)
is_in_pickone_album (cached boolean for cleanup filtering)
updated_at

Purpose: canonical local index and dedupe point for all workflows.

3) `embeddings` (ML index cache)

Store:

asset_id (FK -> assets)
model_version
vector_blob (Float32Array packed as BLOB)
dimensions
computed_at
asset_fingerprint (e.g., modification_time + width + height hash)

Purpose: recompute only when asset fingerprint or model version changes.

4) `analysis_runs` (scan jobs/session)

Store:

run_id (PK)
status (running | completed | cancelled | failed)
started_at, finished_at
params_json (threshold/window/batch)
processed_count, error_count

Purpose: recover/resume long-running scans and provide diagnostics.

5) `photo_groups` (grouping output)

Store:

group_id (PK)
run_id (FK)
group_type (ml_similar | manual_selection)
average_similarity
cover_asset_id
status (new | in_review | reviewed | archived)
created_at, updated_at

6) `group_items` (membership + decision)

Store:

group_id (FK)
asset_id (FK)
rank
decision (undecided | keep | delete)
decided_at

Purpose: persist selection results from pairwise flow.

7) `cleanup_actions` (what user did to clean)

Store:

action_id (PK)
group_id
asset_id
action_type (add_to_pickone_album | restore | skip)
action_status (pending | done | failed)
error_message
created_at, completed_at

Purpose: durable clean-up results/history and retry queue for failures.

8) `manual_import_batches` (manually added photos)

Store:

batch_id (PK)
created_at
source_label (optional)
asset_count

And map imported assets via assets.source='manual' + group_type='manual_selection'.

Cache Policy (What To Cache, Not Treat As Source Of Truth)

DB-backed cache (survives restart, safe to regenerate)

embeddings
photo_groups from ML runs
is_in_pickone_album flags (refresh opportunistically)

Memory-only cache (fast UI)

Current screen thumbnails and pairwise image decode cache
In-flight run buffer before DB flush (batch writes)

Eviction/Invalidation

Invalidate embedding when asset_fingerprint changes or ml_model_version changes
Mark stale groups when any member embedding is stale
Purge old completed runs beyond retention (e.g., keep last 10 runs or 30 days)
Keep manual groups until user archives/deletes

Integration Plan In Existing App

Add DB bootstrap + migrations and initialize at app startup.
Add repository helpers (assets, embeddings, groups, cleanup actions, app state).
Hydrate PhotoGroupsContext from DB on launch (instead of empty memory state).
In app/index.tsx:

Before embedding: upsert assets metadata
Reuse cached embeddings where valid
Persist streaming groups as they are discovered
Persist run progress cursor every N assets

In app/pickone.tsx:

Persist each keep/delete decision to group_items
Persist cleanup action records when adding to album
Update action status after native/MediaLibrary success/failure

On returning to home screen:

Refresh pickone album membership cache for touched assets
Filter/mark groups based on persisted cleanup status

Proposed File Additions/Changes

Add: lib/db/sqlite.ts (connection, pragmas)
Add: lib/db/migrations.ts (migration runner)
Add: lib/db/schema/001_initial.sql
Add: lib/db/repos/assetsRepo.ts
Add: lib/db/repos/embeddingsRepo.ts
Add: lib/db/repos/groupsRepo.ts
Add: lib/db/repos/cleanupRepo.ts
Add: lib/db/repos/appStateRepo.ts
Update: contexts/PhotoGroupsContext.tsx
Update: app/index.tsx
Update: app/pickone.tsx

Write Strategy And Performance

Use transactions for batch inserts/updates (assets + embeddings + groups)
WAL mode + foreign keys on startup pragmas
Batch sizes: 50-200 rows per transaction depending on device
Keep vectors as BLOB for compactness; decode lazily only when needed
Add indexes:
- assets(asset_id)
- embeddings(asset_id, model_version)
- group_items(group_id)
- group_items(asset_id)
- cleanup_actions(action_status, created_at)

Rollout Safety

Phase 1: dual-write (context + DB), read from context
Phase 2: hydrate from DB, context as UI cache
Phase 3: remove temporary fallback paths after validation
Add debug screen/logging for run resume, stale embedding count, pending cleanup actions

kirillshevch/ios-store-plan.md

Select an option

No results found

Select an option

No results found

Expo SQLite Storage And Cache Plan

Goal

Scope Anchors In Current Code

Data Model (What To Persist)

1) `app_state` (small key-value config/state)

2) `assets` (normalized photo index)

3) `embeddings` (ML index cache)

4) `analysis_runs` (scan jobs/session)

5) `photo_groups` (grouping output)

6) `group_items` (membership + decision)

7) `cleanup_actions` (what user did to clean)

8) `manual_import_batches` (manually added photos)

Cache Policy (What To Cache, Not Treat As Source Of Truth)

DB-backed cache (survives restart, safe to regenerate)

Memory-only cache (fast UI)

Eviction/Invalidation

Integration Plan In Existing App

Proposed File Additions/Changes

Write Strategy And Performance

Rollout Safety

kirillshevch/ios-store-plan.md

Expo SQLite Storage And Cache Plan

Goal

Scope Anchors In Current Code

Data Model (What To Persist)

1) app_state (small key-value config/state)

2) assets (normalized photo index)

3) embeddings (ML index cache)

4) analysis_runs (scan jobs/session)

5) photo_groups (grouping output)

6) group_items (membership + decision)

7) cleanup_actions (what user did to clean)

8) manual_import_batches (manually added photos)

Cache Policy (What To Cache, Not Treat As Source Of Truth)

DB-backed cache (survives restart, safe to regenerate)

Memory-only cache (fast UI)

Eviction/Invalidation

Integration Plan In Existing App

Proposed File Additions/Changes

Write Strategy And Performance

Rollout Safety

1) `app_state` (small key-value config/state)

2) `assets` (normalized photo index)

3) `embeddings` (ML index cache)

4) `analysis_runs` (scan jobs/session)

5) `photo_groups` (grouping output)

6) `group_items` (membership + decision)

7) `cleanup_actions` (what user did to clean)

8) `manual_import_batches` (manually added photos)