The announcement blog post is probably the best way to get familiar with the scope of the feature.
The extraction feature requires two Turbopack loaders that integrate with Next.js at build and dev time. This document outlines required permissions and touch points for potential SWC/Rust port.
Read Access:
- Initial project scan reads all source files in configured
srcPath(e.g.,./src,./app) - In monorepo setups, may need to read outside project root (e.g.,
../packages/ui,../shared) - Reads all message catalog files (e.g.,
en.json,de.json,es.json) from configured messages directory - Checks file modification timestamps (
fs.stat) to detect external edits during dev
Write Access:
- Writes to source locale catalog (e.g.,
messages/en.json) - Writes to all target locale catalogs (e.g.,
messages/de.json,messages/es.json, ...) - Creates messages directory if it doesn't exist (
fs.mkdir)
Loader Lifecycle:
- Loader instance persists across HMR (not recreated on hot reload)
- Shared instance across react-client and react-server bundles
- Only lost when dev server fully restarts (e.g., Next.js config change)
Pre-filtering Optimization (Next.js 16+):
- Turbopack peaks into files before invoking loader
- Only processes files containing
useExtractedorgetExtractedstrings
Compilation:
- Uses SWC parser (
@swc/core) to parse TypeScript/JSX/TSX - Transforms AST to replace:
useExtracted()→useTranslations()getExtracted()→getTranslations()- Message strings → minified keys (e.g.,
"Hello"→"dPSc42") - In dev, a fallback is also left in place
- Returns modified source code back to Turbopack
Concurrency:
- Multiple files may be compiled in parallel
- Initial scan must complete before first compilation (blocks loader). This could potentially be relaxed in future.
Write Scheduling:
- Batches writes with 50ms debounce to avoid excessive I/O
- First write happens immediately, subsequent saves are debounced
- De-duplicates multiple rapid saves into single write operation
Caching:
- LRU cache (750 entries) for compiled sources to avoid re-parsing unchanged files
- Cache key is full source content
Catalog Directory Watcher:
- Watches messages directory for new locale files
- When user adds new locale (e.g.,
es.json), automatically populates with all keys - Uses Node.js
fs.watchwith non-persistent, non-recursive mode - Only active when locales are set to
'infer'mode
External Edit Detection:
- Before writing, checks if catalog files were modified externally
- If external modification detected, reads file and merges changes back into memory
- Preserves manual translation edits made while dev server is running
Read Access:
- Reads catalog files (
.json,.po, custom formats) - No write access needed
Format Support:
- Transforms
.pofiles → JavaScript objects - Transforms
.jsonfiles → JavaScript objects (currently a no-op, but parsed for consistency) - Extensible for custom formats in future (would be helpful if users could define this in JS)
Output:
- Returns catalog as
export default JSON.parse({...})for V8 optimization - Example:
{"NhX4DJ": "Hello", "abc123": "World"}
Loader Context:
- Uses
this.resourcePathto extract locale from filename - Example:
messages/en.json→ locale"en"
Dynamic Imports:
- Loader runs optimistically for dynamic imports:
import('../messages/${locale}.json') - Turbopack processes all potential candidates in folder (both dev and build)
- Single persistent compiler instance across entire dev session
- Maintains in-memory state of all extracted messages across all files
- Tracks which messages belong to which source files
- Merges references when same message appears in multiple files
- Initial scan loads all existing messages into memory
- Continuous sync between memory state and disk catalogs
- Conflict resolution when files modified externally during dev
- Same compiler instance used for both react-client and react-server
- Ensures consistent message IDs across client/server boundaries
- Must be able to read files outside Next.js project root
- Example: Next.js app in
apps/web, shared UI inpackages/ui srcPathcan be array:["./src", "../packages/ui/src"]
- Current implementation maintains complex in-memory JavaScript state
- Maps, Sets, LRU caches, async scheduling
- Would need Rust equivalents or JS bridge (SWC plugin?)
- Heavy use of Node.js
fs/promisesAPIs - File watching with
fs.watch - Timestamp tracking and comparison
- Formatters loaded dynamically:
formatters[config.format]() - Supports pluggable catalog formats (JSON, PO, custom)
- Complex visitor pattern traversing SWC AST
- Scope tracking for variable bindings
- Multiple node type transformations (strings, template literals, objects)
- File:
src/plugin/extractor/extractionLoader.tsx - Exports:
export default function extractionLoader(this: TurbopackLoaderContext, source: string) - Config type:
ExtractorConfig
- File:
src/plugin/catalog/catalogLoader.tsx - Exports:
export default function catalogLoader(this: TurbopackLoaderContext, source: string) - Config type:
CatalogLoaderConfig
See extractor
ExtractionCompiler- Orchestrates extraction lifecycleCatalogManager- Manages in-memory message state and persistenceMessageExtractor- AST traversal and transformationCatalogPersister- File I/O operationsSaveScheduler- Batches and de-duplicates writesCatalogLocales- Locale discovery and directory watching
Besides some more isolated unit tests, there are mainly integration tests in ExtractionCompiler.test.tsx which try to test as much as possible at once—however, without having to invoke Next.js.
- It seems to be reasonably fast, but faster is of course always better
- When the user changes a message in a file and saves, this triggers this chain: 1) compile (catalog written), 2) HMR re-render with old catalog, 3) Compile new catalog 4) HMR re-render with new catalog. It's a bit wasteful, a single compile -> re-render chain would be ideal.