This guide defines Claude Code CLI and Codex CLI as general-purpose "intelligent compute nodes." They're not just coding assistants—they're advanced reasoning engines that integrate into automation pipelines.
Directly calling LLM APIs (like chat/completions) is simple, but has clear limitations when handling complex tasks (full-document translation, large-scale refactoring). Using a CLI Agent as an intermediary offers these core advantages:
- Anti-Laziness: When tasks are large, direct API calls often result in outputs like
(omitted for brevity...)or truncation. CLI Agents (especially Claude Code) have native "loop execution" and "self-correction" capabilities—they can sense file state, operate in chunks or loops, and ensure tasks are actually completed. - Native File Context Support: They automatically handle file reading, encoding, and writing, decoupling "reasoning" from "IO." You just specify the goal; the Agent optimizes the reading details.
- Tool-Rich Environment: They come with mature MCP (Model Context Protocol) plugins. For example, in our environment, they can call
Tavilyfor web searches anytime, or write temporary scripts to process data—capabilities that are extremely expensive to simulate via API. - Optimized Context Management: Agent frameworks automatically handle Context Window consumption and long-conversation compression, which is more robust and efficient than hand-coded API logic.
In production, never use pipe patterns like echo | claude for core logic. We exclusively use File-Based Mode.
- Determinism: The AI's mental model when "editing files" is "complete the work and save," while in "conversation" mode it's "answer the question." The former is less prone to truncation.
- Auditability: File system changes before and after (git diff) are the single source of truth.
- Large Capacity: Bypasses command-line argument length limits, letting the Agent decide how to efficiently read files.
# 1. Prepare context (store content to process in a file)
cp raw_data.json task_context.json
# 2. Issue instructions (have the Agent modify that file)
codex exec --full-auto "Read task_context.json, translate Chinese to English, preserve JSON structure. Modify the file in place."In our automated translation scripts, we no longer manually parse text to send to APIs—we hand the entire Tiptap JSON directly to the CLI Agent.
# Core logic: Have the Agent modify the target file directly
prompt = f"""You are a professional content translator.
1. Read {target_file}.
2. Translate Tiptap JSON text nodes from Chinese to English.
3. Keep technical terms (like "vibe coding") unchanged.
4. Ensure the file remains valid JSON.
Modify the file in place."""
# Invocation (Codex example, using latest model with reasoning control)
subprocess.run([
"codex", "exec",
"--dangerously-bypass-approvals-and-sandbox",
"-m", "gpt-5.2", # Explicitly specify latest model
"-c", 'model_reasoning_effort="low"', # Adjust reasoning intensity (low/medium/high)
"-C", str(target_dir),
prompt.replace('\0', '') # Important: Clean Nul Bytes to avoid system errors
])In the GPT-5.2 era, we can precisely control AI's "thinking cost" and "reasoning depth" through parameters.
-m, --model: Recommend usinggpt-5.2. This version is optimized for Agentic workflows and performs better on long contexts and complex refactoring.-c model_reasoning_effort:low: Suitable for simple format conversion, text translation, README updates. Extremely fast, low cost.medium(default): Suitable for routine bug fixes, single-file refactoring.high: Suitable for cross-file logic migration, deep code audits.
Performance Tip: For batch translation tasks, forcing
lowcan improve response speed by 2x+ with minimal impact on literal translation quality.
For long-running tasks (like large-scale translation), simply waiting for process completion (blocking) loses visibility into task progress. We encourage using Streaming JSON mode.
- Claude Code: Use
--output-format stream-json. - Codex: Use
--json. It outputs structured events containingthought(thinking process),call(tool invocations), andresponse.
When writing integration scripts, strongly recommend enabling and printing AI logs in real-time (especially thought and call events).
- Transparency: Developers can instantly see if the AI is reading files correctly or stuck in a loop.
- Debugging Efficiency: When tasks fail, the most intuitive error cause is usually in the last few
thoughtorcalllines. - Feedback: When processing large files, scrolling real-time logs provide stronger certainty than a static progress bar.
For multi-process or long-running tasks, strongly recommend applications print intermediate results to stdout in JSON Lines format—even with interleaved multi-process output, it's easy for machines to parse and monitor.
Standard Log Fields:
{"event": "task_completed", "id": "post_123", "status": "success", "details": "Translated 5 comments"}- Required Fields:
event: Event type (e.g.,start,progress,complete,error).id: Unique task identifier (e.g.,post_id,file_path).status: Current state.
- Purpose:
- Real-time Observability: Provides instant progress feedback in CI/CD or terminal.
- Decoupling: Scripts only need to print intermediate state to stdout, not aggregate or parse final disk files. External tools can consume these logs via pipe if needed.
Warning: If you pass Prompts via stdin, or the Prompt contains invisible \0 characters, Codex CLI (Rust core) will throw a nul byte found error and crash. Countermeasures:
- Prefer passing Prompts as the last command-line argument rather than via pipe.
- Always execute
.replace('\0', '')on strings before passing.
| Scenario | Recommended (Claude) | Recommended (Codex) |
|---|---|---|
| Translation/Doc Rewriting | --permission-mode acceptEdits | --full-auto |
| Automated Testing/Deployment | --permission-mode bypassPermissions | --dangerously-bypass-approvals-and-sandbox |
If you're writing an Agent (like Gemini) to call these CLIs, provide these "meta-instructions":
"When facing large-scale text processing or file system operations, call the underlying
claudeorcodex.
- Prefer file-based mode—first store content to process in a local temp file.
- Use streaming mode (
--json) and parse events in real-time to monitor progress.- Set appropriate
reasoning_effort(e.g.,lowfor translation).- Clean null characters before passing Prompts, and pass them as command-line arguments."
Note: When Claude quota is limited, prioritize using Codex for testing.
For very large files (over 2000 lines), the system automatically switches to parallel mode to break through single-thread speed bottlenecks:
- Auto-splitting: Files are split into independent chunks of 1000 lines each.
- High Concurrency: Uses
ProcessPoolExecutorto launch up to 8 workers in parallel. - Context Preservation: Each worker still reads the complete file to maintain context understanding, but Prompts strictly limit them to only modify their assigned line range.
- Result: A 20k-line giant file translates in ~8 minutes (vs 45+ minutes serial).
To handle complex tasks unattended, Prompt design is crucial:
- Chunked Edit Instructions: Explicitly tell the AI it can edit in chunks (e.g., 1000 lines at a time) if the file is too large.
- Persistence Instructions: Use strong directive words (like "MUST persist", "complete the ENTIRE file") to prevent the AI from getting lazy or giving up midway.
- Self-Correction: Require the AI to perform quality checks and JSON format validation before submitting final results.
To prevent large file processing from being killed unexpectedly, the system uses dynamic timeout strategy:
- Formula: Base 10 minutes + 10 minutes per 5000 lines.
- Range Limits: Minimum 10 minutes, maximum 45 minutes.
- Granularity: Each parallel worker has its own timeout quota, ensuring complex paragraphs have sufficient processing time.
After testing, this is currently the most cost-effective translation configuration:
- Model:
gpt-5.2(smarter and more stable than older versions). - Reasoning Effort:
low. For translation tasks, "low" is sufficient and fastest—no need for "high" deep reasoning costs. - Invocation Example:
codex exec ... -m gpt-5.2 -c 'model_reasoning_effort="low"' ...In concurrent mode, subprocess output gets stuck in buffers if not force-flushed.
- Trick: Force
flush=Truein Python scriptprint()calls. - Effect: This ensures that even with 8 processes running in parallel, you can see interleaved but real-time progress logs in the terminal (like
[L1-1000],[L2001-3000])—very important for psychological safety during long-running tasks.