Codex: The Broke Developer's Lifesaver
Codex 穷鬼大救星
theme: cyanosis
Open Source Address: codex_with_cc
The Broke Developer's Workflow
Folks, big news!
Have you noticed? Codex is getting smarter and smarter; its code writing and architecture design are simply divine.
But do your eyes go black when you check your balance? Token consumption is getting ridiculously fast!
When you're broke, you innovate; when you're rich... well, rich folks don't need to read this article!
To protect our wallets, this broke developer's workflow has emerged!
Before, you were PUAing Codex.
Now, you PUA Codex, and then let Codex PUA Claude Code, letting the AIs engage in fierce internal competition among themselves first! You then only need to review Codex's acceptance results!
Before, you made Codex shoulder large projects alone. It had to read code, break down requirements, find call chains, modify files, run tests, review failure logs, and then go back to fix things. The main thread became bloated, and tokens burned faster and faster. This is like paying a fortune to hire Elon Musk, only to have him unclog sewers?! What a waste of talent!
Now? Welcome to the ruthless AI contracting team!
Codex sits firmly in the boss's chair, giving orders, while sub-agents line up to get work done, and Claude Code, disguised as DeepSeek, frantically gets down to brick-laying (doing grunt work). Long context exploration? Large-scale refactoring? Endless dead-loop debugging? No longer force-fed into the main thread; just throw it to the execution layer. The AI contracting team doesn't deserve weekends!
This is how multi-agent systems should truly look: the boss is responsible for overall planning, accepting results, and sending back for rework! The 'oxen and horses' (grunt workers) fiercely compete for tasks, and all the dirty, tiring work is outsourced!
Of course, all of this is based on DeepSeek's ridiculously low prices! With 60 million tokens used and a 98.5% cache hit rate, costing only two yuan, this workflow allows Codex to handle overall planning and result acceptance, while sub-agents only need to do mindless work. Even the DeepSeek 4 flash model is more than sufficient.
DeepSeek API price list. Cache hit: 0.02 yuan/million tokens. How is this different from giving it away for free? This cache hit discount seems to be permanent, with no time limit.
DeepSeek for the people, Little D's (DeepSeek's) kindness is endless 😭
One-Line Workflow Installation
Prerequisites are simple:
- Install Claude Code.
- Install CC Switch.
- In CC Switch, switch Claude Code's backend API to DeepSeek.
- Prepare a target project you want to integrate this workflow into.
- Open Codex.
No Codex? Then I'm sorry, this project is not for you. This setup is specifically for Codex to be the leader and sub-agents to be the workers.
Then, send the installation prompt to Codex in your target project (this prompt also pulls updates for the workflow):
请把 https://github.com/aiskyhub/codex_with_cc 子代理工作流安装或更新到当前 Codex 环境。
Results
- Prompt
你现在委派三个子代理,让他们深度分析项目,给出项目中的优化计划书,对于三份计划书中矛盾的点,需要反复打回让他们再去验证再去制定,直到一致,然后你汇总出一份优化计划书。
- Creating sub-agents
- Sub-agent executing Claude cli
- Results sent back for re-validation
How to Command Codex
The most basic way to delegate tasks looks like this:
你拆解 xxx 任务,安排给多个子代理实现。你负责审核子代理结果,不符合要求就打回让他们重改,直到符合要求为止。
If you want to make the chain of responsibility clearer, you can directly issue commands like this:
你负责拆解、派工、审核和最终交付。子代理负责执行。结果不合格就返工,直到符合我的要求。
For large task decomposition, you can do this:
你先阅读项目,拆成 3 个互不冲突的实现任务,分别交给子代理处理。每个子代理必须给出变更文件、验证命令和风险说明。你最后统一 review、整合,并跑最终验证。
If you haven't decided on the technical route, you can have multiple sub-agents propose solutions first:
请启动多个子代理分别提出 xxx 的实现方案。每个方案需要说明优缺点、复杂度、风险和迁移成本。你汇总后给出推荐方案,不要直接照抄任何一个子代理。
If you're concerned about code quality, you can have one agent write and another specialize in finding flaws:
安排一个子代理实现 xxx,再安排另一个子代理专门做代码审查和边界情况攻击。你负责判断 review 是否成立,成立就打回实现代理修改,不成立就说明理由。
If you just want to investigate a large module and don't want the main thread to be flooded with all the noise, you can do this:
请把项目里的 xxx 模块交给子代理做深度调查,要求输出调用链、关键文件、潜在风险和建议修改点。你只保留结论,别把所有噪音塞回主上下文。
The common thread among these types of prompts is clear: the main Codex thread is not responsible for painstakingly doing all the work itself; it's responsible for clearly breaking down tasks, articulating standards, and reviewing results.
In plain terms, don't let the boss do the grunt work. The boss's job is to delegate tasks, accept results, send back for rework, and ultimately be responsible for delivery.
Why This Is So Appealing
If you heavily use Codex for projects, you've likely seen this scenario:
A complex requirement is thrown in, and Codex first reads the project, then finds call chains, modifies three to five files, and then runs tests. If a test fails, it starts reading logs. After reading logs, it modifies again. After modifying, it runs again. After running, it discovers another edge case. Finally, the main thread is filled with code snippets, failure logs, intermediate judgments, repair attempts, and various 'let me check again' statements.
It looks fine at first, but then things start to go wrong.
The context gets fatter and fatter, tokens burn faster and faster, and the main Codex's attention is dragged down by intermediate noise. It should be acting as an architect, project manager, reviewer, and ultimate responsible party, but instead, it's forced to become a full-stack laborer, tester, log analyst, and temporary firefighter.
At this point, the most painful thing isn't 'the AI isn't smart enough,' but rather that the smart main thread is bogged down by dirty work. You pay for Codex's judgment, but half its energy is spent sifting through logs, half is repeatedly reading files, and finally, it has to retrieve the main thread from a pile of context fragments.
And DeepSeek's joy lies here: let it chew through those high-token, long-context, repetitive reading grunt tasks. Once the cache hit rate kicks in, it feels like a money-saving cheat code for multi-agent collaboration. It's not that costs disappear entirely, but the psychological pressure of 'every time I try another solution, I feel the token pain' will be significantly reduced.
The most frustrating part is that many tasks in large projects cannot be skipped just by being 'smart'.
You need to read unfamiliar modules, scan files, compare solutions, review failure logs, add tests, check boundaries, and conduct reviews. These are all necessary tasks, but they don't all belong in the main thread.
This is where codex_with_cc comes in: it delegates high-token, high-noise, highly repetitive execution tasks, keeping the main Codex clear-headed and letting 'Little D' (DeepSeek) do the grunt work it's best suited for.
It's not that Codex can't do the work; it's that Codex is perfectly suited to be a leader. Making it constantly do grunt work wastes its most valuable abilities: global judgment, task decomposition, risk assessment, and final acceptance.
The satisfying aspects of this workflow are also here:
- The main thread doesn't have to swallow all the code reading and log noise.
- Sub-agents can handle long context exploration, implementation, review, and rework.
- Claude Code CLI is responsible for executing specific tasks.
- DeepSeek is responsible for digesting large amounts of repetitive context and high-token work.
- The Codex leader ultimately reviews, and if the results are unsatisfactory, sends them back for rework.
What you truly want isn't 'opening several AIs to make it look busy,' but rather for each layer to have its own responsibilities.
The main thread maintains judgment, and the execution layer is responsible for burning through effort.
This Is Not a Prompt Toy
Many so-called multi-agent workflows are actually just prompts written to sound more like company policies.
'You are Agent A, you are responsible for development.'
'You are Agent B, you are responsible for review.'
Sounds very formal, but in practice, it all relies on the AI's self-awareness. There's no session management, no task fingerprinting, no leases, no audit artifacts, no link constraints. How tasks are dispatched, which session was used, whether context was reused, whether validation was run, where results came from—it's all a mess.
codex_with_cc doesn't follow this path.
It does something more fundamental and engineering-oriented: it copies a Codex -> Codex Sub-agent -> Claude Code CLI delegation workflow into any project, making Codex the leader and Claude Code/DeepSeek the execution layer. Dirty work, long context exploration, large-scale code changes, and mutual fault-finding are all thrown to the sub-agents; the main Codex only handles decomposition, scheduling, acceptance, and sending back for rework.
This is no longer the 'please AI, be self-aware' mysticism; instead, it turns multi-agent delegation into a stateful, artifact-driven, bounded, and verifiable workflow.
The real value is hidden in the latter half.
Workflow Breakdown
This chain can be written as one line:
Codex 主线程 -> Codex 子代理 -> Claude Code CLI -> Claude Code 后端模型
Each layer has different responsibilities.
The main Codex thread is responsible for understanding requirements, decomposing tasks, creating sub-agents, reviewing results, sending back for rework, and final delivery. It is the Codex leader, the architect, and the one ultimately accountable.
The Codex sub-agent acts as a traceable conversation tree node, calling delegation scripts to hand over specific implementation, investigation, and review tasks—which are high-token consumption activities—to the Claude Code CLI. It's not just a casually opened chat window, but a task node dispatched by the main thread.
The Claude Code CLI is responsible for executing the specific delegated tasks. It can conduct investigations, modify files, run validations, and output structured reports.
If the Claude Code backend is connected to DeepSeek via CC Switch, then a large amount of grunt work involving reading code, modifying code, and running validations can be offloaded to DeepSeek. The key experience mentioned in the README is DeepSeek's incredibly high cache hit rate, which helps minimize repetitive reading, modeling, and token burning.
So the essence of this chain isn't "making multiple models chat together," but rather layering:
- The main Codex thread is responsible for judgment.
- Codex sub-agents are responsible for taking on task nodes.
- Claude Code CLI is responsible for execution.
- DeepSeek is responsible for digesting high-token work.
The main thread won't be flooded with massive amounts of code and logs, sub-agents do the grunt work, and the main Codex remains clear-headed.
Claude Session Reuse Pool
This library has a built-in Claude session reuse pool.
Three types of roles:
PrimaryReuse: Responsible for serial main session continuation.PrimaryAnchor: Responsible for context anchors in parallel batches.ParallelPool: Responsible for session pooling of independent branch tasks.
These names sound a bit backend-ish, but they solve a very real problem: don't let every task cold-start.
In large projects, one of the most token-wasting things is for each agent to re-read the same background context. Reading project structure, core modules, constraints, call chains, and then starting work. When there are many tasks, this repetitive reading can be very painful.
The goal of the session reuse pool is to allow similar tasks to reuse stable sessions as much as possible, warming up the context. Long tasks no longer start from scratch every time, and repetitive reading, modeling, and token burning are minimized.
This is also why it's suitable for tasks like long context exploration, large-scale modifications, and multi-agent solution comparisons. Context is not a one-time consumable but a reusable work asset.
To put it plainly: since you've already spent tokens to warm up the context, don't boil the water from scratch every time.
Task Fingerprinting, Lease Locks, and Session Recycling
What's the biggest fear in multi-agent parallelism?
Not too few agents, but chaos.
Two workers grabbing the same session, task states unmanaged, a process getting stuck with no one to recycle it. In the end, you don't know who used which context, which task is still alive, or which result is outdated. This kind of parallelism looks busy, but it can easily turn a project into a mess.
codex_with_cc adds engineering constraints to this.
Each delegation generates a fingerprint based on task content, scope, and validation commands. Parallel workers manage session occupancy through leases. Stuck, expired, or vanished process leases are identified and recycled.
This system sounds like service scheduling, and yes, it is very much like service scheduling. However, what's being scheduled here isn't traditional backend tasks, but AI sub-agent execution chains.
The specific questions it addresses are:
- Who initiated this task?
- What is the scope of this task?
- Which validations should it run?
- Which session did it occupy?
- Is the session still alive?
- Has the lease expired?
- Can the results be traced?
Without solving these problems, multi-agent systems are just a mystery box. Once solved, you can then talk about parallelism, reuse, and replayability.
Link Constraints, Audit Artifacts, and Validation Scripts
This workflow also has a critical boundary: the main thread cannot directly execute claude.
The script checks for CODEX_CLAUDE_CHILD_THREAD=1, enforcing that Claude Code delegation can only occur within a Codex sub-thread.
This is very important. Because if the main thread could directly run the execution layer arbitrarily, the link would become messy: context pollution, broken audit trails, unclear task responsibilities, and no one to back up the results. In the end, you'd only know that 'the AI seemed to do something,' but not where it started, how it ran, which session it used, or where the output is.
Therefore, this library clearly defines the boundaries:
- The main Codex is responsible for planning, delegation, review, and rework arbitration.
- Codex sub-agents are the entry point for the execution layer.
- Claude Code CLI executes specific tasks.
- Ultimately, the main Codex reviews and delivers.
Each run also generates audit artifacts:
config_<RunId>.jsonstatus_<RunId>.jsonprompt_<RunId>.mdstream_<RunId>.jsonltrace_<RunId>.logclaude_<RunId>.md
These artifacts solve the problem of 'can the task be traced?' How the task was dispatched, which session was used, whether it resumed, what the output was, whether the link was broken—all can be reviewed.
Finally, there are fallback validation scripts: runtime validation, session pool validation, artifact validation, and delegate chain validation are all configured.
So, multi-sub-agent parallelism isn't a party based on intuition; it's strung together with session state, RunId, SessionKey, artifact root, and link validation.
High EQ way of saying it: An auditable, reusable, concurrent, and replayable multi-agent delegation protocol.
Low EQ way of saying it: Even if Codex is the boss, it still needs office policies and a time clock.
Suitable Scenarios
This workflow is best suited for tasks that are "not difficult individually, but heavily context-dependent overall."
For example, large-scale code reading and module organization. You can have sub-agents investigate a module, output call chains, key files, potential risks, and suggested modifications, with the main Codex only retaining the conclusions and not flooding the main context with all the noise.
For example, multi-file implementation tasks. If a requirement can naturally be broken down into several non-conflicting parts, multiple sub-agents can handle them separately. Each sub-agent provides changed files, validation commands, and risk descriptions, and the main Codex ultimately reviews and integrates them.
For example, multi-solution brainstorming. You can have multiple sub-agents propose solutions, outlining pros and cons, complexity, risks, and migration costs, then have the main Codex summarize and recommend. The key is that the main Codex doesn't directly copy any sub-agent but makes the final judgment.
For example, separating implementation and review. One sub-agent writes code, and another specializes in code review and attacking edge cases. The main Codex determines if the review is valid; if so, it sends it back to the implementation agent for modification; if not, it explains why.
Also, tasks like migration, refactoring, adding tests, and tracing call chains. These tasks may not be difficult, but they are context-heavy and prone to generating a lot of intermediate noise. Delegating them to sub-agents, with the main thread only receiving results, makes the process much cleaner.
Unsuitable Scenarios
Don't use multi-agents for everything.
If it's just changing one or two lines of code, let the main Codex handle it directly. For a small change, the benefits of opening a delegation chain may not outweigh the communication overhead.
If requirements are still changing in real-time, don't rush to throw them to sub-agents. For instance, if product boundaries are unclear, interaction details need back-and-forth confirmation, or business rules themselves are in flux, such tasks should first be stabilized in the main thread.
If file conflicts are extremely high, exercise caution with parallelism. Multiple sub-agents simultaneously modifying the same area can easily lead to stepping on each other's toes. The premise of parallelism is not "more people," but sufficiently clear task boundaries.
So it's not best suited for minor tweaks, but rather for dirty and tiring work like large-scale reading, multi-file implementation, multi-solution exploration, separating implementation and review, migration, refactoring, and test fixes.
The larger the task, the more likely the main thread is to explode; the more likely the main thread is to explode, the more appealing this division of labor becomes.
Conclusion
It's not that Codex can't do the work itself.
But in complex engineering, Codex's most valuable ability isn't "personally modifying a few more files," but rather maintaining global judgment: how to decompose requirements, how to delegate tasks, how to manage risks, how to verify results, where to rework, and what can be delivered.
What codex_with_cc does is separate control from execution.
The main Codex acts as the leader. Codex sub-agents take on tasks. Claude Code CLI executes specific work. DeepSeek digests large amounts of context and repetitive labor. The session reuse pool warms up the context, fingerprinting and leases manage parallelism, and audit artifacts and validation scripts secure the chain.
This is not "opening several AI chat windows."
This is equipping Codex with an execution layer that can delegate, reuse, audit, and validate.
If you've been drowned by token anxiety, context explosion, and a sea of repetitive code reading and logs in large projects, then the value of this workflow is easy to understand: don't let the main thread keep shouldering everything.
Let Codex sit in the leader's chair.
Let sub-agents read, modify, test, and find flaws in each other's work.
If results are unsatisfactory, send them back for rework; if validation fails, continue fixing; if boundaries are unclear, send back for rewriting.
Once this division of labor runs smoothly, you'll find that the real satisfaction isn't "AIs are doing more work," but that the main Codex finally doesn't have to be dragged into the mud by dirty work and can consistently maintain clear judgment, make trade-offs, and take responsibility.