run

This is the primary command for starting the optimization process. It takes several arguments to configure how Weco should optimize your code.

weco directly modifies the file(s) specified by --source (or --sources) during the optimization process. It is strongly recommended to use version control (like Git) to track changes and revert if needed. Alternatively, ensure you have a backup of your original file(s) before running the command. During optimization, the source file(s) will be temporarily modified while running the evaluation command.

Upon completion, you will be prompted with the option to update your file(s) with the best-performing version of the code found during the run. Alternatively, you can provide the --apply-change flag to the run and resume command to update the file(s) automatically.

Command Arguments

Required:

Argument	Description	Example
`-s, --source`	Path to a single source code file that will be optimized. Use exactly one of `--source` or `--sources`.	`-s model.py`
`--sources`	Paths to multiple source code files to be optimized together (up to 10 files, 200KB per file, 500KB total). Weco can make coordinated changes across file boundaries.	`--sources model.py utils.py config.py`
`-c, --eval-command`	Command to run for evaluating the code in your source file(s). This command should print the target `--metric` and its value to the terminal (stdout/stderr). See note below. Required for the default `shell` eval backend; omit when using `--eval-backend langsmith` or `langfuse`.	`-c "python eval.py"`
`-m, --metric`	The name of the metric you want to optimize (e.g., 'accuracy', 'speedup', 'loss'). This metric name does not need to match what's printed by your `--eval-command` exactly (e.g., it's okay to use "speedup" instead of "Speedup:").	`-m speedup`
`-g, --goal`	`maximize`/`max` to maximize the `--metric` or `minimize`/`min` to minimize it.	`-g maximize`

Optional:

Argument	Description	Default	Example
`-n, --steps`	Number of optimization steps (LLM iterations) to run.	100	`-n 50`
`-M, --model`	Model identifier for the LLM to use (e.g., `gpt-5.2`, `claude-opus-4-5`, `gemini-3.1-pro-preview`). See Supported Models for the complete list of available models.	`gemini-3-flash-preview` (with `--api-key`: your key's provider default)	`-M gpt-5.2`
`-i, --additional-instructions`	Natural language description of specific instructions or path to a file containing detailed instructions to guide the LLM. Supported file formats include `.txt`, `.md`, and `.rst`.	`None`	`-i instructions.md` or `-i "Optimize the model for faster inference"`
`-l, --log-dir`	Path to the directory to log intermediate steps and final optimization result.	`.runs/`	`-l ./logs/`
`--save-logs`	Save execution output for each step to `.runs/<run-id>/outputs/step_<n>.out.txt` with a JSONL index file for tracking.	`False`	`--save-logs`
`--eval-timeout`	Timeout in seconds for each evaluation step. If the timeout is reached, the evaluation step fails and the optimization proceeds to the next step. No timeout by default.	`None`	`--eval-timeout 3600`
`--apply-change`	Automatically apply the best solution to the source file(s) without prompting.	`False`	`--apply-change`
`--require-review`	Require manual review and approval of each proposed change before it is evaluated. See Review Mode.	`False`	`--require-review`
`--eval-backend`	Evaluation backend. `shell` (default) runs `--eval-command` directly; `langsmith` and `langfuse` evaluate against datasets — see LangSmith and LangFuse.	`shell`	`--eval-backend langsmith`
`--output`	Output mode: `rich` for the interactive terminal UI, `plain` for machine-readable output suitable for LLM agents.	`rich`	`--output plain`
`--no-auto-resume`	Disable automatic reconnection/resume on transient network errors.	`False`	`--no-auto-resume`
`--auto-resume-max-attempts`	Maximum auto-resume attempts before giving up and printing the manual resume command.	`5`	`--auto-resume-max-attempts 10`
`--no-open`	Don't auto-open the run's dashboard URL in a browser tab.	`False`	`--no-open`
`--daemon`	Detach the run after creating it: print the run id, exit, and leave the eval loop running in the background (stdout/stderr go to `/tmp/weco-run-<run-id>.log`). Inspect or stop it later with the run subcommands.	`False`	`--daemon`
`--api-key`	API keys for LLM providers in the format `provider=key`. You can specify multiple providers by separating them with spaces. Only available to authenticated users.	`None`	`--api-key openai=your-openai-key gemini=your-gemini-key`

Evaluation Requirements

The command specified by --eval-command is crucial for the optimization process. It must:

Execute the modified code from your source file(s)
Assess its performance
Print the metric you specified with --metric along with its numerical value to the terminal

For example, if you set --metric speedup, your evaluation script should output a line like:

speedup: 1.5

Weco will parse this output to extract the numerical value (1.5 in this case) associated with the metric name ('speedup').

For detailed guidance on creating effective evaluation scripts, see Writing Good Evaluation Scripts.

resume

If your optimization run is interrupted (network issues, restart, etc.), resume from the most recent node:

# Resume an interrupted run
weco resume <run_id>

# For example
weco resume 0002e071-1b67-411f-a514-36947f0c4b31

# Resume a completed run with 50 more steps
weco resume <run_id> --steps 50

# Automatically apply the best solution without prompting
weco resume <run_id> --apply-change

# Resume with custom API keys
weco resume <run_id> --api-key openai=your-openai-key gemini=your-gemini-key

The model, log directory, save-logs behavior, and evaluation timeout are all inherited from the original run.

Resume Command Options

Argument	Description	Default
`run_id`	The UUID of the run to resume (required).	-
`-n, --steps`	Run this many more evaluations from the last node. Required when resuming a completed run; optional for terminated/error runs (omit to keep the original budget).	`None`
`--apply-change`	Automatically apply the best solution to the source file without prompting.	`False`
`--output`	Output mode: `rich` for the interactive terminal UI, `plain` for machine-readable output suitable for LLM agents.	`rich`
`--no-auto-resume`	Disable automatic reconnection/resume on transient network errors.	`False`
`--auto-resume-max-attempts`	Maximum auto-resume attempts before giving up and printing the manual resume command.	`5`
`--no-open`	Don't auto-open the run's dashboard URL in a browser tab.	`False`
`--daemon`	Detach the resumed run: print the run id, exit, and leave the eval loop running in the background (stdout/stderr go to `/tmp/weco-run-<run-id>.log`).	`False`
`--api-key`	API keys for LLM providers in the format `provider=key`. You can specify multiple providers by separating them with spaces. Only available to authenticated users.	`None`

derive

A derived run branches off an existing run at a specific step. The code and metric value from that step become the baseline (step 0) of the new run — no re-evaluation, no wasted compute — and a fresh optimization loop continues from there with new LLM context.

Use this to steer the optimizer in a new direction (e.g. "now focus on memory, not speed"), add constraints partway through, explore an alternative branch from a known-good solution, or extend a completed run with more steps. The parent run is stopped by default so you don't pay for two loops at once.

All runs created by derive share a lineage with their parent — the original (non-derived) run is the lineage root, and every derive from it (or from any of its descendants) belongs to the same lineage. The dashboard shows the full tree.

# Derive from the best step seen anywhere in the lineage (default)
weco run derive <run_id>

# Derive from the best step in the specified run only
weco run derive <run_id> --from-step run-best

# Derive from a specific step number
weco run derive <run_id> --from-step 7

# Add steering instructions
weco run derive <run_id> -i "Focus on memory efficiency instead of speed"

# Override the step budget for the derived run
weco run derive <run_id> -n 50

Derive Command Options

Argument	Description	Default
`run_id`	UUID of the parent run to derive from (required).	-
`--from-step`	Where to branch from: `best` (lineage-best), `run-best` (best in this run), a step number, or a node UUID.	`best`
`-n, --steps`	Override the step count for the derived run. If omitted, the parent's step count is inherited.	`None`
`-i, --additional-instructions`	Steering instructions for the derived run (inline text or path to a `.txt`/`.md`/`.rst` file). If omitted, the parent run's instructions are inherited.	`None`
`--api-key`	API keys for LLM providers in the format `provider=key`. Separate multiple providers with spaces.	`None`
`--output`	Output mode: `rich` for the interactive UI, `plain` for machine-readable JSON output (useful for LLM agents).	`rich`
`--no-open`	Don't auto-open the derived run in a browser tab.	`False`
`--daemon`	Detach the derived run: print the run id, exit, and leave the eval loop running in the background (stdout/stderr go to `/tmp/weco-run-<run-id>.log`).	`False`

Keep the source files in your working directory consistent with the parent run before deriving. The derived loop uses your local files as the baseline if they exist — any local edits will override the inherited code from the source step. In rich mode you'll be prompted to confirm before the loop starts.

Review mode

Pass --require-review to weco run to approve every proposed change before it is evaluated. Each candidate solution waits in a pending-approval state until you (or your agent) inspect it, optionally rewrite it, and submit it for evaluation — only then does the optimizer propose the next step.

# Start a run that waits for your approval at each step
weco run --source model.py \
     --eval-command "python eval.py" \
     --metric accuracy \
     --goal maximize \
     --require-review

# From another terminal: list nodes awaiting approval (includes the proposed code)
weco run review <run_id>

# Optionally replace a pending node's code with your own edit
weco run revise <run_id> --node <node_id> --source model.py

# Approve: evaluate the pending node and unlock the next proposal
weco run submit <run_id> --node <node_id>

Command	Description	Key options
`weco run review`	List nodes awaiting action (pending approval or evaluation), including their plan and proposed code.	—
`weco run revise`	Replace a pending node's code with your own revision before it is evaluated.	`--node <id>` (required), `--source <file>` or `--sources <files...>`
`weco run submit`	Submit a pending node: the evaluation runs locally and the result unlocks the optimizer's next proposal.	`--node <id>` (required), `--source`/`--sources` (optional — creates a revision first), `-c, --eval-command` (override the stored eval command)

weco run status shows the same pending list without code, and weco run diff is handy for inspecting a pending change against the baseline.

Source path mapping. When passing --source/--sources to revise or submit, you can map a local file to one of the run's source paths with target_path=local_path syntax (e.g. --source module.py=./my_version.py). Without an explicit mapping, files are matched positionally to the run's original source paths.

Optimizing

run