Robust and safe integration of LLM-generated code into autonomous software development pipelines, specifically addressing string formatting vulnerabilities.
Raw Developer Origin & Technical Request
GitHub Issue
Mar 23, 2026
## Description
The pipeline crashes in the `CODE_GENERATION` stage due to unsafe usage of Python `.format()` on a prompt string that already contains LLM-generated content with curly braces `{}`.
This results in a `KeyError` when `.format()` attempts to interpret parts of the generated code (e.g., dictionary keys) as format placeholders.
## Reproduction
Run the pipeline with a topic that leads to code generation involving Python dictionaries, for example:
```
researchclaw run \
--config config.arc.yaml \
--topic "Reinforcement learning with generative world models" \
--auto-approve
```
Observed failure (reproducible across runs):
```
Stage CODE_GENERATION failed
KeyError: "\n 11 | 'learning_rate'"
```
and in another run:
```
KeyError: "\n 11 | 'learning_rate_quantum'"
```
## Root Cause
In `researchclaw/pipeline/code_agent.py`, function `_targeted_file_repair`:
````python
prompt = (
f"..."
f"python\n{code}\n\n\n"
"Output the COMPLETE fixed `{target_file}` in "
"filename:{target_file}` format..."
).format(target_file=target_file)
````
The string is first constructed using f-strings (which safely inject LLM output), but then `.format()` is applied to the entire string.
If `code`, `error_msg`, or other inserted content contains `{}` (which is very common in Python code), `.format()` interprets them as placeholders and raises `KeyError`.
## Expected Behavior
The repair loop should not crash when LLM-generated code contains curly b...
Developer Debate & Comments
No active discussions extracted for this entry yet.
Adjacent Repository Pain Points
Other highly discussed features and pain points extracted from aiming-lab/AutoResearchClaw.
Engagement Signals
Cross-Market Term Frequency
Quantifies the cross-market adoption of foundational terms like LLM-generated code and CODE_GENERATION stage by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.
Market Trends