An Open-Source Self-Evolving Stock Strategy Research Agent
Watch strategies get diagnosed, mutated, re-tested, and pruned like research assets
Overview · Core Capabilities · Quick Start · Architecture · Validation · CLI
AlphaEvo is a self-evolving stock strategy research agent. It turns a readable YAML strategy into a research loop: backtest, diagnose failure, propose a controlled mutation, re-test the new version, and keep the full evidence trail.
Bundled frozen yfinance snapshot, US tech basket, 2025-02-11 to 2026-04-10:
| Strategy | Signals | Win Rate | Avg Return | Max DD | Score |
|---|---|---|---|---|---|
rsi_reversion_v1 baseline |
0 | 0.0% | 0.00% | 0.0% | 8.1% |
rsi_reversion_v7 champion |
37 | 48.6% | 2.94% | 23.6% | 68.7% |
What happened: the deterministic research committee flagged an over-confirmed entry stack, unlocked enough signals, widened the stop, extended holding, raised the payoff target, switched support context to MA60, then tightened volume confirmation. Each change was accepted only after retest. The champion keeps train-val and val-test gaps at 7.1% / 7.1% on this snapshot. See the generated report: showcase_rsi_reversion_real_snapshot.md.
Live yfinance data, configured external LLM provider, alphaevo evolve rsi_reversion_v1 --method llm, April 10, 2026:
| Strategy | Signals | Win Rate | Avg Return | Score |
|---|---|---|---|---|
rsi_reversion_v1 baseline |
0 | 0.0% | 0.00% | 8.1% |
rsi_reversion_v3 LLM champion |
498 | 52.6% | 1.22% | 56.3% |
Different data windows and protocols should not be ranked directly; the snapshot showcase is the stable no-key first run, while this path demonstrates the LLM research loop on live market data.
- Backtesting and evaluation: run strategies on real data with sampling, multi-metric scoring, and anti-overfitting checks.
- LLM-guided strategy evolution: diagnose failure modes, propose targeted mutations, and re-test whether the new version is actually better.
- Deterministic research committee: technical, risk, overfit, data-quality, and mutation-planning verdicts without requiring an API key.
- Traceable research workflow: keep reports, LLM evidence, evolution trees, and trajectory exports for every iteration.
git clone https://github.com/ZhuLinsen/alphaevo.git
cd alphaevo
pip install -e .
alphaevo showcaseThis runs a real historical-data showcase from a bundled yfinance snapshot:
Showcase Chain: baseline + up to 6 validated mutations
Round 1 │ rsi_reversion_v1 │ Signals: 0 │ Score: 8.1%
Committee: entry stack is too strict
Mutation: entry.logic and -> or
Round 2 │ rsi_reversion_v2 │ Signals: 79 │ Score: 17.8%
Mutation: exit.stop_loss.value 0.05 -> 0.08
Round 7 │ rsi_reversion_v7 │ Signals: 37 │ Score: 68.7% 🏆
| Goal | Command | Data | LLM |
|---|---|---|---|
| Real-data showcase | alphaevo showcase |
Bundled yfinance snapshot | No |
| Quick first-run demo | alphaevo demo |
Bundled yfinance snapshot | No |
| Synthetic dev smoke test | alphaevo demo --synthetic |
Synthetic | No |
| Turn a plain-language idea into executable YAML | alphaevo strategy draft "<idea>" --save |
None | No |
| Draft, backtest, and optimize a plain-language idea | alphaevo strategy research "<idea>" |
Real data | No |
| Revise an existing strategy and validate it | alphaevo strategy improve <id> "<change request>" |
Real data | No |
| Live market data smoke test | alphaevo showcase --live or alphaevo demo --real |
Live yfinance / akshare | No |
| Fuller real-data backtest | alphaevo run ma_crossover_v1 |
Live yfinance | No |
| Test a breakout/volatility-compression template | alphaevo run volatility_compression_breakout_v1 |
Live yfinance | No |
| Optimize entry thresholds and exit/risk rules | alphaevo optimize <id> --spaces entry,params,indicator,exit,stoploss,takeprofit,holding |
Real data | No |
| Balance win rate and payoff quality | alphaevo optimize <id> --objective quality --min-win-rate 0.5 --min-avg-return 0 --min-profit-loss-ratio 1.0 --max-drawdown 0.35 --min-signals 30 --param-max-changes 2 --max-values-per-param 8 --evaluation-mode fast --full-eval-top 5 |
Real data | No |
| Push for higher return quality | alphaevo optimize <id> --objective profit_quality --min-win-rate 0.5 --min-avg-return 0.006 --min-total-return 0.18 --min-profit-loss-ratio 1.1 --joint-top 4 --parallel-workers 4 |
Real data | No |
| Reject overfit-looking candidates | alphaevo optimize <id> --objective robust_profit_quality --evaluation-mode fast --full-eval-top 8 --reject-overfit --max-train-val-gap 0.12 --max-val-test-gap 0.10 --max-walk-forward-gap 0.12 --min-walk-forward-pass-rate 0.5 |
Real data | No |
| Jointly refine entry and exits | alphaevo optimize <id> --spaces all --objective quality --joint-top 3 --joint-candidates-per-seed 64 |
Real data | No |
| Flagship research-agent path | alphaevo evolve <id> --method llm --output reports/ |
Real data | Yes |
Real-data commands need a data adapter extra: install pip install -e ".[data-yfinance]" for the default US workflow, pip install -e ".[data-akshare]" for A-share, or pip install -e ".[data-full]" for both.
alphaevo showcase is the stable real-data first run: it uses the bundled yfinance snapshot and writes a shareable report. alphaevo showcase --live tries live yfinance first and falls back to the snapshot if the provider is unavailable. If you want a stronger first backtest with more symbols on the default yfinance adapter, start with alphaevo run ma_crossover_v1.
--objective robust_profit_quality ranks return quality with an additional
stability score. Robust optimization gates such as --reject-overfit,
--max-train-val-gap, and --max-walk-forward-gap require full candidate
metrics. In fast searches, keep --full-eval-top high enough for the leading
candidates you want to judge.
- Flagship LLM proof path:
rsi_reversion_v1andma_crossover_v1on real data. - Most reliable strategy families right now: trend + reversal strategies whose core signals come from OHLCV / benchmark context.
- Return-oriented trend template:
volatility_compression_breakout_v1uses explicit prior-high breakout triggers, range-position guards, volatility-compression filters, and trailing take profit. - Experimental families: event + rotation strategies still rely partly on proxy context for news / sector-flow semantics, so treat them as research previews rather than the main launch proof.
# From the cloned repo root, add LLM + default real-data support
pip install -e ".[llm,data-yfinance]"
# Set your LLM API key
export ALPHAEVO_API_KEY=your_api_key
export ALPHAEVO_LLM_MODEL=gemini/gemini-2.0-flash # or openai/gpt-4o, etc.
# Run a real-data research loop on the default yfinance adapter
alphaevo run ma_crossover_v1
# Run the flagship LLM research path
alphaevo evolve rsi_reversion_v1 --method llm --rounds 3 --output reports/rsi_evolve/If you also want the built-in A-share workflow, install pip install -e ".[llm,data-full]" or add pip install -e ".[data-akshare]", then use alphaevo run trend_pullback_rebound_v1 --adapter akshare.
┌─────────────────────────────────────────────────────────┐
│ CLI (Typer + Rich) │
└────────────────────────┬────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────┐
│ Orchestrator (Pipeline) │
│ generate → sample → backtest → evaluate → reflect │
│ → evolve → leaderboard │
└──┬────────┬────────┬────────┬────────┬────────┬─────────┘
│ │ │ │ │ │
┌──▼──┐ ┌──▼──┐ ┌───▼──┐ ┌──▼──┐ ┌───▼──┐ ┌──▼────────┐
│Data │ │Stgy │ │Sample│ │Back │ │Eval │ │Reflection │
│Layer│ │Layer│ │Layer │ │test │ │Layer │ │Layer │
└─────┘ └─────┘ └──────┘ └─────┘ └──────┘ └───────────┘
| Layer | Purpose |
|---|---|
| Data Layer | Multi-source market data (yfinance, akshare, or daily_stock_analysis plugin) |
| Strategy Layer | Dual-representation: human-readable + executable YAML DSL |
| Sampler Layer | Smart sampling by market regime, style, and strategy scope |
| Backtest Engine | Signal-level simulation with proper slippage and fees |
| Evaluator Layer | Multi-dimensional metrics + anti-overfitting checks |
| Reflection Layer | LLM failure attribution + LLM-first evolution with optional param-search fallback |
Round 1: rsi_reversion_v1 — live yfinance data, LLM-only evolve
→ 0 signals
→ Confidence: 8.1% ❌ non-functional
LLM diagnosis: "Entry conditions are contradictory and too strict."
Changes: entry.logic and→or, RSI 30→35
Round 2: rsi_reversion_v2 — strategy becomes tradable
→ 522 signals
→ Win rate: 52.7%, Avg return: +0.96%
→ Confidence: 39.2%
LLM diagnosis: "The OR logic over-corrected and now admits noisy entries."
Changes: RSI 35→32, volume_ratio 1.3→1.15, stop_loss pct→atr
Round 3: rsi_reversion_v3 — champion
→ 498 signals
→ Win rate: 52.6%, Avg return: +1.22%
→ Confidence: 56.3% (+48.2pp from start)
These are real results from an April 10, 2026 run using live yfinance data and a configured external LLM provider with
--method llm. Not synthetic, not heuristic fallback.
Strategies are stored as human-readable YAML, so the LLM can explain and mutate them without hiding the logic in code.
meta:
id: trend_pullback_rebound_v3
name: 强趋势回踩放量反包
version: 3
category: trend
entry:
logic: and
triggers:
- indicator: relative_strength_20d
op: ">"
value: 0.12
guards:
- indicator: ma5_above_ma10
op: "=="
value: true
exit:
triggers:
- indicator: close_below_ma10
op: "=="
value: true
stop_loss:
type: atr
atr_period: 21
multiplier: 2.0
take_profit:
type: rr
value: 2.0
params:
tunable:
- target: entry.triggers[indicator=relative_strength_20d].value
range: [0.05, 0.20]
step: 0.01See technical design for the full DSL.
We validate the system on live yfinance data with real --method llm runs, and we intentionally show both success and honest failure.
| Case | Start | Best Outcome | Why It Matters |
|---|---|---|---|
rsi_reversion_v1 |
8.1% |
56.3% champion |
LLM turned a zero-signal strategy into a tradable one in 3 rounds |
ma_crossover_v1 |
24.2% |
24.2% champion |
LLM proposed changes, but anti-overfit rejected weak generalization |
sector_rotation_leader_v1 smoke |
11.3% |
12.3% tested |
Candidate improved in-sample, but train_val_gap=18.9% blocked promotion |
This is part of the product story, not a footnote: AlphaEvo should be trusted more when it stops honestly than when it invents a prettier curve.
Real factor discovery is also live-tested:
alphaevo factor discover AAPLproposed3factors, passed3through sandboxing, validated2, and registered2.- Walkthrough: Factor Discovery Walkthrough
Each real run can export:
<strategy_id>_research_report.md<strategy_id>_llm_evidence.md<strategy_id>_research_log.mdtrajectory/*.jsonl|json
For a fixed-input benchmark suite, use:
python scripts/experiments/run_repro_benchmark.py \
--adapter yfinance \
--method llm \
--rounds 3 \
--output results/repro-benchmark/
python scripts/validate_real_data.py \
--adapter yfinance \
--days 365 \
--output results/real-validation/See the detailed write-up: April 10, 2026 real LLM validation
Current limits:
- live external LLM providers can still introduce latency spikes and timeout-driven fallback
- some strategies, like
ma_crossover_v1, show that a valid LLM diagnosis does not automatically survive anti-overfit checks - discovered factors are research artifacts first and should still be reviewed before any production-style use
Every real evolution run can export more than scores:
trajectory.jsonlcaptures each round as(state -> diagnosis -> hypothesis -> change -> outcome)sharegpt.jsonlreformats the same run into SFT-style conversationspreference.jsonlstores improved vs. non-improved steps for preference learning
That means AlphaEvo is not only a strategy optimizer. It is also a data engine for training better strategy-research agents over time.
AlphaEvo takes overfitting seriously:
- Time Separation: Train / Validation / Test periods strictly separated
- Walk-Forward: Rolling 12-month train → 1-month test windows
- Complexity Penalty: More conditions = lower score
- Stability Check: Performance must be consistent across years/sectors
- Minimum Signals: Strategies with < 30 signals get reliability discount
- Parameter Sensitivity: ±10% perturbation test, >30% decay = warning
Works out of the box with yfinance or akshare for data.
Seamlessly integrates with daily_stock_analysis for multi-source data with automatic fallback.
from alphaevo.data.adapters.dsa import DSAAdapter
data_manager = DataManager([DSAAdapter(dsa_path="/path/to/dsa")])alphaevo demo # 🔥 Try instantly (no setup needed)
alphaevo demo --real # Real data demo without API key
alphaevo run <id> # Full research loop
alphaevo evolve <id> --method llm --rounds 3 --output reports/ # LLM-first evolution
alphaevo factor discover <symbol> # LLM-driven factor discovery
alphaevo leaderboard # Strategy rankings
alphaevo tree <id> # Evolution tree visualizationFor the full command surface, run alphaevo --help.
- FunSearch (Nature 2024) — island-style parallel search and branch competition
- OPRO (DeepMind 2023) — optimizer-style prompt refinement with trajectory history
- Voyager (2023) — reusable skill/pattern libraries and long-horizon memory
AlphaEvo adapts these ideas to quantitative strategy research rather than general coding or benchmark optimization.
- Phase 1: Strategy Research Loop (MVP) — backtest engine, indicators, evaluator
- Phase 2: Self-Evolution Pipeline — LLM reflection, mutation, multi-round improvement
- Phase 3: CLI & Orchestration — full command suite, strategy store, leaderboard
- Phase 4: Open-Source Polish — CI/CD, docs, English templates, CHANGELOG
- Phase 5: Market Regime Adaptive Gating — environment detection, strategy routing
- Phase 6: Web UI Dashboard — visualization, interactive strategy exploration
Contributions welcome! See CONTRIBUTING.md for guidelines.
Especially looking for:
- New strategy templates
- Data source adapters
- Evaluation metrics
- UI/visualization improvements
If this project helps your research, consider giving it a ⭐ and sharing it!
![]() Xiaohongshu 📱 Follow for quant strategy research updates |
📬 Contact & Collaboration 🐛 Submit an Issue — Bug reports / feature requests 📧 zhuls345@gmail.com — Business inquiries 🔗 daily_stock_analysis — Sister project, AI-powered daily stock analysis |
This project is for educational and research purposes only. It does not constitute investment advice. The authors are not responsible for any financial losses incurred from using this software. Always do your own research and consult qualified financial advisors before making investment decisions.
Past strategy performance does not guarantee future results. All backtesting results are simulated and may not reflect real market conditions.
Apache-2.0 License — see LICENSE for details.
If you use or build upon this project, a credit with a link back to this repository is appreciated.

