docs(experiment): add select_num A/B/C comparison report (005)

- Experiment: select_num = 1, 2, 3 comparison - Period: 2020-01-10 ~ 2026-06-02 (1546 trading days) - Key findings: - Top-1: highest return (600%), highest drawdown (-25.5%) - Top-3: best risk-adjusted return (Calmar 1.73, Sharpe 1.35) - Top-2: balanced middle ground (Calmar 1.69) - Add rotation/experiment_select_num.py experiment script - Save report to docs/experiments/005_select_num_comparison.md
2026-06-02 01:32:43 +08:00
parent 07d6f1451c
commit a47af0f0eb
2 changed files with 338 additions and 0 deletions
--- a/docs/experiments/005_select_num_comparison.md
+++ b/docs/experiments/005_select_num_comparison.md
@@ -0,0 +1,145 @@
 # 实验记录 005: select_num 参数对策略表现的影响
 ## 实验信息
 | 项目 | 内容 |
 |------|------|
 | 实验编号 | 005 |
 | 实验日期 | 2026-06-02 |
 | 实验类型 | A/B/C 对比测试 |
 | 研究问题 | `diversified=true` 模式下，`select_num` 取 1/2/3 时对策略收益与风险的影响 |
 | 配置文件 | `rotation/config_simple.yaml` (L133 `select_num`) |
 | 实验脚本 | `rotation/experiment_select_num.py` |
 ---
 ## 1. 实验背景
 ### 策略选股流程
 ```
 Step 1: 类内竞争 → 每个 market 大类只保留得分最高的1只标的（大类冠军）
 Step 2: 跨类排序 → 从大类冠军中按得分从高到低选 Top select_num
 ```
 ### 核心问题
 `select_num` 控制最终持仓标的数量，直接影响集中度和分散度：
 - `select_num=1`：单标的集中持仓，无分散化效果
 - `select_num=2`：持有 2 个大类的冠军标的
 - `select_num=3`：持有 3 个大类的冠军标的（当前默认配置）
 **理论预期**：
 - 持仓数量越少，集中度越高，潜在收益和波动均放大
 - 持仓数量越多，分散化效果越好，回撤更小，但可能引入边际收益较低的标的
 ---
 ## 2. 实验设计
 ### A/B/C 组配置
 | 组别 | select_num | 持仓数量 | 其他配置 |
 |------|-----------|---------|---------|
 | **A组** | 1 | 单标的 | 同对照组 |
 | **B组** | 2 | 双标的 | 同对照组 |
 | **C组** | 3 | 三标的 | 同对照组（当前默认） |
 ### 固定配置（三组相同）
 ```yaml
 factor:
  type: "weighted_momentum"
  n_days: 25
 rotation:
  diversified: true
  threshold:
    mode: "dynamic"
    reference: "931862.CSI"    # 短债动量基准
 ```
 ### 回测区间
 2020-01-10 ~ 2026-06-02，共 **1546 个交易日**
 ---
 ## 3. 回测结果
 ### 核心指标对比
 | 指标 | Top-1（A组） | Top-2（B组） | Top-3（C组） |
 |------|------------|------------|------------|
 | 累计收益 | **600.31%** | 369.88% | 302.14% |
 | 年化收益 | **37.34%** | 28.69% | 25.46% |
 | 最大回撤 | -25.53% | -16.93% | **-14.74%** |
 | 夏普比率 | 1.11 | 1.27 | **1.35** |
 | Calmar比率 | 1.46 | 1.69 | **1.73** |
 | 日胜率 | 54.49% | **55.35%** | 55.18% |
 | 调仓次数 | 197 | 319 | 405 |
 ### 关键观察
 **收益维度：**
 - Top-1 累计收益（600%）几乎是 Top-3（302%）的 2 倍
 - 集中持仓显著放大了收益，但也意味着更高的单标的依赖风险
 **风险维度：**
 - Top-3 最大回撤（-14.74%）比 Top-1（-25.53%）降低约 42%
 - Top-2 居中（-16.93%），回撤控制效果明显
 **风险调整收益（核心指标）：**
 - Calmar 比率：Top-3（1.73）> Top-2（1.69）> Top-1（1.46）
 - 夏普比率：Top-3（1.35）> Top-2（1.27）> Top-1（1.11）
 - **分散化带来更优的风险收益比**
 **调仓频率：**
 - Top-1 调仓次数最少（197 次），因为持仓切换需要单标的排名大幅变动
 - Top-3 调仓次数最多（405 次），持仓组合中任一标的变化都会触发调仓
 ---
 ## 4. NAV 曲线对比
 ![NAV 对比图](../../results/experiment_select_num/select_num_nav_comparison.png)
 ![指标对比图](../../results/experiment_select_num/select_num_comparison.png)
 ---
 ## 5. 结论与建议
 ### 核心结论
 | 目标 | 推荐配置 | 原因 |
 |------|---------|------|
 | 追求绝对收益 | `select_num=1` | 累计收益最高，但需承受更大回撤 |
 | 追求风险调整收益 | `select_num=3` | Calmar/夏普最优，回撤可控 |
 | 平衡两者 | `select_num=2` | 收益与回撤的折中方案 |
 ### 实践建议
 - **当前默认配置 `select_num=3` 是合理的选择**，Calmar 比率最优，适合长期持有
 - 若资金规模较小、风险承受能力强，可考虑 `select_num=1` 追求高弹性
 - `select_num=2` 的 Calmar（1.69）与 Top-3（1.73）非常接近，但收益更高（369% vs 302%），值得进一步观察
 ---
 ## 6. 实验数据位置
 ```
 results/experiment_select_num/
 ├── select_1/
 │   ├── simple_rotation_nav.csv
 │   ├── simple_rotation_signals.csv
 │   ├── simple_rotation_detail.json
 │   └── simple_rotation_metrics.json
 ├── select_2/
 │   └── ... (同上)
 ├── select_3/
 │   └── ... (同上)
 ├── select_num_comparison.png      # 指标对比柱状图
 ├── select_num_nav_comparison.png  # NAV 叠加曲线图
 └── experiment_metrics.json        # 三组指标汇总
 ```
--- a/rotation/experiment_select_num.py
+++ b/rotation/experiment_select_num.py
@@ -0,0 +1,193 @@
 #!/usr/bin/env python3
 """
 select_num A/B 实验：对比 Top-1 / Top-2 / Top-3 的表现
 用法:
    python rotation/experiment_select_num.py
 """
 import os
 import sys
 import yaml
 import json
 import tempfile
 import numpy as np
 import pandas as pd
 from pathlib import Path
 from datetime import datetime
 PROJECT_ROOT = Path(__file__).parent.parent
 sys.path.insert(0, str(PROJECT_ROOT))
 from rotation.simple_rotation import SimpleRotationStrategy
 def run_with_select_num(config_path: str, select_num: int, output_dir: Path) -> dict:
    """运行一次策略，覆盖 select_num"""
    print(f"\n{'='*60}")
    print(f"  实验: select_num = {select_num}")
    print(f"{'='*60}\n")
    # 读取原始配置，修改 select_num，写入临时文件
    with open(config_path, 'r', encoding='utf-8') as f:
        cfg = yaml.safe_load(f)
    cfg['rotation']['select_num'] = select_num
    tmp_path = output_dir / f'config_select_{select_num}.yaml'
    with open(tmp_path, 'w', encoding='utf-8') as f:
        yaml.dump(cfg, f, default_flow_style=False, allow_unicode=True)
    strategy = SimpleRotationStrategy(config_path=str(tmp_path))
    result = strategy.run()
    if result:
        # 导出到子目录
        sub_dir = output_dir / f'select_{select_num}'
        sub_dir.mkdir(parents=True, exist_ok=True)
        strategy.export_results(output_dir=str(sub_dir))
        return result.get('metrics', {})
    return {}
 def print_comparison(all_metrics: dict):
    """打印对比表格"""
    print(f"\n\n{'='*80}")
    print(f"  select_num 实验对比结果")
    print(f"{'='*80}\n")
    header = f"{'指标':<16}"
    for n in sorted(all_metrics.keys()):
        header += f"{'Top-'+str(n):>12}"
    print(header)
    print("-" * (16 + 12 * len(all_metrics)))
    rows = [
        ('累计收益', 'total_return', '{:.2%}'),
        ('年化收益', 'annual_return', '{:.2%}'),
        ('最大回撤', 'max_drawdown', '{:.2%}'),
        ('夏普比率', 'sharpe_ratio', '{:.2f}'),
        ('Calmar比率', 'calmar_ratio', '{:.2f}'),
        ('日胜率', 'win_rate', '{:.2%}'),
        ('交易日数', 'n_days', '{}'),
        ('调仓次数', 'rebalance_count', '{}'),
    ]
    for label, key, fmt in rows:
        row = f"{label:<16}"
        for n in sorted(all_metrics.keys()):
            val = all_metrics[n].get(key, 0)
            row += f"{fmt.format(val):>12}"
        print(row)
    print(f"\n{'='*80}")
 def plot_comparison(all_metrics: dict, output_dir: Path):
    """生成对比图表"""
    import matplotlib
    matplotlib.use("Agg")
    import matplotlib.pyplot as plt
    fig, axes = plt.subplots(1, 3, figsize=(16, 5))
    fig.suptitle("select_num A/B Experiment", fontsize=14, fontweight="bold")
    nums = sorted(all_metrics.keys())
    colors = ['#E74C3C', '#3498DB', '#2ECC71']
    # 1. 收益对比
    ax = axes[0]
    annuals = [all_metrics[n].get('annual_return', 0) for n in nums]
    totals = [all_metrics[n].get('total_return', 0) for n in nums]
    x = np.arange(len(nums))
    w = 0.35
    ax.bar(x - w/2, [a*100 for a in annuals], w, label='Annual %', color='#E74C3C', alpha=0.8)
    ax.bar(x + w/2, [t*100 for t in totals], w, label='Total %', color='#3498DB', alpha=0.8)
    ax.set_xticks(x)
    ax.set_xticklabels([f'Top-{n}' for n in nums])
    ax.set_ylabel('Return (%)')
    ax.set_title('Returns')
    ax.legend()
    ax.grid(True, alpha=0.3)
    # 2. 风险对比
    ax = axes[1]
    dds = [abs(all_metrics[n].get('max_drawdown', 0)) * 100 for n in nums]
    ax.bar(x, dds, color='#E74C3C', alpha=0.7)
    ax.set_xticks(x)
    ax.set_xticklabels([f'Top-{n}' for n in nums])
    ax.set_ylabel('Max Drawdown (%)')
    ax.set_title('Risk')
    ax.grid(True, alpha=0.3)
    # 3. 夏普 & Calmar
    ax = axes[2]
    sharpes = [all_metrics[n].get('sharpe_ratio', 0) for n in nums]
    calmars = [all_metrics[n].get('calmar_ratio', 0) for n in nums]
    ax.bar(x - w/2, sharpes, w, label='Sharpe', color='#2ECC71', alpha=0.8)
    ax.bar(x + w/2, calmars, w, label='Calmar', color='#F39C12', alpha=0.8)
    ax.set_xticks(x)
    ax.set_xticklabels([f'Top-{n}' for n in nums])
    ax.set_ylabel('Ratio')
    ax.set_title('Risk-Adjusted')
    ax.legend()
    ax.grid(True, alpha=0.3)
    plt.tight_layout()
    chart_path = output_dir / 'select_num_comparison.png'
    plt.savefig(str(chart_path), dpi=150, bbox_inches="tight")
    plt.close()
    print(f"\n  + Chart: {chart_path}")
 def plot_nav_comparison(output_dir: Path):
    """加载三组 NAV 画在同一张图上"""
    import matplotlib
    matplotlib.use("Agg")
    import matplotlib.pyplot as plt
    fig, ax = plt.subplots(figsize=(14, 6))
    colors = {'1': '#E74C3C', '2': '#3498DB', '3': '#2ECC71'}
    for n in [1, 2, 3]:
        nav_path = output_dir / f'select_{n}' / 'simple_rotation_nav.csv'
        if nav_path.exists():
            df = pd.read_csv(nav_path, parse_dates=['date'])
            ax.plot(df['date'], df['nav'], label=f'Top-{n}', linewidth=1.5, color=colors[str(n)])
    ax.set_title("NAV Curve Comparison (select_num)", fontsize=14, fontweight="bold")
    ax.set_ylabel("NAV")
    ax.set_yscale("log")
    ax.legend(fontsize=11)
    ax.grid(True, alpha=0.3)
    plt.tight_layout()
    nav_chart = output_dir / 'select_num_nav_comparison.png'
    plt.savefig(str(nav_chart), dpi=150, bbox_inches="tight")
    plt.close()
    print(f"  + NAV Chart: {nav_chart}")
 if __name__ == "__main__":
    if 'FLASK_API_URL' not in os.environ:
        os.environ['FLASK_API_URL'] = 'https://k3s.tokenpluse.xyz'
    config_path = str(Path(__file__).parent / 'config_simple.yaml')
    output_dir = PROJECT_ROOT / 'results' / 'experiment_select_num'
    output_dir.mkdir(parents=True, exist_ok=True)
    all_metrics = {}
    for n in [1, 2, 3]:
        metrics = run_with_select_num(config_path, n, output_dir)
        if metrics:
            all_metrics[n] = metrics
    if all_metrics:
        print_comparison(all_metrics)
        plot_comparison(all_metrics, output_dir)
        plot_nav_comparison(output_dir)
        # 保存原始指标
        metrics_path = output_dir / 'experiment_metrics.json'
        with open(metrics_path, 'w', encoding='utf-8') as f:
            json.dump({str(k): v for k, v in all_metrics.items()}, f, ensure_ascii=False, indent=2)
        print(f"  + Metrics: {metrics_path}")