← Home (EN)|한국어 버전
Methodology

How four AIs reach monthly
ETF consensus through structured debate

Full disclosure of agent roles, round structure, weight calibration, and data sources. Same inputs must yield same conclusion — that's our single source of trust.

1. The Four AIs · Role Definitions

🌍

Macro Analyst

Aggregates 13 macro indicators (ISM PMI, unemployment, yield curve, HY spread, etc.) into a regime score. Monthly scoring on a +1.7 / -1.7 scale.

📈

Momentum Tracker

Monitors 3-month and 12-month moving averages plus RSI of 12 US sector ETFs. Auto-trim recommendation on overheating signals (e.g., 17 consecutive up-days).

🛡️

Risk Manager

Tracks VIX, HY spread, Fed dissent count. Enforces defensive (XLV / ITA) allocation when volatility hedge is needed.

🔍

Verifier

Scores the other 3 AIs' 3-month accuracy and historical miss frequency. AIs below 70% accuracy get auto-downweighted.

2. 5-Round Debate Structure

Round 1Performance Review

What happens: Decompose each ETF's actual return vs prediction. Label as hit / sideways / miss.

Output example: Example: SOXX +12.3% strong hit / SMH cut +9~10% rally — partial miss.

Round 2Macro Inflection Check

What happens: Macro AI presents 13-indicator score and MoM change. Other 3 AIs challenge or reinforce.

Output example: Example: aggregate -1.7pt, no regime shift → late-cycle maintained.

Round 3Sector Momentum Mapping

What happens: Map current momentum score + up-day count to 12 sector ETFs. Identify overheated / cheap candidates.

Output example: Example: semis 17 days up → no new chase buys.

Round 4Weight Consensus

What happens: Each AI proposes desired weights. Difference >±3pp → weaker argument concedes. Not majority — logic strength.

Output example: Example: SOXX 40 → 38% (2pp preemptive trim).

Round 5Verification & New Rules

What happens: Verifier codifies future check criteria for this month's decision. Adds to rulebook.

Output example: Example: '+15% over → auto-trim 5pp' new rule.

3. Weight Calibration

  • Base weights: 4 AIs equal (25% each)
  • 3-month accuracy adjustment: Verifier scores monthly → below 70% = -5pp, above 90% = +5pp
  • Sector constraints: single ETF ≤ 50%, single sector ≤ 60%
  • Monthly weight change ±3pp is the statistically justified range. Beyond requires extra evidence.
  • +15% threshold auto-trim 5pp (rule added 2026-05)

4. Data Sources · License

IndicatorSourceLicense / Cadence
ISM PMIInstitute for Supply Management
Public release (monthly)
FRED MANEMP series
UnemploymentBLS · U.S. Bureau of Labor Statistics
Public data (monthly)
FRED UNRATE series
Yield CurveFRED · Fed Reserve of St. Louis
Public data (FRED)
FRED T10Y2Y series
Fed DissentFOMC Statements + Bloomberg/Reuters
Public + media citation
Manual entry with citation
VIXCBOE (via Yahoo Finance)
Index data (daily close)
yfinance ^VIX
HY SpreadICE BofA US High Yield Master II OAS
Public data (FRED)
FRED BAMLH0A0HYM2
ETF PricesiShares / State Street / Invesco
Public data (daily)
yfinance + Supabase etf_prices

⚠ All sources are public data. We only re-process and re-interpret. Data accuracy is the source institution's responsibility.

5. Limits and *publishing our misses*

This methodology is simulation based on historical patterns and does not guarantee future returns. AI consensus has its own *collective bias* — all four can misread the same data the same way.

When misses occur, we log them on the Missed Museum with 4 columns: decision basis → actual result → what we missed → rule added. Reproducible reasoning, not raw accuracy, is our asset.

← Back to homepage