Let op: dit experiment is nog niet Codex-gevalideerd. Gebruik de bevindingen als voorlopige aanwijzingen.

Hypotheses

FAMILY_AUTOML_PATTERN_DISCOVERY - Experiment Log

FAMILY_AUTOML_PATTERN_DISCOVERY

**Objective**: Use AutoML frameworks to automatically discover hidden patterns and beat the current 53.7% improvement baseline at 12-week horizons through computational brute force and automated feature engineering.

Laatste update
2025-12-01
Repo-pad
hypotheses/FAMILY_AUTOML_PATTERN_DISCOVERY
Codex-bestand
Ontbreekt

Experimentnotities

FAMILY_AUTOML_PATTERN_DISCOVERY - Experiment Log

Overview

Objective: Use AutoML frameworks to automatically discover hidden patterns and beat the current 53.7% improvement baseline at 12-week horizons through computational brute force and automated feature engineering.

Status: DEVELOPMENT
Priority: CRITICAL
Target Performance: 60%+ improvement (vs current 53.7% baseline)

Variants

  • Variant A: AutoGluon comprehensive ensemble with 1000+ automated features
  • Variant B: H2O AutoML with deep learning and gradient boosting focus
  • Variant C: TPOT genetic programming for evolutionary pattern discovery

Experimental Plan

Data Requirements

  • Primary: BoerderijApi weekly Dutch potato prices (1,203 observations)
  • Secondary: NDVI satellite data, international markets, weather accumulation, storage indicators
  • Validation: USE ONLY REAL DATA from repository interfaces - NO SYNTHETIC DATA
  • Horizon: 12-week (84 days) - proven optimal for maximum improvement

Feature Engineering Strategy

  1. Automated Feature Explosion: Generate 1000+ features programmatically
  2. Price transformations: lags 1-52, moving averages, differences, ratios
  3. Polynomial interactions: 2nd and 3rd order combinations
  4. Fourier transforms: seasonal cycles at multiple frequencies
  5. Wavelet decomposition: multi-scale temporal analysis
  6. Technical indicators: RSI, MACD, Bollinger Bands
  7. Cross-market features: international spreads, correlations

  8. AutoML Framework Testing:

  9. AutoGluon: 2-hour comprehensive ensemble search
  10. H2O AutoML: 2-hour deep learning + GBM focus
  11. TPOT: 4-hour genetic programming evolution

  12. Validation Protocol:

  13. Rolling-origin cross-validation with corrected baselines
  14. Test against ALL 4 standard baselines: persistent, seasonal_naive, ar2, historical_mean
  15. Statistical testing: DM + HLN, TOST vs SESOI, FDR correction
  16. Compare against STRONGEST baseline (lowest error)

Success Criteria

  • Primary: Beat 53.7% baseline by 5%+ relative improvement (58.7%+ total)
  • Statistical: p < 0.05 vs strongest baseline after FDR correction
  • Practical: Improvement exceeds 8% SESOI threshold
  • Discovery: Identify top 20 predictive features humans wouldn't consider

Experiment Results

Results will be appended here as variants are completed


Decision Log

Decision summary will be added after all variants complete

Geen Codex-samenvatting

Voeg codex_validated.md toe om de status te documenteren.