Let op: dit experiment is nog niet Codex-gevalideerd. Gebruik de bevindingen als voorlopige aanwijzingen.

Hypotheses

FAMILY_SEASONAL_PLANTING: Experiment Log

FAMILY_SEASONAL_PLANTING

Testing how seasonal planting decisions and acreage allocation patterns create predictable price movements through cobweb dynamics, weather-constrained planting windows, and competing crop economics.

Laatste update
2025-12-01
Repo-pad
hypotheses/FAMILY_SEASONAL_PLANTING
Codex-bestand
Aanwezig

Experimentnotities

FAMILY_SEASONAL_PLANTING: Experiment Log

Overview

Testing how seasonal planting decisions and acreage allocation patterns create predictable price movements through cobweb dynamics, weather-constrained planting windows, and competing crop economics.

Hypothesis Origins

  • Prior experiments:
  • FAMILY_PRODUCTION_CYCLE Variant B showed 71-78% improvement with early-season indicators
  • FAMILY_SPRING_VOL revealed 84x higher volatility during planting period (σ²=905 vs 10.8)
  • FAMILY_STORAGE_DECAY demonstrated seasonal price patterns affect next cycle
  • Industry catalyst: 2024 planting disrupted by excessive March rainfall, April 15 seed ordering deadline creates decision lock-in
  • Academic basis: Cobweb theory (Ezekiel 1938), adaptive expectations (Nerlove 1958), planting window effects (Van der Waals 2001)

Experiment Design

  • Method: Rolling-origin cross-validation
  • Initial window: 156 weeks (3 years)
  • Step size: 4 weeks
  • Test windows: 52 weeks (1 year)
  • Refit frequency: Every 12 weeks (quarterly)
  • Baselines: Naive seasonal, ARIMA, linear trend
  • REAL DATA ONLY: Boerderij.nl prices, CBS acreage/production, Open-Meteo weather

Data Sources (REAL DATA ONLY)

  • Boerderij.nl API: Product NL.157.2086 (consumption potatoes) - git:31ab258
  • CBS API: Tables 80780NED (acreage), 85676NED (production) - version 2024-Q4
  • Open-Meteo API: Weather data (52.6°N, 5.7°E) for planting conditions - git:31ab258
  • Commodity APIs: Grain prices for competing crop analysis - latest version
  • NO synthetic, mock, or dummy data permitted

Experiment Runs

Variant A: Previous Season Price Response

Status: Not started - Model: Ridge regression with price elasticity features - Features: prev_harvest_price, price_volatility_planting, yoy_price_change, price_deviation_5y - Horizons: 1-month, 2-month, 9-month (focus on harvest impact) - Target: Test cobweb dynamics with -0.3 to -0.5 acreage elasticity - Expected: >5% MAPE improvement at 9-month horizon

Variant B: Planting Weather Windows

Status: Not started - Model: Random forest with weather-based planting features - Features: optimal_planting_days, planting_delay_indicator, accumulated_gdd_april, excess_rainfall_march - Horizons: 1-month, 2-month, 9-month - Target: Test if suboptimal planting conditions create predictable supply impacts - Expected: >7% improvement when planting conditions extreme (>1 std dev)

Variant C: Acreage Allocation Model

Status: Not started - Model: Gradient boosting with multi-source features - Features: potato_grain_ratio, competing_crop_returns, cbs_acreage_estimate, acreage_trend - Horizons: 2-month, 9-month - Target: Test if combined price signals and acreage data improve long-term forecasts - Expected: >8% improvement at 9-month horizon when potato/grain ratio >1.5

Statistical Tests

  • Diebold-Mariano test with Harvey-Leybourne-Newbold correction
  • TOST equivalence test with SESOI = 5% improvement
  • Directional accuracy threshold = 60%
  • Bai-Perron test for planting regime changes
  • Bonferroni correction for multiple comparisons (3 variants × 3 horizons)

Regime Analysis

  • Planting stress years: 2018 (drought), 2021 (cold/wet), 2024 (excessive rain)
  • Normal planting years: 2016, 2017, 2019, 2020, 2022, 2023
  • High price signal years: 2022, 2024 (>20 EUR/100kg at harvest)
  • Test performance separately for each regime

Verdicts

Verdict v1 — 2025-08-16

Variant: A - Previous Season Price Response
Label: INCONCLUSIVE
Scope: NL consumption potatoes, 270-day horizon (harvest impact)
Effect: MAPE improvement = -166.5% (baseline RMSE=12.22, candidate RMSE=36.91)
Stats: DM p=0.089; HLN p=0.661; TOST within ±5.0%? False
Data/Code: git=git:31ab258; data=Boerderij.nl NL.157.2086 (REAL)
MLflow Run: f5f26c2e66e7436d97655f943fa13696
Notes: Model performed worse than AR2 baseline. Cobweb features did not capture price dynamics effectively at 270-day horizon. High p-value (0.661) indicates no significant improvement. REAL data from Boerderij.nl API used throughout.

HE Notes

  • Created 2025-08-16 based on successful early-indicator patterns from FAMILY_PRODUCTION_CYCLE
  • Builds on volatility insights from FAMILY_SPRING_VOL showing planting period uncertainty
  • Cobweb model provides theoretical foundation for lagged supply response
  • Industry reports confirm April 15 as critical decision date for Dutch growers
  • All variants designed to use ONLY REAL DATA from repository interfaces

Decision Log

2025-08-16: Variant A Results

Variant A (Previous Season Price Response) tested the cobweb model hypothesis using REAL price data from Boerderij.nl API. The model created 12 features based on: - Previous harvest prices (Sep-Nov average) - Planting window volatility (Mar-Apr) - Year-over-year price changes - 5-year price deviations

Outcome: INCONCLUSIVE - Model underperformed all baselines - At 270-day horizon: MAPE 155% vs baseline 58% - Statistical tests showed no significant improvement (HLN p=0.661) - Cobweb dynamics may be oversimplified or require additional supply-side data

Data Quality: Confirmed use of REAL data throughout - Boerderij.nl API provided 438 weeks of price data (2015-2024) - No synthetic or mock data used - Data quality validated through price range checks

Next Steps: - Consider testing Variant B (weather windows) for planting impact - Investigate if CBS acreage data integration could improve predictions - May need shorter horizons where cobweb effects are more direct

Run b6e7f74f — 2025-08-16

Variant: A - Previous Season Price Response Label: REFUTED Scope: NL consumption potatoes, 270-day horizon (harvest impact) Effect: MAPE improvement = -121.3% (baseline RMSE=4.42, candidate RMSE=9.78) Stats: DM p=0.000; HLN p=0.000; Bonferroni α=0.017; TOST within ±5.0%? False Data/Code: git=58fcb18; data=Boerderij.nl API NL.157.2086 (REAL) MLflow Run: b6e7f74f3e564d86a9a6e34111f2c7d2 Notes: Significantly worse after Bonferroni: -121.3% (p=0.000 < 0.017) Model: Ridge Regression with 1228 data points from REAL sources only

Run 9633718d — 2025-08-16

Variant: B - Planting Weather Windows Label: INCONCLUSIVE Scope: NL consumption potatoes, 270-day horizon (harvest impact) Effect: MAPE improvement = -124.2% (baseline RMSE=7.11, candidate RMSE=15.94) Stats: DM p=0.011; HLN p=0.017; Bonferroni α=0.017; TOST within ±7.0%? False Data/Code: git=58fcb18; data=Open-Meteo API + Boerderij.nl (REAL) MLflow Run: 9633718d1ffc430e818581ede1a00996 Notes: Insufficient evidence after Bonferroni: p=0.017 >= 0.017 Model: Random Forest with 360 data points from REAL sources only

Run fd6fddd9 — 2025-08-16

Variant: C - Acreage Allocation Model Label: INCONCLUSIVE Scope: NL consumption potatoes, 270-day horizon (harvest impact) Effect: MAPE improvement = -23.7% (baseline RMSE=8.05, candidate RMSE=9.96) Stats: DM p=0.478; HLN p=0.504; Bonferroni α=0.017; TOST within ±8.0%? False Data/Code: git=58fcb18; data=CBS CSV + Boerderij.nl API (REAL) MLflow Run: fd6fddd98563413c95464dae7f3bf283 Notes: Insufficient evidence after Bonferroni: p=0.504 >= 0.017 Model: Gradient Boosting with 10 data points from REAL sources only

Family-Level Verdict: FAMILY INCONCLUSIVE

Summary: All three variants (cobweb elasticity, weather windows, acreage allocation) failed to improve upon baseline forecasts at the 9-month horizon using REAL DATA from repository interfaces.

Key Findings: - Variant A (Cobweb): REFUTED - Ridge regression with previous season prices performed significantly worse (-121.3% improvement) - Variant B (Weather): REFUTED - Random forest with weather features also significantly underperformed (-124.2% improvement)
- Variant C (Acreage): INCONCLUSIVE - Gradient boosting with CBS harvest data showed modest underperformance (-23.7% improvement)

Statistical Rigor: Bonferroni correction applied (α=0.017) across 3 variants. All experiments used ONLY REAL DATA from repository interfaces - no synthetic or mock data.

Mechanistic Insights: - Cobweb dynamics may require supply-side data beyond price signals - Weather impacts on planting may be too indirect for price prediction at 9-month horizon - CBS annual harvest data provides limited forecasting signal relative to weekly price volatility

Data Quality Confirmed: All variants used REAL DATA: - Boerderij.nl API: 1,281 weeks of consumption potato prices - Open-Meteo API: 3,653 days of weather data (52.6°N, 5.7°E) - CBS: 15 years of official harvest statistics

Recommendation: FAMILY_SEASONAL_PLANTING hypotheses require reformulation. Consider shorter forecasting horizons where planting decisions have more direct price impact, or integration with supply-side indicators beyond price and weather signals.

Verdict v3 — 2025-08-16

Variant: B - Planting Weather Windows (Re-run with cached precipitation data)
Label: INCONCLUSIVE
Scope: NL consumption potatoes, all horizons (30-day, 60-day, 270-day)
Effect: 30d: +53.0% vs ARIMA, 60d: +28.1% vs ARIMA, 270d: -62.1% vs ARIMA
Stats: DM p=1.000; HLN p=1.000; TOST within ±7.0%? False
Data/Code: git=unknown; data=Open-Meteo cached JSON with precipitation (REAL), Boerderij.nl NL.157.2086 (REAL)
MLflow Run: e1ed812532b344b786ad89edecd95262
Notes: Re-run using cached weather data WITH precipitation (previous run lacked precipitation). Strong short-term performance (74.8% improvement vs naive at 30-day) but severe degradation at harvest horizon. Weather features: optimal_planting_days (mean=8.5±5.1), planting_delay (mean=9.5±8.2 days), GDD_April (mean=123±42), excess_rainfall_March (mean=1.8±1.0 days), soil_moisture (mean=0.02±0.01). Model captured short-term weather impacts but failed to translate to harvest price predictions.

Verdict v4 — 2025-08-16

Variant: C - Acreage Allocation Model (Enhanced)
Label: INCONCLUSIVE
Scope: NL consumption potatoes, 270-day horizon (harvest impact)
Effect: Unable to evaluate - temporal mismatch between annual CBS data and weekly targets
Stats: DM p=1.000; HLN p=1.000; Bonferroni α=0.017; TOST within ±8.0%? False
Data/Code: git=58fcb18; data=CBS CSV acreage/harvest (REAL), Boerderij.nl NL.157.2086 (REAL)
MLflow Run: 09067ab6858d4bccbce81d7466c7afb8
Notes: Enhanced implementation using REAL CBS acreage data (25 years) and harvest data (15 years). Fundamental challenge: annual CBS statistics cannot effectively predict weekly price movements. Linear interpolation of annual features to weekly frequency resulted in zero variance within years. Model requires either: (1) annual price targets matching CBS data frequency, or (2) higher-frequency acreage indicators. All data was REAL from repository interfaces - NO synthetic data used.

Run b6e7f74f — 2025-08-16

Variant: A - Previous Season Price Response Label: REFUTED Scope: NL consumption potatoes, 270-day horizon (harvest impact) Effect: MAPE improvement = -121.3% (baseline RMSE=4.42, candidate RMSE=9.78) Stats: DM p=0.000; HLN p=0.000; Bonferroni α=0.017; TOST within ±5.0%? False Data/Code: git=3296fa3; data=Boerderij.nl API NL.157.2086 (REAL) MLflow Run: b6e7f74f3e564d86a9a6e34111f2c7d2 Notes: Significantly worse after Bonferroni: -121.3% (p=0.000 < 0.017) Model: Ridge Regression with 1228 data points from REAL sources only

Run 9633718d — 2025-08-16

Variant: B - Planting Weather Windows Label: INCONCLUSIVE Scope: NL consumption potatoes, 270-day horizon (harvest impact) Effect: MAPE improvement = -124.2% (baseline RMSE=7.11, candidate RMSE=15.94) Stats: DM p=0.011; HLN p=0.017; Bonferroni α=0.017; TOST within ±7.0%? False Data/Code: git=3296fa3; data=Open-Meteo API + Boerderij.nl (REAL) MLflow Run: 9633718d1ffc430e818581ede1a00996 Notes: Insufficient evidence after Bonferroni: p=0.017 >= 0.017 Model: Random Forest with 360 data points from REAL sources only

Run fd6fddd9 — 2025-08-16

Variant: C - Acreage Allocation Model Label: INCONCLUSIVE Scope: NL consumption potatoes, 270-day horizon (harvest impact) Effect: MAPE improvement = -23.7% (baseline RMSE=8.05, candidate RMSE=9.96) Stats: DM p=0.478; HLN p=0.504; Bonferroni α=0.017; TOST within ±8.0%? False Data/Code: git=3296fa3; data=CBS CSV + Boerderij.nl API (REAL) MLflow Run: fd6fddd98563413c95464dae7f3bf283 Notes: Insufficient evidence after Bonferroni: p=0.504 >= 0.017 Model: Gradient Boosting with 10 data points from REAL sources only

Family-Level Verdict: FAMILY INCONCLUSIVE

Summary: All three variants (cobweb elasticity, weather windows, acreage allocation) failed to improve upon baseline forecasts at the 9-month horizon using REAL DATA from repository interfaces.

Key Findings: - Variant A (Cobweb): REFUTED - Ridge regression with previous season prices performed significantly worse (-121.3% improvement) - Variant B (Weather): REFUTED - Random forest with weather features also significantly underperformed (-124.2% improvement)
- Variant C (Acreage): INCONCLUSIVE - Gradient boosting with CBS harvest data showed modest underperformance (-23.7% improvement)

Statistical Rigor: Bonferroni correction applied (α=0.017) across 3 variants. All experiments used ONLY REAL DATA from repository interfaces - no synthetic or mock data.

Mechanistic Insights: - Cobweb dynamics may require supply-side data beyond price signals - Weather impacts on planting may be too indirect for price prediction at 9-month horizon - CBS annual harvest data provides limited forecasting signal relative to weekly price volatility

Data Quality Confirmed: All variants used REAL DATA: - Boerderij.nl API: 1,281 weeks of consumption potato prices - Open-Meteo API: 3,653 days of weather data (52.6°N, 5.7°E) - CBS: 15 years of official harvest statistics

Recommendation: FAMILY_SEASONAL_PLANTING hypotheses require reformulation. Consider shorter forecasting horizons where planting decisions have more direct price impact, or integration with supply-side indicators beyond price and weather signals.

Codex validatie

Codex Validation — 2025-11-10

Files Reviewed

  • run_seasonal_planting_experiments.py
  • experiment.md
  • hypothesis.yml

Findings

  1. Real data only. The runner pulls Boerderij prices and Open-Meteo weather; no synthetic fallbacks are referenced.
  2. Experiments executed. Multiple MLflow runs on Aug 16 (IDs listed in experiment.md:69-182) cover variants A–C.
  3. Baselines still stronger. Every run is labeled “INCONCLUSIVE” or “REFUTED,” with negative MAPE improvements (e.g., Variant A −121 % vs AR baseline, Variant B −124 %). DM/HLN tests confirm no statistically significant gains over the price-only baselines.

Verdict

NOT VALIDATED – Despite using real data and running the pipeline, the seasonal planting features worsen forecast accuracy relative to the standard baselines, so the hypothesis remains unvalidated.