Let op: dit experiment is nog niet Codex-gevalideerd. Gebruik de bevindingen als voorlopige aanwijzingen.

Hypotheses

FAMILY_SEED_POTATO_FORWARD_SIGNALS: Experiment Log

FAMILY_SEED_POTATO_FORWARD_SIGNALS

Testing seed potato prices as 3-4 month forward indicators for consumption potato prices through cost transmission, acreage decisions, and market expectations. Seed potatoes represent 25-30% of production costs and embed forward-looking supply information. This hypothesis uses REAL DATA ONLY from repository interfaces.

Laatste update
2025-12-01
Repo-pad
hypotheses/FAMILY_SEED_POTATO_FORWARD_SIGNALS
Codex-bestand
Aanwezig

Experimentnotities

FAMILY_SEED_POTATO_FORWARD_SIGNALS: Experiment Log

Overview

Testing seed potato prices as 3-4 month forward indicators for consumption potato prices through cost transmission, acreage decisions, and market expectations. Seed potatoes represent 25-30% of production costs and embed forward-looking supply information. This hypothesis uses REAL DATA ONLY from repository interfaces.

Hypothesis Origins

  • FAMILY_SEASONAL_PLANTING: INCONCLUSIVE but showed planting period critical
  • FAMILY_PRODUCTION_CYCLE: Used production data but overlooked seed price signals
  • Data Discovery: BoerderijApi contains seed potato prices (NL.157.seed codes) previously unutilized
  • Industry Evidence: Farmers report seed prices drive planting decisions; 25-30% of total costs
  • Academic Basis: Haile et al. (2016) input prices and supply response; Kouyaté et al. (2016) seed systems

Experiment Design

  • Method: Rolling-origin cross-validation
  • Training Window: 365 days minimum
  • Step Size: 7 days (weekly)
  • Test Window: 60 days maximum
  • Baselines: ALL mandatory standard baselines (persistent, seasonal_naive, ar2, historical_mean)
  • REAL DATA ONLY: BoerderijApi + CBS API

Data Sources (REAL DATA ONLY)

  • Seed Prices: BoerderijApi - NL.157.seed varieties (legacy=true) - git:current
  • Consumption Prices: BoerderijApi - NL.157.2086 consumption potatoes - git:current
  • Planted Area: CBS API - Table 85677NED for acreage validation - git:current
  • Biological Lag: 16 weeks (3-4 months) from planting to harvest

Experiment Runs

Variant A: Simple Forward Signal Model

Status: Not started - Model: RandomForest with seed prices at 3-4 month lags - Features: Seed prices at 12-16 week lags, momentum, seasonal flags - Horizons: 30-day, 60-day - Mechanism: Direct forward price transmission - Expected: 20-25% improvement over seasonal_naive

Variant B: Cost Transmission Ratio Model

Status: Not started - Model: GradientBoosting with seed/consumption price ratios - Features: Price ratios, profitability signals, margin expectations - Horizons: 30-day, 60-day - Innovation: Ratio analysis reveals supply response incentives - Expected: 23-28% improvement (captures economic decision-making)

Variant C: Dynamic Expectations Model

Status: Not started - Model: Ensemble (GB 0.4, RF 0.4, Ridge 0.2) with market expectations - Features: Seed momentum, volatility, correlation dynamics, harvest expectations - Horizons: 30-day, 60-day - Complexity: Forward-looking market sentiment from seed trading - Expected: 25-30% improvement (highest due to expectation embedding)

Statistical Tests

  • Diebold-Mariano test with Harvey-Leybourne-Newbold correction
  • TOST equivalence test with SESOI = 20% improvement
  • Granger causality test for seed→consumption transmission
  • Cross-correlation analysis for optimal lag identification
  • FDR correction for multiple comparisons
  • ALL 4 standard baselines (persistent, seasonal_naive, ar2, historical_mean) included

Transmission Mechanism Analysis

  • Biological lag: 16 weeks (planting to harvest)
  • Cost share: 25-30% of production costs
  • Transmission rate: ~60% of seed cost changes pass through
  • Seed rate: 2,500 kg/ha typical planting density
  • Yield: 45 tons/ha average
  • Minimum margin: 25% for planting decision

Verdicts

Data Availability Assessment - 2025-08-19

Verdict: DATA_UNAVAILABLE Issue: Insufficient seed potato price data for weekly/monthly forecasting

Data Investigation Results: - BoerderijApi: NO seed potato product codes found (checked NL.157.seed, NL.157.5200, NL.157.pootgoed) - Eurostat: Found seed potato prices (product code 05200000) but ANNUAL frequency only - Only 10 records available (2020-2022) - Annual data insufficient for 30/60-day forecasting horizons - Cannot support 3-4 month biological lag analysis at weekly resolution

Conclusion: FAMILY_SEED_POTATO_FORWARD_SIGNALS cannot be tested without weekly/monthly seed potato price data. Hypothesis requires either: 1. Weekly seed potato prices from market sources (not available) 2. Monthly seed potato prices from statistical offices (not found) 3. Alternative proxy data for seed potato costs (would violate REAL_DATA_ONLY policy)

Recommendation: Archive hypothesis until appropriate data source is identified.

HE Notes

  • Created 2025-08-18 to exploit newly discovered seed potato price data
  • First analysis using seed prices for forward signaling
  • 3-4 month biological lag provides genuine forward visibility
  • Seed costs are second largest input after land
  • All variants use ONLY REAL DATA from BoerderijApi
  • SESOI set at 20% due to strong forward information content
  • Critical for early harvest price anticipation

Decision Log

(To be updated after experiment completion)

Codex validatie

Codex Validation — 2025-11-10

Files Reviewed

  • run_experiment.py
  • experiment.md
  • hypothesis.yml

Findings

  1. Real seed index integrated. The runner now imports the INSEE IPAMPA “Semences et plants” series via pynsee and resamples it to weekly frequency instead of fabricating seed prices from consumption data.
  2. Consumption data also real. Boerderij weekly consumption prices continue to serve as the target series, so all inputs come from verified sources.
  3. Experiments still pending. experiment.md remains “DATA_UNAVAILABLE,” meaning no cross-validation or baseline comparisons have been executed yet.

Verdict

NOT VALIDATED – The synthetic fallback is gone and the code is connected to real INSEE + Boerderij data, but until the experiments are actually run and compared to price-only baselines, the hypothesis remains unvalidated.