Let op: dit experiment is nog niet Codex-gevalideerd. Gebruik de bevindingen als voorlopige aanwijzingen.

Hypotheses

Experiment Log: FAMILY_SATELLITE_PRICE_PREDICTION

FAMILY_SATELLITE_PRICE_PREDICTION

**Status**: PENDING VALIDATION **Created**: 2025-08-23 **Hypothesis**: Satellite vegetation indices provide 12-16 week price forecast advantage through early planting and stress detection

Laatste update
2025-12-01
Repo-pad
hypotheses/FAMILY_SATELLITE_PRICE_PREDICTION
Codex-bestand
Aanwezig

Experimentnotities

Experiment Log: FAMILY_SATELLITE_PRICE_PREDICTION

Family Overview

Status: PENDING VALIDATION
Created: 2025-08-23
Hypothesis: Satellite vegetation indices provide 12-16 week price forecast advantage through early planting and stress detection

Prior Evidence Base

Validated Results (from VALIDATION_REPORT.md)

  • 69.4% improvement at 4-week horizon (satellite_enhanced_model.py)
  • 40.5% improvement at 16-week horizon (integrated_satellite_baseline_model.py)
  • Uses REAL Sentinel-2 data from Zarr stores
  • All 4 standard baselines (persistent, seasonal_naive, ar2, historical_mean) implemented correctly

Critical Issues Identified

  1. Small sample size: Only 157-268 weekly samples after merging
  2. Missing 2020 data: Gap in satellite time series
  3. Negative R²: All models show poor absolute fit (-3.5 to -25.9)
  4. No statistical tests: Missing DM, HLN, TOST validation
  5. BRP mask failures: Parcel boundary application errors

Experimental Plan

Phase 1: Data Validation

  • [ ] Verify Zarr store accessibility and content
  • [ ] Test BRP mask generation for all years
  • [ ] Quantify data overlap between satellite and prices
  • [ ] Add 2020 data if possible

Phase 2: Variant Implementation

  • [ ] Variant a: NDVI-only baseline (15% SESOI)
  • [ ] Variant b: Multi-index ensemble (25% SESOI)
  • [ ] Variant c: Integrated features (35% SESOI)

Phase 3: Statistical Validation

  • [ ] Diebold-Mariano tests vs all baselines
  • [ ] Harvey-Leybourne-Newbold correction
  • [ ] TOST equivalence testing
  • [ ] FDR correction for multiple comparisons

Phase 4: Regime Analysis

  • [ ] Early planting detection (2022, 2025)
  • [ ] High volatility period performance
  • [ ] Growing vs storage season differences

Data Source Verification

Sentinel-2 Zarr Store

# Path: lake_31UFU_medium.zarr
# Scenes: 850+
# Date range: 2015-07-06 to 2023-08-05
# Missing: 2020
# Bands: B02-B12, SCL

BRP Parcel Data

# Interface: BRPApi().get_consumption_potato_mask()
# Years: 2015-2019, 2021-2023
# Crop code: 2014 (consumption potatoes)

Price Data

# Interface: BoerderijApi().get_data(product_id="NL.157.2086")
# Frequency: Weekly
# Range: 2000-2023
# Records: 633

Notes on Methodology

Critical Requirements (per SOP)

  1. USE ONLY REAL DATA - No synthetic/mock data allowed ✅
  2. All 4 standard baselines required - persistent, seasonal_naive, ar2, historical_mean ✅
  3. Compare against strongest baseline - Report lowest MAE baseline
  4. Statistical significance required - DM test with HLN correction
  5. MLflow logging mandatory - Track all experiments

Expected Performance Benchmarks

  • Validated 7.6% baseline (from hypothesis_registry.md): LightGBM at 12-week horizon
  • Target improvement: 15-35% depending on variant
  • Critical horizon: 12 weeks (optimal for quarterly planning)

Experiment Results

To be populated after implementation


Decision Log

To be populated after experiments complete

Codex validatie

Codex Validation — 2025-11-10

Files Reviewed

  • run.py
  • experiment.md
  • hypothesis.yml

Findings

  1. Planning only. The family outlines data validation steps but has no recorded runs; experiment.md states “To be populated after implementation.”
  2. Real-data usage not demonstrated. Without execution, we cannot confirm that Sentinel/BRP/Boerderij feeds were processed.
  3. No baseline comparison. There are no metrics or statistical tests showing that the satellite features beat the price-only baselines.

Verdict

NOT VALIDATED – Until the code is executed with real data and produces statistically significant gains over the mandatory baselines, this family remains unvalidated.