Let op: dit experiment is nog niet Codex-gevalideerd. Gebruik de bevindingen als voorlopige aanwijzingen.

Hypotheses

FAMILY_STORAGE_INFORMATION_ASYMMETRY: Experiment Log

FAMILY_STORAGE_INFORMATION_ASYMMETRY

Testing information asymmetry mechanisms in storage markets where operators possess private knowledge about quality deterioration, storage costs, and optimal release timing that creates predictable patterns 2-6 weeks ahead of spot price movements.

Laatste update
2025-12-01
Repo-pad
hypotheses/FAMILY_STORAGE_INFORMATION_ASYMMETRY
Codex-bestand
Aanwezig

Experimentnotities

FAMILY_STORAGE_INFORMATION_ASYMMETRY: Experiment Log

Overview

Testing information asymmetry mechanisms in storage markets where operators possess private knowledge about quality deterioration, storage costs, and optimal release timing that creates predictable patterns 2-6 weeks ahead of spot price movements.

Hypothesis Origins

  • Prior experiments:
  • FAMILY_WEATHER_ACCUMULATION (92.4% improvement) demonstrated value of cumulative patterns
  • FAMILY_STORAGE_DECAY/OPTIMIZATION failed by ignoring information dynamics
  • FAMILY_CBS_NOWCASTING showed market inefficiency in processing information
  • Industry catalyst:
  • October 2024 storage crisis where operators knew problems 3-4 weeks before price spike
  • February 2023 quality issues discovered by operators weeks before market
  • Trader observations: "Watch releases, not announcements"
  • Academic basis:
  • Kyle (1985) informed trader models
  • Working (1949) storage with private information
  • Information economics (Akerlof 1970)

Experiment Design

  • Method: Rolling-origin cross-validation
  • Initial window: 365 days (1 year minimum)
  • Step size: 7 days (weekly)
  • Test windows: 10 horizons maximum
  • Refit frequency: Every 4 weeks
  • Baselines: ALL 4 MANDATORY - persistent, seasonal_naive, ar2, historical_mean

Data Sources (REAL DATA ONLY - NO SYNTHETIC/MOCK DATA)

  • CBS API: Table 84506NED - monthly storage stocks (git:current)
  • Boerderij.nl API:
  • NL.157.2086 (consumption potatoes) - weekly prices
  • NL.157.2083 (fries potatoes) - weekly prices for quality spreads
  • Open-Meteo API: Hourly temperature/humidity at 52.55°N, 5.55°E
  • Version control: All sources at git:current, pinned at experiment runtime

Experiment Runs

Variant A: Storage Release Signaling

Status: Completed 2025-08-18 - Model: Gradient Boosting - Features: Release volumes (1-2w lags), acceleration, small release indicators, cumulative 4w patterns - Information lead: 2-4 weeks - Target: Test if early small releases signal larger movements

Variant B: Quality Information Asymmetry

Status: Completed 2025-08-18 - Model: Random Forest - Features: Quality spreads, spread widening rate, temperature/humidity stress, storage duration - Information lead: 2-6 weeks - Target: Test if operators know deterioration before market discovery

Variant C: Inventory Position Revelation

Status: Completed 2025-08-18 - Model: Ensemble (GB + RF + Ridge) - Features: Large holder proxy, release clustering, inventory drawdown, market power, strategic timing - Information lead: 3-4 weeks - Target: Test if large holder positioning reveals through cumulative patterns

Statistical Tests

  • Diebold-Mariano test with Harvey-Leybourne-Newbold correction
  • TOST equivalence test with SESOI = 10% improvement
  • Chow test for storage season regime breaks (Oct, Jan)
  • CUSUM for gradual information revelation
  • FDR correction for multiple comparisons

Verdicts

Run 2025-08-18: All Variants - Initial Implementation

Variant A: Storage Release Signaling

Data Versions: - Boerderij.nl prices: 2021-01-01 to 2024-12-31 (168 weekly observations) - CBS storage data: Not available (used price volatility proxies) - Git SHA: exp/FAMILY_SEASONAL_PLANTING/variants_abc

Rolling CV Results: - Training window: 52 weeks minimum - Test periods: 20 folds - Horizon: 4 weeks ahead

Model Performance: - Model MAPE: 22.36% - Model RMSE: Not calculated - Directional accuracy: Not calculated

Baseline Comparison: - Model: MAPE = 22.36% - Persistent baseline: MAPE = 22.04% (improvement: -1.4%) - Seasonal naive baseline: MAPE = 40.16% (improvement: +44.3%) - AR2 baseline: MAPE = 23.56% (improvement: +5.1%) - Naive baseline: MAPE = 22.04% (improvement: -1.4%) - Strongest competitor: persistent (22.04%) - Primary improvement: -1.4% vs persistent baseline

Statistical Tests: - DM test vs persistent: p-value = 0.7551 (not significant) - SESOI threshold: 10% - Practical significance: NO

Verdict: REFUTED (worse than baseline)

Caveats: - Model performed worse than simple persistent baseline - No actual storage volume data available (CBS table didn't have the expected data) - Used price volatility as proxy for release patterns, which may not capture true information signals - Information lead of 2-4 weeks not validated due to poor performance


Variant B: Quality Information Asymmetry

Data Versions: - Boerderij.nl consumption prices: 2021-01-01 to 2024-12-31 - Boerderij.nl fries prices: 2021-01-01 to 2024-12-31 - Quality spread calculated from price differential - Weather data: Failed to load (connection issue) - Git SHA: exp/FAMILY_SEASONAL_PLANTING/variants_abc

Rolling CV Results: - Training window: 52 weeks minimum - Test periods: 20 folds - Horizon: 4 weeks ahead

Model Performance: - Model MAPE: 23.72% - Random Forest with quality spread features

Baseline Comparison: - Model: MAPE = 23.72% - Persistent baseline: MAPE = 22.04% (improvement: -7.6%) - Seasonal naive baseline: MAPE = 40.16% (improvement: +40.9%) - AR2 baseline: MAPE = 23.56% (improvement: -0.7%) - Naive baseline: MAPE = 22.04% (improvement: -7.6%) - Strongest competitor: persistent (22.04%) - Primary improvement: -7.6% vs persistent baseline

Statistical Tests: - DM test vs persistent: p-value = 0.9875 (not significant) - SESOI threshold: 12% - Practical significance: NO

Verdict: REFUTED (worse than baseline)

Caveats: - Quality spread signal not predictive at 2-6 week horizon - Weather data unavailable, limiting deterioration modeling - Model performed significantly worse than persistent baseline - Information asymmetry hypothesis not supported by empirical evidence


Variant C: Inventory Position Revelation

Data Versions: - Boerderij.nl NL prices: 2021-01-01 to 2024-12-31 - Attempted BE/DE cross-market data but no overlapping observations in date range - Market microstructure proxied through bid-ask spreads - Git SHA: exp/FAMILY_SEASONAL_PLANTING/variants_abc

Rolling CV Results: - Training window: 52 weeks minimum
- Test periods: 20 folds - Horizon: 4 weeks ahead

Model Performance: - Ensemble model (GB 40% + RF 40% + Ridge 20%) - Model MAPE: 22.89%

Baseline Comparison: - Model: MAPE = 22.89% - Persistent baseline: MAPE = 22.04% (improvement: -3.8%) - Seasonal naive baseline: MAPE = 40.16% (improvement: +43.0%) - AR2 baseline: MAPE = 23.56% (improvement: +2.9%) - Naive baseline: MAPE = 22.04% (improvement: -3.8%) - Strongest competitor: persistent (22.04%) - Primary improvement: -3.8% vs persistent baseline

Statistical Tests: - DM test vs persistent: p-value = 0.7312 (not significant) - SESOI threshold: 11% - Practical significance: NO

Verdict: REFUTED (worse than baseline)

Caveats: - Cross-market data not available for the test period - Large holder positioning proxied through price patterns only - Ensemble model still underperformed simple baselines - Information revelation hypothesis not supported

MLflow Experiment: FAMILY_STORAGE_INFORMATION_ASYMMETRY Artifacts: CV results saved as csv files

HE Notes

  • Created 2025-08-18 focusing on INFORMATION ASYMMETRY not physical processes
  • Key innovation: Exploits revelation patterns over 2-6 week windows
  • Differentiator: Unlike failed storage families, focuses on information timing advantages
  • All variants use ONLY REAL DATA from repository interfaces
  • Critical validation periods: 2024 Q4 crisis, 2023 Q1 quality issues
  • Expected 10-15% improvement through information lead time advantages

Decision Log

2025-08-18: Initial Run - All Variants REFUTED

Summary: All three variants of the storage information asymmetry hypothesis were refuted. Models performed worse than the simple persistent baseline.

Key Findings: 1. Data Limitations: CBS storage volume data was not available in the expected format. Had to use price volatility as proxy for storage releases, which likely doesn't capture true information signals. 2. Baseline Strength: The persistent baseline (random walk) proved remarkably strong for 4-week ahead predictions, suggesting limited predictability from information asymmetry features. 3. Missing Cross-Market Data: International price data (BE/DE) had no overlapping observations with the test period, limiting cross-market information flow analysis. 4. Weather Data Issues: Connection problems prevented incorporation of temperature/humidity stress features for quality deterioration modeling.

Lessons Learned: - Information asymmetry may exist but not manifest in predictable price patterns at 2-6 week horizons - Quality spreads (consumption vs fries prices) do not contain predictive information about future spot prices - Market microstructure proxies from bid-ask spreads insufficient to capture large holder positioning

Next Steps: 1. Data Enhancement: Need actual CBS storage volume data (may be in different table) 2. Shorter Horizons: Information advantage may be more relevant at 1-2 week horizons rather than 4 weeks 3. Alternative Hypothesis: Consider that markets may be more efficient than expected in incorporating storage information

Verdict: FAMILY_STORAGE_INFORMATION_ASYMMETRY hypothesis REFUTED across all variants. The hypothesized information revelation patterns over 2-6 week windows do not provide predictive power beyond simple baselines.

Codex validatie

Codex Validation — 2025-11-10

Files Reviewed

  • run.py
  • experiment.md
  • hypothesis.yml

Findings

  1. Real data only. The runner pulls CBS table 84506NED, Boerderij NL.157.2086/2083, and Open-Meteo weather without synthetic fallbacks.
  2. Experiments executed. experiment.md:66-222 logs three MLflow-backed runs (Variants A–C) completed on Aug 17.
  3. Baseline superiority not met. Every variant is marked “REFUTED (worse than baseline)” because the MAE error exceeded that of the persistent/AR2 baselines.

Verdict

NOT VALIDATED – Despite using real data and running end-to-end, the storage information asymmetry features fail to outperform the price-only baselines, so the hypothesis remains unvalidated.