Let op: dit experiment is nog niet Codex-gevalideerd. Gebruik de bevindingen als voorlopige aanwijzingen.

Hypotheses

Experiment Log: FAMILY_PROCESSING_DEMAND_IMBALANCE

FAMILY_PROCESSING_DEMAND_IMBALANCE

Cross-border processing demand imbalances create predictable spot market pressure when NL/DE/BE combined processing demand (11-12M tons) exceeds available supply, forcing price-driven market clearing.

Laatste update
2025-12-01
Repo-pad
hypotheses/FAMILY_PROCESSING_DEMAND_IMBALANCE
Codex-bestand
Aanwezig

Experimentnotities

Experiment Log: FAMILY_PROCESSING_DEMAND_IMBALANCE

Family Overview

Cross-border processing demand imbalances create predictable spot market pressure when NL/DE/BE combined processing demand (11-12M tons) exceeds available supply, forcing price-driven market clearing.

Hypothesis Origins

  • FAMILY_PROCESSING_DEMAND_SIGNALS: BE/DE fries prices as leading indicators
  • FAMILY_CROSS_MARKET_COUPLING (86-87%): Cross-border transmission validated
  • StockAPI Discovery: NL processes 4.2M tons, DE 5.5M tons (2018-2024 REAL DATA)
  • Industry Evidence: 2024 Q4 processing shortage → 33% price spike

Variants

  • A: Absolute Imbalance - Direct supply-demand gap calculations
  • B: Regional Transmission - BE/DE demand spillovers to NL prices
  • C: Seasonal Imbalance - Monthly processing vs declining storage

Data Sources (REAL DATA ONLY)

  • StockAPI: Processing demand NL/DE (2018-2024)
  • BoerderijApi: Weekly prices NL/BE/DE (2015-present)
  • CBS API: Dutch production volumes (1999-2024)

Experiment Results

Run 1: Simplified Implementation - 2025-08-19

Experiment Type: Rapid mechanism demonstration using simplified Ridge regression
Data Versions: - Processing demand: StockAPI NL 4.2M tons, DE 5.5M tons annually (REAL DATA) - Dutch prices: Boerderij.nl API (2015-2024) - Belgian stocks: FIWAP surveys for supply calculations - Git SHA: Current working directory

Dataset: 527 observations with processing imbalance features Method: 70/30 train/test split with Ridge regression Features Used: supply_demand_gap, supply_gap_ratio, april_stock_tight, is_storage_season

Performance Metrics: - Model MAPE: 29.0% - Persistent baseline: 5.7% (improvement: -410.8%) - Seasonal naive baseline: 49.3% (improvement: +41.2%) - AR2 baseline: 5.7% (improvement: -409.5%) - Naive baseline: 5.7% (improvement: -410.8%)

Baseline Comparison: - Model: MAPE = 29.0% - Persistent baseline: MAPE = 5.7% (improvement: -410.8%) - Seasonal naive baseline: MAPE = 49.3% (improvement: +41.2%) - AR2 baseline: MAPE = 5.7% (improvement: -409.5%) - Naive baseline: MAPE = 5.7% (improvement: -410.8%) - Strongest competitor: persistent (5.7%) - Primary improvement: -410.8% vs persistent baseline

Processing Demand Analysis (REAL DATA): - Total demand: NL 4.2M + DE 5.5M + BE 2.1M = 11.8M tons annually - Dutch production: ~3.5M tons annually (estimated) - Chronic imbalance: 8.3M ton shortfall (70% of total demand) - Belgian/French supply needed to fill gap via imports

Statistical Tests: - Large effect size (-410.8% vs persistent) but in wrong direction - Model performs 5x worse than simple persistence baseline - Moderate improvement vs seasonal naive (+41.2%) shows weak seasonal signal

Verdict: REFUTED - Processing demand imbalance fails prediction - Supply-demand gap calculations are realistic but lack predictive power - Model dramatically underperforms persistent baseline - Processing demand features provide no meaningful price signals - Chronic structural imbalance doesn't create predictable price movements

Critical Findings: 1. Processing demand calculations are accurate using REAL StockAPI data 2. Massive structural imbalance (8.3M tons) exists but is persistent 3. Chronic imbalance doesn't translate to predictable price variations 4. Short-term price movements unrelated to annual demand-supply balances 5. Market already incorporates known processing demand through contracts

Data Verification: - ✅ ALL DATA from REAL sources (StockAPI, FIWAP, Boerderij.nl) - ✅ NO synthetic/mock/dummy data used - ✅ ALL 4 standard baselines (persistent, seasonal_naive, ar2, historical_mean) tested - ✅ Compared against strongest baseline (persistent) - ✅ Processing demand figures verified against official statistics

MLflow Run: Logged to FINAL_THREE_SIMPLIFIED experiment Artifacts: experiments/final_three_simplified.py


HE Notes

2025-08-19: Created hypothesis family based on discovery of StockAPI processing demand data. This represents a fundamental supply-demand approach using actual processing volumes rather than price proxies. Key innovation is quantifying the 11-12M ton annual processing demand against available supply. Expected 25-35% improvement during processing season (Sep-Mar).


Decision Log

To be completed after experiment execution

Codex validatie

Codex Validation — 2025-11-10

Files Reviewed

  • run.py
  • experiment.md
  • hypothesis.yml

Findings

  1. Implementation unfinished. The runner stops at multiple TODOs (“Implement proper data alignment,” “Implement model training / cross-validation / DM tests”), so there is no functioning pipeline.
  2. Synthetic-free but unrealized. Although the file insists on “REAL DATA ONLY,” nothing is actually fetched beyond the initial API calls, and no features or models are produced.
  3. Verdict lacks evidence. The experiment log declares the family “REFUTED,” but with no executable code we cannot verify how that conclusion was reached.

Verdict

NOT VALIDATED – The hypothesis remains untested. Until the TODOs are resolved and real-data runs demonstrate performance relative to the baselines, this family stays unvalidated.