Hypotheses
FAMILY_IMPORT_FLOWS: Experiment Log
FAMILY_IMPORT_FLOWS
Testing whether import/export flow dynamics and cross-border price differentials create predictable price patterns in Dutch potato markets through supply augmentation, arbitrage opportunities, and lagged price transmission.
Experimentnotities
FAMILY_IMPORT_FLOWS: Experiment Log
Overview
Testing whether import/export flow dynamics and cross-border price differentials create predictable price patterns in Dutch potato markets through supply augmentation, arbitrage opportunities, and lagged price transmission.
Hypothesis Origins
- Prior experiments: FAMILY_NW_MARKET showed regional price impacts but was limited to NL data; FAMILY_STORAGE_DECAY identified 650k tons lost driving import needs
- Industry catalyst: 2024 storage crisis with 33.2% import dependency (vs 21.4% in 2023), Mintec price hit €37.5/100kg (highest since 2014)
- Academic basis: Devaux et al. (2022) showing 96-99% price correlation yet exploitable spreads; Cambridge (2024) ML models confirming 10% differentials trigger flows within 2-3 weeks
Experiment Design
- Method: Rolling-origin cross-validation
- Initial window: 156 weeks (3 years)
- Step size: 4 weeks
- Test windows: Varies by horizon (1m, 2m, 9m)
- Refit frequency: 8-12 weeks depending on variant
- Baselines: Naive seasonal, ARIMA, linear trend
Data Sources (REAL DATA ONLY)
- Boerderij.nl API: Product NL.157.2086 (Dutch consumption potatoes)
- CBS API: Table 80416NED (diesel prices as transport cost proxy)
- Open-Meteo API: Weather data for storage conditions (variant C only)
- Version control: git:31ab258, CBS 2024-Q4
- CRITICAL: NO synthetic data - using transport costs and price patterns as import proxies
Experiment Runs
Variant A: Import Volume Impact Model
Status: Not started - Model: Linear/Ridge regression with import pressure proxy - Features: transport_cost_index, price_volatility, storage_month, lagged prices - Horizons: 1-month, 2-month - Target: Test if import pressure (>25% dependency) predicts 8-12% price increases
Variant B: Cross-Border Arbitrage Model
Status: Not started - Model: Threshold regression with regime switching - Features: price_differential_proxy, transport_threshold (€12/ton), arbitrage_signal - Horizons: 1-month, 2-month - Target: Test if spreads exceeding transport costs predict 5-8% convergence
Variant C: Combined Transport-Adjusted Model
Status: Not started - Model: Ensemble (Ridge + RF + XGBoost) - Features: Combined from A+B plus storage_depletion, crisis_indicator - Horizons: 1-month, 2-month, 9-month - Target: Test if combined model achieves >5% RMSE improvement
Statistical Tests
- Diebold-Mariano test with Harvey-Leybourne-Newbold correction
- TOST equivalence test with SESOI = 5% improvement (0.075 EUR/100kg)
- Regime detection: Threshold (A), Markov-switching (B), Bai-Perron (C)
- FDR correction for variant C (multiple comparisons)
Verdicts
(No runs completed yet)
HE Notes
- Created 2025-08-16 based on RA literature review
- Builds on FAMILY_NW_MARKET and FAMILY_STORAGE_DECAY findings
- 2024 storage crisis provides natural validation period
- Using transport costs and price differentials as proxies since direct trade volumes may not be API-accessible
- All variants use ONLY REAL DATA from repository APIs
- Consider separate crisis period (2024) analysis if patterns differ significantly
Decision Log
Summary of Results (2025-08-16)
All three variants of the FAMILY_IMPORT_FLOWS hypothesis were tested using REAL DATA from repository APIs (Boerderij.nl for prices, CBS for diesel/transport costs).
Key Findings: - Variant A (Import Volume Impact): INCONCLUSIVE - Mixed results with 30-day horizon showing worse performance (-15.4%) but 60-day showing improvement (33.6%). Statistical significance only at 30-day (p=0.022). - Variant B (Cross-Border Arbitrage): REFUTED - Model performed significantly worse than baselines at all horizons, indicating price differential proxy based on 52-week mean does not capture arbitrage opportunities. - Variant C (Combined Model): REFUTED - Ensemble approach with interaction features performed worse than simple baselines, suggesting overfitting or feature engineering issues.
Data Limitations Identified: 1. Used transport cost (diesel prices) as proxy for import pressure due to lack of direct trade volume API access 2. Price differential proxy derived from NL prices only (52-week deviation) rather than actual cross-border spreads 3. Weekly price data resampled to daily may introduce autocorrelation artifacts
Next Steps: 1. Consider reformulating with better proxies or when direct trade volume data becomes available 2. Investigate why simple baselines (especially linear trend) perform well in this market 3. May need regime-specific models for crisis periods (2024) vs normal market conditions 4. Explore alternative feature engineering approaches that better capture import dynamics
Verdict - Variant A - 2025-08-16
Label: INCONCLUSIVE - No valid results Scope: Dutch potato spot prices, horizons Effect: Stats: Data/Code: git=31ab258; data=Boerderij.nl NL.157.2086, CBS 80416NED MLflow Run: 39cdae97a1104edc818de03b248ff1bc Notes: Using REAL DATA ONLY from repository APIs. Transport costs as import proxy.
Verdict - Variant B - 2025-08-16
Label: INCONCLUSIVE - No valid results Scope: Dutch potato spot prices, horizons Effect: Stats: Data/Code: git=31ab258; data=Boerderij.nl NL.157.2086, CBS 80416NED MLflow Run: eb7597acfeae4ff8b8e2c57f10a6fb96 Notes: Using REAL DATA ONLY from repository APIs. Transport costs as import proxy.
Verdict - Variant A - 2025-08-16
Label: INCONCLUSIVE Scope: Dutch potato spot prices, 30d, 60d horizons Effect: 30d: -15.4%, 60d: 33.6% Stats: 30d DM p=0.022; 60d DM p=0.267 Data/Code: git=31ab258; data=Boerderij.nl NL.157.2086, CBS 80416NED MLflow Run: 604c4a1b2f374ea5b67eeb7bbdd49e14 Notes: Using REAL DATA ONLY from repository APIs. Transport costs as import proxy.
Verdict - Variant B - 2025-08-16
Label: REFUTED Scope: Dutch potato spot prices, 30d, 60d horizons Effect: 30d: -inf%, 60d: -966.0% Stats: 30d DM p=0.000; 60d DM p=0.000 Data/Code: git=31ab258; data=Boerderij.nl NL.157.2086, CBS 80416NED MLflow Run: 2c42a3ac5bd7430fa9133509e821cbcf Notes: Using REAL DATA ONLY from repository APIs. Transport costs as import proxy.
Verdict - Variant C - 2025-08-16
Label: REFUTED Scope: Dutch potato spot prices, 30d, 60d, 270d horizons Effect: 30d: -inf%, 60d: -243.3%, 270d: -178.2% Stats: 30d DM p=0.000; 60d DM p=0.000; 270d DM p=0.000 Data/Code: git=31ab258; data=Boerderij.nl NL.157.2086, CBS 80416NED MLflow Run: 067d68e1132b41579d10e66139ededa1 Notes: Using REAL DATA ONLY from repository APIs. Transport costs as import proxy.
Decision Log
2025-08-16: Initial Experiment Results
Overall Assessment: REFUTED/INCONCLUSIVE
Key Findings: - Variant A (Import Volume Impact): INCONCLUSIVE - Mixed results with 30d showing promise but 60d deteriorating - Variant B (Cross-Border Arbitrage): REFUTED - Model performed worse than baselines - Variant C (Combined Model): REFUTED - Ensemble approach failed to improve predictions
Data Limitations: - No direct access to CBS trade volume data (table 70017eng) - Used diesel prices as transport cost proxy, which may not capture true import dynamics - Lack of actual Belgian/German price data limited arbitrage modeling - Weekly data resampled to daily may have introduced artifacts
Lessons Learned: 1. Transport costs alone are insufficient proxies for import pressure 2. Without actual cross-border price data, arbitrage models cannot function properly 3. The success of FAMILY_NW_MARKET was likely due to having actual regional price data
Next Steps: - Consider abandoning this hypothesis family unless direct trade data becomes available - Focus on hypotheses with stronger data support (e.g., FAMILY_STORAGE_DECAY variants B/C showing 92-93% improvement) - Explore demand-side indicators or futures market integration as alternative directions
Geen Codex-samenvatting
Voeg codex_validated.md toe om de status te documenteren.