Green&Soon: Are Climate Model Forecasts Useful for Policy Making?

Effect of Variable Choice on Reliability and Predictive Validity

For a model to be useful for policy decisions, statistical fit is insufficient. Evidence that the model provides out-of-estimation-sample forecasts that are more accurate and reliable than those from plausible alternative models, including a simple benchmark, is necessary.

The UN’s IPCC advises governments with forecasts of global average temperature drawn from models based on hypotheses of causality. Specifically, manmade warming principally from carbon dioxide emissions (Anthro) tempered by the effects of volcanic eruptions (Volcanic) and by variations in the Sun’s energy (Solar). Out-of-sample forecasts from that model, with and without the IPCC’s favoured measure of Solar, were compared with forecasts from models that excluded human influence and included Volcanic and one of two independent measures of Solar. The models were used to forecast Northern Hemisphere land temperatures and—to avoid urban heat island effects—rural only temperatures. Benchmark forecasts were obtained by extrapolating estimation sample median temperatures.

The independent solar models reduced forecast errors relative to those of the benchmark model for all eight combinations of four estimation periods and the two temperature variables tested. The models that included the IPCC’s Anthro variable reduced errors for only three of the eight combinations and produced extreme forecast errors from most model estimation periods. The correlation between estimation sample statistical fit and forecast accuracy was -0.26.

Further tests might identify better models: Only one extrapolation model and only two of many possible independent solar models were tested, and combinations of forecasts from different methods were not examined.

The anthropogenic models’ unreliability would appear to void policy relevance. In practice, even the models validated in this study may fail to improve accuracy relative to naïve forecasts due to uncertainty over the future causal variable values. Our findings emphasise that out-of-sample forecast errors, not statistical fit, should be used to choose between models (hypotheses).