Skip to main content
Publication

Gap-filling eddy covariance methane fluxes: Comparison of machine learning model predictions and uncertainties at FLUXNET-CH4 wetlands

Authors

Irvin, Jeremy; Zhou, Sharon; McNicol, Gavin; Lu, Fred; Liu, Vincent; Fluet-Chouinard, Etienne; Ouyang, Zutao; Knox, Sara; Lucas-Moffat, Antje; Trotta, Carlo; Sullivan, Ryan

Abstract

Time series of methane fluxes measured by eddy-covariance require gap-filling to estimate annual emissions. Gap-filling methane fluxes is challenging because of high variability and complex responses to multiple drivers. To date, there is no widely established gap-filling standard for methane, with regards both to the best model algorithms and predictors. In this study, we address the need for standardization by synthesizing results of gap-filling methods applied at 17 wetland sites spanning boreal to tropical regions including all major wetlands classes and two rice paddies. We introduce new procedures for: 1) creating realistic artificial gap scenarios, 2) training and evaluating gap-filling models without overstating performance, and 3) predicting half-hourly methane fluxes and annual emissions with robust uncertainty estimates. We tested a conventional method (marginal distribution sampling) and four machine learning algorithms - penalized linear regression, artificial neural networks, random forests, and boosted decision trees - and four predictor sets, including temporal, meteorological, ecosystem carbon and energy flux, and soil predictors. We find that the conventional method can achieve similar median performance to the machine learning models but is worse than the best machine learning models and relatively insensitive to predictor choices. Of the machine learning models, decision tree algorithms performed the best in cross-validation experiments, even with a baseline predictor set, and artificial neural networks showed comparable performance when using all predictors. Soil temperature was frequently the most important predictor whilst water table depth was important at sites with substantial water table fluctuations, highlighting the value of data on soil conditions. Raw gap-filling uncertainties from the machine learning models were underestimated and we propose a method to calibrate uncertainties to observations. Finally, we gap-fill and provide summary evaluation metrics for all 81 sites in the FLUXNET-CH4 community dataset and publicly release the python code for model development, evaluation, and uncertainty estimation.