%0 Journal Article %T GlobalWheatYield4km: a global wheat yield dataset at 4-km resolution during 1982–2020 based on deep learning approach %A Luo, Yuchuan %A Zhang, Zhao %A Cao, Juan %A Zhang, Liangliang %A Zhang, Jing %A Han, Jichong %A Zhuang, Huimin %A Cheng, Fei %A Xu, Jialu %A Tao, Fulu %J EGUsphere %D 2022 %V 2022 %F pub.1153614644 %X Accurate and spatially explicit information on global crop yield is paramount for guiding policy-making and ensuring food security. However, most public datasets are at coarse resolution in both space and time. Here, we used data-driven models to develop a 4-km dataset of global wheat yield (GlobalWheatYield4km) from 1982 to 2020. First, we proposed a phenology-based approach to map spatial distributions of spring and winter wheat. Then we determined the optimal grid-scale yield estimation model by comparing the performance of two data-driven models (i.e., Random Forest (RF) and Long Short-Term Memory (LSTM)), with publicly available data (i.e., satellite and climatic data from the Google Earth Engine (GEE) platform, soil properties, and subnational-level census data covering 11000 political units). The results showed that GlobalWheatYield4km captured 82 % of yield variations with RMSE of 619.8 kg/ha across all subnational regions and years. In addition, our dataset had a higher accuracy (R2 0.71) as compared with Spatial Production Allocation Model (SPAM) (R2 0.49) across all subnational regions and three years. The GlobalWheatYield4km dataset might play important roles in modelling crop system and assessing climate impact over larger areas (DOI of the referenced dataset: https://doi.org/10.6084/m9.figshare.10025006; Luo et al., 2022b). %R 10.5194/essd-2022-423 %U https://essd.copernicus.org/preprints/essd-2022-423/essd-2022-423.pdf %U https://app.dimensions.ai/details/publication/pub.1153614644 %U https://doi.org/10.5194/essd-2022-423 %P 1-21