Which of the following descriptions best represent the overarching design of your forecasting model?
- Machine learning-based weather prediction.
What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)
Daily ERA5 data with multiple variables is used to initialise Fengshun. The z-score normalization technique is employed to normalize all input and output variables.
If any, what data does your model rely on for real-time forecasting purposes?
Daily ERA5 was used for real-time forecasting, although with several days delay.
What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)
Only daily ERA5 was used for training.
Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)
Fengshun follows the architecture of previous published FuXi-S2S model, consists of three primary components: an encoder P, a perturbation module, and a decoder. The encoder, processing predicted weather parameters from two preceding time steps, with each time step representing l day. Specifically, it takes inputs into a two-dimensional convolution layer with a kernel size of two, which reduces the dimensions of the input data by half. Following this, the hidden feature is derived from 12 repeated transformer blocks. The input to the encoder is a data cube that combines both upper-air and surface variables. These dimensions represent two preceding time steps. To account for the accumulation of forecast error overtime, the forecast lead time is also included in the encoder's input. The encoder also generates a low-rank multivariate Gaussian distribution. Intermediate perturbation vectors are sampled from this Gaussian distribution. These vectors, after being weighted by a learned weight vector, yield the final perturbation vectors. The decoder then processes the perturbed hidden features through 24 transformer blocks and a fully connected layer, resulting in the final ensemble output. The number of ensemble members generated equals the number of samples drawn from the Gaussian distribution.
Have you published or presented any work related to this forecasting model? If yes, could you share references or links?
https://doi.org/10.1038/s41467-024-50714-1
Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?
No.
Did you face any challenges during model development, and how did you address them?
The quintile is adopted in this competition, which is different from the usually-used MSE loss in AI models. We are still struggling to find a better way to address this challenge.
Are there any limitations to your current model that you aim to address in future iterations?
The detailed spatial information seems to be smoothed in Fengshun output.
Are there any other AI/ML model components or innovations that you wish to highlight?
To update a climate model, large historical hindcast is needed to generate quintile climatology in model space, which takes more time and storage resources. Here, the submitted Fengshun is trying to predict quintile directly.
Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.
Overall technical roadmap: LI Hao, LU Bo.
Model architecture: CHEN Lei, DOU Zesheng.
Data preparation: ZHOU Chenguang, WANG Chenpeng, WU Jie
Model Validation: HU Jiahui, Zhong Xiaohui, ZHAO Yang, QIAN Qifeng
Computational Resource Optimization: ZHAO Chunyan, XIN Yuhang