MetaCarbon3060
Members
First name (team leader)
Yuxuan
Last name
Liu
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Lianjun
Last name
Wu
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Bing
Last name
Yu
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Yukai
Last name
Liu
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Yu
Last name
Wang
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Shengchen
Last name
Zhu
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Xiaoduan
Last name
Feng
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
First name
Zhizhan
Last name
Zhao
Organisation name
MetaCarbon
Organisation type
Small & Medium Enterprise or Startup
Organisation location
China
Models
Model name
PuyunLDM
Number of individuals supporting model development:
6-10
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
< 4
How would you best classify the IT system used for model development or forecast production:
Single node system
Model summary questionnaire for model PuyunLDM
Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.
Which of the following descriptions best represent the overarching design of your forecasting model?
- Machine learning-based weather prediction.
What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)
Our forecasts are initialized from the ERA5T reanalysis.
If any, what data does your model rely on for real-time forecasting purposes?
The input to PuyunLDM consists of two consecutive ERA5 states at time t and t−24 h. Specifically, the model ingests three categories of variables: (1) pressure-level variables — temperature, specific humidity, geopotential height, and horizontal wind components at 13 levels (1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100, and 50 hPa); (2) surface variables — 10 m wind components, 2 m temperature, mean sea level pressure, and total precipitation; and (3) static forcing fields — orography and land-sea mask.
What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)
Only ERA5 dataset.
Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)
PuyunLDM is a latent diffusion weather forecasting model based on DCAE and DiT. It first normalizes multi-variable ERA5 atmospheric fields, applying a log transform before normalization for accumulated precipitation, and then uses a pretrained DCAE to encode them into low-dimensional latent representations. During training, the model conditions on latent states from previous timesteps, adds random noise of varying intensities to the target latent state, and trains the DiT denoiser to recover the clean latent representation under the corresponding diffusion noise level, mainly using a weighted MSE loss. During inference, the model starts from random noise and applies multi-step DPM-Solver denoising under the guidance of the conditioning latent states to generate future latent states, which are then decoded by the DCAE back into meteorological fields and inverse-normalized. Ensemble forecasts are generated naturally from the stochastic diffusion process by sampling multiple plausible future latent trajectories from the same initial condition, producing diverse forecast members that represent predictive uncertainty.
Have you published or presented any work related to this forecasting model? If yes, could you share references or links?
https://arxiv.org/abs/2602.11807
Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?
The model was trained on ERA5 data up to 2022 and verified in the period 2023–2025. Fine-tuning on recent years (2023–2025).
Did you face any challenges during model development, and how did you address them?
A primary challenge is that this competition adopts the quintile-based evaluation, which differs significantly from the commonly used MSE loss in standard AI models. While MSE focuses on minimizing average squared errors, quintile-based metrics require accurate probabilistic distribution modeling across ensemble members. We are still actively working to address this challenge and seeking better ways to align the latent diffusion model's training objectives with the quintile-based evaluation. Although PuyunLDM already enhances latent diffusability and generates efficient ensemble forecasts, adapting its loss formulation to optimally match the competition's metric remains an ongoing effort.
Are there any limitations to your current model that you aim to address in future iterations?
The prediction results are smooth, with loss of details.
Are there any other AI/ML model components or innovations that you wish to highlight?
No.
Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.
Data preparation, model architecture, model validation:Shengchen Zhu, Yuxuan Liu, Lianjun Wu, Liuyu Kai , Xiaoduan Feng, Zhizhan Zhao, Bing Yu, Yu Wang.
Model name
PuyunEnsemble
Number of individuals supporting model development:
6-10
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
< 4
How would you best classify the IT system used for model development or forecast production:
Single node system
Model name
PuyunWeather
Number of individuals supporting model development:
6-10
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
< 4
How would you best classify the IT system used for model development or forecast production:
Single node system
Submitted forecast data in previous period(s)
Please note: Submitted forecast data is only publicly available once the evaluation of a full competitive period has been completed. See the competition's full detailed schedule with submitted data publication dates for each period here.
Access forecasts data