SwissAIClimate
Members
First name (team leader)
Fanny
Last name
Lehmann
Organisation name
ETH Zurich
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Firat
Last name
Ozdemir
Organisation name
Swiss Data Science Center
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Salman
Last name
Mohebi
Organisation name
Swiss Data Science Center
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Yun
Last name
Cheng
Organisation name
Swiss Data Science Center
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Sebastian
Last name
Schemm
Organisation name
University of Cambridge
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
United Kingdom
First name
Torsten
Last name
Hoefler
Organisation name
ETH Zurich
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Mathieu
Last name
Salzmann
Organisation name
EPFL
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Matthias
Last name
Meyer
Organisation name
Swiss Data Science Center
Organisation type
Research Organisation (Academic, Independent, etc.)
Organisation location
Switzerland
First name
Piotr
Last name
Wilczynski
Organisation name
ETH
Organisation type
Academic (Student)
Organisation location
Switzerland
Models
Model name
ESFM
Number of individuals supporting model development:
1-5
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
16-64
How would you best classify the IT system used for model development or forecast production:
High-Performance Computing (HPC) Cluster
Model summary questionnaire for model ESFM
Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.
Which of the following descriptions best represent the overarching design of your forecasting model?
- Machine learning-based weather prediction.
What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)
The model is initialized with ERA5 at 0.25° resolution, with the 12 latest samples available at the time of submission (with a 6-hour sampling between each of the 12 samples). For each initial sample, an autoregressive prediction is performed for 150 steps to cover the entire period. The lead time of each prediction is 6 hours.
If any, what data does your model rely on for real-time forecasting purposes?
Our model is not designed specifically for real-time forecasting.
What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)
The model was trained with only ERA5 data at 0.25° resolution, from 1979 to 2019.
Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)
Our ESFM model builds on the Aurora architecture, with an encoder-processor-decoder structure. The key modification to the Aurora architecture that is employed in the WeatherQuest challenge is the addition of multiple decoder heads for each variable to output probabilistic predictions. We use the pretrained checkpoint from the Aurora model and further train our model with 8 decoder heads for 18k steps on 32 GPUs. The training loss is a combination of latitude-weighted MAE and CRPS.
Have you published or presented any work related to this forecasting model? If yes, could you share references or links?
A preprint will be released soon.
Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?
Our forecasts are not externally validated. We directly submit the output of the autoregressive predictions.
Did you face any challenges during model development, and how did you address them?
The main challenges are technical and computational difficulties related to the size of the model (1.3B parameters). The training is remarkably stable in terms of convergence of the loss function but is impacted by various errors (e.g., NCCL errors, CPU out-of-memory errors).
Are there any limitations to your current model that you aim to address in future iterations?
The main limitation of our model is that it has been trained with only 1-step ahead prediction, no rollout loss function was employed. This implies that there is no optimization over multiple autoregressive predictions. In the next competition phase, our model will feature rollout loss training.
A second limitation was observed during the WeatherQuest challenge and is related to the dispersion of the ensemble members predicted by the decoder heads. We observed that our 8 decoder heads are overdispersive, leading to a few members that are below the 20% quintile or above the 80% quintile. We are addressing this limitation in our ongoing developments.
Are there any other AI/ML model components or innovations that you wish to highlight?
Other innovations brought by ESFM are related to data loading, variable encoding, and pretraining strategies but they were not used in the AI Weather Quest challenge.
Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.
This team has chosen to keep its participants anonymous.
Model name
ESFM2
Number of individuals supporting model development:
1-5
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
16-64
How would you best classify the IT system used for model development or forecast production:
High-Performance Computing (HPC) Cluster
Model name
ESFM3
Number of individuals supporting model development:
1-5
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
16-64
How would you best classify the IT system used for model development or forecast production:
High-Performance Computing (HPC) Cluster
Submitted forecast data in previous period(s)
Please note: Submitted forecast data is only publicly available once the evaluation of a full competitive period has been completed. See the competition's full detailed schedule with submitted data publication dates for each period here.
Access forecasts data