Overview » Teams » SwissAIClimate

SwissAIClimate

Members

First name (team leader)

Fanny

Last name

Lehmann

Organisation name

ETH Zurich

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Firat

Last name

Ozdemir

Organisation name

Swiss Data Science Center

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Salman

Last name

Mohebi

Organisation name

Swiss Data Science Center

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Yun

Last name

Cheng

Organisation name

Swiss Data Science Center

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Sebastian

Last name

Schemm

Organisation name

University of Cambridge

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United Kingdom

First name

Torsten

Last name

Hoefler

Organisation name

ETH Zurich

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Mathieu

Last name

Salzmann

Organisation name

EPFL

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Matthias

Last name

Meyer

Organisation name

Swiss Data Science Center

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

Switzerland

First name

Piotr

Last name

Wilczynski

Organisation name

ETH

Organisation type

Academic (Student)

Organisation location

Switzerland

Models

Model name

ESFM

Number of individuals supporting model development:

1-5

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

< 8

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

16-64

How would you best classify the IT system used for model development or forecast production:

High-Performance Computing (HPC) Cluster

Model summary questionnaire for model ESFM

Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.

Which of the following descriptions best represent the overarching design of your forecasting model?

Machine learning-based weather prediction.

What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)

The model is initialized with ERA5 at 0.25° resolution, with the 12 latest samples available at the time of submission (with a 6-hour sampling between each of the 12 samples). For each initial sample, an autoregressive prediction is performed for 150 steps to cover the entire period. The lead time of each prediction is 6 hours.

If any, what data does your model rely on for real-time forecasting purposes?

Our model is not designed specifically for real-time forecasting.

What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)

The model was trained with only ERA5 data at 0.25° resolution, from 1979 to 2019.

Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)

Our ESFM model builds on the Aurora architecture, with an encoder-processor-decoder structure. The key modification to the Aurora architecture that is employed in the WeatherQuest challenge is the addition of multiple decoder heads for each variable to output probabilistic predictions. We use the pretrained checkpoint from the Aurora model and further train our model with 8 decoder heads for 18k steps on 32 GPUs. The training loss is a combination of latitude-weighted MAE and CRPS.

Have you published or presented any work related to this forecasting model? If yes, could you share references or links?

A preprint will be released soon.

Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?

Our forecasts are not externally validated. We directly submit the output of the autoregressive predictions.

Did you face any challenges during model development, and how did you address them?

The main challenges are technical and computational difficulties related to the size of the model (1.3B parameters). The training is remarkably stable in terms of convergence of the loss function but is impacted by various errors (e.g., NCCL errors, CPU out-of-memory errors).

Are there any limitations to your current model that you aim to address in future iterations?

The main limitation of our model is that it has been trained with only 1-step ahead prediction, no rollout loss function was employed. This implies that there is no optimization over multiple autoregressive predictions. In the next competition phase, our model will feature rollout loss training. A second limitation was observed during the WeatherQuest challenge and is related to the dispersion of the ensemble members predicted by the decoder heads. We observed that our 8 decoder heads are overdispersive, leading to a few members that are below the 20% quintile or above the 80% quintile. We are addressing this limitation in our ongoing developments.

Are there any other AI/ML model components or innovations that you wish to highlight?

Other innovations brought by ESFM are related to data loading, variable encoding, and pretraining strategies but they were not used in the AI Weather Quest challenge.

Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.

This team has chosen to keep its participants anonymous.

Model name

ESFM2

Number of individuals supporting model development:

1-5

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

< 8

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

16-64

How would you best classify the IT system used for model development or forecast production:

High-Performance Computing (HPC) Cluster

Model name

ESFM3

Number of individuals supporting model development:

1-5

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

< 8

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

16-64

How would you best classify the IT system used for model development or forecast production:

High-Performance Computing (HPC) Cluster

Submitted forecast data in previous period(s)

Please note: Submitted forecast data is only publicly available once the evaluation of a full competitive period has been completed. See the competition's full detailed schedule with submitted data publication dates for each period here.

Access forecasts data

SwissAIClimate

Members

Models

Model name

Model summary questionnaire for model ESFM

SON 2025 Period

Model name

Model name

Submitted forecast data in previous period(s)

Participation