Overview » Teams » NvidiaDistill

NvidiaDistill

Members

First name (team leader)

Noah

Last name

Brenowitz

Organisation name

NVIDIA

Organisation type

Large Tech Company

Organisation location

United States of America

First name

Scott

Last name

Martin

Organisation name

NVIDIA

Organisation type

Large Tech Company

Organisation location

United States of America

Model

Model name

DLESyM10K

Number of individuals supporting model development:

1-5

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

8-48

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

4-16

How would you best classify the IT system used for model development or forecast production:

High-Performance Computing (HPC) Cluster

Model summary questionnaire for model DLESyM10K

Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.

Which of the following descriptions best represent the overarching design of your forecasting model?

Machine learning-based weather prediction.
Ensemble-based model, aggregating multiple predictions to assess uncertainty and variability.

What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)

Initialized from 4 days of IFS daily averages and Copernicus SST regridded to HEALPix64 grid.

If any, what data does your model rely on for real-time forecasting purposes?

As above.

What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)

Trained on 10,000 years of DLESyM simulations. DLESyM simulations run from ERA5 initial conditions to generate large volume of training data, then distilled model trained for S2S forecasting using the DLESyM output. Finally, model fine-tuned on ERA5.

Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)

Architecture: conditional diffusion model implemented on HEALPix64 grid, following broadly the architecture used in cBottle. Framework: Generate huge synthetic dataset with DLESyM (which in turn was trained on ERA5) -> Train 'distilled' diffusion model to produce ensemble forecast at AIWQ lead times in single model timestep (no autoregression) -> Fine-tune this distilled model on ERA5 -> Calculate quintile probs using model climate quintile boundaries from a 20 year climatology.

Have you published or presented any work related to this forecasting model? If yes, could you share references or links?

arXiv paper (under review at JAMES): https://doi.org/10.48550/arXiv.2512.22814 AMS talk: https://ams.confex.com/ams/106ANNUAL/meetingapp.cgi/Paper/476944

Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?

Hindcasts against ERA5 using ERA5 data already processed in house and benchmarking against ECMWF ENS S2S realtime forecasts from S2S database.

Did you face any challenges during model development, and how did you address them?

Nothing major, but some experimentation with the loss and tuning was needed to achieve the present results.

Are there any limitations to your current model that you aim to address in future iterations?

This first proof of concept was trained on v1 of DLESyM which only forecasts 8 atmospheric variables and 1 ocean variable. We hope in future to train our model on recent state-of-the-art AI earth system models like SamudrACE or DLESyM v2 which should represent the drivers of S2S and seasonal dynamics better.

Are there any other AI/ML model components or innovations that you wish to highlight?

No.

Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.

Scott Martin (work done during internship at NVIDIA): data preparation, model development, model validation, model architecture, realtime data preparation. Noah Brenowitz: data preparation, model architecture, project concept, mentorship. Dale Durran: project concept, mentorship. Mike Pritchard: project concept, mentorship.

Which of the following descriptions best represent the overarching design of your forecasting model?

Machine learning-based weather prediction.
Ensemble-based model, aggregating multiple predictions to assess uncertainty and variability.