Overview » Teams » CliMA

CliMA

Members

First name (team leader)

Costa

Last name

Christopoulos

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Nat

Last name

Efrat-Henrici

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Kevin

Last name

Phan

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Jordan

Last name

Benjamin

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Julian

Last name

Schmitt

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Ronak

Last name

Patel

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Oliver

Last name

Dunbar

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Tapio

Last name

Schneider

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Haakon

Last name

Ervik

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Zhaoyi

Last name

Shen

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

First name

Katherine

Last name

Deck

Organisation name

Climate Modeling Alliance (CliMA)

Organisation type

Research Organisation (Academic, Independent, etc.)

Organisation location

United States of America

Models

Model name

CliMAWeather

Number of individuals supporting model development:

11-20

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

1,000-5,000

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

64-256

How would you best classify the IT system used for model development or forecast production:

Cloud computing system

Documentation

View Documentation

Model summary questionnaire for model CliMAWeather

Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.

Which of the following descriptions best represent the overarching design of your forecasting model?

Post-processing of numerical weather prediction (NWP) data.
Statistical model focused on generating quintile probabilities.
Hybrid model that integrates physical simulations with machine learning or statistical techniques.

What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)

Our approach relies on post-processing a coupled atmosphere-land model, ClimaCoupler.jl. The coupler requires specifying an initial state and time-varying forcings. The land and atmosphere components are initialized directly from the ERA5 state. A hydrostatic vertical pressure profile is derived from the near-surface atmospheric pressure. Sea surface temperature and sea ice concentration are persisted from the ERA initial condition through the forecast period.

If any, what data does your model rely on for real-time forecasting purposes?

Most recently available 3D ERA5 state for land and atmosphere. ERA5 sea surface temperature and sea ice concentration are prescribed. Initialization variables: Atmosphere Model (ClimaAtmos.jl): Temperature (3D), Specific Humidity (3D), U/V winds (3D), Near-surface pressure (2D) Land Model (ClimaLand.jl): Soil Temperature (3D), Volumetric fraction of water (3D), Skin temperature (2D), Snow water equivalent (2D), Temperature of snow layer (2D) ;Prescribed Land Fields: Leaf Area Index [Full integrated land model]; Albedo [bucket model] (2D) Auxiliary/Prescribed Fields: Sea surface temperature (2D), Sea ice concentration (2D), Surface elevation (2D)

What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)

Coupled model output and ERA5 reanalysis (2018 - present)

Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)

Coupled Atmosphere-Land Model: The coupled model (ClimaCoupler.jl), initialized from ERA5 as detailed above, is run for 4 weeks. ML Post-processing: We use a data-driven correction framework that maps simulated ClimaCoupler forecast fields to ERA5. A pointwise machine learning model is trained to adjust the dynamical forecasts from the coupled model using analogs from comparable time periods over the historical period (since 2018). The model takes as input spatial, temporal, and dynamical variables from the coupled model and produces both a mean correction and an initial uncertainty estimate. Ensemble forecasts are generated by sampling from this predictive distribution. To address underdispersion in the ensemble, we introduce an empirically derived component that captures nonlocal contributions to uncertainty. This additive term is estimated from historical forecast residuals and combined with local error estimates.

Have you published or presented any work related to this forecasting model? If yes, could you share references or links?

Results specific to subseasonal forecasting with CliMA will be presented at AMS 2026. Recent papers highlighting components of the CliMA model: Yatunin, D., Byrne, S., Kawczynski, C., Kandala, S., Bozzola, G., Sridhar, A., Shen, Z., Jaruga, A., Sloan, J., He, J., Huang, D.Z., Barra, V., Knoth, O., Ullrich, P., Schneider, T., 2025: The CliMA atmosphere dynamical core: Concepts, numerics, and scaling. Journal of Advances in Modeling Earth Systems, submitted. Deck, K., Braghiere, R. K., Renchon, A. A., Sloan, J., Bozzola, G., Speer, E., Mackay, B., Reddy, T., Phan, K., Gagne-Landmann, A. L., Yatunin, D., Charbonneau, A., Efrat-Henrici, N., Bach, E., Ma, S., Gentine, P., Frankenberg, C., Bloom, A., Wang, Y., Longo, M., Schneider, T., 2025: ClimaLand: A land surface model for advancing climate modeling with machine learning and data-driven parameterizations. Journal of Advances in Modeling Earth Systems, submitted.

Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?

Training is performed on years from 2018, with 2024 held out as a validation year. Hyperparameters are tuned to optimize performance in the validation year. For uncertainty quantification, we assess reliability by examining the fraction of grid cells within predefined, predicted probability bins (0-0.2, 0.2-0.8, 0.8-1.0). A well-calibrated model should yield approximately 20%, 60%, and 20% of grid cells in the lower, middle, and upper ranges, respectively. The model was subsequently tuned to improve this alignment.

Did you face any challenges during model development, and how did you address them?

1.) Biases in raw models simulations: core model development, bug fixes, coupling improvements 2.) Overconfident forecasts/overfitting: add additional noise, modify hyperparameters

Are there any limitations to your current model that you aim to address in future iterations?

1) A single deterministic model forecast is mapped to quantities, so uncertainty from initial condition and forcings are not properly captured. 2) The local model prevents learning non-local, larger-scale features and regimes. 3) Currently, sea ice concentration and sea surface temperature are persisted. We would like to move to a climatological reference or couple full-complexity sea ice and ocean models.

Are there any other AI/ML model components or innovations that you wish to highlight?

Ongoing work employing ensemble-based data-assimilation for parameter estimation, which will allow for simultaneous calibration/fine-tuning of physics parameters in the numerical model and parameters in the ML postprocessor.

Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.

This team has chosen to keep its participants anonymous.

Model name

CliMAWeather2

Number of individuals supporting model development:

11-20

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

1,000-5,000

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

64-256

How would you best classify the IT system used for model development or forecast production:

Cloud computing system

Model summary questionnaire for model CliMAWeather2

Which of the following descriptions best represent the overarching design of your forecasting model?

Statistical model focused on generating quintile probabilities.
An empirical model that utilises historical weather patterns.

What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)

The model is initialized with analysis provided by ERA5.

If any, what data does your model rely on for real-time forecasting purposes?

Key indices of slowly varying large-scale modes of variability, such as ENSO.

What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)

ERA5 reanalysis for 1980-2025.

Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)

We use a stochastic models that relates key large-scale climate drivers, including long-term climate trends, to surface conditions such as 2-m temperature, MSLP, and precipitation. We generate ensembles of forecasts by generating multiple realizations from the stochastic model.

Have you published or presented any work related to this forecasting model? If yes, could you share references or links?

No.

Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?

We validated the model against historical ERA5 data. However, we also changed the model in the course of the Weather Quest competition.

Did you face any challenges during model development, and how did you address them?

We continually improved the model during the competition period; development is ongoing.

Are there any limitations to your current model that you aim to address in future iterations?

We plan to refine how slow climate modes are incorporated and to expand the historical training dataset. Further tuning will be guided by validation results.

Are there any other AI/ML model components or innovations that you wish to highlight?

Not at this time.

This team has chosen to keep its participants anonymous.

Submitted forecast data in previous period(s)

Please note: Submitted forecast data is only publicly available once the evaluation of a full competitive period has been completed. See the competition's full detailed schedule with submitted data publication dates for each period here.

Access forecasts data