Overview » Teams » Neuralio

Neuralio

Members

First name (team leader)

Stelios

Last name

Kotsopoulos

Organisation name

Neuralio AI

Organisation type

Small & Medium Enterprise or Startup

Organisation location

Greece

First name

Paraskevi

Last name

Vourlioti

Organisation name

Neuralio AI

Organisation type

Small & Medium Enterprise or Startup

Organisation location

Greece

First name

Theano

Last name

Mamouka

Organisation name

Neuralio AI

Organisation type

Small & Medium Enterprise or Startup

Organisation location

Greece

First name

George

Last name

Gousios

Organisation name

Neuralio AI

Organisation type

Small & Medium Enterprise or Startup

Organisation location

Greece

Model

Model name

NeuWeather

Number of individuals supporting model development:

1-5

Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:

< 8

Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:

< 4

How would you best classify the IT system used for model development or forecast production:

Single node system

Model summary questionnaire for model NeuWeather

Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.

Which of the following descriptions best represent the overarching design of your forecasting model?

Machine learning-based weather prediction.
Hybrid model that integrates physical simulations with machine learning or statistical techniques.

What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)

Model Initialization Techniques Data Sources ------------------------------------------- ECMWF Open Data: Real-time ensemble forecasts (51 members: 1 control + 50 perturbed) Downloaded from https://data.ecmwf.int/ecpds/home/opendata/ GRIB2 format, converted to NetCDF ERA5 Reanalysis (for training): 32 years (1991-2022) of historical data Initial Conditions Processing Variable Extraction: ------------------------------------------------- 2-meter temperature (2t → tas) Mean sea level pressure (msl → mslp) Total precipitation (tp → pr) Spatial Processing: ------------------- Regridded to 1.5° regular lat-lon grid (240×121) Using CDO remapnn (nearest neighbor interpolation) Temporal Processing: -------------------- Downloads 0-21 hour forecasts at 3-hour intervals Merges timesteps using CDO mergetime Creates 7-day input sequence for model Model Architecture ------------------ Pretrained ClimaX (1.40625° resolution) with frozen transformer layers Spherical harmonic embedding for climate data representation Quantile regression output: predicts Q20, Q40, Q50, Q60, Q80 1.3B parameters total Training Approach ----------------- Anomaly-based training: Model predicts deviations from day-of-year climatology Standardization using seasonal cycle amplitude (Arctic ~10K, Tropics ~1K) Latitude-weighted loss (area weighting) Pinball loss for quantile regression Additional Features ------------------- Static features: land-sea mask, lat/lon encodings MJO indices (NOAA ROMI) for tropical variability Seasonal harmonics encoding Inference ---------- Each of 51 ensemble members processed independently Quantile predictions converted to quintile probabilities via piecewise-linear CDF interpolation Final forecast: mean probabilities across all members

If any, what data does your model rely on for real-time forecasting purposes?

Real-Time Data Dependencies Primary Data Source --------------------------------------------- ECMWF Open Data Portal - Live ensemble forecasts --------------------------------------------------- URL: https://data.ecmwf.int/ecpds/home/opendata/{date}/{hour}z/ifs/0p25/enfo/ Updated operationally (typically 00z and 12z runs) Download format: GRIB2 files Real-Time Variables Required --------------------------- 2-meter temperature (2t) Mean sea level pressure (msl) Total precipitation (tp) Ensemble Configuration ---------------------- 51 ensemble members total: 1 control member (perturbation number = 0) 50 perturbed members (perturbation numbers 1-50) Temporal Requirements --------------------- Forecast hours: 0, 3, 6, 9, 12, 15, 18, 21 (8 timesteps) Creates 7-day input sequence for the model Uses most recent available initialization date Secondary Data Sources ---------------------- NOAA MJO Indices (ROMI) ==================== Real-time RMM1/RMM2 indices for Madden-Julian Oscillation Source: ./data/mjo/mjo_indices.txt (updated periodically) Used as additional input features for tropical variability No Other External Data The model does not rely on: ----------------------------------------------- Real-time observations Satellite data Radar data Station measurements

What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)

ERA5 post-processed reanalysis datasets, prepared and provided by the competition's organisers. ----------------------------------------------------------------------------------- Time period: 1991-2022 (32 years) Variables: 2-meter temperature, mean sea level pressure, total precipitation Resolution: 1.5° regular lat-lon grid (240×121) Temporal: Daily data

Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)

**Model Architecture Overview** **Core Architecture: ClimaX-Harmonics with Quantile Regression** Pretrained Vision Transformer (ViT) adapted for subseasonal climate forecasting with spherical harmonic embeddings. **Key Design Features** *Pretrained Foundation:* - ClimaX pretrained weights (1.40625° resolution) - Frozen transformer layers (48 parameters) - Fine-tuned residual MLP blocks (1 block) *Spherical Harmonics Integration:* - SphericalVariableEmbedding for Earth's geometry - Spectral coefficients: 14,641 - Grid: 121×240 (latitude×longitude) at 1.5° resolution *Quantile Regression Output:* - Predicts 5 quantiles: Q20, Q40, Q50, Q60, Q80 - Pinball loss for quantile training - Learned aggregation layer *Anomaly-Based Training:* - Model predicts deviations from day-of-year climatology - Standardized by seasonal cycle amplitude (Arctic ~10K, Tropics ~1K) **Model Hyperparameters** - Embed dimension: 1024 - Depth: 8 transformer blocks - Dropout: 0.3 - Weight decay: 1e-3 - Learning rate: 2e-5 - Batch size: 32 - Total parameters: ~1.3B **Input/Output** - Input sequence: 7 days - Output: 32 days forecast - Input channels: 15 (3 predictors + 7 static + 5 MJO) - Predictors: temperature, precipitation, pressure **Framework Stack** - Deep Learning: PyTorch - Distributed Training: Hugging Face Accelerate (multi-GPU) - Mixed Precision: BF16 - Data Processing: xarray, numpy, CDO **Loss Functions** - Primary: Pinball loss (quantile regression) - Secondary: Spectral loss in spherical harmonics domain - Latitude weighting: Area-weighted (cos(lat)) - Crossing penalty: Enforces quantile monotonicity **Pre-Processing Steps** 1. Regrid ECMWF data to 1.5° using CDO remapnn 2. Extract variables: temperature, pressure, precipitation 3. Add static features: land-sea mask, lat/lon encodings 4. Add MJO indices (RMM1, RMM2, amplitude, phase) 5. Normalize using fitted StandardScaler 6. Convert targets to anomalies from climatology **Post-Processing Steps** 1. Inverse anomaly transform (add climatology back) 2. Reshape to spatial grid (121×240) 3. Save as NetCDF (CF-1.6 compliant) 4. Convert quantile predictions to quintile probabilities via piecewise-linear CDF interpolation 5. Average across 51 ensemble members 6. Calculate weekly means for Days 19-25 and Days 26-32 **Additional Features** - Static features: land-sea mask, lat/lon encodings - MJO indices (RMM1, RMM2) - Latitude-weighted loss (area scheme) **Training** - Data: ERA5 reanalysis (1991-2022) - Train/Val split: 90%/10% (year-based) - Early stopping with patience - Mixed precision (BF16) - Multi-GPU with Hugging Face Accelerate **Evaluation Metrics** - RPSS (Ranked Probability Skill Score) - competition metric - MAE, RMSE for validation

Have you published or presented any work related to this forecasting model? If yes, could you share references or links?

Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?

Yes, using a held-out year-based validation split (2019-2021) from the ERA5 training data. No independent observational datasets were used.

Did you face any challenges during model development, and how did you address them?

**1. Memory Management** *Challenge:* Large spatial grids (121×240) × 15 input channels × 32-day output × 5 quantiles caused OOM errors on GPU. *Solutions:* - Mixed precision training (BF16) - Batch-wise scaler fitting instead of loading full dataset - Gradient accumulation (effective batch size management) - Pre-flattened climatology arrays to avoid repeated memory allocations **2. Spherical Geometry** *Challenge:* Earth's curved surface poorly represented in standard grid approaches; polar regions have different grid cell sizes. *Solutions:* - Spherical harmonic embeddings (SphericalVariableEmbedding) - Latitude-weighted loss function (area weighting with cos(lat)) - Spectral loss in frequency domain **3. Quantile Spread Calibration** *Challenge:* Model's predicted quantile spread didn't match climatological variability, causing poor RPSS despite good MAE. *Solutions:* - Anomaly-based training (predict deviations from climatology) - Standardization by seasonal cycle amplitude - Piecewise-linear CDF interpolation for quantile-to-quintile conversion **4. Validation Plateau** *Challenge:* Validation loss plateaued early while training loss continued decreasing (overfitting risk). *Solutions:* - Frozen pretrained ClimaX transformer layers - Only fine-tune 1 residual MLP block - Early stopping with patience - Dropout (0.3) and weight decay (1e-3) **5. Subseasonal Predictability Limits** *Challenge:* Skill at 3-4 week lead times is fundamentally limited by atmospheric chaos. *Solutions:* - MJO indices as additional predictors (source of subseasonal predictability) - Ensemble processing (51 ECMWF members) - Probabilistic output (quintile probabilities) rather than deterministic forecasts

Are there any limitations to your current model that you aim to address in future iterations?

**Current Limitations and Future Improvements** **1. Limited Training Data Diversity** *Limitation:* Model trained only on ERA5 reanalysis (1991-2022). No multi-source data fusion. *Future:* Incorporate satellite observations, station data, and other reanalysis products for improved robustness. **2. Single Variable Output** *Limitation:* Currently predicts only 2-meter temperature; precipitation not included. *Future:* Extend to multi-variable prediction (temperature and precipitation) for complete subseasonal forecasts. **3. Quantile Spread Calibration** *Limitation:* Model's predicted uncertainty spread doesn't perfectly match climatological variability in all regions, particularly at high latitudes. *Future:* Implement region-specific calibration or direct training against competition's 20-year quintile boundaries. **4. Static Ensemble Processing** *Limitation:* All 51 ECMWF ensemble members processed independently and simply averaged. *Future:* Learn ensemble weighting or use attention mechanisms to weight members based on their reliability. **5. No Online Learning** *Limitation:* Model is static after training; doesn't adapt to recent observations or forecast errors. *Future:* Implement online fine-tuning or adaptive bias correction based on recent verification. **6. Computational Cost** *Limitation:* 1.3B parameter model requires significant GPU resources for both training and inference. *Future:* Knowledge distillation to smaller models or more efficient architectures for operational deployment. **7. Limited Teleconnection Exploitation** *Limitation:* Only MJO indices used; other sources of subseasonal predictability not explicitly modeled. *Future:* Add stratospheric conditions, sea ice extent, soil moisture, and other teleconnection indices (ENSO, NAO, etc.).

Are there any other AI/ML model components or innovations that you wish to highlight?

**Additional Innovations** **1. Pretrained Climate Foundation Model** Leveraged ClimaX - a foundation model pretrained on diverse climate data - rather than training from scratch. This provides: - Better generalization from climate-specific learned representations - Reduced training time and data requirements - Transfer learning benefits for subseasonal forecasting **2. Spherical Harmonic Embeddings** Custom `SphericalVariableEmbedding` layer that respects Earth's spherical geometry: - 14,641 spectral coefficients - Proper handling of polar singularities - Spectral loss in frequency domain for physically consistent predictions **3. Anomaly-Based Learning** Model predicts deviations from day-of-year climatology rather than absolute temperatures: - Removes seasonal cycle (easier learning task) - Focuses model capacity on anomalies (the signal that matters) - Natural regularization toward climatology **4. Quantile Regression for Probabilistic Forecasts** Direct prediction of uncertainty through quantile outputs (Q20, Q40, Q50, Q60, Q80): - Pinball loss for proper quantile estimation - Learned aggregation across quantiles - Monotonicity enforcement (crossing penalty) **5. Multi-Scale Feature Integration** Combines multiple information sources in a unified architecture: - Dynamic atmospheric fields (temperature, pressure, precipitation) - Static geographic features (land-sea mask, topography encodings) - Teleconnection indices (MJO RMM1/RMM2) - Temporal harmonics for seasonal awareness **6. Hybrid Loss Function** Combination of spatial and spectral losses: - Pinball loss for accurate quantile prediction - Spectral loss in spherical harmonics domain - Latitude weighting for proper global averaging

Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.

Data preparation: Theano Mamouka Model architecture, training, feature engineering, evaluating: Gousios George Methodology, analysis, investigation: Paraskevi Vourlioti Effective supervision, researching, team administration: Stelios Kotsopoulos

Which of the following descriptions best represent the overarching design of your forecasting model?

Machine learning-based weather prediction.
Hybrid model that integrates physical simulations with machine learning or statistical techniques.