Neuralio

Members

First name (team leader)
Stelios
Last name
Kotsopoulos
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
First name
Paraskevi
Last name
Vourlioti
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
First name
Theano
Last name
Mamouka
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
First name
George
Last name
Gousios
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece

Model

Model name

NeuWeather
Number of individuals supporting model development:
1-5
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
< 4
How would you best classify the IT system used for model development or forecast production:
Single node system

Model summary questionnaire for model NeuWeather

Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.

Which of the following descriptions best represent the overarching design of your forecasting model?
  • Machine learning-based weather prediction.
  • Hybrid model that integrates physical simulations with machine learning or statistical techniques.
What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)
Model Initialization Techniques Data Sources ------------------------------------------- ECMWF Open Data: Real-time ensemble forecasts (51 members: 1 control + 50 perturbed) Downloaded from https://data.ecmwf.int/ecpds/home/opendata/ GRIB2 format, converted to NetCDF ERA5 Reanalysis (for training): 32 years (1991-2022) of historical data Initial Conditions Processing Variable Extraction: ------------------------------------------------- 2-meter temperature (2t → tas) Mean sea level pressure (msl → mslp) Total precipitation (tp → pr) Spatial Processing: ------------------- Regridded to 1.5° regular lat-lon grid (240×121) Using CDO remapnn (nearest neighbor interpolation) Temporal Processing: -------------------- Downloads 0-21 hour forecasts at 3-hour intervals Merges timesteps using CDO mergetime Creates 7-day input sequence for model Model Architecture ------------------ Pretrained ClimaX (1.40625° resolution) with frozen transformer layers Spherical harmonic embedding for climate data representation Quantile regression output: predicts Q20, Q40, Q50, Q60, Q80 1.3B parameters total Training Approach ----------------- Anomaly-based training: Model predicts deviations from day-of-year climatology Standardization using seasonal cycle amplitude (Arctic ~10K, Tropics ~1K) Latitude-weighted loss (area weighting) Pinball loss for quantile regression Additional Features ------------------- Static features: land-sea mask, lat/lon encodings MJO indices (NOAA ROMI) for tropical variability Seasonal harmonics encoding Inference ---------- Each of 51 ensemble members processed independently Quantile predictions converted to quintile probabilities via piecewise-linear CDF interpolation Final forecast: mean probabilities across all members
If any, what data does your model rely on for real-time forecasting purposes?
Real-Time Data Dependencies Primary Data Source --------------------------------------------- ECMWF Open Data Portal - Live ensemble forecasts --------------------------------------------------- URL: https://data.ecmwf.int/ecpds/home/opendata/{date}/{hour}z/ifs/0p25/enfo/ Updated operationally (typically 00z and 12z runs) Download format: GRIB2 files Real-Time Variables Required --------------------------- 2-meter temperature (2t) Mean sea level pressure (msl) Total precipitation (tp) Ensemble Configuration ---------------------- 51 ensemble members total: 1 control member (perturbation number = 0) 50 perturbed members (perturbation numbers 1-50) Temporal Requirements --------------------- Forecast hours: 0, 3, 6, 9, 12, 15, 18, 21 (8 timesteps) Creates 7-day input sequence for the model Uses most recent available initialization date Secondary Data Sources ---------------------- NOAA MJO Indices (ROMI) ==================== Real-time RMM1/RMM2 indices for Madden-Julian Oscillation Source: ./data/mjo/mjo_indices.txt (updated periodically) Used as additional input features for tropical variability No Other External Data The model does not rely on: ----------------------------------------------- Real-time observations Satellite data Radar data Station measurements
What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)
ERA5 post-processed reanalysis datasets, prepared and provided by the competition's organisers. ----------------------------------------------------------------------------------- Time period: 1991-2022 (32 years) Variables: 2-meter temperature, mean sea level pressure, total precipitation Resolution: 1.5° regular lat-lon grid (240×121) Temporal: Daily data
Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)
**Model Architecture Overview** **Core Architecture: ClimaX-Harmonics with Quantile Regression** Pretrained Vision Transformer (ViT) adapted for subseasonal climate forecasting with spherical harmonic embeddings. **Key Design Features** *Pretrained Foundation:* - ClimaX pretrained weights (1.40625° resolution) - Frozen transformer layers (48 parameters) - Fine-tuned residual MLP blocks (1 block) *Spherical Harmonics Integration:* - SphericalVariableEmbedding for Earth's geometry - Spectral coefficients: 14,641 - Grid: 121×240 (latitude×longitude) at 1.5° resolution *Quantile Regression Output:* - Predicts 5 quantiles: Q20, Q40, Q50, Q60, Q80 - Pinball loss for quantile training - Learned aggregation layer *Anomaly-Based Training:* - Model predicts deviations from day-of-year climatology - Standardized by seasonal cycle amplitude (Arctic ~10K, Tropics ~1K) **Model Hyperparameters** - Embed dimension: 1024 - Depth: 8 transformer blocks - Dropout: 0.3 - Weight decay: 1e-3 - Learning rate: 2e-5 - Batch size: 32 - Total parameters: ~1.3B **Input/Output** - Input sequence: 7 days - Output: 32 days forecast - Input channels: 15 (3 predictors + 7 static + 5 MJO) - Predictors: temperature, precipitation, pressure **Framework Stack** - Deep Learning: PyTorch - Distributed Training: Hugging Face Accelerate (multi-GPU) - Mixed Precision: BF16 - Data Processing: xarray, numpy, CDO **Loss Functions** - Primary: Pinball loss (quantile regression) - Secondary: Spectral loss in spherical harmonics domain - Latitude weighting: Area-weighted (cos(lat)) - Crossing penalty: Enforces quantile monotonicity **Pre-Processing Steps** 1. Regrid ECMWF data to 1.5° using CDO remapnn 2. Extract variables: temperature, pressure, precipitation 3. Add static features: land-sea mask, lat/lon encodings 4. Add MJO indices (RMM1, RMM2, amplitude, phase) 5. Normalize using fitted StandardScaler 6. Convert targets to anomalies from climatology **Post-Processing Steps** 1. Inverse anomaly transform (add climatology back) 2. Reshape to spatial grid (121×240) 3. Save as NetCDF (CF-1.6 compliant) 4. Convert quantile predictions to quintile probabilities via piecewise-linear CDF interpolation 5. Average across 51 ensemble members 6. Calculate weekly means for Days 19-25 and Days 26-32 **Additional Features** - Static features: land-sea mask, lat/lon encodings - MJO indices (RMM1, RMM2) - Latitude-weighted loss (area scheme) **Training** - Data: ERA5 reanalysis (1991-2022) - Train/Val split: 90%/10% (year-based) - Early stopping with patience - Mixed precision (BF16) - Multi-GPU with Hugging Face Accelerate **Evaluation Metrics** - RPSS (Ranked Probability Skill Score) - competition metric - MAE, RMSE for validation
Have you published or presented any work related to this forecasting model? If yes, could you share references or links?
No
Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?
Yes, using a held-out year-based validation split (2019-2021) from the ERA5 training data. No independent observational datasets were used.
Did you face any challenges during model development, and how did you address them?
**1. Memory Management** *Challenge:* Large spatial grids (121×240) × 15 input channels × 32-day output × 5 quantiles caused OOM errors on GPU. *Solutions:* - Mixed precision training (BF16) - Batch-wise scaler fitting instead of loading full dataset - Gradient accumulation (effective batch size management) - Pre-flattened climatology arrays to avoid repeated memory allocations **2. Spherical Geometry** *Challenge:* Earth's curved surface poorly represented in standard grid approaches; polar regions have different grid cell sizes. *Solutions:* - Spherical harmonic embeddings (SphericalVariableEmbedding) - Latitude-weighted loss function (area weighting with cos(lat)) - Spectral loss in frequency domain **3. Quantile Spread Calibration** *Challenge:* Model's predicted quantile spread didn't match climatological variability, causing poor RPSS despite good MAE. *Solutions:* - Anomaly-based training (predict deviations from climatology) - Standardization by seasonal cycle amplitude - Piecewise-linear CDF interpolation for quantile-to-quintile conversion **4. Validation Plateau** *Challenge:* Validation loss plateaued early while training loss continued decreasing (overfitting risk). *Solutions:* - Frozen pretrained ClimaX transformer layers - Only fine-tune 1 residual MLP block - Early stopping with patience - Dropout (0.3) and weight decay (1e-3) **5. Subseasonal Predictability Limits** *Challenge:* Skill at 3-4 week lead times is fundamentally limited by atmospheric chaos. *Solutions:* - MJO indices as additional predictors (source of subseasonal predictability) - Ensemble processing (51 ECMWF members) - Probabilistic output (quintile probabilities) rather than deterministic forecasts
Are there any limitations to your current model that you aim to address in future iterations?
**Current Limitations and Future Improvements** **1. Limited Training Data Diversity** *Limitation:* Model trained only on ERA5 reanalysis (1991-2022). No multi-source data fusion. *Future:* Incorporate satellite observations, station data, and other reanalysis products for improved robustness. **2. Single Variable Output** *Limitation:* Currently predicts only 2-meter temperature; precipitation not included. *Future:* Extend to multi-variable prediction (temperature and precipitation) for complete subseasonal forecasts. **3. Quantile Spread Calibration** *Limitation:* Model's predicted uncertainty spread doesn't perfectly match climatological variability in all regions, particularly at high latitudes. *Future:* Implement region-specific calibration or direct training against competition's 20-year quintile boundaries. **4. Static Ensemble Processing** *Limitation:* All 51 ECMWF ensemble members processed independently and simply averaged. *Future:* Learn ensemble weighting or use attention mechanisms to weight members based on their reliability. **5. No Online Learning** *Limitation:* Model is static after training; doesn't adapt to recent observations or forecast errors. *Future:* Implement online fine-tuning or adaptive bias correction based on recent verification. **6. Computational Cost** *Limitation:* 1.3B parameter model requires significant GPU resources for both training and inference. *Future:* Knowledge distillation to smaller models or more efficient architectures for operational deployment. **7. Limited Teleconnection Exploitation** *Limitation:* Only MJO indices used; other sources of subseasonal predictability not explicitly modeled. *Future:* Add stratospheric conditions, sea ice extent, soil moisture, and other teleconnection indices (ENSO, NAO, etc.).
Are there any other AI/ML model components or innovations that you wish to highlight?
**Additional Innovations** **1. Pretrained Climate Foundation Model** Leveraged ClimaX - a foundation model pretrained on diverse climate data - rather than training from scratch. This provides: - Better generalization from climate-specific learned representations - Reduced training time and data requirements - Transfer learning benefits for subseasonal forecasting **2. Spherical Harmonic Embeddings** Custom `SphericalVariableEmbedding` layer that respects Earth's spherical geometry: - 14,641 spectral coefficients - Proper handling of polar singularities - Spectral loss in frequency domain for physically consistent predictions **3. Anomaly-Based Learning** Model predicts deviations from day-of-year climatology rather than absolute temperatures: - Removes seasonal cycle (easier learning task) - Focuses model capacity on anomalies (the signal that matters) - Natural regularization toward climatology **4. Quantile Regression for Probabilistic Forecasts** Direct prediction of uncertainty through quantile outputs (Q20, Q40, Q50, Q60, Q80): - Pinball loss for proper quantile estimation - Learned aggregation across quantiles - Monotonicity enforcement (crossing penalty) **5. Multi-Scale Feature Integration** Combines multiple information sources in a unified architecture: - Dynamic atmospheric fields (temperature, pressure, precipitation) - Static geographic features (land-sea mask, topography encodings) - Teleconnection indices (MJO RMM1/RMM2) - Temporal harmonics for seasonal awareness **6. Hybrid Loss Function** Combination of spatial and spectral losses: - Pinball loss for accurate quantile prediction - Spectral loss in spherical harmonics domain - Latitude weighting for proper global averaging
Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.
Data preparation: Theano Mamouka Model architecture, training, feature engineering, evaluating: Gousios George Methodology, analysis, investigation: Paraskevi Vourlioti Effective supervision, researching, team administration: Stelios Kotsopoulos

Which of the following descriptions best represent the overarching design of your forecasting model?
  • Machine learning-based weather prediction.
  • Hybrid model that integrates physical simulations with machine learning or statistical techniques.
What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)
Model Initialization Techniques Data Sources ECMWF Open Data: Real-time ensemble forecasts (51 members: 1 control + 50 perturbed) Downloaded from https://data.ecmwf.int/ecpds/home/opendata/ GRIB2 format, converted to NetCDF Initial Conditions Processing Variable Extraction : 2-meter temperature (2t → tas) Mean sea level pressure (msl → mslp) Total precipitation (tp → pr) Spatial Processing: Regridded to 1.5° regular lat-lon grid (240×121) Using CDO remapnn (nearest neighbor interpolation) Grid specification Temporal Processing: Downloads 0-21 hour forecasts at 3-hour intervals Merges timesteps using CDO mergetime Creates 7-day input sequence for model Data Normalization : Applies transform fitted during training (pickled scaler) Transform includes feature scaling for X and target scaling for y Inverse transform applied to predictions Key Processing Steps NaN handling: Replaces NaNs with zeros Harmonics conversion: Optional spherical harmonics data conversion Ensemble processing: Each of 51 members processed independently, then averaged for final forecast
If any, what data does your model rely on for real-time forecasting purposes?
Real-Time Data Dependencies Primary Data Source ECMWF Open Data Portal - Live ensemble forecasts URL: https://data.ecmwf.int/ecpds/home/opendata/{date}/{hour}z/ifs/0p25/enfo/ Updated operationally (typically 00z and 12z runs) Download format: GRIB2 files Real-Time Variables Required: 2-meter temperature (2t) Mean sea level pressure (msl) Total precipitation (tp) Ensemble Configuration: 51 ensemble members total: 1 control member (perturbation number = 0) 50 perturbed members (perturbation numbers 1-50) Temporal Requirements : Forecast hours: 0, 3, 6, 9, 12, 15, 18, 21 (8 timesteps) Creates 7-day input sequence for the model Uses most recent available initialization date Processing Pipeline: init_date='20251006' init_hour='00' year='2025' No Other External Data The model does not rely on: Real-time observations Satellite data Radar data Station measurements
What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)
ERA5 post-processed reanalysis datasets, prepared and provided by the competition's organisers.
Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)
Model Architecture Overview Core Architecture: ClimaX with Spherical Harmonics Model Type: Vision Transformer (ViT) adapted for climate data Key Design Features Spherical Harmonics Integration : Handles Earth's spherical geometry Spectral loss in frequency domain Grid type: Equiangular (default) Spatial dimensions: 121×240 (latitude×longitude) Multi-Variable Processing: Input channels: 3 (temperature, pressure, precipitation) Output: Temperature forecasts (32 days) Patch-based processing (ViT approach) Temporal Component: Sequence length: 7 days input Prediction horizon: 32 days output LSTM layers for temporal encoding Model Hyperparameters Saved : var_embed_dim: 128+ (variable embedding dimension) depth: 4+ (transformer blocks) heads: Multi-head attention (≥4) mlp_dim: hidden_dim × 4 dropout: 0.2-0.3 spectral_loss_weight: 0.1 Framework Stack Deep Learning: PyTorch Distributed Training: Hugging Face Accelerate Data Processing: xarray, numpy Optimization: Mixed precision training Gradient accumulation (4 steps) Early stopping (patience=8) Pre-Processing Steps Spatial Processing: Regrid to 1.5° using CDO remapnn Pad to power-of-2 dimensions for ViT Normalization: Fitted on 3000 samples Separate feature (X) and target (y) transforms Saved as pickle: *_transform.pkl Harmonics Conversion: converter = HarmonicsDataConverter(original_spatial_dims, (nlat, nlon)) Loss Functions Dual Loss Approach: Spatial Loss: Standard MSE/MAE in physical space Spectral Loss: Spherical harmonics frequency domain harmonics_loss = SpectralLoss( nlat=nlat, nlon=nlon, grid='equiangular', spectral_weight=0.1 ) Post-Processing Steps Inverse Transform: Denormalize predictions to physical units Reshape to Spatial Grid: Convert flattened output back to 121×240 grid NetCDF Output: CF-1.6 compliant Daily forecasts for 32 days Proper coordinate metadata Weekly Aggregation: Calculate weekly means from daily predictions Ensemble Processing: Average 51 ensemble member predictions Convert to quintile probabilities forecast_pbs_week1_MEAN = np.mean(all_forecast_pbs_week1, axis=0) Evaluation Metrics MAE (Mean Absolute Error) RMSE (Root Mean Square Error) Custom ECMWF metrics
Have you published or presented any work related to this forecasting model? If yes, could you share references or links?
No
Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?
No
Did you face any challenges during model development, and how did you address them?
Memory Management Issues Challenge Large spatial grids (121×240) × multiple variables × long sequences caused OOM errors Solutions: # Memory optimization flag args.use_memory_optimizations = True Batch scaler fitting instead of full dataset Gradient checkpointing for large inputs Additional optimizations: # Drop incomplete batches to save memory Mixed precision training: use_mixed_precision=True Gradient accumulation accumulation_steps=4 3. Spherical Geometry Handling Challenge Earth's curved surface poorly represented in standard CNN/grid approaches Solution: Spherical Harmonics Integration Patch Size Compatibility for Vision Transformer Irregular spatial dimensions don't divide evenly into patches Solution: Padding to power-of-2
Are there any limitations to your current model that you aim to address in future iterations?
Optimize the model's ability to create realistic predictions for temperature during inference. Right now there is not great spatial variability and differentiation between ensemble members.
Are there any other AI/ML model components or innovations that you wish to highlight?
Spherical Harmonics Integration (Primary Innovation) ClimaX-Harmonics Architecture Innovation: Physics-informed deep learning that respects Earth's spherical geometry Key Components: Why it matters: Standard CNNs treat lat-lon grids as flat images (wrong!) Spherical harmonics preserve geometric properties at poles Spectral loss enforces physically-consistent spatial patterns Better captures global circulation patterns Dual-Domain Loss Function Innovation: Training in both spatial AND frequency domains simultaneously train_model_with_harmonics_loss( accelerator, model, device, train_loader, val_loader, harmonics_loss_fn=harmonics_loss, spectral_loss_weight=0.1, # 10% spectral, 90% spatial ... ) Benefits: Spatial loss: Accurate point-wise predictions Spectral loss: Smooth, physically realistic large-scale patterns Prevents checkerboard artifacts and unrealistic small-scale noise Distributed Training with Accelerate Innovation: Seamless multi-GPU training without code changes Handles gradient synchronization automatically find_unused_parameters=True allows flexible model architectures
Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.
Data preparation: Theano Mamouka Model architecture: Gousios George Methodology, analysis, investigation: Paraskevi Vourlioti Effective supervision, researching, team administration: Stelios Kotsopoulos

Submitted forecast data in previous period(s)

Please note: Submitted forecast data is only publicly available once the evaluation of a full competitive period has been completed. See the competition's full detailed schedule with submitted data publication dates for each period here.

Access forecasts data

Participation

Competition Period

For the selected competition period, the table below shows the variables submitted each week by the respective team.

Week First forecast window: Days 19 to 25 Second forecast window: Days 26 to 32
Near-surface (2m) temperature (tas) Mean sea level pressure (mslp) Precipitation (pr) Near-surface (2m) temperature (tas) Mean sea level pressure (mslp) Precipitation (pr)

This team did not submit any entries to the competition