Neuralio
Members
First name (team leader)
Stelios
Last name
Kotsopoulos
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
First name
Paraskevi
Last name
Vourlioti
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
First name
Theano
Last name
Mamouka
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
First name
George
Last name
Gousios
Organisation name
Neuralio AI
Organisation type
Small & Medium Enterprise or Startup
Organisation location
Greece
Model
Model name
NeuWeather
Number of individuals supporting model development:
1-5
Maximum number of Central Processing Units (CPUs) supporting model development or forecast production:
< 8
Maximum number of Graphics Processing Units (GPUs) supporting model development or forecast production:
< 4
How would you best classify the IT system used for model development or forecast production:
Single node system
Model summary questionnaire for model NeuWeather
Please note that the list below shows all questionnaires submitted for this model.
They are displayed from the most recent to the earliest, covering each 13-week competition period in which the team competed with this model.
Which of the following descriptions best represent the overarching design of your forecasting model?
- Machine learning-based weather prediction.
- Hybrid model that integrates physical simulations with machine learning or statistical techniques.
What techniques did you use to initialise your model? (For example: data sources and processing of initial conditions)
Model Initialization Techniques
Data Sources
ECMWF Open Data: Real-time ensemble forecasts (51 members: 1 control + 50 perturbed)
Downloaded from https://data.ecmwf.int/ecpds/home/opendata/
GRIB2 format, converted to NetCDF
Initial Conditions Processing
Variable Extraction :
2-meter temperature (2t → tas)
Mean sea level pressure (msl → mslp)
Total precipitation (tp → pr)
Spatial Processing:
Regridded to 1.5° regular lat-lon grid (240×121)
Using CDO remapnn (nearest neighbor interpolation)
Grid specification
Temporal Processing:
Downloads 0-21 hour forecasts at 3-hour intervals
Merges timesteps using CDO mergetime
Creates 7-day input sequence for model
Data Normalization :
Applies transform fitted during training (pickled scaler)
Transform includes feature scaling for X and target scaling for y
Inverse transform applied to predictions
Key Processing Steps
NaN handling: Replaces NaNs with zeros
Harmonics conversion: Optional spherical harmonics data conversion
Ensemble processing: Each of 51 members processed independently, then averaged for final forecast
If any, what data does your model rely on for real-time forecasting purposes?
Real-Time Data Dependencies
Primary Data Source
ECMWF Open Data Portal - Live ensemble forecasts
URL: https://data.ecmwf.int/ecpds/home/opendata/{date}/{hour}z/ifs/0p25/enfo/
Updated operationally (typically 00z and 12z runs)
Download format: GRIB2 files
Real-Time Variables Required:
2-meter temperature (2t)
Mean sea level pressure (msl)
Total precipitation (tp)
Ensemble Configuration:
51 ensemble members total:
1 control member (perturbation number = 0)
50 perturbed members (perturbation numbers 1-50)
Temporal Requirements :
Forecast hours: 0, 3, 6, 9, 12, 15, 18, 21 (8 timesteps)
Creates 7-day input sequence for the model
Uses most recent available initialization date
Processing Pipeline:
init_date='20251006'
init_hour='00'
year='2025'
No Other External Data
The model does not rely on:
Real-time observations
Satellite data
Radar data
Station measurements
What types of datasets were used for model training? (For example: observational datasets, reanalysis data, NWP outputs or satellite data)
ERA5 post-processed reanalysis datasets, prepared and provided by the competition's organisers.
Please provide an overview of your final ML/AI model architecture (For example: key design features, specific algorithms or frameworks used, and any pre- or post-processing steps)
Model Architecture Overview
Core Architecture: ClimaX with Spherical Harmonics
Model Type: Vision Transformer (ViT) adapted for climate data
Key Design Features
Spherical Harmonics Integration :
Handles Earth's spherical geometry
Spectral loss in frequency domain
Grid type: Equiangular (default)
Spatial dimensions: 121×240 (latitude×longitude)
Multi-Variable Processing:
Input channels: 3 (temperature, pressure, precipitation)
Output: Temperature forecasts (32 days)
Patch-based processing (ViT approach)
Temporal Component:
Sequence length: 7 days input
Prediction horizon: 32 days output
LSTM layers for temporal encoding
Model Hyperparameters
Saved :
var_embed_dim: 128+ (variable embedding dimension)
depth: 4+ (transformer blocks)
heads: Multi-head attention (≥4)
mlp_dim: hidden_dim × 4
dropout: 0.2-0.3
spectral_loss_weight: 0.1
Framework Stack
Deep Learning: PyTorch
Distributed Training: Hugging Face Accelerate
Data Processing: xarray, numpy
Optimization:
Mixed precision training
Gradient accumulation (4 steps)
Early stopping (patience=8)
Pre-Processing Steps
Spatial Processing:
Regrid to 1.5° using CDO remapnn
Pad to power-of-2 dimensions for ViT
Normalization:
Fitted on 3000 samples
Separate feature (X) and target (y) transforms
Saved as pickle: *_transform.pkl
Harmonics Conversion:
converter = HarmonicsDataConverter(original_spatial_dims, (nlat, nlon))
Loss Functions
Dual Loss Approach:
Spatial Loss: Standard MSE/MAE in physical space
Spectral Loss: Spherical harmonics frequency domain
harmonics_loss = SpectralLoss(
nlat=nlat,
nlon=nlon,
grid='equiangular',
spectral_weight=0.1
)
Post-Processing Steps
Inverse Transform:
Denormalize predictions to physical units
Reshape to Spatial Grid:
Convert flattened output back to 121×240 grid
NetCDF Output:
CF-1.6 compliant
Daily forecasts for 32 days
Proper coordinate metadata
Weekly Aggregation:
Calculate weekly means from daily predictions
Ensemble Processing:
Average 51 ensemble member predictions
Convert to quintile probabilities
forecast_pbs_week1_MEAN = np.mean(all_forecast_pbs_week1, axis=0)
Evaluation Metrics
MAE (Mean Absolute Error)
RMSE (Root Mean Square Error)
Custom ECMWF metrics
Have you published or presented any work related to this forecasting model? If yes, could you share references or links?
No
Before submitting your forecasts to the AI Weather Quest, did you validate your model against observational or independent datasets? If so, how?
No
Did you face any challenges during model development, and how did you address them?
Memory Management Issues
Challenge
Large spatial grids (121×240) × multiple variables × long sequences caused OOM errors
Solutions:
# Memory optimization flag
args.use_memory_optimizations = True
Batch scaler fitting instead of full dataset
Gradient checkpointing for large inputs
Additional optimizations:
# Drop incomplete batches to save memory
Mixed precision training:
use_mixed_precision=True
Gradient accumulation
accumulation_steps=4
3. Spherical Geometry Handling
Challenge
Earth's curved surface poorly represented in standard CNN/grid approaches
Solution: Spherical Harmonics Integration
Patch Size Compatibility for Vision Transformer
Irregular spatial dimensions don't divide evenly into patches
Solution: Padding to power-of-2
Are there any limitations to your current model that you aim to address in future iterations?
Optimize the model's ability to create realistic predictions for temperature during inference. Right now there is not great spatial variability and differentiation between ensemble members.
Are there any other AI/ML model components or innovations that you wish to highlight?
Spherical Harmonics Integration (Primary Innovation)
ClimaX-Harmonics Architecture
Innovation: Physics-informed deep learning that respects Earth's spherical geometry Key Components:
Why it matters:
Standard CNNs treat lat-lon grids as flat images (wrong!)
Spherical harmonics preserve geometric properties at poles
Spectral loss enforces physically-consistent spatial patterns
Better captures global circulation patterns
Dual-Domain Loss Function
Innovation: Training in both spatial AND frequency domains simultaneously
train_model_with_harmonics_loss(
accelerator,
model,
device,
train_loader,
val_loader,
harmonics_loss_fn=harmonics_loss,
spectral_loss_weight=0.1, # 10% spectral, 90% spatial
...
)
Benefits:
Spatial loss: Accurate point-wise predictions
Spectral loss: Smooth, physically realistic large-scale patterns
Prevents checkerboard artifacts and unrealistic small-scale noise
Distributed Training with Accelerate
Innovation:
Seamless multi-GPU training without code changes
Handles gradient synchronization automatically
find_unused_parameters=True allows flexible model architectures
Who contributed to the development of this model? Please list all individuals who contributed to this model, along with their specific roles (e.g., data preparation, model architecture, model validation, etc) to acknowledge individual contributions.
Data preparation: Theano Mamouka
Model architecture: Gousios George
Methodology, analysis, investigation: Paraskevi Vourlioti
Effective supervision, researching, team administration: Stelios Kotsopoulos
Submitted forecast data in previous period(s)
Please note: Submitted forecast data is only publicly available once the evaluation of a full competitive period has been completed. See the competition's full detailed schedule with submitted data publication dates for each period here.
Access forecasts data