Skip to content

[2178] Add data reader for offgrid inference#2426

Open
vhertel wants to merge 4 commits into
ecmwf:developfrom
vhertel:vhertel/develop/2178-offgrid-evaluation
Open

[2178] Add data reader for offgrid inference#2426
vhertel wants to merge 4 commits into
ecmwf:developfrom
vhertel:vhertel/develop/2178-offgrid-evaluation

Conversation

@vhertel

@vhertel vhertel commented May 29, 2026

Copy link
Copy Markdown

Description

Summary

Adds support for offgrid inference, enabling the model to produce forecasts at arbitrary lat/lon target locations rather than being constrained to the original data grid. Introduces a dedicated offgrid data reader, wires it into the multi-stream data sampler, and provides configs for running and evaluating offgrid forecasting.

New config keys

Adds a test_config.offgrid_inference block to the config:

test_config:
  offgrid_inference:
    # absolute path to .npy file with shape (N, 2) [lat, lon] used for offgrid inference
    coords: /e/scratch/weatherai/shared_work/offgrid-test/offgrid_regular.npy
    # absolute path to .npy file with shape (N, G) used for offgrid inference; use null if no geoinfos are available
    geoinfos: /e/scratch/weatherai/shared_work/offgrid-test/offgrid_regular_geoinfos.npy
    # temporal spacing between offgrid samples, e.g. 6h
    frequency: 6h

How to run

# Train
uv run --offline train --base-config config/config_offgrid_forecasting.yml

# Inference (regular grid)
uv run --offline inference --from-run-id kzyr6530 --options test_config.output.num_samples=8

# Inference (SYNOP grid)
uv run --offline inference --from-run-id kzyr6530 --options test_config.output.num_samples=8 test_config.offgrid_inference.coords=/e/scratch/weatherai/shared_work/offgrid-test/offgrid_synop.npy test_config.offgrid_inference.geoinfos=/e/scratch/weatherai/shared_work/offgrid-test/offgrid_synop_geoinfos.npy

# Evaluate
uv run --offline evaluate --config config/evaluate/eval_config_offgrid.yml

Implementation notes

  • In MultiStreamDataSampler, a DataReaderOffgrid is instantiated per file and stream and inherits metadata (channels and normalization) from the original data reader. This lets channels and normalization parameters be loaded from the dataset regardless of type (anemoi, obs, fesom), but the two steps can be merged if preferred.
  • Geoinfos are not yet supported.

Testing

Tested with a uniform grid and with random coordinates for SYNOP. Both grid files are temporarily stored at:

  • /e/scratch/weatherai/shared_work/offgrid-test/offgrid_regular.npy
  • /e/scratch/weatherai/shared_work/offgrid-test/offgrid_regular_geoinfos.npy
  • /e/scratch/weatherai/shared_work/offgrid-test/offgrid_synop.npy
  • /e/scratch/weatherai/shared_work/offgrid-test/offgrid_synop_geoinfos.npy
map_toqb9dv6_preds_0_2023-10-01T0600_ERA5_global_10v_fstep_001 map_thb1rpuj_preds_0_2023-10-01T0600_ERA5_global_10v_fstep_001

Issue Number

Addresses #2178

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@vhertel vhertel marked this pull request as draft May 29, 2026 17:55

@grassesi grassesi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I like the Decorator pattern you are employing here, since it keeps the new behavior mostly contained in one place with very few changes to the initialization logic l. I would like to this fact sharpened/highlighted more. I have just some questions (maybe I am missing context here) and a few nitpicks

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are (silently) changing the behavior for all people using the config_forecasting.yml. If this was by mistake please revert. If this is intended, please make it a separate PR and communicate in the developer channel ahead of time.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! The config matches the current version in the develop branch, so it should merge without changes. Shall I revert the file to the previous version?

Comment on lines +119 to +133
else:
# empty source channels (needed from base class)
self.source_idx: NDArray[np.int64] = np.array([], dtype=np.int64)
self.source_channels: list[str] = []

# empty target channels (needed from base class)
self.target_idx: NDArray[np.int64] = np.array([], dtype=np.int64)
self.target_channels: list[str] = []

# empty target channel weights
self.target_channel_weights: list[float] = []

# neutral normalization
self.mean = np.zeros(0, dtype=np.float32)
self.stdev = np.ones(0, dtype=np.float32)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see the point of this branch. Looking at how this class is instantiated, it always wraps another DataReader. Do you plan on modifying the instantiation logic so this might happen? I would just make the ref_reader obligatory. This would also focus this design.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out; I made the ref_reader mandatory and removed the corresponding logic.

Comment on lines 143 to 150

return np.arange(perms_len)
# Setup for offgrid inference and evaluation
offgrid_eval = mode_cfg.get("offgrid_eval", {})
# Path to .npy file containing offgrid coordinates
self.offgrid_template = offgrid_eval.get("grid", None)
# Frequency defined by config (offgrid_eval.frequency) with fallback to time_window_step
self.offgrid_frequency = offgrid_eval.get("frequency", mode_cfg.time_window_step)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning on using self.offgrid_template and self.offgrid_frequency somewhere else in this class or by any downstream code? If not please remove since this pollutes the namespace of this class.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These attributes are only used here in the MultiStreamDataSampler, so I changed them to plain variables.

)
# load offgrid dataset if specified
if self.offgrid_template is not None:
filename = pathlib.Path(str(self.offgrid_template))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I dont think the str typecast is needed here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it :)

Comment thread config/config_offgrid_forecasting.yml Outdated
Comment on lines +227 to +233
offgrid_eval:
# absolute path to .npy file with shape (N, 2) [lat, lon] used for offgrid inference
grid: /e/scratch/weatherai/shared_work/offgrid-test/offgrid_regular.npy
# temporal spacing between offgrid samples, e.g. 6h
frequency: 6h
# TODO add support for geoinfos
geoinfos: null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this intends to trick the decoder into generating data on a different grid? Since it is output related, does it maybe make sense to put it under <mode>_config.output? At least I would rename offgid_eval to something like offgrid_inference to be consistent with the existing terminology used in the code.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clessig what is your opinion on the design choice here?

@vhertel vhertel marked this pull request as ready for review June 2, 2026 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants