Rust library for loading, structuring, and querying astronomical observation datasets — with trajectory grouping, multi-observer support, and efficient lookups.
- Serialisation / deserialisation (
serdefeature) — persist an [ObsDataset] to JSON (or any otherserde-compatible format) and restore it without losing observations or custom observers. Runtime-only state (MPC network cache) is automatically re-initialised on deserialisation. - Polars ingestion (
polarsfeature) — load observations from aDataFrameorLazyFramewith full schema validation. - Parallel iteration (
parallelfeature) — iterate over observations, nights, and trajectories in parallel via rayon, with zero data copying. - ADES ingestion (
adesfeature) — load observations directly from MPC ADES XML files, with automatic MPC observer resolution. - MPC 80-column ingestion (
mpc_80_colfeature) — load observations from the classic MPC fixed-width 80-column ASCII format. - Parquet ingestion via DataFusion (
datafusionfeature) — load observations from any Parquet file reachable by URI (file://,http://,https://,hdfs://) using Apache Arrow / DataFusion. - Multi-observer support — MPC observatory codes (resolved lazily from the MPC website), custom geodetic sites (interned and deduplicated), or unknown observer.
- Trajectory grouping — group observations by a
traj_idcolumn; supports both integer (UInt32) and string (String) identifiers. - Three astrometric error models — FCCT14, CBM10, and VFCC17, used to assign measurement accuracies to MPC-coded observatories.
Add photom to your Cargo.toml. Without any optional features:
[dependencies]
photom = "0.1"Enable individual features as needed:
[dependencies]
photom = { version = "0.1", features = ["polars", "parallel", "ades", "mpc_80_col", "datafusion", "serde"] }All features are independent and can be combined freely.
ObsDataset implements the standard serde::Serialize / serde::Deserialize
traits and works with any serde-compatible format (JSON, MessagePack, …).
use photom::observation_dataset::ObsDataset;
// Serialise — format-agnostic (use any serde serializer).
let json = serde_json::to_string(&dataset)?;
std::fs::write("dataset.json", &json)?;
// Deserialise with the default index layout (Split — always safe).
let json = std::fs::read_to_string("dataset.json")?;
let restored: ObsDataset = serde_json::from_str(&json)?;
// Binary format (rmp-serde / MessagePack).
let bytes: Vec<u8> = rmp_serde::to_vec(&dataset)?;
let restored: ObsDataset = rmp_serde::from_slice(&bytes)?;For potentially faster look-ups you can request a contiguous index layout via
[ObsDatasetSeed] (a [serde::de::DeserializeSeed] implementation).
Any format that exposes its Deserializer struct publicly works — both
serde_json and rmp-serde do:
use photom::{IndexLayout, ObsDatasetSeed};
use serde::de::DeserializeSeed as _;
// JSON
let mut de = serde_json::Deserializer::from_str(&json);
let restored = ObsDatasetSeed { layout: IndexLayout::TryContiguous }
.deserialize(&mut de)?;
// MessagePack (rmp-serde — compact binary)
let mut de = rmp_serde::Deserializer::new(bytes.as_slice());
let restored = ObsDatasetSeed { layout: IndexLayout::TryContiguous }
.deserialize(&mut de)?;TryContiguous falls back to Split automatically for any index group whose
observations are not stored contiguously.
What is persisted
| State | Persisted? | Notes |
|---|---|---|
| Observations | Yes | Full list in insertion order |
| Custom geodetic observers | Yes | All sites and their coordinates |
| Astrometric error model | Yes | FCCT14, CBM10, VFCC17, or None |
| MPC network cache | No | Fetched lazily on first use |
| MPC network cache | No | Fetched lazily on first use |
| Trajectory aliases | Yes | Fully round-tripped |
| Night / trajectory indices | Yes | Membership stored per-observation; rebuilt on load |
use photom::observation_dataset::ObsDataset;
use photom::io::polars::{FromPolarsArgs};
let dataset = ObsDataset::from_polars(&df, FromPolarsArgs::default())?;
for obs in dataset.iter_observations() {
println!("{:?}", obs);
}use photom::observation_dataset::ObsDataset;
use photom::io::polars::FromPolarsArgs;
let dataset = ObsDataset::from_lazy(df.lazy(), FromPolarsArgs::default())?;use photom::observation_dataset::ObsDataset;
use photom::io::datafusion::LoadObsArgs;
let dataset = ObsDataset::from_parquet_uri(
"file:///data/observations.parquet",
LoadObsArgs::default(),
)?;
println!("{} observations loaded", dataset.observation_count());use photom::observation_dataset::ObsDataset;
// error_ra and error_dec are optional fallback uncertainties in arcseconds.
let dataset = ObsDataset::from_ades("observations.xml", Some(0.5), Some(0.5))?;use photom::observation_dataset::ObsDataset;
let dataset = ObsDataset::from_mpc_80_col("observations.txt")?;use photom::observation_dataset::ObsDataset;
use rayon::iter::ParallelIterator;
let count = dataset.par_iter_observations().count();
if let Some(par_iter) = dataset.par_iter_full_night() {
par_iter.for_each(|(night_id, obs)| {
println!("night {:?}: obs id {}", night_id, obs.id());
});
}EquCoord bundles a sky position (RA, Dec) with its 1-σ uncertainties.
All values are stored internally in radians; use from_degrees to supply
degrees.
use photom::coordinates::equatorial::EquCoord;
use photom::coordinates::cartesian::CartesianCoord;
// Construct from degrees — converted to radians internally.
let a = EquCoord::from_degrees(10.0, 0.001, 20.0, 0.001);
let b = EquCoord::from_degrees(10.5, 0.001, 20.5, 0.001);
// Great-circle separation via the Vincenty formula (result in radians).
let sep = a.angular_separation(&b);
// Vector-averaging midpoint on the sphere.
let mid = a.spherical_midpoint(&b);
// Lossless projection onto the unit sphere (uncertainties discarded).
let cart = CartesianCoord::from(a);
// Recover equatorial angles (errors set to zero).
let back: EquCoord = cart.into();
// Propagate astrometric covariance through the spherical → Cartesian mapping.
// Returns CartesianCoordCov with the full 3×3 covariance matrix.
let cov = a.to_cartesian_cov();
// Inverse: propagate back to equatorial marginal 1-σ errors.
let recovered = cov.to_equatorial();EclipticCoord stores an ecliptic sky position EclipticCoordCov additionally
carries the full 2×2 covariance
use photom::coordinates::equatorial::{EquCoord, EquCoordCov};
use photom::coordinates::ecliptic::{EclipticCoord, EclipticCoordCov};
// Position-only: uncertainties are discarded.
let eq = EquCoord::from_degrees(83.82, 0.0, 22.01, 0.0); // near Crab Nebula
let ecl = EclipticCoord::from(eq);
let (lon_deg, lat_deg) = ecl.to_degrees();
// Inverse conversion back to equatorial.
let back: EquCoord = ecl.into();
// Full covariance propagation (1 arcsec errors).
let eq_with_err = EquCoord::from_degrees(83.82, 1.0 / 3600.0, 22.01, 1.0 / 3600.0);
let eq_cov = EquCoordCov::from_equ(eq_with_err);
let ecl_cov = EclipticCoordCov::from(eq_cov);
// Marginal 1-σ errors and full 2×2 covariance in the ecliptic frame.
let sigma_lon = ecl_cov.coord.lon_error;
let sigma_lat = ecl_cov.coord.lat_error;
let cov_lon_lat = ecl_cov.cov.xy; // off-diagonal correlation term
// Round-trip back to equatorial covariance.
let eq_cov2 = EquCoordCov::from(ecl_cov);Cov2 is a compact symmetric 2×2 covariance matrix for astrometric error
ellipses expressed in a local tangent-plane frame.
use photom::coordinates::cov2::Cov2;
use photom::coordinates::equatorial::EquCoord;
// Build a diagonal covariance from the marginal errors of an EquCoord.
let coord = EquCoord::from_degrees(45.0, 0.001, 20.0, 0.002);
let cov = Cov2::from_equ(&coord);
// Semi-axes of the 1-σ confidence ellipse.
let sigma_major = cov.lambda_max().max(0.0).sqrt();
let sigma_minor = cov.lambda_min().max(0.0).sqrt();
// Mahalanobis distance for an offset vector (radians).
let offset = [1e-4_f64, 0.0_f64];
if let Some(d2) = cov.mahalanobis_sq(offset) {
let _ = d2.sqrt(); // normalised distance
}
// Add isotropic process noise q·I (Kalman-style inflation).
let inflated = cov.inflate_isotropic(1e-8);TangentPlane projects sky positions near a chosen tangent point onto a local
2-D Cartesian frame. Great circles project to straight lines, making this ideal
for short-arc astrometry and kinematic linking.
use photom::coordinates::equatorial::EquCoord;
use photom::coordinates::gnomonic_projection::{TangentPlane, TangentVec};
// Define the tangent point (degrees, converted internally to radians).
let ref_coord = EquCoord::from_degrees(45.0, 0.0, 20.0, 0.0);
let plane = TangentPlane::new(ref_coord);
// Forward projection: sky → tangent plane.
let target = EquCoord::from_degrees(45.5, 0.0, 20.5, 0.0);
let tp = plane.project(&target);
// Inverse projection: tangent plane → sky.
let sky = tp.unproject();
// Squared Euclidean distance between two projected points (radians²).
let other = plane.project(&EquCoord::from_degrees(45.1, 0.0, 20.1, 0.0));
let d2 = tp.dist2(&other);
// Translate a projected point by a displacement vector.
let v = TangentVec { dx: 1e-3, dy: -1e-3 };
let shifted = tp + v;All column values for ra, ra_err, dec, dec_err, obs_lon, obs_lat, obs_ra_acc, and obs_dec_acc must be supplied in radians. No unit conversion is performed during ingestion.
| Column | Polars type | Arrow type | Unit | Description |
|---|---|---|---|---|
id |
UInt64 |
UInt64 |
— | Unique observation identifier |
ra |
Float64 |
Float64 |
rad | Right ascension |
ra_err |
Float64 |
Float64 |
rad | 1-σ right ascension uncertainty |
dec |
Float64 |
Float64 |
rad | Declination |
dec_err |
Float64 |
Float64 |
rad | 1-σ declination uncertainty |
magnitude |
Float64 |
Float64 |
mag | Apparent magnitude |
mag_err |
Float64 |
Float64 |
mag | 1-σ magnitude uncertainty |
filter |
String |
Utf8 / UInt8 / UInt16 / UInt32 |
— | Photometric filter label or code |
mjd_tt |
Float64 |
Float64 |
MJD (TT) | Epoch (Modified Julian Date, Terrestrial Time) |
| Column | Polars type | Arrow type | Unit | Description |
|---|---|---|---|---|
obs_lon |
Float64 |
Float64 |
rad | Geodetic longitude, east of Greenwich |
obs_lat |
Float64 |
Float64 |
rad | Geodetic latitude |
obs_alt |
Float64 |
Float64 |
m | Altitude above the reference ellipsoid |
obs_ra_acc |
Float64 |
Float64 |
rad | 1-σ RA measurement accuracy — required when geodetic triplet is set |
obs_dec_acc |
Float64 |
Float64 |
rad | 1-σ Dec measurement accuracy — required when geodetic triplet is set |
mpc_code_obs |
String |
Utf8 |
— | Three-byte ASCII MPC code (takes precedence over geodetic columns) |
| Column | Polars type | Arrow type | Description |
|---|---|---|---|
traj_id |
UInt32 or String |
UInt32 or Utf8 |
Trajectory identifier; nullable — null rows are loaded but not assigned to any trajectory |
night_id |
UInt32 |
UInt32 |
Night identifier; nullable — null rows are included but not assigned to any night |
Each row's observer is resolved in the following order of precedence:
mpc_code_obsnon-null →ObserverId::MpcCode(MPC site, resolved lazily from the MPC website).obs_lon,obs_lat, andobs_altall non-null →ObserverId::IntId(custom geodetic site).obs_ra_accandobs_dec_accmust also be non-null.- Otherwise → no observer (
None).
A partially-null geodetic triplet (one or two of the three columns non-null) is always an ingestion error. A complete triplet without accuracy values is also an error.
| Field | Type | Default | Description |
|---|---|---|---|
error_model |
Option<ObsErrorModel> |
None |
Astrometric error model for MPC-coded observatories |
do_rechunk |
Option<bool> |
Some(false) |
Force single-chunk layout before ingestion |
contiguous_choice |
Option<ContiguousChoice> |
Some(ContiguousNight) |
Sort by night or trajectory for compact index ranges |
| Field | Type | Default | Description |
|---|---|---|---|
error_model |
Option<ObsErrorModel> |
None |
Astrometric error model for MPC-coded observatories |
contiguous_choice |
Option<ContiguousChoice> |
Some(ContiguousNight) |
Sort by night or trajectory for compact index ranges |
| Alias | Underlying type | Unit |
|---|---|---|
Arcseconds |
f64 |
Angle in arcseconds |
Radians |
f64 |
Angle in radians |
Degrees |
f64 |
Angle in degrees |
MJDTT |
f64 |
Modified Julian Date (Terrestrial Time) |
Meters |
f64 |
Distance in metres |
| Error type | Feature | Description |
|---|---|---|
PolarsError |
polars |
Schema validation, type mismatch, null in required column, partial geodetic triplet, missing accuracy, invalid MPC code |
LoadObsError |
datafusion |
URI resolution failure, resource not found, DataFusion I/O error, Arrow column error |
AdesError |
ades |
XML parse error, missing mandatory field, unresolvable observatory |
Mpc80ColError |
mpc_80_col |
Parse error in the fixed-width 80-column format |
ObserverError |
— | Invalid float value, MPC code not found or malformed |
To compile the documentation locally, run the following command in the terminal:
RUSTDOCFLAGS="--html-in-header $(pwd)/katex-header.html" cargo doc --no-deps --all-featuresThe DataFusion tests require the large-test-fixtures feature to run. The large Parquet fixtures have been excluded from the crates.io package and are gated behind this feature.
To run the full test suite including DataFusion:
cargo test --features "datafusion,large-test-fixtures"All other tests are gated behind their associated features and do not require this additional flag.
photom requires Rust 1.94.0 or later.
This project is licensed under the CeCILL-C Free Software License Agreement.