feat: Support excluding recent builds using timestamp (--exclude-newer)#4228
feat: Support excluding recent builds using timestamp (--exclude-newer)#4228jezdez wants to merge 5 commits into
--exclude-newer)#4228Conversation
Convert the exclude_newer duration/timestamp from conda's config into a Unix timestamp cutoff and pass it to libmambapy's Database constructor as exclude_newer_timestamp. This enables native --exclude-newer filtering in the libmamba solver backend. Depends on: - conda/conda#15761 (exclude_newer config + parse_duration_to_seconds) - mamba-org/mamba#4228 (exclude_newer_timestamp in Database::Settings)
Convert the exclude_newer duration/timestamp from conda's config into a Unix timestamp cutoff and pass it to libmambapy's Database constructor as exclude_newer_timestamp. This enables native --exclude-newer filtering in the libmamba solver backend. Depends on: - conda/conda#15761 (exclude_newer config + parse_duration_to_seconds) - mamba-org/mamba#4228 (exclude_newer_timestamp in Database::Settings)
Convert the exclude_newer duration/timestamp from conda's config into a Unix timestamp cutoff and pass it to libmambapy's Database constructor as exclude_newer_timestamp. This enables native --exclude-newer filtering in the libmamba solver backend. Depends on: - conda/conda#15761 (exclude_newer config + parse_duration_to_seconds) - mamba-org/mamba#4228 (exclude_newer_timestamp in Database::Settings)
Convert the exclude_newer duration/timestamp from conda's config into a Unix timestamp cutoff and pass it to libmambapy's Database constructor as exclude_newer_timestamp. This enables native --exclude-newer filtering in the libmamba solver backend. Depends on: - conda/conda#15761 (exclude_newer config + parse_duration_to_seconds) - mamba-org/mamba#4228 (exclude_newer_timestamp in Database::Settings)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4228 +/- ##
==========================================
+ Coverage 55.08% 55.10% +0.02%
==========================================
Files 240 241 +1
Lines 30329 30346 +17
Branches 3241 3246 +5
==========================================
+ Hits 16707 16723 +16
- Misses 13619 13620 +1
Partials 3 3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
exclude_newer_timestamp to Database::Settings
The kwarg doesn't exist in released libmambapy yet (pending mamba-org/mamba#4228), so passing None causes a TypeError. Only include the kwarg when there's an actual timestamp value.
exclude_newer_timestamp to Database::Settings--exclude-newer)
|
Thank you for this contribution, @jezdez. 🙂 Please let us know when this can be reviewed. |
--exclude-newer)--exclude-newer)
| } | ||
| if constexpr (std::is_same_v<Action, Solution::Reinstall> | ||
| || std::is_same_v<Action, Solution::Omit>) | ||
| if constexpr (std::is_same_v<Action, Solution::Reinstall> || std::is_same_v<Action, Solution::Omit>) |
There was a problem hiding this comment.
It seems like a spurious change which needs to be reverted.
There was a problem hiding this comment.
Fixed in fbf2a8b by restoring the original line wrapping, so this formatting-only change no longer appears in the PR diff.
| on_parsed(filename); | ||
| if (exclude_newer_timestamp && pkg_timestamp > *exclude_newer_timestamp) | ||
| { | ||
| repo.remove_solvable(id, /* reuse_id= */ true); |
There was a problem hiding this comment.
Side-note: Removing solvable here is the most appropriate to me. I think that we might even be able (in the case of sharded repodata usage) to filter builds on timestamp at the earliest (see #4214); yet I am not sure that there will be massive benefits (more maintenance and complexity for probably not the most reduced overhead).
There was a problem hiding this comment.
Agreed. I kept the native libmamba implementation at the solvable insertion/removal point for this PR. The conda-side scoped policy uses Python repodata filtering before libmamba; earlier shard-level filtering can be revisited separately once the shard APIs settle.
|
It could be interesting to have a variant of this offering some kind of "as-of" for reproducibility without a lock. |
|
Hello @jezdez, Would you authorize us to finish this PR? 🙂 |
|
Thanks @jjerphan, I’d appreciate the handover if you’re still willing to help finish this PR. One thing worth keeping explicit while doing that: this PR currently adds only the low-level/global libmamba API, For conda, the downstream solver integrations use libmamba’s native global cutoff only for the simple global-policy case. When conda has channel or package overrides, conda-libmamba-solver applies the conda policy in Python-side repodata filtering before handing records to libmamba. So I would not try to squeeze scoped policy into this PR unless standalone mamba/micromamba wants its own matching UX and public API; that likely deserves a separate design/API change. I also pushed fbf2a8b to address the review comments and align the native filter with conda’s timestamp semantics by preferring |
|
Thank you for explaining this. To me it makes sense to respect the policies which has been adopted by conda and to implement what is necessary to support them (or at least first the global |
603645e to
1f95c97
Compare
|
Everything should have been implemented in the last commits (it makes sense to me to have everything in a |
| exclude_newer.get_cli_config<std::string>(), | ||
| exclude_newer.description() | ||
| ); | ||
|
|
There was a problem hiding this comment.
Shouldn't we add an option for exclude_newer_package as well?
There was a problem hiding this comment.
exclude_newer_package is, as far as I understand this issue and from the current implementation proposal for conda, only part of the configuration.
There was a problem hiding this comment.
We could introduce it on the CLI, but I do not really know how could one specify per-package policies.
There was a problem hiding this comment.
I wonder whether we should use elements of std::chrono now since we support C++20.
There was a problem hiding this comment.
We need to check if we can parse all the formats, but if so, that would make the code much simpler.
6a96376 to
de43e12
Compare
| solver::Request::Flags solver_flags = {}; | ||
| std::string exclude_newer; | ||
| std::map<std::string, std::string> exclude_newer_package; | ||
| ExcludeNewerPolicy exclude_newer_policy; |
There was a problem hiding this comment.
Not sure if we really want to include the whole ExcludeNewerPolicy in the context. Seems ok for now (not too heavy), but maybe others in the team would mind.
There was a problem hiding this comment.
I guess this is the best tradeoff to keep up with the ecosystem.
|
I'm wondering if we should add a section in the documentation explaining the usage. |
b930219 to
4223123
Compare
4223123 to
fb568b6
Compare
|
I would wait for it to have at least one other review and preferably two. |
d635b0a to
4aca7a7
Compare
|
More generally, using |
When set, packages with a timestamp newer than the cutoff are excluded during repodata loading. This covers both ingestion paths: - JSON (mamba_read_json -> set_repo_solvables_impl): after parsing a solvable, check its timestamp and remove it if newer than the cutoff - PackageInfo (add_repo_from_packages_impl_loop): skip the package before adding a solvable The .solv cache path is not filtered; callers (e.g. conda-libmamba-solver) are expected to invalidate the cache when exclude_newer changes. Also moves MAX_CONDA_TIMESTAMP to helpers.hpp so both helpers.cpp and database.cpp can use it without duplication. Python bindings expose the new parameter as an optional exclude_newer_timestamp keyword argument on Database.__init__(). This enables conda's --exclude-newer feature to work with the libmamba solver backend. See conda/conda#15759 for full tracking.
- C++ tests (test_database.cpp): verify add_repo_from_packages filters packages by timestamp, normalizes millisecond timestamps, and filters packages loaded from repodata JSON - Python tests (test_solver_libsolv.py): verify the libmambapy binding accepts exclude_newer_timestamp, filters packages from both the packages API and repodata JSON, and that None keeps all packages
The repodata JSON filter was reading solv.timestamp() to check against the cutoff, but libsolv attributes require internalize() before they can be read back. Since internalize() runs after all packages are loaded, the timestamp was always 0 at filter time. Fix by passing the parsed timestamp out of set_solvable via an out parameter and using that directly in the filter comparison. Also fix the millisecond normalization test to use a timestamp that actually triggers the normalization path (> MAX_CONDA_TIMESTAMP), and add subdir/depends fields to the Python repodata test fixture to match real repodata structure.
Apply clang-format to ternary expression in database.cpp and a pre-existing line-break issue in helpers.cpp.
a41f30a to
96d2ca7
Compare
Description
Adds an optional
exclude_newer_timestampfield toDatabase::Settings, allowing callers to exclude packages with a timestamp newer than the given Unix epoch cutoff during repodata loading.This enables conda's
--exclude-newerfeature to work with the libmamba solver backend, matching what rattler already supports natively. See conda/conda#15759 for the full tracking and conda/ceps#154 for the CEP proposingindexed_timestampin repodata.This PR adds only the low-level global libmamba API: a single cutoff timestamp applied when loading a repo. It intentionally does not implement conda's richer channel- or package-scoped
exclude_newerpolicy;conda-libmamba-solverapplies scoped policy in Python-side repodata filtering when needed and uses this native cutoff for the simple global-policy case.Fix #4254.
How it works
The cutoff is applied at load time in both ingestion paths:
mamba_read_json->set_repo_solvables_impl): after a solvable is parsed, its policy timestamp is compared to the cutoff. If newer, the solvable is removed viarepo.remove_solvable(id, true)-- the same pattern used for parse failures. For repodata JSON,indexed_timestampis preferred overtimestampwhen present (matching conda semantics).add_repo_from_packages_impl_loop): the package timestamp (normalized from ms to seconds) is checked before adding a solvable. If newer, the package is skipped entirely.The
.solvcache path is not filtered. Callers (e.g.conda-libmamba-solver) are expected to invalidate the cache whenexclude_newer_timestampchanges, the same way they handlerepodata_filters.Other changes
MAX_CONDA_TIMESTAMPfrom the anonymous namespace inhelpers.cpptohelpers.hppand addednormalize_conda_timestampfor shared timestamp normalization.Database.__init__()accepts an optionalexclude_newer_timestampkeyword argument.Context
--exclude-neweris a supply chain security feature that excludes recently published packages, giving security vendors time to flag malicious uploads. It's already supported by uv, pip, pixi, and rattler.indexed_timestamp: conda/ceps#154Test plan
add_repo_from_packages, millisecond normalization, repodata JSON filtering, unfiltered vs filteredpackage_count, andindexed_timestampprecedenceNonedefault