[None][infra] Reduce Docker image layer count in release stage#14972
[None][infra] Reduce Docker image layer count in release stage#14972tburt-nv wants to merge 2 commits into
Conversation
Consolidate all COPY and RUN steps in the release stage into a single RUN using bind mounts and cp. This replaces 7 COPY layers and 2 RUN layers with a single RUN layer. Also use a bind mount for the OSS attribution script instead of COPY+RUN+rm. Reduces the release stage from ~13 layers to 3 (pip install, copy+setup, OSS attribution), saving ~10 layers in the published image. Signed-off-by: jbernloehr <jbernloehr@nvidia.com>
|
/bot run --stage-list "Build-Docker-Images" |
|
PR_Github #52147 [ run ] triggered by Bot. Commit: |
|
PR_Github #52147 [ run ] completed with state
|
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
|
/bot run |
|
PR_Github #52374 [ run ] triggered by Bot. Commit: |
📝 WalkthroughWalkthroughThe PR optimizes the Docker release stage by replacing sequential COPY statements with BuildKit bind mounts for more efficient artifact assembly. It removes benchmark source files, validates library presence, creates package symlinks, and refactors OSS attribution generation. Jenkins CI image tag properties are updated to reference the newly built images. ChangesDocker Build and CI Image Updates
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docker/Dockerfile.multi`:
- Line 165: The ldd predicate can silently pass when there are no tensorrt_llm
lines; change the check to first assert the dependency exists and then assert it
is not "not found" — e.g. run ldd -v bin/executorWorker and test with grep -q
'tensorrt_llm' (fail if missing) and only then check that the matched lines do
NOT contain "not found" (e.g. ! grep tensorrt_llm | grep -q "not found");
reference the binary name bin/executorWorker and the dependency token
tensorrt_llm when updating the Dockerfile multi-line gate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 14825cce-cb85-4f12-b17e-5176b75b12ae
📒 Files selected for processing (2)
docker/Dockerfile.multijenkins/current_image_tags.properties
|
/bot run --disable-fail-fast |
|
PR_Github #52374 [ run ] completed with state
|
|
PR_Github #52388 [ run ] triggered by Bot. Commit: |
|
PR_Github #52388 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #52414 [ run ] triggered by Bot. Commit: |
Description
This PR is intended to supersede @janbernloehr's PR #11987, since it needs some minor tweaks.
Here is Jan's original summary:
Follow-up to #11720 which already consolidated the devel stage layers using bind mounts.
This PR applies the same pattern to the release stage:
Replace 7 COPY layers + 2 RUN layers with a single RUN using --mount=type=bind and cp, consolidating all file copies, symlink creation, and cleanup into one layer
Replace the OSS attribution COPY + RUN + rm with a single RUN using a bind mount
Reduces the release stage from ~13 layers to 3 (pip install, copy+setup+cleanup, OSS attribution), saving ~10 layers in the published image and leaving more headroom under Docker's 128-layer overlay2 limit for downstream consumers.
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.Summary by CodeRabbit