Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions .github/workflows/test-nvcr-pull.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# nvcr.io image pull + size — single-job smoke test.
#
# Pulls a specific nvcr.io image and reports its on-disk size. Does
# not include `docker login` — relies on whatever credentials the
# runner is configured with (the Packer-baked NGC pull-secret on
# nv-gha-runners, if present). A failure with an auth error is
# itself useful signal that the runner is not pre-configured for
# the target namespace.
#
# Triggers: same as `secret-scan.yml` — copy-pr-bot mirror branches
# (`pull-request/[0-9]+`) on NVIDIA/cccl + workflow_dispatch.

name: nvcr image pull + size

run-name: nvcr pull — ${{ github.ref_name }}

on:
push:
branches:
- "pull-request/[0-9]+"
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-on-${{ github.event_name }}-from-${{ github.ref_name }}
cancel-in-progress: true

permissions:
contents: read

jobs:
pull-and-size:
name: docker pull + size
# NV self-hosted CPU runner on NVIDIA/cccl; GitHub-hosted fallback
# on contributor forks where nv-gha-runners labels do not resolve.
runs-on: ${{ github.repository == 'NVIDIA/cccl' && 'linux-amd64-cpu4' || 'ubuntu-latest' }}
# Cap stuck pulls / registry hangs so a wedged job doesn't sit on a
# self-hosted runner indefinitely. 15 min covers a multi-GB image
# pull on a busy runner with headroom; well under GitHub's default
# 6-hour job timeout.
timeout-minutes: 15
env:
NVCR_IMAGE: nvcr.io/nvidian/prodsec/pulse-trufflehog:1.33
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/test-nvcr-pull.yml"
echo "=== file ==="
ls -l "$FILE"
echo
echo "=== snippet around line 37 ==="
nl -ba "$FILE" | sed -n '1,120p'

Repository: NVIDIA/cccl

Length of output: 247


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/test-nvcr-pull.yml"

echo "=== file ==="
ls -l "$FILE"
echo

echo "=== lines 1-120 with numbers (no nl) ==="
cat -n "$FILE" | sed -n '1,120p'
echo

echo "=== lines around NVCR_IMAGE ==="
rg -n "NVCR_IMAGE" "$FILE" || true
echo

echo "=== lines around job steps cleanup/docker rmi ==="
rg -n "docker rmi|cleanup" "$FILE" || true
echo

echo "=== search for timeout-minutes ==="
rg -n "timeout-minutes" "$FILE" || true

Repository: NVIDIA/cccl

Length of output: 2871


important: Pin NVCR_IMAGE to an immutable digest instead of the mutable :1.33 tag to keep the workflow reproducible and strengthen supply-chain trust (line 37).

  • suggestion: Add timeout-minutes to jobs.pull-and-size and avoid masking cleanup failures (docker rmi ... || true) so self-hosted runner hygiene issues don’t get hidden (lines 31-37, 52-54).


steps:
- name: docker pull
run: docker pull "${NVCR_IMAGE}"

- name: Report size
run: |
set -euo pipefail
bytes=$(docker image inspect "${NVCR_IMAGE}" --format '{{.Size}}')
mib=$(awk -v b="${bytes}" 'BEGIN { printf "%.1f", b/1024/1024 }')
gib=$(awk -v b="${bytes}" 'BEGIN { printf "%.2f", b/1024/1024/1024 }')
echo "Image: ${NVCR_IMAGE}"
echo "Size: ${bytes} bytes (${mib} MiB / ${gib} GiB)"

- name: Cleanup
if: always()
# Only attempt removal if the image is actually present locally.
# If the pull failed (e.g. auth error on a private namespace),
# `docker rmi` would fail with "no such image" — that's expected,
# not a real disk-growth signal, so we skip silently. If the
# image IS present and `rmi` fails, that's a genuine problem on
# a self-hosted runner (leaks layers across runs) — surface as
# a warning so it shows up in the run log.
run: |
set -euo pipefail
if ! docker image inspect "${NVCR_IMAGE}" >/dev/null 2>&1; then
echo "cleanup: image not present locally (likely pull failed); nothing to remove"
exit 0
fi
if ! docker rmi "${NVCR_IMAGE}" >/dev/null 2>&1; then
echo "::warning::cleanup: failed to remove ${NVCR_IMAGE} from runner cache; manual cleanup may be needed to avoid disk growth on this runner"
else
echo "cleanup: removed ${NVCR_IMAGE}"
fi
Loading