Skip to content

[libcu++] Rename max to highest in arguments framework#9246

Merged
pciolkosz merged 2 commits into
NVIDIA:mainfrom
pciolkosz:rename_max_to_highest
Jun 4, 2026
Merged

[libcu++] Rename max to highest in arguments framework#9246
pciolkosz merged 2 commits into
NVIDIA:mainfrom
pciolkosz:rename_max_to_highest

Conversation

@pciolkosz
Copy link
Copy Markdown
Contributor

Initially lowest and max were selected in the arguments framework to mirror numeric_limits, but max has some issues on Windows and I don't necessarily like the asymmetry. Since the arguments are mostly just created by the users and consumed by our APIs, I think we can deviate from numeric_limits and use lowest and highest instead, which is symmetrical and avoids windows issues

@pciolkosz pciolkosz requested a review from a team as a code owner June 3, 2026 23:02
@pciolkosz pciolkosz requested a review from davebayer June 3, 2026 23:02
@github-project-automation github-project-automation Bot moved this to Todo in CCCL Jun 3, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL Jun 3, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5cf19ffe-5073-4bca-abea-55105e000cbf

📥 Commits

Reviewing files that changed from the base of the PR and between 2ecd653 and 8ef2fe5.

📒 Files selected for processing (11)
  • cub/cub/agent/agent_batched_topk.cuh
  • cub/cub/device/dispatch/dispatch_batched_topk.cuh
  • cub/cub/device/dispatch/kernels/kernel_batched_topk.cuh
  • libcudacxx/include/cuda/__argument/argument.h
  • libcudacxx/include/cuda/__argument/argument_bounds.h
  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
✅ Files skipped from review due to trivial changes (2)
  • libcudacxx/include/cuda/__argument/argument_bounds.h
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
🚧 Files skipped from review as they are similar to previous changes (6)
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/include/cuda/__argument/argument.h

Note: CodeRabbit is enabled on this repository as a convenience for maintainers
and contributors. Use your best judgment when considering its review comments and
suggestions — a suggested change may be inadequate, unnecessary, or safe to ignore.
Contributors are not expected to address every comment. Human reviews are what
ultimately matter for merging.

Overview

This change renames the upper bound terminology in the CUDA argument framework from "max" to "highest" to improve API symmetry and avoid Windows-specific macro interference. The term "max" is replaced consistently throughout the framework and its tests, while "lowest" remains unchanged as the lower bound.

Key Changes

libcudacxx/include/cuda/__argument/argument.h (Primary changes)

  • Variable template specialization: __valid_static_bounds_v now uses __static_bounds<_Lowest, _Highest> template parameters
  • Helper functions renamed: __wrapper_static_max__wrapper_static_highest, __effective_max__effective_highest
  • Deduction guides updated for __immediate, __immediate_sequence, __deferred, and __deferred_sequence to use _Highest template parameter
  • Constant bound computation helpers: __constant_compute_max__constant_compute_highest, __constant_sequence_compute_max__constant_sequence_compute_highest
  • Traits member renamed: __traits_impl<...>::max__traits_impl<...>::highest
  • Free-function API renamed: __max_() overload set → __highest_() overload set

libcudacxx/include/cuda/__argument/argument_bounds.h

  • Minor change to __runtime_bounds default initialization to wrap std::numeric_limits<_Tp>::max() as (::cuda::std::numeric_limits<_Tp>::max)() to prevent macro interference

Test Files Updated

  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp

All tests updated to reflect the renaming: __traits<T>::max__traits<T>::highest and __max_()__highest_() calls throughout static and runtime bound validation scenarios.

CUB usages updated

  • cub/cub/agent/agent_batched_topk.cuh
  • cub/cub/device/dispatch/dispatch_batched_topk.cuh
  • cub/cub/device/dispatch/kernels/kernel_batched_topk.cuh

Upper-bound references and compile-time decisions were switched from max to highest in policy/segment-size logic.

Impact

This is a pure refactoring with no functional changes—the behavior of the arguments framework remains identical. The change improves API naming consistency ("lowest"/"highest" symmetry) and eliminates Windows macro name collision risks.

important:

Walkthrough

This PR renames the upper-bound API and internals from "max" to "highest" across libcudacxx argument-bounds, updates deduction guides and traits, replaces the __max_ free-function set with __highest_, fixes a macro-safety initializer, and updates CUB callsites and tests to match.

Changes

Argument-bounds "max" to "highest" terminology refactor

Layer / File(s) Summary
Bound validation and helper renames
libcudacxx/include/cuda/__argument/argument.h
__valid_static_bounds_v specialized for __static_bounds<_Lowest,_Highest>; introduced __wrapper_static_highest and __effective_highest; static-element validation updated to use the new helpers; __highest_ free-function added.
Deduction guides for wrapper types
libcudacxx/include/cuda/__argument/argument.h
All deduction guides for __immediate, __immediate_sequence, __deferred, and __deferred_sequence updated to use __static_bounds<_Lowest,_Highest>.
Traits computation and member rename
libcudacxx/include/cuda/__argument/argument.h
__constant_compute_highest / __constant_sequence_compute_highest replace "max" variants; __traits_impl specializations renamed member maxhighest.
Free-function API replacement
libcudacxx/include/cuda/__argument/argument.h
Replaced __max_ overload set with __highest_; wrapper overloads validate intersections and return __effective_highest.
Macro collision safety
libcudacxx/include/cuda/__argument/argument_bounds.h
__runtime_bounds default __upper_ initializer parenthesizes (::cuda::std::numeric_limits<_Tp>::max)() to avoid macro collisions.
CUB batched-topk callsites
cub/cub/agent/agent_batched_topk.cuh, cub/cub/device/dispatch/dispatch_batched_topk.cuh, cub/cub/device/dispatch/kernels/kernel_batched_topk.cuh
Policy selection and small-segment predicates updated to use __traits<...>::highest instead of ::max; minor parenthesization adjustment around numeric_limits<key_t>::max() in padding logic.
Test assertion updates
libcudacxx/test/libcudacxx/cuda/argument/*
All tests updated: __traits<...>::max__traits<...>::highest and __max_(...)__highest_(...) across trait/unit/usage tests.

Possibly related PRs

  • NVIDIA/cccl#9074: Earlier changes to CUB batched-topk that referenced traits::max and would need the same max→highest renaming.
  • NVIDIA/cccl#8875: Introduced the original lowest/max argument-bounds API that this PR renames to highest equivalents.

Suggested reviewers

  • gevtushenko
  • davebayer

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Infer (1.2.0)
libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp

libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp:11:10: fatal error: 'cuda/_argument' file not found
11 | #include <cuda/_argument>
| ^~~~~~~~~~~~~~~~~~
1 error generated.
libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp:150:3-155:3: ERROR translating statement 'CompoundStmt'
Aborting translation of method 'test' in file 'libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp': "Assert_failure src/clang/cAst_utils.ml:249:53"
Uncaught Internal Error: "Assert_failure src/clang/cAst_utils.ml:249:53"
Error backtrace:
Raised at ClangFrontend__CAst_utils.get_decl_from_typ_ptr in file "src/clang/cAst_utils.ml", line 249, characters 53-65
Called from ClangFrontend__CTrans.CTrans_funct.get_destructor_decl_ref in file "src/clang/cTrans.ml", line 658, characters 12-59
Called from ClangFrontend__CTrans.CTrans_funct.destructor_calls.(fun) in file "src/clang/cTrans.ml", line 2048, characters 12-69
Called from Base__List.rev_fil

... [truncated 2200 characters] ...

ed from ClangFrontend__CTrans.CTrans_funct.exec_with_node_creation in file "src/clang/cTrans.ml" (inlined), line 104, characters 20-38
Called from ClangFrontend__CTrans.CTrans_funct.get_clang_stmt_trans in file "src/clang/cTrans.ml" (inlined), line 5395, characters 4-69
Called from ClangFrontend__CTrans.CTrans_funct.get_custom_stmt_trans in file "src/clang/cTrans.ml", line 5401, characters 8-55
Called from ClangFrontend__CTrans.CTrans_funct.exec_trans_instrs.exec_trans_instrs_rev in file "src/clang/cTrans.ml" (inlined), line 5365, characters 28-54
Called from ClangFrontend__CTrans.CTrans_funct.exec_trans_instrs in file "src/clang/cTrans.ml" (inlined), line 5389, characters 6-70
Called from ClangFrontend__CTrans.CTrans_funct.instructions_trans in file "src/clang/cTrans.ml", line 5451, chara

libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp

libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp:11:10: fatal error: 'cuda/_argument' file not found
11 | #include <cuda/argument>
| ^~~~~~~~~~~~~~~~~~
1 error generated.
Error: the following clang command did not run successfully:
/opt/infer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/bin/clang-18
@/tmp/coderabbit-infer/8ef2fe52e82d6f8ebe68d1e9b8b161b917cbc457-07842b3345a172fc/tmp/clang_command_.tmp.15a262.txt
++Contents of '/tmp/coderabbit-infer/8ef2fe52e82d6f8ebe68d1e9b8b161b917cbc457-07842b3345a172fc/tmp/clang_command
.tmp.15a262.txt':
"-cc1" "-load"
"/opt/infer-linux-x86_64-v1.2.0/lib/infer/infer/bin/../../facebook-clang-plugins/libtooling/build/FacebookClangPlugin.dylib"
"-add-plugin" "BiniouASTExporter" "-plugin-arg-BiniouASTExporter" "-"
"-plugin-arg-BiniouASTExporter" "PREPEND_CURRENT_DIR=1"
"-plugin-arg-BiniouASTExporter" "MAX_STRING_SIZE=65535" "-cc1" "-triple"
"x86_64-unknown-li

... [truncated 1169 characters] ...

al-isystem" "/usr/local/include" "-internal-isystem"
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
"-internal-externc-isystem" "/usr/include/x86_64-linux-gnu"
"-internal-externc-isystem" "/include" "-internal-externc-isystem"
"/usr/include" "-Wno-ignored-optimization-argument" "-Wno-everything"
"-fdeprecated-macro" "-ferror-limit" "19" "-fgnuc-version=4.2.1"
"-fskip-odr-check-in-gmf" "-fcxx-exceptions" "-fexceptions"
"-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o"
"/tmp/coderabbit-infer/07842b3345a172fc/file.o" "-x" "c++"
"libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp" "-O0"
"-fno-builtin" "-include"
"/opt/infer-linux-x86_64-v1.2.0/lib/infer/infer/bin/../lib/clang_wrappers/global_defines.h"
"-Wno-everything"

libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp

libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp:11:10: fatal error: 'cuda/_argument' file not found
11 | #include <cuda/_argument>
| ^~~~~~~~~~~~~~~~~~
1 error generated.
libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp:161:3-166:3: ERROR translating statement 'CompoundStmt'
Aborting translation of method 'test' in file 'libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp': "Assert_failure src/clang/cAst_utils.ml:249:53"
Uncaught Internal Error: "Assert_failure src/clang/cAst_utils.ml:249:53"
Error backtrace:
Raised at ClangFrontend__CAst_utils.get_decl_from_typ_ptr in file "src/clang/cAst_utils.ml", line 249, characters 53-65
Called from ClangFrontend__CTrans.CTrans_funct.get_destructor_decl_ref in file "src/clang/cTrans.ml", line 658, characters 12-59
Called from ClangFrontend__CTrans.CTrans_funct.destructor_calls.(fun) in file "src/clang/cTrans.ml", line 2048, characters 12-69
Called from Base__List.rev_filter

... [truncated 2200 characters] ...

from ClangFrontend__CTrans.CTrans_funct.exec_with_node_creation in file "src/clang/cTrans.ml" (inlined), line 104, characters 20-38
Called from ClangFrontend__CTrans.CTrans_funct.get_clang_stmt_trans in file "src/clang/cTrans.ml" (inlined), line 5395, characters 4-69
Called from ClangFrontend__CTrans.CTrans_funct.get_custom_stmt_trans in file "src/clang/cTrans.ml", line 5401, characters 8-55
Called from ClangFrontend__CTrans.CTrans_funct.exec_trans_instrs.exec_trans_instrs_rev in file "src/clang/cTrans.ml" (inlined), line 5365, characters 28-54
Called from ClangFrontend__CTrans.CTrans_funct.exec_trans_instrs in file "src/clang/cTrans.ml" (inlined), line 5389, characters 6-70
Called from ClangFrontend__CTrans.CTrans_funct.instructions_trans in file "src/clang/cTrans.ml", line 5451, characte

  • 3 others

Comment @coderabbitai help to get the list of available commands and usage tips.

@pciolkosz pciolkosz requested a review from gevtushenko June 3, 2026 23:13
@github-actions

This comment has been minimized.

@pciolkosz pciolkosz force-pushed the rename_max_to_highest branch from 2ecd653 to 8ef2fe5 Compare June 4, 2026 01:33
@pciolkosz pciolkosz requested a review from a team as a code owner June 4, 2026 01:33
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cub/cub/device/dispatch/dispatch_batched_topk.cuh (1)

92-92: ⚡ Quick win

suggestion: Qualify the touched free-function calls from the global namespace. The edited call sites still use wrap_select_direction(...) / params::get_param(...) unqualified, which does not match the header rule for free-function calls.

As per coding guidelines, "All calls to free functions must be fully qualified starting from the global namespace, e.g., ::cuda::ceil_div."

Also applies to: 151-151, 200-200, 256-256, 300-300

cub/cub/agent/agent_batched_topk.cuh (1)

219-220: ⚡ Quick win

suggestion: Cache params::get_param(k_param, segment_id) once before the min. This path currently evaluates the per-segment k parameter twice, which is an extra device-side fetch in the hot small-segment path and makes the type expression harder to read.

As per coding guidelines, "cub/**/*: Focus on algorithm correctness, temporary-storage protocol, dispatch/policy selection, stream behavior, CUDA error handling, synchronization, memory access safety, performance regressions, and test coverage."


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5cf19ffe-5073-4bca-abea-55105e000cbf

📥 Commits

Reviewing files that changed from the base of the PR and between 2ecd653 and 8ef2fe5.

📒 Files selected for processing (11)
  • cub/cub/agent/agent_batched_topk.cuh
  • cub/cub/device/dispatch/dispatch_batched_topk.cuh
  • cub/cub/device/dispatch/kernels/kernel_batched_topk.cuh
  • libcudacxx/include/cuda/__argument/argument.h
  • libcudacxx/include/cuda/__argument/argument_bounds.h
  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
✅ Files skipped from review due to trivial changes (2)
  • libcudacxx/include/cuda/__argument/argument_bounds.h
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
🚧 Files skipped from review as they are similar to previous changes (6)
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/include/cuda/__argument/argument.h

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cub/cub/device/dispatch/dispatch_batched_topk.cuh (1)

92-92: ⚡ Quick win

suggestion: Qualify the touched free-function calls from the global namespace. The edited call sites still use wrap_select_direction(...) / params::get_param(...) unqualified, which does not match the header rule for free-function calls.

As per coding guidelines, "All calls to free functions must be fully qualified starting from the global namespace, e.g., ::cuda::ceil_div."

Also applies to: 151-151, 200-200, 256-256, 300-300

cub/cub/agent/agent_batched_topk.cuh (1)

219-220: ⚡ Quick win

suggestion: Cache params::get_param(k_param, segment_id) once before the min. This path currently evaluates the per-segment k parameter twice, which is an extra device-side fetch in the hot small-segment path and makes the type expression harder to read.

As per coding guidelines, "cub/**/*: Focus on algorithm correctness, temporary-storage protocol, dispatch/policy selection, stream behavior, CUDA error handling, synchronization, memory access safety, performance regressions, and test coverage."


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5cf19ffe-5073-4bca-abea-55105e000cbf

📥 Commits

Reviewing files that changed from the base of the PR and between 2ecd653 and 8ef2fe5.

📒 Files selected for processing (11)
  • cub/cub/agent/agent_batched_topk.cuh
  • cub/cub/device/dispatch/dispatch_batched_topk.cuh
  • cub/cub/device/dispatch/kernels/kernel_batched_topk.cuh
  • libcudacxx/include/cuda/__argument/argument.h
  • libcudacxx/include/cuda/__argument/argument_bounds.h
  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
✅ Files skipped from review due to trivial changes (2)
  • libcudacxx/include/cuda/__argument/argument_bounds.h
  • libcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpp
🚧 Files skipped from review as they are similar to previous changes (6)
  • libcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpp
  • libcudacxx/include/cuda/__argument/argument.h
🛑 Comments failed to post (1)
cub/cub/device/dispatch/dispatch_batched_topk.cuh (1)

242-243: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

suggestion: Fix the static_assert text to mention num_segments, not segment sizes. The predicate here is on NumSegmentsParameterT, so the current message points users at the wrong argument when this fails.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

🥳 CI Workflow Results

🟩 Finished in 2h 33m: Pass: 100%/340 | Total: 3d 19h | Max: 1h 09m | Hits: 93%/503670

See results here.

@pciolkosz pciolkosz merged commit 6b7ae38 into NVIDIA:main Jun 4, 2026
362 checks passed
@miscco
Copy link
Copy Markdown
Contributor

miscco commented Jun 4, 2026

Not really a fan given that lowest is actually part of numeric_limits so now we have an actual divergence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants