UCP/CORE: Print debugging tables on context and ep creation#11510
Draft
guy-ealey-morag wants to merge 12 commits into
Draft
UCP/CORE: Print debugging tables on context and ep creation#11510guy-ealey-morag wants to merge 12 commits into
guy-ealey-morag wants to merge 12 commits into
Conversation
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
Signed-off-by: Guy Ealey Morag <gealeymorag@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What?
Print human-readable tables on context and endpoint creation that give information on the different transports and devices available, and also which of them is actively being used.
Why?
Users often run UCX (or NIXL with UCX) and get unexpected behavior due to misconfiguration (in env vars or in the container).
Printing the transport device tables allow users to see which transports are available, which devices are visible, and which transports and devices are enabled/disabled in their current configuration. The tables can also report that a transport that was compiled into UCX is unsupported in their current environment.
On endpoint creation we print the transports and devices that were selected, along with the lane types.
How?
All of the information about the transports and devices is collected during initialization, and then it's printed in a table for the context init, and for every endpoint init (depending on env var
UCX_PRINT_TRANSPORT_TABLES)Examples:
(The timestamp/process prefix added by
ucs_log_print_compactwas removed here for improved readability)ucx_perftestwithcudaucx_perftestwithtcpUnsupported transports: