Skip to content

UCP/CUDA: Fix Valgrind failures with CUDA managed memory#11499

Open
gleon99 wants to merge 1 commit into
openucx:masterfrom
gleon99:valgrind-cuda-managed-leak
Open

UCP/CUDA: Fix Valgrind failures with CUDA managed memory#11499
gleon99 wants to merge 1 commit into
openucx:masterfrom
gleon99:valgrind-cuda-managed-leak

Conversation

@gleon99
Copy link
Copy Markdown
Contributor

@gleon99 gleon99 commented May 28, 2026

What?

Fix Valgrind failures with CUDA managed memory

Why?

Internal issue

@gleon99 gleon99 added the WIP-DNM Work in progress / Do not review label May 28, 2026
@gleon99 gleon99 requested a review from rakhmets May 29, 2026 12:09
@gleon99 gleon99 removed the WIP-DNM Work in progress / Do not review label May 29, 2026
{
return (event->comp != NULL) &&
(event->comp->func == uct_cuda_base_stream_flushed_cb);
return event->event == NULL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the previous check not working here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It dereferences comp->func, which can point into a UCP request. Valgrind may mark that request memory undefined/noaccess.


status = ucs_memtype_cache_lookup(address, length, mem_info);
if (ucs_likely(status == UCS_ERR_NO_ELEM)) {
if (ucs_unlikely(RUNNING_ON_VALGRIND)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why needed with valgrind? is cuda/managed memory not present in memtype cache?

ucs_queue_push(&iface->super.active_queue, &q_desc->queue);
}

VALGRIND_MAKE_MEM_DEFINED(&cuda_event->event, sizeof(cuda_event->event));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess we need to do it after ucs_mpool_get, because the reason the event is valid is because it's cuda_event->event is coming from the mpool, and it should be probably before used in line 333

ucs_queue_push(&iface->super.active_queue, &q_desc->queue);
}

VALGRIND_MAKE_MEM_DEFINED(&cuda_ipc_event->super.event,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here -- maybe need inline wrapper to get object (derived from) uct_cuda_event_desc_t from a memory pool and make ->event defined?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants