Skip to content

Fix udev enumerator leaks and a wrong null-check in device discovery#477

Open
rocker-zhang wants to merge 1 commit into
Syllo:masterfrom
rocker-zhang:fix-udev-enumerator-leaks
Open

Fix udev enumerator leaks and a wrong null-check in device discovery#477
rocker-zhang wants to merge 1 commit into
Syllo:masterfrom
rocker-zhang:fix-udev-enumerator-leaks

Conversation

@rocker-zhang
Copy link
Copy Markdown

While reading the udev device-discovery code I noticed the enumerator is
leaked on a few early-return paths, in the same spirit as the recent cleanup
fixes (#467, #468).

Three functions create an enumerator with nvtop_enumerator_new() and then
return early when a match/alloc step fails, without calling
nvtop_enumerator_unref(): nvtop_device_get_hwmon() and the intel/v3d
gpuinfo_*_get_device_handles(). The leaked enumerator owns a udev handle and
a udev_enumerate object. The no-hwmon branch in nvtop_device_get_hwmon() runs
for every device that has no hwmon node, so this one leaks on a normal run
rather than only on failure. I routed the early exits through a single unref
with the goto-cleanup pattern already used in nvtop_enumerator_new(), leaving
the success path unchanged.

The same commit also fixes nvtop_enumerator_new(): it checked enumerator
(the address of the caller's variable, never NULL) instead of *enumerator
(the calloc result), so an out-of-memory calloc would dereference NULL instead
of returning -1.

Validated on x86_64 (all backends + libdrm) and arm64 (NVIDIA + v3d): clean
build with -Wall -Wextra, the gtest suite passes, and --snapshot on a live
GPU still enumerates devices and reports stats. I also confirmed the leak with
LeakSanitizer on both arches: a build that skips the unref on the early-return
path leaks the enumerator (the udev handle and udev_enumerate object), and the
fix brings it to zero. clang's static analyzer flags the leak in
nvtop_device_get_hwmon() before the change and is clean after.

nvtop_device_get_hwmon() and the intel/v3d gpuinfo_*_get_device_handles()
functions return early when a udev enumerator API call fails, but those
early returns never release the enumerator, leaking the udev handle and
the udev_enumerate object. In nvtop_device_get_hwmon() the no-hwmon path
is reached for every device without an hwmon node, so it leaks on a normal
startup, not just on error. Route the early exits through a single
nvtop_enumerator_unref() using the goto-cleanup pattern already used in
nvtop_enumerator_new().

nvtop_enumerator_new() also tested the address of the caller's pointer,
which is never NULL, instead of the calloc() result. An allocation failure
would therefore fall through and dereference a NULL pointer instead of
returning -1; check *enumerator instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant