CONFIGURE: added a configure option to add SONAME suffix#11483
CONFIGURE: added a configure option to add SONAME suffix#11483roiedanino wants to merge 8 commits into
Conversation
Signed-off-by: Roie Danino <rdanino@nvidia.com>
| AS_IF([test "x$with_soname_suffix" != xno], | ||
| [AS_IF([test "x$with_soname_suffix" = xyes], | ||
| [AC_MSG_ERROR([--with-soname-suffix requires an explicit suffix value])]) | ||
| AS_IF([printf '%s\n' "$with_soname_suffix" | grep -Eq '^@<:@A-Za-z0-9@:>@@<:@A-Za-z0-9_.-@:>@*$'], |
There was a problem hiding this comment.
[P2] [important] Dotted suffixes break module loading
The option accepts dots, but libtool -release 1.22 produces names like libucs-1.22.so.0; UCX's module loader derives module_ext from the first dot in the loaded libucs basename, so it would look for module files ending in .22.so.0 instead of .so.0. That means accepted values such as --with-soname-suffix=1.22 can prevent optional UCT/UCM modules from loading. Either reject dots here or fix the loader to derive the real shared-library extension, and add a small configure/runtime check for a dotted suffix.
| -DUCX_CONFIG_DIR=\"$(ucx_config_dir)\" | ||
| libucs_la_CFLAGS = $(BASE_CFLAGS) $(LT_CFLAGS) | ||
| libucs_la_LDFLAGS = -ldl $(BFD_LDFLAGS) -version-info $(SOVERSION) | ||
| libucs_la_LDFLAGS = -ldl $(BFD_LDFLAGS) -version-info $(SOVERSION) $(UCX_LT_RELEASE) |
There was a problem hiding this comment.
[P2] [important] libucs_signal is left unsuffixed
The new suffix is applied to libucm/libucs/libuct/libucp, but libucs_signal is also an installed versioned UCX shared library. If the goal is avoiding collisions with already-loaded UCX DSOs, leaving this SONAME as libucs_signal.so.0 keeps one UCX entry point outside the suffix scheme. Add $(UCX_LT_RELEASE) here as well, or explicitly document that this option is limited to the four core libraries.
|
🚫 @roiedanino — not authorized to trigger a review on this repo. Contact a repo admin to be added to the allowed-users list. |
|
🤖 Starting review — findings will be posted here when done. |
svc-nvidia-pr-review
left a comment
There was a problem hiding this comment.
Code Review
4 findings — 🚨 1 blocker,
4 findings posted as inline comments.
| AS_IF([printf '%s\n' "$with_soname_suffix" | grep -Eq '^@<:@A-Za-z0-9@:>@@<:@A-Za-z0-9_.-@:>@*$'], | ||
| [], | ||
| [AC_MSG_ERROR([--with-soname-suffix must contain only letters, digits, dots, underscores, and dashes])]) | ||
| UCX_LT_RELEASE="-release $with_soname_suffix" |
There was a problem hiding this comment.
🚨 BLOCKER
libtool's -release changes the installed library filename (e.g. libucp-SUFFIX.so.0), not just the SONAME. With this set, ucx.pc (which emits -lucp) and cmake/ucx-targets.cmake.in (which hardcodes libucp.so/libucs.so/libuct.so) will fail to locate the libs for downstream consumers. Either also template these files with the suffix, or use a mechanism that only alters the SONAME (e.g., -Wl,-soname=...) if the intent really is SONAME-only as the option name suggests.
| ucx_soname_suffix_summary="<disabled>" | ||
| AC_ARG_WITH([soname-suffix], | ||
| AS_HELP_STRING([--with-soname-suffix=SUFFIX], | ||
| [Append SUFFIX to UCX core shared library SONAMEs. Disabled by default. [default=NO]]), |
There was a problem hiding this comment.
the option is named --with-soname-suffix but -release actually appends the suffix to the library filename (and hence SONAME), not only to the SONAME. pls clarify the help text or rename (e.g. --with-library-suffix) so users aren't surprised by the renamed .so file.
| AS_IF([test "x$with_soname_suffix" != xno], | ||
| [AS_IF([test "x$with_soname_suffix" = xyes], | ||
| [AC_MSG_ERROR([--with-soname-suffix requires an explicit suffix value])]) | ||
| AS_IF([printf '%s\n' "$with_soname_suffix" | grep -Eq '^@<:@A-Za-z0-9@:>@@<:@A-Za-z0-9_.-@:>@*$'], |
There was a problem hiding this comment.
the regex printf | grep -Eq works but is heavier than needed; a case statement (as used elsewhere in autoconf) would avoid spawning subprocesses and be more portable. minor.
| AS_IF([printf '%s\n' "$with_soname_suffix" | grep -Eq '^@<:@A-Za-z0-9@:>@@<:@A-Za-z0-9_.-@:>@*$'], | ||
| [], | ||
| [AC_MSG_ERROR([--with-soname-suffix must contain only letters, digits, dots, underscores, and dashes])]) | ||
| UCX_LT_RELEASE="-release $with_soname_suffix" |
There was a problem hiding this comment.
ℹ️ INFO
only the four core libs (ucm/ucp/ucs/uct) get $(UCX_LT_RELEASE); the per-transport module libs (e.g. src/uct/ib/Makefile.am, src/ucm/cuda/Makefile.am, …) still use plain -version-info $(SOVERSION). is the asymmetry intentional? if two UCX trees with different suffixes are installed in the same prefix, the modules will collide.
Signed-off-by: Roie Danino <rdanino@nvidia.com>
|
🤖 Starting review — findings will be posted here when done. |
svc-nvidia-pr-review
left a comment
There was a problem hiding this comment.
Code Review
4 findings —
3 findings posted as inline comments.
General Findings
ℹ️ INFO
no test coverage: nothing in test/ exercises --with-soname-suffix or ucs_module_normalize_base. Can we add a small gtest that constructs a fake module filename with the configured suffix and verifies the loader strips it, similar to existing test_module usage? Otherwise this path is only validated by hand.
| [Suffix appended to private UCX module library names]) | ||
| ucx_soname_suffix_summary="$with_soname_suffix"]) | ||
| AC_SUBST([UCX_LT_RELEASE]) | ||
| UCT_MODULE_LDFLAGS='$(UCX_LT_RELEASE)' |
There was a problem hiding this comment.
UCT_MODULE_LDFLAGS is just $(UCX_LT_RELEASE) and the rest of the diff already uses $(UCX_LT_RELEASE) directly in ucm/, ucp/, ucs/, perf/. Why have two names for the same thing? Can we drop UCT_MODULE_LDFLAGS and use $(UCX_LT_RELEASE) everywhere?
| [], | ||
| [AC_MSG_ERROR([--with-soname-suffix must contain only letters, digits, dots, underscores, and dashes])]) | ||
| UCX_LT_RELEASE="-release $with_soname_suffix" | ||
| AC_DEFINE_UNQUOTED([UCX_MODULE_SONAME_SUFFIX], |
There was a problem hiding this comment.
minor: the macro name UCX_MODULE_SONAME_SUFFIX is misleading — libtool -release modifies the library file name (and link/SONAME), it doesn't append to the ELF SONAME. Maybe UCX_MODULE_FILE_SUFFIX or UCX_MODULE_RELEASE_SUFFIX would be clearer.
| size_t base_len = strlen(base); | ||
|
|
||
| if ((base_len > suffix_len) && | ||
| !strcmp(base + base_len - suffix_len, UCX_MODULE_SONAME_SUFFIX)) { |
There was a problem hiding this comment.
minor: this only strips the suffix when it appears at the very end of base. After ucs_module_filename_to_base that is what we get for libtool -release output (e.g. libuct_cuda-MYSUFFIX), so it works, but a one-line comment here describing the expected input (lib<framework>_<mod>-<suffix>) would help future readers understand why a tail-only check is sufficient.
Signed-off-by: Roie Danino <rdanino@nvidia.com>
|
🤖 Starting review — findings will be posted here when done. |
svc-nvidia-pr-review
left a comment
There was a problem hiding this comment.
Code Review
4 findings — 🚨 1 blocker,
4 findings posted as inline comments.
| libuct_knem_la_LIBADD = $(top_builddir)/src/ucs/libucs.la \ | ||
| $(top_builddir)/src/uct/libuct.la | ||
| libuct_knem_la_LDFLAGS = -version-info $(SOVERSION) $(UCT_MODULE_LDFLAGS) | ||
| libuct_knem_la_LDFLAGS = -version-info $(SOVERSION) $(UCX_LT_RELEASE) |
There was a problem hiding this comment.
this replaces $(UCT_MODULE_LDFLAGS) with $(UCX_LT_RELEASE) instead of appending — the other Makefile.am entries in this PR all kept their existing flags and only added $(UCX_LT_RELEASE). UCT_MODULE_LDFLAGS is currently unset so it's harmless, but was the drop intentional?
| } | ||
|
|
||
| ucs_module_filename_to_base(entry->d_name, base, sizeof(base)); | ||
| ucs_module_normalize_base(base); |
There was a problem hiding this comment.
ucs_module_normalize_base will also strip the configured suffix from any unrelated module name that happens to end with it (e.g. a third-party libuct_foo-ci.so not built with --with-soname-suffix=ci). Worth scoping on the -<suffix> separator anchored to the framework prefix, or at least a comment that such collisions are accepted.
| #include <string.h> | ||
|
|
||
|
|
||
| static inline void ucs_module_normalize_base(char *base) |
There was a problem hiding this comment.
ℹ️ INFO
minor: a dedicated header for a single 8-line static inline helper feels heavy. It exists only so test_sys.cc can reach the function — could keep the helper static in module.c and expose a test hook, or declare it non-inline in module.h.
Signed-off-by: Roie Danino <rdanino@nvidia.com>
Signed-off-by: Roie Danino <rdanino@nvidia.com>
Signed-off-by: Roie Danino <rdanino@nvidia.com>
What?
Adds --with-soname-suffix option in configure which when set adds:
-release SUFFIX
for example:
libucp_la_LDFLAGS = -ldl -version-info $(SOVERSION) -release
examplewill produce:
libucp-example.so.0.0.0
libucp-example.so.0 -> libucp-example.so.0.0.0
libucp.so -> libucp-example.so.0.0.0
Why?
This allows creating purpose-specific
.sofiles to avoid collisions with already loaded UCX versions when binding to specific versions or binaries is required.This by itself won't provide a full isolation without using
RTLD_DEEPBINDduringdlopen.