Date: Thu, 14 Apr 2022 23:25:07 -0700 From: Mark Millard <marklmi@yahoo.com> To: jbo@insane.engineer, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: llvm & RTTI over shared libraries Message-ID: <9ADA04B1-2A0F-4B96-8510-88A5E4E1E2C0@yahoo.com> References: <9ADA04B1-2A0F-4B96-8510-88A5E4E1E2C0.ref@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
From: <jbo_at_insane.engineer> wrote on Date: Thu, 14 Apr 2022 16:36:24 +0000 : (I've line-split the text.) > I'm in the middle of moving to FreeBSD as my primary development = platform (desktop wise). > As such, I am currently building various software tools I've written = over the years on > FreeBSD for the first time. Most of those were developed on either = Linux+GCC or on > Windows+Mingw (MinGW -> GCC). >=20 > Today I found myself debugging a piece of software which runs fine on = FreeBSD when > compiled with gcc11 but not so much when compiling with clang14. > I managed to track down the problem but I lack the deeper = understanding to resolve > this properly - so here we are. >=20 > The software in question is written in C++20 and consisting of: > - An interface library (just a bunch of header files). > - A main executable. > - A bunch of plugins which the executable loads via dlopen(). >=20 > The interface headers provide several types. Lets call them A, B, C = and D. where B, > C and D inherit from A. > The plugins use std::dynamic_pointer_cast() to cast an = std::shared_ptr<A> (received > via the plugin interface) to the derived classes such as = std::shared_ptr<B>. > This is where the trouble begins. >=20 > If everything (the main executable and the plugins) are compiled using = gcc11, everything > works "as I expect it". > However, when compiling everything with clang14, the main executable = is able to load the > plugins successfully but those std::dynamic_pointer_cast() calls = within the plugins > always return nullptr. >=20 > After some research I seem to understand that the way that RTTI is = handled over shared > library boundaries is different between GCC and LLVM. > This is where my understanding starts to get less solid. >=20 > I read the manual page of dlopen(3). It would seem like the flag = RTLD_GLOBAL would be > potentially interesting to me: "Symbols from this shared object [...] = of needed objects > will be available for re-solving undefined references from all other = shared objects." > The software (which "works as intended" when compiled with GCC) was so = far only calling > dlopen(..., RTLD_LAZY). > I'm not even sure whether this applies to my situation. My gut feeling = tells me that I'm > heading down the wrong direction here. After all, the main executable = is able to load > the plugins and to call the plugin's function which receives an = std::shared_ptr<A> > asparameter just fine, also when compiled with LLVM. > Is the problem I'm experiencing related to the way that the plugin = (shared library) is > loaded or the way that the symbols are being exported? > In the current state, the plugins do not explicitly export any = symbols. >=20 > Here's a heavily simplified version of my scenario: The simplified example was not designed to compile and test. So I made guesses and made my own. The .cpp files have comments on the compile/link commands used and there are examples of c++ and g++11 compile/link/run sequences after the source code. The code is not well commented. Nor does it deal with error handling or the like. But it is fairly short overall. # more base_plugin.h #include <memory> // For its own libbase_plugin.so file, load time bound, no dlopen used = for it: struct base { virtual ~base(); }; struct base_plugin { virtual std::shared_ptr<base> create_data_instance() = =3D 0; virtual void action(std::shared_ptr<base> data) = =3D 0; virtual ~base_plugin(); }; extern "C" // for each derived plugin .so file: { using plugin_instance_creator=3D base_plugin* (*)(); const char plugin_instance_creator_name[] =3D = "create_plugin_instance"; // Lookup via dlsym. using plugin_instance_destroyer=3D void (*)(base_plugin*); const char plugin_instance_destroyer_name[] =3D = "destroy_plugin_instance"; // Lookup via dlsym. }; # more base_plugin.cpp // c++ -std=3Dc++20 -O0 -g -fPIC -lc++ -olibbase_plugin.so -shared = base_plugin.cpp // g++11 -std=3Dc++20 -O0 -g -fPIC -lstdc++ -olibbase_plugin.so -shared = base_plugin.cpp #include "base_plugin.h" base::~base() {} base_plugin::~base_plugin() {} # more main_using_plugin.cpp=20 // c++ -std=3Dc++20 -O0 -g -fPIC -lc++ -L. -lbase_plugin = -Wl,-rpath=3D. \ // -omain_using_plugin = main_using_plugin.cpp // g++11 -std=3Dc++20 -O0 -g -fPIC -lstdc++ -L. -lbase_plugin = -Wl,-rpath=3D. \ // -Wl,-rpath=3D/usr/local/lib/gcc11 -omain_using_plugin = main_using_plugin.cpp #include "base_plugin.h" #include <dlfcn.h> int main() { auto dl=3D dlopen("./libsharedlib_plugin.so",RTLD_LAZY); // = hardcoded .so path for the example union { void* as_voidptr; plugin_instance_creator = as_plugin_instance_creator; } creator_plugin_func; creator_plugin_func.as_voidptr=3D = dlsym(dl,plugin_instance_creator_name); union { void* as_voidptr; plugin_instance_destroyer = as_plugin_instance_destroyer; } destroyer_plugin_func; destroyer_plugin_func.as_voidptr=3D = dlsym(dl,plugin_instance_destroyer_name); auto plugin=3D = (creator_plugin_func.as_plugin_instance_creator)(); { // Local scope for data std::shared_ptr<base> = data{plugin->create_data_instance()}; plugin->action(data); } // Presume for the example that nothing requires the plugin = after here. (destroyer_plugin_func.as_plugin_instance_destroyer)(plugin); destroyer_plugin_func.as_voidptr=3D nullptr; dlclose(dl); } NOTE: So, other than the dlopen, the above has no direct tie to the specific dynamically loaded plugin. The base_plugin is in a .so but is load-time bound instead of using dlopen. That .so would be used by all the plugins found via dllopen. (I only made one example.) As for the .so used via dlopen/dlsym/dlclose . . . # more sharedlib_plugin.h #include "base_plugin.h" // For its own libsharedlib_plugin.so file, where dlopen is used to find = it: struct sharedlib : base { int v; }; struct sharedlib_plugin : base_plugin { std::shared_ptr<base> create_data_instance() = override; void action(std::shared_ptr<base> base) = override; }; # more sharedlib_plugin.cpp // c++ -std=3Dc++20 -O0 -g -fPIC -lc++ -olibsharedlib_plugin.so = -shared sharedlib_plugin.cpp // g++11 -std=3Dc++20 -O0 -g -fPIC -lstdc++ -olibsharedlib_plugin.so = -shared sharedlib_plugin.cpp #include "sharedlib_plugin.h" #include <iostream> std::shared_ptr<base> sharedlib_plugin::create_data_instance() { std::cout << "create_data_instance in use from dlopen'd .so\n"; return = std::static_pointer_cast<base>(std::make_shared<sharedlib>()); } void sharedlib_plugin::action(std::shared_ptr<base> b) { std::cout << "action in use from dlopen'd .so class\n"; auto separate_share =3D std::dynamic_pointer_cast<sharedlib>(b); if (separate_share->v || 1 < separate_share.use_count()) std::cout << "separate_share is not nullptr (would crash = otherwise)\n"; } extern "C" base_plugin* create_plugin_instance() { std::cout << "create_plugin_instance in use from dlopen'd = .so\n"; return new sharedlib_plugin(); } extern "C" void destroy_plugin_instance(const base_plugin* plugin) { std::cout << "destroy_plugin_instance in use from dlopen'd = .so\n"; delete plugin; } # c++ -std=3Dc++20 -O0 -g -fPIC -lc++ -olibbase_plugin.so -shared = base_plugin.cpp # c++ -std=3Dc++20 -O0 -g -fPIC -lc++ -L. -lbase_plugin -Wl,-rpath=3D. \ -omain_using_plugin main_using_plugin.cpp # c++ -std=3Dc++20 -O0 -g -fPIC -lc++ -olibsharedlib_plugin.so -shared = sharedlib_plugin.cpp # ./main_using_plugin create_plugin_instance in use from dlopen'd .so create_data_instance in use from dlopen'd .so action in use from dlopen'd .so class separate_share is not nullptr (would crash otherwise) destroy_plugin_instance in use from dlopen'd .so For reference: # ldd main_using_plugin main_using_plugin: libc++.so.1 =3D> /lib/libc++.so.1 (0x819d0000) libcxxrt.so.1 =3D> /lib/libcxxrt.so.1 (0x82735000) libbase_plugin.so =3D> ./libbase_plugin.so (0x8328d000) libm.so.5 =3D> /lib/libm.so.5 (0x83c47000) libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x85861000) libc.so.7 =3D> /lib/libc.so.7 (0x848f9000) # ldd ./libsharedlib_plugin.so ./libsharedlib_plugin.so: libc++.so.1 =3D> /lib/libc++.so.1 (0x3b69aeeb6000) libcxxrt.so.1 =3D> /lib/libcxxrt.so.1 (0x3b69af6f2000) libm.so.5 =3D> /lib/libm.so.5 (0x3b69afd1f000) libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x3b69b0303000) libc.so.7 =3D> /lib/libc.so.7 (0x3b69aafdb000) As for g++11 use . . . Testing with g++11 does involve additional/adjusted command line options: -Wl,-rpath=3D/usr/local/lib/gcc11/ ( for main_using_plugin.cpp ) -lstdc++ (for all 3 .cpp files) (FreeBSD's libgcc_s.so.1 does not cover everything needed for all architectures for g++11's code generation. I was working in a context where using /usr/local/lib/gcc11//libgcc_s.so.1 was important.) # g++11 -std=3Dc++20 -O0 -g -fPIC -lstdc++ -olibbase_plugin.so -shared = base_plugin.cpp # g++11 -std=3Dc++20 -O0 -g -fPIC -lstdc++ -L. -lbase_plugin = -Wl,-rpath=3D. \ -Wl,-rpath=3D/usr/local/lib/gcc11 -omain_using_plugin = main_using_plugin.cpp # g++11 -std=3Dc++20 -O0 -g -fPIC -lstdc++ -olibsharedlib_plugin.so = -shared sharedlib_plugin.cpp # ./main_using_plugin create_plugin_instance in use from dlopen'd .so create_data_instance in use from dlopen'd .so action in use from dlopen'd .so class separate_share is not nullptr (would crash otherwise) destroy_plugin_instance in use from dlopen'd .so For reference: # ldd main_using_plugin main_using_plugin: libstdc++.so.6 =3D> /usr/local/lib/gcc11//libstdc++.so.6 = (0x83a00000) libbase_plugin.so =3D> ./libbase_plugin.so (0x8213d000) libm.so.5 =3D> /lib/libm.so.5 (0x82207000) libgcc_s.so.1 =3D> /usr/local/lib/gcc11//libgcc_s.so.1 = (0x82c66000) libc.so.7 =3D> /lib/libc.so.7 (0x849c4000) # ldd ./libsharedlib_plugin.so ./libsharedlib_plugin.so: libstdc++.so.6 =3D> /usr/local/lib/gcc11/libstdc++.so.6 = (0x1c2a7b800000) libm.so.5 =3D> /lib/libm.so.5 (0x1c2a7bb1c000) libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x1c2a7c416000) libc.so.7 =3D> /lib/libc.so.7 (0x1c2a780e8000) Overall: Looks to me like both the system clang/llvm and g++11 contexts are working. (The platform context was aarch64 main [so: 14], in case it matters.) =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9ADA04B1-2A0F-4B96-8510-88A5E4E1E2C0>