Some people using VW 7.x have reported freezes or crashes of VW on Linux distros with newer revs of glibc - particularly on RedHat 8 and RedHat 9. We've had a look at the problem, and believe that we have a bug in VW. Here's how to deal with this at present:
Some new linux systems install multiple libc.so.6 libraries, typically in /lib, /lib/i686, and /lib/tls. The difference between the libraries is the version of the kernel they assume to be running. Each library states this assumption in an ELF section named .note.ABI-tag that can be viewed using#> objdump -s -j .note.ABI-tag <ELF file>
The last 3 32-bit unsigned ints in this tag contain the kernel version, or "operating system ABI", required by the library. This was done as part of the new Native Posix Threads implementation. One big difference is that newer kernels have an api that defines the 'errno' global in thread-local storage instead of as a traditional global variable.
UnixSystemSupport has a #libraryDirectories attribute that includes '/lib', '/usr/shlib', and '/usr/lib'. On Martin's system, the loader resolves the engine's symbols in /lib/tls/libc-2.3.2.so and /lib/tls/libm-2.3.2.so. However, because the paths are hard-coded in the attributes of UnixSystemSupport, this causes VW to open /lib/libc-2.3.2.so when it tries to find the function pointers needed by the OSTimeZone package. Additionally, this second libc-2.3.2.so is loaded in RTLD_GLOBAL mode by primitive 330.
I discovered that I could eliminate the system freeze in two different ways:
- set LD_ASSUME_KERNEL=2.2.5 in the environment before running vw
This causes the loader to resolve the engine against the libc in /lib. Since the UnixSystemSupport methods search for the same library there is only one glibc library loaded.
- empty the #libraryDirectories attribute on UnixSystemSupport and add 'libc.so.6' to the head of the #libraryFiles list.
This causes dlopen() to be called on the name "libc.so.6" which uses the platform's library search path. This ensures that the libc.so.6 that gets loaded is the same library as the loader used when it loaded the engine.
That's part of the text - with work-arounds you can apply in your image - for the internal bug report on this issue. Let us know if you have problems with this!