Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Sep 2018 15:43:34 +0000
From:      bugzilla-noreply@freebsd.org
To:        ports-bugs@FreeBSD.org
Subject:   [Bug 231402] textproc/kf5-syntax-highlighting: does not build on systems with VLAN interfaces
Message-ID:  <bug-231402-7788@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D231402

            Bug ID: 231402
           Summary: textproc/kf5-syntax-highlighting: does not build on
                    systems with VLAN interfaces
           Product: Ports & Packages
           Version: Latest
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: Individual Port(s)
          Assignee: kde@FreeBSD.org
          Reporter: lantw44@gmail.com
          Assignee: kde@FreeBSD.org
             Flags: maintainer-feedback?(kde@FreeBSD.org)

kf5-syntax-highlighting build fails with undefined symbol error on a FreeBSD
11.2 system with at least one VLAN network interface. I know it is odd for
network configuration on the system to affect the build, but it is really w=
hat
I found after 3 days of debugging. Here are the error messages:

[94/132] cd
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data &&
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/kat=
ehighlightingindexer
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/in=
dex.katesyntax
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/syntax-highlig=
hting-5.49.0/data/schema/language.xsd
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/sy=
ntax-data.qrc
FAILED: data/index.katesyntax=20
cd /tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data=
 &&
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/kat=
ehighlightingindexer
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/in=
dex.katesyntax
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/syntax-highlig=
hting-5.49.0/data/schema/language.xsd
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/data/sy=
ntax-data.qrc
/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so: Undefined symbol
"_ZN17QNetworkInterfaceC1ERKS_@Qt_5"
ninja: build stopped: subcommand failed.

I guess this is a memory corruption issue in Qt5 network module, which may
provide the kernel a bad pointer and cause the kernel to overwrite data of =
the
runtime linker. The symbol '_ZN17QNetworkInterfaceC1ERKS_' does exist in
/usr/local/lib/qt5/libQt5Network.so.5 and
/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so correctly lists
libQt5Network.so.5 as its dependency with NEEDED, but the runtime linker
rejects the symbol in libQt5Network.so.5 when comparing version tags.

Steps to reproduce the problem:

1. Install FreeBSD 11.2 amd64 and download the ports tree. Whether it is a
physical machine or a virtual machine doesn't matter.
2. Create a VLAN network interface. It can be done with command 'ifconfig v=
lan3
create vlan 3 vlandev re0' where 're0' is your network interface.
3. Make sure the runtime linker /libexec/ld-elf.so.1 is compiled with -O2
option. This is the default, so you don't have to do anything in this step
unless you don't use binaries distributed by FreeBSD project.
4. Install textproc/qt5-xmlpatterns port with portmaster.
5. Build textproc/kf5-syntax-highlighting.

It was tested on FreeBSD 11.2-RELEASE-p3 amd64 with ports revision 479821. I
could reproduce it on 3 systems (physical machine, virtual machine, jail on
virtual machine) and each of them runs on different hardware.

I mentioned qt5-xmlpatterns above because it is an optional dependency of
kf5-syntax-highlighting. kf5-syntax-highlighting can be built without probl=
ems
when qt5-xmlpatterns is not installed, but it also means that it doesn't li=
nk
to qt5-network. kf5-syntax-highlighting automatically picks up qt5-xmlpatte=
rns
during the configure phase and it is qt5-xmlpatterns that causes
kf5-syntax-highlighting to load qt5-network during the build.

The following are results of my debugging. I haven't found the root cause of
the problem, but I think these notes may be useful to do further debugging.

I started by checking symbol tables of both libqgenericbearer.so and
libQt5Network.so.5.

$ pkg which /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so
/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so was installed by pac=
kage
qt5-network-5.11.1
$ readelf -aW /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so
Symbol table (.dynsym) contains 140 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    69: 0000000000000000    21 FUNC    GLOBAL DEFAULT  UND
_ZN17QNetworkInterfaceC1ERKS_@Qt_5 (2)

$ pkg which /usr/local/lib/qt5/libQt5Network.so.5
/usr/local/lib/qt5/libQt5Network.so.5 was installed by package
qt5-network-5.11.1
$ readelf -aW /usr/local/lib/qt5/libQt5Network.so.5
Symbol table (.dynsym) contains 2161 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
  1245: 00000000000c7790    21 FUNC    GLOBAL DEFAULT   12
_ZN17QNetworkInterfaceC1ERKS_@@Qt_5 (3)

The plugin links to libQt5Network.so.5 properly:

$ ldd
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/kat=
ehighlightingindexer=20
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/kat=
ehighlightingindexer:
        libQt5XmlPatterns.so.5 =3D> /usr/local/lib/qt5/libQt5XmlPatterns.so=
.5
(0x800a00000)
        libQt5Network.so.5 =3D> /usr/local/lib/qt5/libQt5Network.so.5
(0x801033000)
        libQt5Core.so.5 =3D> /usr/local/lib/qt5/libQt5Core.so.5 (0x80140000=
0)
        libc++.so.1 =3D> /usr/lib/libc++.so.1 (0x801aec000)
        ...

$ ldd /usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so
/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so:
        libQt5Network.so.5 =3D> /usr/local/lib/qt5/libQt5Network.so.5
(0x80120c000)
        libQt5Core.so.5 =3D> /usr/local/lib/qt5/libQt5Core.so.5 (0x80160000=
0)
        libc++.so.1 =3D> /usr/lib/libc++.so.1 (0x801cec000)
        ...

But the program which throws the undefined symbol error,
katehighlightingindexer, doesn't link to libqgenericbearer.so. It suggests =
that
libqgenericbearer.so is loaded by calling dlopen.

I set a breakpoint on dlopen in GDB, and yes, it calls it with:
dlopen("/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so", RTLD_NODEL=
ETE
| RTLD_LAZY);

The return value of dlopen is correct. It is properly loaded, and the hash =
of
the version entry is 363045.

(gdb) b dlopen
Function "dlopen" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (dlopen) pending.

(gdb) r 1 2 3
Starting program:
/tmp/wrkdirs/usr/ports/textproc/kf5-syntax-highlighting/work/.build/bin/kat=
ehighlightingindexer
1 2 3
[New LWP 101325 of process 74133]

Thread 1 hit Breakpoint 1, dlopen (name=3D0x805415498
"/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so", mode=3D4097) at
/usr/src/libexec/rtld-elf/rtld.c:3193
warning: Source file is more recent than executable.
3193            return (rtld_dlopen(name, -1, mode));

(gdb) finish
Run till exit from #0  dlopen (name=3D0x805415498
"/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so", mode=3D4097) at
/usr/src/libexec/rtld-elf/rtld.c:3193
0x000000080165a731 in ?? () from /usr/local/lib/qt5/libQt5Core.so.5
Value returned is $2 =3D (void *) 0x80067e000

(gdb) p ((Obj_Entry *)(0x80067e000))->vertab[2]
$3 =3D {hash =3D 363045, flags =3D 0, name =3D 0x807202678 "Qt_5", file =3D=
 0x8072025de
"libQt5Network.so.5"}
(gdb) p ((Obj_Entry *)(0x80067e000))->path
$8 =3D 0x800634f40 "/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so"

The number '2' seems to come from the '(2)' suffix of the output of readelf=
. I
assumes it means the version tag used by the symbol has index 2.

(gdb) b _rtld_bind if $_streq(obj->path,
"/usr/local/lib/qt5/plugins/bearer/libqgenericbearer.so") &&
obj->vertab[2].hash !=3D 363045
Breakpoint 3 at 0x80060f907: file /usr/src/libexec/rtld-elf/rtld.c, line 81=
0.

(gdb) c
Continuing.
[Switching to LWP 101325 of process 74133]

Thread 2 hit Breakpoint 3, _rtld_bind (obj=3D0x80067e000, reloff=3D1272) at
/usr/src/libexec/rtld-elf/rtld.c:810
810         rlock_acquire(rtld_bind_lock, &lockstate);

(gdb) p obj->vertab[2]
$17 =3D {hash =3D 32, flags =3D 0, name =3D 0x807202678 "Qt_5", file =3D 0x=
8072025de
"libQt5Network.so.5"}

The value of the hash field of the version entry has changed from 363045 to=
 32.
The value '32' isn't random. I always get the same value here. If you follow
the execution of the correct _rtld_bind call, you will find it fails to mat=
ch
the version tag at file /usr/src/libexec/rtld-elf/rtld.c, function
matched_symbol, line 4329:

4329                 if (obj->vertab[verndx].hash !=3D req->ventry->hash ||
4330                     strcmp(obj->vertab[verndx].name, req->ventry->name=
)) {=20
4331                         /*
4332                          * Version does not match. Look if this is a
4333                          * global symbol and if it is not hidden. If
4334                          * global symbol (verndx < 2) is available,
4335                          * use it. Do not return symbol if we are
4336                          * called by dlvsym, because dlvsym looks for
4337                          * a specific version and default one is not
4338                          * what dlvsym wants.
4339                          */
4340                         if ((req->flags & SYMLOOK_DLSYM) ||
4341                             (verndx >=3D VER_NDX_GIVEN) ||
4342                             (obj->versyms[symnum] & VER_NDX_HIDDEN))
4343                                 return (false);
4344                 }

verndx is 2, and req->ventry->hash is 363045. If obj->vertab[2].hash hasn't
been modified, the runtime linker will pick this symbol and the execution c=
an
continue.

I tried to set a hardware watchpoint on obj->vertab[2].hash in GDB, but the
watchpoint never hit. I also tried to set a software watchpoint on the same
address, and the result wasn't always the same. Most of the time it ran for=
ever
and I interrupted it after a few minutes, but sometimes it stopped at
instructions which should not modify the memory, such as 'mov r15,QWORD PTR
fs:0x10' and 'mov r15,rdi'. Therefore, I thought the hash value was modifie=
d by
the kernel, but 'catch syscall' command in GDB didn't seem to work for me. =
GDB
kept printing 'Thread 2 received signal SIGSYS, Bad system call.' and made =
the
program behave abnormally. I decided to use DTrace to track the hash value
changes for me:

# dtrace -n 'syscall:::entry, syscall:::return /pid =3D=3D 99608/ { printf(=
"%s %u
=3D=3D> %x %x %x %x", probefunc, *(unsigned int *)copyin(0x801242230, 4), a=
rg0,
arg1, arg2, arg3); }'

dtrace: description 'syscall:::entry, syscall:::return ' matched 2168 probes
CPU     ID                    FUNCTION:NAME
  1  80243                      ioctl:entry ioctl 363045 =3D=3D> 8 c0306938
7fffdfffd770 0
  1  80244                     ioctl:return ioctl 32 =3D=3D> 0 0 0 0

0x801242230 was the address of the hash variable obtained from GDB. It seem=
s it
was a 'ioctl(8, SIOCGIFMEDIA, 0x7fffdfffd730)' call that changed the value.=
 8
was a socket file descriptor created by calling 'socket(PF_INET, SOCK_DGRAM=
 |
SOCK_CLOEXEC, 0)'. 0x7fffdfffd730 looked like a pointer on the stack, as
'procstat -v' said this region grew down. I stopped debugging here and
temporarily removed the VLAN interface with 'ifconfig vlan3 destroy' to let
portmaster upgrade kf5-syntax-highlighting and hundreds of other ports for =
me.

The conclusion is that I probably have to read the code of qt5-network in o=
rder
to figure out what really happens. I found totally 3 ways to workaround the
problem on systems affected by this problem:

1. Remove all VLAN interfaces, which may not be possible if your networking
environment requires it.
2. Use Clang 6 shipped with FreeBSD base to recompile /libexec/ld-elf.so.1 =
with
-O1, -O0, or -DDEBUG.
3. Use GCC 8 from ports to recompile /libexec/ld-elf.so.1 with -O0. Using -=
O1
or -DDEBUG doesn't help when using GCC.

In fact, I didn't replace /libexec/ld-elf.so.1 on the system because it is
risky. I did the test by either running the compiled ld-elf.so.1 under
/usr/src/libexec/rtld-elf directly as an executable or modifying the
interpreter path stored in katehighlightingindexer executable with 'patchelf
--set-interpreter' command.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-231402-7788>