Date: Tue, 5 Jun 2012 16:34:17 +0200 From: Jeremie Le Hen <jlh@FreeBSD.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: toolchain@freebsd.org Subject: Re: libunwind-based pstack(1) Message-ID: <20120605143417.GJ47353@felucia.tataz.chchile.org> In-Reply-To: <20120524122518.GJ2358@deviant.kiev.zoral.com.ua> References: <20120524122518.GJ2358@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Kostik, On Thu, May 24, 2012 at 03:25:18PM +0300, Konstantin Belousov wrote: > Hi, > I reimplemented pstack(1) using libunwind. The source is available at > the git repository at http://people.freebsd.org/~kib/git/pstacku.git/ . > To use it, you should also use git HEAD of the libunwind from > http://libunwind.nongnu.org, I do not think that version from ports > will work. Due to libunwind use, this pstack works on i386 and amd64. > When libunwind/FreeBSD is ported to other arches, adding corresponding > support to pstack is quite easy. > > So far, I tried to implement most of the features supported by original > pstack, but there are limitations due to use of libunwind. Only libthr > supported as the threading library, you probably get some funny results > for libc_r and kse-based libpthread. > > The big unimplemented feature is coredump stack dumping, but libunwind > only got support for Linux coredump backtracing a day ago, and I did not > yet looked at porting this to FreeBSD. > > Lesser implemented but not properly working feature is the arguments > printing. I might fix this later. I tried it on RELENG_9/amd64 it works mostly, but most of the time I only get the stack address and an offset from the _init symbol when the address points into the binary text or "????????" when it is a shared library. Supposedly this is because there is no known symbol for the function, but having used Solaris' pstack quite extensively, I was a little bit disappointed. % obiwan:/opt/bin# ./pstack -O 2237 % 2237: /usr/local/libexec/postfix/nqmgr (osrel 0) % Thread 101037: % 0x80455153c ???????? in /lib/libcrypto.so.6 % 0x42118a event_loop+0x11a in /usr/local/libexec/postfix/nqmgr % 0x40b217 trigger_server_main+0xbd7 in /usr/local/libexec/postfix/nqmgr % 0x403089 main+0xd9 in /usr/local/libexec/postfix/nqmgr % 0x402f1c _start+0x9c in /usr/local/libexec/postfix/nqmgr % 0x800d85000 ???????? in /usr/local/lib/libpcre.so.1 % obiwan:/opt/bin# ./pstack -O 1937 % 1937: /usr/sbin/rpcbind (osrel 0) % Thread 101018: % 0x800d6f7cc ???????? in /lib/libutil.so.9 % 0x4057f2 _init+0x3292 in /usr/sbin/rpcbind % 0x404d1a _init+0x27ba in /usr/sbin/rpcbind % 0x402d7c _init+0x81c in /usr/sbin/rpcbind % 0x8027a7000 ???????? in /usr/lib/libwrap.so.6 Here is the output from nawk compiled from pkgsrc on Solaris 11 (but after checking, all binaries and shared libraries are compiled with debugging symbols): % root@gandalf:/root# /usr/pkg/bin/nawk '{print}' & % [1] 10960 % root@gandalf:/root# pstack 10960 % 10960: /usr/pkg/bin/nawk {print} % feee25b5 read (0, fef68ce0, 400) % feeab91c _filbuf (8079690, 8082910, 80528fc, fef70210) + d3 % 08059b5f readrec (8047c94, 8047c98, 8079690, 80596a8) + 170 % 0805985b getrec (8079f78, 8078e14, 1, 7f7f7f7f) + 1e7 % 0805b255 program (80825f0, 103, fef58b60, fef58b60, 8047d78, 8047d30) + b8 % 0805b131 execute (80825e0, 807a0e0, fef58b60, 805b073, 8047d30, 8047d78) + a9 % 0805b07e run (80825e0, 8066a90, 8047d48, 8057f6f) + 16 % 08057fdf main (1, 8047d74, 8047d80, 8047d68, 8054122, 8060e60) + 41b % 08054183 _start (2, 8047e34, 8047e34, 0, 8047e4e, 8047e69) + 83 % [1] + Stopped (SIGTTIN) /usr/pkg/bin/nawk '{print}' & Sometimes, supposedly when the binary doesn't exist any longer, the output is wrong as if there's absolutely no frame belonging to the binary in the stack: % obiwan:/opt/bin# ps xp 2055 % PID TT STAT TIME COMMAND % 2055 ?? Ss 0:30.96 /usr/sbin/cron -s % obiwan:/opt/bin# ./pstack -O 2055 % 2055: ???????? (osrel 0) % Thread 100364: % 0x801fb84cc ???????? in /lib/libutil.so.9 % 0x801f2479d ???????? in /lib/libutil.so.9 After restarting cron(8): % obiwan:/opt/bin# ps xp 61470 % PID TT STAT TIME COMMAND % 61470 ?? Is 0:00.01 /usr/sbin/cron -s % obiwan:/opt/bin# ./pstack -O 61470 % 61470: /usr/sbin/cron (osrel 0) % Thread 103137: % 0x800d1b4cc __sys_nanosleep+0xc in /lib/libc.so.7 % 0x800c8779d sleep+0x3d in /lib/libc.so.7 % 0x402f28 _init+0xd70 in /usr/sbin/cron % 0x40294c _init+0x794 in /usr/sbin/cron % 0x800624000 ???????? in /libexec/ld-elf.so.1 After recompiling cron(8) with DEBUG_FLAGS=-g: % obiwan:/opt/bin# ps xp 61730 % PID TT STAT TIME COMMAND % 61730 ?? Is 0:00.00 /usr/sbin/cron -s % obiwan:/opt/bin# ./pstack -O 61730 % 61730: /usr/sbin/cron (osrel 0) % Thread 103140: % 0x800d1b4cc __sys_nanosleep+0xc in /lib/libc.so.7 % 0x800c8779d sleep+0x3d in /lib/libc.so.7 % 0x402f28 main+0x478 in /usr/sbin/cron % 0x40294c _start+0x9c in /usr/sbin/cron % 0x800624000 ???????? in /libexec/ld-elf.so.1 Last, option although documented in usage() and implemented is not in the getopt(3) string. Though, according to the source, I don't think the messages it reports may be useful to the user :-). -- Jeremie Le Hen Men are born free and equal. Later on, they're on their own. Jean Yanne
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120605143417.GJ47353>