Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Jun 2012 16:34:17 +0200
From:      Jeremie Le Hen <jlh@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        toolchain@freebsd.org
Subject:   Re: libunwind-based pstack(1)
Message-ID:  <20120605143417.GJ47353@felucia.tataz.chchile.org>
In-Reply-To: <20120524122518.GJ2358@deviant.kiev.zoral.com.ua>
References:  <20120524122518.GJ2358@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Kostik,

On Thu, May 24, 2012 at 03:25:18PM +0300, Konstantin Belousov wrote:
> Hi,
> I reimplemented pstack(1) using libunwind. The source is available at
> the git repository at http://people.freebsd.org/~kib/git/pstacku.git/ .
> To use it, you should also use git HEAD of the libunwind from
> http://libunwind.nongnu.org, I do not think that version from ports
> will work. Due to libunwind use, this pstack works on i386 and amd64.
> When libunwind/FreeBSD is ported to other arches, adding corresponding
> support to pstack is quite easy.
> 
> So far, I tried to implement most of the features supported by original
> pstack, but there are limitations due to use of libunwind. Only libthr
> supported as the threading library, you probably get some funny results
> for libc_r and kse-based libpthread.
> 
> The big unimplemented feature is coredump stack dumping, but libunwind
> only got support for Linux coredump backtracing a day ago, and I did not
> yet looked at porting this to FreeBSD.
> 
> Lesser implemented but not properly working feature is the arguments 
> printing. I might fix this later.

I tried it on RELENG_9/amd64 it works mostly, but most of the time I only get
the stack address and an offset from the _init symbol when the address
points into the binary text or "????????" when it is a shared library.
Supposedly this is because there is no known symbol for the function,
but having used Solaris' pstack quite extensively, I was a little bit
disappointed.

% obiwan:/opt/bin# ./pstack -O 2237
% 2237: /usr/local/libexec/postfix/nqmgr (osrel 0)
% Thread 101037:
%  0x80455153c ???????? in /lib/libcrypto.so.6
%  0x42118a event_loop+0x11a in /usr/local/libexec/postfix/nqmgr
%  0x40b217 trigger_server_main+0xbd7 in /usr/local/libexec/postfix/nqmgr
%  0x403089 main+0xd9 in /usr/local/libexec/postfix/nqmgr
%  0x402f1c _start+0x9c in /usr/local/libexec/postfix/nqmgr
%  0x800d85000 ???????? in /usr/local/lib/libpcre.so.1

% obiwan:/opt/bin# ./pstack -O 1937
% 1937: /usr/sbin/rpcbind (osrel 0)
% Thread 101018:
%  0x800d6f7cc ???????? in /lib/libutil.so.9
%  0x4057f2 _init+0x3292 in /usr/sbin/rpcbind
%  0x404d1a _init+0x27ba in /usr/sbin/rpcbind
%  0x402d7c _init+0x81c in /usr/sbin/rpcbind
%  0x8027a7000 ???????? in /usr/lib/libwrap.so.6

Here is the output from nawk compiled from pkgsrc on Solaris 11 (but
after checking, all binaries and shared libraries are compiled with
debugging symbols):

% root@gandalf:/root# /usr/pkg/bin/nawk '{print}' &
% [1]     10960
% root@gandalf:/root# pstack 10960
% 10960:  /usr/pkg/bin/nawk {print}
%  feee25b5 read     (0, fef68ce0, 400)
%  feeab91c _filbuf  (8079690, 8082910, 80528fc, fef70210) + d3
%  08059b5f readrec  (8047c94, 8047c98, 8079690, 80596a8) + 170
%  0805985b getrec   (8079f78, 8078e14, 1, 7f7f7f7f) + 1e7
%  0805b255 program  (80825f0, 103, fef58b60, fef58b60, 8047d78, 8047d30) + b8
%  0805b131 execute  (80825e0, 807a0e0, fef58b60, 805b073, 8047d30, 8047d78) + a9
%  0805b07e run      (80825e0, 8066a90, 8047d48, 8057f6f) + 16
%  08057fdf main     (1, 8047d74, 8047d80, 8047d68, 8054122, 8060e60) + 41b
%  08054183 _start   (2, 8047e34, 8047e34, 0, 8047e4e, 8047e69) + 83
% [1] + Stopped (SIGTTIN)        /usr/pkg/bin/nawk '{print}' &


Sometimes, supposedly when the binary doesn't exist any longer, the output is
wrong as if there's absolutely no frame belonging to the binary in the stack:

% obiwan:/opt/bin# ps xp 2055
%   PID  TT  STAT    TIME COMMAND
%  2055  ??  Ss   0:30.96 /usr/sbin/cron -s
% obiwan:/opt/bin# ./pstack -O 2055
% 2055: ???????? (osrel 0)
% Thread 100364:
%  0x801fb84cc ???????? in /lib/libutil.so.9
%  0x801f2479d ???????? in /lib/libutil.so.9

After restarting cron(8):

% obiwan:/opt/bin# ps xp 61470
%   PID  TT  STAT    TIME COMMAND
% 61470  ??  Is   0:00.01 /usr/sbin/cron -s
% obiwan:/opt/bin# ./pstack -O 61470
% 61470: /usr/sbin/cron (osrel 0)
% Thread 103137:
%  0x800d1b4cc __sys_nanosleep+0xc in /lib/libc.so.7
%  0x800c8779d sleep+0x3d in /lib/libc.so.7
%  0x402f28 _init+0xd70 in /usr/sbin/cron
%  0x40294c _init+0x794 in /usr/sbin/cron
%  0x800624000 ???????? in /libexec/ld-elf.so.1

After recompiling cron(8) with DEBUG_FLAGS=-g:

% obiwan:/opt/bin# ps xp 61730
%   PID  TT  STAT    TIME COMMAND
% 61730  ??  Is   0:00.00 /usr/sbin/cron -s
% obiwan:/opt/bin# ./pstack -O 61730
% 61730: /usr/sbin/cron (osrel 0)
% Thread 103140:
%  0x800d1b4cc __sys_nanosleep+0xc in /lib/libc.so.7
%  0x800c8779d sleep+0x3d in /lib/libc.so.7
%  0x402f28 main+0x478 in /usr/sbin/cron
%  0x40294c _start+0x9c in /usr/sbin/cron
%  0x800624000 ???????? in /libexec/ld-elf.so.1


Last, option although documented in usage() and implemented is not in
the getopt(3) string.  Though, according to the source, I don't think
the messages it reports may be useful to the user :-).

-- 
Jeremie Le Hen

Men are born free and equal.  Later on, they're on their own.
				Jean Yanne



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120605143417.GJ47353>