From owner-freebsd-current@FreeBSD.ORG Sat Aug 15 13:20:46 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DA3C8106568B for ; Sat, 15 Aug 2009 13:20:46 +0000 (UTC) (envelope-from bruce@cran.org.uk) Received: from muon.cran.org.uk (muon.cran.org.uk [66.246.138.153]) by mx1.freebsd.org (Postfix) with ESMTP id 999278FC55 for ; Sat, 15 Aug 2009 13:20:46 +0000 (UTC) Received: from tau.draftnet (87-194-158-129.bethere.co.uk [87.194.158.129]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by muon.cran.org.uk (Postfix) with ESMTPSA id 901378461; Sat, 15 Aug 2009 13:20:45 +0000 (UTC) Date: Sat, 15 Aug 2009 14:20:43 +0100 From: Bruce Cran To: Thomas Backman Message-ID: <20090815142043.2b18dae0@tau.draftnet> In-Reply-To: <9CBAB74F-45CD-4B20-835C-A77C9D01B5D1@exscape.org> References: <665DE2F7-0899-40B7-9129-2082F2188D3E@exscape.org> <94F61AF3-E0D2-4BCD-8C74-07C3C0752A47@exscape.org> <20090814093916.11c89255@gluon.draftnet> <9CBAB74F-45CD-4B20-835C-A77C9D01B5D1@exscape.org> X-Mailer: Claws Mail 3.7.2 (GTK+ 2.16.5; amd64-portbld-freebsd8.0) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: FreeBSD current Subject: Re: ps -axl during textdumps occasionally segfaults with a HUGE ps.core X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Aug 2009 13:20:46 -0000 On Fri, 14 Aug 2009 11:05:05 +0200 Thomas Backman wrote: > Looks like you're right! > I tried the same deal: > [root@chaos ~]# time ps -axl -M /var/crash/vmcore.45.NMAP_SCAN > Segmentation fault: 11 (core dumped) >=20 > real 0m46.005s > user 0m0.000s > sys 0m7.753s >=20 > (All the time taken, according to the hard drive noise, was to save =20 > the core dump, which existed long before it returned to the shell. >=20 > [root@chaos ~]# gdb /bin/ps ps.core > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and =20 > you are > welcome to change it and/or distribute copies of it under certain =20 > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for =20 > details. > This GDB was configured as "amd64-marcel-freebsd"...(no debugging =20 > symbols found)... > Core was generated by `ps'. > Program terminated with signal 11, Segmentation fault. > Reading symbols from /lib/libm.so.5...(no debugging symbols =20 > found)...done. > Loaded symbols for /lib/libm.so.5 > Reading symbols from /lib/libkvm.so.5...(no debugging symbols =20 > found)...done. > Loaded symbols for /lib/libkvm.so.5 > Reading symbols from /lib/libc.so.7...(no debugging symbols =20 > found)...done. > Loaded symbols for /lib/libc.so.7 > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols =20 > found)...done. > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x0000000800960b9b in strlen () from /lib/libc.so.7 > (gdb) bt > #0 0x0000000800960b9b in strlen () from /lib/libc.so.7 > #1 0x0000000800959812 in open () from /lib/libc.so.7 > #2 0x00000008008f0546 in vsnprintf () from /lib/libc.so.7 > #3 0x0000000800772d79 in _kvm_err () from /lib/libkvm.so.5 > #4 0x00000008007707f7 in kvm_getprocs () from /lib/libkvm.so.5 > #5 0x0000000000405322 in uname () > #6 0x0000000000401f0e in ?? () > #7 0x0000000800539000 in ?? () > #8 0x0000000000000000 in ?? () > #9 0x0000000000000000 in ?? () > ... > #639 0x9066669066669066 in ?? () > #640 0x00007fffffffec38 in ?? () > #641 0x0000000000000004 in ?? () > #642 0x00007fffffffec60 in ?? () > #643 0x0000000000000012 in ?? () > Cannot access memory at address 0x800000000000 >=20 > Crash in strlen() this time, rather than bcopy(), but uname() in > still the root cause, I guess...? >=20 I managed to get a full backtrace and can at least see what's causing the crash: it seems it's stepping past the nlist array and calls vsnprintf with a bad argument. kvm_nlist returns -1 to report that the symbol table couldn't be read, but the code assumes it has returned a positive number to indicate that there's an invalid entry, so it starts searching for that entry where n_type is 0. tau# gdb ps=0D GNU gdb 6.1.1 [FreeBSD] [...] (gdb) break kvm_proc.c:631 No source file named kvm_proc.c. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (kvm_proc.c:631) pending. (gdb) run -ax -M /var/crash/vmcore.3 Starting program: /bin/ps -ax -M /var/crash/vmcore.3 Breakpoint 2 at 0x80076f2f8: file /usr/src/lib/libkvm/kvm_proc.c, line 631. Pending breakpoint "kvm_proc.c:631" resolved Program received signal SIGSEGV, Segmentation fault. 0x000000080096340b in strlen (str=3DVariable "str" is not available. ) at /usr/src/lib/libc/string/strlen.c:88 88 if (*p =3D=3D '\0') (gdb) bt #0 0x000000080096340b in strlen (str=3DVariable "str" is not available. ) at /usr/src/lib/libc/string/strlen.c:88 #1 0x000000080095c082 in __vfprintf (fp=3D0x7fffffffd9a0, fmt0=3D0x800773915 "%s: no such symbol", ap=3D0x7fffffffdb10) at /usr/src/lib/libc/stdio/vfprintf.c:825 #2 0x00000008008cc696 in vsnprintf (str=3DVariable "str" is not available. ) at /usr/src/lib/libc/stdio/vsnprintf.c:70 #3 0x0000000800772e89 in _kvm_err (kd=3DVariable "kd" is not available. ) at /usr/src/lib/libkvm/kvm.c:104 #4 0x0000000800770907 in kvm_getprocs (kd=3D0x800b02300, op=3D8, arg=3D0, cnt=3D0x7fffffffdf1c) at /usr/src/lib/libkvm/kvm_proc.c:561 #5 0x0000000000405322 in main (argc=3D4, argv=3D0x7fffffffe9a8) at /usr/src/bin/ps/ps.c:511 (gdb) frame 4 #4 0x0000000800770907 in kvm_getprocs (kd=3D0x800b02300, op=3D8, arg=3D0, cnt=3D0x7fffffffdf1c) at /usr/src/lib/libkvm/kvm_proc.c:561 561 _kvm_err(kd, kd->program, (gdb) list 556 nl[5].n_name =3D 0; 557=09 558 if (kvm_nlist(kd, nl) !=3D 0) { 559 for (p =3D nl; p->n_type !=3D 0; ++p) 560 ; 561 _kvm_err(kd, kd->program, 562 "%s: no such symbol", p->n_name); 563 return (0); 564 } 565 if (KREAD(kd, nl[0].n_value, &nprocs)) { (gdb) print nl $1 =3D {{n_name =3D 0x8007738ef "_nprocs", n_type =3D 240 '=C3=B0', n_other= =3D -1 '=C3=BF', n_desc =3D -1, n_value =3D 34365215744}, { n_name =3D 0x8007738f7 "_allproc", n_type =3D 160 '=C2=A0', n_other =3D -100 '\234', n_desc =3D 80, n_value =3D 0}, { n_name =3D 0x800773900 "_zombproc", n_type =3D 57 '9', n_other =3D 2 '\002', n_desc =3D 81, n_value =3D 34367538496}, { n_name =3D 0x80077390a "_ticks", n_type =3D 74 'J', n_other =3D 0 '\0', n_desc =3D 0, n_value =3D 34365215744}, { n_name =3D 0x800773911 "_hz", n_type =3D 168 '= =C2=A8', n_other =3D -23 '=C3=A9', n_desc =3D -1, n_value =3D 140737488349576}, {n_n= ame =3D 0x0, n_type =3D 1 '\001', n_other =3D 0 '\0', n_desc =3D 0, n_value =3D 34365024109}}=20 --=20 Bruce Cran