Date: Wed, 26 Aug 2009 22:35:31 +0100 From: Bruce Cran <bruce@cran.org.uk> To: John Baldwin <jhb@freebsd.org> Cc: brooks@freebsd.org, current@freebsd.org Subject: Re: patches to fix "ps -M" as used in crashinfo(8) Message-ID: <20090826223531.11364956@gluon.draftnet> In-Reply-To: <200908260844.16767.jhb@freebsd.org> References: <20090824230145.75824e5f@gluon.draftnet> <200908260844.16767.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 26/08/2009 13:44, John Baldwin wrote: > On Monday 24 August 2009 6:01:45 pm Bruce Cran wrote: > >> I've recently been debugging a series of problems with running ps(1) >> on crash dumps, and now have a couple of patches: the bugs cause >> ps(1) to crash while crashinfo(8) is being run during boot, dumping a >> 1GB ps.core file in the root filesystem. >> >> The patches are at >> http://www.cran.org.uk/~brucec/freebsd/pr137890.kvm_proc.c.diff and >> http://www.cran.org.uk/~brucec/freebsd/pr137890.ps.c.diff >> >> The problem with ps.c is that like pkill(1) and w(1), they all >> initialize the execfile argument to kvm_open or kvm_openfiles to >> "/dev/null" instead of NULL, causing the default usage of "ps >> -M /var/crash/vmcore.x" to fail because libkvm fails to >> fstat /dev/null. They only work if "-N" is also specified. >> > Note that crashinfo specifies both -M and -N: > > echo > "------------------------------------------------------------------------" > echo "ps -axl" echo > ps -M $VMCORE -N $KERNEL -axl > echo > I realised that just after posting, when I checked how it runs ps. When I saw the segfault at bootup I think I just ran "-ax -M /var/crash/vmcore.x" and saw it segfault too, so jumped to the wrong conclusion. In the end there were a couple of ways to get it to crash, and I'm not convinced I've found them all yet. > I'm not sure that 'ps -M blah' without '-N' should really work. > Also, I'm not sure how fstat() of /dev/null could fail? > The documentation (for ps and the equivalent parameter for kvm_open) seems to say that if you don't specify "-N" then the currently running kernel is used, as specified by getbootfile(3). I don't know if that makes sense or not. The code which involved fstat was in __aout_fdnlist in lib/libc/gen/nlist.c: /* check that file is at least as large as struct exec! */ if ((_fstat(fd, &st) < 0) || (st.st_size < sizeof(struct exec))) return (-1); I guess it was the second check that was failing and causing the function to return, and not the fstat call. > The kvm_nlist() bug in libkvm should probably still be fixed, and the > ngroups one you might want to poke brooks@ about. > > -- Bruce Cran
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090826223531.11364956>