Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Nov 2015 16:37:32 -0800
From:      John Baldwin <jhb@freebsd.org>
To:        Marius Strobl <marius@alchemy.franken.de>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: Supporting cross-debugging vmcores in libkvm (Testing needed)
Message-ID:  <5992121.1Qh8fceFnn@ralph.baldwin.cx>
In-Reply-To: <20151116230439.GA77914@alchemy.franken.de>
References:  <3121152.ujdxFEovO3@ralph.baldwin.cx> <5385051.zAN7Yc63R0@ralph.baldwin.cx> <20151116230439.GA77914@alchemy.franken.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, November 17, 2015 12:04:39 AM Marius Strobl wrote:
> On Fri, Nov 13, 2015 at 11:50:37AM -0800, John Baldwin wrote:
> > On Friday, November 13, 2015 12:41:46 AM Marius Strobl wrote:
> > > On Thu, Nov 12, 2015 at 02:36:42PM -0800, John Baldwin wrote:
> > > > On Monday, August 31, 2015 02:21:19 PM John Baldwin wrote:
> > > > > On Wednesday, August 12, 2015 10:50:20 AM John Baldwin wrote:
> > > > > > On Tuesday, August 04, 2015 10:56:09 AM John Baldwin wrote:
> > > > > > > Many debuggers (recent gdb and lldb) support cross-architecture debugging 
> > > > > > > just fine.  My current WIP port of kgdb to gdb7 supports cross-debugging for
> > > > > > > remote targets already, but I wanted it to also support cross-debugging for
> > > > > > > vmcores.
> > > > > > > 
> > > > > > > The existing libkvm/kgdb code in the tree has some limited support for
> > > > > > > cross-debugging.  It requires building a custom libkvm (e.g. libkvm-i386.a)
> > > > > > > and custom kgdb for each target platform.  However, gdb (and lldb) both
> > > > > > > support multiple targets in a single binary, so I'd like to have a single
> > > > > > > kgdb binary that can cross-debug anything.
> > > > > > > 
> > > > > > > I started hacking on libkvm last weekend and have a prototype that I've used
> > > > > > > (along with some patches to my kgdb port) to debug an amd64 vmcore on an
> > > > > > > i386 machine and vice versa.
> > > > > > >
> > > > > > > ...
> > > > > > > 
> > > > > > > What I'm mostly after is comments on the API, etc.  Once that is settled I
> > > > > > > will move forward on converting and/or stubbing the other backends (the
> > > > > > > stub route would be to only support other backends on native systems for
> > > > > > > now).
> > > > > > 
> > > > > > I guess this is closer to a nuclear power plant than a bikeshed judging by the
> > > > > > feedback.  I have ported the rest of the MD backends and verified that the
> > > > > > updated libkvm passes a universe build (including various static assertions
> > > > > > for the duplicated constants in other backends).  What I have not done is any
> > > > > > runtime testing and I would like to ask for help with that now.  In particular
> > > > > > I need someone to test that kgdb and/or ps works against a native core dump
> > > > > > on all platforms other than amd64 and i386.  Note that some of the trickiness
> > > > > > is that the backends now have to make runtime decisions for things that were
> > > > > > previously compile-time decisions.  The biggest one affected by this is the
> > > > > > MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
> > > > > > I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
> > > > > > where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
> > > > > > uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
> > > > > > for both in theory).  The ARM backend also handles both endians (in theory).
> > > > > > 
> > > > > > Another wrinkle is that sparc64 uses its own dump format instead of writing
> > > > > > out an ELF file.  I had to convert the header structures to use fixed-width
> > > > > > types to be cross-friendly.  It would be good to ensure that a new libkvm
> > > > > > can read a vmcore from an old kernel and vice versa to make sure my conversion
> > > > > > is correct (I added an explicit padding field that I believe was implicit
> > > > > > before).
> > > > > > 
> > > > > > The code is currently available for review in phabric at
> > > > > > https://reviews.freebsd.org/D3341
> > > > > > 
> > > > > > To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
> > > > > 
> > > > > I've just rebased this to port aarch64's minidump support.  I just need people
> > > > > willing and able to test on non-x86.  Testing with the in-tree kgdb using an
> > > > > updated libkvm would be sufficient.
> > > > 
> > > > After a lot of crickets, I have updated the manpages for the new API.  I will
> > > > commit this "soon".  If you want kgdb to keep working on your non-x86
> > > > platform, this is your chance to test this before it hits the tree.
> > > > 
> > > 
> > > What exact test procedure do you suggest for full coverage of an
> > > architecture?
> > 
> > Just ensuring that kgdb and things like ps -M <core> -N <kernel> still work.
> 
> With the patch from D3341 applied, kgdb(1) still seems to work fine on
> sparc64. However, `ps -M <core> -N <kernel>` doesn't; it just prints
> the header and then exists after a short pause. Using the same core and
> kernel with ps(1) on a machine with userland built without your patch,
> ps(1) just segfaults after a short period of time. I can't tell whether
> that's a regression or not as I've never used ps(1) on a core before
> and you also have added padding to struct sparc64_dump_hdr, which might
> be responsible for triggering the segfault. On the other hand, an old
> kgb(1) seemingly works fine with the new core.

Hmm, I had thought that the old and new sparc64_dump_hdr would be the
same?  I was just using fixed width types so that any platform could
#include the header and get the same layout.  In particular, I don't
want the dump format to change on disk after this change so that once
kgdb (or lldb) has cross-debugging support we can read both old and
new sparc64 vmcores.

> FYI, I needed the follow patch on top of D3341 (based on the amd64
> counterpart):
> --- lib/libkvm/kvm_minidump_aarch64.c	2015-11-16 23:41:58.075242000 +0100
> +++ lib/libkvm/kvm_minidump_aarch64.c	2015-11-16 13:25:26.411577000 +0100
> @@ -122,7 +122,7 @@
>  		return (-1);
>  	}
>  	if (pread(kd->pmfd, bitmap, vmst->hdr.bitmapsize, off) !=
> -	    vmst->hdr.bitmapsize) {
> +	    (ssize_t)vmst->hdr.bitmapsize) {
>  		_kvm_err(kd, kd->program,
>  		    "cannot read %d bytes for page bitmap",
>  		    vmst->hdr.bitmapsize);
> @@ -215,7 +215,7 @@
>  	}
>  
>  invalid:
> -	_kvm_err(kd, 0, "invalid address (0x%lx)", va);
> +	_kvm_err(kd, 0, "invalid address (0x%jx)", (uintmax_t)va);
>  	return (0);
>  }

Oops, yes.  I fixed this in my git branch when I built universe with it
recently but I might not have pushed that update to phabricator yet.

> Also, parallel builds failed with something not finding libelf but
> building with a single jobs succeeded. I don't know whether D3341
> introduces that or if it's a bug in head (the latter probably is
> unlikely but I didn't investigate).

Hmm, it is true that libkvm now depends on libelf.  My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?

> > Btw, Mark Linimon tried to generate a crashdump for me on his sparc64 running
> > HEAD recently so I could test the updated kgdb but it failed to generate a
> > dump.
> 
> Ah, that reminds me of something; fixed in r290957.

Thanks!

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5992121.1Qh8fceFnn>