From owner-freebsd-hackers@freebsd.org Wed Nov 11 08:03:02 2015 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7675BA2B6D6; Wed, 11 Nov 2015 08:03:02 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 4B7841BD1; Wed, 11 Nov 2015 08:03:00 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA26200; Wed, 11 Nov 2015 10:02:58 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1ZwQMs-000G6Q-9u; Wed, 11 Nov 2015 10:02:58 +0200 Subject: Re: strange kernel crash To: John Baldwin References: <563C8CED.3020101@FreeBSD.org> <2278845.gkxYBUMIWE@ralph.baldwin.cx> <5641AF48.1000507@FreeBSD.org> <18887451.3zmRk4crln@ralph.baldwin.cx> Cc: freebsd-current@FreeBSD.org, Hans Petter Selasky , FreeBSD Hackers From: Andriy Gapon Message-ID: <5642F5E0.4050402@FreeBSD.org> Date: Wed, 11 Nov 2015 10:01:36 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <18887451.3zmRk4crln@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Nov 2015 08:03:02 -0000 On 10/11/2015 20:42, John Baldwin wrote: > On Tuesday, November 10, 2015 10:48:08 AM Andriy Gapon wrote: >> On 09/11/2015 22:16, John Baldwin wrote: >>> On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote: >>>> On 11/06/15 12:20, Andriy Gapon wrote: >>>>> Now the strange part: >>>>> >>>>> 0xffffffff80619a18 <+744>: jne 0xffffffff80619a61 <__mtx_lock_flags+817> >>>>> 0xffffffff80619a1a <+746>: mov %rbx,(%rsp) >>>>> => 0xffffffff80619a1e <+750>: movq $0x0,0x18(%rsp) >>>>> 0xffffffff80619a27 <+759>: movq $0x0,0x10(%rsp) >>>>> 0xffffffff80619a30 <+768>: movq $0x0,0x8(%rsp) >>>> >>>> Were these instructions dumped from RAM or from the kernel ELF file? >>> >>> Probably not from RAM. You can use 'info files' in gdb to see what is >>> handling the address range in question (core vs executable). x/i in ddb >>> would have been the "real" truth. >> >> Yes, according to the output of files it looks like gdb would read that data >> from the text section of the kernel file. >> >> How about libkvm? Would kvm_read read data from the core file? > > kvm_read should only access the vmcore, yes. > >> I've written the following small program (cut down dmesg.c, actually): >> https://people.freebsd.org/~avg/vmcore_read.c >> >> (kgdb) disassemble /r >> => 0xffffffff80619a1e <+750>: 48 c7 44 24 18 00 00 00 00 movq >> $0x0,0x18(%rsp) >> >> $ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29 0xffffffff80619a1e 9 >> 48 c7 44 24 18 00 00 00 00 >> >> Seems like the code is intact. >> >> P.S. >> 1. To correct something I said earlier, the fault is #UD, not #GP. >> 2. The only "suspicious" activity at the time of the crash was the execution of >> a bhyve VM. > > Was the crash in the guest or the host? UD# seems even more bizarre. It was the host. This is bizarre indeed. I can think only of two possibilities: - new CPU erratum - corrupted data somehow getting into the instruction cache, but the correct data being read during the crash dump (i.e. flaky memory) -- Andriy Gapon