From owner-freebsd-current@freebsd.org Wed Sep 19 21:11:59 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 25BCB10A704A for ; Wed, 19 Sep 2018 21:11:59 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B0D0192573; Wed, 19 Sep 2018 21:11:58 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id w8JLBu6r001747 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 19 Sep 2018 14:11:56 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id w8JLBu2t001746; Wed, 19 Sep 2018 14:11:56 -0700 (PDT) (envelope-from sgk) Date: Wed, 19 Sep 2018 14:11:56 -0700 From: Steve Kargl To: Mark Johnston Cc: freebsd-current@freebsd.org Subject: Re: ALPHA4 panic in VM Message-ID: <20180919211156.GA1677@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20180919200152.GA1164@troutmask.apl.washington.edu> <20180919210211.GC99168@raichu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180919210211.GC99168@raichu> User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Sep 2018 21:11:59 -0000 On Wed, Sep 19, 2018 at 05:02:11PM -0400, Mark Johnston wrote: > On Wed, Sep 19, 2018 at 01:01:52PM -0700, Steve Kargl wrote: > > I have the kernel and core file if more information is needed. > > > > % cat info.2 > > Dump header from device: /dev/ada0p3 > Architecture: amd64 > > Architecture Version: 2 > > Dump Length: 2348281856 > > Blocksize: 512 > > Compression: none > > Dumptime: Wed Sep 19 12:29:59 2018 > > Hostname: troutmask.apl.washington.edu > > Magic: FreeBSD Kernel Dump > > Version String: FreeBSD 12.0-ALPHA4 #0 r338505: Thu Sep 6 13:45:34 PDT 2018 > > kargl@troutmask.apl.washington.edu:/usr/obj/usr/src/amd64.amd64/sys/SPEW > > Panic String: page fault > > Dump Parity: 2676008548 > > Bounds: 2 > > Dump Status: good > > > > % more core.txt.2 > > Fatal trap 12: page fault while in kernel mode > > cpuid = 1; apic id = 11 > > fault virtual address = 0xffffb8000719a428 > > This seems to be the result of a bit-flip. cred is 0xffffb8000719a400, > which is almost but not quite in the direct map. In particular we have: > > (kgdb) frame 10 > #10 0xffffffff8083e07d in vm_object_destroy (object=) at /usr/src/sys/vm/vm_object.c:703 > 703 swap_release_by_cred(object->charge, object->cred); > (kgdb) p object > $8 = > (kgdb) p *(vm_object_t)$r13 > $9 = { > ... > cred = 0xffffb8000719a400, > charge = 28672, > umtx_data = 0x0 > } > (kgdb) p *(struct ucred *)0xfffff8000719a400 > $10 = { > cr_ref = 5737, > cr_uid = 1001, > cr_ruid = 1001, > cr_svuid = 1001, > cr_ngroups = 7, > cr_rgid = 1001, > cr_svgid = 1001, > cr_uidinfo = 0xfffff80007285500, > cr_ruidinfo = 0xfffff80007285500, > cr_prison = 0xffffffff80a9de10 , > ... > > That is, flipping one of the bits in the fault address leads me to a > valid ucred. This could in principle be the result of a software bug, > but I'd be more inclined to suspect the hardware. Mark, Thanks for looking into the problem. This system has been running for probably 2 years or so without issues. I guess it's time to pull out memtest86+ (or similar) to see if hardware is starting to fail. -- Steve