Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Aug 2014 23:45:06 +0200 (CEST)
From:      =?ISO-8859-1?Q?Trond_Endrest=F8l?= <Trond.Endrestol@fagskolen.gjovik.no>
To:        current@freebsd.org
Subject:   Re: panic: aatpic_assign_cpu: bad cookie [Was: Build machine OK; laptop panics @r269515]
Message-ID:  <alpine.BSF.2.11.1408062342340.64214@mail.fig.ol.no>
In-Reply-To: <alpine.BSF.2.11.1408062244020.64214@mail.fig.ol.no>
References:  <20140804194759.GT1228@albert.catwhisker.org> <20140805142914.GJ1228@albert.catwhisker.org> <C248C4AE-65AB-406A-A523-F7D7FAA8FBBE@FreeBSD.org> <53E1450D.5090708@protected-networks.net> <53E14CD1.20308@protected-networks.net> <8B832384-C1CC-4622-BA67-7447ECE317C9@FreeBSD.org> <alpine.BSF.2.11.1408061914220.64214@mail.fig.ol.no> <alpine.BSF.2.11.1408062244020.64214@mail.fig.ol.no>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 6 Aug 2014 22:48+0200, Trond Endrestøl wrote:

> On Wed, 6 Aug 2014 19:21+0200, Trond Endrestøl wrote:
> 
> > On Tue, 5 Aug 2014 20:49-0700, John Baldwin wrote:
> > 
> > > 
> > > On Aug 5, 2014, at 2:29 PM, Michael Butler <imb@protected-networks.net> wrote:
> > > 
> > > > On 08/05/14 16:56, Michael Butler wrote:
> > > >> On 08/05/14 16:02, John Baldwin wrote:
> > > >> 
> > > >>> My guess is that the recent Xen changes tickled something.
> > > >> 
> > > >> I can confirm this on a kernel which is otherwise up to date ..
> > > >> 
> > > >> FreeBSD toshi.auburn.protected-networks.net 11.0-CURRENT FreeBSD
> > > >> 11.0-CURRENT #2 r269608M: Tue Aug  5 16:48:12 EDT 2014
> > > >> 
> > > >> I backed out all of SVN r269507 through r269515.
> > > >> 
> > > >> Now working ..
> > > > 
> > > > [ .. snip .. ]
> > > > 
> > > >> Now to see if it's related to the other machine's disk woes (it's a
> > > >> single-core device),
> > > > 
> > > > And it fixes the inability to probe disks on my single-core machine :-)
> > > 
> > > It looks like the MADT code to probe the I/O APICs isn't working so 
> > > it's trying to fall back to using the ATPIC while using SMP (which 
> > > doesn't work).  I know it's a pain on a laptop, but if it is at all 
> > > possible to capture either a verbose or non-verbose dmesg that would 
> > > really help narrow it down.
> > > 
> > > Also, if anyone can try reverting just the MADT-related changes in 
> > > the recent Xen changes to see if you can narrow down which exact one 
> > > triggers the panic that would be really helpful.
> > 
> > I noticed this panic on i386 head r269607 yesterday, running in VBox 
> > on Windows 7 SP1 x64, on an Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz 
> > (3175.67-MHz 686-class CPU).
> > 
> > Go to http://ximalas.info/~trond/atpic_assign_cpu/ where you'll find a 
> > verbose dmesg from i386 head r268838 from the same VM and a couple of 
> > screenshots of the crash while booting r269607 with the verbose flag 
> > on.
> > 
> > I'm rewinding /usr/src to r269507, and I'll take it from there, one 
> > commit at the time.
> 
> Reverting r269510 did the trick, i.e.:
> 
> cd /usr/src && svn up && svn diff -r 269510:269509 | patch
> 
> My i386 head VM is running smoothly with r269641M, with M meaning only 
> the above reversal.
> 
> > I'll also try to investigate this panic using my amd64 head VM.
> 
> Work in progress.

amd64 is unaffected, as r269644 worked without any modifications.

I'm guessing the changes to sys/x86/x86/local_apic.c and 
sys/x86/xen/xen_intr.c come as a pair. If not, then the changes done 
to sys/x86/x86/local_apic.c is the culprit somehow.


Trond,
going to bed.

-- 
+-------------------------------+------------------------------------+
| Vennlig hilsen,               | Best regards,                      |
| Trond Endrestøl,              | Trond Endrestøl,                   |
| IT-ansvarlig,                 | System administrator,              |
| Fagskolen Innlandet,          | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,       | Cellular...: +47 952 62 567,       |
| sentralbord 61 14 54 00.      | Switchboard: +47 61 14 54 00.      |
+-------------------------------+------------------------------------+
From owner-freebsd-current@FreeBSD.ORG  Thu Aug  7 06:11:20 2014
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5318AF67
 for <current@freebsd.org>; Thu,  7 Aug 2014 06:11:20 +0000 (UTC)
Received: from relay01.pair.com (relay01.pair.com [209.68.5.15])
 by mx1.freebsd.org (Postfix) with SMTP id 1DFA825CC
 for <current@freebsd.org>; Thu,  7 Aug 2014 06:11:19 +0000 (UTC)
Received: (qmail 61944 invoked from network); 7 Aug 2014 06:11:11 -0000
Received: from 87.58.146.155 (HELO x2.osted.lan) (87.58.146.155)
 by relay01.pair.com with SMTP; 7 Aug 2014 06:11:11 -0000
X-pair-Authenticated: 87.58.146.155
Received: from x2.osted.lan (localhost [127.0.0.1])
 by x2.osted.lan (8.14.5/8.14.5) with ESMTP id s776BBvB006800;
 Thu, 7 Aug 2014 08:11:11 +0200 (CEST)
 (envelope-from pho@x2.osted.lan)
Received: (from pho@localhost)
 by x2.osted.lan (8.14.5/8.14.5/Submit) id s776BB9X006799;
 Thu, 7 Aug 2014 08:11:11 +0200 (CEST) (envelope-from pho)
Date: Thu, 7 Aug 2014 08:11:11 +0200
From: Peter Holm <peter@holm.cc>
To: Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: r269147: NULL mp in getnewvnode() via kern_proc_filedesc_out()
Message-ID: <20140807061111.GA6625@x2.osted.lan>
References: <53E1975D.4010703@FreeBSD.org> <20140806031932.GE93733@kib.kiev.ua>
 <53E1A76A.1070400@FreeBSD.org> <53E26A6C.4000806@FreeBSD.org>
 <20140806181512.GK93733@kib.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20140806181512.GK93733@kib.kiev.ua>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: current@freebsd.org, Bryan Drewery <bdrewery@freebsd.org>
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>;
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Aug 2014 06:11:20 -0000

On Wed, Aug 06, 2014 at 09:15:12PM +0300, Konstantin Belousov wrote:
> On Wed, Aug 06, 2014 at 12:48:28PM -0500, Bryan Drewery wrote:
> > On 8/5/2014 10:56 PM, Bryan Drewery wrote:
> > > On 8/5/2014 10:19 PM, Konstantin Belousov wrote:
> > >> On Tue, Aug 05, 2014 at 09:47:57PM -0500, Bryan Drewery wrote:
> > >>> Has anyone else encountered this? Got it while running poudriere.
> > >>>
> > >>>> NULL mp in getnewvnode()
> > >>>> [...]
> > >>>> vn_fullpath1() at vn_fullpath1+0x19d/frame 0xfffffe1247d8e540
> > >>>> vn_fullpath() at vn_fullpath+0xc1/frame 0xfffffe1247d8e590
> > >>>> export_fd_to_sb() at export_fd_to_sb+0x489/frame 0xfffffe1247d8e7c0
> > >>>> kern_proc_filedesc_out() at kern_proc_filedesc_out+0x234/frame
> > >>>> 0xfffffe1247d8e840
> > >>>> sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x84/frame
> > >>>> 0xfffffe1247d8e900
> > >>>> sysctl_root_handler_locked() at
> > >>>> sysctl_root_handler_locked+0x68/frame 0xfffffe1247d8e940
> > >>>> sysctl_root() at sysctl_root+0x18e/frame 0xfffffe1247d8e990
> > >>>> userland_sysctl() at userland_sysctl+0x192/frame 0xfffffe1247d8ea30
> > >>>> sys___sysctl() at sys___sysctl+0x74/frame 0xfffffe1247d8eae0
> > >>>> amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe1247d8ebf0
> > >>>> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe1247d8ebf0
> > >>>
> > >>> Unfortunately I have no dump as the kmem was too large compared to my
> > >>> swap, and I didn't get to the console before some of the text was
> > >>> overwritten. Perhaps it will hit it again soon after reboot and I'll get
> > >>> a core.
> > >>
> > >> "NULL mp in getnewvnode()" is only the printf(), it is not a panic or
> > >> KASSERT.  The event does not stop the machine, nor it prints the
> > >> backtrace.
> > >>
> > >> You mentioned that you was unable to dump, so did the system paniced ?
> > >> Without full log of the panic messages and backtrace, it is impossible
> > >> to start guessing what the problem is.
> > >>
> > >> That said, the printf seemingly outlived its usefulness.
> > >>
> > > 
> > > Got it. I've set debug.debugger_on_panic=1 to not auto reboot on panic
> > > next time this happens. I had it at 0 which was causing the lack of
> > > information in these.
> > 
> > Here is the full trace:
> > 
> > 
> > > NULL mp in getnewvnode()
> > > VNASSERT failed
> > > 0xfffff806071dc760: tag null, type VDIR
> > >     usecount 1, writecount 0, refcount 1 mountedhere 0
> > >     flags ()
> > >     lock type zfs: EXCL by thread 0xfffff8009a53f490 (pid 1028, tmux, tid 100881)
> > >         vp=0xfffff806071dc760, lowervp=0xfffff8013157f588
> > > panic: Don't call insmntque(foo, NULL)
> > > cpuid = 5
> > > KDB: stack backtrace:
> > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1247e76b50
> > > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe1247e76c00
> > > vpanic() at vpanic+0x126/frame 0xfffffe1247e76c40
> > > kassert_panic() at kassert_panic+0x139/frame 0xfffffe1247e76cb0
> > > insmntque1() at insmntque1+0x230/frame 0xfffffe1247e76cf0
> > > null_nodeget() at null_nodeget+0x158/frame 0xfffffe1247e76d60
> > > null_lookup() at null_lookup+0xeb/frame 0xfffffe1247e76dd0
> > > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xf1/frame 0xfffffe1247e76e00
> > > lookup() at lookup+0x5ad/frame 0xfffffe1247e76e90
> > > namei() at namei+0x4e4/frame 0xfffffe1247e76f50
> > > vn_open_cred() at vn_open_cred+0x27a/frame 0xfffffe1247e770a0
> > > vop_stdvptocnp() at vop_stdvptocnp+0x161/frame 0xfffffe1247e773e0
> > > null_vptocnp() at null_vptocnp+0x2b/frame 0xfffffe1247e77440
> > > VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0xf7/frame 0xfffffe1247e77470
> > > vn_vptocnp_locked() at vn_vptocnp_locked+0x118/frame 0xfffffe1247e774e0
> > > vn_fullpath1() at vn_fullpath1+0x19d/frame 0xfffffe1247e77540
> > > vn_fullpath() at vn_fullpath+0xc1/frame 0xfffffe1247e77590
> > > export_fd_to_sb() at export_fd_to_sb+0x489/frame 0xfffffe1247e777c0
> > > kern_proc_filedesc_out() at kern_proc_filedesc_out+0x234/frame 0xfffffe1247e77840
> > > sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x84/frame 0xfffffe1247e77900
> > > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x68/frame 0xfffffe1247e77940
> > > sysctl_root() at sysctl_root+0x18e/frame 0xfffffe1247e77990
> > > userland_sysctl() at userland_sysctl+0x192/frame 0xfffffe1247e77a30
> > > sys___sysctl() at sys___sysctl+0x74/frame 0xfffffe1247e77ae0
> > > amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe1247e77bf0
> > > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe1247e77bf0
> > > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x801041fca, rsp = 0x7fffffffd878, rbp = 0x7fffffffd8b0 ---
> > > KDB: enter: panic
> > > [ thread pid 1028 tid 100881 ]
> > > Stopped at      kdb_enter+0x3e: movq    $0,kdb_why
> > > db> call doadump()
> > > 
> > > Dump failed. Partition too small.
> > > = 0
> > 
> 
> Try this.
> 
> diff --git a/sys/fs/nullfs/null_vnops.c b/sys/fs/nullfs/null_vnops.c
> index 481644c..e803c24 100644
> --- a/sys/fs/nullfs/null_vnops.c

With this patch I get a 
panic: Lock (lockmgr) null not locked @ kern/vfs_default.c:523.

Details @ http://people.freebsd.org/~pho/stress/log/kostik698.txt

- Peter



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.11.1408062342340.64214>