Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 May 2017 09:28:55 -0400
From:      Adam McDougall <mcdouga9@egr.msu.edu>
To:        Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@FreeBSD.org>
Cc:        stable@freebsd.org, cperciva@freebsd.org
Subject:   Re: Boot hang on Xen after r318347/(310418)
Message-ID:  <20170525132854.GA7604@egr.msu.edu>
In-Reply-To: <20170525094103.iedycf2t4dy367fc@dhcp-3-128.uk.xensource.com>
References:  <20170524223307.GS79337@egr.msu.edu> <20170525094103.iedycf2t4dy367fc@dhcp-3-128.uk.xensource.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, May 25, 2017 at 10:41:03AM +0100, Roger Pau Monné wrote:

> On Wed, May 24, 2017 at 06:33:07PM -0400, Adam McDougall wrote:
> > Hello,
> > 
> > Recently I made a new build of 11-STABLE but encountered a boot hang
> > at this state:
> > http://www.egr.msu.edu/~mcdouga9/pics/r318347-smp-hang.png
> > 
> > It is easy to reproduce, I can just boot from any 11 or 12 ISO that 
> > contains the commit.
> 
> I have just tested latest HEAD (r318861) and stable/11 (r318854) and
> they both work fine on my environment (a VM with 4 vCPUs and 2GB of
> RAM on OSS Xen 4.9). I'm also adding Colin in case he has some input,
> he has been doing some tests on HEAD and AFAIK he hasn't seen any
> issues.
> 
> > I compiled various svn revisions to confirm that r318347 caused the 
> > issue and r318346 is fine. With r318347 or later including the latest 
> > 11-STABLE, the system will only boot with one virtual CPU in XenServer. 
> > Any more cpus and it hangs. I also tried a 12 kernel from head this 
> > afternoon and I have the same hang. I had this issue on XenServer 7 
> > (Xen 4.7) and XenServer 6.5 (Xen 4.4). I did most of my testing on 7. I 
> > also did much of my testing with a GENERIC kernel to try to rule out 
> > kernel configuration mistakes. When it hangs, the performance 
> > monitoring in Xen tells me at least one CPU is pegged. r318674 boots 
> > fine on physical hardware without Xen involved.
> > 
> > Looking at r318347 which mentions EARLY_AP_STARTUP and later seeing 
> > r318763 which enables EARLY_AP_STARTUP in GENERIC, I tried adding it to 
> > my kernel but it turned the hang into a panic but with any number of 
> > CPUs: 
> > http://www.egr.msu.edu/~mcdouga9/pics/r318347-early-ap-startup-panic.png
> 
> I guess this is on stable/11 right? The panic looks easier to debug
> that the hang, so let's start by this one. Can you enable the serial
> console and kernel debug options in order to get a trace? With just
> this it's almost impossible to know what went wrong.

Yes this was on stable/11 amd64.

> If you still have that kernel around (and it's debug symbols), can you
> do:
> 
> $ addr2line -e /usr/lib/debug/boot/kernel/kernel.debug 0xffffffff80793344
> 
> (The address is the instruction pointer on the crash image, I think I
> got it right)

I'll reproduce this soon and get the results from that command.

> In order to compile a stable/11 kernel with full debugging support you
> will have to add:
> 
> # For full debugger support use (turn off in stable branch):
> options 	BUF_TRACKING		# Track buffer history
> options 	DDB			# Support DDB.
> options 	FULL_BUF_TRACKING	# Track more buffer history
> options 	GDB			# Support remote GDB.
> options 	DEADLKRES		# Enable the deadlock resolver
> options 	INVARIANTS		# Enable calls of extra sanity checking
> options 	INVARIANT_SUPPORT	# Extra sanity checks of internal structures, required by INVARIANTS
> options 	WITNESS			# Enable checks to detect deadlocks and cycles
> options 	WITNESS_SKIPSPIN	# Don't run witness on spinlocks for speed
> options 	MALLOC_DEBUG_MAXZONES=8	# Separate malloc(9) zones
> 
> To your kernel config file.

I'll work on that soon too when I get a chance, thanks.

> 
> Just to be sure, this is an amd64 kernel right?

yes

> 
> Roger.
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>  



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170525132854.GA7604>