From owner-freebsd-current@freebsd.org Wed Jan 18 01:21:00 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 39B0BCB4A0F for ; Wed, 18 Jan 2017 01:21:00 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1669913C9; Wed, 18 Jan 2017 01:20:59 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id DE79710A791; Tue, 17 Jan 2017 20:20:51 -0500 (EST) From: John Baldwin To: Cy Schubert Cc: Hans Petter Selasky , FreeBSD Current , Konstantin Belousov Subject: Re: Strange issue after early AP startup Date: Tue, 17 Jan 2017 17:20:48 -0800 Message-ID: <1922021.4HJeqFJ74r@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-STABLE; KDE/4.14.10; amd64; ; ) In-Reply-To: <201701180108.v0I18wd1035225@slippy.cwsent.com> References: <201701180108.v0I18wd1035225@slippy.cwsent.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Tue, 17 Jan 2017 20:20:52 -0500 (EST) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jan 2017 01:21:00 -0000 On Tuesday, January 17, 2017 05:08:58 PM Cy Schubert wrote: > In message <1492450.XZfNz8zFfg@ralph.baldwin.cx>, John Baldwin writes: > > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > > > In message , Hans Petter > > > Sela > > > sky writes: > > > > Hi, > > > > > > > > When booting I observe an additional 30-second delay after this print: > > > > > > > > > Timecounters tick every 1.000 msec > > > > > > > > ~30 second delay and boot continues like normal. > > > > > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > > > > > cpu0:timer 44300 442 > > > > > cpu1:timer 40561 404 > > > > > cpu3:timer 48462822 483058 > > > > > cpu2:timer 48477898 483209 > > > > > > > > Trying to add delays and/or prints around the Timecounters printout > > > > makes the issue go away. Any ideas for debugging? > > > > > > > > Looks like a startup race to me. > > > > > > just picking a random email to reply to, I'm seeing a different issue with > > > early AP startup. It affects one of my four machines, my laptop. My three > > > server systems downstairs have no problem however my laptop will reboot > > > repeatedly at: > > > > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > > > NOT READY, Medium not present - tray closed > > > > So it panics and reboots after this? > > Yes, it goes into a panic/reboot loop for a few iterations until it > successfully boots. Disabling early AP startup allows it to boot up without > the assumed race. Can you add DDB to the kernel config (and remove DDB_UNATTENDED) to get it to break into DDB when it panics to get the panic message (and a stack trace as well)? -- John Baldwin