From owner-freebsd-stable@freebsd.org Mon Jul 24 02:15:09 2017 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 865BCDB3C82 for ; Mon, 24 Jul 2017 02:15:09 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-pg0-x22b.google.com (mail-pg0-x22b.google.com [IPv6:2607:f8b0:400e:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 52F3F77796 for ; Mon, 24 Jul 2017 02:15:09 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-pg0-x22b.google.com with SMTP id 125so50591094pgi.3 for ; Sun, 23 Jul 2017 19:15:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=0fCDIAicCqj1Q3sXACKB8W0RfMVgdQQCMV2XWxKxWBs=; b=t0Y8cTW3rSKAu/Q6vx7b3sCDc/rYqgcuPaFdoQJkfYNjpdKm02QUXAuWpLXtnZnm1g D+MLTyyOXFiiOHYlRxhk85OSonki+WP6xH3wOQXUk00dcaGs4hFOgeY/T0QcMAiI1mJj ieXlHYbhEkGCLBigqAjMkjkf9NdluwA6pQS+++GYCwMrSbJ9KuMQGlnpMOZBPnUHjqAf wLnekgW/Beom3OJxeXEb1NFrUj2qXQR3E1HVDFFJVq7NyvX2gS9FIWuhccSzWhpGdrYG CbBvkGDmFklTm3bFc5kN+daGLNnL2G8huJ3tLhkCQ3SUyQivPPYF1o/J/sPNYarcQrvb bQ7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=0fCDIAicCqj1Q3sXACKB8W0RfMVgdQQCMV2XWxKxWBs=; b=GQVCR6S3JlHjiKAXcTPONw0J+0wa9Ti+7VxSe6XvwginzPuubFgBPSF4ogCyKHv8pW 5isZrvpCw/HqsYWXZGNtLuxvm6vzf2tCz4aDdbGezhAv/WZV2p5n2p5SAGsUXzA0fR9o VlEYx+8LvDxNzm+wIPbhvVY3J8RGxSRHHXGRZ9mCz7n3ZCAVRLaXPtXdpADLGzBZPcUe WvPnzBJwAmYrk62HhPNWbhZ1i3d4ez3HiZnA9vzT0u3//+PDIMX1zlbvzd66MUsfpghG /YNVFAkb3Cs3G45oV/vOn2x1xVX4J60gAceEptpvvCBQ+pQoQeMuDxG+VJFq3n/+qviq VdnA== X-Gm-Message-State: AIVw11137syE17lietUCOdR9SKYDC3wPb/S8kBa2YraLNmqjdZ+vKTfP T6gPjGrMrxPXpw== X-Received: by 10.98.223.18 with SMTP id u18mr12615835pfg.166.1500862508673; Sun, 23 Jul 2017 19:15:08 -0700 (PDT) Received: from raichu ([2604:4080:1102:0:ca60:ff:fe9d:3963]) by smtp.gmail.com with ESMTPSA id z5sm17839133pgr.35.2017.07.23.19.15.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 23 Jul 2017 19:15:08 -0700 (PDT) Sender: Mark Johnston Date: Sun, 23 Jul 2017 19:15:04 -0700 From: Mark Johnston To: Mark Martinec Cc: freebsd-stable@freebsd.org Subject: Re: The 11.1-RC3 can only boot and attach disks in "Safe mode", otherwise gets stuck attaching Message-ID: <20170724021504.GA97170@raichu> References: <20170717232434.GB21048@wkstn-mjohnston.west.isilon.com> <9b3563aae75aa954d7fe31ffe25e1d29@ijs.si> <20170720000325.GB9198@wkstn-mjohnston.west.isilon.com> <81295bcacd7c44813de8d346c88cbb65@ijs.si> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <81295bcacd7c44813de8d346c88cbb65@ijs.si> User-Agent: Mutt/1.8.3 (2017-05-23) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 02:15:09 -0000 On Thu, Jul 20, 2017 at 03:45:39PM +0200, Mark Martinec wrote: > 2017-07-20 02:03, Mark Johnston wrote: > > One thing to try at this point would be to disable EARLY_AP_STARTUP in > > the kernel config. That is, take a configuration with which you're able > > to reproduce the hang during boot, and remove "options > > EARLY_AP_STARTUP". > > Done. And it avoids the problem altogether! Thanks. > Tried a reboot several times and it succeeds every time. Thanks. Sorry for the delayed follow-up. > > Here is all that I had in a config file for building a kernel, > i.e. I took away the 'options DDB' which also seemingly avoided > the problem: > include GENERIC > ident NELI > nooptions EARLY_AP_STARTUP Could you try re-enabling EARLY_AP_STARTUP, applying the patch at the end of this email, and see if the message "sleeping before eventtimer init" appears in the boot output? If it does, it'll be followed by a backtrace that might be useful for tracking down the hang. It might produce false positives, but we'll see. > > > This feature has a fairly large impact on the bootup process and has > > had a few problems that manifested as hangs during boot. There was at > > least one other case where an innocuous change to the kernel > > configuration "fixed" the problem by introducing some second-order > > effect (causing kernel threads to be scheduled in a different > > order, for instance). > > > Regardless of whether the suggestion above makes a difference, it would > > be helpful to see verbose dmesgs from both a clean boot and a boot that > > hangs. If disabling EARLY_AP_STARTUP helps, then we can try adding some > > assertions that will cause the system to panic when the hang occurs, > > making it easier to see what's going on. > > Hmmm. > I have now saved a couple of versions of /var/run/dmesg.boot > (in boot_verbose mode) when EARLY_AP_STARTUP is disabled and > the boot is successful. However, I don't know how to capture > such log when booting hangs, as I have no serial interface > and the boot never completes. All I have is a screen photo > of the last state when a hang occurs (showing ada disks > successfully attached, followed immediately by the attempt > to attach a da disk, which hangs). Ok, let's not worry about this for now. Index: sys/kern/kern_clock.c =================================================================== --- sys/kern/kern_clock.c (revision 321401) +++ sys/kern/kern_clock.c (working copy) @@ -385,6 +385,8 @@ static int devpoll_run = 0; #endif +bool inited_clocks = false; + /* * Initialize clock frequencies and start both clocks running. */ @@ -412,6 +414,8 @@ #ifdef SW_WATCHDOG EVENTHANDLER_REGISTER(watchdog_list, watchdog_config, NULL, 0); #endif + + inited_clocks = true; } /* Index: sys/kern/kern_synch.c =================================================================== --- sys/kern/kern_synch.c (revision 321401) +++ sys/kern/kern_synch.c (working copy) @@ -298,6 +298,8 @@ return (rval); } +extern bool inited_clocks; + /* * pause() delays the calling thread by the given number of system ticks. * During cold bootup, pause() uses the DELAY() function instead of @@ -330,6 +332,10 @@ DELAY(sbt); return (0); } + if (cold && !inited_clocks) { + printf("%s: sleeping before eventtimer init\n", curthread->td_name); + kdb_backtrace(); + } return (_sleep(&pause_wchan[curcpu], NULL, 0, wmesg, sbt, pr, flags)); }