From owner-freebsd-stable@FreeBSD.ORG Wed Jun 19 12:21:59 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 615BED74 for ; Wed, 19 Jun 2013 12:21:59 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) by mx1.freebsd.org (Postfix) with ESMTP id DD52F1D40 for ; Wed, 19 Jun 2013 12:21:58 +0000 (UTC) Received: from mfilter10-d.gandi.net (mfilter10-d.gandi.net [217.70.178.139]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id B7AAAA80F0; Wed, 19 Jun 2013 14:21:47 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter10-d.gandi.net Received: from relay3-d.mail.gandi.net ([217.70.183.195]) by mfilter10-d.gandi.net (mfilter10-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id IDUPL15MSMFb; Wed, 19 Jun 2013 14:21:46 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 7969EA80C0; Wed, 19 Jun 2013 14:21:45 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id 6BAA773A1C; Wed, 19 Jun 2013 05:21:43 -0700 (PDT) Date: Wed, 19 Jun 2013 05:21:43 -0700 From: Jeremy Chadwick To: Adam Strohl Subject: Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount Message-ID: <20130619122143.GA70813@icarus.home.lan> References: <51C1979D.3010305@ateamsystems.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51C1979D.3010305@ateamsystems.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jun 2013 12:21:59 -0000 On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote: > Hello -STABLE@, > > So I've seen this situation seemingly randomly on a number of both > physical 9.1 boxes as well as VMs for I would say 6-9 months at > least. I finally have a physical box here that reproduces it > consistently that I can reboot easily (ie; not a production/client > server). > > No matter what I do: > > reboot > shutdown -p > shutdown -r > > This specific server will stop at "All buffers synced" and not > actually power down or reboot. KB input seems to be ignored. This > server is a ZFS NAS (with GMIRROR for boot blocks) but the other > boxes which show this are using GMIRRORs for root/swap/boot (no > ZFS). > > Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg > > When I reset the server it appears that disks were not dismounted > cleanly ... on this ZFS box it comes back quick because ZFS is good > like that but on the other servers with GMIRROR roots rebuilding the > GMIRROR and fscking at the same time is murder on the > disk/performance until it finishes. 1. You mention "as well as VMs". Anything under a "virtual machine" or under a hypervisor is going to be very, very, **VERY** different than bare metal. So I hope the issues you're talking about above are on bare metal -- I will assume so. 2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE. If you use stable/9 (RELENG_9) we need to see uname -a output (you can hide the machine name if you want). 3. Can we please have dmesg from this machine? The controller and some other hardware details matter. 4. Does "sysctl hw.usb.no_shutdown_wait=1" help you? 5. Does "sysctl hw.acpi.handle_reboot=1" help you? 6. Does "sysctl hw.acpi.disable_on_reboot=1" help you? 7. If none of the above helps, can you please boot verbose mode and then when the system "locks up" on "shutdown -r now" take a picture of the VGA console? 8. Does the machine run moused(8) (check the process list please, do not rely on rc.conf) ? > Another interesting thing is that this particular server runs slapd > (OpenLDAP) which, when it comes back up, has a "corrupted" DB > (easily fixed with db_recover, but still). This might be because FS > commits aren't happening at the end. I can even manually stop > slapd (service slapd stop) then run sync(8) (I assume this does > something for ZFS too) and it still comes back as hosed if I reboot > shortly after. If I start/stop slapd it's fine. So I feel like > there is an FS/dismount thing going on here. sync(8) does not do what you think it does. Please read (not skim) this entire thread starting here: http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982 http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html Your problem is related to unclean shutdown; fix that and your issues go away. > Additional information: I also have some boxes which will reboot > (ie; they don't freeze like some do at the end) but they don't > dismount cleanly either and have to rebuild both GMIRROR and fsck. > This might be a different issue, too. Every issue needs to be handled/treated separately. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB |