From owner-freebsd-stable@FreeBSD.ORG Wed Jun 19 11:41:43 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 97BA6EBA for ; Wed, 19 Jun 2013 11:41:43 +0000 (UTC) (envelope-from adams-freebsd@ateamsystems.com) Received: from smtp163.dfw.emailsrvr.com (smtp163.dfw.emailsrvr.com [67.192.241.163]) by mx1.freebsd.org (Postfix) with ESMTP id 69D4B1AB8 for ; Wed, 19 Jun 2013 11:41:43 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp6.relay.dfw1a.emailsrvr.com (SMTP Server) with ESMTP id 1B0A9270A6D for ; Wed, 19 Jun 2013 07:36:05 -0400 (EDT) X-Virus-Scanned: OK Received: from smtp115.ord1c.emailsrvr.com (smtp115.ord1c.emailsrvr.com [108.166.43.115]) by smtp6.relay.dfw1a.emailsrvr.com (SMTP Server) with ESMTPS id C903E270A81 for ; Wed, 19 Jun 2013 07:36:04 -0400 (EDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp7.relay.ord1c.emailsrvr.com (SMTP Server) with ESMTP id 009071B814F for ; Wed, 19 Jun 2013 07:35:57 -0400 (EDT) X-Virus-Scanned: OK Received: by smtp7.relay.ord1c.emailsrvr.com (Authenticated sender: adam.strohl-AT-ateamsystems.com) with ESMTPSA id 942181B81A8 for ; Wed, 19 Jun 2013 07:35:56 -0400 (EDT) Message-ID: <51C1979D.3010305@ateamsystems.com> Date: Wed, 19 Jun 2013 18:35:57 +0700 From: Adam Strohl User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jun 2013 11:41:43 -0000 Hello -STABLE@, So I've seen this situation seemingly randomly on a number of both physical 9.1 boxes as well as VMs for I would say 6-9 months at least. I finally have a physical box here that reproduces it consistently that I can reboot easily (ie; not a production/client server). No matter what I do: reboot shutdown -p shutdown -r This specific server will stop at "All buffers synced" and not actually power down or reboot. KB input seems to be ignored. This server is a ZFS NAS (with GMIRROR for boot blocks) but the other boxes which show this are using GMIRRORs for root/swap/boot (no ZFS). Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg When I reset the server it appears that disks were not dismounted cleanly ... on this ZFS box it comes back quick because ZFS is good like that but on the other servers with GMIRROR roots rebuilding the GMIRROR and fscking at the same time is murder on the disk/performance until it finishes. Another interesting thing is that this particular server runs slapd (OpenLDAP) which, when it comes back up, has a "corrupted" DB (easily fixed with db_recover, but still). This might be because FS commits aren't happening at the end. I can even manually stop slapd (service slapd stop) then run sync(8) (I assume this does something for ZFS too) and it still comes back as hosed if I reboot shortly after. If I start/stop slapd it's fine. So I feel like there is an FS/dismount thing going on here. Additional information: I also have some boxes which will reboot (ie; they don't freeze like some do at the end) but they don't dismount cleanly either and have to rebuild both GMIRROR and fsck. This might be a different issue, too. Anyone have any thoughts? Let me know if I can provide more details etc. -- Adam Strohl http://www.ateamsystems.com/