From owner-freebsd-current@FreeBSD.ORG Thu Aug 26 06:00:46 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3AFC016A4CE for ; Thu, 26 Aug 2004 06:00:46 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id C005E43D45 for ; Thu, 26 Aug 2004 06:00:45 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.11/8.12.11) with ESMTP id i7Q60Z2m005989; Wed, 25 Aug 2004 23:00:39 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200408260600.i7Q60Z2m005989@gw.catspoiler.org> Date: Wed, 25 Aug 2004 23:00:35 -0700 (PDT) From: Don Lewis To: noackjr@alumni.rice.edu In-Reply-To: <412D46D3.9010900@alumni.rice.edu> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-current@FreeBSD.org cc: bettan@nerim.net Subject: Re: reboot on freebsd 5.3-beta1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Aug 2004 06:00:46 -0000 On 25 Aug, Jon Noack wrote: > On 08/25/04 13:31, Don Lewis wrote: >> On 25 Aug, bettan wrote: >>> When i reboot , i have Syncing disk , vnodes remaining and numbers >>> but it isn't quickly and i don't umount my files systems before the >>> reboot. >> >> It should print a number once per second. The numbers should quickly >> decrease to something in the low single digits. It is not unusual to see >> the number decrease to zero and then bounce back up a couple of times. >> Spending about 10 seconds or so at this stage is not unexpected. > > This is a relatively recent change in behavior to workaround the fact > that IDE/ATA controllers/drives report that they have successfully > written data before they actually perform the write. With the old quick > sync, people were experiencing corruption and data loss when rebooting. > The new behavior is much more conservative and is designed to minimize > risk. Before this change, the system attempted to sync all the file systems after it disabled the syncer thread. The problem was that if soft updates was in use and there was a lot of file system activity shortly before system shutdown, there could be some mutually dependent file system writes buffered that the final sync code couldn't handle. When this happened, the system would get stuck at the "syncing disks, buffers remaining..." stage and after an initial decrease, the number of buffers would stabilize at some non-zero value. Eventually the final sync code would time out and shut down the system with the mounted file systems marked unclean. I could easily reproduce this problem by rebooting shortly after running mergemaster, so I got in the habit of running the sync command three times and waiting a bit before rebooting as a workaround. The IDE/ATA problem only happened when powering off the system. I *think* it might have been fixed by explicitly telling the drives to do a cache flush. There is also a tuneable delay (kern.shutdown.poweroff_delay) before turning off the power to give the drives time to flush their write caches even if they ignore any explicit flush command. The system shutdown delay currently observed at the "Syncing disk, vnodes remaining" stage should be roughly the same as delay previously seen at the "syncing disks, buffers remaining..." stage. I've got some ideas on how to speed it up.