Date: Wed, 10 Apr 2013 15:02:56 -0500 From: Kevin Day <toasty@dragondata.com> To: "freebsd-fs@FreeBSD.org Filesystems" <freebsd-fs@freebsd.org> Subject: Does sync(8) really flush everything? Lost writes with journaled SU after sync+power cycle Message-ID: <87CC14D8-7DC6-481A-8F85-46629F6D2249@dragondata.com>
next in thread | raw e-mail | index | archive | help
Working with an environment where a system (with journaled soft-updates) = is going to be notified that it's going to be losing power shortly, and = needs to shut down daemons and flush everything to disk. It doesn't = actually shutdown though, because the "power down now" command may get = cancelled and we need to bring things back up. My understanding was that = we could call sync(8), then just wait for the power to drop. The problem is that we were frequently losing the last 30-60 seconds = worth of filesystem changes prior to the shutdown. i.e. newly created = directories would disappear or fsck would reclaim them and throw them = into lost+found. I confirmed that there is no caching disk controller, and write caching = is disabled on the drives themselves, and the problem continued. On a whim, after running sync(8) once and waiting 10 seconds, I did = "mount -u -o ro -f /" to force the filesystem into read-only mode. It = took about 8 seconds to finish, gstat showed a lot of write activity, = and SIGINFO on the mount command showed: load: 0.01 cmd: mount 15775 [biowr] 3.62r 0.00u 0.55s 5% 1644k load: 0.03 cmd: mount 15775 [runnable] 4.41r 0.00u 0.65s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 5.00r 0.00u 0.72s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 5.70r 0.00u 0.80s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.03r 0.00u 0.84s 6% 1644k load: 0.03 cmd: mount 15775 [running] 6.27r 0.00u 0.87s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.51r 0.00u 0.90s 7% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.69r 0.00u 0.92s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 6.90r 0.00u 0.94s 6% 1644k load: 0.03 cmd: mount 15775 [biowr] 7.04r 0.00u 0.96s 7% 1644k load: 0.03 cmd: mount 15775 [biowr] 7.20r 0.00u 0.98s 7% 1644k If sync's man page is true (force completion of pending disk writes = (flush cache)), and there is zero filesystem activity occurring, = shouldn't that be enough to ensure no corruption after a power cycle? If = sync really is flushing everything, what's all the write activity = happening in when degrading from rw to ro? Is there a better way to get things into a stable state on disk, yet not = fully shutdown so that we can recover from this if the shutdown order is = cancelled? For me, this is easily reproducible with: mkdir /root/test sync sleep 10 (hit reset button) The problem doesn't happen with: mkdir /root/test mount -u -o ro -f / (hit reset button) It's great that we're not ending up in an inconsistent state, but i was = expecting sync to prevent this. -- Kevin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87CC14D8-7DC6-481A-8F85-46629F6D2249>