Date: Sat, 7 Mar 2009 22:55:50 GMT From: Dieter <freebsd@sopwith.solgatos.com> To: freebsd-gnats-submit@FreeBSD.org Subject: kern/132397: reboot causes filesystem corruption Message-ID: <200903072255.n27Mtos4024466@www.freebsd.org> Resent-Message-ID: <200903072300.n27N05RS079446@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 132397 >Category: kern >Synopsis: reboot causes filesystem corruption >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Mar 07 23:00:05 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Dieter >Release: 7.1 >Organization: >Environment: 7.1-RELEASE amd64 >Description: FreeBSD 7.1 amd64 soft updates, disk write cache off System was running fine. I typed reboot. It printed out a *very* long string of numbers of buffers left, (see below) usually the string is very short (one line). It gave up with 3 buffers left to go. (why?) A filesystem is now corrupt and fsck causes panic. Through the miracle of soft updates filesystems can now survive power outages and the reset button with no damage, but an orderly reboot causes unfixable corruption? THIS IS COMPLETELY UNACCEPTABLE AND NEEDS TO BE FIXED! Q1) What "timed out" ? (see below) kproc_shutdown() maybe? Q2) Why does the number of buffers sometimes go up? Q3) Why doesn't it get all the buffers synced out to disk? It is like something is still generating dirty buffers, but aren't all processes killed before it gets to this point? I supposed I can crank up static int kproc_shutdown_wait = 60; in kern_shutdown.c but that just gives it more time. It could still fail. kern_kthread.c says: * Advise a kernel process to suspend (or resume) in its main loop. * Participation is voluntary. Voluntary eh? Q4) Could something refuse to suspend and cause this? Why is this allowed? in kern_shutdown.c: /* * With soft updates, some buffers that are * written will be remarked as dirty until other * buffers are written. */ for (iter = pbusy = 0; iter < 20; iter++) { I suppose the "20" could be increased, but again, it could still fail. Q5) Does this buffer syncing not follow the soft updates protocol? If not why not? There is something fundamentally wrong here, but I don't know what. Barring a disk write error, which is not the case here, this should never happen. Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining...1 3 1 3 2 0 1 2 0 1 0 2 2 0 2 2 2 2 2 2 0 2 0 2 2 2 2 0 2 2 2 2 0 1 0 2 0 2 2 2 2 0 2 0 2 1 0 2 2 0 2 0 1 2 2 2 1 0 1 2 2 time d out Syncing disks, buffers remaining... 40 40 34 33 33 17 17 11 10 10 2 1 1 37 39 17 17 5 5 39 40 23 23 8 8 2 1 1 37 38 17 17 5 5 40 41 23 23 8 7 7 1 1 34 35 17 17 7 6 6 1 31 35 23 22 22 21 21 9 9 2 1 1 27 26 26 21 21 21 1 1 27 24 24 21 21 10 1 0 4 3 3 30 28 28 22 21 21 21 30 22 21 21 21 1 1 27 25 25 22 22 10 10 3 3 27 24 2 3 23 21 21 7 6 6 30 27 26 26 21 21 21 31 22 22 21 21 8 7 7 1 1 26 22 22 21 21 9 9 2 2 28 25 25 21 21 21 1 1 25 22 22 21 21 9 9 2 2 28 26 26 21 21 21 1 1 26 24 2 4 21 21 21 30 22 21 21 21 3 2 2 28 26 25 25 21 21 21 1 1 26 23 23 21 21 21 31 22 22 21 21 10 9 9 2 2 28 25 24 24 21 21 10 10 2 2 29 26 26 21 21 21 1 1 27 24 24 21 21 21 31 23 22 22 21 21 7 7 31 30 29 29 22 22 21 21 10 9 9 3 2 2 28 26 25 25 21 21 21 1 1 26 23 23 21 21 10 10 3 3 30 27 27 21 21 21 2 2 29 26 26 21 21 21 1 31 36 23 23 21 21 10 10 3 3 30 28 27 27 21 21 21 1 1 26 23 23 21 21 10 10 2 1 1 27 24 23 23 21 21 6 6 31 28 27 27 21 21 21 2 1 1 27 25 25 21 21 21 29 21 21 21 2 2 28 25 25 21 21 21 31 22 22 21 21 9 9 3 2 2 29 27 26 26 21 21 21 1 1 27 24 24 21 21 21 31 23 22 22 21 21 9 9 3 3 30 27 27 21 21 21 1 1 27 24 24 22 22 21 21 5 5 31 28 28 21 21 21 3 3 28 25 25 21 21 21 1 1 26 21 21 21 2 2 29 26 25 25 21 21 10 10 2 2 28 24 24 21 21 10 10 1 1 27 24 23 23 21 21 9 8 8 1 1 27 24 24 21 21 10 9 9 2 1 1 25 22 22 21 21 9 9 3 2 2 28 24 23 23 21 21 9 9 3 3 30 26 26 21 21 21 1 1 26 22 22 21 21 10 10 4 3 3 30 27 26 26 21 21 21 31 22 21 21 21 4 4 30 26 26 21 21 21 1 1 27 23 23 21 21 10 10 3 3 28 25 25 21 21 21 1 31 35 22 22 21 21 10 1 0 3 3 28 26 25 25 21 21 21 1 1 26 23 23 21 21 21 31 22 21 21 21 3 3 29 26 26 21 21 21 1 1 26 23 23 21 21 7 7 1 1 26 23 23 21 21 9 9 1 40 46 37 37 33 32 32 26 26 14 14 5 5 40 40 24 23 23 6 6 1 38 44 36 36 31 31 24 24 10 10 3 3 39 41 24 24 14 14 5 5 40 41 24 24 9 9 3 3 39 41 25 25 10 9 9 4 4 40 41 25 25 10 10 2 2 38 39 2 3 23 7 7 1 1 36 37 8 8 1 1 38 37 37 32 31 31 25 24 24 9 9 3 3 36 36 22 16 16 6 6 31 28 28 22 21 21 21 31 22 21 21 21 31 22 21 21 21 3 2 2 28 25 24 24 21 21 21 1 1 26 22 22 21 21 9 9 2 1 1 27 25 24 24 21 21 21 1 1 25 22 22 21 21 9 9 3 2 2 29 26 25 25 21 21 21 1 1 26 23 23 21 21 10 10 3 3 29 26 26 21 21 21 1 1 26 23 23 2 1 21 21 31 23 22 22 21 21 10 10 4 4 29 26 26 21 21 21 1 1 26 23 23 21 21 21 1 31 35 22 22 21 21 9 9 1 1 27 24 23 23 21 21 21 1 1 26 22 22 21 21 9 8 8 2 1 1 27 2 5 24 24 21 21 7 7 29 27 26 26 21 21 21 31 22 22 21 21 8 7 7 1 1 26 23 22 22 21 2 1 10 10 4 4 30 27 27 21 21 21 1 1 27 24 24 21 21 21 1 31 35 22 21 21 21 4 3 3 20 17 16 16 15 14 15 5 5 2 2 3 3 3 3 5 3 3 5 3 3 3 5 3 2 5 3 3 4 3 3 2 1 1 3 4 3 3 3 3 3 1 1 3 3 5 3 3 4 4 3 3 3 3 3 5 5 3 2 3 3 3 3 5 3 2 5 3 3 4 3 3 4 3 3 5 4 4 3 2 5 3 2 3 2 2 3 3 3 3 5 3 3 4 3 3 5 4 3 3 3 5 3 2 3 3 3 3 5 3 3 3 5 4 4 3 3 3 3 5 1 1 3 3 3 3 2 3 3 2 2 3 3 3 3 3 3 3 2 2 5 3 3 3 4 3 3 3 3 2 3 2 5 3 3 3 3 3 3 2 2 2 3 2 3 3 2 2 3 3 3 3 2 2 3 3 3 3 3 3 3 3 3 2 2 5 3 3 3 3 2 2 3 5 3 3 3 3 3 2 2 3 3 2 2 2 2 3 3 2 2 3 2 2 5 3 2 2 3 5 1 2 3 3 3 4 5 3 2 2 3 2 2 3 3 2 2 5 3 3 3 3 2 3 5 3 3 5 3 3 2 2 2 3 3 3 3 3 3 2 3 3 5 3 2 5 3 2 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 6 3 3 3 3 5 3 3 3 3 3 3 3 3 2 2 3 5 3 2 5 3 2 3 3 3 3 3 3 3 3 3 2 3 3 3 3 5 3 2 5 3 2 3 3 3 3 5 3 2 3 3 3 3 3 3 3 3 3 5 3 3 5 3 3 5 5 3 3 3 3 3 5 5 3 3 5 3 3 5 3 3 5 3 3 5 1 1 3 4 5 3 3 3 5 3 3 3 3 3 3 3 4 3 2 3 3 3 5 3 2 3 3 3 4 3 3 3 3 3 3 3 4 3 2 3 3 3 3 3 3 3 3 3 3 3 5 3 3 2 2 3 5 3 2 5 3 2 3 3 5 3 3 3 3 3 5 2 5 3 3 4 3 3 3 5 3 2 5 3 3 3 3 5 3 3 3 4 3 3 3 3 3 5 3 3 3 3 4 3 3 3 5 3 3 5 4 4 5 3 2 5 3 2 3 5 3 3 3 3 3 3 3 3 3 3 3 5 3 3 1 3 3 5 4 4 3 2 3 3 3 3 3 5 3 3 5 3 3 4 3 2 3 3 5 3 3 2 1 1 5 3 3 3 5 4 4 3 2 5 3 3 3 3 3 3 3 5 3 2 3 2 3 3 3 3 5 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5 3 3 3 5 3 3 3 4 3 3 4 3 3 3 3 5 2 3 3 3 2 3 3 3 3 5 3 2 3 4 3 2 3 1 1 3 4 4 3 3 3 4 3 3 3 3 2 2 4 3 3 4 3 3 3 2 2 3 3 3 3 3 2 2 3 3 3 3 2 2 4 3 3 3 2 2 3 3 3 3 3 3 3 2 3 3 2 2 3 3 2 2 3 3 3 2 2 3 3 2 2 3 3 2 3 2 2 3 3 3 3 3 3 3 3 2 2 2 3 3 3 3 4 3 3 2 2 3 3 3 3 3 3 2 3 4 2 2 1 1 3 4 3 3 3 3 2 2 4 3 3 3 3 3 2 2 3 2 4 2 2 2 3 3 3 3 2 2 3 3 3 3 3 3 3 3 3 3 3 3 I wasn't able to capture all of it. >How-To-Repeat: unknown >Fix: unknown >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200903072255.n27Mtos4024466>