Date: Fri, 12 Dec 2014 18:16:17 +0100 From: Walter Hop <freebsd@spam.lifeforms.nl> To: freebsd-fs@freebsd.org Subject: Serious FS hangs and panics on 10.1 Message-ID: <553B39FA-7DBC-4536-9FD4-11A98E0D4740@spam.lifeforms.nl>
next in thread | raw e-mail | index | archive | help
Hi all, As some may have read on -stable, various users are having system hangs = since 10.1-RC when unmounting the root filesystem on 10.1 with = UFS+softupdates. I'll recap: hangs occur for instance when /sbin/init = has been meddled with, so people experience it generally after running = freebsd-update. With the 10.1-p1 update, the bug and mailinglist posts = got additional activity, so it's a recurring theme. I verified the = problem still exists in CURRENT, and found lock order reversals which = may or may not be related. = (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D195458) Now the above problem has a simple mitigation: just disable softupdates = before doing freebsd-update, and you won't hang. Okay, a little = startling, but I=E2=80=99m still sleeping okay. Now today, the 10.1 story seems to look a lot worse, with a 10.1 box = getting back-to-back kernel panics in VFS functions. This is a box = serving SVN repositories, and SVN is known to exercise a filesystem = pretty thoroughly (even uncovering NTFS bugs in pre-SP1 Windows7). = We=E2=80=99ve updated this box from 10.0 to 10.1 a week ago. The four = panics that we saw (trace below), had the exact same instruction pointer = and stack trace, so I'm pretty positive we're not looking at a random = hardware fluke. The last panics were spaced only minutes apart, which was pretty scary. = I was fearing persistent disk corruption, but the panics stopped when... = I disabled softupdates! This was my first shot, as this also solved my = other stability problem on 10.1. Anyway, the machine has been stable so = far. Maybe these two problems are unrelated, it might be too early to tell, = but in any case, I am getting the strong vibe that something was changed = in UFS/VFS/softupdates between 10.0 and 10.1 that's possibly very = problematic and has a risk of causing data loss in the future. Our experience with 10.0 has been remarkably good (same for earlier = releases for that matter... in fact I don't think I can remember the = last kernel panic in production at all.. maybe on 5.2-STABLE?) So, = that's why we were very happy to see 10.1; but it feels really = troublesome in the filesystem department, which is very uncharacteristic = for FreeBSD. That said, I'd prefer spending some more energy on getting 10.1 working = well, rather than downgrading or jumping to other systems... But I think = it really needs some love. Any ideas on what we could do? Thanks! WH --=20 Walter Hop | PGP key: https://lifeforms.nl/pgp Panic: kernel: Fatal trap 12: page fault while in kernel mode kernel: cpuid =3D 0; apic id =3D 00 kernel: fault virtual address =3D 0x30058 kernel: fault code =3D supervisor write data, page not present kernel: instruction pointer =3D 0x20:0xffffffff8090e46a kernel: stack pointer =3D 0x28:0xfffffe000024d780 kernel: frame pointer =3D 0x28:0xfffffe000024d850 kernel: code segment =3D base 0x0, limit 0xfffff, type = 0x1b kernel: =3D DPL 0, pres 1, long 1, def32 0, gran 1 kernel: processor eflags =3D interrupt enabled, resume, IOPL =3D 0 kernel: current process =3D 27466 (httpd) kernel: trap number =3D 12 kernel: panic: page fault kernel: cpuid =3D 0 kernel: KDB: stack backtrace: kernel: #0 0xffffffff80963000 at kdb_backtrace+0x60 kernel: #1 0xffffffff80928125 at panic+0x155 kernel: #2 0xffffffff80d24f1f at trap_fatal+0x38f kernel: #3 0xffffffff80d25238 at trap_pfault+0x308 kernel: #4 0xffffffff80d2489a at trap+0x47a kernel: #5 0xffffffff80d0a782 at calltrap+0x8 kernel: #6 0xffffffff8090ec35 at lf_advlock+0x45 kernel: #7 0xffffffff809b8e69 at vop_stdadvlock+0xa9 kernel: #8 0xffffffff80e44247 at VOP_ADVLOCK_APV+0xa7 kernel: #9 0xffffffff808e4919 at kern_fcntl+0xb39 kernel: #10 0xffffffff808e3d5c at kern_fcntl_freebsd+0xac kernel: #11 0xffffffff80d25851 at amd64_syscall+0x351 kernel: #12 0xffffffff80d0aa6b at Xfast_syscall+0xfb
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?553B39FA-7DBC-4536-9FD4-11A98E0D4740>