Date: Thu, 30 Jul 2009 14:55:48 +0200 From: Rene Ladan <rene@freebsd.org> To: Kostik Belousov <kostikbel@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) Message-ID: <e890cae60907300555x63de4a0dva503171d8fe2d3e6@mail.gmail.com> In-Reply-To: <20090730092507.GF1884@deviant.kiev.zoral.com.ua> References: <200907271400.n6RE05Rv056472@freefall.freebsd.org> <200907290742.20838.jhb@freebsd.org> <e890cae60907290820i65abae2fracbc5ab935465089@mail.gmail.com> <200907291135.17569.jhb@freebsd.org> <e890cae60907300205v5d3d5586qe86969bd28fe8621@mail.gmail.com> <20090730092507.GF1884@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] 2009/7/30 Kostik Belousov <kostikbel@gmail.com>: > On Thu, Jul 30, 2009 at 11:05:32AM +0200, Rene Ladan wrote: >> 2009/7/29 John Baldwin <jhb@freebsd.org>: >> > On Wednesday 29 July 2009 11:20:21 am Rene Ladan wrote: >> >> 2009/7/29 John Baldwin <jhb@freebsd.org>: >> >> > On Wednesday 29 July 2009 5:52:24 am Rene Ladan wrote: >> >> >> 2009/7/28 John Baldwin <jhb@freebsd.org>: >> >> >> > On Tuesday 28 July 2009 10:03:40 am Rene Ladan wrote: >> >> >> >> 2009/7/28 John Baldwin <jhb@freebsd.org>: >> >> >> >> > On Monday 27 July 2009 10:00:05 am Rene Ladan wrote: >> >> >> >> >> The following reply was made to PR kern/136945; it has been noted >> > by >> >> >> > GNATS. >> >> >> >> >> >> >> >> >> >> From: Rene Ladan <rene@freebsd.org> >> >> >> >> >> To: John Baldwin <jhb@freebsd.org> >> >> >> >> >> Cc: bug-followup@freebsd.org >> >> >> >> >> Subject: Re: kern/136945: [ufs] [lor] filedesc structure/ufs (poll) >> >> >> >> >> Date: Mon, 27 Jul 2009 15:51:15 +0200 >> >> >> >> >> >> >> >> >> >> 2009/7/27 John Baldwin <jhb@freebsd.org>: >> >> >> >> >> > I would actually expect this to be the correct order for these >> > two >> >> >> >> > locks.= >> >> >> >> >> =A0Can >> >> >> >> >> > you capture the output of the 'debug.witness.fullgraph' sysctl >> > to a >> >> >> > file? >> >> >> >> >> > >> >> >> >> >> Yes, see attachment. I'm still running the same 8.0-BETA2. >> >> >> >> > >> >> >> >> > Hmm, the attachment was eaten by a grue, can you post the file >> >> > somewhere? >> >> >> >> > >> >> >> >> Yes, see ftp://rene-ladan.nl/pub/freebsd/kern_136945.txt >> >> >> > >> >> >> > Ok, it looks like it did encounter a UFS -> filedesc order at some >> >> > point. Can >> >> >> > you patch sys/kern/subr_witness.c to add a section to the order_lists[] >> >> > array >> >> >> > after the 'ZFS locking list' and before the spin locks list that looks >> >> > like >> >> >> > this: >> >> >> > >> >> >> > { "filedesc structure", &lock_class_sx }, >> >> >> > { "ufs", &lock_class_lockmgr}, >> >> >> > { NULL, NULL }, >> >> >> > >> >> >> The LOR seems to be gone, previously it showed up only once right >> >> >> after booting the system. >> >> >> >> >> >> But now a new LOR (according to the LOR page) seems pop up: >> >> >> Trying to mount root from ufs:/dev/ad0s1a >> >> >> lock order reversal: >> >> >> 1st 0xffffff0002a4ad80 ufs (ufs) >> > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:1465 >> >> >> 2nd 0xffffff0002b29a48 filedesc structure (filedesc structure) @ >> >> >> /usr/src/sys/kern/kern_descrip.c:2478 >> >> >> KDB: stack backtrace: >> >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> >> >> _witness_debugger() at _witness_debugger+0x49 >> >> >> witness_checkorder() at witness_checkorder+0x7ea >> >> >> _sx_xlock() at _sx_xlock+0x44 >> >> >> mountcheckdirs() at mountcheckdirs+0x80 >> >> >> vfs_donmount() at vfs_donmount+0xfbf >> >> >> kernel_mount() at kernel_mount+0xa1 >> >> >> vfs_mountroot_try() at vfs_mountroot_try+0x177 >> >> >> vfs_mountroot() at vfs_mountroot+0x47d >> >> >> start_init() at start_init+0x62 >> >> >> fork_exit() at fork_exit+0x12a >> >> >> fork_trampoline() at fork_trampoline+0xe >> >> >> --- trap 0, rip = 0, rsp = 0xffffff800001ad30, rbp = 0 --- >> >> >> >> >> >> The output of `df' and `mount' looks ok. >> >> > >> >> > Yes, this is the "real" LOR as "filedesc" -> "ufs" in the poll() case >> > should >> >> > be the normal order. I believe this should fix it. mountcheckdirs() >> > doesn't >> >> > need the vnodes locked, it just needs the caller to hold references on >> > them >> >> > so they aren't recycled: >> >> > >> >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#96 >> >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c >> >> > @@ -1069,9 +1069,10 @@ >> >> > vfs_event_signal(NULL, VQ_MOUNT, 0); >> >> > if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp)) >> >> > panic("mount: lost mount"); >> >> > + VOP_UNLOCK(newdp, 0); >> >> > + VOP_UNLOCK(vp, 0); >> >> > mountcheckdirs(vp, newdp); >> >> > - vput(newdp); >> >> > - VOP_UNLOCK(vp, 0); >> >> > + vrele(newdp); >> >> > if ((mp->mnt_flag & MNT_RDONLY) == 0) >> >> > error = vfs_allocate_syncvnode(mp); >> >> > vfs_unbusy(mp); >> >> > >> >> The LOR is still present, but at a different place without the >> >> mountcheckdirs() call (not on the LOR page either) : >> > >> > Ok, try this patch as well: >> > >> > --- //depot/projects/smpng/sys/kern/vfs_mount.c#97 >> > +++ /home/jhb/work/p4/smpng/sys/kern/vfs_mount.c >> > @@ -1481,6 +1481,8 @@ >> > if (VFS_ROOT(TAILQ_FIRST(&mountlist), LK_EXCLUSIVE, &rootvnode)) >> > panic("Cannot find root vnode"); >> > >> > + VOP_UNLOCK(rootvnode, 0); >> > + >> > p = curthread->td_proc; >> > FILEDESC_XLOCK(p->p_fd); >> > >> > @@ -1496,8 +1498,6 @@ >> > >> > FILEDESC_XUNLOCK(p->p_fd); >> > >> > - VOP_UNLOCK(rootvnode, 0); >> > - >> > EVENTHANDLER_INVOKE(mountroot); >> > } >> > >> >> Still no luck, I now get a LOR that is similar to LOR 281 right after booting: >> >> lock order reversal: >> 1st 0xffffff0002c2c7f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2083 >> 2nd 0xffffff0002b2a248 filedesc structure (filedesc structure) @ >> /usr/src/sys/kern/vfs_syscalls.c:3776 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> _witness_debugger() at _witness_debugger+0x49 >> witness_checkorder() at witness_checkorder+0x7ea >> _sx_slock() at _sx_slock+0x44 >> kern_mkdirat() at kern_mkdirat+0x201 >> syscall() at syscall+0x1af >> Xfast_syscall() at Xfast_syscall+0xe1 >> --- syscall (136, FreeBSD ELF64, mkdir), rip = 0x800729dac, rsp = >> 0x7fffffffec88, rbp = 0x7fffffffef66 --- > > Remove the FILEDESC_SLOCK()/FILEDESC_SUNLOCK() calls from kern_mkdirat(). > I removed the two lines at sys/kern/vfs_syscalls.c (3776 and 3778), but there still seem to be some LORs (attached). The two LORs about the reboot call are from before Kostiks patch. [-- Attachment #2 --] FreeBSD 8.0-BETA2 #2: Thu Jul 30 09:xx:xx CEST 2009 lock order reversal: (#276) lock order reversal: 1st 0xffffff0002b50270 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1200 2nd 0xffffffff80bdcca0 allproc (allproc) @ /usr/src/sys/kern/kern_descrip.c:2473 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _sx_slock() at _sx_slock+0x44 mountcheckdirs() at mountcheckdirs+0x3f dounmount() at dounmount+0x477 vfs_unmountall() at vfs_unmountall+0x54 boot() at boot+0x818 mkdumpheader() at mkdumpheader syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (55, FreeBSD ELF64, reboot), rip = 0x80078f85c, rsp = 0x7fffffffeab8, rbp = 0 --- FreeBSD 8.0-BETA2 #3: Thu Jul 30 13:29:46 CEST 2009 lock order reversal: 1st 0xffffff00510a5d80 ufs (ufs) @ /usr/src/sys/kern/kern_exec.c:570 2nd 0xffffff0002dfe248 filedesc structure (filedesc structure) @ /usr/src/sys/kern/kern_descrip.c:1864 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _sx_xlock() at _sx_xlock+0x44 setugidsafety() at setugidsafety+0x40 kern_execve() at kern_execve+0xf22 execve() at execve+0x38 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (59, FreeBSD ELF64, execve), rip = 0x8007c3d0c, rsp = 0x7fffffffec48, rbp = 0x7fffffffed50 --- lock order reversal: (like #261) 1st 0xffffff8029512438 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2558 2nd 0xffffff0002c44400 dirhash (dirhash) @ /usr/src/sys/ufs/ufs/ufs_dirhash.c:285 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x49 witness_checkorder() at witness_checkorder+0x7ea _sx_xlock() at _sx_xlock+0x44 ufsdirhash_acquire() at ufsdirhash_acquire+0x29 ufsdirhash_move() at ufsdirhash_move+0x19 ufs_direnter() at ufs_direnter+0x4a9 ufs_makeinode() at ufs_makeinode+0x2a7 VOP_CREATE_APV() at VOP_CREATE_APV+0x8d vn_open_cred() at vn_open_cred+0x406 kern_openat() at kern_openat+0x163 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (5, FreeBSD ELF64, open), rip = 0x8009dadec, rsp = 0x7fffffffe5c8, rbp = 0x1b6 ---
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e890cae60907300555x63de4a0dva503171d8fe2d3e6>
