Date: Sun, 19 Dec 2010 19:10:04 -0500 (EST) From: Benjamin Kaduk <kaduk@MIT.EDU> To: freebsd-fs@freebsd.org Subject: debugging process in bovlbx state Message-ID: <alpine.GSO.1.10.1012191909420.640@multics.mit.edu>
next in thread | raw e-mail | index | archive | help
Hi all, I'm working on bringing the out-of-tree OpenAFS network filesystem up-to-date for FreeBSD 7.3-RELEASE, and I think I need some help to fix this bug. I should preface my discourse with the fact that there is a whole slow of lock order reversals that I haven't even tried to track down, but I do not believe that this hang is deadlock since 'show alllocks' in DDB does not show anything that seems interesting. Any pointers for things to look at would be appreciated; more details of the failing case below. In order to get the afs kernel module to load, I needed to tweak a few lines of code in getpages(), as I had previously cribbed a bunch of changes/updates from the experimental NFS client while getting AFS to work on current freebsd. In particular, vm_page_set_valid is not present in 7.3, so I am currently running with: --- a/src/afs/FBSD/osi_vnodeops.c +++ b/src/afs/FBSD/osi_vnodeops.c @@ -890,12 +890,8 @@ afs_vop_getpages(struct vop_getpages_args *ap) * Read operation filled a partial page. */ m->valid = 0; - vm_page_set_valid(m, 0, size - toff); -#ifndef AFS_FBSD80_ENV - vm_page_undirty(m); -#else + vm_page_set_validclean(m, 0, size - toff); KASSERT(m->dirty == 0, ("afs_getpages: page %p is dirty", m)); -#endif } But my knowledge of vm_page_* is approximately nil, so there's no reason to think everything was correct even before that patch. Anyway, my test case is running libarchive's configure script with source and destination directories in (different places in) AFS. It only gets twenty lines in, ending with: checking for gcc option to accept ISO C89... none needed checking for style of include used by make... GNU checking dependency style of gcc... ^Tload: 0.04 cmd: cp 1250 [bovlbx] 0.00u 0.00 procstat -kk reports: mega-man# procstat -kk 1250 PID TID COMM TDNAME KSTACK 1250 100060 cp - mi_switch+0x233 sleepq_switch+0xe9 sleepq_wait+0x44 _sleep+0x3a0 vm_object_pip_wait+0x4e bufobj_invalbuf+0x10e afs_GetVCache+0x2f7 The call to vinvalbuf in afs_GetVCache is here: 1646 iheldthelock = VOP_ISLOCKED(vp, curthread); 1647 if (!iheldthelock) 1648 vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, curthread); 1649 AFS_GUNLOCK(); 1650 vinvalbuf(vp, V_SAVE, curthread, PINOD, 0); 1651 AFS_GLOCK(); 1652 if (!iheldthelock) 1653 VOP_UNLOCK(vp, LK_EXCLUSIVE, curthread); Which is not very enlightening. I kind of suspect that some flags on the bufobj were erroneously set elsewhere and it is only now popping up. afs_GetVCache is in this source file: http://git.openafs.org/?p=openafs.git;a=blob;f=src/afs/afs_vcache.c;h=26ed2c2be271048509425583f0cc2de6c4166c4b;hb=HEAD and {get,put}pages in this: http://git.openafs.org/?p=openafs.git;a=blob;f=src/afs/FBSD/osi_vnodeops.c;h=7ae6571adb74d69cfe25e3190ade3b22dc8cdab8;hb=HEAD Thanks, Ben Kaduk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.GSO.1.10.1012191909420.640>