Date: Mon, 20 Dec 2010 07:53:06 -0500 From: John Baldwin <jhb@freebsd.org> To: freebsd-fs@freebsd.org Subject: Re: debugging process in bovlbx state Message-ID: <201012200753.06804.jhb@freebsd.org> In-Reply-To: <alpine.GSO.1.10.1012191909420.640@multics.mit.edu> References: <alpine.GSO.1.10.1012191909420.640@multics.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday, December 19, 2010 7:10:04 pm Benjamin Kaduk wrote: > Hi all, > > I'm working on bringing the out-of-tree OpenAFS network filesystem > up-to-date for FreeBSD 7.3-RELEASE, and I think I need some help to fix > this bug. > I should preface my discourse with the fact that there is a whole slow of > lock order reversals that I haven't even tried to track down, but I do not > believe that this hang is deadlock since 'show alllocks' in DDB does not > show anything that seems interesting. > > Any pointers for things to look at would be appreciated; more details of > the failing case below. > > > In order to get the afs kernel module to load, I needed to tweak a few > lines of code in getpages(), as I had previously cribbed a bunch of > changes/updates from the experimental NFS client while getting AFS to work > on current freebsd. In particular, vm_page_set_valid is not present in > 7.3, so I am currently running with: > --- a/src/afs/FBSD/osi_vnodeops.c > +++ b/src/afs/FBSD/osi_vnodeops.c > @@ -890,12 +890,8 @@ afs_vop_getpages(struct vop_getpages_args *ap) > * Read operation filled a partial page. > */ > m->valid = 0; > - vm_page_set_valid(m, 0, size - toff); > -#ifndef AFS_FBSD80_ENV > - vm_page_undirty(m); > -#else > + vm_page_set_validclean(m, 0, size - toff); > KASSERT(m->dirty == 0, ("afs_getpages: page %p is dirty", m)); > -#endif > } > > > But my knowledge of vm_page_* is approximately nil, so there's no reason > to think everything was correct even before that patch. > > Anyway, my test case is running libarchive's configure script with source > and destination directories in (different places in) AFS. It only gets > twenty lines in, ending with: > checking for gcc option to accept ISO C89... none needed > checking for style of include used by make... GNU > checking dependency style of gcc... > ^Tload: 0.04 cmd: cp 1250 [bovlbx] 0.00u 0.00 > > procstat -kk reports: > mega-man# procstat -kk 1250 > PID TID COMM TDNAME KSTACK > 1250 100060 cp - mi_switch+0x233 > sleepq_switch+0xe9 sleepq_wait+0x44 _sleep+0x3a0 vm_object_pip_wait+0x4e > bufobj_invalbuf+0x10e afs_GetVCache+0x2f7 > > The call to vinvalbuf in afs_GetVCache is here: > 1646 iheldthelock = VOP_ISLOCKED(vp, curthread); This is probably wrong. VOP_ISLOCKED() can return four different values: - LK_SHARED: (someone, possibly curthread) holds a shared lock - LK_EXCLUSIVE: curthread holds an exclusive lock - LK_EXCLOTHER: some other thread holds an exclusive lock - 0: no thread holds any lock. This means if another thread has the vnode locked, you don't try to lock it. :) Do you actually know that this routine can be held without the vnode locked by the current thread? -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201012200753.06804.jhb>