Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Dec 2010 19:10:04 -0500 (EST)
From:      Benjamin Kaduk <kaduk@MIT.EDU>
To:        freebsd-fs@freebsd.org
Subject:   debugging process in bovlbx state
Message-ID:  <alpine.GSO.1.10.1012191909420.640@multics.mit.edu>

next in thread | raw e-mail | index | archive | help
Hi all,

I'm working on bringing the out-of-tree OpenAFS network filesystem 
up-to-date for FreeBSD 7.3-RELEASE, and I think I need some help to fix 
this bug.
I should preface my discourse with the fact that there is a whole slow of 
lock order reversals that I haven't even tried to track down, but I do not 
believe that this hang is deadlock since 'show alllocks' in DDB does not 
show anything that seems interesting.

Any pointers for things to look at would be appreciated; more details of 
the failing case below.


In order to get the afs kernel module to load, I needed to tweak a few 
lines of code in getpages(), as I had previously cribbed a bunch of 
changes/updates from the experimental NFS client while getting AFS to work 
on current freebsd.  In particular, vm_page_set_valid is not present in 
7.3, so I am currently running with:
--- a/src/afs/FBSD/osi_vnodeops.c
+++ b/src/afs/FBSD/osi_vnodeops.c
@@ -890,12 +890,8 @@ afs_vop_getpages(struct vop_getpages_args *ap)
               * Read operation filled a partial page.
               */
              m->valid = 0;
-           vm_page_set_valid(m, 0, size - toff);
-#ifndef AFS_FBSD80_ENV
-           vm_page_undirty(m);
-#else
+           vm_page_set_validclean(m, 0, size - toff);
              KASSERT(m->dirty == 0, ("afs_getpages: page %p is dirty", m));
-#endif
          }


But my knowledge of vm_page_* is approximately nil, so there's no reason 
to think everything was correct even before that patch.

Anyway, my test case is running libarchive's configure script with source 
and destination directories in (different places in) AFS.  It only gets 
twenty lines in, ending with:
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc...
^Tload: 0.04 cmd: cp 1250 [bovlbx] 0.00u 0.00

procstat -kk reports:
mega-man# procstat -kk 1250
    PID    TID COMM             TDNAME           KSTACK
   1250 100060 cp               -                mi_switch+0x233 
sleepq_switch+0xe9 sleepq_wait+0x44 _sleep+0x3a0 vm_object_pip_wait+0x4e 
bufobj_invalbuf+0x10e afs_GetVCache+0x2f7

The call to vinvalbuf in afs_GetVCache is here:
     1646         iheldthelock = VOP_ISLOCKED(vp, curthread);
     1647         if (!iheldthelock)
     1648             vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, curthread);
     1649         AFS_GUNLOCK();
     1650         vinvalbuf(vp, V_SAVE, curthread, PINOD, 0);
     1651         AFS_GLOCK();
     1652         if (!iheldthelock)
     1653             VOP_UNLOCK(vp, LK_EXCLUSIVE, curthread);

Which is not very enlightening.  I kind of suspect that some flags on the 
bufobj were erroneously set elsewhere and it is only now popping up.

afs_GetVCache is in this source file:
http://git.openafs.org/?p=openafs.git;a=blob;f=src/afs/afs_vcache.c;h=26ed2c2be271048509425583f0cc2de6c4166c4b;hb=HEAD
and {get,put}pages in this:
http://git.openafs.org/?p=openafs.git;a=blob;f=src/afs/FBSD/osi_vnodeops.c;h=7ae6571adb74d69cfe25e3190ade3b22dc8cdab8;hb=HEAD


Thanks,

Ben Kaduk



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.GSO.1.10.1012191909420.640>