Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Oct 2010 08:45:43 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Andriy Gapon <avg@icyb.net.ua>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Locked up processes after upgrade to ZFS v15
Message-ID:  <20101012154543.GA35378@icarus.home.lan>
In-Reply-To: <4CB47E3F.3050002@icyb.net.ua>
References:  <20101011183707.GA13925@icarus.home.lan> <4CB3870F.7070107@icyb.net.ua> <20101012100709.GA29861@icarus.home.lan> <4CB4429C.9040109@icyb.net.ua> <20101012130245.GA32584@icarus.home.lan> <4CB46CE9.20905@icyb.net.ua> <20101012143559.GA34396@icarus.home.lan> <4CB47355.1050109@icyb.net.ua> <20101012151852.GA35014@icarus.home.lan> <4CB47E3F.3050002@icyb.net.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 12, 2010 at 06:26:55PM +0300, Andriy Gapon wrote:
> on 12/10/2010 18:18 Jeremy Chadwick said the following:
> > Got it -- just finished and is currently running/working.  I also
> > installed ports/sysutils/DTraceToolkit and shells/ksh93 "just in case".
> > 
> > testbox# dtrace -l | head
> >    ID   PROVIDER            MODULE                          FUNCTION NAME
> >     1     dtrace                                                     BEGIN
> >     2     dtrace                                                     END
> >     3     dtrace                                                     ERROR
> >     4   dtmalloc                                                 fbt malloc
> >     5   dtmalloc                                                 fbt free
> >     6   dtmalloc                                              cyclic malloc
> >     7   dtmalloc                                              cyclic free
> >     8   dtmalloc                                          zones_data malloc
> >     9   dtmalloc                                          zones_data free
> > 
> > I can provide you root-level access to the box as well as serial console
> > if you'd prefer to do the debugging yourself, otherwise step me through
> > what's needed and I'll be happy to act as remote hands.
> 
> Great!  Let's start now :)
> I would like you to run the following script with "dtrace -s <script name>" in one
> terminal while running sendfile patched regression test (with TEST_EXTRA=100) in
> another.  After sendfile program finishes, please ^C the DTrace script.
> Please show me complete output that you'll get from the DTrace script.
> Thanks!
> 
> fbt::vm_fault:entry
> /execname == "sendfile"/
> {
>         self->vm_fault = 1;
> }
> 
> fbt::vm_fault:return
> /execname == "sendfile"/
> {
>         self->vm_fault = 0;
> }
> 
> fbt::zfs_freebsd_read:entry
> /self->vm_fault/
> {
>         self->zfs_read = 1;
> }
> 
> fbt::zfs_freebsd_read:return
> /self->vm_fault/
> {
>         self->zfs_read = 0;
> }
> 
> fbt::vm_page_lookup:return
> /self->zfs_read && arg1 != 0/
> {
>         @stacks[stack()] = count();
>         printf("\n");
>         printf("valid = 0x%02x\n", ((vm_page_t)arg1)->valid);
>         printf("flags = 0x%04x\n", ((vm_page_t)arg1)->flags);
>         printf("oflags = 0x%04x\n", ((vm_page_t)arg1)->oflags);
>         printf("pindex = %u\n", ((vm_page_t)arg1)->pindex);
>         printf("object = %p\n", ((vm_page_t)arg1)->object);
> }

Okay, I realised what I did wrong with the original incarnation of
your modified sendfile stuff -- the code defaults to using /tmp, which
idiotically I forgot to change to a ZFS filesystem (/tmp isn't ZFS
on the testbox).  Now that I changed it to /home, I can reproduce the
problem.  Excellent!

Secondly: the testbox is running kernel/world source from October 8th.
I *have not* applied your kernel patch at this point.  Just a FYI.

So here's what I get.  Note that the sendfile process appears locked up
in zfsmrb state.


Terminal #1 (sendfile)
------------------------
testbox# ./sendfile
1..11
ok 1
ok 2
ok 3
ok 4
ok 5
ok 6
ok 7
ok 8
ok 9
ok 10
ok 11
mmap test
^C


Terminal #2 (DTrace script + ps output)
-----------------------------------------
testbox# ./zfs_sendfile.d
dtrace: script './zfs_sendfile.d' matched 5 probes
CPU     ID                    FUNCTION:NAME
  1  22458            vm_page_lookup:return
valid = 0x01
flags = 0x0000
oflags = 0x0001
pindex = 4
object = c614e550

^C


              0xc457b43d
              kernel`VOP_READ_APV+0x7a
              kernel`vnode_pager_generic_getpages+0x329
              kernel`vop_stdgetpages+0x29
              kernel`VOP_GETPAGES_APV+0x83
              kernel`vnode_pager_getpages+0x19a
              kernel`vm_fault+0x1139
              kernel`trap_pfault+0x173
              kernel`trap+0x2cb
              kernel`0xc07d29bc
                1

testbox# ps -axl | grep sendfile
    0  1318  1132   0  52  0  3324  1024 zfsmrb DL+   u0    0:00.01 ./sendfile
    0  1333  1170   0  44  0  3444  1200 -      R+     0    0:00.00 grep sendfile
testbox# procstat -k -k 1318
  PID    TID COMM             TDNAME           KSTACK
 1318 100126 sendfile         -                mi_switch+0x11b sleepq_switch+0xc1 sleepq_wait+0x39 _sleep+0x282 vm_page_sleep+0xd5 zfs_freebsd_read+0x2f3 VOP_READ_APV+0x7a vnode_pager_generic_getpages+0x329 vop_stdgetpages+0x29 VOP_GETPAGES_APV+0x83 vnode_pager_getpages+0x19a vm_fault+0x1139 trap_pfault+0x173 trap+0x2cb calltrap+0x6

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101012154543.GA35378>