Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Jan 2010 11:13:14 +0200
From:      Mikolaj Golub <to.my.trociny@gmail.com>
To:        freebsd-fs@FreeBSD.org
Subject:   Re: FreeBSD NFS client/Linux NFS server issue
Message-ID:  <86tyuqnz9x.fsf@zhuzha.ua1>
In-Reply-To: <86ocl272mb.fsf@kopusha.onet> (Mikolaj Golub's message of "Sun\, 10 Jan 2010 11\:03\:56 %2B0200")
References:  <86ocl272mb.fsf@kopusha.onet>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 10 Jan 2010 11:03:56 +0200 Mikolaj Golub wrote:

> Hi,
>
> I wrote about this problem some times ago to freebsd-stable
>
> http://lists.freebsd.org/pipermail/freebsd-stable/2009-December/053746.html
>
> but there have not been any response there so far and also we have collected
> some additional information since then so I am trying again here :-)
>
> We have CentOS 5.3 NFS server and many FreeBSD clients. The problem have been
> observed since we started upgrade FreeBSD host from 6.3 to 7.1 so we thought
> this is 7.1 related bu recently we had the similar issue with 6.3 host and
> actually it looks like the problem is rather on NFS server side, but I would
> like to describe the problem here and ask some advise.
>
> So the "problem" FreeBSD clients have several NFS mounts from CentOS 5.3
> server. Mount options: rw,-3,-T,-s,-i,-r=32768,-w=32768,-o=noinet6
>
> At some point on one of the clients some of the mounts might get stuck -- we
> observe one or two our php applications hanged on writing to NFS folder in
> nfs_flush()->bufobj_wwait() and all other processes that try to access this
> NFS mount hang acquiring nfs vn_lock hold by one of the above php processes.
>
> For one of the incident we were tcpdumping "problem" NFS connection for about
> 1 hour and during this hour an activity was observed only once:
>
> 08:20:38.281422 IP (tos 0x0, ttl 64, id 56110, offset 0, flags [DF], proto TCP (6), length 140) 172.30.10.27.344496259 > 172.30.10.121.2049: 88 access fh[1:9300:10df8001] 003f
> 08:20:38.281554 IP (tos 0x0, ttl 64, id 26624, offset 0, flags [DF], proto TCP (6), length 52) 172.30.10.121.2049 > 172.30.10.27.971: ., cksum 0xca5e (correct), 89408667:89408667(0) ack 1517941890 win 46 <nop,nop,timestamp 901975640 111169517>
>
> The client sent rpc ACCESS request for root exported inode, received tcp ack
> response (so tcp connection was ok) but did not receive any RPC reply from the
> server.
>
> So it looks like the problem on NFS server side. But for me it looks a bit
> strange that freebsd client is sending rpc packets so rarely. Shouldn't it
> retransmit them more frequently? For another incident we monitored tcp
> connection for 4 minutes and did not see any packets then. Unfortunately we
> can't run tcpdumping long time as these are production servers and we need to
> reboot hosts to restore normal operations.

There are some new details and now it looks like rather client side
problem. We set up tcpdump monitoring of 2049 port on the "problem" NFS
clients and now have full tcpdump log + kernel crush dump for the latest
incident.

So this time again the first gotten stuck processes were php scripts writing
log files to NFS folder. 

(kgdb) thr 152
[Switching to thread 152 (Thread 100073)]#0  sched_switch (td=0xc6bf4690, newtd=Variable "newtd" is not available.
)
    at /usr/src/sys/kern/sched_ule.c:1944
1944                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xc6bf4690, newtd=Variable "newtd" is not available.
) at /usr/src/sys/kern/sched_ule.c:1944
#1  0xc07cabe6 in mi_switch (flags=Variable "flags" is not available.
) at /usr/src/sys/kern/kern_synch.c:440
#2  0xc07f42fb in sleepq_switch (wchan=Variable "wchan" is not available.
) at /usr/src/sys/kern/subr_sleepqueue.c:497
#3  0xc07f460c in sleepq_catch_signals (wchan=0xcc2f2ee8) at /usr/src/sys/kern/subr_sleepqueue.c:417
#4  0xc07f4ebd in sleepq_wait_sig (wchan=0xcc2f2ee8) at /usr/src/sys/kern/subr_sleepqueue.c:594
#5  0xc07cb047 in _sleep (ident=0xcc2f2ee8, lock=0xcc2f2e8c, priority=333, wmesg=0xc0b731ed "bo_wwait", timo=0)
    at /usr/src/sys/kern/kern_synch.c:224
#6  0xc0827295 in bufobj_wwait (bo=0xcc2f2ec4, slpflag=256, timeo=0) at /usr/src/sys/kern/vfs_bio.c:3870
#7  0xc0966307 in nfs_flush (vp=0xcc2f2e04, waitfor=1, td=0xc6bf4690, commit=1)
    at /usr/src/sys/nfsclient/nfs_vnops.c:2989
#8  0xc09667ce in nfs_fsync (ap=0xed1928ec) at /usr/src/sys/nfsclient/nfs_vnops.c:2725
#9  0xc0aee5d2 in VOP_FSYNC_APV (vop=0xc0c2b920, a=0xed1928ec) at vnode_if.c:1007
#10 0xc0827864 in bufsync (bo=0xcc2f2ec4, waitfor=1, td=0xc6bf4690) at vnode_if.h:538
#11 0xc083f354 in bufobj_invalbuf (bo=0xcc2f2ec4, flags=1, td=0xc6bf4690, slpflag=256, slptimeo=0)
    at /usr/src/sys/kern/vfs_subr.c:1066
#12 0xc083f6e2 in vinvalbuf (vp=0xcc2f2e04, flags=1, td=0xc6bf4690, slpflag=256, slptimeo=0)
    at /usr/src/sys/kern/vfs_subr.c:1142
#13 0xc094f216 in nfs_vinvalbuf (vp=0xcc2f2e04, flags=Variable "flags" is not available.
) at /usr/src/sys/nfsclient/nfs_bio.c:1326
#14 0xc0951825 in nfs_write (ap=0xed192bc4) at /usr/src/sys/nfsclient/nfs_bio.c:918
#15 0xc0aef956 in VOP_WRITE_APV (vop=0xc0c2b920, a=0xed192bc4) at vnode_if.c:691
#16 0xc0850097 in vn_write (fp=0xc95087b8, uio=0xed192c60, active_cred=0xc84db000, flags=0, td=0xc6bf4690)
    at vnode_if.h:373
#17 0xc07f9d17 in dofilewrite (td=0xc6bf4690, fd=6, fp=0xc95087b8, auio=0xed192c60, offset=-1, flags=0)
    at file.h:256
#18 0xc07f9ff8 in kern_writev (td=0xc6bf4690, fd=6, auio=0xed192c60) at /usr/src/sys/kern/sys_generic.c:401
#19 0xc07fa06f in write (td=0xc6bf4690, uap=0xed192cfc) at /usr/src/sys/kern/sys_generic.c:317
#20 0xc0ad9c75 in syscall (frame=0xed192d38) at /usr/src/sys/i386/i386/trap.c:1090
#21 0xc0ac01b0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
#22 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) fr 13
#13 0xc094f216 in nfs_vinvalbuf (vp=0xcc2f2e04, flags=Variable "flags" is not available.
) at /usr/src/sys/nfsclient/nfs_bio.c:1326
1326            error = vinvalbuf(vp, flags, td, slpflag, 0);
(kgdb) p * (struct nfsnode *) vp->v_data
$5 = {n_mtx = {lock_object = {lo_name = 0xc0b80584 "NFSnode lock", lo_type = 0xc0b80584 "NFSnode lock", 
      lo_flags = 16973824, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, 
    mtx_recurse = 0}, n_size = 133758, n_brev = 0, n_lrev = 0, n_vattr = {va_type = VREG, va_mode = 420, 
    va_nlink = 1, va_uid = 4002, va_gid = 4002, va_fsid = 50396930, va_fileid = 213581825, va_size = 133758, 
    va_blocksize = 512, va_atime = {tv_sec = 1263232925, tv_nsec = 0}, va_mtime = {tv_sec = 1263232924, 
      tv_nsec = 0}, va_ctime = {tv_sec = 1263232924, tv_nsec = 0}, va_birthtime = {tv_sec = 0, tv_nsec = 0}, 
    va_gen = 0, va_flags = 0, va_rdev = 0, va_bytes = 143360, va_filerev = 0, va_vaflags = 0, va_spare = 0}, 
  n_attrstamp = 0, n_mode = 13, n_modeuid = 4002, n_modestamp = 1263232925, n_mtime = {tv_sec = 1263232029, 
    tv_nsec = 0}, n_ctime = 1263232029, n_expiry = 0, n_fhp = 0xce9002a4, n_vnode = 0xcc2f2e04, n_dvp = 0x0, 
  n_error = 0, n_un1 = {nf_atim = {tv_sec = 0, tv_nsec = 0}, nd_cookieverf = {nfsuquad = {0, 0}}, 
    nd4_cookieverf = "\000\000\000\000\000\000\000"}, n_un2 = {nf_mtim = {tv_sec = 0, tv_nsec = 0}, 
    nd_direof = 0}, n_un3 = {nf_silly = 0x0, nd_cook = {lh_first = 0x0}}, n_fhsize = 20, n_flag = 4, n_fh = {
    fh_generic = {fh_fsid = {val = {16777217, 37632}}, fh_fid = {fid_len = 16385, fid_reserved = 4277, 
        fid_data = "\001\000»\fRüÿ¯\000\000\000\000\000\000\000"}}, 
    fh_bytes = "\001\000\000\001\000\223\000\000\001@µ\020\001\000»\fRüÿ¯", '\0' <repeats 107 times>}, 
  n_rfc = {next = {tqe_next = 0x0, tqe_prev = 0x0}, refcnt = 0, lop = 0x0, np = 0x0, 
    stateid = '\0' <repeats 15 times>}, n_wfc = {next = {tqe_next = 0x0, tqe_prev = 0x0}, refcnt = 0, 
    lop = 0x0, np = 0x0, stateid = '\0' <repeats 15 times>}, n_name = 0x0, n_namelen = 0, 
  n_directio_opens = 0, n_directio_asyncwr = 0, n_ac_ts = {nfs_ac_ts_tid = 100073, nfs_ac_ts_pid = 64011, 
    nfs_ac_ts_syscalls = 1660222}}


(kgdb) thr 153
[Switching to thread 153 (Thread 100535)]#0  sched_switch (td=0xcc1d2d20, newtd=Variable "newtd" is not available.
)
    at /usr/src/sys/kern/sched_ule.c:1944
1944                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xcc1d2d20, newtd=Variable "newtd" is not available.
) at /usr/src/sys/kern/sched_ule.c:1944
#1  0xc07cabe6 in mi_switch (flags=Variable "flags" is not available.
) at /usr/src/sys/kern/kern_synch.c:440
#2  0xc07f42fb in sleepq_switch (wchan=Variable "wchan" is not available.
) at /usr/src/sys/kern/subr_sleepqueue.c:497
#3  0xc07f460c in sleepq_catch_signals (wchan=0xc7a42420) at /usr/src/sys/kern/subr_sleepqueue.c:417
#4  0xc07f4ebd in sleepq_wait_sig (wchan=0xc7a42420) at /usr/src/sys/kern/subr_sleepqueue.c:594
#5  0xc07cb047 in _sleep (ident=0xc7a42420, lock=0xc7a423c4, priority=333, wmesg=0xc0b731ed "bo_wwait", timo=0)
    at /usr/src/sys/kern/kern_synch.c:224
#6  0xc0827295 in bufobj_wwait (bo=0xc7a423fc, slpflag=256, timeo=0) at /usr/src/sys/kern/vfs_bio.c:3870
#7  0xc0966307 in nfs_flush (vp=0xc7a4233c, waitfor=1, td=0xcc1d2d20, commit=1)
    at /usr/src/sys/nfsclient/nfs_vnops.c:2989
#8  0xc09667ce in nfs_fsync (ap=0xed8978ec) at /usr/src/sys/nfsclient/nfs_vnops.c:2725
#9  0xc0aee5d2 in VOP_FSYNC_APV (vop=0xc0c2b920, a=0xed8978ec) at vnode_if.c:1007
#10 0xc0827864 in bufsync (bo=0xc7a423fc, waitfor=1, td=0xcc1d2d20) at vnode_if.h:538
#11 0xc083f354 in bufobj_invalbuf (bo=0xc7a423fc, flags=1, td=0xcc1d2d20, slpflag=256, slptimeo=0)
    at /usr/src/sys/kern/vfs_subr.c:1066
#12 0xc083f6e2 in vinvalbuf (vp=0xc7a4233c, flags=1, td=0xcc1d2d20, slpflag=256, slptimeo=0)
    at /usr/src/sys/kern/vfs_subr.c:1142
#13 0xc094f216 in nfs_vinvalbuf (vp=0xc7a4233c, flags=Variable "flags" is not available.
) at /usr/src/sys/nfsclient/nfs_bio.c:1326
#14 0xc0951825 in nfs_write (ap=0xed897bc4) at /usr/src/sys/nfsclient/nfs_bio.c:918
#15 0xc0aef956 in VOP_WRITE_APV (vop=0xc0c2b920, a=0xed897bc4) at vnode_if.c:691
#16 0xc0850097 in vn_write (fp=0xcdd5a000, uio=0xed897c60, active_cred=0xc8d5c200, flags=0, td=0xcc1d2d20)
    at vnode_if.h:373
#17 0xc07f9d17 in dofilewrite (td=0xcc1d2d20, fd=6, fp=0xcdd5a000, auio=0xed897c60, offset=-1, flags=0)
    at file.h:256
#18 0xc07f9ff8 in kern_writev (td=0xcc1d2d20, fd=6, auio=0xed897c60) at /usr/src/sys/kern/sys_generic.c:401
#19 0xc07fa06f in write (td=0xcc1d2d20, uap=0xed897cfc) at /usr/src/sys/kern/sys_generic.c:317
#20 0xc0ad9c75 in syscall (frame=0xed897d38) at /usr/src/sys/i386/i386/trap.c:1090
#21 0xc0ac01b0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
#22 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) fr 13
#13 0xc094f216 in nfs_vinvalbuf (vp=0xc7a4233c, flags=Variable "flags" is not available.
) at /usr/src/sys/nfsclient/nfs_bio.c:1326
1326            error = vinvalbuf(vp, flags, td, slpflag, 0);
(kgdb) p * (struct nfsnode *) vp->v_data
$6 = {n_mtx = {lock_object = {lo_name = 0xc0b80584 "NFSnode lock", lo_type = 0xc0b80584 "NFSnode lock", 
      lo_flags = 16973824, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, 
    mtx_recurse = 0}, n_size = 14566, n_brev = 0, n_lrev = 0, n_vattr = {va_type = VREG, va_mode = 420, 
    va_nlink = 1, va_uid = 4002, va_gid = 4002, va_fsid = 50396930, va_fileid = 116310018, va_size = 14566, 
    va_blocksize = 512, va_atime = {tv_sec = 1263232925, tv_nsec = 0}, va_mtime = {tv_sec = 1263232924, 
      tv_nsec = 0}, va_ctime = {tv_sec = 1263232924, tv_nsec = 0}, va_birthtime = {tv_sec = 0, tv_nsec = 0}, 
    va_gen = 0, va_flags = 0, va_rdev = 0, va_bytes = 20480, va_filerev = 0, va_vaflags = 0, va_spare = 0}, 
  n_attrstamp = 0, n_mode = 13, n_modeuid = 4002, n_modestamp = 1263232925, n_mtime = {tv_sec = 1263232029, 
    tv_nsec = 0}, n_ctime = 1263232029, n_expiry = 0, n_fhp = 0xce2ee2a4, n_vnode = 0xc7a4233c, n_dvp = 0x0, 
  n_error = 0, n_un1 = {nf_atim = {tv_sec = 0, tv_nsec = 0}, nd_cookieverf = {nfsuquad = {0, 0}}, 
    nd4_cookieverf = "\000\000\000\000\000\000\000"}, n_un2 = {nf_mtim = {tv_sec = 0, tv_nsec = 0}, 
    nd_direof = 0}, n_un3 = {nf_silly = 0x0, nd_cook = {lh_first = 0x0}}, n_fhsize = 20, n_flag = 4, n_fh = {
    fh_generic = {fh_fsid = {val = {16777217, 37632}}, fh_fid = {fid_len = 16385, fid_reserved = 4277, 
        fid_data = "\002Àî\006+Ë\223\v\000\000\000\000\000\000\000"}}, 
    fh_bytes = "\001\000\000\001\000\223\000\000\001@µ\020\002Àî\006+Ë\223\v", '\0' <repeats 107 times>}, 
  n_rfc = {next = {tqe_next = 0x0, tqe_prev = 0x0}, refcnt = 0, lop = 0x0, np = 0x0, 
    stateid = '\0' <repeats 15 times>}, n_wfc = {next = {tqe_next = 0x0, tqe_prev = 0x0}, refcnt = 0, 
    lop = 0x0, np = 0x0, stateid = '\0' <repeats 15 times>}, n_name = 0x0, n_namelen = 0, 
  n_directio_opens = 0, n_directio_asyncwr = 0, n_ac_ts = {nfs_ac_ts_tid = 100535, nfs_ac_ts_pid = 64012, 
    nfs_ac_ts_syscalls = 1926475}}

So inodes for this files are known (50396930 and 50396930) and I can track nfs
operations in tcpdump for these files. These are log files -- php scripts
write there some summary information. One of the script was managed to write
the following to the file before the hang:

Update upload started at 2010-01-11 18:02:01 GMT: 
filename: [XXXXXX_UPDATE_011110_094706.ZIP]
date: [2010-01-11]

    FSR equipment file, secs: 0.63525295257568. 
    Upload summary:
          inserted:                                3
          updated:                                 3
          up-to-date:                              1495

And rpc requests in wireshark for this file:

26756	18:02:00.648590	172.30.10.83	172.30.10.54	NFS	V3 LOOKUP Reply (Call In 26728), FH:0x29d8e1c4
26756	18:02:00.648590	172.30.10.83	172.30.10.54	NFS	V3 LOOKUP Reply (Call In 26728), FH:0x29d8e1c4
26757	18:02:00.648675	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 26759), FH:0x29d8e1c4
26759	18:02:00.648988	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 26757)
28216	18:02:01.342201	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 28217), FH:0x29d8e1c4
28217	18:02:01.342433	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 28216)
28218	18:02:01.343514	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 28239), FH:0x29d8e1c4 Offset:0 Len:14282
28239	18:02:01.356049	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 28218) Len:14167
28938	18:02:01.995767	172.30.10.54	172.30.10.83	NFS	V3 WRITE Call (Reply In 28939), FH:0x29d8e1c4 Offset:14167 Len:115 UNSTABLE
28939	18:02:02.014548	172.30.10.83	172.30.10.54	NFS	V3 WRITE Reply (Call In 28938) Len:115 UNSTABLE
28940	18:02:02.014626	172.30.10.54	172.30.10.83	NFS	V3 COMMIT Call (Reply In 29569), FH:0x29d8e1c4
29569	18:02:02.223967	172.30.10.83	172.30.10.54	NFS	V3 COMMIT Reply (Call In 28940)
29572	18:02:02.224055	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 29575), FH:0x29d8e1c4
29575	18:02:02.224474	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 29572)
29578	18:02:02.224552	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 29607), FH:0x29d8e1c4 Offset:0 Len:14331
29607	18:02:02.418815	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 29578) Len:14282
29612	18:02:02.418936	172.30.10.54	172.30.10.83	NFS	V3 WRITE Call (Reply In 29613), FH:0x29d8e1c4 Offset:14282 Len:49 UNSTABLE
29613	18:02:02.430813	172.30.10.83	172.30.10.54	NFS	V3 WRITE Reply (Call In 29612) Len:49 UNSTABLE
29614	18:02:02.430909	172.30.10.54	172.30.10.83	NFS	V3 COMMIT Call (Reply In 29634), FH:0x29d8e1c4
29634	18:02:02.731175	172.30.10.83	172.30.10.54	NFS	V3 COMMIT Reply (Call In 29614)
29635	18:02:02.731296	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 29640), FH:0x29d8e1c4
29640	18:02:02.732761	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 29635)
29641	18:02:02.732925	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 29667), FH:0x29d8e1c4 Offset:0 Len:14351
29667	18:02:02.736009	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 29641) Len:14331
29669	18:02:02.736172	172.30.10.54	172.30.10.83	NFS	V3 WRITE Call (Reply In 29675), FH:0x29d8e1c4 Offset:14331 Len:20 UNSTABLE
29675	18:02:02.799350	172.30.10.83	172.30.10.54	NFS	V3 WRITE Reply (Call In 29669) Len:20 UNSTABLE
29676	18:02:02.799439	172.30.10.54	172.30.10.83	NFS	V3 COMMIT Call (Reply In 29764), FH:0x29d8e1c4
29764	18:02:03.267882	172.30.10.83	172.30.10.54	NFS	V3 COMMIT Reply (Call In 29676)
29767	18:02:03.268003	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 29770), FH:0x29d8e1c4
29770	18:02:03.268902	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 29767)
29771	18:02:03.268980	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 29787), FH:0x29d8e1c4 Offset:0 Len:14404
29787	18:02:03.274837	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 29771) Len:14351
29788	18:02:03.275013	172.30.10.54	172.30.10.83	NFS	V3 WRITE Call (Reply In 29791), FH:0x29d8e1c4 Offset:14351 Len:53 UNSTABLE
29791	18:02:03.373167	172.30.10.83	172.30.10.54	NFS	V3 WRITE Reply (Call In 29788) Len:53 UNSTABLE
29792	18:02:03.373252	172.30.10.54	172.30.10.83	NFS	V3 COMMIT Call (Reply In 29839), FH:0x29d8e1c4
29839	18:02:03.989291	172.30.10.83	172.30.10.54	NFS	V3 COMMIT Reply (Call In 29792)
29840	18:02:03.989382	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 29842), FH:0x29d8e1c4
29842	18:02:03.989563	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 29840)
29843	18:02:03.989627	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 29859), FH:0x29d8e1c4 Offset:0 Len:14457
29859	18:02:03.991310	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 29843) Len:14404
29860	18:02:03.991477	172.30.10.54	172.30.10.83	NFS	V3 WRITE Call (Reply In 29861), FH:0x29d8e1c4 Offset:14404 Len:53 UNSTABLE
29861	18:02:03.994289	172.30.10.83	172.30.10.54	NFS	V3 WRITE Reply (Call In 29860) Len:53 UNSTABLE
29862	18:02:03.994360	172.30.10.54	172.30.10.83	NFS	V3 COMMIT Call (Reply In 30299), FH:0x29d8e1c4
30299	18:02:04.911587	172.30.10.83	172.30.10.54	NFS	[TCP Previous segment lost] V3 COMMIT Reply (Call In 29862)
30301	18:02:04.911675	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 30356), FH:0x29d8e1c4
30356	18:02:04.917405	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 30301)
30360	18:02:04.917482	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 30539), FH:0x29d8e1c4 Offset:0 Len:14513
30539	18:02:04.929902	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 30360) Len:14457
30551	18:02:04.930074	172.30.10.54	172.30.10.83	NFS	V3 WRITE Call (Reply In 30593), FH:0x29d8e1c4 Offset:14457 Len:56 UNSTABLE
30593	18:02:04.932002	172.30.10.83	172.30.10.54	NFS	V3 WRITE Reply (Call In 30551) Len:56 UNSTABLE
30594	18:02:04.932104	172.30.10.54	172.30.10.83	NFS	V3 COMMIT Call (Reply In 30907), FH:0x29d8e1c4
30907	18:02:05.037990	172.30.10.83	172.30.10.54	NFS	V3 COMMIT Reply (Call In 30594)
30912	18:02:05.038141	172.30.10.54	172.30.10.83	NFS	V3 ACCESS Call (Reply In 30982), FH:0x29d8e1c4
30982	18:02:05.049686	172.30.10.83	172.30.10.54	NFS	V3 ACCESS Reply (Call In 30912)
30991	18:02:05.050141	172.30.10.54	172.30.10.83	NFS	V3 READ Call (Reply In 31178), FH:0x29d8e1c4 Offset:0 Len:14566
31178	18:02:05.077695	172.30.10.83	172.30.10.54	NFS	V3 READ Reply (Call In 30991) Len:14513

So because it was appending to the file every php write call caused the
sequence of the following rpc: ACCESS - READ - WRITE - COMMIT. And trying to
flush the next line of the log it got stuck after READ call (the next should
be WRITE call but client never did it).

The same thing is for other log file written by othe php process. The last rpc
for this file:

30990 18:02:05.050063 172.30.10.54 172.30.10.83 NFS V3 READ Call (Reply In 31068), FH:0x532fa29d Offset:131072 Len:2686
31068 18:02:05.062801 172.30.10.83 172.30.10.54 NFS V3 READ Reply (Call In 30990) Len:2685

A bit later there were several successful COMMIT calls (when php processes
were closing other files I think). And other NFS activity was observed -- our
nagios checks and other applications, which was just looking for presence and
status of certain files, were running successfully and in tcpdump there are
successful readdir/access/lookup/fstat calls. df utility did not hanged then
too.

Later when our engineer tried to access the mounted folder with mc the
process locked acquiring nfs vn_lock held by php script (td=0xc6bf4690):

(kgdb) thr 155
[Switching to thread 155 (Thread 100252)]#0  sched_switch (td=0xc70d98c0, newtd=Variable "newtd" is not available.
)
    at /usr/src/sys/kern/sched_ule.c:1944
1944                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xc70d98c0, newtd=Variable "newtd" is not available.
) at /usr/src/sys/kern/sched_ule.c:1944
#1  0xc07cabe6 in mi_switch (flags=Variable "flags" is not available.
) at /usr/src/sys/kern/kern_synch.c:440
#2  0xc07f42fb in sleepq_switch (wchan=Variable "wchan" is not available.
) at /usr/src/sys/kern/subr_sleepqueue.c:497
#3  0xc07f4946 in sleepq_wait (wchan=0xcc2f2e5c) at /usr/src/sys/kern/subr_sleepqueue.c:580
#4  0xc07cb056 in _sleep (ident=0xcc2f2e5c, lock=0xc0c77730, priority=80, wmesg=0xc0b80b92 "nfs", timo=0)
    at /usr/src/sys/kern/kern_synch.c:226
#5  0xc07adf5a in acquire (lkpp=0xed46e7f0, extflags=Variable "extflags" is not available.
) at /usr/src/sys/kern/kern_lock.c:151
#6  0xc07ae84c in _lockmgr (lkp=0xcc2f2e5c, flags=8194, interlkp=0xcc2f2e8c, td=0xc70d98c0, 
    file=0xc0b74aeb "/usr/src/sys/kern/vfs_subr.c", line=2061) at /usr/src/sys/kern/kern_lock.c:384
#7  0xc0832470 in vop_stdlock (ap=0xed46e840) at /usr/src/sys/kern/vfs_default.c:305
#8  0xc0aef4f6 in VOP_LOCK1_APV (vop=0xc0c1d5c0, a=0xed46e840) at vnode_if.c:1618
#9  0xc084ed86 in _vn_lock (vp=0xcc2f2e04, flags=8194, td=0xc70d98c0, 
    file=0xc0b74aeb "/usr/src/sys/kern/vfs_subr.c", line=2061) at vnode_if.h:851
#10 0xc0841d84 in vget (vp=0xcc2f2e04, flags=8194, td=0xc70d98c0) at /usr/src/sys/kern/vfs_subr.c:2061
#11 0xc08355b3 in vfs_hash_get (mp=0xc7047b30, hash=2492309256, flags=Variable "flags" is not available.
) at /usr/src/sys/kern/vfs_hash.c:81
#12 0xc09534d4 in nfs_nget (mntp=0xc7047b30, fhp=0xcb8d0868, fhsize=20, npp=0xed46e9f0, flags=2)
    at /usr/src/sys/nfsclient/nfs_node.c:120
#13 0xc0964a05 in nfs_lookup (ap=0xed46ea84) at /usr/src/sys/nfsclient/nfs_vnops.c:947
#14 0xc0aefbe6 in VOP_LOOKUP_APV (vop=0xc0c2b920, a=0xed46ea84) at vnode_if.c:99
#15 0xc0836841 in lookup (ndp=0xed46eb48) at vnode_if.h:57
#16 0xc083756f in namei (ndp=0xed46eb48) at /usr/src/sys/kern/vfs_lookup.c:219
#17 0xc0844fef in kern_lstat (td=0xc70d98c0, path=0x48705400 <Address 0x48705400 out of bounds>, 
    pathseg=UIO_USERSPACE, sbp=0xed46ec18) at /usr/src/sys/kern/vfs_syscalls.c:2169
#18 0xc08451af in lstat (td=0xc70d98c0, uap=0xed46ecfc) at /usr/src/sys/kern/vfs_syscalls.c:2152
#19 0xc0ad9c75 in syscall (frame=0xed46ed38) at /usr/src/sys/i386/i386/trap.c:1090
#20 0xc0ac01b0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
#21 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) fr 6
#6  0xc07ae84c in _lockmgr (lkp=0xcc2f2e5c, flags=8194, interlkp=0xcc2f2e8c, td=0xc70d98c0, 
    file=0xc0b74aeb "/usr/src/sys/kern/vfs_subr.c", line=2061) at /usr/src/sys/kern/kern_lock.c:384
384                     error = acquire(&lkp, extflags, (LK_HAVE_EXCL | LK_WANT_EXCL), &contested, &waitstart);
(kgdb) p *lkp
$8 = {
  lk_object = {
    lo_name = 0xc0b80b92 "nfs", 
    lo_type = 0xc0b80b92 "nfs", 
    lo_flags = 70844416, 
    lo_witness_data = {
      lod_list = {
        stqe_next = 0x0
      }, 
      lod_witness = 0x0
    }
  }, 
  lk_interlock = 0xc0c77730, 
  lk_flags = 33816640, 
  lk_sharecount = 0, 
  lk_waitcount = 1, 
  lk_exclusivecount = 1, 
  lk_prio = 80, 
  lk_timo = 51, 
  lk_lockholder = 0xc6bf4690, 
  lk_newlock = 0x0
}

After this all processes that tried to access the share and df utility got
stuck and no any traffic in tcpdump was observed.

This time in crush dump I see 19 nfsiod processes but according to *nmp all
were working with another mounts (which did not get stuck), threads look
similar to this one:

(kgdb) bt
#0  sched_switch (td=0xcc179230, newtd=Variable "newtd" is not available.
) at /usr/src/sys/kern/sched_ule.c:1944
#1  0xc07cabe6 in mi_switch (flags=Variable "flags" is not available.
) at /usr/src/sys/kern/kern_synch.c:440
#2  0xc07f42fb in sleepq_switch (wchan=Variable "wchan" is not available.
) at /usr/src/sys/kern/subr_sleepqueue.c:497
#3  0xc07f460c in sleepq_catch_signals (wchan=0xc0c89f00) at /usr/src/sys/kern/subr_sleepqueue.c:417
#4  0xc07f4db9 in sleepq_timedwait_sig (wchan=0xc0c89f00) at /usr/src/sys/kern/subr_sleepqueue.c:631
#5  0xc07cb021 in _sleep (ident=0xc0c89f00, lock=0xc0c89ed0, priority=348, wmesg=0xc0b69679 "-", timo=120000)
    at /usr/src/sys/kern/kern_synch.c:220
#6  0xc095b46e in nfssvc_iod (instance=0xc0c89880) at /usr/src/sys/nfsclient/nfs_nfsiod.c:244
#7  0xc079e6c9 in fork_exit (callout=0xc095b380 <nfssvc_iod>, arg=0xc0c89880, frame=0xed981d38)
    at /usr/src/sys/kern/kern_fork.c:804
#8  0xc0ac01c0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264
(kgdb) fr 6
#6  0xc095b46e in nfssvc_iod (instance=0xc0c89880) at /usr/src/sys/nfsclient/nfs_nfsiod.c:244
244                     error = msleep(&nfs_iodwant[myiod], &nfs_iod_mtx, PWAIT | PCATCH,
(kgdb) list
239                     nfs_iodmount[myiod] = NULL;
240                     /*
241                      * Always keep at least nfs_iodmin kthreads.
242                      */
243                     timo = (myiod < nfs_iodmin) ? 0 : nfs_iodmaxidle * hz;
244                     error = msleep(&nfs_iodwant[myiod], &nfs_iod_mtx, PWAIT | PCATCH,
245                         "-", timo);
246                     if (error) {
247                             nmp = nfs_iodmount[myiod];
248                             /*
(kgdb) p *nmp
$2 = {nm_mtx = {lock_object = {lo_name = 0xc0b808ee "NFSmount lock", lo_type = 0xc0b808ee "NFSmount lock", 
      lo_flags = 16973824, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, 
    mtx_recurse = 0}, nm_flag = 35399, nm_state = 1310720, nm_mountp = 0xc6d98b30, nm_numgrps = 16, 
  nm_fh = "\001\000\000\000\000\223\000\000\001\000Í\a", '\0' <repeats 115 times>, nm_fhsize = 12, 
  nm_rpcclnt = {rc_flag = 0, rc_wsize = 0, rc_rsize = 0, rc_name = 0x0, rc_so = 0x0, rc_sotype = 0, 
    rc_soproto = 0, rc_soflags = 0, rc_timeo = 0, rc_retry = 0, rc_srtt = {0, 0, 0, 0}, rc_sdrtt = {0, 0, 0, 
      0}, rc_sent = 0, rc_cwnd = 0, rc_timeouts = 0, rc_deadthresh = 0, rc_authtype = 0, rc_auth = 0x0, 
    rc_prog = 0x0, rc_proctlen = 0, rc_proct = 0x0}, nm_so = 0xc95201a0, nm_sotype = 1, nm_soproto = 0, 
  nm_soflags = 44, nm_nam = 0xc69a4ac0, nm_timeo = 6000, nm_retry = 2, nm_srtt = {15, 15, 15, 15}, nm_sdrtt = {
    3, 3, 3, 3}, nm_sent = -256, nm_cwnd = 4192, nm_timeouts = 0, nm_deadthresh = 9, nm_rsize = 32768, 
  nm_wsize = 32768, nm_readdirsize = 4096, nm_readahead = 1, nm_wcommitsize = 1177026, nm_acdirmin = 30, 
  nm_acdirmax = 60, nm_acregmin = 3, nm_acregmax = 60, nm_verf = "KJ¡~\000\n\207é", nm_bufq = {
    tqh_first = 0x0, tqh_last = 0xc70493c0}, nm_bufqlen = 0, nm_bufqwant = 0, nm_bufqiods = 0, 
  nm_maxfilesize = 1099511627775, nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, 
  nm_tprintf_delay = 30, nm_nfstcpstate = {rpcresid = 0, flags = 1, sock_send_inprog = 0}, 
  nm_hostname = "172.30.10.50\000/var/www/app10", '\0' <repeats 60 times>, nm_clientid = 0, nm_fsid = {val = {
      0, 0}}, nm_lease_time = 0, nm_last_renewal = 0}

-- 
Mikolaj Golub



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86tyuqnz9x.fsf>