Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Apr 2014 00:01:16 -0400
From:      Garrett Wollman <wollman@bimajority.org>
To:        freebsd-fs@freebsd.org
Subject:   NFS behavior on a ZFS dataset with no quota remaining
Message-ID:  <21327.21004.879860.960260@hergotha.csail.mit.edu>

next in thread | raw e-mail | index | archive | help
Recently one of our users managed to constipate one of our NFS servers
in an odd way.  They hit the quota on their dataset, and rather than
having all of their writes error out as they should have, the NFS
server instead stopped responding to all requests.  While this was
happening, sysctl vfs.nfsd reported:

vfs.nfsd.disable_checkutf8: 0
vfs.nfsd.server_max_nfsvers: 4
vfs.nfsd.server_min_nfsvers: 2
vfs.nfsd.nfs_privport: 0
vfs.nfsd.async: 0
vfs.nfsd.enable_locallocks: 0
vfs.nfsd.issue_delegations: 0
vfs.nfsd.commit_miss: 0
vfs.nfsd.commit_blks: 0
vfs.nfsd.mirrormnt: 1
vfs.nfsd.cachetcp: 1
vfs.nfsd.tcpcachetimeo: 300
vfs.nfsd.udphighwater: 500
vfs.nfsd.tcphighwater: 150000
vfs.nfsd.minthreads: 16
vfs.nfsd.maxthreads: 64
vfs.nfsd.threads: 18
vfs.nfsd.request_space_used: 36520872
vfs.nfsd.request_space_used_highest: 47536420
vfs.nfsd.request_space_high: 47185920
vfs.nfsd.request_space_low: 31457280
vfs.nfsd.request_space_throttled: 1
vfs.nfsd.request_space_throttle_count: 8451
vfs.nfsd.fha.enable: 1
vfs.nfsd.fha.bin_shift: 22
vfs.nfsd.fha.max_nfsds_per_fh: 8
vfs.nfsd.fha.max_reqs_per_nfsd: 0
vfs.nfsd.fha.fhe_stats: fhe 0xfffffe103fcab6c0: {
    fh: 32922711030235146
    num_rw: 457
    num_exclusive: 7
    num_threads: 2
    thread 0xfffffe0112f72a00 offset 26738688 (count 457)
    thread 0xfffffe04b0751080 offset 4390912 (count 7)
}, fhe 0xfffffe02fe8acd80: {
    fh: 32922925778599946
    num_rw: 90
    num_exclusive: 0
    num_threads: 2
    thread 0xfffffe0e77ee2c80 offset 6946816 (count 17)
    thread 0xfffffe0d25c1f280 offset 2752512 (count 73)
}

I increased their quota by a terabyte, and NFS immediately started
working again, for all clients.  But this seems, um, very bad.  Can
anyone explain what's going on in either NFS or ZFS that could cause
this?  I must emphasize that the zpool was by no means out of space;
it was merely one client dataset (out of many) that hit its quota.

A few seconds after increasing the quota, the sysctl tree looks like
this:

vfs.nfsd.disable_checkutf8: 0
vfs.nfsd.server_max_nfsvers: 4
vfs.nfsd.server_min_nfsvers: 2
vfs.nfsd.nfs_privport: 0
vfs.nfsd.async: 0
vfs.nfsd.enable_locallocks: 0
vfs.nfsd.issue_delegations: 0
vfs.nfsd.commit_miss: 0
vfs.nfsd.commit_blks: 0
vfs.nfsd.mirrormnt: 1
vfs.nfsd.cachetcp: 1
vfs.nfsd.tcpcachetimeo: 300
vfs.nfsd.udphighwater: 500
vfs.nfsd.tcphighwater: 150000
vfs.nfsd.minthreads: 16
vfs.nfsd.maxthreads: 64
vfs.nfsd.threads: 36
vfs.nfsd.request_space_used: 71688
vfs.nfsd.request_space_used_highest: 47536420
vfs.nfsd.request_space_high: 47185920
vfs.nfsd.request_space_low: 31457280
vfs.nfsd.request_space_throttled: 0
vfs.nfsd.request_space_throttle_count: 8455
vfs.nfsd.fha.enable: 1
vfs.nfsd.fha.bin_shift: 22
vfs.nfsd.fha.max_nfsds_per_fh: 8
vfs.nfsd.fha.max_reqs_per_nfsd: 0
vfs.nfsd.fha.fhe_stats: fhe 0xfffffe10807ad540: {
    fh: 32896773722738482
    num_rw: 9
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0035dc6380 offset 131072 (count 9)
}, fhe 0xfffffe14197de7c0: {
    fh: 32757316134636194
    num_rw: 8
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe02c3290800 offset 131072 (count 8)
}, fhe 0xfffffe06f2280cc0: {
    fh: 32869182852828802
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe038b3a8200 offset 0 (count 2)
}, fhe 0xfffffe0c90f5f400: {
    fh: 32493416164103072
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0ea55e4d00 offset 0 (count 2)
}, fhe 0xfffffe0ca9bd3d40: {
    fh: 32896984176135987
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe038b41ee00 offset 0 (count 2)
}, fhe 0xfffffe07c47884c0: {
    fh: 32897044305678131
    num_rw: 4
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe03aff63300 offset 131072 (count 4)
}, fhe 0xfffffe0aa9b151c0: {
    fh: 32892809467924243
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0f928e3780 offset 0 (count 2)
}, fhe 0xfffffe0762c91300: {
    fh: 32609079633383714
    num_rw: 1
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0a44496700 offset 0 (count 1)
}, fhe 0xfffffe11b0bf43c0: {
    fh: 32869363241455234
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0d550b4900 offset 0 (count 2)
}, fhe 0xfffffe1771ebd740: {
    fh: 32753381944593018
    num_rw: 6
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe1342368700 offset 131072 (count 6)
}, fhe 0xfffffe0ba23a52c0: {
    fh: 32679023175800193
    num_rw: 1
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe07460a8280 offset 0 (count 1)
}, fhe 0xfffffe092bd460c0: {
    fh: 32770347065412426
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0446182400 offset 0 (count 2)
}, fhe 0xfffffe07d65df600: {
    fh: 32416961451261960
    num_rw: 1
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe1596ead400 offset 0 (count 1)
}, fhe 0xfffffe036487ab40: {
    fh: 32746333903260256
    num_rw: 1
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0955989380 offset 0 (count 1)
}, fhe 0xfffffe12db02e640: {
    fh: 32803607292153112
    num_rw: 0
    num_exclusive: 1
    num_threads: 1
    thread 0xfffffe0a88c8b780 offset 0 (count 1)
}, fhe 0xfffffe0c50823a00: {
    fh: 32696404908442640
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe04e4a7fc00 offset 1305526272 (count 2)
}, fhe 0xfffffe1193fd7000: {
    fh: 32623167126115560
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe0551cb4280 offset 0 (count 2)
}, fhe 0xfffffe0eeacd33c0: {
    fh: 32809096260357425
    num_rw: 2
    num_exclusive: 0
    num_threads: 1
    thread 0xfffffe1516ecdc80 offset 0 (count 2)
}

Unfortunately, I did not think to take a procstat -kk of the nfsd
threads.

-GAWollman



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21327.21004.879860.960260>