Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Feb 2020 07:42:56 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 244465] ZFS recursive snapshot with refquota-full filesystems causes DoS for NFS users
Message-ID:  <bug-244465-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D244465

            Bug ID: 244465
           Summary: ZFS recursive snapshot with refquota-full filesystems
                    causes DoS for NFS users
           Product: Base System
           Version: 11.3-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: misc
          Assignee: bugs@FreeBSD.org
          Reporter: pen@lysator.liu.se

This is probably related to the ZFS feature of throttling down the transact=
ion
size when writing data to very near-full (refquota) filesystems but anyway.=
=20

System:
Servers with many ZFS filesystems (HOME directories) for many users, shared=
 via
NFSv4 and SMB (Samba).

We are seeing DoS issues for NFS users on the same machine where some _othe=
r_
user have filled their HOME directories so they are at, or very near (MB's),
their refquota limits at the time when we are taking our hourly recursive
snapshots of all filesystems on those servers.

   ("zfs snapshot -r DATA/homes" basically)

The NFS users (they have their HOME directories mounted from those servers)=
 at
those times are seeing freezes for over a minute where the NFS server basic=
ally
seems to be unresponsive. We are also seeing NFS mount requests beeing deni=
ed
(or rather timed out) at the same times.

We haven't seen the same type of bug reports from our SMB users, but their
usage patterns are a bit different so this might just be that they haven't
noticed the same issue.

Possible workarounds (not tested yet):

1. Don't use "zfs snapshot -r" but instead manually loop over all filesyste=
ms
and skip taking snapshots of the ones that are nearly full. (Modify the sou=
rce
for the "zfs" command to have an option to skip nearly-full filesystems when
doing a recursive snap)

2. Temporarily increase the refquota of problematic filesystems before runn=
ing
"zfs snapshot -r". (Problem: We might not be able to lower the quota
afterwards).

3. Modify the zfs snapshot stuff in the kernel to ignore the quotas (since =
it's
run by the root users I think if might be reasonable, but possibly not so e=
asy
to implement).

In general this slowing down of everything ZFS-related when a filesystem is
nearly full is starting to _really_ become a pain in the...


3.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-244465-227>