Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Oct 2024 06:25:09 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 282169] zfs rename deadlock with mountd, df & fstat (and possibly others)
Message-ID:  <bug-282169-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D282169

            Bug ID: 282169
           Summary: zfs rename deadlock with mountd, df & fstat (and
                    possibly others)
           Product: Base System
           Version: 13.4-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: pen@lysator.liu.se

Created attachment 254323
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D254323&action=
=3Dedit
Output from "procstat -kk -a" at the time of the deadlock

Ran into a deadlock involving doing "zfs rename" on a large number of
filesystems on one of our production servers a couple of days ago.=20

I was renaming them from DATA/{staff,students}/<user> to
DATA/archive/{staff,students}/<user>.

While running that script the first 167 (out of about 20000) filesystems wo=
rked
fine, and then it deadlocked - mountd stopped servicing new requests, a "df=
 -it
zfs" never finished and same with an "fuser" command.

(I run another script from cron every minute that logs how the system looks=
 by
saving the output from "df -it zfs", "fuser", "procstat -kk -a" and a bunch=
 of
other commands, that script also stopped working at the time of the deadloc=
k).


Looking at the output from "procstat -kk -a" (included) it seems the hanging
processes were blocked with some ZFS locks.=20

I found an old bug report from around 2016 (209158) where something similar=
 is
discussed, but that was with the old FreeBSD-ZFS code.

I eventually had to hard-reset the server since it never recovered (atleast=
 not
in the 5 hours I waited).

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-282169-227>