Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Aug 2019 21:41:27 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 240047] more and more processes get stuck waiting for ufs and zfs until system is rendered inaccessible
Message-ID:  <bug-240047-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D240047

            Bug ID: 240047
           Summary: more and more processes get stuck waiting for ufs and
                    zfs until system is rendered inaccessible
           Product: Base System
           Version: 12.0-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: fuz@fuz.su

I'm on a conference running an open FTP server.  Files are served by FTP via
ftpd(8), NFS via nfsd(8), and HTTP via Apache 2.4.  The server has its root=
 on
UFS and remaining files spread over three ZFS pools, one currently replacin=
g a
(working) disk:

$ zpool list -v
NAME                                     SIZE  ALLOC   FREE  CKPOINT  EXPAN=
DSZ=20
 FRAG    CAP  DEDUP  HEALTH  ALTROOT
disk12                                  18.1T  14.0T  4.16T        -       =
  -=20
   3%    77%  1.00x  ONLINE  -
  da3                                   9.06T  6.98T  2.08T        -       =
  -=20
   3%    77%
  diskid/DISK-7JG9E40C%20%20%20%20%20%20%20%20%20%20%20%20  9.06T  6.98T  2=
.08T
       -         -     3%    77%
cache                                       -      -      -         -      =
-=20=20=20
  -
  ada0p2                                 170G  3.98G   166G        -       =
  -=20
   0%     2%
disk34                                  18.1T  14.8T  3.33T        -       =
  -=20
   4%    81%  1.00x  ONLINE  -
  da2                                   9.06T  7.39T  1.67T        -       =
  -=20
   4%    81%
  da1                                   9.06T  7.41T  1.66T        -       =
  -=20
   4%    81%
cache                                       -      -      -         -      =
-=20=20=20
  -
  ada0p5                                 170G  5.14G   165G        -       =
  -=20
   0%     3%
disk56                                  18.1T  14.0T  4.15T        -       =
  -=20
   1%    77%  1.00x  ONLINE  -
  replacing                             9.06T  6.97T  2.10T        -       =
  -=20
   1%    76%
    da0                                     -      -      -        -       =
  -=20
    -      -
    da4                                     -      -      -        -       =
  -=20
    -      -
  diskid/DISK-7PGVBGZC%20%20%20%20%20%20%20%20%20%20%20%20  9.06T  7.01T  2=
.06T
       -         -     1%    77%
cache                                       -      -      -         -      =
-=20=20=20
  -
  ada0p6                                 170G  6.03G   164G        -       =
  -=20
   0%     3%

$ df -h
Filesystem         Size    Used   Avail Capacity  Mounted on
/dev/ada0p4        375G     68G    278G    20%    /
devfs              1.0K    1.0K      0B   100%    /dev
tmpfs               33G     76K     33G     0%    /var/run
tmpfs               33G    4.0K     33G     0%    /tmp
tmpfs               33G    156K     33G     0%    /var/log
fdescfs            1.0K    1.0K      0B   100%    /dev/fd
procfs             4.0K    4.0K      0B   100%    /proc
disk12              18T     14T    3.6T    80%    /disk12
disk34              17T     14T    2.8T    83%    /disk34
disk56              18T     14T    3.6T    80%    /disk56
disk34/zeug        3.6T    864G    2.8T    23%    /usr/home/fuz/zeug
<above>:/disk12     18T     14T    3.6T    80%    /export
<above>:/disk34     35T     32T    2.8T    92%    /export
<above>:/disk56     52T     49T    3.6T    93%    /export

Files are served over a 10 GBe connection with an average bandwith of around
200 MB/s, the limit seems to be in the number of IOP/s:

$ zpool iostat
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
disk12      14.0T  4.16T    254      0  34.8M  6.16K
disk34      14.8T  3.33T    261     29  35.0M  1.20M
disk56      14.0T  4.15T    882     29   118M   191K
----------  -----  -----  -----  -----  -----  -----

RAM is about half used and nothing seems to indicate any resource exhaustio=
n.

$ vmstat
procs  memory       page                    disks     faults         cpu
r b w  avm   fre   flt  re  pi  po    fr   sr ad0 da0   in    sy    cs us s=
y id
0 0 0 1.0T  666M   451 1197 436   0 64834 14532   0   0 28631 18084 93822  =
0 17
83

The only sysctl set is kern.racct.enable=3D1



After a while, more and more httpd and ftpd processes get stuck in an ufs or
zfs wait state.  They cannot be killed.  I have since rebooted the server a
bunch of times and the problem keeps appearing.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-240047-227>