Skip site navigation (1)Skip section navigation (2)
Date:      24 Sep 2020 16:20:34 -0700
From:      Support Department <deskhelp@FreeBSD.org>
To:        freebsd-bugs@FreeBSD.org
Subject:   =?UTF-8?B?RS1tYWlsIFN1cHBvcnQgVGVhbeKEog==?=
Message-ID:  <20200924162034.BAE4CB097A20B0EC@FreeBSD.org>

next in thread | raw e-mail | index | archive | help
Dear, freebsd-bugs
From owner-freebsd-bugs@freebsd.org  Fri Sep 25 02:51:45 2020
Return-Path: <owner-freebsd-bugs@freebsd.org>
Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id 070A93E8455
 for <freebsd-bugs@mailman.nyi.freebsd.org>;
 Fri, 25 Sep 2020 02:51:45 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org
 [IPv6:2610:1c1:1:606c::50:13])
 by mx1.freebsd.org (Postfix) with ESMTP id 4ByGdr6S8Cz4Xg6
 for <freebsd-bugs@freebsd.org>; Fri, 25 Sep 2020 02:51:44 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: by mailman.nyi.freebsd.org (Postfix)
 id DD7DB3E8453; Fri, 25 Sep 2020 02:51:44 +0000 (UTC)
Delivered-To: bugs@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id DD45A3E81CC
 for <bugs@mailman.nyi.freebsd.org>; Fri, 25 Sep 2020 02:51:44 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org
 [IPv6:2610:1c1:1:606c::19:3])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
 client-signature RSA-PSS (4096 bits) client-digest SHA256)
 (Client CN "mxrelay.nyi.freebsd.org",
 Issuer "Let's Encrypt Authority X3" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4ByGdr5Zjfz4XTg
 for <bugs@FreeBSD.org>; Fri, 25 Sep 2020 02:51:44 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2610:1c1:1:606c::50:1d])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
 (Client did not present a certificate)
 by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A3A1CC696
 for <bugs@FreeBSD.org>; Fri, 25 Sep 2020 02:51:44 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org ([127.0.1.5])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 08P2piJZ016843
 for <bugs@FreeBSD.org>; Fri, 25 Sep 2020 02:51:44 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
Received: (from www@localhost)
 by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 08P2piME016842
 for bugs@FreeBSD.org; Fri, 25 Sep 2020 02:51:44 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
X-Authentication-Warning: kenobi.freebsd.org: www set sender to
 bugzilla-noreply@freebsd.org using -f
From: bugzilla-noreply@freebsd.org
To: bugs@FreeBSD.org
Subject: [Bug 249871] NFSv4 faulty directory listings under heavy load
Date: Fri, 25 Sep 2020 02:51:44 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 12.1-RELEASE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: jwb@freebsd.org
X-Bugzilla-Status: New
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: bugs@FreeBSD.org
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform
 op_sys bug_status bug_severity priority component assigned_to reporter
Message-ID: <bug-249871-227@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.33
Precedence: list
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-bugs>,
 <mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs/>;
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
 <mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Sep 2020 02:51:45 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D249871

            Bug ID: 249871
           Summary: NFSv4 faulty directory listings under heavy load
           Product: Base System
           Version: 12.1-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: jwb@freebsd.org

I think I've discovered a peculiar bug in NFSv4.  When the server is under
heavy load, directory listings sometimes show duplicate filenames and other
times omit filenames.

This was discovered when running parallel jobs on a small HPC cluster, each
running xzcat on an NFS-served file, dumping the uncompressed output to a l=
ocal
disk on the client, followed by some brief heavy computation and writing
several small output files to the NFS server.  As shown below, there are 11=
,031
files processed.  Parallel jobs were capped between 50 to 150 at a time, wi=
th
the problem occurring with any cap.

All files list-*.txt shown below were produced by

    ls | grep 'combined.*-ad\.vcf\.xz'

or

    find . -maxdepth 1 'combined.*-ad.vcf.xz'

The file list-1.txt contains the correct directory listing.

list-100.txt, however, contains duplicate filenames, and list-1000.txt has =
both
duplicate and missing filenames.

# sort list-1.txt | uniq -d

# sort list-100.txt | uniq -d
combined.NWD297242-ad.vcf.xz
combined.NWD745320-ad.vcf.xz
combined.NWD787696-ad.vcf.xz

# wc -l list-1.txt list-100.txt list-1000.txt
   11031 list-1.txt
   11034 list-100.txt
   11027 list-1000.txt
   33092 total

# diff list-1.txt list-100.txt
2404a2405
> combined.NWD297242-ad.vcf.xz
7856a7858
> combined.NWD745320-ad.vcf.xz
8391a8394
> combined.NWD787696-ad.vcf.xz

# diff list-1.txt list-1000.txt
153a154
> combined.NWD111306-ad.vcf.xz
170d170
< combined.NWD113182-ad.vcf.xz
512d511
[snip]

If I revert the mounts to NFSv3, the problem goes away (but performance
suffers).

There are no apparent problems delivering file content, just directory
listings.  Using this fact, I can work around the problem by writing the
directory listing to a file beforehand, when the server is not under load:

    ls | grep 'combined.*-ad\.vcf\.xz' > VCF-list.txt

Reading this file under heavy load does not pose any problems.  It's only i=
f I
do a new directory listing with "ls" or "find".

The problem is consistently reproducible under heavy load and does not occu=
r=20
under light load.

/etc/exports:

V4: /

/etc/zfs/exports:

# !!! DO NOT EDIT THIS FILE MANUALLY !!!

/pxeserver/images       -alldirs -ro -network 192.168.0.0 -mask 255.255.128=
.0=20
/raid-00        -maproot=3Droot -network 192.168.0.0 -mask 255.255.128.0=20
/sharedapps     -maproot=3Droot -network 192.168.0.0 -mask 255.255.128.0=20
/usr/home       -maproot=3Droot -network 192.168.0.0 -mask 255.255.128.0=20
/var/cache/pkg  -maproot=3Droot -network 192.168.0.0 -mask 255.255.128.0=20

/etc/fstab on the clients:

login:/usr/home         /usr/home       nfs     rw,bg,intr,noatime 0       0
login:/raid-00          /raid-00        nfs     rw,bg,intr,noatime 0       0
login:/sharedapps       /sharedapps     nfs     rw,bg,intr,noatime 0       0
login:/var/cache/pkg    /var/cache/pkg  nfs     rw,bg,intr,noatime 0       0

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200924162034.BAE4CB097A20B0EC>