From owner-freebsd-fs@freebsd.org Thu Dec 10 16:16:04 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 308194AFB34 for ; Thu, 10 Dec 2020 16:16:04 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsJtq2hP4z4fVQ for ; Thu, 10 Dec 2020 16:16:03 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x12b.google.com with SMTP id m25so9003215lfc.11 for ; Thu, 10 Dec 2020 08:16:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=6bxuwoEdmnzuUpjFycLU11Xk0YdpiQ5K98+Ks3oKI7Y=; b=PxPjlLmBDblOUw4W1v8Aelnk7mFiZe+NhqchX1mF4liwWbc+qPCOFOcbdUuhzE6erX wg5VP5BtOo92eq/AxX5aQ6Ikf4MkIK3DS/04EB8+T7OAuNiel1iFfBAQXKmqb7sINwEU ztHz1+61YqAvhiZa8Afw4b1pqjuJrja4WZiumJNtMA9nVdreBXZYTiw3J85Y4+1JHWbg SQ9TPV7NES4RYI+JBHr4TNVaAOvAaVMCzPs31xfVF1ruq8y2Av13G98yOJXolZO19cms Nohmfa9aCtzJhlE2ruuVDMR6PTPJvZUISAEQ1yfDaDisr4dmFe9501mGmA84iubsCYTY lmCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=6bxuwoEdmnzuUpjFycLU11Xk0YdpiQ5K98+Ks3oKI7Y=; b=d/4NMoUuKKBNclGQ51T6gk9IW/lJxqYU/xcyas0sTX5lg3W0V7AHXDg9gKwjXZS011 XdfLp33e89Ji7iexsw0t3fgThb6T9qfhXF/WK2grKnIoQaiC3Ey2nJ8xbQAP9zeX+gJB ZGzYz+HukF5hgaY7d8+ksWCj60AwuUP4WlfVvNsGzFims+J6nihjpPBHByJ/tnno3Hij u8IPyxY9Pehafz/CU7fstewp3A6l86370HbYu6YmYP8B10Yn/gbby1flzus68kzl4X1v x/vM5owZ+N52IDaVPTbzGrG+N+rfeIi1sC8xYFxotmORE4HLGiT3WKpB5LMnxv/Ef716 A9SA== X-Gm-Message-State: AOAM533pmbROYur9RrcF18cI/CYux2RVl59DGkRQSMJ5UZ9YR5+GmhHq y5Tlv4VTwX5lPZNy1ZgzyUt4R7MHWtmq3I3r6P9qkGCEv/g= X-Google-Smtp-Source: ABdhPJyAIGfcaf8HbAugAe9y69bGQRxFRCt+iYdcRxeIsCySLpXWgy0Nkgv3plVvFdxcE9+vz+DdDiKCRTSwIlSmFps= X-Received: by 2002:a05:6512:786:: with SMTP id x6mr2785991lfr.643.1607616961122; Thu, 10 Dec 2020 08:16:01 -0800 (PST) MIME-Version: 1.0 From: J David Date: Thu, 10 Dec 2020 11:15:50 -0500 Message-ID: Subject: Major issues with nfsv4 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CsJtq2hP4z4fVQ X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=PxPjlLmB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::12b as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12b:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12b:from:127.0.2.255]; NEURAL_SPAM_SHORT(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12b:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2020 16:16:04 -0000 Recently, we attempted to get with the 2000's and try switching from NFSv3 to NFSv4 on our 12.2 servers. This has not gone well. Any system we switch to NFSv4 mounts is functionally unusable, pegged at 100% system CPU usage, load average 70+, largely from nfscl threads and client processes using NFS. Dmesg shows NFS-related messages: $ dmesg | fgrep -i nfs | sort | uniq -c | sort -n 1 nfsv4 err=10010 4 nfsv4 client/server protocol prob err=10026 29 nfscl: never fnd open Nfsstat shows no client activity; "nfsstat -e -c 1" and "nfsstat -c 1" both report: GtAttr Lookup Rdlink Read Write Rename Access Rddir 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Meanwhile, tcpdump on the client shows an endless stream of getattr requests at the exact same time nfsstat -c says nothing is happening: $ sudo tcpdump -n -i net1 -c 10 port 2049 and src 172.20.200.39 14:47:27.037974 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [.], ack 72561, win 545, options [nop,nop,TS val 234259249 ecr 4155804100], length 0 14:47:27.046282 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 139940:140092, ack 72561, win 545, options [nop,nop,TS val 234259259 ecr 4155804100], length 152: NFS request xid 1544756021 148 getattr fh 0,5/0 14:47:27.051260 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140092:140248, ack 72641, win 545, options [nop,nop,TS val 234259269 ecr 4155804104], length 156: NFS request xid 1544756022 152 getattr fh 0,5/0 14:47:27.063372 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140248:140404, ack 72721, win 545, options [nop,nop,TS val 234259279 ecr 4155804106], length 156: NFS request xid 1544756023 152 getattr fh 0,5/0 14:47:27.068646 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140404:140556, ack 72801, win 545, options [nop,nop,TS val 234259279 ecr 4155804108], length 152: NFS request xid 1544756024 148 getattr fh 0,5/0 14:47:27.080627 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140556:140712, ack 72881, win 545, options [nop,nop,TS val 234259299 ecr 4155804110], length 156: NFS request xid 1544756025 152 getattr fh 0,5/0 14:47:27.085224 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140712:140868, ack 72961, win 545, options [nop,nop,TS val 234259299 ecr 4155804112], length 156: NFS request xid 1544756026 152 getattr fh 0,5/0 14:47:27.096802 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140868:141024, ack 73041, win 545, options [nop,nop,TS val 234259309 ecr 4155804114], length 156: NFS request xid 1544756027 152 getattr fh 0,5/0 14:47:27.101849 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 141024:141180, ack 73121, win 545, options [nop,nop,TS val 234259319 ecr 4155804116], length 156: NFS request xid 1544756028 152 getattr fh 0,5/0 14:47:27.112905 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 141180:141336, ack 73201, win 545, options [nop,nop,TS val 234259329 ecr 4155804118], length 156: NFS request xid 1544756029 152 getattr fh 0,5/0 Only 10 shown here for brevity, but: $ sudo tcpdump -n -i net1 -c 10000 port 2049 and src 172.20.200.39 | fgrep getattr | wc -l tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on net1, link-type EN10MB (Ethernet), capture size 262144 bytes 10000 packets captured 20060 packets received by filter 0 packets dropped by kernel 9759 There are no dropped packets or network problems: $ netstat -in -I net1 Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll net1 1500 12:33:df:5f:79:d7 40988832 0 0 48760307 0 0 net1 - 172.20.0.0/16 172.20.200.39 40942065 - - 48756241 - - The mount flags in fstab are: ro,nfsv4,nosuid The mount flags as reported by "nfsstat -m" are: nfsv4,minorversion=0,tcp,resvport,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2147483647 Today, I managed to kill everything down to one user process that was exhibiting this behavior. After a kill -9 on that process, it went to "REsJ" but continued to burn the same amount of CPU (all system). Oddly the run state / wait channel was just "CPU1." Running "ktrace" did not produce any trace records. Probably that is predictable for a process in E state; if the process had crossed the user/kernel boundary in a way ktrace could detect, it would have exited. At that point, I started unmounting filesystems. Everything but the NFS filesystem used by that process unmounted cleanly. The umount for that filesystem went to D state for about a minute and then kicked back "Device busy." That's fair, if awfully slow. Meanwhile, that user process continued burning system CPU with the E flag set, not doing anything whatsoever in userspace, still producing 300+ "getattr fh 0,5/0" per second according to tcpdump and 0 according to nfsstat. Eventually, I rebooted with fstab set back to nfsv3. This feels like the user process is in a system call that is stuck in an endless loop repeating some operation that generates that getattr request. But that is a feeling, not a fact. This is fairly easy to reproduce; it seems pretty consistent within a few hours (a day at most) any time I switch the relevant mounts to nfsv4. Reverting to nfsv3 makes this issue completely disappear. What on earth could be going on here? What other information can I provide that would help track this down? Thanks for any advice! From owner-freebsd-fs@freebsd.org Thu Dec 10 20:04:44 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5770A4B4EEE for ; Thu, 10 Dec 2020 20:04:44 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsPyg2cq3z4vT5 for ; Thu, 10 Dec 2020 20:04:43 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 16C304001C; Thu, 10 Dec 2020 21:04:41 +0100 (CET) Received: from [IPv6:2001:9b1:28ff:3b01:39af:c1c2:809f:6011] (unknown [IPv6:2001:9b1:28ff:3b01:39af:c1c2:809f:6011]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id C1CB040019; Thu, 10 Dec 2020 21:04:40 +0100 (CET) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\)) Subject: Re: Major issues with nfsv4 From: Peter Eriksson In-Reply-To: Date: Thu, 10 Dec 2020 21:00:07 +0100 Cc: FreeBSD FS Content-Transfer-Encoding: quoted-printable Message-Id: <976EC1BD-AB37-478C-B567-E8013E80F071@lysator.liu.se> References: To: J David X-Mailer: Apple Mail (2.3654.20.0.2.21) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 4CsPyg2cq3z4vT5 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 130.236.254.3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-1.50 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[130.236.254.3:from]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_SPAM_LONG(1.00)[1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:2843, ipnet:130.236.0.0/16, country:SE]; RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2020 20:04:44 -0000 Any particular reason you choose to use NFSv4.0 and not NFSv4.1? Also, it might be useful information if you could show the configuration = your are using on the server and the clients. Are the client FreeBSD = 12.2 also or (more common) some Linux variant? We are using NFS v4.0 and v4.1 with great success here from our FreeBSD = 12.1, 12.2 and 11.3 servers from various Linux (and some OmniOS clients = - only 4.0 on those) with=20 Kerberos. With NFSv4 there are some additional things you need to set up compared = to NFSv3. For example the NFS-Domain name which must be the same on = servers & clients, and you must run the nfsuserd daemon, and have the V4 = export line. Our NFS server setup: > root:/etc # egrep 'nfs|gss|sec' rc.conf rc.conf.d/* /boot/loader.conf = /etc/sysctl.conf exports zfs/exports >=20 > rc.conf:gssd_enable=3D"YES" > rc.conf:nfs_server_enable=3D"YES" > rc.conf:nfsv4_server_enable=3D"YES" > rc.conf:nfscbd_enable=3D"YES" >=20 > rc.conf.d/nfsuserd:nfsuserd_enable=3D"YES" > rc.conf.d/nfsuserd:nfsuserd_flags=3D"-manage-gids -domain = your.nfs.domain.id 16" >=20 > exports:V4: /export -sec=3Dkrb5:krb5i:krb5p=09 >=20 > zfs/exports:/export/staff -sec=3Dkrb5:krb5i:krb5p=20 On a Linux client (Debian for example) you need to configure the = NFS-domain, make sure the idmap/gssd stuff is running and make sure you = nfsmount correctly=E2=80=A6 /etc/default/nfs-common NEED_IDMAPD=3Dyes NEED_GSSD=3Dyes /etc/idmapd.conf [general] Domain =3D your.nfs.domain.id Local-Realms =3D YOUR-KRB5-REALM /etc/nfsmount.conf [NFSMount_Global_Options] Defaultvers =3D 4.1 Packages need on Linux clients:=20 keyutils nfs-kernel-server (on Debian 9) Nfs-utils libnfsidmap nfs4-acl-tools rpcgssd (CentOS 7) We use =E2=80=9Cfstype=3Dnfs4,sec=3Dkrb5=E2=80=9D when mounting on the = Linux clients. At least on CentOS 7 if you use = =E2=80=9Cfstype=3Dnfs,vers=3D4,sec=3Dkrb5=E2=80=9D then iit will use 4.0 = instead of the highest supported NFS version=E2=80=A6 - Peter > On 10 Dec 2020, at 17:15, J David wrote: >=20 > Recently, we attempted to get with the 2000's and try switching from > NFSv3 to NFSv4 on our 12.2 servers. This has not gone well. >=20 > Any system we switch to NFSv4 mounts is functionally unusable, pegged > at 100% system CPU usage, load average 70+, largely from nfscl threads > and client processes using NFS. >=20 > Dmesg shows NFS-related messages: >=20 > $ dmesg | fgrep -i nfs | sort | uniq -c | sort -n > 1 nfsv4 err=3D10010 > 4 nfsv4 client/server protocol prob err=3D10026 > 29 nfscl: never fnd open >=20 > Nfsstat shows no client activity; "nfsstat -e -c 1" and "nfsstat -c 1" > both report: >=20 > GtAttr Lookup Rdlink Read Write Rename Access Rddir > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 >=20 > Meanwhile, tcpdump on the client shows an endless stream of getattr > requests at the exact same time nfsstat -c says nothing is happening: >=20 > $ sudo tcpdump -n -i net1 -c 10 port 2049 and src 172.20.200.39 > 14:47:27.037974 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [.], > ack 72561, win 545, options [nop,nop,TS val 234259249 ecr 4155804100], > length 0 > 14:47:27.046282 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 139940:140092, ack 72561, win 545, options [nop,nop,TS val > 234259259 ecr 4155804100], length 152: NFS request xid 1544756021 148 > getattr fh 0,5/0 > 14:47:27.051260 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 140092:140248, ack 72641, win 545, options [nop,nop,TS val > 234259269 ecr 4155804104], length 156: NFS request xid 1544756022 152 > getattr fh 0,5/0 > 14:47:27.063372 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 140248:140404, ack 72721, win 545, options [nop,nop,TS val > 234259279 ecr 4155804106], length 156: NFS request xid 1544756023 152 > getattr fh 0,5/0 > 14:47:27.068646 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 140404:140556, ack 72801, win 545, options [nop,nop,TS val > 234259279 ecr 4155804108], length 152: NFS request xid 1544756024 148 > getattr fh 0,5/0 > 14:47:27.080627 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 140556:140712, ack 72881, win 545, options [nop,nop,TS val > 234259299 ecr 4155804110], length 156: NFS request xid 1544756025 152 > getattr fh 0,5/0 > 14:47:27.085224 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 140712:140868, ack 72961, win 545, options [nop,nop,TS val > 234259299 ecr 4155804112], length 156: NFS request xid 1544756026 152 > getattr fh 0,5/0 > 14:47:27.096802 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 140868:141024, ack 73041, win 545, options [nop,nop,TS val > 234259309 ecr 4155804114], length 156: NFS request xid 1544756027 152 > getattr fh 0,5/0 > 14:47:27.101849 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 141024:141180, ack 73121, win 545, options [nop,nop,TS val > 234259319 ecr 4155804116], length 156: NFS request xid 1544756028 152 > getattr fh 0,5/0 > 14:47:27.112905 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], > seq 141180:141336, ack 73201, win 545, options [nop,nop,TS val > 234259329 ecr 4155804118], length 156: NFS request xid 1544756029 152 > getattr fh 0,5/0 >=20 > Only 10 shown here for brevity, but: >=20 > $ sudo tcpdump -n -i net1 -c 10000 port 2049 and src 172.20.200.39 | > fgrep getattr | wc -l > tcpdump: verbose output suppressed, use -v or -vv for full protocol = decode > listening on net1, link-type EN10MB (Ethernet), capture size 262144 = bytes > 10000 packets captured > 20060 packets received by filter > 0 packets dropped by kernel > 9759 >=20 > There are no dropped packets or network problems: >=20 > $ netstat -in -I net1 > Name Mtu Network Address Ipkts Ierrs Idrop > Opkts Oerrs Coll > net1 1500 12:33:df:5f:79:d7 40988832 0 0 > 48760307 0 0 > net1 - 172.20.0.0/16 172.20.200.39 40942065 - - > 48756241 - - >=20 > The mount flags in fstab are: >=20 > ro,nfsv4,nosuid >=20 > The mount flags as reported by "nfsstat -m" are: >=20 > = nfsv4,minorversion=3D0,tcp,resvport,hard,cto,sec=3Dsys,acdirmin=3D3,acdirm= ax=3D60,acregmin=3D5,acregmax=3D60,nametimeo=3D60,negnametimeo=3D60,rsize=3D= 65536,wsize=3D65536,readdirsize=3D65536,readahead=3D1,wcommitsize=3D167772= 16,timeout=3D120,retrans=3D2147483647 >=20 > Today, I managed to kill everything down to one user process that was > exhibiting this behavior. After a kill -9 on that process, it went to > "REsJ" but continued to burn the same amount of CPU (all system). > Oddly the run state / wait channel was just "CPU1." Running "ktrace" > did not produce any trace records. Probably that is predictable for a > process in E state; if the process had crossed the user/kernel > boundary in a way ktrace could detect, it would have exited. >=20 > At that point, I started unmounting filesystems. Everything but the > NFS filesystem used by that process unmounted cleanly. The umount for > that filesystem went to D state for about a minute and then kicked > back "Device busy." That's fair, if awfully slow. >=20 > Meanwhile, that user process continued burning system CPU with the E > flag set, not doing anything whatsoever in userspace, still producing > 300+ "getattr fh 0,5/0" per second according to tcpdump and 0 > according to nfsstat. >=20 > Eventually, I rebooted with fstab set back to nfsv3. >=20 > This feels like the user process is in a system call that is stuck in > an endless loop repeating some operation that generates that getattr > request. But that is a feeling, not a fact. >=20 > This is fairly easy to reproduce; it seems pretty consistent within a > few hours (a day at most) any time I switch the relevant mounts to > nfsv4. Reverting to nfsv3 makes this issue completely disappear. >=20 > What on earth could be going on here? What other information can I > provide that would help track this down? >=20 > Thanks for any advice! > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Thu Dec 10 20:30:57 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1F0524B57CF for ; Thu, 10 Dec 2020 20:30:57 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsQXw15Wzz3CNj for ; Thu, 10 Dec 2020 20:30:55 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lj1-x233.google.com with SMTP id q8so8178857ljc.12 for ; Thu, 10 Dec 2020 12:30:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+47NaldpSp8kyksiY2jD5dbg0DT2rB9Ln6UHvhJN1G8=; b=XhqWtvDa0WRb3SATRQ1GhYFeGqtmf1YEUipJ4Lyq8Bd/o5hl88B8rIgFhRyJGIPGpC 3+IWLhNgRopnysb2+PifJ3/5USzVZGcDPJjxiefD/bwsTrk/MWLnsAE/gltOIxN7D0gR 6+vJJcLHIwNpYKXnFWBrBrmnNUzRO2Fb19S20dyrJBJagW+p1lbtw2tbnJzgAQz153/r 8A/g5wFbAMQa0+RTmTkiSQhk2aTwyPfOvNuV9enGIMrVyVAt4BgNgs8o7qAinSXqb2jJ T0zsY4FtsEgzwORLs6aViVnx4gdg1sKI2tn+WdE6POaili2iyGgvzjD9tk+l0F8vPg+q mi1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+47NaldpSp8kyksiY2jD5dbg0DT2rB9Ln6UHvhJN1G8=; b=ewhbR6GjYf/ldlg9x3MIjtCYYE1GwNtKB7luSkwT6Hw+zg6ik/aVoXhs0uWLdO57Jd VciehyfPq3Q2XZnvbG/ebjvw4/ZQ/7oAbqwSZAThLjKzs0Q4AYm0AuFerBwr3ntAw9C0 EFRJHzxDK+UsaCqGSljKFzOSEwdjs0V31OehnoZNdeCkXM4jXBCnFQ4bPEXgZm7/kG+G +GGwsrOVbwYUgvkI04hWc6f1ziS27WOeE84rzwu4Q6VqsGMxmeY7oMd6aUsoeI6r/sIG /9rIhY11FuFYMjlj0CNTNb8VvUjZavrGhp/gqy/GqOkLFClTmfAy4PNwQsBSIgJqOJ8Y /eLg== X-Gm-Message-State: AOAM531hhqP4xiPYLa73a/3kv52FTtD6NNxFdaFrGI4O/8GGI0gp+aBN D8NFYAnDOoXiYYtFOr0RyxZAs/C+5Qo3aqk3rtH0fSRFYFM= X-Google-Smtp-Source: ABdhPJz6+N6Fj9owrWO7Qm7yO0jPsTMOuqn0oH527m6uVV/ZX0V1t+B2p0wtGIN5ysI92bpC3Mdr+nAuTBrC8Hem7cw= X-Received: by 2002:a2e:9641:: with SMTP id z1mr3571583ljh.171.1607632253474; Thu, 10 Dec 2020 12:30:53 -0800 (PST) MIME-Version: 1.0 References: <976EC1BD-AB37-478C-B567-E8013E80F071@lysator.liu.se> In-Reply-To: <976EC1BD-AB37-478C-B567-E8013E80F071@lysator.liu.se> From: J David Date: Thu, 10 Dec 2020 15:30:42 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Peter Eriksson Cc: FreeBSD FS Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CsQXw15Wzz3CNj X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=XhqWtvDa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::233 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-1.93 / 15.00]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.93)[-0.928]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::233:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::233:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::233:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2020 20:30:57 -0000 Ah, oops. The "12.2 servers" referred to at the top of the message are the NFS *clients* in this scenario. They are application servers, not NFS servers. Sorry for the confusing overloaded usage of "server" there! Everything in the message (dmesg, tcpdump, nfsstat, etc.) is from the perspective of a FreeBSD 12.2 NFS client, which is where the problems are occurring. Our Linux servers (machines? instances? hosts? nodes?) that are NFS clients have been running NFSv4 against the same servers for many years without incident. Thanks! From owner-freebsd-fs@freebsd.org Thu Dec 10 20:40:13 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A80E94B5C6C for ; Thu, 10 Dec 2020 20:40:13 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsQld08Qvz3D4B; Thu, 10 Dec 2020 20:40:12 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lj1-x235.google.com with SMTP id f11so8242571ljm.8; Thu, 10 Dec 2020 12:40:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5Uy0nl9G/ukVCOdnR+AjvzPkNRC3C39j1MGOcAh3h6U=; b=HkUCIn0fzQ1PLBbB26IsRC4k1K1trTk0Yp+wULFaVSKJnJUcjWS3xpvkpn960f6g7s n9Pr7S15KGSqLMfbrpzRjmM6B2F0oPqOAR229bJJUVhaxBjcqbdUkSFVqbfii0x+UiAj omooeG4gAGyHqV2E8hPTbhAjjeJA1i0HQGJe3VGpoDokDsTQARCEDK7x6F8g2tLsgkAA BrTVCrZ+EZZ8SEp1dXy27mRhjzfmE7IcL2F268I4iXaO6PKvdiZZtUTnLE4g+OSnEdM4 GVG74ayv0YyTs9idsbpi+GadFWArRcy8d1x7mtsHifgCxPw91QdEyPOJ6EfAzTU5JBFE l0ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5Uy0nl9G/ukVCOdnR+AjvzPkNRC3C39j1MGOcAh3h6U=; b=qzytjd3xPO3MoUPyuncoRvP6o5RrJY4lXrd6Qs7J86xyGb60y0wgu5eMFAmLhhiApL Sdvo6k1+NVvPntf1+eh36tkhJad6I6oLEYmRupVzOTs2HaHRqTPkL3kB8ZMVxIrOeBCt IsJyM9ttbQNfeu46E6lHFVOKdtlQUJaNDhBJ1knminKhuMM67I6zCI82RNF94K0im6v2 umxngIMUttAGXPP7pUwRfO453gEVhkBjCDteVZWn8SyAvL8V3gluxJlROlbszux3MfyB vk0BNQYzqmu2dy6zVB/ky9Szyhz2fs76o6t/trhZwAJfgCluWDhmbjfyq7Ehyl9A4Hu7 hlrQ== X-Gm-Message-State: AOAM532KQvjmOom7BMfLOcJIXHzvx5oJHyICeoA/w999GEkQlPx+cH6H IpflpDvF8c+op/4kiSnG/JN33BGAeub8cFQi1XAlUh17KIA= X-Google-Smtp-Source: ABdhPJwPuj8py9ocaXs0FV3qtkybjPG/c83AaJlNrzV0XQWOnq/GNHV0LOvkoasmcc/64J54LJn4OGKRtYaU/HLTykw= X-Received: by 2002:a05:651c:1282:: with SMTP id 2mr3741870ljc.383.1607632810678; Thu, 10 Dec 2020 12:40:10 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Thu, 10 Dec 2020 15:39:59 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CsQld08Qvz3D4B X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=HkUCIn0f; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::235 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-1.65 / 15.00]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.65)[-0.654]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::235:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::235:from:127.0.2.255]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::235:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2020 20:40:13 -0000 On Thu, Dec 10, 2020 at 1:20 PM Konstantin Belousov wrote: > Show procstat -kk -p output for it. I will add this to the list of things to try the next time I provoke this issue. As you might expect, the people working on these machines don't appreciate these issues, so my goal is to gather as much of a strategy as I can before doing so again. Thanks! From owner-freebsd-fs@freebsd.org Fri Dec 11 01:00:03 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 67BF64BDFB5 for ; Fri, 11 Dec 2020 01:00:03 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0602.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::602]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsXWQ2hx5z4Zsf for ; Fri, 11 Dec 2020 01:00:01 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cv2n0VWFaK+Yb6Ojyg4m3qHUlY7d3RB02ff4l7Wq866d3it40JhPJQJUikgNUCDq2dL3KYkLH+G6Ido6fT2ft9FZ7TDkf94hx2b9alR34VFXVixru8EfXDwWInIkgN8otNCEAyUpncQ/0CBod/C5cjrk0ssuVEw3EE/SvH7YfurodyZh6D2MVgu9GN0L6o1XTA9V5cAn62f/8qAlw4IsnFSZkfj27J6ayPSwb8v9ZqRcZ5LRquEOxbl/yLh+Cv7GelrBE92H/Xq9HpliNMQFnVyoBADdKG7kq27uLf/F3nqdZyOdraeUj3nLOXhdAX8GqbBurAWba/AQXI9Y+jDZ1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BxJ3ggTGkb5Y1JY4UPnQPCbME/hz4D62g9emJFX9tUA=; b=Tr5v6OP2bMgXgbTiVxbqirU7OwkaTMf17benq4FKMjBm3WsZBWFL1hkTe6dnd0jIjpmFb76OpKfcIj5KejAmt+wCvP5GCShOv2+H7ryDvFtrPdDd46HuTouhebbQi5RCpxVhbW2r0NPyGJxAWp6CAO/vlia6Nv7Y0ldF902YvkdVTYuMU51/RayemB3E4fRLrP47qLXB4AAagJW2/XZOOZTSLaMiVYc1I8DDnAEVsmmasaF5gv3c1Fv/0NM+uhESj7SddnKs6MHZ2huk93+yUyA/6nvH9sX8Lzi8VhEJxfj3f4njLi0c/4KNZ3RdFlvPLk7fBZspuR6kjCPZrkXcFw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BxJ3ggTGkb5Y1JY4UPnQPCbME/hz4D62g9emJFX9tUA=; b=M3+QOVQTEVuqD5ZHLRGpKoOtRFWziJZNtWOCwbqytge3b3IXLrWEjFTz7DdsEutjn352RUTB2jTFRDV2I52/2KzbadeYvlSHVKCV5bdd2gY5B5Z0XYD6Z4dsu6sf7mWnUXUKA7hUfq1ejMRHW+yCG8YtZTOoFWTNiVjVsd9Bm21lAnNSO8jyPUp3czPEh1sza2aGSuyNIdthCCQYLS3Dqz8tXRNhzzMtQFmtKf5GO4Slt6xd8Vmzun4UT7XWpTU0Z+fWxL+ZcrkrPj3vemo/IHDCbFKEkJc9KZhRzOjFGtyFtRWvElCXY/6BG6RDwQUm+D8u5Qw4Et781YBlbMFugQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB2451.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:30::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.18; Fri, 11 Dec 2020 00:59:58 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%6]) with mapi id 15.20.3632.027; Fri, 11 Dec 2020 00:59:58 +0000 From: Rick Macklem To: J David , "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQ Date: Fri, 11 Dec 2020 00:59:58 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: ffe77cc6-c234-46d2-d547-08d89d701517 x-ms-traffictypediagnostic: QB1PR01MB2451: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 8AVazbZgnWxJkWj+73BSKPo4Hcp2pZ5eaDhCitLS1l+uAdulL4kDjXTzETppYkaFj3onNXjlddil/GSSpiIKkx7ID/DNvymYnPLRlvf3wNPyoOXRBRAwvm2dUW0F5PSueUtq7rMXJ3sJGCcpXysUyiOp4wKf3XSm0RE7dwagwWCHhyQ1Mbo9VCi1DrRhWkh1bP07jHeHigmbKF9LO5J74sPbnD54JXJyzR9E/5BJ/XkMxSBKTpPCE98pNU8hRzp7+rCVkwgRW6sdLGtw9vXg76a1KjWS2mD/QE9xtt5N36EBL8Q29cYuZrzVc9uVgHz1aSiqwEhzyhrSsChgNoBcL1BLWEaP26q9HMXJ70TsEa530d++bLDiQfE0Ft7TQ/ZUAacUWPe2Oifays9sBoZx8w== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(136003)(376002)(366004)(396003)(39860400002)(7696005)(86362001)(786003)(9686003)(66446008)(316002)(76116006)(52536014)(83380400001)(64756008)(71200400001)(186003)(8676002)(8936002)(5660300002)(55016002)(66476007)(110136005)(91956017)(66556008)(33656002)(2906002)(478600001)(66946007)(966005)(6506007); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?VmV6fIMMcfpURjUXcF9S8NaTMzZyMYvI1G3RcjRPp6j6C9eeqgbiGP4R2R?= =?iso-8859-1?Q?TlR8hBrJt4LqPNHqk9JAue8Kejyq6wucTXLXoPfqzzRs+j8YK1Ztghh2XC?= =?iso-8859-1?Q?G7pIow9ESZ5Tk1aIEm78uHwvUn5ZQggEbuLQMpKfUtjiWAXxwmBiVIINcO?= =?iso-8859-1?Q?iBESTeR4wIOZ2l232b39/vSD2CNlEnIschE6LXVk+oSmSzRge8SnV+Oz89?= =?iso-8859-1?Q?JoeI2kkxrT7HD9hdlWL8IVYZWUGQdEpi0KJphyXOrYBgzuuxZ608Fnkkyv?= =?iso-8859-1?Q?U+F3UQskBkhhdXotUKx30Rl4obBQbclvabSP8va82JBaazoqAakinT+Poe?= =?iso-8859-1?Q?60IjwaZ/WqoT3+HA00TUFyz+d+Ol8AP89niWabOnQW2QboDyUFDXit3BD2?= =?iso-8859-1?Q?rtfz0kQFDHdpqfKyNUU3heD2J/EBbdXDUAra9bLB1bOPF8nodyKp3Nbodz?= =?iso-8859-1?Q?Ng686Mx4NjYw3TwthxGKZQKBkI6eVcrJteg2Gg/xxqhb8VE9IXZu5j7LZG?= =?iso-8859-1?Q?YvJ5rvHkdXvDCjh1PIn7Su4v2TURpVqY6y9iZJ8MvarPT7UAiYe2db7Gr+?= =?iso-8859-1?Q?Tj5tz3ovEGY9WS2BnjxEXohrkWUBI6qzWiG+BA6q6iIZWp0HoF2NR7FXaR?= =?iso-8859-1?Q?YJxktFNDBTwldCIk+bg3Ci7fNNPOt0ZRQhgT38RyyV9tLc3HeRXFZKhpe5?= =?iso-8859-1?Q?NaHXS6F5fOYcrbrmBxNygFlhubTiLbGfXjV5ZOBqpe6KU/QIgk7UxE513m?= =?iso-8859-1?Q?LqcfKkqlJGdMxF/GM0T6NwQ0Xeif7sH3oNpJ0QOlpZ3ULZujCHxc3r5iWK?= =?iso-8859-1?Q?NMt1gFSE9+aCwK2uKPNJ0in1H9rw8e7bO/QSAqHxNsePCHuKcHei4ZeA97?= =?iso-8859-1?Q?XMlvqYg+5Ako9SH+6qTR2v+vaN2t/OvdKjMQdKM//3BcQLhb22elwUGHlX?= =?iso-8859-1?Q?humteHrP/YFtI3x1LVYp4f1EPxuFS1FOI98PXtDNf37532K3mhx+cAgAS/?= =?iso-8859-1?Q?gWiAmMFqZDr8/TWFlZqPfjspaocBBGqZ5PnsU1yoIKmkntCFzhlP/kyhE1?= =?iso-8859-1?Q?SAAN88VSlOa6WahQ4VUczTU=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: ffe77cc6-c234-46d2-d547-08d89d701517 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Dec 2020 00:59:58.8226 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: vMRNVP4cbtI4WPWyCxFCrhfNSVgxXGu5GxllXmrxCP1qeT67y01EtoUC2rlfvtW1qhYakNWFiStLMog8GzJlWg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB2451 X-Rspamd-Queue-Id: 4CsXWQ2hx5z4Zsf X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=M3+QOVQT; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::602 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com,freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::602:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::602:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 01:00:03 -0000 J. David wrote:=0A= >Recently, we attempted to get with the 2000's and try switching from=0A= >NFSv3 to NFSv4 on our 12.2 servers. This has not gone well.=0A= >=0A= >Any system we switch to NFSv4 mounts is functionally unusable, pegged=0A= >at 100% system CPU usage, load average 70+, largely from nfscl threads=0A= >and client processes using NFS.=0A= >=0A= >Dmesg shows NFS-related messages:=0A= >=0A= >$ dmesg | fgrep -i nfs | sort | uniq -c | sort -n=0A= > 1 nfsv4 err=3D10010=0A= > 4 nfsv4 client/server protocol prob err=3D10026=0A= > 29 nfscl: never fnd open=0A= Add "minorversion=3D1" to your FreeBSD NFS client mount options=0A= and error 10026 should go away (and I suspect that the 10010 will=0A= go away too.=0A= =0A= The correct semantics for handling the "seqid" field that=0A= serialized open/lock operations for NFSv4.0 is difficult to get=0A= correct (and might now be broken in the client, since the=0A= original code written 20years ago depended on exclusive=0A= vnode locking and hasn't been updated or interop tested with=0A= non-FreeBSD NFS servers for ages).=0A= --> NFSv4.0 is close to 20years old and has been fixed/superceded=0A= by NFSv4.1 for many years now.=0A= --> NFSv4.1 (and NFSv4.2) replaced the "seqid" stuff with something=0A= called "sessions", which works better.=0A= =0A= I have been tempted to make FreeBSD NFSv4 mounts use 4.1/4.2=0A= by default to avoid problems with NFSv4.0, but I've hesitated since=0A= the change could be considered a POLA violation.=0A= =0A= NFSv4.0 is like any .0 release. There were significant issues with the=0A= protocol fixed by NFSv4.1.=0A= =0A= If you still have problems when using NFSv4.1, post again.=0A= Btw, "nfsstat -m" shows what the client mount options actually are.=0A= =0A= rick=0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= Nfsstat shows no client activity; "nfsstat -e -c 1" and "nfsstat -c 1"=0A= both report:=0A= =0A= GtAttr Lookup Rdlink Read Write Rename Access Rddir=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= 0 0 0 0 0 0 0 0=0A= =0A= Meanwhile, tcpdump on the client shows an endless stream of getattr=0A= requests at the exact same time nfsstat -c says nothing is happening:=0A= =0A= $ sudo tcpdump -n -i net1 -c 10 port 2049 and src 172.20.200.39=0A= 14:47:27.037974 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [.],=0A= ack 72561, win 545, options [nop,nop,TS val 234259249 ecr 4155804100],=0A= length 0=0A= 14:47:27.046282 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 139940:140092, ack 72561, win 545, options [nop,nop,TS val=0A= 234259259 ecr 4155804100], length 152: NFS request xid 1544756021 148=0A= getattr fh 0,5/0=0A= 14:47:27.051260 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 140092:140248, ack 72641, win 545, options [nop,nop,TS val=0A= 234259269 ecr 4155804104], length 156: NFS request xid 1544756022 152=0A= getattr fh 0,5/0=0A= 14:47:27.063372 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 140248:140404, ack 72721, win 545, options [nop,nop,TS val=0A= 234259279 ecr 4155804106], length 156: NFS request xid 1544756023 152=0A= getattr fh 0,5/0=0A= 14:47:27.068646 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 140404:140556, ack 72801, win 545, options [nop,nop,TS val=0A= 234259279 ecr 4155804108], length 152: NFS request xid 1544756024 148=0A= getattr fh 0,5/0=0A= 14:47:27.080627 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 140556:140712, ack 72881, win 545, options [nop,nop,TS val=0A= 234259299 ecr 4155804110], length 156: NFS request xid 1544756025 152=0A= getattr fh 0,5/0=0A= 14:47:27.085224 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 140712:140868, ack 72961, win 545, options [nop,nop,TS val=0A= 234259299 ecr 4155804112], length 156: NFS request xid 1544756026 152=0A= getattr fh 0,5/0=0A= 14:47:27.096802 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 140868:141024, ack 73041, win 545, options [nop,nop,TS val=0A= 234259309 ecr 4155804114], length 156: NFS request xid 1544756027 152=0A= getattr fh 0,5/0=0A= 14:47:27.101849 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 141024:141180, ack 73121, win 545, options [nop,nop,TS val=0A= 234259319 ecr 4155804116], length 156: NFS request xid 1544756028 152=0A= getattr fh 0,5/0=0A= 14:47:27.112905 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.],=0A= seq 141180:141336, ack 73201, win 545, options [nop,nop,TS val=0A= 234259329 ecr 4155804118], length 156: NFS request xid 1544756029 152=0A= getattr fh 0,5/0=0A= =0A= Only 10 shown here for brevity, but:=0A= =0A= $ sudo tcpdump -n -i net1 -c 10000 port 2049 and src 172.20.200.39 |=0A= fgrep getattr | wc -l=0A= tcpdump: verbose output suppressed, use -v or -vv for full protocol decode= =0A= listening on net1, link-type EN10MB (Ethernet), capture size 262144 bytes= =0A= 10000 packets captured=0A= 20060 packets received by filter=0A= 0 packets dropped by kernel=0A= 9759=0A= =0A= There are no dropped packets or network problems:=0A= =0A= $ netstat -in -I net1=0A= Name Mtu Network Address Ipkts Ierrs Idrop=0A= Opkts Oerrs Coll=0A= net1 1500 12:33:df:5f:79:d7 40988832 0 0=0A= 48760307 0 0=0A= net1 - 172.20.0.0/16 172.20.200.39 40942065 - -=0A= 48756241 - -=0A= =0A= The mount flags in fstab are:=0A= =0A= ro,nfsv4,nosuid=0A= =0A= The mount flags as reported by "nfsstat -m" are:=0A= =0A= nfsv4,minorversion=3D0,tcp,resvport,hard,cto,sec=3Dsys,acdirmin=3D3,acdirma= x=3D60,acregmin=3D5,acregmax=3D60,nametimeo=3D60,negnametimeo=3D60,rsize=3D= 65536,wsize=3D65536,readdirsize=3D65536,readahead=3D1,wcommitsize=3D1677721= 6,timeout=3D120,retrans=3D2147483647=0A= =0A= Today, I managed to kill everything down to one user process that was=0A= exhibiting this behavior. After a kill -9 on that process, it went to=0A= "REsJ" but continued to burn the same amount of CPU (all system).=0A= Oddly the run state / wait channel was just "CPU1." Running "ktrace"=0A= did not produce any trace records. Probably that is predictable for a=0A= process in E state; if the process had crossed the user/kernel=0A= boundary in a way ktrace could detect, it would have exited.=0A= =0A= At that point, I started unmounting filesystems. Everything but the=0A= NFS filesystem used by that process unmounted cleanly. The umount for=0A= that filesystem went to D state for about a minute and then kicked=0A= back "Device busy." That's fair, if awfully slow.=0A= =0A= Meanwhile, that user process continued burning system CPU with the E=0A= flag set, not doing anything whatsoever in userspace, still producing=0A= 300+ "getattr fh 0,5/0" per second according to tcpdump and 0=0A= according to nfsstat.=0A= =0A= Eventually, I rebooted with fstab set back to nfsv3.=0A= =0A= This feels like the user process is in a system call that is stuck in=0A= an endless loop repeating some operation that generates that getattr=0A= request. But that is a feeling, not a fact.=0A= =0A= This is fairly easy to reproduce; it seems pretty consistent within a=0A= few hours (a day at most) any time I switch the relevant mounts to=0A= nfsv4. Reverting to nfsv3 makes this issue completely disappear.=0A= =0A= What on earth could be going on here? What other information can I=0A= provide that would help track this down?=0A= =0A= Thanks for any advice!=0A= _______________________________________________=0A= freebsd-fs@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-fs=0A= To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"=0A= =0A= From owner-freebsd-fs@freebsd.org Fri Dec 11 01:21:14 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4162A4BE7C2 for ; Fri, 11 Dec 2020 01:21:14 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0612.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::612]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsXzs2LdWz4btX for ; Fri, 11 Dec 2020 01:21:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vrrf9k6/Wz9BdNzZMVfGAYymGK+Z6g9+Cr8FOcaP1rZTVMwjq2t5Ye92HqoXjR6qteEIR+hSqkwGzfykcEeJF53oT9I9uBNviwHbxe3PDBqfQh7+LjTNM2sBw4nfXVIt9oNJPW1JqEQklF5d+Lki4TSxhyjHXjGZXQyu+AyrmhTVTy8b8QVj2B54TYGA8CtE2KCMytxB6i/d3Mkl8CaImZKmmzkIeOxoibDgYnQjE2Y6pYOXSvfOhlqDLzTl8Qk6WouNKScBblVnVcnPyV57D8xMlBHyOKkILKZgtRF5HJGoA3zncIm/EtToOKZeg/05S7wHniGOxgF8hfKB1uMEiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vFFcG++IOfEvSZQaTvPEJIUlgClTcaYm/sMn7YzTBOk=; b=ktLPEr/3XIfu6zjVifLhC85MYZpJEtgNfk1qmVssreuz+07PIrHns91EbYUC5KTUFYU/y6AieLamVLcL7QPuZcNAprFISQJ91T47F+tyxHUx+McDGhOL+UU+q+m984YPVq3BPDeB2NPZOh3vrVd/gnnyp+WWwolfXDPZU6VVlyON+EeMeGS+N+bqZbA+fT4cTmTq9dpq8z/kBgbwMDPxRSWJUUHrq4NopjnskWU0a5a55A2W/N4HtXrIdoE2aTVRRgb9VVfNvQQibYTlminXXL33078UHOs9fhwcT0eqqNA7hcATWGBMCn1KmU5hjNESMnEeGbTHBTUxTsBsBAnLtg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vFFcG++IOfEvSZQaTvPEJIUlgClTcaYm/sMn7YzTBOk=; b=KY5OruZWTL2jWhRcUIrUtibMX/nJ4YRK0Lxb5fuXcfKvsTggT7Gh6qy6xQn0wEwjVE/OgA8s/fBORwkakvUX8ijJulDNUi6EAn8K3iI/Jw8NAeKvlRO/Xci0UGoA10wvUq+BpkAq5geBffhF/K+i9HXUagXJGlHWOets/8TnDpxob+o3SHTx5lSVWI/aYsBowS6qsLhUbrHqWiI0IA4DYbzckZ8im6eWhb1Exg5wT6xzRo3MsSQQ/Z6VXgseAZQCBEha0ki/wlhxNNUyxDvA/Mw2zCaa52eCdSZxnRzzxwbZyQ7SeMIJhlG1ra3vKxTGcUCvXt2UUdpon+APWBQBQg== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB2406.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:51::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.21; Fri, 11 Dec 2020 01:21:11 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%6]) with mapi id 15.20.3632.027; Fri, 11 Dec 2020 01:21:11 +0000 From: Rick Macklem To: J David , Peter Eriksson CC: FreeBSD FS Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnwv/2AgAAIjACAAE0Npg== Date: Fri, 11 Dec 2020 01:21:11 +0000 Message-ID: References: <976EC1BD-AB37-478C-B567-E8013E80F071@lysator.liu.se>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: aefd4e19-e512-40dd-22db-08d89d730ba7 x-ms-traffictypediagnostic: YQXPR01MB2406: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: gbfFnof0OGZSW0lQBB03OD7hJs4JsxNnFA8EDAUshC58eUGEheuidgJw40U/BfYfsavUXl28J866aE1vH35mNfcc+5IfAXSNLPWjWU+Xvb3DWrnYOFbzlV0+mn+owPESbvo88IIEVhf1leDQ75EXujplo7qnQR4uPLTTk4Tcm6q+a7GiBrRcIQsnczo5dwSeQxHwyRnLKqHJU2Qg/05zLjrjiVtgV1VqjWpeTbLiaEQKFf/2Ott7af2hozAza3YkHHd/FrnH/Ebepgn4NZOeslTO/fyGQ6K2gT/wduStXBjSbwfDzlLUQ6QMK/RaaSOF8htVWMJURLmEx/Nv92VgEBuw/6I45sPjhhrGx2UcqBgk+xHWvkAl+v8k5GP5WSu+om3Fk5uYDJSq6Q5Awswoww== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(136003)(346002)(366004)(186003)(76116006)(91956017)(66946007)(7696005)(52536014)(66476007)(66556008)(33656002)(786003)(8936002)(2906002)(71200400001)(5660300002)(110136005)(9686003)(55016002)(83380400001)(966005)(64756008)(66446008)(6506007)(86362001)(4326008)(508600001)(8676002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?iwfnVF6P1m2tZg/gDkUojXoHHwOKbK+YX83ouyXVVKYjK6AcgpVfoHYtu/?= =?iso-8859-1?Q?GL1ZQWZJWwUEGF+bHsWQiXMKqu2btLkrrjCHv4Eqolh0oXSa/y0Nio+oXZ?= =?iso-8859-1?Q?DefcnNS1T0DsEfMBLHJVExehsSdaDZhFWLxMjpNcaMDOn5iJ9EEPwPML+r?= =?iso-8859-1?Q?Z5U9X/0LiJZmEyRP5xWV6Utk3PGPdmee89wL0ldsQwgF6nP3XxNBJKLvo9?= =?iso-8859-1?Q?68czaVsA5a0IKkQwW4sb7eX/Ssne1+EgOwMGoRufruYHny0ScNa06Qk0I5?= =?iso-8859-1?Q?zqa89krl2re8TfNQy9hC73YqDd0ElfvX3K7fiffMhLpx1mVZOCrVL0oCc2?= =?iso-8859-1?Q?R91vxfx4rV130gHf7wxWlfPMLib9Cn9yOxbE1Z+W8ITqOZ7/d3S5fScwmr?= =?iso-8859-1?Q?qMlGy43IeUrMaHUieyBUpHyAGA+wzToBV/0XM5oEpw+uYpRfPyLBT0XfnG?= =?iso-8859-1?Q?ffm3Bx7kCEQv5t7syN6J93XZMC6osUnnbDq7DyF+PmQB6hogRx2g35OtE0?= =?iso-8859-1?Q?zzIKe9EWYuIEpHMUOGZbvWIOSSv8uWkQ2wSKaPDCCVVTPdSfr7wcrcUh04?= =?iso-8859-1?Q?O0+6D7HDEnyLWxZ6uXhFevh17dw/sIVoI5YUi2VLaTSocRRAKNuIac8KH/?= =?iso-8859-1?Q?CWq2B1NzeQDWDdwM+VkrMA1TiyINYn9/Y+ywWY8nCO1JQ262MhemJabIrk?= =?iso-8859-1?Q?LuJrzWXvEJmFJqi5yU3dGpm5QU3FMnMi58ukAtt3MwIkjHmyooLIMFSkRM?= =?iso-8859-1?Q?JaVxf48fq1F8R3bkwEMjVujFCukN+ggAf49OYNILUZ9d6AGNc8EddlWc5K?= =?iso-8859-1?Q?gM7AP19oA/OSv/N6ORIzqFBaSVgsTYfhhUvqg4gWP7UhvdpLCX3z2ZU+l8?= =?iso-8859-1?Q?AvULSsEltNugVDwalz8MAX4s4lH27+MugvriWoR0xXKpmxBZNV1i8PFlgm?= =?iso-8859-1?Q?xvixk7CCQViC2ggUUSqnfdX4fGHd9rUX1S5plzrwf2hBs9+K6KKN8CAleC?= =?iso-8859-1?Q?qNW3aRAuvrFI/CTou1Jg5epZRengByEWWQrQ49qSnRvijyUQYkNrtEObz/?= =?iso-8859-1?Q?3L1YJ4f+Nr024QYhOGV59b4=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: aefd4e19-e512-40dd-22db-08d89d730ba7 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Dec 2020 01:21:11.4058 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 4edoaf3ck/tYWUBxh8ZWfjYnAC1HGTy5UEJ5SRtac1RqYStW7EAZLKy/UX6zV3whjVtR3i8b84BQCewVx2JASQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB2406 X-Rspamd-Queue-Id: 4CsXzs2LdWz4btX X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=KY5OruZW; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::612 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::612:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::612:from:127.0.2.255]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_SPAM_LONG(1.00)[1.000]; FREEMAIL_TO(0.00)[gmail.com,lysator.liu.se]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 01:21:14 -0000 J. David wrote:=0A= >Ah, oops. The "12.2 servers" referred to at the top of the message=0A= >are the NFS *clients* in this scenario. They are application servers,=0A= >not NFS servers. Sorry for the confusing overloaded usage of "server"=0A= >there!=0A= So what is your NFS server running?=0A= =0A= Btw, if it happens to be a Linux system and you aren't using Kerberos,=0A= it will expect Users/Groups as the numbers in strings by default.=0A= To do that, do not start the nfsuserd(8) daemon on the client and=0A= instead add the following line to the client's /etc/sysctl.conf file:=0A= vfs.nfs.enable_uidtostring=3D1=0A= =0A= When User/Group mapping is broken, you'll see lots of files owned=0A= by "nobody".=0A= =0A= Also, if you do want to see what the NFS packets look like, you can=0A= capture packets with tcpdump, but then look at them in wireshark.=0A= # tcpdump -s 0 -w out.pcap host =0A= - then look at out.pcap in wireshark. Unlike tcpdump, wireshark=0A= knows how to parse NFS messages properly.=0A= =0A= rick=0A= ps: Once you have switched to NFSv4.1 and have User/Group=0A= mapping working, I suspect the NFS clients will be ok.=0A= Using NFSv4.1 also avoids FreeBSD NFS server issues w.r.t.=0A= tuning the DRC, since it is not used by NFSv4.1 (again, fixed=0A= by sessions).=0A= =0A= Everything in the message (dmesg, tcpdump, nfsstat, etc.) is from the=0A= perspective of a FreeBSD 12.2 NFS client, which is where the problems=0A= are occurring.=0A= =0A= Our Linux servers (machines? instances? hosts? nodes?) that are NFS=0A= clients have been running NFSv4 against the same servers for many=0A= years without incident.=0A= =0A= Thanks!=0A= _______________________________________________=0A= freebsd-fs@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-fs=0A= To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"=0A= =0A= From owner-freebsd-fs@freebsd.org Fri Dec 11 16:33:24 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 28A554B43D0 for ; Fri, 11 Dec 2020 16:33:24 +0000 (UTC) (envelope-from mike@sentex.net) Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [IPv6:2607:f3e0:0:3::19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "pyroxene.sentex.ca", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsxDM2wm7z4ZXh for ; Fri, 11 Dec 2020 16:33:23 +0000 (UTC) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:28bd:e794:2a7c:a8d8] ([IPv6:2607:f3e0:0:4:28bd:e794:2a7c:a8d8]) by pyroxene2a.sentex.ca (8.15.2/8.15.2) with ESMTPS id 0BBGXMF2042714 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Fri, 11 Dec 2020 11:33:22 -0500 (EST) (envelope-from mike@sentex.net) Subject: forcing nfsv4 versions from the server? (was Re: Major issues with nfsv4 To: Rick Macklem , J David , "freebsd-fs@freebsd.org" References: From: mike tancsa Message-ID: <318cbeaf-ce39-6ed7-3c64-8dc0efc540ce@sentex.net> Date: Fri, 11 Dec 2020 11:33:22 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 4CsxDM2wm7z4ZXh X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:3::19 as permitted sender) smtp.mailfrom=mike@sentex.net X-Spamd-Result: default: False [0.58 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f3e0::/32]; HFILTER_HELO_IP_A(1.00)[pyroxene2a.sentex.ca]; HFILTER_HELO_NORES_A_OR_MX(0.30)[pyroxene2a.sentex.ca]; NEURAL_HAM_SHORT(-0.42)[-0.421]; FREEMAIL_TO(0.00)[uoguelph.ca,gmail.com,freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2607:f3e0:0:3::19:from]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; MID_RHS_MATCH_FROM(0.00)[]; SUBJECT_HAS_QUESTION(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FREEFALL_USER(0.00)[mike]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[sentex.net]; SPAMHAUS_ZRD(0.00)[2607:f3e0:0:3::19:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; R_DKIM_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 16:33:24 -0000 On 12/10/2020 7:59 PM, Rick Macklem wrote: > J. David wrote: >> Recently, we attempted to get with the 2000's and try switching from >> NFSv3 to NFSv4 on our 12.2 servers. This has not gone well. >> >> Any system we switch to NFSv4 mounts is functionally unusable, pegged >> at 100% system CPU usage, load average 70+, largely from nfscl threads >> and client processes using NFS. >> >> Dmesg shows NFS-related messages: >> >> $ dmesg | fgrep -i nfs | sort | uniq -c | sort -n >> 1 nfsv4 err=10010 >> 4 nfsv4 client/server protocol prob err=10026 >> 29 nfscl: never fnd open > Add "minorversion=1" to your FreeBSD NFS client mount options > and error 10026 should go away (and I suspect that the 10010 will > go away too. Hi Rick,     I never knew there was such an important difference. Is there a way on the server side to force only v4.1 connections from the client when they try and v4.x mount ?     ---Mike From owner-freebsd-fs@freebsd.org Fri Dec 11 16:57:19 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 166ED4B5381; Fri, 11 Dec 2020 16:57:19 +0000 (UTC) (envelope-from ali.abdallah@suse.com) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp2.suse.de", Issuer "Let's Encrypt Authority X3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Csxlx75dHz4cXf; Fri, 11 Dec 2020 16:57:17 +0000 (UTC) (envelope-from ali.abdallah@suse.com) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id ED013ACF1; Fri, 11 Dec 2020 16:57:14 +0000 (UTC) Date: Fri, 11 Dec 2020 17:57:13 +0100 From: Ali Abdallah To: freebsd-fs@freebsd.org Cc: freebsd-stable@freebsd.org Subject: Consistency of pkg db on UFS Message-ID: <20201211165713.syvzamtdtrbrgx44@frix230> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: 4Csxlx75dHz4cXf X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.20 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.com:s=susede1]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_MED(-2.00)[suse.com:dkim]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:195.135.220.15]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; RCVD_DKIM_ARC_DNSWL_MED(-0.50)[]; RWL_MAILSPIKE_GOOD(0.00)[195.135.220.15:from]; RCVD_IN_DNSWL_MED(-0.20)[195.135.220.15:from]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[suse.com,quarantine]; DKIM_TRACE(0.00)[suse.com:+]; NEURAL_SPAM_LONG(1.00)[1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MID_RHS_NOT_FQDN(0.50)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:29298, ipnet:195.135.220.0/22, country:DE]; RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs,freebsd-stable]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 16:57:19 -0000 Hello, I've came across the following issues on a freshly installed system with a UFS SU+j root partition. # pkg install -y some_package The package management tool is not yet installed on your system. Do you want to fetch and install it now? [y/N]: y # kldload SOME_MODULE -> crash happens here (due to drm for example) On next boot any attempt to use pkg results in the following message: # pkg install nano The package management tool is not yet installed on your system. Do you want to fetch and install it now? [y/N]: y Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:12:amd64/quarterly, please wait... Verifying signature with trusted certificate pkg.freebsd.org.2013102301... done Installing pkg-1.15.10... the most recent version of pkg-1.15.10 is already installed It seems that /var/db/pkg/local.sqlite contains entry for the installed packages, but those packages didn't make it to the filesystem. Is this because the db is fsynced while the actual package data is not? The issue is very easy to reproduce (on a VM for instance). (I was not able to reproduce on a ZFS root filesystem). Regards, Ali From owner-freebsd-fs@freebsd.org Fri Dec 11 20:30:44 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8954F4B9E61 for ; Fri, 11 Dec 2020 20:30:44 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct2VC6N5lz4r3l; Fri, 11 Dec 2020 20:30:43 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lj1-x22f.google.com with SMTP id f11so12426779ljn.2; Fri, 11 Dec 2020 12:30:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZVTEIgZn702gSvneh1YX0fnGddHB1nVchrnrrZv7/Zc=; b=SzB6y2jM728B1WHmkDJBJgDvg3tb4cXAO/X/iBuzyji1ZY5WREkpN+tMmh63DoFQJx ZD/aMA3kju6+x6y5fnLrv6l7wZ2JjbZNkDHFXKVz7gj5HrBZmG+obaeqPh6sOboT4/zP nyodPVGpC3FiqdE7/I3a+tD+Tc8Uc8WEhhUc+hIp0PV/jRUIKlxeul9H02khOs81pHs/ Mmh2Qcj+v11BFLkLFZ2vhe60mPo3KNVdZX6GT4hMWb/2vrb2oNYayYe1mp5vGjr47Ufb gtnf0h5hzRcc+PJmzVU8XJ3QRBa1lPoyTd6Onl60vzjk5llFJSR/L7hrRAUz9LM7wy06 Km+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZVTEIgZn702gSvneh1YX0fnGddHB1nVchrnrrZv7/Zc=; b=n+9MNpvvv2sIK76BmNEZVJ9DOrEYS2TeGJUXghMHvWdz/bZevZQT4JujXeY2nO47pV RwI6tn0ilcyYSfZsEnXl6nrNgol8l171ujpNgT20/Mp+HssFWysmjF4WJO5CCv9cuGL3 aT+jCiS2vQyb22NOTqfzuACBpII6A+gCtD0BpKxhTd/+kO92McZYu4i/Sqe87506fCRl EvWUkYjJXW1eB0QlYdAW28MO9MYG/SkRQdH7L3akO3YP8QnL2O5M4eWBJTgsgUbLHsjj 8VWQ2kEY916K2UGDm37KIEno3tXaVEYcZV1XJJfTLqL5zbmboXusONMI0AJOjyCNMvQT eMlQ== X-Gm-Message-State: AOAM533rAMCem+oTMwCuHEmntul3UgsdQ5IbvibKDxo6BuVnsKYuebs7 Iu3kDFbvf/EfuTdHLJ5MWiodsF6mcYhg1IthMohVGhmDFeIj0Q== X-Google-Smtp-Source: ABdhPJwYiR6lZvd0Vk1OltM5p7D+f5MYMLaaH4w4WqcIZg0QZTgoXK52BMEoQv1LTQHj6OnHst0rbjd8CrT0eAKTk10= X-Received: by 2002:a05:651c:1282:: with SMTP id 2mr5827212ljc.383.1607718640491; Fri, 11 Dec 2020 12:30:40 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Fri, 11 Dec 2020 15:30:29 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Ct2VC6N5lz4r3l X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=SzB6y2jM; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::22f as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-1.98 / 15.00]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.98)[-0.984]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::22f:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::22f:from:127.0.2.255]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::22f:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 20:30:44 -0000 On Thu, Dec 10, 2020 at 1:20 PM Konstantin Belousov wrote: > E means exiting process. Is it multithreaded ? > Show procstat -kk -p output for it. To answer this separately, procstat -kk of an exiting process generating huge volumes of getattr requests produces nothing but the headers: # ps Haxlww | fgrep DNE 0 21281 18549 1 20 0 11196 2560 piperd S+ 1 0:00.00 fgrep DNE 125428 9661 1 0 36 15 0 16 nfsreq DNE+J 3- 3:22.54 job_exec # proctstat -kk 9661 PID TID COMM TDNAME KSTACK This happened while retesting on NFSv4.1. Although I don't know if the process was originally multithreaded, it appears it wasn't even single-threaded by the time it got into this state. Thanks! From owner-freebsd-fs@freebsd.org Fri Dec 11 21:52:30 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0FE9F4BBEC8 for ; Fri, 11 Dec 2020 21:52:30 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct4JY2vCDz3CTt for ; Fri, 11 Dec 2020 21:52:29 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x12b.google.com with SMTP id a9so15354822lfh.2 for ; Fri, 11 Dec 2020 13:52:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KGLSdwx8YS111NcOayZ61Lecwj/5lDLDZmc2tieFbFY=; b=m4E9E4CkF3WC6cgKOlIsgf7usmVEan2NxXUE4pzrx0ablS8VUK0bBi++VGY7NhT7N3 v6colW57FbUshoAeDzksfzIilM3NEXL0i9WieDnKNjmNi0mcxU/q7l/tH9nrteIBIYTR dkUjKuB9Rrw/qT/++t1ePBEKVasMyaum7Jpo0PElKTnuQyl7ONbyRl3lQK1A5LYrrDde pviwsEBWJ1t8VLH39fTsgjSunZq1MxQrKraaKmAiP78t9DHDfEwgM51R/eMvh68093Xt oPoQkxiy9dOYD2/D3/lGf4jS6D/6shKES/X7CGvSfm4755dCRJpg5lK9tBiIeFik1ppK LRTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KGLSdwx8YS111NcOayZ61Lecwj/5lDLDZmc2tieFbFY=; b=WZUxWVIPAWGDXSa53rJCiXEr4DMZEUatbRkXw2FY5YuTpeIEO8yKsfylgdIwEhTCcy ++Ta3jFP2BvqFdiAE0RgHPXXJseTX+e5VWXcABhqn8iqL0Xn3EzzVjaxItiHHwdvNFST 1aJqh4HCVHlc9iPUaK9dvXlRb8E1GeEwxrOo+LpE77+AO95KWk6s2OmGggSem24+xaMR ks/OJOVCjlW/xv6a1ehO1B7JTUuYY7po0kgMWg4ppBcPJaRGiXjjHi66te7LKCFC1hCw J5aVPGN40xB+uHeDNKqu8tllQ2lI2+CLHD1pI4a71MhA9PJByjxoXbZ2IzBra7JSfjO5 aVPQ== X-Gm-Message-State: AOAM533n1C/TSHDj/XZ9hXpkPBfRxgtpa664EqbPvHB9s0g3updNRYBT pLMGHq/z4GWoDZsSAIc2J5kLzb/+/kxU3OSDJ/E= X-Google-Smtp-Source: ABdhPJz4rXFzwI1dhcG9gFuIMD2hM7SfUdDVWKX06S+dvL9+Md9Kzh218fwCytSgf8fPPn/gXFPRHmU1UGRb8NsEE88= X-Received: by 2002:a05:6512:1095:: with SMTP id j21mr5555997lfg.309.1607723547537; Fri, 11 Dec 2020 13:52:27 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Fri, 11 Dec 2020 16:52:16 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Rick Macklem Cc: "freebsd-fs@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Ct4JY2vCDz3CTt X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=m4E9E4Ck; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::12b as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-1.26 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.26)[-0.259]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12b:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12b:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12b:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 21:52:30 -0000 Unfortunately, switching the FreeBSD NFS clients to NFSv4.1 did not resolve our issue. But I've narrowed down the problem to a harmful interaction between NFSv4 and nullfs. These FreeBSD NFS clients form a pool of application servers that run jobs for the application. A given job needs read-write access to its data and read-only access to the set of binaries it needs to run. The job data is horizontally partitioned across a set of directory trees spread over one set of NFS servers. A separate set of NFS servers store the read-only binary roots. The jobs are assigned to these machines by a scheduler. A job might take five milliseconds or five days. Historically, we have mounted the job data trees and the various binary roots on each application server over NFSv3. When a job starts, its setup binds the needed data and binaries into a jail via nullfs, then runs the job in the jail. This approach has worked perfectly for 10+ years. After I switched a server to NFSv4.1 to test that recommendation, it started having the same load problems as NFSv4. As a test, I altered it to mount NFS directly in the jails for both the data and the binaries. As "nullfs-NFS" jobs finished and "direct NFS" jobs started, the load and CPU usage started to fall dramatically. The critical problem with this approach is that privileged TCP ports are a finite resource. At two per job, this creates two issues. First, there's a hard limit on both simultaneous jobs per server inconsistent with the hardware's capabilities. Second, due to TIME_WAIT, it places a hard limit on job throughput. In practice, these limits also interfere with each other; the more simultaneous long jobs are running, the more impact TIME_WAIT has on short job throughput. While it's certainly possible to configure NFS not to require reserved ports, the slightest possibility of a non-root user establishing a session to the NFS server kills that as an option. Turning down TIME_WAIT helps, though the ability to do that only on the interface facing the NFS server would be more palatable than doing it globally. Adjusting net.inet.ip.portrange.lowlast does not seem to help. The code at sys/nfs/krpc_subr.c correctly uses ports between IPPORT_RESERVED and IPPORT_RESERVED/2 instead of ipport_lowfirstauto and ipport_lowlastauto. But is that the correct place to look for NFSv4.1? How explosive would adding SO_REUSEADDR to the NFS client be? It's not a full solution, but it would handle the TIME_WAIT side of the issue. Even so, there may be no workaround for the simultaneous mount limit as long as reserved ports are required. Solving the negative interaction with nullfs seems like the only long-term fix. What would be a good next step there? Thanks! From owner-freebsd-fs@freebsd.org Fri Dec 11 23:06:30 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8F5334BE07B for ; Fri, 11 Dec 2020 23:06:30 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0606.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::606]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct5xx4ctSz3JJV for ; Fri, 11 Dec 2020 23:06:29 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Buk0jDA5lbS57e058uZyVCJIlf7tCVGBkAWIQFZWiybBFxeruncKvuHSq52ZtMManyvMukezGpccyfyA/nmnSa7ZJRKp1d4QtjvS9Lzj4IUNsrNrPdJH43bTYfQ5inRRAYB9zgdemeiX6srd0AfSami9DBWjWvnWEVVi6cO9VslkxIeblWfqHWVmtYSBjZW0uOitST+DIRyMHJkR2qPpG7ckhTrCKJK0qsNiVVwKJJYgcVDpBNK/C0zCfFlq+yZ3HbOEfJfitemOCMDOVyVx3Q93uINxMDl3I/FFBLC01Wfm3thBcf9H8PuXbY/oTMITJEEd1LZBn45beUBU0bF0ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3HYDyYvfTL3o1g8Us7knOBgmCvBUCQPeefcEtboPjgo=; b=TCx0C3/ffR3mUCjRSi9ZZBjs4LWx3CVpz4k15HaV1HoJxUYZBprQysHcctv7i7/pgRsT2YUK50Py4Hsvro5vF31kv2TPuKMOQMFbsQ0kDSN2mhVemXvnSLRTSCTIh64sPEm01kuQOSN5B0xUfHRi2x6iR7HIvaXPV34dsTuhmAxZ+gQM5TQz9fszzHqdXe5snJHAtn+tx+qFaxcesOMeb/FZSBiBnJN7yt/uNEyzdzpvJ3rB2EJkfIqbx8pQUP7U4EDjv5d2zthBKGhDVhs5RF2Ld1gmFNLmZ03oJHqOKbY4W3loW3KqCXg+8hT5hyuHIFHWiNazeVZysYrT10B8qw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3HYDyYvfTL3o1g8Us7knOBgmCvBUCQPeefcEtboPjgo=; b=Xee4OH8I+lSS0Ni+f5qhhq6u3wA8lQv2+893QIXLrg/i4w1jVkk4bm8DsiC0TAWp2UDtbW0UMBfBx7jlySq4ssY3fnoIZh21tqsqCx6ypmjsg4160E7erUqgfXHlJVtnm8i07/IdeWiH18ha+MYNXzM+MhqjT/dSTRrwnI7xMQVTSyxEcA17MyyBAwLSfIdqDiQFMIKQxLJj7qq+3xoxW6lUHMsrXK9LXP1QMiYGnsv0gWVhs3es7xY15IJiiauujerk+QzVrsUViQKgDOLOIlQCS5rYc1qiIC6KceZTUZU8jHlkFDGK+JzjX8gYySvTMvldevD5D0wry9vQpCITPQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQBPR0101MB1698.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:b::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.18; Fri, 11 Dec 2020 23:06:27 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%6]) with mapi id 15.20.3632.027; Fri, 11 Dec 2020 23:06:27 +0000 From: Rick Macklem To: mike tancsa , J David , "freebsd-fs@freebsd.org" Subject: Re: forcing nfsv4 versions from the server? (was Re: Major issues with nfsv4 Thread-Topic: forcing nfsv4 versions from the server? (was Re: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAEJtwCAAGlzLw== Date: Fri, 11 Dec 2020 23:06:27 +0000 Message-ID: References: , <318cbeaf-ce39-6ed7-3c64-8dc0efc540ce@sentex.net> In-Reply-To: <318cbeaf-ce39-6ed7-3c64-8dc0efc540ce@sentex.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 017bbbea-ce1d-40b1-c7dd-08d89e2963be x-ms-traffictypediagnostic: YQBPR0101MB1698: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2887; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: PIKOogHK/bKHC3ziKiH8ICmmDRI+4sxMzA4YbesrA9ufTtOVx8flFPvl+gWwTHV0+xpif5vaWrE+vbA11CnPsZdVJgPpGSx6DiAjLd2MfDS4UVI3y2ZWXGx2FWrbdCrZuMlmdAPkm0q95ZMbKsyAQn/j2OKNRQhzkjBmhFwRRYYyr2rCt8BNCobpxB25lb8B5/mTf9Vm7zV2bRRMwRlD5M+JoMmGaEviF3KTZQBRCTnfmx8q3E48e7Byi1FHCZveRa+drYr17OWyaeEju727VWR2sUpAc06F1CNNnJXjwj4IIkmy0Von5nPlmA9ZNOxT x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(366004)(136003)(346002)(91956017)(508600001)(2906002)(8936002)(9686003)(110136005)(66556008)(66946007)(52536014)(55016002)(66476007)(186003)(33656002)(5660300002)(8676002)(7696005)(6506007)(66446008)(71200400001)(76116006)(86362001)(83380400001)(786003)(64756008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?Kka4td70ENQooc0hxEYaSO9fKzs6MITC2q1a14iyloOzu38T48IYmt2nDS?= =?iso-8859-1?Q?kiM0EQL4q5d8G4vgKI/JY7VIYU9gNJc2UxzXFGXedDzAfFWvo6iVrigN4h?= =?iso-8859-1?Q?ag740O/b9+D/ky47l3va/ai0IVU9Wj9jRik3efCJynah1lK8FgzigT+0//?= =?iso-8859-1?Q?g8PGjOPlnQtEBofqTJHiq9R6DODUobti+PyhPOz0SEkoO3ylW1supx5OP9?= =?iso-8859-1?Q?8Rhk804aFumKxzOaraTt2Ct7NP6JP7CrRysWOG5ERkwpgd86FrCyEo2VHy?= =?iso-8859-1?Q?g/RtyfkoAAqjqPjzK0D3fs+MO9EbLuH92MCJq4GV1RBq9QJxzX3kDaCP4/?= =?iso-8859-1?Q?zXXstWIEfRlbL3mcfw39Yec+vRsgTnjtfzGxb2SliCmddRfZIFkTAaHpqr?= =?iso-8859-1?Q?EH08WeszRuEtrAse6IMDQ986hcfu5UKRsREmFFe+meAOx/+KEVWGuuyCOv?= =?iso-8859-1?Q?D1h4jECwK/fKIkxxjWify6JeaNAkSLg2OyKoe/IM5CjXKzpVX2yVZvjmrC?= =?iso-8859-1?Q?fAOP2YNsI5jlNocokTa6RmZS5IzoULV8fJ+0HV7i/bpWangmscllV1AJne?= =?iso-8859-1?Q?1X6rQEv5GRrFTNkrKQqWh6bNh3KT+t49ZvAmLe+46bFImY0BeRUAqHGCCD?= =?iso-8859-1?Q?QkaPKBc16XJQsE4aavi/6yfTkMY+z6AbmBh+T3N0tFkwwZ8BaQY6xYGXar?= =?iso-8859-1?Q?T6v/j6jLIMxIH/2+s6hJ71Os2aWpmd0TITwZh7SGDaFEqjUP5onC5JnIDy?= =?iso-8859-1?Q?8D7FfGxlP6NTrX6/gzhA2s62DteMRBIso8AcJqE1K4TqnDiwKDhkEuP+ws?= =?iso-8859-1?Q?XtxF4U47/QQVjVTbxBX5rn2dpfMR1c/gJeLOk3CrECq4M86Dg9MVk9Cql9?= =?iso-8859-1?Q?L6uT6QaRCuzvxCE06cSe4nwk48vLp4XhAVMSksjWrlz279mZN8Rc7yqxjE?= =?iso-8859-1?Q?FgDKNaxUmRmEeHINw0fI6r96sac7cI+pmWGwL5X8Gob/xuznUDGWaD6eMl?= =?iso-8859-1?Q?y0shAo+Zhm8kdruE1tL5o6wc31UTL2NNhlW6BBB1w1d1ba95fLWf+hMvyh?= =?iso-8859-1?Q?8zxKHRbTXtLq/vTpVbvSpZg=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 017bbbea-ce1d-40b1-c7dd-08d89e2963be X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Dec 2020 23:06:27.6350 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 2xoESxbTWqphM3cc/OEEoeQW02GgVIA5xTF3UHv8A1acYhFzalqyf8apWCe7x/sqNcSoHVjm8nJL2/RpT4VG2w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB1698 X-Rspamd-Queue-Id: 4Ct5xx4ctSz3JJV X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=Xee4OH8I; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::606 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[sentex.net,gmail.com,freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5c::606:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; SUBJECT_HAS_QUESTION(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5c::606:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 23:06:30 -0000 mike tancsa wrote:=0A= [stuff snipped]=0A= >Hi Rick,=0A= >=0A= > I never knew there was such an important difference. Is there a way=0A= >on the server side to force only v4.1 connections from the client when=0A= >they try and v4.x mount ?=0A= You can set the sysctl:=0A= vfs.nfsd.server_min_minorversion4=3D1=0A= if your server has it. (I can't remember what versions of FreeBSD have it.)= =0A= =0A= For Linux clients, they will usually use the highest minor version the=0A= server supports. FreeBSD clients will use 0 unless the "minorversion=3D1"= =0A= option is on the mount command.=0A= =0A= To be honest, I have only heard of a couple of other sites having the=0A= NFSERR_BADSEQID (10026) error problem and it sounds like J David's=0A= problem is related to nullfs and jails.=0A= =0A= 4.0->4.1 was a minor revision in name only. RFC5661 (the NFSv4.1 one)=0A= is over 500pages. Not a trivial update. On the other hand, 4.1->4.2 is=0A= a minor update, made up of a bunch of additional optional features=0A= like SEEK_HOLE/SEEK_DATA support and local copy_file_range() support=0A= in the server.=0A= =0A= rick=0A= =0A= =0A= ---Mike=0A= =0A= From owner-freebsd-fs@freebsd.org Fri Dec 11 23:08:24 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8A1204BE1D6 for ; Fri, 11 Dec 2020 23:08:24 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct6073Rqbz3JjS for ; Fri, 11 Dec 2020 23:08:23 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ot1-f54.google.com with SMTP id b18so9808590ots.0 for ; Fri, 11 Dec 2020 15:08:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=k/5FB9tI5gPpuQu1jNK/U1eva7QDrF+Kb5HpHMiyCgs=; b=T4y5MfRY+qkYkEK9yRhd93KISFCzsbOvobRuka++7kXRhpnZ0/Y/HJvYbfcT/cB72L sFzDV9bDoP6y8RYl67GK5ppZfX6Ji29JNVH0sixNv49ZX2tTqv33XSxOqUnPbueBjCAo YRqLaz/J6AT4CQkSFMBS131fICA4FqYBo89XLVsUZ7nmvbaoAaNiBhNnYzwmqTeNZsP4 f1Fji0jVO7CqB1IFBt0Ywcp9eZVRfOtr3ZMV6PCxgV8QZywLwluRL9HVo4S2STP58lsj 18i0VQDvK1OgJHcEpGwNBauVmJLoE3jOSDyBsgglxwnYERtnuURVtsIn4sk+5mwCi424 3Nvw== X-Gm-Message-State: AOAM533+76hG5281KzidVdpJsRZ60D/iftbhHb9KHrYQzx01w1TsihVl KbLxux8buZ1zvuZBuwU5rpX4s7JvGymVXPiOdtE= X-Google-Smtp-Source: ABdhPJx9Bs8lYHFVeCAJcnYUfILGryYMppvADxYkNBudn8W3oSFDJG4Aexsb/I3IV0GURLpTFZYTdjOu5dJeMCJAX78= X-Received: by 2002:a9d:646:: with SMTP id 64mr11625976otn.18.1607728101418; Fri, 11 Dec 2020 15:08:21 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Fri, 11 Dec 2020 16:08:10 -0700 Message-ID: Subject: Re: Major issues with nfsv4 To: J David Cc: Rick Macklem , "freebsd-fs@freebsd.org" X-Rspamd-Queue-Id: 4Ct6073Rqbz3JjS X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.210.54 as permitted sender) smtp.mailfrom=asomers@gmail.com X-Spamd-Result: default: False [-0.97 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; RWL_MAILSPIKE_GOOD(0.00)[209.85.210.54:from]; NEURAL_HAM_SHORT(-0.97)[-0.973]; FREEMAIL_TO(0.00)[gmail.com]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RBL_DBL_DONT_QUERY_IPS(0.00)[209.85.210.54:from]; R_DKIM_NA(0.00)[]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FREEFALL_USER(0.00)[asomers]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; SPAMHAUS_ZRD(0.00)[209.85.210.54:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[209.85.210.54:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 23:08:24 -0000 On Fri, Dec 11, 2020 at 2:52 PM J David wrote: > Unfortunately, switching the FreeBSD NFS clients to NFSv4.1 did not > resolve our issue. But I've narrowed down the problem to a harmful > interaction between NFSv4 and nullfs. > > These FreeBSD NFS clients form a pool of application servers that run > jobs for the application. A given job needs read-write access to its > data and read-only access to the set of binaries it needs to run. > > The job data is horizontally partitioned across a set of directory > trees spread over one set of NFS servers. A separate set of NFS > servers store the read-only binary roots. > > The jobs are assigned to these machines by a scheduler. A job might > take five milliseconds or five days. > > Historically, we have mounted the job data trees and the various > binary roots on each application server over NFSv3. When a job > starts, its setup binds the needed data and binaries into a jail via > nullfs, then runs the job in the jail. This approach has worked > perfectly for 10+ years. > > After I switched a server to NFSv4.1 to test that recommendation, it > started having the same load problems as NFSv4. As a test, I altered > it to mount NFS directly in the jails for both the data and the > binaries. As "nullfs-NFS" jobs finished and "direct NFS" jobs > started, the load and CPU usage started to fall dramatically. > > The critical problem with this approach is that privileged TCP ports > are a finite resource. At two per job, this creates two issues. > > First, there's a hard limit on both simultaneous jobs per server > inconsistent with the hardware's capabilities. Second, due to > TIME_WAIT, it places a hard limit on job throughput. In practice, > these limits also interfere with each other; the more simultaneous > long jobs are running, the more impact TIME_WAIT has on short job > throughput. > > While it's certainly possible to configure NFS not to require reserved > ports, the slightest possibility of a non-root user establishing a > session to the NFS server kills that as an option. > > Turning down TIME_WAIT helps, though the ability to do that only on > the interface facing the NFS server would be more palatable than doing > it globally. > > Adjusting net.inet.ip.portrange.lowlast does not seem to help. The > code at sys/nfs/krpc_subr.c correctly uses ports between > IPPORT_RESERVED and IPPORT_RESERVED/2 instead of ipport_lowfirstauto > and ipport_lowlastauto. But is that the correct place to look for > NFSv4.1? > > How explosive would adding SO_REUSEADDR to the NFS client be? It's > not a full solution, but it would handle the TIME_WAIT side of the > issue. > > Even so, there may be no workaround for the simultaneous mount limit > as long as reserved ports are required. Solving the negative > interaction with nullfs seems like the only long-term fix. > > What would be a good next step there? > > Thanks! > That's some good information. However, it must not be the whole story. I've been nullfs mounting my NFS mounts for years. For example, right now on a FreeBSD 12.2-RC2 machine: > sudo nfsstat -m Password: 192.168.0.2:/home on /usr/home nfsv4,minorversion=1,tcp,resvport,soft,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2147483647 > mount | grep home 192.168.0.2:/home on /usr/home (nfs, nfsv4acls) /usr/home on /iocage/jails/rustup2/root/usr/home (nullfs) Are you using any mount options with nullfs? It might be worth trying to make the read-only mount into read-write, to see if that helps. And what does "jls -n" show? -Alan From owner-freebsd-fs@freebsd.org Fri Dec 11 23:28:32 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D27F24BE96D for ; Fri, 11 Dec 2020 23:28:32 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0620.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::620]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct6RN08KTz3KSh for ; Fri, 11 Dec 2020 23:28:31 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k9luQW4TfsQ4w4nxwJRv4Et/2t2tai7jmTHvDacnVr9rTbqRl/bcRUadgZifJME19+goIs0p11v5I8l2L4LkCXBAD7Dvt2CAI2UK9i3FShmMTodE0eBNfU+0/QUZ7Bbyd6PqmUHWaaIXrif0XvDx0V47+Y6RlN+AtwoERPOHn35/O83DdYkQg7bYd9m/pOYMCiN+IPhajjWb4/RXoofQ+YZzD5Q2q3Bsva/DpYW/F1pdCWBPPbIqgDcryQkhRzl9UXby9bO2dmwVvxU+DIoxOiEWcuuez5OVr8e91/OpWvWauHX2KJWE/pwl2XKmLYrME8kh0VfpW/cM/HNz9C7lvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mSPd6CYGzrUtF9v2TIXrbxpPyMu0uhXD8LkvYsfYkEw=; b=GTPa71Icv6306/MesXLlF4g0RuTeuqxbZt/XuOvVC+iFXiq6T+4ozJ63kz7c9TaWsdfrDsCuQwWG8iCKRWF7fRfXV7i3ivO/9my1C6+5Ybcv400cABPZaTjQ0P1RSEneUN7U0BAQY3/oS3A0rkwBMX9AkgqbyYDGRYRzhk+EpIdww3UWZuKcoJuih4s20uyWwflQ51UeLDRMO0QDMUQkaEzxXXjsOcOTGHm0nUI5k8fXCN8SLsVqiIGVZOYroVxG9uug3Z+d3iNnFJ3kuUejjMhmwq1pCB2w9p95yqg2rbDxWqheUnzx/EggbG3L9ZGxVKCfw5vhgKO+3rhFQ5r1yw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mSPd6CYGzrUtF9v2TIXrbxpPyMu0uhXD8LkvYsfYkEw=; b=LUPNRKykN011au51Xaw5HlnOVWg5EJbk8D3MhN8BUQdfkTsRCCyMSDl7DiAFs+xUbS3bKXST7ttiLp8BDM76AG0wYqowkbcwTusy4qcXyeBr4Mv8qo5lLBP/a1tFPuA+5HV2nHAfNcgzVL5RN+HdpWujhZJn3D9agLUc/3/Vajq5JCXNeVjAeHvF8SYnfebbLChW6PbGQEexk0HONah/eRi8xpvqUyBppg/0XFJQNWXDnKkuqG61VJTj8+enADJOauFRVuSSKLeHlviiNEp7aBJI0zT8fyt/5uEizbts7s5FVH1l7icBisVqApqKBi7j+GPimGzxgTwki9T6yHrdpQ== Received: from YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b00:20::29) by YTBPR01MB3040.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:1b::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.12; Fri, 11 Dec 2020 23:28:30 +0000 Received: from YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM ([fe80::b131:712c:ca2d:c7b7]) by YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM ([fe80::b131:712c:ca2d:c7b7%5]) with mapi id 15.20.3632.022; Fri, 11 Dec 2020 23:28:30 +0000 From: Rick Macklem To: J David CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa8w== Date: Fri, 11 Dec 2020 23:28:30 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 94310d6c-1352-49e0-5110-08d89e2c7823 x-ms-traffictypediagnostic: YTBPR01MB3040: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: Z6DPizYMfz5smm2mlMGa94n0nNIh8JTzhBguSHOTBExALQSDf4e6x4MmySFVHgIwFlmGRnwYrIhFBPobr9w8U3nBzT7UfJ6Q74TbvWT6STPXw/WQici9maxrg3vJk0FJY9h1Xfg9PS+jOAnlJaFwxGaWeyRTPEeJpVVcw6cFBpaWFg+FxfOdcIMa8To+AWti1Gq4X6TcKDXQTfqRZMoCVnqOVQF1IrDOGqmENnYEp5PysSAv9LRF0eXZhgBlDrNLVopfaLOZhjK+olDDk0v56ZTWDysifYsFrCBHym/hj1A2Cx6ITjriMfE3Z7E0KjTLQLh6GM0hNaO5Fnmi1NA4RA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(39860400002)(376002)(346002)(366004)(136003)(396003)(5660300002)(76116006)(2906002)(786003)(316002)(478600001)(83380400001)(4326008)(66574015)(71200400001)(8936002)(6506007)(66556008)(52536014)(66476007)(66446008)(86362001)(66946007)(6916009)(64756008)(6512007)(33656002)(6486002)(186003)(9686003)(91956017)(8676002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?xLQKtwtCQgkrngFKFIhGvhWclrXDT0elsB8FVcZEL5Sn6N5Ru1GYMwn3aE?= =?iso-8859-1?Q?x2d0TMwPZd5nya2w47PIlEtQw0tZwhD7XVlwE6PktABB2sPme4kqaPFA2I?= =?iso-8859-1?Q?rclen+Hd7pi43NtwrYr16Z91wd4QxNPZqFVJXH5A0CViHRc3iX6sJy8Ui8?= =?iso-8859-1?Q?7IJRJOvYuMtOOIi8nVp5Iq21GMxnPTf+8MjULSGeJdrPAHjhIoUQmKsJRm?= =?iso-8859-1?Q?CwMoYGrxhWIsfVkHeEcsCpMfz64IM81VnTVlW5uQlpUYgNjQWIN+++5M+R?= =?iso-8859-1?Q?NKOlCxUWh/0M8Okjqdq6cSBQZmc7u5bzD67slTiq24tGvaC+gAqdp2Md54?= =?iso-8859-1?Q?AHKdp2hlcBTJVsWsLYI+RtoqnxnjbskfJqyMFW2kH8YKScuCTMfjsxxsWX?= =?iso-8859-1?Q?Lpu4CvnIJYLlL0tGhUyaEJ5c7ITEQi3XdhUgDat6W4UGxyL+vCTup8DlYz?= =?iso-8859-1?Q?peJsoT5ETyKb8HQouaf9nVBAHg91EJrTz/0DOH9KCm1B/kf2lvgIFvU7UT?= =?iso-8859-1?Q?EUP1oSwcuXX0XPRnB/DtJI1LqjV/0UBNQfrPdfOvztd//h9ZUBFm290RcN?= =?iso-8859-1?Q?E/YMWpf/8+Uw51UfCRa59J9zzivAXB4TshjJov90NAV4G1sYiPwnZGkDY7?= =?iso-8859-1?Q?kKR3t95qV8jKMz9VGe+uufWMfPzJoqc4k9rB4FmQ+ls5P6Ar5JsnFHHqxq?= =?iso-8859-1?Q?h3FCfKZzEpYd4XA0YmyqdDES9tedOEtvEiMHagiFhTac2NQNYEeDTS6L+j?= =?iso-8859-1?Q?/E/WYvF/0Z6v9kUX/ij7hQOpbIdVhZ1lhC3paOOoWzmQ+rbdRu6kZFIqgR?= =?iso-8859-1?Q?b2ifKXHnh+llr60mNGbrslIUA9sno1Sq6oyOk1bbtUPXIjBuygjIAwhUMx?= =?iso-8859-1?Q?mDO4CvA/BHm0zs5E4E45/Gas0LNRxz82++KsYlkU8XaTJ/mkoCJpzYOAyi?= =?iso-8859-1?Q?9Z+5859m4+RSx8EIyYxEM4P1S5Gl4T4B1HDkOLkycXBEzwSwhXrBTDE+ES?= =?iso-8859-1?Q?H/zqO7ZdrIussLyw0kna7cqX7nfRZvNT8zqvUxpUTQo0sjH81WJLYF0NKh?= =?iso-8859-1?Q?nSijzi76PTl0r+/T6d2F8ec=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 94310d6c-1352-49e0-5110-08d89e2c7823 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Dec 2020 23:28:30.3325 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 4n8nic69tJqXnnloshxjqG+T4zsbJKJiTWy8MLUunjd2J+EhUhjgjqQlEdS/7x9n5M1xJ8VWtZhDxnbThfS/jw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTBPR01MB3040 X-Rspamd-Queue-Id: 4Ct6RN08KTz3KSh X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=LUPNRKyk; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::620 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5c::620:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5c::620:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 23:28:32 -0000 J David wrote:=0A= >Unfortunately, switching the FreeBSD NFS clients to NFSv4.1 did not=0A= >resolve our issue. But I've narrowed down the problem to a harmful=0A= >interaction between NFSv4 and nullfs.=0A= I am afraid I know nothing about nullfs and jails. I suspect it will be=0A= something related to when file descriptors in the NFS client mount=0A= get closed.=0A= =0A= The NFSv4 Open is a Windows Open lock and has nothing to do with=0A= a POSIX open. Since only one of these can exist for each=0A= tuple, the NFSv4 Close must be delayed until=0A= all POSIX Opens on the file have been closed, including open file=0A= descriptors inherited by children processes.=0A= =0A= Someone else recently reported problems using nullfs and vnet jails.=0A= =0A= >These FreeBSD NFS clients form a pool of application servers that run=0A= >jobs for the application. A given job needs read-write access to its=0A= >data and read-only access to the set of binaries it needs to run.=0A= >=0A= >The job data is horizontally partitioned across a set of directory=0A= >trees spread over one set of NFS servers. A separate set of NFS=0A= >servers store the read-only binary roots.=0A= >=0A= >The jobs are assigned to these machines by a scheduler. A job might=0A= >take five milliseconds or five days.=0A= >=0A= >Historically, we have mounted the job data trees and the various=0A= >binary roots on each application server over NFSv3. When a job=0A= >starts, its setup binds the needed data and binaries into a jail via=0A= >nullfs, then runs the job in the jail. This approach has worked=0A= >perfectly for 10+ years.=0A= Well, NFSv3 is not going away any time soon, so if you don't need=0A= any of the additional features it offers...=0A= =0A= >After I switched a server to NFSv4.1 to test that recommendation, it=0A= >started having the same load problems as NFSv4. As a test, I altered=0A= >it to mount NFS directly in the jails for both the data and the=0A= >binaries. As "nullfs-NFS" jobs finished and "direct NFS" jobs=0A= >started, the load and CPU usage started to fall dramatically.=0A= Good work isolating the problem. Imay try playing with NFSv4/nullfs=0A= someday soon and see if I can break it.=0A= =0A= >The critical problem with this approach is that privileged TCP ports=0A= >are a finite resource. At two per job, this creates two issues.=0A= >=0A= >First, there's a hard limit on both simultaneous jobs per server=0A= >inconsistent with the hardware's capabilities. Second, due to=0A= >TIME_WAIT, it places a hard limit on job throughput. In practice,=0A= >these limits also interfere with each other; the more simultaneous=0A= >long jobs are running, the more impact TIME_WAIT has on short job=0A= >throughput.=0A= >=0A= >While it's certainly possible to configure NFS not to require reserved=0A= >ports, the slightest possibility of a non-root user establishing a=0A= >session to the NFS server kills that as an option.=0A= Personally, I've never thought the reserved port# requirement provided=0A= any real security for most situations. Unless you set "vfs.usermount=3D1"= =0A= only root can do the mount. For non-root to mount the NFS server=0A= when "vfs.usermount=3D0", a user would have to run their own custom hacked= =0A= userland NFS client. Although doable, I have never heard of it being done.= =0A= =0A= rick=0A= =0A= Turning down TIME_WAIT helps, though the ability to do that only on=0A= the interface facing the NFS server would be more palatable than doing=0A= it globally.=0A= =0A= Adjusting net.inet.ip.portrange.lowlast does not seem to help. The=0A= code at sys/nfs/krpc_subr.c correctly uses ports between=0A= IPPORT_RESERVED and IPPORT_RESERVED/2 instead of ipport_lowfirstauto=0A= and ipport_lowlastauto. But is that the correct place to look for=0A= NFSv4.1?=0A= =0A= How explosive would adding SO_REUSEADDR to the NFS client be? It's=0A= not a full solution, but it would handle the TIME_WAIT side of the=0A= issue.=0A= =0A= Even so, there may be no workaround for the simultaneous mount limit=0A= as long as reserved ports are required. Solving the negative=0A= interaction with nullfs seems like the only long-term fix.=0A= =0A= What would be a good next step there?=0A= =0A= Thanks!=0A= From owner-freebsd-fs@freebsd.org Fri Dec 11 23:35:35 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 95C584BE6F5 for ; Fri, 11 Dec 2020 23:35:35 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct6bV6Pt3z3L3P for ; Fri, 11 Dec 2020 23:35:34 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ot1-f53.google.com with SMTP id f16so9820898otl.11 for ; Fri, 11 Dec 2020 15:35:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+d4J+g5JwFz4dwtBWbnhui6vE/5iTCcg2P1ZqKimxMs=; b=clCMj52+aHyMqFM8edPEr8raf4UC0cn60Yv0tN9FDlJoG3hqfav46oww/608NKipbb OE7auAs2XJyux2yZGXU69tXDUYYvhTbW2BDcP1QfS6dKEar9vXH3csvs21onT9M8KfF9 vOpmY/Xb01SevEDJ+Qtfu2rQNcxcds+v5vZsvueVJWIt5g1iqx0gJwMgzlvGH+5ISjLm +y8ZJQoK8L/cXhyeAFXX+PnPQDRd/PK4BUVfkBqPbunVXlh15S9Pb9mzSi/Vp1VqppsQ bPaWj62irTAqJnr3DprTaVE/0GXAxmYreSHXqW3lUGq7lIOPVUhSlarnSDWwYnei4m+W HWeg== X-Gm-Message-State: AOAM531jt+z9b3WqjiitFAAP3nWGNHFhk59pFCIzam87vs4icgSHIfYk 1CmeOSCkMSOod0li7Z9rjD/BsUpCqwC6lXDkkZM= X-Google-Smtp-Source: ABdhPJz5l8CiTZw30fce5HxZO4ItP3/HUF6cLSflQOMBi33/4ifjBed+9k/eNXCgwE2va2pb6iqpYOGVXNEeRx6YS6g= X-Received: by 2002:a05:6830:2413:: with SMTP id j19mr2724213ots.251.1607729733541; Fri, 11 Dec 2020 15:35:33 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Fri, 11 Dec 2020 16:35:22 -0700 Message-ID: Subject: Re: Major issues with nfsv4 To: Rick Macklem Cc: J David , "freebsd-fs@freebsd.org" X-Rspamd-Queue-Id: 4Ct6bV6Pt3z3L3P X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.210.53 as permitted sender) smtp.mailfrom=asomers@gmail.com X-Spamd-Result: default: False [-1.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; RWL_MAILSPIKE_GOOD(0.00)[209.85.210.53:from]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RBL_DBL_DONT_QUERY_IPS(0.00)[209.85.210.53:from]; R_DKIM_NA(0.00)[]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FREEFALL_USER(0.00)[asomers]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; SPAMHAUS_ZRD(0.00)[209.85.210.53:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[209.85.210.53:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 23:35:35 -0000 On Fri, Dec 11, 2020 at 4:28 PM Rick Macklem wrote: > J David wrote: > >Unfortunately, switching the FreeBSD NFS clients to NFSv4.1 did not > >resolve our issue. But I've narrowed down the problem to a harmful > >interaction between NFSv4 and nullfs. > I am afraid I know nothing about nullfs and jails. I suspect it will be > something related to when file descriptors in the NFS client mount > get closed. > > The NFSv4 Open is a Windows Open lock and has nothing to do with > a POSIX open. Since only one of these can exist for each > tuple, the NFSv4 Close must be delayed until > all POSIX Opens on the file have been closed, including open file > descriptors inherited by children processes. > Does it make a difference whether the files are opened read-only or read-write? My longstanding practice has been to never use NFS to store object files while compiling. I do that for performance reasons, and I didn't think that nullfs had anything to do with it (but maybe it does). > > Someone else recently reported problems using nullfs and vnet jails. > > >These FreeBSD NFS clients form a pool of application servers that run > >jobs for the application. A given job needs read-write access to its > >data and read-only access to the set of binaries it needs to run. > > > >The job data is horizontally partitioned across a set of directory > >trees spread over one set of NFS servers. A separate set of NFS > >servers store the read-only binary roots. > > > >The jobs are assigned to these machines by a scheduler. A job might > >take five milliseconds or five days. > > > >Historically, we have mounted the job data trees and the various > >binary roots on each application server over NFSv3. When a job > >starts, its setup binds the needed data and binaries into a jail via > >nullfs, then runs the job in the jail. This approach has worked > >perfectly for 10+ years. > Well, NFSv3 is not going away any time soon, so if you don't need > any of the additional features it offers... > > >After I switched a server to NFSv4.1 to test that recommendation, it > >started having the same load problems as NFSv4. As a test, I altered > >it to mount NFS directly in the jails for both the data and the > >binaries. As "nullfs-NFS" jobs finished and "direct NFS" jobs > >started, the load and CPU usage started to fall dramatically. > Good work isolating the problem. Imay try playing with NFSv4/nullfs > someday soon and see if I can break it. > > >The critical problem with this approach is that privileged TCP ports > >are a finite resource. At two per job, this creates two issues. > > > >First, there's a hard limit on both simultaneous jobs per server > >inconsistent with the hardware's capabilities. Second, due to > >TIME_WAIT, it places a hard limit on job throughput. In practice, > >these limits also interfere with each other; the more simultaneous > >long jobs are running, the more impact TIME_WAIT has on short job > >throughput. > > > >While it's certainly possible to configure NFS not to require reserved > >ports, the slightest possibility of a non-root user establishing a > >session to the NFS server kills that as an option. > Personally, I've never thought the reserved port# requirement provided > any real security for most situations. Unless you set "vfs.usermount=1" > only root can do the mount. For non-root to mount the NFS server > when "vfs.usermount=0", a user would have to run their own custom hacked > userland NFS client. Although doable, I have never heard of it being done. > There are a few out there. For example, https://github.com/sahlberg/libnfs . > > rick > > Turning down TIME_WAIT helps, though the ability to do that only on > the interface facing the NFS server would be more palatable than doing > it globally. > > Adjusting net.inet.ip.portrange.lowlast does not seem to help. The > code at sys/nfs/krpc_subr.c correctly uses ports between > IPPORT_RESERVED and IPPORT_RESERVED/2 instead of ipport_lowfirstauto > and ipport_lowlastauto. But is that the correct place to look for > NFSv4.1? > > How explosive would adding SO_REUSEADDR to the NFS client be? It's > not a full solution, but it would handle the TIME_WAIT side of the > issue. > > Even so, there may be no workaround for the simultaneous mount limit > as long as reserved ports are required. Solving the negative > interaction with nullfs seems like the only long-term fix. > > What would be a good next step there? > > Thanks! > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Fri Dec 11 23:39:20 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 74B5C4BEAEC for ; Fri, 11 Dec 2020 23:39:20 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670050.outbound.protection.outlook.com [40.107.67.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct6gq3rLdz3LFN; Fri, 11 Dec 2020 23:39:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XGAcRftepmP/zmfBd3vuGRL0RKefPGRqDYyq0nzivQqruWlsw/6ffqaLdoopF7ErENzM/tVYFMifoNkmv9an9pfTITUvjzIUdS1uW2Be4CKBH+ZJwOpeZJJni7ku+xhHmrI9xLuRYuxhXZnKnkn+AX95WYWEOWjIwDUED28GJ8McQi4bD4wqVvytofVH23rNX1Iym51Ps2U1DjG5LZtqwX/bCs26VG0cpzUMTE+nQ4ebAyaRaeEk6ThXueMIsBcUQfhbO3EwNpJT+ui99asgS0iveBEd+f7gDx+6XcWwmapmlF7CHjssQoKCyx1D2eGq8QZTYj12/kRvp3uZ6bdsaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PXPdePU8RkQeCYHWqd4SOLpGDndnJZ0JMp5ePLKzF0g=; b=XZk5s1JHwzfsN0Ln/u7FWSUOBWOa0MZXgL1iwAFkQCsf0LpWgMIJRNp81oqEL2+/G4yE3Vw8G+vGJ9zMcTPHX9Eo/lCUtKTAR5kVNo0EE2OgpjTV0itceUStGZx+CfZMFvjAABsRHR0zt5Zvyf0voV5JVwSs/21VwwFtVOMBWv0FWXaNMYKg7lOdbCav7MpfvJNgljA91SjLdRrXNS89P9XbHhoS4TzniEbKuIO6U0KfDo5bK3r5CqlXHOPEFh2LYYA5qHXyKKl4K/LW7AoTrEtLRfwf2Gz7uGs7CYJi30R3+Wl7OlfbLwHNgdMR5RgQBDjXStgnULRd97SAGfk7fQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PXPdePU8RkQeCYHWqd4SOLpGDndnJZ0JMp5ePLKzF0g=; b=g5C/CWQDyHRGnPtLrTR53wVRjOxN0wnvBuRe7D5ndHJoir+u/SQ0fbl/X0fR5wWAyTiHkWaBpt7bVdK4QP4U69vO+RimZe/5CVwLA1GUINKBEfNKcqTjo/YGFgMxtZNqz5kxd9j/xvoQUsD3WvwqMRb+loszli5n6N15NQIeLN1X0X3eb5AayJOxTs0Vf0CSr+GnzkTurRI4/1K/1HxwdhUThaRzoPQOFFkcBSHux9QH62wPXgUiXvuhSeTsg6dD52a/lthQ+78PMGC8FeA53gEgatPdvymZYS6s4bzTbZlleUgL5NllJH5g4QCzrrVvTwIHLtUHqYHZ+AapaN/hig== Received: from YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b00:20::29) by YT1PR01MB3546.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:f::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.12; Fri, 11 Dec 2020 23:39:18 +0000 Received: from YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM ([fe80::b131:712c:ca2d:c7b7]) by YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM ([fe80::b131:712c:ca2d:c7b7%5]) with mapi id 15.20.3632.022; Fri, 11 Dec 2020 23:39:17 +0000 From: Rick Macklem To: Alan Somers , J David CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABU0AIAABhGS Date: Fri, 11 Dec 2020 23:39:17 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 54f651a6-d1f7-44b0-1cba-08d89e2dfa06 x-ms-traffictypediagnostic: YT1PR01MB3546: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7691; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: gKA9lFhSYGi4Nqe8MKEEZpniUo+DTQyNhoFj0fZIA+Pw9zskj0mpql+dRO0yIGRKSWCqzlILCcjRBu1NNAUSXm/Da0Ar3nfyMW/IlEEpcz37EvHahn5JkJOoaPboLOWa2lyV8OrHu8DYghM+xchNoN7tpHbJ1jRSWckzwfYjw6gycraCZa8+K197/W9uppgKA+u4ADg4EpvwjTtZBybsVjvQ4ivNYv+ZlV+3z1kjD9xNHSNvesX9h7sRzPbJfpebZRHlgxrdoLDLOBDnqXDndAC976lhT5VRVdA5weEF39fbrcNJD4WCjvFBL2a98cF4wWhwjHiNns94+s8+2mTrcQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(136003)(346002)(376002)(366004)(91956017)(786003)(83380400001)(33656002)(110136005)(508600001)(55016002)(2906002)(66446008)(66946007)(9686003)(5660300002)(71200400001)(6506007)(7696005)(52536014)(86362001)(64756008)(8936002)(186003)(4326008)(76116006)(8676002)(66476007)(66556008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?tlDPTY6jmUC21KmVbDrsZ+eZuLlwbRzqt4SJfWRArdZSeP2YWcQQ3ZTo0B?= =?iso-8859-1?Q?kIjn2Nq9D2MyTIIzuMymx8zefZeeJzuFHap6LgB+GJo2cm8UQ/jcknr7GA?= =?iso-8859-1?Q?0nmfCaFZ3JgiPltoLXiCFhiIg1jPEGiifDgo4y72ffsuX+ChHJsMBUUa1x?= =?iso-8859-1?Q?qts0YJCJ6KHe4OXfQP5vUJ6x44mT6UyLz9CkWQbSMUSJP7n5LDNUchpqeM?= =?iso-8859-1?Q?ogKZ6IqqLsS1wnwN8CMejMlGzywjBAQjMlQZfGIWPBauAU7aEmPxy5rggd?= =?iso-8859-1?Q?zAN/Xgs4jXN4I3cjX4P+T75ExldNAtpWb3Gd6LWMyFrcEnurt70SGhvmOU?= =?iso-8859-1?Q?Nw8NHpzR0ycjwld6uzNJ+NQqqv0Edj6tyUJ2MoHpXM/tUm+0A4XXl0fCzy?= =?iso-8859-1?Q?gFtzxwFVAldTjCwNHD5k+512/sx300wSR9YczWC0UKrJC0rJO3EJEZPGdF?= =?iso-8859-1?Q?s0dDSqxUYFVLpaA1oNMVxnB5rXpjIY5sc2ABtciDbQ2nxgRHS+K2MgWExG?= =?iso-8859-1?Q?o4z4LNmI4lXaeWDZ8nWOY0yO9LHA6L8/GDRFPfgQd7ne0GJd6N6QLi2jkA?= =?iso-8859-1?Q?6mZkESjJHez203MBhotN+e960YPI1JXPDBhyd9yKP3yMNY1YJrFHk5n/Ja?= =?iso-8859-1?Q?ie4AhQgQmJEcIPaVGvOGckoSqlhVTyxgj6Wtn1OEHLfqBdQ90v75Wc2MJC?= =?iso-8859-1?Q?+yPwAitxyo+nnYbAuNdo3uQ0/XQkWrZMAQ1SQ7AGRX9etYN0ofqm70GOSp?= =?iso-8859-1?Q?yFqT0Zjask4QplIp0C96tcPXP2MkRujEXlIT2zRMPRJNraYaIiK0V639mO?= =?iso-8859-1?Q?QiMD8LT9UclVkZhmQbjpvuB2bee+Qlbh0tSuLbU1X3QLG2bG0k9WzGgLKN?= =?iso-8859-1?Q?K3G6jL4F2oLguloHDScdWaCgvN3GmgCw9dwO7GxFEgnlk3ZMMC8kpfF2lx?= =?iso-8859-1?Q?c4r41kqOPvh6n7C1ok27sca8/wrbtS3/oEsRcSrcL/RlMHz3FbIOywCOYK?= =?iso-8859-1?Q?u6+e2ebiLU0MPKxHYtSK/ZFr1E7VU6+oY/wL7aOC9nayMQxBcGwfr02rvF?= =?iso-8859-1?Q?kAhy1Jm8QoppSVx32bbcuWE=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 54f651a6-d1f7-44b0-1cba-08d89e2dfa06 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Dec 2020 23:39:17.7267 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: F8zrpcnwa7i7TFtNKnCrwh1N8zoof1VxUq7k4IGiXH/lFDZAGz4FaHBW3M617u68RA5Se9Pz95rokBJwcIrtDQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YT1PR01MB3546 X-Rspamd-Queue-Id: 4Ct6gq3rLdz3LFN X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=g5C/CWQD; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.67.50 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.10 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[freebsd.org,gmail.com]; RCVD_IN_DNSWL_LOW(-0.10)[40.107.67.50:from]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.67.50:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FROM_EQ_ENVFROM(0.00)[]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[40.107.67.50:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.67.50:from]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 23:39:20 -0000 Alan Somers wrote:=0A= [stuff snipped]=0A= >That's some good information. However, it must not be the whole story. I= 've >been nullfs mounting my NFS mounts for years. For example, right now = on a >FreeBSD 12.2-RC2 machine:=0A= If I recall, you were one of the two people that needed to switch to=0A= "minorversion=3D1" to get rid of NFSERR_BADSEQID (10026) errors.=0A= Is that correct?=0A= =0A= >> sudo nfsstat -m=0A= >Password:=0A= >192.168.0.2:/home on /usr/home=0A= >nfsv4,minorversion=3D1,tcp,resvport,soft,cto,sec=3Dsys,acdirmin=3D3,acdirm= ax=3D60,acreg>min=3D5,acregmax=3D60,nametimeo=3D60,negnametimeo=3D60,rsize= =3D65536,wsize=3D6553>6,readdirsize=3D65536,readahead=3D1,wcommitsize=3D167= 77216,timeout=3D120,retrans=3D>2147483647=0A= Btw, using "soft" with NFSv4 mounts is a bad idea. (See the BUGS section of= =0A= "man mount_nfs".)=0A= =0A= If you have a hung NFSv4 mount, you can use=0A= # umount -N /usr/home=0A= to dismount it. (It may take a couple of minutes.)=0A= =0A= rick=0A= =0A= > mount | grep home=0A= 192.168.0.2:/home on /usr/home (nfs, nfsv4acls)=0A= /usr/home on /iocage/jails/rustup2/root/usr/home (nullfs)=0A= =0A= Are you using any mount options with nullfs? It might be worth trying to m= ake the read-only mount into read-write, to see if that helps. And what do= es "jls -n" show?=0A= -Alan=0A= From owner-freebsd-fs@freebsd.org Fri Dec 11 23:44:55 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D80A44BEFAE for ; Fri, 11 Dec 2020 23:44:55 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-oo1-f44.google.com (mail-oo1-f44.google.com [209.85.161.44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct6pH5j7Wz3LbK for ; Fri, 11 Dec 2020 23:44:55 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-oo1-f44.google.com with SMTP id t23so2559808oov.4 for ; Fri, 11 Dec 2020 15:44:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6Ld7AvXwbXTbVrqbHTf9DlzPwFtXvdiw9hBsu1XrNsk=; b=feh9oye2ieMTS3Yq1mWSBAmYVLizX0iRm+1/7m6j81PEbClLLMUL/V2RfZAmxP0lmW 2gWxA54NWrBMWhU92/ivQvl2571g0ySd7IFNJ2frKUySusppnCKdnP+rIL1DZLTx+n1d Rm+4VTyvSFAFYfxgohPpOP2KBKJbMSsG8x9xkl/NpY0HEbxTU/qiQJfyWtF/r5FYfpnV 8h4Nl1HjnoZyN+Ln9v+H8KdVEJw7uvCKQq1PY9QADZSrdHea0TbpjRlJuN0gJNDXN2Em PgPzPuY+E6DUV0FWY+1lgm52sjks0HNYhCbeGKZF1pToEu7SIKxYi+Flo9Dapwyld9zh q1tQ== X-Gm-Message-State: AOAM5314i9cNM9VEzVHQfKPIG7jnRWQYrXgd3jOsXKpWKhv1dyyoHs54 42ORmvRr/5yKLPDY5ctVLiRlS4D4pIf04ENZyos= X-Google-Smtp-Source: ABdhPJy1dXPRN5JxNM/a58+2CpR0zepuLuPEmm4gLHivK0xR7fkReWJSGKa1HRbu5mK9IrQhBm7uhw23fqIVOi44AeM= X-Received: by 2002:a4a:d5d3:: with SMTP id a19mr12063087oot.61.1607730294708; Fri, 11 Dec 2020 15:44:54 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Fri, 11 Dec 2020 16:44:43 -0700 Message-ID: Subject: Re: Major issues with nfsv4 To: Rick Macklem Cc: J David , "freebsd-fs@freebsd.org" X-Rspamd-Queue-Id: 4Ct6pH5j7Wz3LbK X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; TAGGED_RCPT(0.00)[]; REPLY(-4.00)[] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2020 23:44:55 -0000 On Fri, Dec 11, 2020 at 4:39 PM Rick Macklem wrote: > Alan Somers wrote: > [stuff snipped] > >That's some good information. However, it must not be the whole story. > I've >been nullfs mounting my NFS mounts for years. For example, right now > on a >FreeBSD 12.2-RC2 machine: > If I recall, you were one of the two people that needed to switch to > "minorversion=1" to get rid of NFSERR_BADSEQID (10026) errors. > Is that correct? > In fact, yes. Though that case had nothing to do with nullfs or jails. > > >> sudo nfsstat -m > >Password: > >192.168.0.2:/home on /usr/home > > >nfsv4,minorversion=1,tcp,resvport,soft,cto,sec=sys,acdirmin=3,acdirmax=60,acreg>min=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=6553>6,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=>2147483647 > Btw, using "soft" with NFSv4 mounts is a bad idea. (See the BUGS section of > "man mount_nfs".) > Grahh. I forgot that was in there. I can't remember why I put that there. These days I agree with you, and advise other people to use hard mounts, too. Thanks for point it out. > > If you have a hung NFSv4 mount, you can use > # umount -N /usr/home > to dismount it. (It may take a couple of minutes.) > > rick > > > mount | grep home > 192.168.0.2:/home on /usr/home (nfs, nfsv4acls) > /usr/home on /iocage/jails/rustup2/root/usr/home (nullfs) > > Are you using any mount options with nullfs? It might be worth trying to > make the read-only mount into read-write, to see if that helps. And what > does "jls -n" show? > -Alan > From owner-freebsd-fs@freebsd.org Sat Dec 12 00:02:29 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4E8A24BF71A for ; Sat, 12 Dec 2020 00:02:29 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660075.outbound.protection.outlook.com [40.107.66.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct7BX1X1Jz3MZr for ; Sat, 12 Dec 2020 00:02:27 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kzn29JAqvGgqX0CMFNCrq02jNlIA4+VAb7jXZnnG88R9ifOX6pU8Cb8OqKvU6Wav8YBbyFKNOps9UcLWR+bxddaaWmhZpmgCuCI/7SMNOlqTSdfCNE+emyyx23Pl/HFy8xJC7SliqqcOYd9oo3Okd9sVFaBVAN1q5D8J/A7d/rUmHfD1HDtFe5U51ovmp4Ka79wB3HFPvvBIGsoIbM6kzLSdobjD51WJrYKzM1ISMaP49hCM5DJpWfhoZFghSWwoM5Exb/i0YvA/v5cM8jU7u7nqtuTwViK/lqQEBvjOClmeI0IH0snK/Y+VXV0uxAdU1reKkcrVlxomfbM/gTY/Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bj3jTVLqdqrEzse+57U5Rs3567Lbo6OuLmDkGqz623g=; b=oSpFco6CQJ2EqT4P0CRe/ZyK4HoiyV1blu1xw7zTgCTl9fiYzFeaXfjxhyz/kmglYdd94MbJuZjy2yf/VaKvnxUm6qydEyg5yB2wXl5TAyYG/3SANSBEOZQo2jCH1Ms4QHKF8BOi8i3hTR0lLcTftC44O1RBPrpqh28UOWGVDIcUBhgb+UFFpdPBvHf2wodBn9eZu+QXuMe32d4nLQoP22qOtqQSJq4ZmqfuNwBnJUyaMw2vO34dj/2EaF7ylbQHVjLPFyEQQl61pBulFusjluSNqiuimcDmgpnO6NqrNG/V487++stlHYyvrkGOiPmnfphe1sAX9YW/M5ld/ZoaLg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bj3jTVLqdqrEzse+57U5Rs3567Lbo6OuLmDkGqz623g=; b=LVd0UuQ+js+QWzZVyLNYCeXR3pc+dSHS8d5xcjwi0jtWgrJgWjVVxxg62nz6BWEYew/pe2btulByRoZIcLIxp1ygZdgAhvalFLXi0rPFtd2aWwJhz3ySPSsPme6cvsYWQztyzrvbPboMKljQoo1J+xzbIcIGR9cUoUP9JP6D3mRflzJVdJPRxhumDInl8CepOBcI4xHoGGaw4GZ7GYunU2NYtXCWicThoBDEMZvTeBbQnM/JtLLC9B6uTXE+MDijRGVLTeX/BJ5ADmz6tlXO3Y3XbItlrQA47f0MdT+MD3hDLxEtPoUJTYEI0UPQJ8/esJKCJ+96i9yx0K5iUakxCA== Received: from YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b00:20::29) by YTBPR01MB2605.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:1a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.17; Sat, 12 Dec 2020 00:02:26 +0000 Received: from YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM ([fe80::b131:712c:ca2d:c7b7]) by YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM ([fe80::b131:712c:ca2d:c7b7%5]) with mapi id 15.20.3632.022; Sat, 12 Dec 2020 00:02:26 +0000 From: Rick Macklem To: J David CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAACDk6A== Date: Sat, 12 Dec 2020 00:02:26 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1701fcff-6642-45a1-4d8e-08d89e3135cb x-ms-traffictypediagnostic: YTBPR01MB2605: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: dlKKeG4jIbbI0+htzSCdWiFh59CbNoScmFUl3mmaeLeukC5D5vcVBoeCtjbUksZxQ/by2H0asKtFW8oh9EypeHE4zJRKBSup7RWnjJE65X9heT7DmxS/AgR83sbYCzHFY/p4Lf/18tf/yHsCast8Jy7NoBtvmfC9kNr+KzUhInH2oAiXBz3S8mRS4G8C8gtWXCQBXFSpgwHvaxgqlAZim7sY7588ibVEya2XBUmO+ZJmQCiLiCbFtKcdes6R9j3iSxf9B73AhRRISzG05v77CZ6E4FEd+kF4UMTM9RJE4y+5oMvzRimEnR//YmqbVa33WISShmZMupPqMD2C8XZ/Cw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(376002)(136003)(366004)(8676002)(6506007)(2906002)(508600001)(786003)(8936002)(33656002)(7696005)(6916009)(55016002)(9686003)(83380400001)(4326008)(5660300002)(91956017)(66446008)(186003)(71200400001)(52536014)(4744005)(66476007)(66946007)(64756008)(66556008)(76116006)(86362001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?3G1hSxWJagr4JAU/mVbjmtx84hWq3EYXTg1bYIaor0xyu5DYUNIGg7T52T?= =?iso-8859-1?Q?2YuKackkAhIDLmnZuSRu0F7JKY0E5nDvuGyOE4YvK91ur6ZDeYg9IhZFg7?= =?iso-8859-1?Q?CPnPGdkt9JF7JLdsC+xCN+KV3LCaril5vqLIMaU1Law8OyK84ps9UAdQpp?= =?iso-8859-1?Q?/gp/nqqvxDeUeRLpLZL+k/AqpvCFCb+Na1HVhPxscIvqzIxPp+qg8dzjja?= =?iso-8859-1?Q?ebu9W8V7IqOf1bZGucAxVntYrQIyYYtgbYkSIkvzUELKKokeuFcdSyggcB?= =?iso-8859-1?Q?Ch0YIvzINvp8rtNJ1P+Q2PnKNKIKYhe/2oJEVXUxDtT2tlgVa9XQ7yXAQV?= =?iso-8859-1?Q?fYxwgazZh/WkvbWy5CK8FR6HBSpciSVg/3X1Eg/HYgQ88xpd8d7HTpFDWA?= =?iso-8859-1?Q?VY6ZSIoYhqLUiXPZWCITSKZnsQ6lNbNJWWDeNy4cQY/PdZ/qOiUdONfQEO?= =?iso-8859-1?Q?TZTY/t1kX5Z37wXtys3jJyYtfKx5bhxM2g0FzSx5Gycixn6n0j5kuePRR1?= =?iso-8859-1?Q?On7fQCBUsgnN3HEb8FjCf4ZsucqvGJDmd4DFe0yDMv/tyc/h+hkNv0ELU2?= =?iso-8859-1?Q?bKF2CisnaUOsm+sP8OJOesHcq9npq568Pt+hrLOg79N4uP9qBkTkz6jCSz?= =?iso-8859-1?Q?qTWYVntq4kWop9psKudQgkg2OYZX/8+N9K4kRfMRz51fnOj6+BB7O6xZ2C?= =?iso-8859-1?Q?x36zXfGX+VfNQj3zFvhvLaCWuGNSIZtzOpyOtWHzeJcEh4fRaeu9bjlUKt?= =?iso-8859-1?Q?RpRMZTPzsPSQNE6SfGIC9l2opbSX9+DrElN/FPgWZxYdRIPFtglmDBr/Du?= =?iso-8859-1?Q?/VsOmPaJbMe4dWeN4EmiJp4aD5+Hxm+/0EX1W/jxych/+8l9+yRU2CKMjc?= =?iso-8859-1?Q?bEzCOGlb2N+3zAON1goFRdcyXskbJg3nRIdc/RVXmpLs9zvhXvUmSULwwt?= =?iso-8859-1?Q?adZfJ2Ora+bf9XiMOAINkd+edN8g0CBMPiuCZlQvtoiNpgJTd3jHBfM6Po?= =?iso-8859-1?Q?bQygLiz582+3dQQdWClnoCPX87/BvlJc7UwAs6p8xOU09C3tHNd/Uip0BV?= =?iso-8859-1?Q?jkE62rRA+AXWl5dRdfZgc4k=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YTOPR0101MB0970.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 1701fcff-6642-45a1-4d8e-08d89e3135cb X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Dec 2020 00:02:26.5097 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: x8CnrtENVK3LhHtHZF7o8E6snk2UrMfneJJRatEp6AgFcjELK6sxArPoVO11McSzZCOni568gjuNJH3ty2CpSQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTBPR01MB2605 X-Rspamd-Queue-Id: 4Ct7BX1X1Jz3MZr X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=LVd0UuQ+; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.75 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.66.75:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[40.107.66.75:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.75:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.75:from]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 00:02:29 -0000 J David wrote:=0A= [lots of stuff snipped]=0A= >Even so, there may be no workaround for the simultaneous mount limit=0A= >as long as reserved ports are required. Solving the negative=0A= >interaction with nullfs seems like the only long-term fix.=0A= >=0A= >What would be a good next step there?=0A= Well, if you have a test system you can break, doing=0A= # nfsstat -c -E=0A= once it is constipated could be useful.=0A= =0A= Look for the numbers under=0A= OpenOwner Opens LockOwner ...=0A= and see if any of them are getting very large.=0A= =0A= rick=0A= =0A= Thanks!=0A= From owner-freebsd-fs@freebsd.org Sat Dec 12 01:09:00 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BFC8F47953B for ; Sat, 12 Dec 2020 01:09:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct8gJ370Mz3hJK for ; Sat, 12 Dec 2020 01:09:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 0BC18rUV060892 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 12 Dec 2020 03:08:56 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 0BC18rUV060892 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 0BC18r27060891; Sat, 12 Dec 2020 03:08:53 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Dec 2020 03:08:53 +0200 From: Konstantin Belousov To: J David Cc: freebsd-fs@freebsd.org Subject: Re: Major issues with nfsv4 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 4Ct8gJ370Mz3hJK X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; TAGGED_RCPT(0.00)[]; REPLY(-4.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 01:09:00 -0000 On Fri, Dec 11, 2020 at 03:30:29PM -0500, J David wrote: > On Thu, Dec 10, 2020 at 1:20 PM Konstantin Belousov wrote: > > E means exiting process. Is it multithreaded ? > > Show procstat -kk -p output for it. > > To answer this separately, procstat -kk of an exiting process > generating huge volumes of getattr requests produces nothing but the > headers: > > # ps Haxlww | fgrep DNE > 0 21281 18549 1 20 0 11196 2560 piperd S+ 1 > 0:00.00 fgrep DNE > 125428 9661 1 0 36 15 0 16 nfsreq DNE+J 3- > 3:22.54 job_exec > # proctstat -kk 9661 > PID TID COMM TDNAME KSTACK > > This happened while retesting on NFSv4.1. Although I don't know if > the process was originally multithreaded, it appears it wasn't even > single-threaded by the time it got into this state. Ok, do 'procstat -kk -a' instead. Exiting processes are not excluded from the kstack sysctl, might be you just raced with termination. Or, if you have serial console, enter ddb, then do 'bt '. Or if you have kernel built with symbols, # kgdb /boot/kernel/kernel /dev/mem (gdb) proc (gdb) bt but this has low chances of work for running process. procstat -kk -a output might be the most informative anyway. From owner-freebsd-fs@freebsd.org Sat Dec 12 01:39:19 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5716F47AF8B for ; Sat, 12 Dec 2020 01:39:19 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct9LG4YNQz3klq for ; Sat, 12 Dec 2020 01:39:18 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x12c.google.com with SMTP id h19so16052854lfc.12 for ; Fri, 11 Dec 2020 17:39:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=F9lNgWWmYI3sRJpqHjsFlRZ/IDEH23elMzT2m0V+xcY=; b=V3wlCeA796L/xtWScNrjM11EZRwFpFWhu76EauN3xxxVuiVLEfjEsnvcaFcdZYVzvn aGVhBjas4M4PU9dCfutjM7W+9U9XoYTWdUpFGp/gdk3lFM+s/pooT+JSClb7hhBQflHo QTBvNAsLvHZSsWPTd8hmJPH0ncgQTUxl9r4jAl1yjvs0sBVKJjujwMG2SW5vcH6BkhZ8 H0GFbfLWnXiuBR9CTba7xLYBgwoLoxiMTWVPwLLLy0K8ofoNt9qjEUGvNIZz1fJ2mybj Djnz7ku47T5E3lQpOQUXn/R9GyT/bQsmqLegs97kNrxxvXZTD3UO2DZnozVXFO8TH2ei jxTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=F9lNgWWmYI3sRJpqHjsFlRZ/IDEH23elMzT2m0V+xcY=; b=ol05S03Pj1o7ge8Qaf9TDA+62c4Km/USGohIKPygdWFOOE8UTSVMADVtKZI/Lp/hds QNBAFcl1dyxc3NGT6rgNAKlCj9oCMqI4SfU5f9b56VWfEgXC3uNMh9bDJallyT75Mxdg 7wezPgGqfuG/3Wo8mMzF79cgOqsCuBI/jueRZ+u6Uz3ASItc3oe/YC21BZXBgStjWnt3 rJ/2c7sQxFfDOmkvgbfFAQ5IYPoVVcOEw5ybtNgztU8EkjGd4djeNi517GqL1sXfYOHx 1muD6KKZIlKjIJtBFVxr3hr8WiVjtqksFDKU+0YnhYITFTht2f6DmwP2DuhPpI27M8dD ELxg== X-Gm-Message-State: AOAM530nrepiYoIuv3EQhNtCKBv17k+BMeT62xD9+oavCB0tvnV5vxYE y4dDPrz8NlyioMSd71pdNhA3byUD2BxWwYCcEM0= X-Google-Smtp-Source: ABdhPJxbjm0wEGENsfKX62Ty0+p8baUvoYWsCrgK26uvkVCTChU4MBkgZm4zQ1ZPMvMBHv1xi59qeGPghzBM9C11ItU= X-Received: by 2002:a05:6512:1095:: with SMTP id j21mr5794879lfg.309.1607737156683; Fri, 11 Dec 2020 17:39:16 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Fri, 11 Dec 2020 20:39:05 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: FreeBSD FS Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Ct9LG4YNQz3klq X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=V3wlCeA7; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::12c as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12c:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12c:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12c:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 01:39:19 -0000 On Fri, Dec 11, 2020 at 8:09 PM Konstantin Belousov wrote: > Ok, do 'procstat -kk -a' instead. Exiting processes are not excluded from > the kstack sysctl, might be you just raced with termination. No, it's not a race. When this is occurring, processes sit in "exiting" for several minutes like that, doing (apparently) nothing. What's weird is that I was able to unmount the nullfs mount, but not the NFS mount, even though the process would have had to access the NFS mount through the nullfs mount. Thanks! From owner-freebsd-fs@freebsd.org Sat Dec 12 01:47:04 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E97FE47AEC7 for ; Sat, 12 Dec 2020 01:47:04 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct9WC6p3fz3l4l for ; Sat, 12 Dec 2020 01:47:03 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x133.google.com with SMTP id a9so16141039lfh.2 for ; Fri, 11 Dec 2020 17:47:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MaFewCBAhUEwOBJ+It7wBNoExnse7RFtKcKq4TnYzNw=; b=OxnQp7w7/d8sTIrFYHEr9HtACgnmJ/BLm4UHX009IAxv1Z/KAR48F5xxN+n336twBa YhHp9hWf/6hF5B08QgYj1K0lzItvn7QMGVkGBgEq4eQIkx3hCQDjdxofkW4J0P9wjNOg FvjV6lqPfn9gr1MnMzJ5eQ+SR5OVV9ruGzYkiTJqYDC1nN1X1ZxudFiNbPx46oMz0J9Q w66mn58VLlwojOdJ7yxbeyEj9rklj/czQfhshh890ZZ5IzukHN3JqleYPCQJAefHH5D7 hKpXdZGn9bRibfvuuoUIt/0cRlf4d2xvOk0g57mJ/VIaGBwCtx/Rl5jtXCWQiuykSwOr lROg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MaFewCBAhUEwOBJ+It7wBNoExnse7RFtKcKq4TnYzNw=; b=U3hrh56V25yxiCd9lwO+0F0TViBxKrLvMaF5aBvAuTu0t4tDzx+0L92Q/hDutux0Ba hUYURD1Ge3cwWiptupJZ8K3WooviT7KZ7ZZtoALKQCdpKa+rcrGz8vA5z7OCLkHnyClm sesQdmTYcuzdpvsf8VGcHODdXk//dS3E2B02om6bU0asbBnwJcymfrWcALOCPDOAEnuT FgnJNGWy2ndONtcjzqiS9sjyFCffyw3Jy2w1utnKLt1vlzgZ3HRH0w+NRN/kb8jTXd6H ypbineEViBiTfT0rCdBRtot/fng3vG+QwF5Zrnf7SCe9CKZB7eZ0Equh2h0QSMoYb2FI +WqA== X-Gm-Message-State: AOAM532sHQFgin5l8GW+uSa2+2qLp7kExLawuwv0/Xgtdjiso66xqKiK cAjAzC+u7emq0aM9Ob5NllBTYbK+eBK+g/+V1Kc= X-Google-Smtp-Source: ABdhPJwAoEjucziKRKlgBELdr1eI2R3032kRhVeslvrHY1jIbFjlrfPdPxqLFNyRvZBhWV7Q5iskwRCkgatdEEwPbnw= X-Received: by 2002:a05:651c:1341:: with SMTP id j1mr5128621ljb.216.1607737622128; Fri, 11 Dec 2020 17:47:02 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Fri, 11 Dec 2020 20:46:51 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Rick Macklem Cc: "freebsd-fs@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Ct9WC6p3fz3l4l X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=OxnQp7w7; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::133 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::133:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::133:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::133:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 01:47:05 -0000 On Fri, Dec 11, 2020 at 6:28 PM Rick Macklem wrote: > I am afraid I know nothing about nullfs and jails. I suspect it will be > something related to when file descriptors in the NFS client mount > get closed. What does NFSv4 do differently than NFSv3 that might upset a low-level consumer like nullfs? > Well, NFSv3 is not going away any time soon, so if you don't need > any of the additional features it offers... If we did not want the additional features, we definitely would not be attempting this. > a user would have to run their own custom hacked > userland NFS client. Although doable, I have never heard of it being done. Alex beat me to libnfs. What about this as a stopgap measure? > How explosive would adding SO_REUSEADDR to the NFS client be? It's > not a full solution, but it would handle the TIME_WAIT side of the > issue. The kernel NFS networking code is confusing to me. I can't even figure out where/how NFSv4 binds a client socket to know if it's possible. (Pretty sure the code in sys/nfs/krpc_subr.c is not it.) Thanks! From owner-freebsd-fs@freebsd.org Sat Dec 12 02:01:31 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8A9AE47B902 for ; Sat, 12 Dec 2020 02:01:31 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ct9qt5TL8z3llD; Sat, 12 Dec 2020 02:01:30 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x133.google.com with SMTP id a9so16192543lfh.2; Fri, 11 Dec 2020 18:01:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/FzeZtAl8mnjK3u6uEwfpq6uN7N4dWttU+qoFzUJmOw=; b=upi+Okzm3MwVMK3A3EdLwqlq2j2pLRz6ZZ3qHBeqveQ31OZF3mSHtMA51DITKRhcmK e9aQX5JoBc2JVtLwYN97On/UQrgoClnw3jk7o8zLAjXShhywkpmI3fXcqe+s+SGg6m2s 1e/A5NSUH5FNLVuH19qUFfeg+HDA0bGuAKie8GHEGIUqvBZRACBVKpUH4EyQWz2vSa8l DXEfynzCp3lt8eozUDgd3ul0/j7m3jTQ1ZHJL/8pWnVphNZeri35qlvPmLUGuWyb/DXg NWEbwvOiADdZ6Noev7g1eGk3Yi1epnGjnuGpcEHc1AQafbjM/TP2E8FvDL9i4cdBp5mH Wwpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/FzeZtAl8mnjK3u6uEwfpq6uN7N4dWttU+qoFzUJmOw=; b=JkqSV1zrtk1sOxSC9S3tgPcDE4tL0dCj4Osow83CTsARhvV/IraUwyo7iEaC1shrxf s5JELYK9pfwlm0M/36SSO23PwiqhBcEaq7Hs5ZHNPQeOjM1M0KET0yIoY3XBArLWDa0M GIa4g9ZMcrx98HUcVEmbcKYaFpj5+MrE7p3hvqikhKxkDAjIh8hCKuh8mOSxio+JL3kn rL7IAJ0gebjjuv8XRrnosWWDxqIr4YrifxTWxA4rGsFN9Mbqtvd/mqtGwqKJAybaDdH+ LuA+ZRRbpii3SGc2pkTHdctJ/lBzpmrDgC7N2sDaq+4efj7YleJmy8uFcNQxnWw+42aq DHLw== X-Gm-Message-State: AOAM533vah1h1jYRr/D7PFJRARvZHYwmyx0BTDeb59otlrXidhH81NVx 0vchPPfqvXP495V2EumdrG9mbEh3HnXyMs0yTRidWidWRABMCA== X-Google-Smtp-Source: ABdhPJwii8ZtrCQJ/RuyHFCYQ5yJ2xk4+HV+Dyjwsqdawa3lzUHEyiBanqH6vv8ucv6c37l2aE2nba+DQwgNFO4PAzk= X-Received: by 2002:a2e:b80c:: with SMTP id u12mr6415933ljo.490.1607738488958; Fri, 11 Dec 2020 18:01:28 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Fri, 11 Dec 2020 21:01:17 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Alan Somers Cc: Rick Macklem , "freebsd-fs@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Ct9qt5TL8z3llD X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=upi+Okzm; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::133 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::133:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::133:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::133:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 02:01:31 -0000 On Fri, Dec 11, 2020 at 6:08 PM Alan Somers wrote: > That's some good information. However, it must not be the whole story. Indeed not. If it were, this would happen instantly every time. There must be some sort of trigger. But there are a lot of jobs that run and I didn't write any of them. So the search space is large. > Are you using any mount options with nullfs? nosuid and, on half the mounts, ro. > It might be worth trying to make the read-only mount into read-write, to see if that helps. It won't; the read-only mounts they are exported read-only on the server side. And no one is going to sign off on changing that, not even for a minute. > And what does "jls -n" show? Here is an example, newlines added for readability: devfs_ruleset=0 nodying enforce_statfs=2 host=new ip4=disable ip6=disable jid=1020 linux=new name=job-1020 osreldate=1202000 osrelease=12.2-RELEASE parent=0 path=/job/roots/job-1020 persist securelevel=-1 sysvmsg=inherit sysvsem=inherit sysvshm=inherit vnet=inherit allow.nochflags allow.nomlock allow.nomount allow.mount.nodevfs allow.mount.nofdescfs allow.mount.nofusefs allow.mount.nonullfs allow.mount.noprocfs allow.mount.notmpfs allow.noquotas allow.noraw_sockets allow.noread_msgbuf allow.reserved_ports allow.set_hostname allow.nosocket_af allow.sysvipc children.cur=0 children.max=0 cpuset.id=87 host.domainname=/""}"" host.hostid=0 host.hostname=job1020.local host.hostuuid=00000000-0000-0000-0000-000000000000 ip4.addr=10.0.3.252 ip4.saddrsel ip6.addr=2001:db8::1 ip6.saddrsel linux.osname=Linux linux.osrelease=3.2.0 linux.oss_version=198144 Seems like the next step is to find a reproduction that doesn't involve people calling me asking angry questions about why things are broken again. Thanks! From owner-freebsd-fs@freebsd.org Sat Dec 12 03:41:06 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EE71C47ECBF for ; Sat, 12 Dec 2020 03:41:06 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0625.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::625]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CtD2p1GkVz3rJn for ; Sat, 12 Dec 2020 03:41:05 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kS/Zwbx1rEZ/krxJyfLHP2VvQ/feEZQ8N9VMdBV5545s7p2QvwwjmR3n5lCoPFcg5vpELkAwQ4VOO+5Djw15sFDM9ER8PzUYSLoLYswgvEcItjefGzsz82EPvCFWxQn7f0fm7zxH+MNAh7W49kUvXw50nFM2LSa9HvalbuHXiXZP/zlzLcB3/+SSZoTtV9XuQ/xQXoJledvMBW9jHeAP8fzMik1mgEkMOzKJvhaTv6LoaWTovOUiRRGXHpe07E+f7bbgRXV/A8GqjcBu9+vlF4r+JD2jJpRrSKc33CcBcePWakcRJ99NvrgBOmAC8m7Ya/t76ITmUTtz/ZtHW+3HiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+Tkv9cZR2DiHV5HbVXdeCnWZ0r1r3zlSnawmbiNh7Nw=; b=VwW0J4C1Prg85H1H4ZO/i74l0GbMlTHgyVbjP937H2cW1izcQ2vl4fXUDi6BTRryvl0RnosHOY+r5QsRzTqEsKKRW6NBxwfEf8ctIU25f6wjMosvPXNk5RiJmK0s2qGpOotnTsE+JKRHiEw5T5jcRrjESP6w/Ab8lM/gzyd1cN/nfxjOLP39sY2DZLUVZzHbUQuzOMziU9LhqpSmdmzg7+kvW2r9iyJuo8OIEHbd6kXCvl+1SGxqdHqM9p137oKBseM0P2Gvh/s5jczc8CQlpdmWLeK9CPGjF/AruYessQmuzQkwwyW5xSCzBf1xsrIknCbGCKtdAMHqYOBUdppMjQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+Tkv9cZR2DiHV5HbVXdeCnWZ0r1r3zlSnawmbiNh7Nw=; b=A8OaZz7eNTU8wSykXaoeNZbcw6TiV4mzazV9StjbERumlmh42q+NTyVM3N8uP3kNtld6T21XfhY7DuDRyfubSyBiuMXtZSKphRO6lr+s2+3UFpVDuzwjsyAJnssZUue283BdsBUbHMrKT3LFLF5BBg8ccmElFdM8Lc3aRUlseusDR2bZHrOC8E8xUy9bROfuWd/fiqwGJshq5Nb44JWZo41aBu5zgg3qLVHThoqTVaVp6zIymFWvlQ+a8HAuBG7xVR9F/ej1d14bmYZ7IbCbTc7uUZbL4OAdJCXNbfeR8ihP531dXJMdyK0YZcy9YK4yXDT1cx5PdXYvk7dXNpnLPg== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB3431.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:50::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.14; Sat, 12 Dec 2020 03:41:04 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.019; Sat, 12 Dec 2020 03:40:55 +0000 From: Rick Macklem To: J David CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvck= Date: Sat, 12 Dec 2020 03:40:55 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: c4187e36-ae86-4107-5e1b-08d89e4fbb13 x-ms-traffictypediagnostic: YQXPR01MB3431: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 2Uy033T/7C3u55C/ApEXqJZz6kIXNePmvscKBTL3y2stwMt2JElXYxtedlkkV4tPdRwUrhZAOONeYMvI6GONJrmZO76cvXQRngE5begnJ1YluiLoUeQy3qiID7Jz7JWo3Nc/4aUxmzNzErUwiGCuC/uCmIRueL03Ej1rBZLsxryZqRT/oAe9kPlx8UtVRrRJc8z9GB5bJJrNcx2HcH64JNBR/tI7q/P1DfQA4TjZWtpG+xcEcShMclPdPz8469kjk4Suj/zpVFUotycLliH3Tre3qwqiPZc1IQNvKt8U0F5N7hzCCRL7HEdHWKc8mQHn3N1Ag8MXoHCZ3+P5uK7xYA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(136003)(366004)(376002)(9686003)(33656002)(5660300002)(83380400001)(55016002)(6506007)(86362001)(71200400001)(8676002)(66556008)(186003)(8936002)(7696005)(786003)(2906002)(66476007)(64756008)(66446008)(66946007)(66574015)(52536014)(91956017)(76116006)(6916009)(4326008)(508600001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?MMVVko2xCbeCaIKZrbNYrUdlwQBzXDNzAH9sqmnbRytg1VgjLY6XCZWCp3?= =?iso-8859-1?Q?G84dtTXrC+4fvsiwdo5BmfyYmbBTXZ18doiMzJEVc6yG8xlHuuvDjL5lhf?= =?iso-8859-1?Q?BvkW4rM24NlftXOx5q33HAwAz/UiTSwC4MCaxCyiaIFjI5mgLhewH+5axL?= =?iso-8859-1?Q?fR8OtktiQo7Q9ozsR/FR8kUeAcHswGv5YiK+WmgudCrBFqY1qivEXJWxCr?= =?iso-8859-1?Q?1fHTViYVgw7VZWg/VfrQms6usPLZnAxZvdXf+0Arx7BEfc8b2Uyu+Ru8pe?= =?iso-8859-1?Q?ynYsURIs5WTKOC+RU+03wTUmMjHUDxj8sAvns7c/ES3miz5VlwwjCANkFY?= =?iso-8859-1?Q?pUNhVoLtUIv0HSC37J/3V1gvABmPuRvRPBoaVXcevZqurM5IoELr/pk7lO?= =?iso-8859-1?Q?HyGE2vtsANYMOW1UdNvuL2PA8SL5MX4eEYA5qgjLsz1X1Qe9jYQdKvplFs?= =?iso-8859-1?Q?6TkhLqyIFhf7xcSbDod7cxQSDC872X0tHIwnKFz5SScVK/8kRlTq0eFFRs?= =?iso-8859-1?Q?BoX4TZ54KMQJsCLI4o34dTddmm6k7Ibl73H1h6kwtP+Hz6gEzm2QWJMw9n?= =?iso-8859-1?Q?tZcdpLDFO8F3SPe9H975Kdyre8nxVaspeUlrodZ9nKL52lgIJlU7mrZ0nW?= =?iso-8859-1?Q?fckyf+UDhXIdDAvsQ2rmr9mYQ83CftyhUE7TEK8BkECsGHFdpacPYvWVpw?= =?iso-8859-1?Q?/eGjr756byEvoRQ0z4WAKwPBArlDtolsgnm0FZAtfRorhSyGoToKROqwrd?= =?iso-8859-1?Q?KSTSiNMkWD0XDC7QkF0PClIt+CqwU/nuvdIS1/Hk3Pa8r2KFM4rk/C0ipy?= =?iso-8859-1?Q?34r9HQJBfwYYbuW6oT1nsRd7k/Foi1S0KG6zSkpUkphdW623PP7V/ATMLs?= =?iso-8859-1?Q?lMDgq0JP2700AP6s9Hn85NO3+wUrEToTnVKF5eWk+Z3g5UnyoHqbKa2hgi?= =?iso-8859-1?Q?QJUcph3ze8+K/AzyXfL+KbTE3ggwl/QJUPU0FfIWN5/dAjhmBQbWYONcbV?= =?iso-8859-1?Q?962vsd7BR+NCLNX34LOju/oZbfUNlwqDE1AfLihzpiUqGSfb+JamzgJmKd?= =?iso-8859-1?Q?uWSTyyeqkScXC+swSdDhBXM=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: c4187e36-ae86-4107-5e1b-08d89e4fbb13 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Dec 2020 03:40:55.0199 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: bQvNHpxiDw+VDsK8hvlqSDdD/ahBZcU5gJXXMVpjtEAndTQer8lfRhE6slzaKAFbLJ8OfU3FtSTAUnhEGVL+1Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB3431 X-Rspamd-Queue-Id: 4CtD2p1GkVz3rJn X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=A8OaZz7e; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::625 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5c::625:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5c::625:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 03:41:07 -0000 J David wrote:=0A= >On Fri, Dec 11, 2020 at 6:28 PM Rick Macklem wrote:= =0A= >> I am afraid I know nothing about nullfs and jails. I suspect it will be= =0A= >> something related to when file descriptors in the NFS client mount=0A= >> get closed.=0A= >=0A= >What does NFSv4 do differently than NFSv3 that might upset a low-level=0A= >consumer like nullfs?=0A= The opens for one. When a file is opened it finds its way to VOP_OPEN().=0A= --> For NFSv3 all it does is some client side cache consistency checks.=0A= --> For NFSv4, it must acquire or update a NFSv4 Open, which is a form=0A= of lock that is acquired/updated by an Open operation in an RPC.=0A= Then the client stores this locking info in a structure in a linked = list=0A= off of the mount point.=0A= Once all file descriptors for the vnode are closed, then, and only= =0A= then can a Close operation be done against the server and the linked= =0A= list data structure be free'd.=0A= --> Does having nullfs between the file descriptors and the NFS vnod= es=0A= for the same file affect when the v_usecount decrements to 0 = on=0A= the NFS vnode?=0A= I don't know. but if it delays it, then these linked list str= uctures=0A= will not be free'd as soon and might accumulate.=0A= --> The more structures the longer the linked list and the mo= re=0A= overhead/cpu will be used prcessing them.=0A= The fact that processes are spending a long time in exit() might=0A= be a hint that there are a large # of these NFSv4 Opens to deal with= =0A= when files are being closed implicitly during exit.=0A= =0A= As I mentioned, "nfsstat -c -E" will tell you how many Opens there= =0A= are under the "OpenOwners ..." line.=0A= =0A= >> Well, NFSv3 is not going away any time soon, so if you don't need=0A= >> any of the additional features it offers...=0A= >=0A= >If we did not want the additional features, we definitely would not be=0A= >attempting this.=0A= >=0A= >> a user would have to run their own custom hacked=0A= >> userland NFS client. Although doable, I have never heard of it being don= e.=0A= >=0A= >Alex beat me to libnfs.=0A= And you have users that would want to maliciously access the NFS server=0A= running jobs on this environment? (Other than reverting to NFSv3, allowing= =0A= clients to use non-reserved port#s is probably your other choice, from what= =0A= I can see. Fixing whatever the interaction between nullfs and the NFSv4 mou= nt=0A= is probably won't be fixed quickly, if ever.)=0A= =0A= >What about this as a stopgap measure?=0A= >=0A= >> How explosive would adding SO_REUSEADDR to the NFS client be? It's=0A= >> not a full solution, but it would handle the TIME_WAIT side of the=0A= >> issue.=0A= >=0A= >The kernel NFS networking code is confusing to me. I can't even=0A= >figure out where/how NFSv4 binds a client socket to know if it's=0A= >possible. (Pretty sure the code in sys/nfs/krpc_subr.c is not it.)=0A= It's done in the kernel RPC code, found in the sys/rpc directory.=0A= Mostly in clnt_rc.c and clnt_vc.c.=0A= If there is a timeout for an RPC (slow server, network problem,...),=0A= the code in clnt_rc.c will create a new TCP connection. The old=0A= connection could easily still be around.=0A= As such, I do not believe that SO_REUSEADDR or SO_REUSEPORT=0A= is feasible.=0A= =0A= rick=0A= =0A= Thanks!=0A= From owner-freebsd-fs@freebsd.org Sat Dec 12 03:46:40 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7978847EED7 for ; Sat, 12 Dec 2020 03:46:40 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CtD9C5pv3z3sBb for ; Sat, 12 Dec 2020 03:46:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 0BC3kWfC099334 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 12 Dec 2020 05:46:35 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 0BC3kWfC099334 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 0BC3kWAW099333; Sat, 12 Dec 2020 05:46:32 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Dec 2020 05:46:32 +0200 From: Konstantin Belousov To: Rick Macklem Cc: J David , "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 4CtD9C5pv3z3sBb X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [0.41 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RBL_DBL_DONT_QUERY_IPS(0.00)[2001:470:d5e7:1::1:from]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.997]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.41)[0.411]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2001:470:d5e7:1::1:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 03:46:40 -0000 On Sat, Dec 12, 2020 at 03:40:55AM +0000, Rick Macklem wrote: > J David wrote: > >On Fri, Dec 11, 2020 at 6:28 PM Rick Macklem wrote: > >> I am afraid I know nothing about nullfs and jails. I suspect it will be > >> something related to when file descriptors in the NFS client mount > >> get closed. > > > >What does NFSv4 do differently than NFSv3 that might upset a low-level > >consumer like nullfs? > The opens for one. When a file is opened it finds its way to VOP_OPEN(). > --> For NFSv3 all it does is some client side cache consistency checks. > --> For NFSv4, it must acquire or update a NFSv4 Open, which is a form > of lock that is acquired/updated by an Open operation in an RPC. > Then the client stores this locking info in a structure in a linked list > off of the mount point. > Once all file descriptors for the vnode are closed, then, and only > then can a Close operation be done against the server and the linked > list data structure be free'd. > --> Does having nullfs between the file descriptors and the NFS vnodes > for the same file affect when the v_usecount decrements to 0 on > the NFS vnode? > I don't know. but if it delays it, then these linked list structures > will not be free'd as soon and might accumulate. > --> The more structures the longer the linked list and the more > overhead/cpu will be used prcessing them. > The fact that processes are spending a long time in exit() might > be a hint that there are a large # of these NFSv4 Opens to deal with > when files are being closed implicitly during exit. > > As I mentioned, "nfsstat -c -E" will tell you how many Opens there > are under the "OpenOwners ..." line. Nullfs vnodes keep a reference on the lower vnode. When nullfs vnode caching is enabled, nullfs vnodes survive after a vfs syscall is finished. NFSv4 mount automatically sets flag MNTK_NULL_NOCACHE that disables nullfs vnode cache. From owner-freebsd-fs@freebsd.org Sat Dec 12 17:33:18 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6B1D04BB520 for ; Sat, 12 Dec 2020 17:33:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660076.outbound.protection.outlook.com [40.107.66.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CtZW126XKz3hTF for ; Sat, 12 Dec 2020 17:33:16 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oQjbnLOnSF2GPlitCetNJXWivT8DxfYlpIR5kAIFl8PMYOW8pXFgEGr0JpTCKbMp9QcXxc28gxF60cSwtD/ELyPfe+7DtJ4VB4B738LIR8LqbtNbY7iCvJkDJuPvkJKm6IUnMGEyj8qtXYZrZntJDAHTcNj5Slf/8FcgDKc4BA5xe5+pz/coqjJKA1koEtOePKoMRBdgKRpUKduHbV3bK2bEEfchPtvRu1lUJkdDXzjLSmTGIv0tIKwoqocs6em3WC7RL/EG3VHlj8YoMKrLfE23xmvGvEixFivYrT4mRjP74U1w6U3cgX0V2qnIQKlNKCIk0hCZDg9TDzahnfrMyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dfM5XBztkhn4kj4oaFFOr2LZgHyoCPN2ViEZOO5bR/U=; b=XAvDO5Tc+I4oqMKw81CYNMnK8BCrIyeO/KVycctkicNclfBI2HPCNNFtogeVIBGWuxh4mW6iVNOsA5hP6iu53w5hf0tF7PuN2ywyApBHSC4peE9BKrEjfyl4LFgrZvLTMMNLT1Br6maDPjmQ4M76QEmubANEgZ+NfwT1hA2a6sDDmn/631Rd278jJCMCPmrb+tsPew0gOdcd2xibSHixsq56KBw1xx29SyqyAbMa92TefOymZiQ/bOvppyXXnqQRcvuqMIG4k+n7kWC6+RJUV7pXXoYudv7ZTfk4b9c6JDPe/EeiNFVcXeykQvUfxoFoTHzNr0m9CkryrnF7k2+X5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dfM5XBztkhn4kj4oaFFOr2LZgHyoCPN2ViEZOO5bR/U=; b=iKJd7UAfADxDHChWNobPjoqMeqcnvaOXC4bwGL1m8EQ7Z94g1M7y3Kh7WtgAcvaZ53+KDksPA8CaUsMGwLMeL2OhvcY5kTMg6v5DVNiwwOtBDVGVgwlH6wbnbm6IbBeOjoKWqk4r3M2t56NmQkGJwMRVO9DfkHFgsxibDxYONj/XrI9vj6RWEjC+Y/ormx82Wm1LBHa0dqQ5T6uXxyYjuwL+dGKGTNYNFEHW4BeeBOafpXPgqIuvT25oJ8DzreAlhrG2gPDV2ABN3HJMenPtcMA/WdQi5NZGdVEW/t3Pq1MYT/d7rtsDWZOHsG+ONFeY+XiuLR+9FacAhkBjEOkQgw== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB3985.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:3c::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.12; Sat, 12 Dec 2020 17:33:15 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.019; Sat, 12 Dec 2020 17:33:06 +0000 From: Rick Macklem To: Konstantin Belousov CC: J David , "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvcmAAAu0AIAA4wiA Date: Sat, 12 Dec 2020 17:33:06 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 815e41b4-7d2d-4b50-c058-08d89ec3fc47 x-ms-traffictypediagnostic: QB1PR01MB3985: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ECTDa+9z6UiKOZWRa/gpKC5vBFIMEINHXt3p/Z5Y+HMIDzWaZx5JSaHgn0rg6RcdUuN0DrFi51TZAZD27W9ySqJz/QLJSvuVV/YkcWHXcDgbavFpf3Kg1+xRNJwHgQ9D5pd64WTJzHWq+QYa41rviZRwG5YVrFdtrl+cz5Xg4ZJw5p0fepoRM9nudZTw/EhF+FQ0G+rInR/T9cKG1zD6AMrIf5cqtQ6gWXIuuFc+CAfiOMJr9QTdun+s4EGK2CP5IbyxiEUSqk0wVZkvnNMBbQ8io9DaPr8C1ezk5QLMSBvsvJabdXuBM9S6pyL6KTfpf4AwBY196JZd4Lk96nLKWQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(136003)(39860400002)(346002)(366004)(396003)(4326008)(8676002)(64756008)(33656002)(66446008)(66476007)(66556008)(66946007)(76116006)(6506007)(8936002)(52536014)(91956017)(4744005)(2906002)(71200400001)(54906003)(7696005)(86362001)(186003)(478600001)(316002)(83380400001)(5660300002)(6916009)(9686003)(55016002)(786003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?8L8hbcawwzS3o+y0spIvwSVlOH4/cpR8jusq7f2gkBJ/ccPYyGivsnBX9y?= =?iso-8859-1?Q?uxliLR+1dlQuy5mxvZNu4g2JZepsyT66syxKep9iKMdYG3HZrxdLmaci6y?= =?iso-8859-1?Q?D2oyUQRBvD9IIwyMyfRqt4Tqs6EHUsdjaUbPbzmwQZQ6nDy8Jb2F0sOTw6?= =?iso-8859-1?Q?T7OBpsIr3BIya24tadgg9IAOMNz6qMbg5gZF2NSLQHhitzFChyQvhV7rEM?= =?iso-8859-1?Q?c21f3DOPdBa7VsCSMd3RZrDObXiJCe8RV9hY8S7xR4BYhMliBeKnUVwMg6?= =?iso-8859-1?Q?NFC2UaYwzYNAoBeztsMcsOlTGMfgNWBxDLzte34CWzWBybhWxEsw5wIkB1?= =?iso-8859-1?Q?0o/jSmsx0D9SSpd07fLjDh3cU6YXLrQ/vbYA/tSfEMspH6io2ko3+141SK?= =?iso-8859-1?Q?M1FPTiwlRfvSI0+bAmbDfB8unhThVf5ni2UOCuqzw/6JXvWiw8B22/WMQj?= =?iso-8859-1?Q?hrkyouPkiWIUBzcj+N5eO0GFlN8rvTv93uwlTthFSzCaRGFshsl5lXwXI4?= =?iso-8859-1?Q?c9Ku0oP9+ZGUxkEF3F1ObLZ/1zQcljmF+jjHsjImX7vGu62UpxlxPVg0B5?= =?iso-8859-1?Q?IqAtZw5yBo0Ql4tWd9oFE9OKsR1T+xj0DHJwbup6EAgy0qWrtQYTd3T/uj?= =?iso-8859-1?Q?WjE91dDKcG2YlaQNUSYFItA5F8h84m/4/ASxzlthhccwcn8XKOcusGelqG?= =?iso-8859-1?Q?BIDOHzzvkdM/u/d2llJx70dTw1zVAjA1UmgMSwKwzXF3wDpGZ8D6Ahjy2w?= =?iso-8859-1?Q?z1OcXdiyirk1RC0k9UM4gyvoK8TepHCXQk+fRI4BV+LE1v/hTyLG7kpswi?= =?iso-8859-1?Q?L80r2XAJ/ER3s98XaYDZxbJyGYyq/WtkJ3kHJOWTgS3lEuqMyPpXRtfuVC?= =?iso-8859-1?Q?NvtXo3kmAmGZGEg0VsAizb57USIRE4dVogx1cfy5Hbp+FWu+1JvoSOORt/?= =?iso-8859-1?Q?8Uo3Vk5mTM1eFLJu2TJ2YYbNTrc+DFuhZUlEuI8NAjT36OcQs1KRjTpTyk?= =?iso-8859-1?Q?ThDETomTBO+XZ5UXvgs5GqN19I4/kiATrICdIaxY/v+XWLB2vLFcxvlYD4?= =?iso-8859-1?Q?77aBN+eMwPNH9YZ7GcBaW3Q=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 815e41b4-7d2d-4b50-c058-08d89ec3fc47 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Dec 2020 17:33:06.0337 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 1IoQCdLb+pneTiElapy9n+PkmwxYFx2EaoH8inQEvUTeom4PP1lRSUAGNi9kLypLvHywFh0kTRsUvaOrAM5UUw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB3985 X-Rspamd-Queue-Id: 4CtZW126XKz3hTF X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=iKJd7UAf; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.76 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.66.76:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[40.107.66.76:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.76:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.76:from]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 17:33:18 -0000 Konstantin Belousov wrote:=0A= [stuff snipped]=0A= >Nullfs vnodes keep a reference on the lower vnode. When nullfs vnode=0A= >caching is enabled, nullfs vnodes survive after a vfs syscall is finished.= =0A= >=0A= >NFSv4 mount automatically sets flag MNTK_NULL_NOCACHE that disables nullfs= =0A= >vnode cache.=0A= Thanks Kostik, I see that. (And I recall discussions about disabling the nu= llfs caching.)=0A= =0A= Now, if I understand it correctly, if vrele() is called with the vnode shar= ed locked,=0A= then VOP_INACTIVE() won't be called.=0A= --> It is VOP_INACTIVE()/VOP_RECLAIM() in the NFSv4 client that does the cl= oses.=0A= Normally the NFS client calls vput() when the vnode is locked, but is there= a case=0A= where nullfs might cause vrele() to be called on the lower vp when it is sh= ared=0A= locked? (Delaying closes until VOP_RECLAIM() would cause problems, I think?= )=0A= =0A= Note that, until we see the "nfsstat -c -E" we won't know if lots of opens= =0A= are an issue anyhow.=0A= =0A= Thanks for the help, rick=0A= =0A= From owner-freebsd-fs@freebsd.org Sat Dec 12 17:51:12 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 68B714BBAAE for ; Sat, 12 Dec 2020 17:51:12 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CtZvg3zZ6z3jPv for ; Sat, 12 Dec 2020 17:51:11 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 0BCHowXE099686 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 12 Dec 2020 19:51:01 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 0BCHowXE099686 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 0BCHowtX099685; Sat, 12 Dec 2020 19:50:58 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Dec 2020 19:50:58 +0200 From: Konstantin Belousov To: Rick Macklem Cc: J David , "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 4CtZvg3zZ6z3jPv X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [0.85 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2001:470:d5e7:1::1:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.85)[0.849]; TAGGED_RCPT(0.00)[]; R_DKIM_NA(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2001:470:d5e7:1::1:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2020 17:51:12 -0000 On Sat, Dec 12, 2020 at 05:33:06PM +0000, Rick Macklem wrote: > Konstantin Belousov wrote: > [stuff snipped] > >Nullfs vnodes keep a reference on the lower vnode. When nullfs vnode > >caching is enabled, nullfs vnodes survive after a vfs syscall is finished. > > > >NFSv4 mount automatically sets flag MNTK_NULL_NOCACHE that disables nullfs > >vnode cache. > Thanks Kostik, I see that. (And I recall discussions about disabling the nullfs caching.) > > Now, if I understand it correctly, if vrele() is called with the vnode shared locked, > then VOP_INACTIVE() won't be called. > --> It is VOP_INACTIVE()/VOP_RECLAIM() in the NFSv4 client that does the closes. > Normally the NFS client calls vput() when the vnode is locked, but is there a case > where nullfs might cause vrele() to be called on the lower vp when it is shared > locked? (Delaying closes until VOP_RECLAIM() would cause problems, I think?) If vput() is called with the vnode shared-locked, it tries to upgrade to exclusive with LK_NOWAIT. If sleep-less upgrade fails, invalidation is postponed. In practice, the failure to upgrade the lock is not too common, because caller drops the last use reference on the vnode, but it still happens. That said, I do not believe that this situation causes the problems OP described. I want to see what is going on in the exiting process. > > Note that, until we see the "nfsstat -c -E" we won't know if lots of opens > are an issue anyhow. > > Thanks for the help, rick >