From owner-freebsd-fs@freebsd.org Thu Dec 10 16:16:04 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 308194AFB34 for ; Thu, 10 Dec 2020 16:16:04 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CsJtq2hP4z4fVQ for ; Thu, 10 Dec 2020 16:16:03 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x12b.google.com with SMTP id m25so9003215lfc.11 for ; Thu, 10 Dec 2020 08:16:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=6bxuwoEdmnzuUpjFycLU11Xk0YdpiQ5K98+Ks3oKI7Y=; b=PxPjlLmBDblOUw4W1v8Aelnk7mFiZe+NhqchX1mF4liwWbc+qPCOFOcbdUuhzE6erX wg5VP5BtOo92eq/AxX5aQ6Ikf4MkIK3DS/04EB8+T7OAuNiel1iFfBAQXKmqb7sINwEU ztHz1+61YqAvhiZa8Afw4b1pqjuJrja4WZiumJNtMA9nVdreBXZYTiw3J85Y4+1JHWbg SQ9TPV7NES4RYI+JBHr4TNVaAOvAaVMCzPs31xfVF1ruq8y2Av13G98yOJXolZO19cms Nohmfa9aCtzJhlE2ruuVDMR6PTPJvZUISAEQ1yfDaDisr4dmFe9501mGmA84iubsCYTY lmCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=6bxuwoEdmnzuUpjFycLU11Xk0YdpiQ5K98+Ks3oKI7Y=; b=d/4NMoUuKKBNclGQ51T6gk9IW/lJxqYU/xcyas0sTX5lg3W0V7AHXDg9gKwjXZS011 XdfLp33e89Ji7iexsw0t3fgThb6T9qfhXF/WK2grKnIoQaiC3Ey2nJ8xbQAP9zeX+gJB ZGzYz+HukF5hgaY7d8+ksWCj60AwuUP4WlfVvNsGzFims+J6nihjpPBHByJ/tnno3Hij u8IPyxY9Pehafz/CU7fstewp3A6l86370HbYu6YmYP8B10Yn/gbby1flzus68kzl4X1v x/vM5owZ+N52IDaVPTbzGrG+N+rfeIi1sC8xYFxotmORE4HLGiT3WKpB5LMnxv/Ef716 A9SA== X-Gm-Message-State: AOAM533pmbROYur9RrcF18cI/CYux2RVl59DGkRQSMJ5UZ9YR5+GmhHq y5Tlv4VTwX5lPZNy1ZgzyUt4R7MHWtmq3I3r6P9qkGCEv/g= X-Google-Smtp-Source: ABdhPJyAIGfcaf8HbAugAe9y69bGQRxFRCt+iYdcRxeIsCySLpXWgy0Nkgv3plVvFdxcE9+vz+DdDiKCRTSwIlSmFps= X-Received: by 2002:a05:6512:786:: with SMTP id x6mr2785991lfr.643.1607616961122; Thu, 10 Dec 2020 08:16:01 -0800 (PST) MIME-Version: 1.0 From: J David Date: Thu, 10 Dec 2020 11:15:50 -0500 Message-ID: Subject: Major issues with nfsv4 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CsJtq2hP4z4fVQ X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=PxPjlLmB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::12b as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12b:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12b:from:127.0.2.255]; NEURAL_SPAM_SHORT(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12b:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2020 16:16:04 -0000 Recently, we attempted to get with the 2000's and try switching from NFSv3 to NFSv4 on our 12.2 servers. This has not gone well. Any system we switch to NFSv4 mounts is functionally unusable, pegged at 100% system CPU usage, load average 70+, largely from nfscl threads and client processes using NFS. Dmesg shows NFS-related messages: $ dmesg | fgrep -i nfs | sort | uniq -c | sort -n 1 nfsv4 err=10010 4 nfsv4 client/server protocol prob err=10026 29 nfscl: never fnd open Nfsstat shows no client activity; "nfsstat -e -c 1" and "nfsstat -c 1" both report: GtAttr Lookup Rdlink Read Write Rename Access Rddir 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Meanwhile, tcpdump on the client shows an endless stream of getattr requests at the exact same time nfsstat -c says nothing is happening: $ sudo tcpdump -n -i net1 -c 10 port 2049 and src 172.20.200.39 14:47:27.037974 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [.], ack 72561, win 545, options [nop,nop,TS val 234259249 ecr 4155804100], length 0 14:47:27.046282 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 139940:140092, ack 72561, win 545, options [nop,nop,TS val 234259259 ecr 4155804100], length 152: NFS request xid 1544756021 148 getattr fh 0,5/0 14:47:27.051260 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140092:140248, ack 72641, win 545, options [nop,nop,TS val 234259269 ecr 4155804104], length 156: NFS request xid 1544756022 152 getattr fh 0,5/0 14:47:27.063372 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140248:140404, ack 72721, win 545, options [nop,nop,TS val 234259279 ecr 4155804106], length 156: NFS request xid 1544756023 152 getattr fh 0,5/0 14:47:27.068646 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140404:140556, ack 72801, win 545, options [nop,nop,TS val 234259279 ecr 4155804108], length 152: NFS request xid 1544756024 148 getattr fh 0,5/0 14:47:27.080627 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140556:140712, ack 72881, win 545, options [nop,nop,TS val 234259299 ecr 4155804110], length 156: NFS request xid 1544756025 152 getattr fh 0,5/0 14:47:27.085224 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140712:140868, ack 72961, win 545, options [nop,nop,TS val 234259299 ecr 4155804112], length 156: NFS request xid 1544756026 152 getattr fh 0,5/0 14:47:27.096802 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 140868:141024, ack 73041, win 545, options [nop,nop,TS val 234259309 ecr 4155804114], length 156: NFS request xid 1544756027 152 getattr fh 0,5/0 14:47:27.101849 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 141024:141180, ack 73121, win 545, options [nop,nop,TS val 234259319 ecr 4155804116], length 156: NFS request xid 1544756028 152 getattr fh 0,5/0 14:47:27.112905 IP 172.20.200.39.727 > 172.20.20.161.2049: Flags [P.], seq 141180:141336, ack 73201, win 545, options [nop,nop,TS val 234259329 ecr 4155804118], length 156: NFS request xid 1544756029 152 getattr fh 0,5/0 Only 10 shown here for brevity, but: $ sudo tcpdump -n -i net1 -c 10000 port 2049 and src 172.20.200.39 | fgrep getattr | wc -l tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on net1, link-type EN10MB (Ethernet), capture size 262144 bytes 10000 packets captured 20060 packets received by filter 0 packets dropped by kernel 9759 There are no dropped packets or network problems: $ netstat -in -I net1 Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll net1 1500 12:33:df:5f:79:d7 40988832 0 0 48760307 0 0 net1 - 172.20.0.0/16 172.20.200.39 40942065 - - 48756241 - - The mount flags in fstab are: ro,nfsv4,nosuid The mount flags as reported by "nfsstat -m" are: nfsv4,minorversion=0,tcp,resvport,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2147483647 Today, I managed to kill everything down to one user process that was exhibiting this behavior. After a kill -9 on that process, it went to "REsJ" but continued to burn the same amount of CPU (all system). Oddly the run state / wait channel was just "CPU1." Running "ktrace" did not produce any trace records. Probably that is predictable for a process in E state; if the process had crossed the user/kernel boundary in a way ktrace could detect, it would have exited. At that point, I started unmounting filesystems. Everything but the NFS filesystem used by that process unmounted cleanly. The umount for that filesystem went to D state for about a minute and then kicked back "Device busy." That's fair, if awfully slow. Meanwhile, that user process continued burning system CPU with the E flag set, not doing anything whatsoever in userspace, still producing 300+ "getattr fh 0,5/0" per second according to tcpdump and 0 according to nfsstat. Eventually, I rebooted with fstab set back to nfsv3. This feels like the user process is in a system call that is stuck in an endless loop repeating some operation that generates that getattr request. But that is a feeling, not a fact. This is fairly easy to reproduce; it seems pretty consistent within a few hours (a day at most) any time I switch the relevant mounts to nfsv4. Reverting to nfsv3 makes this issue completely disappear. What on earth could be going on here? What other information can I provide that would help track this down? Thanks for any advice!