From owner-freebsd-fs@freebsd.org Sun Dec 13 08:09:24 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EFE114B11C8 for ; Sun, 13 Dec 2020 08:09:24 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Ctxxv6g9Zz3JhD for ; Sun, 13 Dec 2020 08:09:23 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x134.google.com with SMTP id m12so22371343lfo.7 for ; Sun, 13 Dec 2020 00:09:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=e/2vazV1NbBZkCghA8bGmpU6064DFZhfQeICIS6fVoE=; b=baOdDAuJ3ArGzzQTvZS1j3UcXaN8FKBFT5BAqd9j8W3cIXNN8ezxEiRGdegTS88cqa i8OibbaaHoZbWBjWq7eADd9+MGSiDtTGQia2wrzEn6QooxxdPqlMAsSdiMh2khhViFtn gTbuh/gTfd8E00EBW7AgJBHOEeTRnV8S24iIhabDuQ6VSeoZW5+xv8vnbuIZwGf6V+ZU 9Ai7yBJU1RSt00DSPO2ZAcfbI8rsaFjRzSRyq1R4WpfxyrMPqmcmUMP7bK4CqYQc4oml 8y7+OxC/6Zd13Y+BJtkCKzM/rvTrGC1uO4YboFIpDfNJlqIcv95zgR1rPfKDCSexH7OU v8rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=e/2vazV1NbBZkCghA8bGmpU6064DFZhfQeICIS6fVoE=; b=UfGkgDmsgqjafA9Pq4tJbPekfhxlaC6K8ox2juFwB2yyMY096tfiI7V6+vTn6Zzi8x Fr2c1715pdwA89FK1WhnBUUA/oDxtPCS3kqeoJoE2VSzzb12gfWXabYvehPhaPoGywcb 9SmGM9wZUpJ8oMns4NajkU3yd603zYnoKqjHsbCv6Cf4fuQpnUx5gRmj4W8TU9FaVXgA 7rqlNEkL5zwg6ks0ql9+LuuA2FphB/MMo1m4cRdV+M4FXwCbGzK+Bti4BYaYW3OfbLpH 9Ngf19MHBq45QLgXR225W1B4UhHT7v7yU02GhQCAp5hvtda2VNNs7ZQ1Sy8c3CF8NOWJ wE8w== X-Gm-Message-State: AOAM532BPVFaaPRhDRJ0YBh5P5tgk+rRbV5sBIh7dVn9GypWyZYAndBp Q4s4d2/4/66DGWU6ZdVNy61gQQGmLchD/azXsoI= X-Google-Smtp-Source: ABdhPJw6YJ/I8GJ/3gX5/ahm8KZ97aoqCG97x0hOHqJaG6bt7lcA5sUYxpCtNBxEmsyKO+VPt0fkUPFmGr94JA5H/KY= X-Received: by 2002:a05:6512:1095:: with SMTP id j21mr7650261lfg.309.1607846961463; Sun, 13 Dec 2020 00:09:21 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Sun, 13 Dec 2020 03:09:10 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: Rick Macklem , "freebsd-fs@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Ctxxv6g9Zz3JhD X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=baOdDAuJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::134 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::134:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::134:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::134:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Dec 2020 08:09:25 -0000 The output of "procstat -kk -a" is extremely large. So I found what looked like four processes that were having problems, ran procstat -kk in a loop for all four for a couple of minutes, appending the output to a file, then did sort | uniq on the file to find common traces. 353 samples were collected. 11 "" samples had no kernel call stack and were discarded. Of the 342 stack samples, here's a frequency analysis of the entries present in the trace: 451 VOP_LOOKUP_APV 310 kern_statat 309 fast_syscall_common 303 amd64_syscall 234 namei 233 lookup 226 null_lookup 225 freebsd11_stat 220 nfs_lookup 189 __mtx_lock_sleep 156 nfscl_nodeleg 155 mi_switch 136 vputx 134 VOP_INACTIVE_APV 134 vinactive 83 sys_fstatat 68 nfsrpc_close 68 ncl_inactive 67 turnstile_wait 66 vrecycle 66 VOP_RECLAIM_APV 66 vgonel 66 null_reclaim 66 null_inactive 55 VOP_GETATTR_APV 55 nfscl_request 55 newnfs_request 55 clnt_reconnect_call 54 clnt_vc_call 53 sleepq_timedwait 53 _sleep 41 nfsrpc_lookup 30 nfs_getattr 26 nfscl_mustflush 26 ncl_getattrcache 25 null_bypass 24 vn_stat 24 null_getattr 19 ast 14 VOP_LOCK1_APV 13 VOP_ACCESS_APV 13 vn_dir_check_exec 13 nfsrpc_accessrpc 13 nfs34_access_otw 13 nfs_access 13 critical_exit_preempt 12 Xipi_intr_bitmap_handler 12 ipi_bitmap_handler 12 doreti_ast 11 thread_lock_flags_ 9 sigdeferstop_impl 8 nfscl_doclose 6 _vn_lock 5 null_lock 3 sleepq_wait 3 sleeplk 3 nfscl_deleggetmodtime 3 nfs_lock 2 VOP_READLINK_APV 2 VOP_ISLOCKED_APV 2 sigallowstop_impl 2 nfscl_getcl 2 lockmgr_xlock_hard 1 Xtimerint 1 vtnet_txq_mq_start_locked 1 vtnet_txq_mq_start 1 vget 1 tcp_usr_send 1 tcp_output 1 sosend_generic 1 sosend 1 null_nodeget 1 nfsrpc_getattr 1 nfscl_clientrelease 1 ncl_bioread 1 lockmgr_slock_hard 1 ip_output 1 ether_output_frame 1 ether_output 1 cache_lookup (VOP_LOOKUP_APV appears twice on many lines.) The most common trace (91 times, or over 1/4 of all observed) is: __mtx_lock_sleep+0xf8 nfscl_nodeleg+0x207 nfs_lookup+0x314 VOP_LOOKUP_APV+0x75 null_lookup+0x98 VOP_LOOKUP_APV+0x75 lookup+0x451 namei+0x414 kern_statat+0x72 freebsd11_stat+0x30 amd64_syscall+0x387 fast_syscall_common+0xf8 This trace appears roughly uniformly (28/38/34) in 3 of the 4 processes. The full set of traces (sorted by uniq -c) is here: https://pastebin.com/HUqkeMri (This message is already long enough!) The characters stripped off the front of each line are consistently: (pid) (uid) python3 - It happens these were all Python-based jobs. Python seems predisposed to trigger this, but non-Python jobs trigger it as well. Heavy use of stat() does seem to be a common element regardless of job type. Here's the output of "nfsstat -E -c" 60 seconds after running it with -z: Client Info: RPC Counts: Getattr Setattr Lookup Readlink Read Write 1667 111 6376 0 42 153 Create Remove Rename Link Symlink Mkdir 111 7 14 0 0 0 Rmdir Readdir RdirPlus Access Mknod Fsstat 0 0 0 2620 0 160 FSinfo pathConf Commit SetClId SetClIdCf Lock 0 0 113 0 0 48 LockT LockU Open OpenCfr 0 48 320 0 OpenDownGr Close 0 402 RelLckOwn FreeStateID PutRootFH DelegRet GetAcl SetAcl 0 3 0 0 0 0 ExchangeId CreateSess DestroySess DestroyClId LayoutGet GetDevInfo 0 0 0 0 0 0 LayoutCommit LayoutReturn ReclaimCompl ReadDataS WriteDataS CommitDataS 0 0 0 0 0 0 OpenLayout CreateLayout 0 0 OpenOwner Opens LockOwner Locks Delegs LocalOwn 21175 130439 30 6 0 0 LocalOpen LocalLown LocalLock 0 0 0 Rpc Info: TimedOut Invalid X Replies Retries Requests 0 0 0 0 12247 Cache Info: Attr Hits Attr Misses Lkup Hits Lkup Misses 1110054 858 1002829 6361 BioR Hits BioR Misses BioW Hits BioW Misses 2000 54 292 153 BioRL Hits BioRL Misses BioD Hits BioD Misses 6911 0 208 0 DirE Hits DirE Misses 104 0 This does reflect the whole machine, not just those four processes. Finally, when I attempted to kill those four processes with ktrace running on all of them, the system panicked: Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x10 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8272d28f stack pointer = 0x28:0xfffffe008a5f24c0 frame pointer = 0x28:0xfffffe008a5f25f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 5627 (node) trap number = 12 panic: page fault cpuid = 2 time = 1607845622 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008a5f2180 vpanic() at vpanic+0x17b/frame 0xfffffe008a5f21d0 panic() at panic+0x43/frame 0xfffffe008a5f2230 trap_fatal() at trap_fatal+0x391/frame 0xfffffe008a5f2290 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe008a5f22e0 trap() at trap+0x286/frame 0xfffffe008a5f23f0 calltrap() at calltrap+0x8/frame 0xfffffe008a5f23f0 --- trap 0xc, rip = 0xffffffff8272d28f, rsp = 0xfffffe008a5f24c0, rbp = 0xfffffe008a5f25f0 --- null_bypass() at null_bypass+0xaf/frame 0xfffffe008a5f25f0 VOP_ADVLOCK_APV() at VOP_ADVLOCK_APV+0x80/frame 0xfffffe008a5f2620 closef() at closef+0x8f/frame 0xfffffe008a5f26b0 fdescfree_fds() at fdescfree_fds+0x3c/frame 0xfffffe008a5f2700 fdescfree() at fdescfree+0x466/frame 0xfffffe008a5f27c0 exit1() at exit1+0x488/frame 0xfffffe008a5f2820 sigexit() at sigexit+0x159/frame 0xfffffe008a5f2b00 postsig() at postsig+0x2fd/frame 0xfffffe008a5f2bc0 ast() at ast+0x317/frame 0xfffffe008a5f2bf0 doreti_ast() at doreti_ast+0x1f/frame 0x7fffffff67a0 The fsck ate the ktrace.out file on reboot. This system does not have kernel debug symbols. If there's a way to try to figure out what null_bypass+0xaf corresponds to using the machine that built this kernel, which has the kernel, objects, debug symbols, and source, please let me know. It's built from releng/12.2 r368515. Thanks! From owner-freebsd-fs@freebsd.org Sun Dec 13 17:09:00 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 577914BE3FD for ; Sun, 13 Dec 2020 17:09:00 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on060e.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::60e]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cv9wW2QcNz4f7V for ; Sun, 13 Dec 2020 17:08:58 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hMdvfi2ZYDLLhVMDu0cWx7Tm0Sizv+i68lH0nQPJqcXThDuUY3acyH1jHpjboENndkE3u14bf8JnESTHP3CGiYkJBb3bedBJtQvfecUbXnygC3Nh7P6Lotrqor+ZAvlNM3xGPY4EcQq2P4r5ME6pb2CBe+5+hcQnzQtyJQclfD+6BHQmFPzSlY0yhPhdXllvot1Ed9AmWy6cV2NJADgNFJiB5lckbCFP1pLucDZEiVLlkxfeMz9KFdMs0yVF61vH6Lu8Ld+gRBBh5i7hodOsS3CFxH/ophVYBWRTUgyuqeUQmMNp6oEnMxKikiqRiLUUiyU8XlrYwyhTFCWizrrQsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2DDeMw3S37NQzdOUGYpRnF+FdmbmzYCjv59MDYKGXmo=; b=HKBDDdSNe5dDQStyu1kyvKdTSCvWst9Ilo5GVVpFdoz+W40i3oDgOJkkAPb3EGe8kTv7i2+5bcAXvwSayOBvwTF8+OzItBh5UFOihQ/yCcq6k6d1pl/gRpvOk9FTFEq0K7AXd0k7GNYjU0VPmAgm07vwokUkiFLc1i5mJ6gmaD4U24prYVx7/45IGkhgmAXFzaoqe6059Vf8EdACsCWzhjOqvhmR5suT2tQVBR8NjgAUsAs+TD4M012I5bhquwq6BONzXVZmlIUoVQYOprhv8Z5YzbszaWsvtwS2ap1YC0Thc5OKqumyEuyhYDDPVlJH5YEYIlfiIuxl3+qmtXavaA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2DDeMw3S37NQzdOUGYpRnF+FdmbmzYCjv59MDYKGXmo=; b=O1ukXopONjZEZGT562CFYMu2UoG64HS1OhYlHsoj01UN2b4LcdBGo97E19w/j2MMsO96Zg33671XN67mSijcuZpBV9x/31xITiJE/yuOsvKEaz+dRxabIyP+vxuDjsGtZC96mZbeN37qmw5D720fj+uyHZhjr+1e2cjBMj//tKyK5DjRlRU4P3WUfXlBBzNktMH8x1xgvHLw5KxuMZL1b+Th6a1ZJfSdAghaSBAvcNrnv38RELE58YXIw5bZia0At8RLx4630O05/Ze7BUgVECdyFfNIvGivk+19eKOwotNf4yMAfsrgbfSCeczpE+on5r0rkdoxh/GCfKJmg2VnhQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB2689.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:33::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.15; Sun, 13 Dec 2020 17:08:55 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.024; Sun, 13 Dec 2020 17:08:48 +0000 From: Rick Macklem To: J David , Konstantin Belousov CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvcmAAAu0AIAA4wiAgAAI5gCAAO/IAIAAiDrm Date: Sun, 13 Dec 2020 17:08:48 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: f6df9b2a-5a2b-461e-edf6-08d89f89c206 x-ms-traffictypediagnostic: QB1PR01MB2689: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: bd28ToaQmlU/Zp6nXt4dw0+hAK793SKXkZohF55MYE5mAAOtVPcjYb/rQzBg2Hm9TLRHZ934OTcQJzE9Jdah+xSE1EylMtAiaF1FruqzmQQ3vucSZRw98uRf0BvJhIvkyqPDXHvB15yDQ8uSzqml7nVEneM4ML10Pyl1k+tHinmNqDkYZmbzJuK4bdhBgJIH/MPBnvDKWqer3QUKt22oexW8YsXvAXcsUS7fwzXkMok8kVqsryfVNmkuqKh1zIv9Ip9s+/sweaMrr/6ybW5v8ypY8uiDLw3Y6a8UppE2ODmB/XBBZUF3gh1idmmF09z3BJhaiVHaQP/KiMNSvHeS9w== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(366004)(346002)(136003)(39850400004)(396003)(8936002)(86362001)(66476007)(66946007)(66556008)(186003)(2906002)(64756008)(91956017)(316002)(786003)(4326008)(83380400001)(6506007)(52536014)(5660300002)(55016002)(7696005)(478600001)(66446008)(76116006)(9686003)(8676002)(33656002)(110136005)(71200400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?NqJHeuguBIqHl7qpdfnu9rvFj4e02rUN/CUtgwuwaj3RBwmXKD570VyQ2G?= =?iso-8859-1?Q?8WynsIYAICkcdh8W+0dpDxh5fR0hskxO089xDmoNTl70QNSXuNZwrxsKwr?= =?iso-8859-1?Q?TpfX5MR3pprHJK6Y9Yn2a5eIz6+gNWz9z3KhWM8uh1b8yqh4YcMAiTqu9I?= =?iso-8859-1?Q?GUM6w8e0BzpgXhdtn8tj+BOwWheMb1IS2NHaxCZ/ZhaJ1+jDun8tPGZ90z?= =?iso-8859-1?Q?24+ym27pIz67nN8Y/Th89KoDuNLnlawbuR2+fd1dPpUFrqJOfSaWAF3LUO?= =?iso-8859-1?Q?hJ2Eu3bfSmVJwuD1t14Asi+ou9ocEEGBuVBgG206wECNM2d4fS0xBmkYF2?= =?iso-8859-1?Q?VZjxmsCSa+FbxGcrPWetsuAoZS+mPuY9vfGlZiTF7k6KPGT8QqICFGYGL+?= =?iso-8859-1?Q?jxcP3R0LWGupp3ddIPmLv9fhyiXNVroTEpuHQes5NjWjAZM/gUEogueSiN?= =?iso-8859-1?Q?r2m5HFSj/A+ycOG71JGc08DPXBSUPRq+aY4kqzMLI/19U7ZvqKXunPmSTN?= =?iso-8859-1?Q?w2Z4roBP0JCVRX+B+K2MT0wvjqTtClRR7JJQatdXyg3sNdClSdHBPCE/5P?= =?iso-8859-1?Q?4S7SHFS3ZpXJMtfYZjIX1MI7y6SRMuLZhAnzjPj7+CYaU0vzMmalzZUbwy?= =?iso-8859-1?Q?iJGELDt3GmnT39kouu5TTYBrbAKWfejGezf82mDZxmgLRALgyDNUprLmAf?= =?iso-8859-1?Q?v9CogLKtysNR0LuTlQqJZzgkWWwHKlctXx42J8VqF2lu1mEZEZas4L8ksE?= =?iso-8859-1?Q?o9m1jrPfuxB+eau9fRUanOc/zTxjgyaABRgfdHxnc1yLOBx+4yedJAB/Mr?= =?iso-8859-1?Q?ALMDDdI3BOQ6RtPVal5ylS7w1+EXWYJQ12lvuZItyLQBCPtdc7VAv5QzLH?= =?iso-8859-1?Q?NVLYIzDJ7zs0tU1lQgY8LtrKfJbuHAvEvQ6QCnUoUDMZt/05suh49L0whP?= =?iso-8859-1?Q?z3CtBXjP5Aolj3MTQRcKUAhVC70jiVcQ6i4vquBS/glBDzTGenMzLOTLGo?= =?iso-8859-1?Q?xiHEjr/TYPo/iapis+G+/vH9VvgMWEQ9zg6RQRmLgd86HJRbkQDI7I01de?= =?iso-8859-1?Q?F+ZzgDJwaJbhv1KdZNIKODM=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: f6df9b2a-5a2b-461e-edf6-08d89f89c206 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Dec 2020 17:08:48.6835 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: J6fEcc8wowZOaVeYG/ttW0Om7rQpCwg9An0EyVspW0AQOZqzbTipL7AqS6qzRXq8u116CYpx/yrV98u4pb9aPw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB2689 X-Rspamd-Queue-Id: 4Cv9wW2QcNz4f7V X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=O1ukXopO; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::60e as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::60e:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::60e:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Dec 2020 17:09:00 -0000 J David wrote:=0A= [stuff snipped]=0A= >The most common trace (91 times, or over 1/4 of all observed) is:=0A= >=0A= >__mtx_lock_sleep+0xf8 nfscl_nodeleg+0x207 nfs_lookup+0x314=0A= >VOP_LOOKUP_APV+0x75 null_lookup+0x98 VOP_LOOKUP_APV+0x75 >lookup+0x451=0A= >namei+0x414 kern_statat+0x72 freebsd11_stat+0x30 amd64_syscall+0x387=0A= >fast_syscall_common+0xf8=0A= This is just waiting on the mutex that protects the open/lock/delegation=0A= state.=0A= =0A= Now, is this just caused by heavy load and 130000 opens, or is there=0A= something in nullfs that results in more state handling than needed?=0A= [more stuff snipped]=0A= >It happens these were all Python-based jobs. Python seems predisposed=0A= >to trigger this, but non-Python jobs trigger it as well. Heavy use of=0A= >stat() does seem to be a common element regardless of job type.=0A= >=0A= >Here's the output of "nfsstat -E -c" 60 seconds after running it with -z:= =0A= >=0A= >Client Info:=0A= >RPC Counts:=0A= > Getattr Setattr Lookup Readlink Read W= rite=0A= > 1667 111 6376 0 42 = 153=0A= > Create Remove Rename Link Symlink M= kdir=0A= > 111 7 14 0 0 = 0=0A= > Rmdir Readdir RdirPlus Access Mknod Fs= stat=0A= > 0 0 0 2620 0 = 160=0A= > FSinfo pathConf Commit SetClId SetClIdCf = Lock=0A= > 0 0 113 0 0 = 48=0A= > LockT LockU Open OpenCfr=0A= > 0 48 320 0=0A= > OpenDownGr Close=0A= > 0 402=0A= --> So, during this time interval, there were 320 opens done and 402 closes= .=0A= > RelLckOwn FreeStateID PutRootFH DelegRet GetAcl Se= tAcl=0A= > 0 3 0 0 0 = 0=0A= > ExchangeId CreateSess DestroySess DestroyClId LayoutGet GetDev= Info=0A= > 0 0 0 0 0 = 0=0A= > LayoutCommit LayoutReturn ReclaimCompl ReadDataS WriteDataS >Commit= DataS=0A= > 0 0 0 0 0 = 0=0A= > OpenLayout CreateLayout=0A= > 0 0=0A= > OpenOwner Opens LockOwner Locks Delegs Loca= lOwn=0A= > 21175 130439 30 6 0 = 0=0A= So it has accumulated 130439 opens over 21175 different processes.=0A= (An openowner represents a process on the client.)=0A= Are there typically 20000+ processes/jobs running concurrently on this setu= p?=0A= =0A= That implies that any time a file is opened (and any time a vnode v_usecoun= t=0A= drops to 0) a linear traversal of a 130000 element linked list is done whil= e=0A= holding the mutex that the above common procstat entry is waiting on.=0A= (The nfscl_nodeleg() call to check for a delegation happens whenever lookup= =0A= has a name cache hit.)=0A= -->Also, every VOP_GETATTR() called from stat() will need to acquire the=0A= state mutex for a short period of time.=0A= =0A= -->So I am not surprised that the mount gets "constipated".=0A= To be honest, the NFSv4 state handling in the client was never designe= d=0A= to handle opens at this scale.=0A= --> Unlike the NFS server code, which uses hash tables of linked lists= , the client=0A= only uses a single linked list for all open structures.=0A= I have thought of changing the NFS client code to use hash tabl= es of lists=0A= and may work on doing so during the coming months, but who know= s=0A= when such a patch might be ready.=0A= So long as the lists are short, holding the mutex during list t= raversal=0A= seems to work ok, even on a heavily loaded NFS server, so doing= so=0A= should allow the client code to scale to larger numbers of open= =0A= structures.=0A= =0A= What you are seeing may just be contention for the client state mutex=0A= (called NFSLOCKCLSTATE() in the code), caused by repeated traversals=0A= of the 130000 element linked list.=0A= =0A= The other question is...=0A= Is having 130000 opens normal or is nullfs somehow delaying the=0A= VOP_INACTIVE() call on the underlying NFS vnode enough to cause=0A= these opens to accumulate?=0A= (Do to how these Windows oplock like opens are defined, they cannot=0A= be closed/discarded by the NFS client until VOP_INACTIVE() is called on=0A= the open file's vnode.)=0A= =0A= Counting how many processes are in __mtx_lock_sleep in the procstat=0A= output should give us a better idea if contention on the NFSv4 client=0A= state mutex is causing the problem.=0A= =0A= rick=0A= =0A= > LocalOpen LocalLown LocalLock=0A= > 0 0 0=0A= Rpc Info:=0A= TimedOut Invalid X Replies Retries Requests=0A= 0 0 0 0 12247=0A= Cache Info:=0A= Attr Hits Attr Misses Lkup Hits Lkup Misses=0A= 1110054 858 1002829 6361=0A= BioR Hits BioR Misses BioW Hits BioW Misses=0A= 2000 54 292 153=0A= BioRL Hits BioRL Misses BioD Hits BioD Misses=0A= 6911 0 208 0=0A= DirE Hits DirE Misses=0A= 104 0=0A= =0A= This does reflect the whole machine, not just those four processes.=0A= =0A= Finally, when I attempted to kill those four processes with ktrace=0A= running on all of them, the system panicked:=0A= =0A= Fatal trap 12: page fault while in kernel mode=0A= cpuid =3D 2; apic id =3D 02=0A= fault virtual address =3D 0x10=0A= fault code =3D supervisor read data, page not present=0A= instruction pointer =3D 0x20:0xffffffff8272d28f=0A= stack pointer =3D 0x28:0xfffffe008a5f24c0=0A= frame pointer =3D 0x28:0xfffffe008a5f25f0=0A= code segment =3D base 0x0, limit 0xfffff, type 0x1b=0A= =3D DPL 0, pres 1, long 1, def32 0, gran 1=0A= processor eflags =3D interrupt enabled, resume, IOPL =3D 0=0A= current process =3D 5627 (node)=0A= trap number =3D 12=0A= panic: page fault=0A= cpuid =3D 2=0A= time =3D 1607845622=0A= KDB: stack backtrace:=0A= db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008a5f2= 180=0A= vpanic() at vpanic+0x17b/frame 0xfffffe008a5f21d0=0A= panic() at panic+0x43/frame 0xfffffe008a5f2230=0A= trap_fatal() at trap_fatal+0x391/frame 0xfffffe008a5f2290=0A= trap_pfault() at trap_pfault+0x4f/frame 0xfffffe008a5f22e0=0A= trap() at trap+0x286/frame 0xfffffe008a5f23f0=0A= calltrap() at calltrap+0x8/frame 0xfffffe008a5f23f0=0A= --- trap 0xc, rip =3D 0xffffffff8272d28f, rsp =3D 0xfffffe008a5f24c0, rbp= =0A= =3D 0xfffffe008a5f25f0 ---=0A= null_bypass() at null_bypass+0xaf/frame 0xfffffe008a5f25f0=0A= VOP_ADVLOCK_APV() at VOP_ADVLOCK_APV+0x80/frame 0xfffffe008a5f2620=0A= closef() at closef+0x8f/frame 0xfffffe008a5f26b0=0A= fdescfree_fds() at fdescfree_fds+0x3c/frame 0xfffffe008a5f2700=0A= fdescfree() at fdescfree+0x466/frame 0xfffffe008a5f27c0=0A= exit1() at exit1+0x488/frame 0xfffffe008a5f2820=0A= sigexit() at sigexit+0x159/frame 0xfffffe008a5f2b00=0A= postsig() at postsig+0x2fd/frame 0xfffffe008a5f2bc0=0A= ast() at ast+0x317/frame 0xfffffe008a5f2bf0=0A= doreti_ast() at doreti_ast+0x1f/frame 0x7fffffff67a0=0A= =0A= The fsck ate the ktrace.out file on reboot. This system does not have=0A= kernel debug symbols. If there's a way to try to figure out what=0A= null_bypass+0xaf corresponds to using the machine that built this=0A= kernel, which has the kernel, objects, debug symbols, and source,=0A= please let me know. It's built from releng/12.2 r368515.=0A= =0A= Thanks!=0A= From owner-freebsd-fs@freebsd.org Sun Dec 13 21:25:34 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E093F4C4057 for ; Sun, 13 Dec 2020 21:25:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvHcY4xs2z4vXr for ; Sun, 13 Dec 2020 21:25:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 0BDLPMoj094310 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sun, 13 Dec 2020 23:25:25 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 0BDLPMoj094310 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 0BDLPM7J094294; Sun, 13 Dec 2020 23:25:22 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 13 Dec 2020 23:25:22 +0200 From: Konstantin Belousov To: Rick Macklem Cc: J David , "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 4CvHcY4xs2z4vXr X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [-1.01 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RBL_DBL_DONT_QUERY_IPS(0.00)[2001:470:d5e7:1::1:from]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.99)[0.994]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2001:470:d5e7:1::1:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Dec 2020 21:25:34 -0000 On Sun, Dec 13, 2020 at 05:08:48PM +0000, Rick Macklem wrote: > J David wrote: > [stuff snipped] > >The most common trace (91 times, or over 1/4 of all observed) is: > > > >__mtx_lock_sleep+0xf8 nfscl_nodeleg+0x207 nfs_lookup+0x314 > >VOP_LOOKUP_APV+0x75 null_lookup+0x98 VOP_LOOKUP_APV+0x75 >lookup+0x451 > >namei+0x414 kern_statat+0x72 freebsd11_stat+0x30 amd64_syscall+0x387 > >fast_syscall_common+0xf8 > This is just waiting on the mutex that protects the open/lock/delegation > state. > > Now, is this just caused by heavy load and 130000 opens, or is there > something in nullfs that results in more state handling than needed? > [more stuff snipped] Nullfs with -o nocache (default for NFS mounts) should not cache vnodes. So it is more likely a local load that has 130k files open. Of course, it is the OP who can answer the question. From owner-freebsd-fs@freebsd.org Mon Dec 14 00:24:18 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 109C84A8C87 for ; Mon, 14 Dec 2020 00:24:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0601.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::601]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvMZn0QKjz3hJ1 for ; Mon, 14 Dec 2020 00:24:16 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=A7KV8dC4Pd41/NEqZ29yM0HJdLLFQXk8OTvsHBQIp3AdLlXLyUK4ItdnOsAHa/KAWRgL6QVY/n6ma8PsH+ih+EArSLiNHIFPoplctA8A1D1wvbWMAPkTM+PX9X48sXKHbkea9NA6GNns3kaUg6TixFmyJk/r7eungShMtN6ouYQ9sPRG/QokLavRSsspu6V5RTp7VQbsNa0/gQW449ApY/0kP/nUiIcm9EjD/3eidLkgVGQvEAugO7VdZW4ciRdtNXRLfgUxeEAv+FqPRDXlFoWEGDzu74I5zAqS1eHU5mofM/Mibb1uSgWZg3tEYdeiFJfNehXzTypLvCakcRQOlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XQq+7hvmAVrCCkKx43PpqujO/5fTSsfZ50vXnJd4InQ=; b=I3Ram8G1BAsJyN4TTKiJCLPC/uDlC5BaTPGS6Q/Gr0lpUccM0kCagqS/cxL5GnUGVZePR31pkIE1G1BxqGqo6hK16wrdSWJYJ0tFOqG6sKbQhFh+v2/XKJo1ThYa9V5LEzJy5shcHJOIxkaowc6Acd5xjcHn5fHyIP3Fo27JkzpAfY22wKU+seLO7KFPYMxiIahGFtdAtO1Os25pobI9P0a3L5Kfj2tzC2HnPQ0nSIljKMTS3jyVd9guoWgUlYT074CC0pqX7+DwpcrnafGSeynByROmvlujVBXF0w2vNxsRCGZ6rE2MiT9chcFSJbzvYLiqiWaVTzWLqqbSj8242w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XQq+7hvmAVrCCkKx43PpqujO/5fTSsfZ50vXnJd4InQ=; b=Bdl8FyBmzY1DRqok3yElSmgrPnfeT+Gfke94ntwUQQW4qjE3gdKBX9HXbIcnCP5gQ4W7PeKl2R96Kjolq7Jn5+1qy35iO9cG7F0p1MaZeV/TyFjxtoqZeA3whj80eS7GkIjLjezHBUvfeJfAHNdu7OsTrcEDbIbOEA5YGy/3hw3txodMMWUpE3suXzUrvylHEoMOGjTAm4ev9hR38jhLkoTyez+MNwmwdQXhm8fIhEjQiO3h5SSu6PmUGZ2toKqFiIZjm0SsaT0WXkCZcze9gxvIPc2Fggcjn/fVtL8/LzmdVjnon7AWFDo7kdYRdNf2dSVSDXnpEpLllFGrddoVbA== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB2584.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:4c::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.13; Mon, 14 Dec 2020 00:24:15 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.024; Mon, 14 Dec 2020 00:24:14 +0000 From: Rick Macklem To: J David , Konstantin Belousov CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvcmAAAu0AIAA4wiAgAAI5gCAAO/IAIABCutw Date: Mon, 14 Dec 2020 00:24:14 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 85b48805-9f30-4b41-3201-08d89fc69678 x-ms-traffictypediagnostic: YQXPR01MB2584: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 4Vu8lszuwdBtctWd/WmCF+RAt+L1SNQO6YdtU2xmbmu+W58ExX22Qp9lcnA+N9ALnEN0BCWGzwOzn1ZOWwCW3uuxAFFTIDnCWMqpFdUwRCgksfYhWIC4W2mLZXuwu4EUr9Yy1lP4vMxqTITBpK9A84yPnjV4bAH2qbpzwPI8DqUxL+RVj2yGdy2R2rJUB9j95mUtpBJR3xhR2WDJIlaivoZijsftMwpztnZ9s4arEt5r+UjHAfbO5YE9gl6dRDLw3AIQ2eDRPH0CewrnsrmjZBIQWbd6pT5B6j4t0oUVe6P8qWdP2A87YrkZcHdndKMjnI5H2kS6LU62cjnYXMdkKIEOWQIIuaG0tv9NrMSkCtqWWJmgCPNoc8godUDkLHYSrAyK9RxUU7nlC/6kKazumVsPMZeV2oCA9y43GPUbR6fj+fZL7TdOwvETRVBkzPM2557kKZkSXwLTf/ugB3dtgA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(136003)(376002)(366004)(8936002)(110136005)(2906002)(55016002)(8676002)(66446008)(66616009)(66946007)(786003)(66556008)(66476007)(64756008)(9686003)(4326008)(99936003)(76116006)(52536014)(91956017)(966005)(508600001)(86362001)(33656002)(6506007)(83380400001)(7696005)(5660300002)(186003)(71200400001)(10126625002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?oPXauQBwLrW2EYtjIEK+uCDfLKN5IJnm3rFzpoIXE3MA59+U8G63xpBOor?= =?iso-8859-1?Q?RACnW+e/+O9ac0XkBxhZdZlN9KWHi9bEyP49mC37D+T08+v9jmAIrdVkSG?= =?iso-8859-1?Q?TijG3RfE0vLAFPmliXqmmboE8wvIlxhArnTPKffuJZbCBcInLfKqlDNiN1?= =?iso-8859-1?Q?JQGmBt8rSggC8OO9AXbSS9aC2/eaFQ3i4dHFqZztMFdyBv7wygsVXCQUZl?= =?iso-8859-1?Q?fFmHM5SgrLwGziNCmoF3eXeidAajT1xX/YJBXnjGul/tPelLaXucRtPbUT?= =?iso-8859-1?Q?MCpCsghB1ToKz/d7J+DVRO2Nm7E4h5/kpB+vOwMb6TG6nRZfzASzCBOl/M?= =?iso-8859-1?Q?QUN7cX4VKEnVLQgcakpYTFWrJnMPsNW4XpIMWeT98bCD4WJ4dZzCDQ7lXS?= =?iso-8859-1?Q?LNB2JEr0V4MBVTNtxH1aVeEw7YNYLuFnrtMTd5BE3i7vWZMilzlKvhAFRv?= =?iso-8859-1?Q?jVpcTalNR7p+EeQReXtDODlm49qhEZnsiGYJBusKiN8SH30LiKd/vu/eFZ?= =?iso-8859-1?Q?iK09NKtj8ug4HFLl4xMfXfrXqXGamy7CYPXS8HEhTwcm8L14JEgFq8cjRW?= =?iso-8859-1?Q?i5O5qcjmjYwzrSyD25PW567+sofad32B8LpInBMsXowuyNVVpx6RrNOiwI?= =?iso-8859-1?Q?ph6Bjlu9XxgEPw5ploXfyyNCL+oyPdsPRndtbcMt7OoUb8A5X+09L4y3Ae?= =?iso-8859-1?Q?M2zJYE+3x4q/iYpq+2mTohxISZgv8JqgTwHYGIqkumBs9ro1A7hHzfGBTj?= =?iso-8859-1?Q?lDv7JwT54D+W5krz4obT1eEO2VAcwiFs5XwcQ+x+7eo+4kDVw++ybCiOj2?= =?iso-8859-1?Q?S9dQ0d5GlQuym1noPsMlDi6yp2Cgu/nyMpRi96wY5iH3dUSx2ar1rnShKI?= =?iso-8859-1?Q?nWm7n0XFwBGSADtuRoN4si39LWrWdri2cMW0fYgc5MOJa0ohZQTHwr1OKS?= =?iso-8859-1?Q?kDUhwDr9vrdULf/xAOGuSs80bXCc/K/ELaR0cvgZutIghOWIWWijXTz1Ml?= =?iso-8859-1?Q?PEPt9Te6qdJJ98Oa735f5IIgtvXky49pd9hK6wSLyf9YNWgA2sMeAM0f5N?= =?iso-8859-1?Q?Id+9ybibTsiIswmc88MQm6Q=3D?= x-ms-exchange-transport-forked: True Content-Type: multipart/mixed; boundary="_002_YQXPR0101MB096834DB16439960ADAFCE29DDC70YQXPR0101MB0968_" MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 85b48805-9f30-4b41-3201-08d89fc69678 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Dec 2020 00:24:14.8806 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: IUeicf0UbjtuO3ps+8TRC3AawGSKAQjpXL+THF3lFZ56dXt0T216/LFvRHDuppyj/D4+v9MsZypRWsw7g9SQLg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB2584 X-Rspamd-Queue-Id: 4CvMZn0QKjz3hJ1 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=Bdl8FyBm; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::601 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; HAS_ATTACHMENT(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5c::601:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5c::601:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 00:24:18 -0000 --_002_YQXPR0101MB096834DB16439960ADAFCE29DDC70YQXPR0101MB0968_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable J David wrote:=0A= [stuff snipped]=0A= >The most common trace (91 times, or over 1/4 of all observed) is:=0A= >=0A= >__mtx_lock_sleep+0xf8 nfscl_nodeleg+0x207 nfs_lookup+0x314=0A= >VOP_LOOKUP_APV+0x75 null_lookup+0x98 VOP_LOOKUP_APV+0x75 >lookup+0x451=0A= >namei+0x414 kern_statat+0x72 freebsd11_stat+0x30 amd64_syscall+0x387=0A= >fast_syscall_common+0xf8=0A= >=0A= >This trace appears roughly uniformly (28/38/34) in 3 of the 4 processes.= =0A= nfscl_nodeleg() plus two others acquire the mutex to check for a delegation= =0A= for the file. Servers should only issue delegations when the callback=0A= path is working and some never issue delegations.=0A= =0A= So, if you are not running the nfscbd(8) daemon (there is no need=0A= unless you are using delegations or pNFS) a trivial check for the=0A= callback path being enabled should determine that no delegation=0A= exists if it is not enabled. This avoids acquiring the mutex for this=0A= common case.=0A= --> You have no "Delegs" according to your "nfsstat -c -E".=0A= =0A= So, if you set up a test system with the attached patch applied to the=0A= kernel and you do not run nfscbd(8) (ie no nfscbd_enable=3D"YES" in=0A= your /etc/rc.conf) this patch might help.=0A= --> It avoids acquiring the mutex for lookups and getattrs (stat calls).=0A= =0A= The only reason I haven't done this before is a concern w.r.t. a broken=0A= NFSv4 server that erroneously issues delegations when the callback=0A= path is not working. (I don't know of such a server and, since you do not= =0A= have delegations, it should not be an issue for your case.)=0A= =0A= Might be enough to alleviate your problem.=0A= =0A= I will work on changing the single open linked list to a hash table of=0A= linked lists, but that is probably weeks away from a test patch.=0A= =0A= Let me know if you can try the patch and if it helps, rick=0A= =0A= The full set of traces (sorted by uniq -c) is here:=0A= =0A= https://pastebin.com/HUqkeMri=0A= =0A= (This message is already long enough!)=0A= =0A= The characters stripped off the front of each line are consistently:=0A= =0A= (pid) (uid) python3 -=0A= =0A= It happens these were all Python-based jobs. Python seems predisposed=0A= to trigger this, but non-Python jobs trigger it as well. Heavy use of=0A= stat() does seem to be a common element regardless of job type.=0A= =0A= Here's the output of "nfsstat -E -c" 60 seconds after running it with -z:= =0A= =0A= Client Info:=0A= RPC Counts:=0A= Getattr Setattr Lookup Readlink Read Wr= ite=0A= 1667 111 6376 0 42 = 153=0A= Create Remove Rename Link Symlink Mk= dir=0A= 111 7 14 0 0 = 0=0A= Rmdir Readdir RdirPlus Access Mknod Fss= tat=0A= 0 0 0 2620 0 = 160=0A= FSinfo pathConf Commit SetClId SetClIdCf L= ock=0A= 0 0 113 0 0 = 48=0A= LockT LockU Open OpenCfr=0A= 0 48 320 0=0A= OpenDownGr Close=0A= 0 402=0A= RelLckOwn FreeStateID PutRootFH DelegRet GetAcl Set= Acl=0A= 0 3 0 0 0 = 0=0A= ExchangeId CreateSess DestroySess DestroyClId LayoutGet GetDevI= nfo=0A= 0 0 0 0 0 = 0=0A= LayoutCommit LayoutReturn ReclaimCompl ReadDataS WriteDataS CommitDa= taS=0A= 0 0 0 0 0 = 0=0A= OpenLayout CreateLayout=0A= 0 0=0A= OpenOwner Opens LockOwner Locks Delegs Local= Own=0A= 21175 130439 30 6 0 = 0=0A= LocalOpen LocalLown LocalLock=0A= 0 0 0=0A= Rpc Info:=0A= TimedOut Invalid X Replies Retries Requests=0A= 0 0 0 0 12247=0A= Cache Info:=0A= Attr Hits Attr Misses Lkup Hits Lkup Misses=0A= 1110054 858 1002829 6361=0A= BioR Hits BioR Misses BioW Hits BioW Misses=0A= 2000 54 292 153=0A= BioRL Hits BioRL Misses BioD Hits BioD Misses=0A= 6911 0 208 0=0A= DirE Hits DirE Misses=0A= 104 0=0A= =0A= This does reflect the whole machine, not just those four processes.=0A= =0A= Finally, when I attempted to kill those four processes with ktrace=0A= running on all of them, the system panicked:=0A= =0A= Fatal trap 12: page fault while in kernel mode=0A= cpuid =3D 2; apic id =3D 02=0A= fault virtual address =3D 0x10=0A= fault code =3D supervisor read data, page not present=0A= instruction pointer =3D 0x20:0xffffffff8272d28f=0A= stack pointer =3D 0x28:0xfffffe008a5f24c0=0A= frame pointer =3D 0x28:0xfffffe008a5f25f0=0A= code segment =3D base 0x0, limit 0xfffff, type 0x1b=0A= =3D DPL 0, pres 1, long 1, def32 0, gran 1=0A= processor eflags =3D interrupt enabled, resume, IOPL =3D 0=0A= current process =3D 5627 (node)=0A= trap number =3D 12=0A= panic: page fault=0A= cpuid =3D 2=0A= time =3D 1607845622=0A= KDB: stack backtrace:=0A= db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008a5f2= 180=0A= vpanic() at vpanic+0x17b/frame 0xfffffe008a5f21d0=0A= panic() at panic+0x43/frame 0xfffffe008a5f2230=0A= trap_fatal() at trap_fatal+0x391/frame 0xfffffe008a5f2290=0A= trap_pfault() at trap_pfault+0x4f/frame 0xfffffe008a5f22e0=0A= trap() at trap+0x286/frame 0xfffffe008a5f23f0=0A= calltrap() at calltrap+0x8/frame 0xfffffe008a5f23f0=0A= --- trap 0xc, rip =3D 0xffffffff8272d28f, rsp =3D 0xfffffe008a5f24c0, rbp= =0A= =3D 0xfffffe008a5f25f0 ---=0A= null_bypass() at null_bypass+0xaf/frame 0xfffffe008a5f25f0=0A= VOP_ADVLOCK_APV() at VOP_ADVLOCK_APV+0x80/frame 0xfffffe008a5f2620=0A= closef() at closef+0x8f/frame 0xfffffe008a5f26b0=0A= fdescfree_fds() at fdescfree_fds+0x3c/frame 0xfffffe008a5f2700=0A= fdescfree() at fdescfree+0x466/frame 0xfffffe008a5f27c0=0A= exit1() at exit1+0x488/frame 0xfffffe008a5f2820=0A= sigexit() at sigexit+0x159/frame 0xfffffe008a5f2b00=0A= postsig() at postsig+0x2fd/frame 0xfffffe008a5f2bc0=0A= ast() at ast+0x317/frame 0xfffffe008a5f2bf0=0A= doreti_ast() at doreti_ast+0x1f/frame 0x7fffffff67a0=0A= =0A= The fsck ate the ktrace.out file on reboot. This system does not have=0A= kernel debug symbols. If there's a way to try to figure out what=0A= null_bypass+0xaf corresponds to using the machine that built this=0A= kernel, which has the kernel, objects, debug symbols, and source,=0A= please let me know. It's built from releng/12.2 r368515.=0A= =0A= Thanks!=0A= --_002_YQXPR0101MB096834DB16439960ADAFCE29DDC70YQXPR0101MB0968_ Content-Type: application/octet-stream; name="clstate.patch" Content-Description: clstate.patch Content-Disposition: attachment; filename="clstate.patch"; size=912; creation-date="Mon, 14 Dec 2020 00:24:08 GMT"; modification-date="Mon, 14 Dec 2020 00:24:08 GMT" Content-Transfer-Encoding: base64 LS0tIHN5cy9mcy9uZnNjbGllbnQvbmZzX2Nsc3RhdGUuYy5zYXYJMjAyMC0xMi0xMyAwNjoxNzoz MC42NDI3NTIwMDAgLTA4MDAKKysrIHN5cy9mcy9uZnNjbGllbnQvbmZzX2Nsc3RhdGUuYwkyMDIw LTEyLTEzIDA2OjM4OjEwLjgxOTkwOTAwMCAtMDgwMApAQCAtNDM3NCw3ICs0Mzc0LDcgQEAgbmZz Y2xfbm9kZWxlZyh2bm9kZV90IHZwLCBpbnQgd3JpdGVkZWxlZykKIAogCW5wID0gVlRPTkZTKHZw KTsKIAlubXAgPSBWRlNUT05GUyh2cC0+dl9tb3VudCk7Ci0JaWYgKCFORlNIQVNORlNWNChubXAp KQorCWlmICghTkZTSEFTTkZTVjQobm1wKSB8fCBuZnNjbF9lbmFibGVjYWxsYiA9PSAwKQogCQly ZXR1cm4gKDEpOwogCU5GU0xPQ0tDTFNUQVRFKCk7CiAJY2xwID0gbmZzY2xfZmluZGNsKG5tcCk7 CkBAIC00NzMwLDcgKzQ3MzAsNyBAQCBuZnNjbF9kZWxlZ21vZHRpbWUodm5vZGVfdCB2cCkKIAlz dHJ1Y3QgbmZzbW91bnQgKm5tcDsKIAogCW5tcCA9IFZGU1RPTkZTKHZwLT52X21vdW50KTsKLQlp ZiAoIU5GU0hBU05GU1Y0KG5tcCkpCisJaWYgKCFORlNIQVNORlNWNChubXApIHx8IG5mc2NsX2Vu YWJsZWNhbGxiID09IDApCiAJCXJldHVybjsKIAlORlNMT0NLQ0xTVEFURSgpOwogCWNscCA9IG5m c2NsX2ZpbmRjbChubXApOwpAQCAtNDc1OSw3ICs0NzU5LDcgQEAgbmZzY2xfZGVsZWdnZXRtb2R0 aW1lKHZub2RlX3QgdnAsIHN0cnVjdCB0aW1lc3BlYyAqbXRpCiAJc3RydWN0IG5mc21vdW50ICpu bXA7CiAKIAlubXAgPSBWRlNUT05GUyh2cC0+dl9tb3VudCk7Ci0JaWYgKCFORlNIQVNORlNWNChu bXApKQorCWlmICghTkZTSEFTTkZTVjQobm1wKSB8fCBuZnNjbF9lbmFibGVjYWxsYiA9PSAwKQog CQlyZXR1cm47CiAJTkZTTE9DS0NMU1RBVEUoKTsKIAljbHAgPSBuZnNjbF9maW5kY2wobm1wKTsK --_002_YQXPR0101MB096834DB16439960ADAFCE29DDC70YQXPR0101MB0968_-- From owner-freebsd-fs@freebsd.org Mon Dec 14 03:52:02 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 840A94AD2A3 for ; Mon, 14 Dec 2020 03:52:02 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvSBT4nqmz3sZ3 for ; Mon, 14 Dec 2020 03:52:01 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x12e.google.com with SMTP id u18so26650926lfd.9 for ; Sun, 13 Dec 2020 19:52:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pTnZFSVYWHSaPXUZ3mtbqMS1N8mqIZ1DM9QDI0fBLyA=; b=KaTTjlIZyqJ4XZdHGyE3XMvdqPmEYUGQGh1PohQEdqu4CjO4DHh4Ovt1g8VKMHHDnD wW1B8szPLtgqTnjnRbx9AZVaBMoKdNWBkMonv938Qle8BWP1qB7SGQGebD+fgAmiJKzA 7gn5sKZ29zJlr4QQTu11EhzuqVUrYCSw7uuKZHmiGHnALf9SNJNQvi7iZJ4f5Ydegs8x RzT5OyToCtLQ3+rIvyiaC0CASai06x3PvDiaqBHr6A/pJIunIJqNyN3vqcjZF9Rn4RZL PjNrhTJZLAPsvYWvG9TJ422iWNdUcWp7iF+PczvC+x+ZI5avzqyJeWuYoBZw2bupiV1S 3oXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pTnZFSVYWHSaPXUZ3mtbqMS1N8mqIZ1DM9QDI0fBLyA=; b=pTa9FXJfFlNxth+UJybK+u3AH2aLO6MC3qw0Ws52N+EANmx3+/HchsJb+K7YIl+bEf 0YQARpToHTgR8Jo42I8Un2FaNbvjNmS/DnKWOch12A61eHolNax0SSs10OQJnbiu50SN MYDvchovyefJloZ85VpGzz9aIsRQFJ+ElBY2ZVdaprf/dEZ1zfTXYfH2ZH/6rLA8uZQc V9VE2UZdi08BgSt/74xkve6fA7O6Tqr/eth2MlOUhVGpNWQYUWXQANUCZPwR06AXsDVB MWpryZvciWXx9dkTrwMjRGg20ExOG1hyT8+WRRC4fLcGbatxMIas66X4YzDe61L8lZhd 4G6g== X-Gm-Message-State: AOAM530nehjvrHXMBza6qjhEsCmZN16P3DQVVskXygUOk2X6RMbJaMbo 3EiSvqhYrKiAR9hh4Yh0poQbaSLWHm0XH9ikg0A= X-Google-Smtp-Source: ABdhPJyl3sDEB4DNsGsUiS0p8DZZjAcyY/hScMYeKqv8A+aIXQ5lbpzTylQc1qY3yC0mK7nXMYVE6g5JU84pZsQIu9E= X-Received: by 2002:a2e:b80c:: with SMTP id u12mr9858960ljo.490.1607917919528; Sun, 13 Dec 2020 19:51:59 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Sun, 13 Dec 2020 22:51:48 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: Rick Macklem , "freebsd-fs@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CvSBT4nqmz3sZ3 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=KaTTjlIZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::12e as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-3.98 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.98)[-0.979]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12e:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12e:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12e:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 03:52:02 -0000 On Sun, Dec 13, 2020 at 4:25 PM Konstantin Belousov wrote: > Nullfs with -o nocache (default for NFS mounts) should not cache vnodes. > So it is more likely a local load that has 130k files open. Of course, > it is the OP who can answer the question. This I can rule out; there is no visible correlation between "Opens" and the number of files open on the system. Just finishing a test right now, and: $ sudo nfsstat -E -c | fgrep -A1 OpenOwner OpenOwner Opens LockOwner Locks Delegs LocalOwn 4678 36245 15 6 0 0 $ sudo fstat | wc -l 2698 $ ps Haxlww | wc -l 1012 The value of Opens increases consistently over time. Killing the processes causing this behavior *did not* reduce the number of OpenOwner or Opens. Unmounting the nullfs mounts (after the processes were gone) *did*: $ sudo nfsstat -E -c | fgrep -A1 OpenOwner OpenOwner Opens LockOwner Locks Delegs LocalOwn 130 41 0 0 0 0 Mutex contention was observed this time, but once it was apparent that "Opens" was increasing over time, I didn't let the test get to the point of disrupting activities. This test ended at Opens = 36589, which is well short of the previous 130,000+. It is possible that mutex contention becomes an issue once system CPU resources are exhausted. More about the results of the latest test after the data is analyzed. After that's done, I'll attempt Rick's patch. In the long run, we would definitely like to get delegation to work. Baby steps! Thanks! From owner-freebsd-fs@freebsd.org Mon Dec 14 03:52:34 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 275E54AD136 for ; Mon, 14 Dec 2020 03:52:34 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvSC53pWcz3sZk for ; Mon, 14 Dec 2020 03:52:33 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x12a.google.com with SMTP id 23so26650088lfg.10 for ; Sun, 13 Dec 2020 19:52:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FS/wj5hDvXIaiYWyw3ZcMrQAMQ78gaUY3AeQ4RXJF4s=; b=S6HHnksQfzFYUZw6kmxulatXmx3U4rr1PeEeW1DELX8S9iFHYOahfB3jdaL7uiG6yD FwoiHmk2/wjHmGlO0yueMr9Nw2t35xdBU2/cbOkk3dxLErI5dUkJEZEG46n2m2N1qyLf 3GY8cpc+kU8b96pGhJOh3l7B7Spt2uX34erW5ZOW2AHqNyP3MGUCKbW1kNF3AO4M1GtM 2zMzlsuWXIUs0ac2qoejY8MIb0H9e2ljSZ8+x3M7QgWKZQC8FVtkva/5rdxWLawrlSB5 gw6UlK80q3vwMJgtRx0rsxok0LXv21n+6fSKj5qOO51HdwKzfbwoR9r73HiV/41F+jrw JcqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FS/wj5hDvXIaiYWyw3ZcMrQAMQ78gaUY3AeQ4RXJF4s=; b=BIBtLv2zk6S3ifmrdLrqNAVhntwwsriMS6l7gJoR4QofVd7thtT6sG9YC2Gep0AQrg NGMx8kRJbkI4Cx3JsAmR4UiOVvwAjDvJULbp24schrhz96zGD0WWPnlkALp0/MvuyIxM blWX4P8K+5YtshOQSsP3l8TEMONdT4Gjx67PRlLbvsMxmb8BakEQT2USDSYdy4gzWTYs t/4DyVqkQo7NU0S5526VzMV1bJMjWI8IoadRvEoAF/hXoeOmXHrOlpTljd5EB14EYl4e SvnXbngpH1Y1TwKGsFYeDXZzeA88r+BqD5ZR5R4I69rqU6CH6aRTr+q2Nz58s2/AFUE6 ahJw== X-Gm-Message-State: AOAM530AAP225gnJodjceFX3KCPXdhe6GNGVgK51zx7gym8Llub09/L3 PdRGjwYNKtdhrJ6xN6j1+pa0cst2EflaYAUGESA= X-Google-Smtp-Source: ABdhPJy8tRx1vcFa2bstrcnxVNS0h29zsQSpntYVQpS8rOTSC+qNyax1sn68FmGcreL4hRa7uo6wp8rrniLlfTgICkc= X-Received: by 2002:a2e:93cd:: with SMTP id p13mr9330359ljh.205.1607917951806; Sun, 13 Dec 2020 19:52:31 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Sun, 13 Dec 2020 22:52:20 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: Rick Macklem , "freebsd-fs@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CvSC53pWcz3sZk X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=S6HHnksQ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::12a as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-3.95 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.95)[-0.954]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12a:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12a:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12a:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 03:52:34 -0000 Sorry, mutex contention was *not* observed this time. Thanks! From owner-freebsd-fs@freebsd.org Mon Dec 14 07:57:35 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id AA2C54B2712 for ; Mon, 14 Dec 2020 07:57:35 +0000 (UTC) (envelope-from Alexander@leidinger.net) Received: from mailgate.Leidinger.net (mailgate.leidinger.net [IPv6:2a00:1828:2000:313::1:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvYdp2wlXz4Y28 for ; Mon, 14 Dec 2020 07:57:34 +0000 (UTC) (envelope-from Alexander@leidinger.net) Received: from outgoing.leidinger.net (p5b1651f8.dip0.t-ipconnect.de [91.22.81.248]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (Client did not present a certificate) by mailgate.Leidinger.net (Postfix) with ESMTPSA id 85E3EC41 for ; Mon, 14 Dec 2020 08:57:23 +0100 (CET) Received: from webmail.leidinger.net (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (Client did not present a certificate) by outgoing.leidinger.net (Postfix) with ESMTPS id E3C633786 for ; Mon, 14 Dec 2020 08:57:03 +0100 (CET) Date: Mon, 14 Dec 2020 08:57:03 +0100 Message-ID: <20201214085703.Horde.gA1tADBpbqeZbvgO3plk1f-@webmail.leidinger.net> From: Alexander Leidinger To: freebsd-fs@freebsd.org Subject: Re: Major issues with nfsv4 References: In-Reply-To: Accept-Language: de,en Content-Type: multipart/signed; boundary="=_Q0_GaX2mtXoTVaWwq6I6wBY"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 4CvYdp2wlXz4Y28 X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.10 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[leidinger.net:s=outgoing-alex]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[2a00:1828:2000:313::1:5:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[leidinger.net:+]; DMARC_POLICY_ALLOW(-0.50)[leidinger.net,quarantine]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1828:2000:313::1:5:from]; ASN(0.00)[asn:34240, ipnet:2a00:1828::/32, country:DE]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs]; RECEIVED_SPAMHAUS_PBL(0.00)[91.22.81.248:received] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 07:57:35 -0000 This message is in MIME format and has been PGP signed. --=_Q0_GaX2mtXoTVaWwq6I6wBY Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Quoting Rick Macklem (from Fri, 11 Dec 2020=20=20 23:28:30=20+0000): >> While it's certainly possible to configure NFS not to require reserved >> ports, the slightest possibility of a non-root user establishing a >> session to the NFS server kills that as an option. > Personally, I've never thought the reserved port# requirement provided > any real security for most situations. Unless you set "vfs.usermount=3D1" > only root can do the mount. For non-root to mount the NFS server > when "vfs.usermount=3D0", a user would have to run their own custom hacke= d > userland NFS client. Although doable, I have never heard of it being done= . 22 years ago I wrote an userland NFS client (it triggered my first=20=20 contribution/bugfix=20to rpcgen in FreeBSD which was MFCed to FreeBSD=20=20 2.2.8)=20as an university project (an exprimental computer with PRAM=20=20 technology=20didn't had a network stack but a host-interface to a=20=20 controlling=20server, and people wanted to access network shares, so the=20= =20 controling=20host was a NFS proxy, and I did this with a NFS userland=20=20 client).=20IIRC it was NFSv3. I had a little test-tool with a CUI in=20=20 which=20I was able to interactively list directories and open files (I=20= =20 used=20that for testing). As this more or less was my first software=20=20 project=20I realized alone, and it was scheduled to be something to be=20= =20 realized=20with a few man-hours per week during half a year, I would say=20= =20 it=20is easy to do for someone with interest / motivation. Bye, Alexander. --=20 http://www.Leidinger.net=20Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF --=_Q0_GaX2mtXoTVaWwq6I6wBY Content-Type: application/pgp-signature Content-Description: Digitale PGP-Signatur Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJf1xrPAAoJEBINsJsD+NiGT4MQAIfMGk5j/6dJ5mp/99/6LtpR 0wiCPNXGdvYjKb/Ws5lsqKeGZVKQmLh7M2aIyUBC8UeoIuE4Itxbieouw1f+zpIo rNZUdyGCUmiZUVbCPvVd4s4OEhCRU1U0+8UQIV2F+BpW43vnq5zEfXUzTJG1nbXF SwG+zePQbvawX6rMZWVsGfBaSlg2Hk2GQf2hxsQ8hXRYc3MTI6RS/RrQSRyLi1QC RIz79UhrBmKa5PV5DGQG5Cx2VwTNFkG9I3zgjnx1eX7BwSLG6o52OXdvdGdDd/02 AufedYciwy3Vz7/Z8e3/pOhGkCXnvJfcdIAUZ+BGjoBL7msBJ796cuEQ01FLPNyp h9KgMNPqpoJhzpRb5seyRkb7w1kg0nE8lianLslvMFMn3y1eGpNx2wO7W+aad9Gx gHtVYNp7AfzqBJxAZyTFufNVSv62zv6umIz455c7jpWhbV7kViP0vYpPXiUTwUIX sKUCzFMOC6DJEelKVD8Ne2LFlaCGlVm51xm/EAmyWbS3hBrvJjkZ2HznOmv+GOVS sFUj1bOxncvcrChoL6Zh6XYIdtHadsS7QigW3I7HuYo6/gltyjEXfkmVkS15QvnJ UpDTlNMuploa0hQzZkktG1FPw49DHFeu7Yj0xkJtur6+YubgE1lipJtJqJVKcp19 8ZL1DZBCE4W6j6VE7qSc =eeyk -----END PGP SIGNATURE----- --=_Q0_GaX2mtXoTVaWwq6I6wBY-- From owner-freebsd-fs@freebsd.org Mon Dec 14 15:05:50 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B5A564BC4B7 for ; Mon, 14 Dec 2020 15:05:50 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660054.outbound.protection.outlook.com [40.107.66.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cvl7x4MRyz3DqZ for ; Mon, 14 Dec 2020 15:05:49 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QQCjvAvhWXVOiJT+4wQ5vDEPJeryofH38TEqDAiwEbM14z8JEAq++ValrWq7FkR/QP36dzMg1jYk9OLQ52FdCPDqZXtDSVwXrwD1PuaxzTKPifdW+TMdbnkCt+i2UJZVwOvx6OyfGlHfyzFKH5tnmSzn5uNt5r9JSALpVE4Dlt4StcxfB/QGrVnfQxEx4LjGL4budIu51Avzv1J28D90o4Nt2NMo1mj98asIZz0HqIxCKbfznbe/SGQ3UO+QZp9t5OjSSq4/zyFGEsDdjlAwzsd8lGirF2FVfCqtfsnlfE8+WU5jGNH9lbc3fWjcFC2Tyfw9CqZq/tpE/Y0uUXvFEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TUHMpfDQNWV58mt9xEnnNY8ECB6QDfPmzr6XI8vjrhk=; b=mTlVUagyEzrL52C1YK5maDC5bcRfsE1DV8fNcuRe6ec+r54+mvsvskyLIyOOHSHO/TSPoTu487UuxYLZ52IgG84rNyDuw8OFkVS73ajxERMEfGB2NnzuId5rmSLvrpe8pCOYllPQ0yieY2LgkhpAO5KWKmvr0Cxa1Lf8318eVL53RWzAO72Lf1e6N20q02d9HQhhra1lF0JWdurRNY2vpke3l2XK5UC2aIuNp2GKicJjzrEQo8YpK+G/EApPXU236FlX2EhUJhl4ujZTM0X/cjrZ6l+mWsAfn+xMtpytNFpLMlvTtnnmLbXc+m0kKgF7Z1AfHhglrW82FG6tAFR3fA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TUHMpfDQNWV58mt9xEnnNY8ECB6QDfPmzr6XI8vjrhk=; b=gebmKswuwpcLYsAroKhAK8DR1hU3kDmxDdrTCcW9HNbeF966YW9lRTQVH7iqv0qVvgWKlj/52T4hwB+NxORfLhlX71D3lvm1EBEoe02BRFSRects3FSblDiO4bq2TAEQYNDOdycnMh/14FfshCpUFwdF7prVzRJen+VyzEoEq562epCNnnvMVOiV64SrDNXxv35oKy2ChxD0dSSiDw6eMynSaNrWnA9JH16D8OFOZSoV99eIuj57Zo+l2Ukc6xb+4YX9mhIYOzQNMnjGm58U55YH0ZgpB//vB4JdSz5tOmfdZADbPkVArqyK0VmGffacWZqTsTtcFzZRbfCxRMJalQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB2913.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:3d::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.17; Mon, 14 Dec 2020 15:05:43 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.025; Mon, 14 Dec 2020 15:05:43 +0000 From: Rick Macklem To: "freebsd-fs@freebsd.org" , Alexander Leidinger Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84ADuMmAgAB0YUo= Date: Mon, 14 Dec 2020 15:05:43 +0000 Message-ID: References: , <20201214085703.Horde.gA1tADBpbqeZbvgO3plk1f-@webmail.leidinger.net> In-Reply-To: <20201214085703.Horde.gA1tADBpbqeZbvgO3plk1f-@webmail.leidinger.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 89c7da78-c484-4040-943e-08d8a041ba67 x-ms-traffictypediagnostic: QB1PR01MB2913: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:6430; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: wx2FCJzBFiUukZG1/PT/zkDOwT4hvYWnJTMfljkGcapeLUKilq7s82XIGk/CsVU4iHEQN1hR2pMoa9+1GFXxWCRBLqlxmANUZKlO9D71JfAC4uOwUCWmLdZ0Vo98bEZ27hWocoLFOPg00fIe3wcDw7Cj1H7AW2JpUhwYLLsk0BFyxO+dRBlGpOIsJXF5hBI4NPqhABulzkRikBHDsMrZMEELCCIV6GjadPudSVdBc7Z0pmC84/yN6H16/dm/eDzLfeGwvexW7sf0RpV96DR+i71nlc+CFo06AELEE0/UMGihA0TZyvKpcRoyvPKMhw2LyyzhFnz6LfFBhzfFBfq/86i08GsN4NWU0r/DvSwvldqEo2REudRig6tR6kpLYDqtMsn1C35JzdEGXF4lMTZ+iQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(366004)(136003)(376002)(346002)(9686003)(2906002)(110136005)(7696005)(71200400001)(91956017)(86362001)(5660300002)(8676002)(83380400001)(55016002)(786003)(66946007)(8936002)(33656002)(966005)(186003)(76116006)(6506007)(508600001)(64756008)(66556008)(66446008)(66476007)(52536014); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?LQo9k4UZXO6Vg9xg017Bc3Fq5vJz04S+cVjSGG0zGMVFDoS3CjF9b65uGX?= =?iso-8859-1?Q?kGSsWSedwHnQXZ41zn2upo24pVtsLgvn7ScFfjmpSSh6+ioaN/HviRH18E?= =?iso-8859-1?Q?UrTP6ESEg27lY0COdLEdpbEhYoeiARMeN8zUq9dy9rfbdFuXUNO9RSoWVp?= =?iso-8859-1?Q?RMtG/cAaS8T8XQOXjZjouZj/dPr/WrbO7A2OZAL+lTsd3sb0Rtqdcum04u?= =?iso-8859-1?Q?l/ScQdubt1rP+aSCeEPsqhqbEhvuZgFjM+KyUPOrW9oJDrmD2Q8CUUwZ2w?= =?iso-8859-1?Q?Ac11wNgYKX3fHh+A0jzY99pGMiECfHgsJOHUdEYsiKxRCU+iJ/5ucpqQX2?= =?iso-8859-1?Q?paRZjy6rtYu8KBu2OsNtP1pD0IbuO7S9UYUPwT5GpIQ3lCPHNV+21C1xsw?= =?iso-8859-1?Q?BCcWTSjkuGtY591ytiKDxQynrEwZBFhVOmg/hHb0xEd5yQUbvKw8JzGS7Y?= =?iso-8859-1?Q?DIx+YNYS3fNCFnmucnBI/eqfekV7l0Mmk7HUWnbB0HVdlAlgjiJ4Mv7WIF?= =?iso-8859-1?Q?aMa0WL8J0gXcwK5AXRXV5+zWIwNFV/Vk23UPyrb8Ri6IIQNGWcvB+UpCxG?= =?iso-8859-1?Q?9DnqsdZL1bskxBGARoxjetelzmv206Dd3TE7DRA928r3+gDSNM3/L9pK1y?= =?iso-8859-1?Q?+erD5MFVYbNIUM9Ajn1ZkKRN6IYzbtwXAtrWpkc+9TLpuQlAcvycR82+gw?= =?iso-8859-1?Q?yQkkwRHoW79Kt4IggfL1PShHHEJBuMPk6SE0KDOYkdQphInbCgHbvaj7nq?= =?iso-8859-1?Q?n3AWljKmgZXwsuS8IbMen1h+GSpZI+cguyqPvBNk6vQLHgZJ8zy2yQVm+E?= =?iso-8859-1?Q?kotNE/4iZULKA6cfsJn7282YV4uXeFIO/eTAKMs99h282RhQRaZP77hM3p?= =?iso-8859-1?Q?Yd2EVOBl79okGKd7JV9nz7BwUjzpjRHc76znCoGhyQ84gpj1bLqoTLbZ2+?= =?iso-8859-1?Q?DZsTP81c/WsMxtgigXHz6SQp35pXqv31exD7KWhLDvDqRkPEU9nhNE81AB?= =?iso-8859-1?Q?GnTtaCItRTVJvCM0xZMnzExsYJihvTuS+9dejYzZflOsYPUy3rJg2ZER2G?= =?iso-8859-1?Q?sjeClROjliwZ1PUuWSZEjxk=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 89c7da78-c484-4040-943e-08d8a041ba67 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Dec 2020 15:05:43.1235 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 8/GXs1dD2KNOoSesjmITYPRyWbLnTIvaKxrxDXEurtBNiPvtWp5bFhtOJTPuYgAU12dw1HfkfsKK8JNN9wCpOg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB2913 X-Rspamd-Queue-Id: 4Cvl7x4MRyz3DqZ X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=gebmKswu; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.54 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.66.54:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[40.107.66.54:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.54:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.54:from]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 15:05:50 -0000 Alexander Leidinger wrote:=0A= >Quoting Rick Macklem =0A= >>> While it's certainly possible to configure NFS not to require reserved= =0A= >>> ports, the slightest possibility of a non-root user establishing a=0A= >>> session to the NFS server kills that as an option.=0A= >> Personally, I've never thought the reserved port# requirement provided= =0A= >> any real security for most situations. Unless you set "vfs.usermount=3D1= "=0A= >> only root can do the mount. For non-root to mount the NFS server=0A= >> when "vfs.usermount=3D0", a user would have to run their own custom hack= ed=0A= >> userland NFS client. Although doable, I have never heard of it being don= e.=0A= >=0A= >22 years ago I wrote an userland NFS client (it triggered my first=0A= >contribution/bugfix to rpcgen in FreeBSD which was MFCed to FreeBSD=0A= >2.2.8) as an university project (an exprimental computer with PRAM=0A= >technology didn't had a network stack but a host-interface to a=0A= >controlling server, and people wanted to access network shares, so the=0A= >controling host was a NFS proxy, and I did this with a NFS userland=0A= >client). IIRC it was NFSv3. I had a little test-tool with a CUI in=0A= >which I was able to interactively list directories and open files (I=0A= >used that for testing). As this more or less was my first software=0A= >project I realized alone, and it was scheduled to be something to be=0A= >realized with a few man-hours per week during half a year, I would say=0A= >it is easy to do for someone with interest / motivation.=0A= It's a lot more work to do an NFSv4 one and if all your legitimate=0A= NFS mounts are v4, you can probably disable NFSv3 support on the=0A= NFS server (vfs.nfsd.server_ min_version=3D4 on FreeBSD).=0A= =0A= The NFS-over-TLS I now have in test mode for FreeBSD can help=0A= w.r.t. this since it can be configured to require the client have an=0A= X509 certificate for NFS to work. If you are interested in more info=0A= on this https://people.freebsd.org/~rmacklem/nfs-over-tls-setup.txt=0A= =0A= rick=0A= =0A= =0A= =0A= Bye,=0A= Alexander.=0A= =0A= --=0A= http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF=0A= http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF=0A= From owner-freebsd-fs@freebsd.org Mon Dec 14 15:21:43 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A40934BCC46 for ; Mon, 14 Dec 2020 15:21:43 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvlVG1cNTz3FTg for ; Mon, 14 Dec 2020 15:21:42 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x134.google.com with SMTP id l11so31051987lfg.0 for ; Mon, 14 Dec 2020 07:21:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=o+H+NLT8BAXe2r81tkY63OrR0U2RqQWnhB3/NSO05F0=; b=u8A1lJ650oU61d1/TUIk/WYST2axV5d7vhiT0ahrOm2e8jOje9BQQVFKSiPfW57306 sbyt+yhCTYSwPlkXDXkikviqtA8XE2BrjUtJwPfy/cqdXIQlMDZSeAJ7a8gkdc7MQ5aR F8YUA2RRflO2fS8QZW+DBEF71sxCvuKixUIOqwOqhji2gKJWxufBkNPJ8LyQt6rI3Rgp J33DWfeaDbm+2jCaMnmW0zvANAL9RiAeC9m1APiTKhZ6NzhLrwgU/uj0C3UNzoEmf+oN IrDhOKl3J7mxTrU++XLvr+uhgtafVQsU2WzMAAwgdaRk2rFLT0Hp6P3t1JS01dS0Zw6m OA8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o+H+NLT8BAXe2r81tkY63OrR0U2RqQWnhB3/NSO05F0=; b=NIRJ3Mrl63zwZ96o30jgFDNRApc1D7Vra23aMQ6jNmlur1ZUXDGU9pd/ysRXdKaw2H fruVB/H5RAei7g+CnBL7aRBH2v5aDg52qgOYH3PngrnUhhQmx+zr+adubqfAcHpuAs2X pr9lcdTx8IQfbluegVT5RcZylPPd0ibmGFMILe9SWuCevu83FrM0jpcELqWLAxLfg9Wb AdQK8ugGTafKLld9WunVBWOZC9Vqgx1t3Q9R4R2X3hJ60vE0z6nwBipyVfyzUUBRqguK NhGtt/NWKg7+NqAReOrkIKGed57GGNOwkwfK1Cy795RcW0YzYkdvy6GiFC3RXdrCfcKP vRDA== X-Gm-Message-State: AOAM531GAxtg6Oz9fofiztSvL2E7Pe9ww4Lk+hxn6iXkiecqr8I3nzML NKKe3vIVxUU+KZZeQKoPHf1nzzOlJfhAs9nbDik= X-Google-Smtp-Source: ABdhPJzkEHWs6Tm5+BNnNhy1nkwVavYm8LADVeDdOdC2HtxMROlCaSE8O5I+tmpncF1p3OfKLzRUH1B5HrQN7cQKyvU= X-Received: by 2002:a19:4a13:: with SMTP id x19mr9344111lfa.648.1607959298848; Mon, 14 Dec 2020 07:21:38 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Mon, 14 Dec 2020 10:21:27 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: Rick Macklem , "freebsd-fs@freebsd.org" X-Rspamd-Queue-Id: 4CvlVG1cNTz3FTg X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=u8A1lJ65; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::134 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; HAS_ATTACHMENT(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::134:from]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::134:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::134:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 15:21:43 -0000 TLDR: The values of OpenOwner and Opens have a statistically significant correlation to the passage of time and are statistically independent of the number of currently running jobs (jails), processes, or threads. 3,173 samples were collected over approximately twelve hours, containing the following values (five number summary in parenthesis: min 1Q median 3Q max): - nfsstat -E -c OpenOwner (137 1405 2380 3541 4693) - nfsstat -E -c Opens (49 10479 18229 27732 36589) - # of active Jobs (1 50 50 50 51) - # of Job processes (1 117 117 117 121) - # of Job threads (1 519 521 525 533) - # of nfscl Threads (48 53 53 53 55) - Total # of processes on system (149 260 261 264 280) - Total # of threads on system (481 996 1001 1005 1023) OpenOwner and Opens are the dependent variables. The remaining values and the sample sequence number (N) are independent variables. The following table shows the adjusted R-squared values of linear regressions using each combination of the independent and dependent variables. While R-squared is not always the best measure of goodness of fit, it is easy to understand, and given the type of data and the relationship sought, its use here is both accurate and illustrative. OpenOwner Opens N 0.9369 0.9310 NTestEnd* 0.9962 0.9979 Jobs 0.2461 0.0324 JobProcs 0.0225 0.0285 JobThreads 0.0921 0.1060 NfsclThreads 0.0072 0.0000 SysProcs 0.0325 0.0376 SysThreads 0.1003 0.1145 *Because the test ended at sample 3156, NTestEnd reflects the regressions of OpenOwner and Opens vs. sample sequence number for only sample 1 - 3156. The results strongly indicate that both OpenOwner and Opens are highly correlated with time. No other regression demonstrates a statistically significant correlation. Opens and OpenOwner are also highly correlated to each other (adjusted R-squared = 0.9957). The high correlation and strong linear relationship with time suggests this is caused by something that is both roughly constant over time and largely independent of system activity measures based on process counts. It may be worth re-doing this test, capturing the rest of "nfsstat -E -c stats" about operations as well as counts of open files. Finding a strong correlation might help narrow down the causal action, which would hopefully make it possible to independently reproduce and/or fix this. Couple of questions around that: 1) Is there a way to get the total number of currently-open files more efficiently than enumerating them? (E.g., "fstat | wc -l" and "fstat -m | wc -l" are slow and resource-intensive.) 2) If so, is there a way to do that on a per-process basis? Thanks! From owner-freebsd-fs@freebsd.org Mon Dec 14 15:31:52 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C427E4BCCE6 for ; Mon, 14 Dec 2020 15:31:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0614.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::614]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cvljz5vpVz3GG2 for ; Mon, 14 Dec 2020 15:31:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N6/2rTi0kBW3Iq1HPkDkzpGrB5Dx1Ju53Av/ltjRKl8fGtZk9lMcNA+JWkGCchudMRvz5hnTbR6NYK7FPJgxH+/2AhHycsanU0Urxei0sX0wkTEuNAU0loh7B0bWBaGHUzRv6KKsD2wky8ZjM4RhY+4fgk+lh0ny1kJKYEMbvvVLqoCaby13TR5TycniZxfLzucPcBTOB3bFtKyCNaqJyuOtUDwQuIRubC0YSH8ISobbroy+vY0gBynGhT1ZkvWShaXdELkqHaMrSK62xc/fYKWY3FNe2gKnGYyOPlHkcA+Xd6QgZFCGqgbfofaUh68cRHsPGBejMC0iYnQPbYjGUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yoLg9w5lThSmnX/kOAscMrAv/3ZJaWDiV8Usd/Iq1Lg=; b=CjsgbAW0SNqDE39x3fP7WhBqEu2PxNAJGVyV1fSRZRDqAT6Y6U0NZ2oKloRy3JI3NONQIMUVWau0oNuxgXX726Uiu/hl4FWEe+n0DxkYr1nRZU/UOnxDSPKwIuL71gkAbzL7difJeDU3ue7W9tFl9LWV5huin4qMXEyVE8VM+JH0gYJ+wKfQmBJK0fGe1ieM5dIg9/Nxiv7d3Y1AB3B9o+6uIhNco4e5MEQmlvrGVEXp1UH1j0bDabjI0hMqUmSIQX5OuLPhHr6xhz0HyCoEm1SQHe2xubSDp2gvls4Sp3g3UvSzXvWO2qOU8YppLKm5CI8idmisiKaBw5zalu2irQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yoLg9w5lThSmnX/kOAscMrAv/3ZJaWDiV8Usd/Iq1Lg=; b=kgZz2/EdmpjGBlclGkWqLISDw98M6/fskw8iFcfrKl2UbMqwMT8s7vyqWNM4Zg5muj9jv0QxTuYL+NNGF7TcoQpAWnDAsvA6zeRgH+V7QQ5ic3Uwa66/Jxl5JSi8fl3vYJPkRnm1gT02r51G3kA+6rFBXAofp2HJBBFYk3St6OZ5ETuCKzhU513pB95t2pcuWvlFwDqyF9QlKn6ivXOY1JT9b7t3fql7rMdFgL/r0dXvRP7hSHE4KrUa8k8v7DBhcTb1vmEtPerxVUUhYYVwuyO98Fb7Owod8Q7LVJffM4QwWyJCKEWQ3WxZnK4rt0qNnuuh0KHie03gMa6LfGUYLA== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB2724.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:3e::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.24; Mon, 14 Dec 2020 15:31:49 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.025; Mon, 14 Dec 2020 15:31:49 +0000 From: Rick Macklem To: J David , Konstantin Belousov CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvcmAAAu0AIAA4wiAgAAI5gCAAO/IAIAAiDrmgABWOgCAAGv4AIAAv9Gv Date: Mon, 14 Dec 2020 15:31:49 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 8d5a2ab8-4bc8-40ae-15de-08d8a0455fe6 x-ms-traffictypediagnostic: QB1PR01MB2724: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: IH6BjkNkkKCe9uZ7upN9gXbk0xnT7Vy4zuUF/LXJFng8q/H9bydpE4269GA9hU8cWm34bHhAuMd1soEYzzJNiHW4+XNATDOPmckujGOy7cdxsJG9sZbSA/KTJydWCorkPVqmVIeAydAy1ejIj/77W1d3m+MT0n902a8HESCf/K+TRlXLuVs9Qp0rSznrcxbHB85qrLCK1xYXRg9jWBEUuGK4jd19fjbkWOhL7mJjqu3U7rLiVxJX48gSCwsdYqJ6lTkVWJP7tXHlBf7ZOH4U0VyE46Z+giwJK1g2UD2Dm3NDGpaILLURFfipklAysTkJNd+o711clerdPIjdKC+OPQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(396003)(136003)(39860400002)(366004)(346002)(9686003)(86362001)(4326008)(8936002)(52536014)(55016002)(6506007)(5660300002)(64756008)(110136005)(83380400001)(316002)(66946007)(66476007)(786003)(91956017)(66446008)(2906002)(71200400001)(76116006)(66556008)(33656002)(7696005)(478600001)(186003)(8676002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?F/tWHhjXRZJbo71f+NI+xZtWQjETYVxpn7hl6YnJeSZVL1lOclUX4kKOwM?= =?iso-8859-1?Q?OVHrsXzEZnMpIOvBdLzdED0SvNyGhP2lCJ4iQhSLuPRNSmrWYkGS69ZlNd?= =?iso-8859-1?Q?dqvLoNyUBkhazqwVY3oG07P6iV3g1TRFjIO1JDylPxL37Z3UsDzNz0PsY7?= =?iso-8859-1?Q?nn9gneGBykPFKslhOqfjcASeIYI3pFN1LbhLtokLEa0jmTmytihH6bD60S?= =?iso-8859-1?Q?UgPWljuLzgRrF/uRi7ZE8r7F30ECAv7YuvD9+Lq8uTRIkns34/YftZSWnD?= =?iso-8859-1?Q?GNI/8672ZVwVFrvdpvXoNg6wrFBWgG0C1Pf2BgtJRsoI1Gk+Uam+x4NlKi?= =?iso-8859-1?Q?TAFVX1rpbyksidBCtbrzsVH1+mYraWymD10JR5Mf5Mbff8Bk3oQbzr3APS?= =?iso-8859-1?Q?dUSatEPtdb11LoxtQkKdEo8YzEwTpsRMFnM0FV4HSIGNQa3Wy7j3BwXg/+?= =?iso-8859-1?Q?55LnHCvcpP83OyEo5GSZJ/VlWvNmPGpZKMiH4ilflo+AUF/XIvnirXaAAl?= =?iso-8859-1?Q?bAwbVkBG7TISfL6rbaAlwN0s1BUxC10vscJIn25iL4Jf13TEiejtLGfqUm?= =?iso-8859-1?Q?JWhgLjOMLAHsOE1ObmG722WjD8O4XReFtMp3pwcgSeqoW6FTelzRDaCSUs?= =?iso-8859-1?Q?PVA00NvjEmHH5wcXB1EXdpKYeim4Z0SF17VDJtyEJhRIwEKgBu8H4u5uGR?= =?iso-8859-1?Q?sySZlO6FWlaEu2u2BKlDkNrKjiqWM5DxQkeALIfbiJk4EhD4MyV1/ISuBD?= =?iso-8859-1?Q?TQfci2o9NxbD2w4bdy+fY3YqNyo88vXAEjFFw9TcoaU8dq6LVthJLoUoVW?= =?iso-8859-1?Q?Z3qOnMHvwOURlVw+68IYwWoxIQL+vqOr1jAMa8niRt6XZSmqjXrIX267Ie?= =?iso-8859-1?Q?c1/49oc5JbMAX9WdXtqZv1uy5xqHiN5jRB084CTil6JgQZO+GhvER4MeTj?= =?iso-8859-1?Q?Yn72ILPDw+TAHAIretzJJReNOmweAlt6+EFULlxGlDiLj3k+ZhMyLPlDhW?= =?iso-8859-1?Q?WeCUVD6A7mJ8ZKj4sukjNEO7thTy47y4oyGZ1ukCOTy41QiKbjF9i18fvv?= =?iso-8859-1?Q?UmzlC2US8TwEYOFwAYafnzM=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 8d5a2ab8-4bc8-40ae-15de-08d8a0455fe6 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Dec 2020 15:31:49.4677 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: e5567rr48gSn5QyJbJP94dQlvXWXnKigfPDrlpJfoagZUMs2ZDkyjgGMLLIWmuug97ZDWUE3q3rMrm74WuPVDA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB2724 X-Rspamd-Queue-Id: 4Cvljz5vpVz3GG2 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=kgZz2/Ed; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::614 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5c::614:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5c::614:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 15:31:52 -0000 J David wrote:=0A= >On Sun, Dec 13, 2020 at 4:25 PM Konstantin Belousov = wrote:=0A= >> Nullfs with -o nocache (default for NFS mounts) should not cache vnodes.= =0A= >> So it is more likely a local load that has 130k files open. Of course,= =0A= >> it is the OP who can answer the question.=0A= >=0A= >This I can rule out; there is no visible correlation between "Opens"=0A= >and the number of files open on the system.=0A= >=0A= >Just finishing a test right now, and:=0A= >=0A= >$ sudo nfsstat -E -c | fgrep -A1 OpenOwner=0A= > OpenOwner Opens LockOwner Locks Delegs Loca= lOwn=0A= > 4678 36245 15 6 0 = 0=0A= >$ sudo fstat | wc -l=0A= > 2698=0A= >$ ps Haxlww | wc -l=0A= > 1012=0A= >=0A= >The value of Opens increases consistently over time.=0A= >=0A= >Killing the processes causing this behavior *did not* reduce the=0A= >number of OpenOwner or Opens.=0A= >=0A= >Unmounting the nullfs mounts (after the processes were gone) *did*:=0A= >=0A= >$ sudo nfsstat -E -c | fgrep -A1 OpenOwner=0A= > OpenOwner Opens LockOwner Locks Delegs Loca= lOwn=0A= > 130 41 0 0 0 = 0=0A= Iwill take a look at the nullfs code to see if I can spot anything that mig= ht=0A= explain this.=0A= =0A= >Mutex contention was observed this time, but once it was apparent that=0A= >"Opens" was increasing over time, I didn't let the test get to the=0A= >point of disrupting activities. This test ended at Opens =3D 36589,=0A= >which is well short of the previous 130,000+. It is possible that=0A= >mutex contention becomes an issue once system CPU resources are=0A= >exhausted.=0A= >=0A= >More about the results of the latest test after the data is analyzed.=0A= >=0A= >After that's done, I'll attempt Rick's patch.=0A= The patch should help w.r.t. lock contention, but if the Open=0A= count is increasing, you'll "hit the wall eventually".=0A= =0A= > In the long run, we=0A= >would definitely like to get delegation to work. Baby steps!=0A= Well, all delegations do is allow a file to be repeatedly opened/closed=0A= without talking to the server. But they are per-file and issued on the=0A= first open against the server. As such, they are only useful if processes= =0A= are opening/closing the same file over and over again.=0A= -->They also add complexity and, as such, they are often a loss and=0A= not a help. They are disabled on the FreeBSD NFS server by default=0A= and disabled on the client side unless the nfscbd(8) is running.=0A= =0A= rick=0A= =0A= Thanks!=0A= From owner-freebsd-fs@freebsd.org Mon Dec 14 15:52:22 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 246114BD5DD for ; Mon, 14 Dec 2020 15:52:22 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660081.outbound.protection.outlook.com [40.107.66.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cvm9d03Myz3GpN for ; Mon, 14 Dec 2020 15:52:20 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ORq548kK8tbK8Fj0MFLqzZ9Sf06YZFSXbOt5WQ8LvyjhuCo3s4erBrQvik/0UwQx3btJGBQC5a58iOVb0fBx4+QjihkQl7Hr8OtcwMJXlqPhoARA27BabXTVcUEjJqeF9lP7n2QF50wiqqowNxJT6jGvMSY0NYrs/UhTgMGP0eF1OltYD7quwnIuJ6KBugLi3DlQhF85TdOnT4/2lsHXt44Rq5WrrL44rhLgWb5ctFHRG5bk/YBpdMZ8iIuG0SPsx7zlAjYlDqfj5R3qoJMFYOh9up7+E8h1LfaL/ICpvO51+uF8V7/Dm3tqqeEm/d7fJ332j0BTR2h+HOx/EkGuaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mONrFz7rwdrR3VYN8g5Vwayy2j1jN5YtasLcVTZRmcs=; b=Bs23OFCzwbIllipRFbsYao8Mk68R6e6hESTNce9YhybDywBSnzFubzU09trlPJMJlCyMvNwjg5iuhqXH12AsS27QhdeVFhwsy53XPpBLXU7cVnO/3pHdJLSp4BqQVm9o3sw/8nX/xeAHQP9kqQ76SATBbUGwN/t50dxCeuSMmU+cwWJxEIxlXKECkBEZ5dh8I+GWg4X+tU7WWnPlB6SWAey6Jexlmo/do02d3hRv31bozd2cjRbKjBo0hNSl6hQqdI84apgcbrpfbv72A13X71EbpBqpZUVjdU1I+Fvp+mvpsBJmk0n8zYhgkmv6OsB2MuTFZMToXC6vRDS45KMZ6Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mONrFz7rwdrR3VYN8g5Vwayy2j1jN5YtasLcVTZRmcs=; b=bDWWLWWNXdCDwr7OHNqF7303IrWTNfLUvYYumVfRYpXvac7Fu6MgxcpU3scNENKyPcF0LyrIxrlCT0xj6f+yKFCiNH28ZLBemQOljCrl6L8dK5gKHN9qJU9tc1Bic0cbPHcNqIqBQvfJg+M6KBQPwlvI0qib9MbEDyIqgCERrpuYgP13Jrf7JCqpqfY+poEbKU0rwO5Ss6+DcFL/m+AbSPKYUvVOwdftYFfzR54tAsLGxQXqeI50YjrOITLGFvQIteIrFuCDRa09ec6Trdg35TB0r7giRcgQPvABQu7TdBrdcnigVPtN1Srpzu++szIr50Nkhy4gpxbaWFf5VZiYow== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB2406.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:51::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.24; Mon, 14 Dec 2020 15:52:19 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.025; Mon, 14 Dec 2020 15:52:19 +0000 From: Rick Macklem To: J David , Konstantin Belousov CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvcmAAAu0AIAA4wiAgAAI5gCAAO/IAIAAiDrmgABWOgCAAGv4AIAAACYAgADAioCAAAV+kQ== Date: Mon, 14 Dec 2020 15:52:19 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 67a3d275-e659-40db-0593-08d8a0483d10 x-ms-traffictypediagnostic: YQXPR01MB2406: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7219; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: QKCfTYw0DxZQIzPX+WF2CuHo/EQguIi5qyE0odtu+o12hJOWZS4ndNGB2lXmKEXjZLQodsVCZ74z5qXShIDiPLu9/wgq/nCX3gUUMYa1ypjI9TbCEH0Lp7lxnuI58Dc9JILQ7aEpVgP1cenUhNSWzKy5ZQyYhW6+T7SyPnEf/oLKdILbdfuceTugNBy7HAwl63opUhF5dwfN2xEvZuNnPphybU2ZzRP8+XNillHQQmZRDlXFWg8ziC4fvt99in+EF+fkt6YvNJQOqaW7LNecO9EkPTxHxHEeP+pe+e1XuOHQPAoHGhfeu0fcMCs8JF1MxR7qgmd51vPZDdl0YOtHVw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(366004)(136003)(346002)(376002)(6506007)(66476007)(91956017)(786003)(110136005)(508600001)(8936002)(83380400001)(33656002)(4326008)(186003)(66556008)(64756008)(66946007)(9686003)(86362001)(66446008)(76116006)(5660300002)(7696005)(2906002)(55016002)(53546011)(71200400001)(8676002)(52536014); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?imJfp8VpJT3nlN7ZAjKzO2M7e1p3MwkPqEzujaT9MZUYRTra7Ql8l+2WXbub?= =?us-ascii?Q?AVIpBYTXFFnOJRlDyOSe16aOTW4g50L1jhGkDPNNGxo9DmReNaW1ZuhMlJ74?= =?us-ascii?Q?6/0yX7bEcjL3+3Yd7KexJIbmwUhEpgnSxgN6EgctXh9OQCahjWvVWDoe6sif?= =?us-ascii?Q?o6w3siq+f7nU2UmjuwpjZGhQDJholHFvtKaaDYb3AsQBvc3d4RbIXRRL6g7G?= =?us-ascii?Q?zRD4Sr9D+MpHogfhGBLyQuC3j74bpHS1Jga21tSx/y7Aq/lXO4VzDdAx8b0M?= =?us-ascii?Q?Y0JguIUEJ8VN8gjbTi9qSmsrRZpKIcmcgQwpXWSkdfXN+PTC6TCvDd8EEXd+?= =?us-ascii?Q?P5ZTJQtZO0ehUumSqYnJQ5+ROLyk6dlc08MY/m9I9FKkO2LyAO9xDXaerhKO?= =?us-ascii?Q?xLVBO9D6zK/qLkOwf4yiCadRm4kjuAM0k8hYNNj8LEKGQL1JbgMNstowyx6y?= =?us-ascii?Q?kWsfpxnAPW1A3kvwpxO7RLO6dbewA7eoYerWxHMu7Ig8l0C5Yhlj9+0frxJe?= =?us-ascii?Q?e80rg6sSvn9lfVz0fymaliIE/bIpKk18eyXi1jjY1CaZ/eni3deQnYfE6cSv?= =?us-ascii?Q?ulC5W9qLzFBBUQ9+XFOp6BiMdK6nq+5cNYNGkigKau4Z2kLm6HbLuSvRXtFr?= =?us-ascii?Q?zTvpQE5rEubCOm9iNFi3XZ7W/lMl0oN504O2U8d5zkYRZhs55nDknha+Y0BL?= =?us-ascii?Q?jsXzyyBt4xX/Q8b9+ugCLMJBXBhguqQsP/AoHh8D4w8BsWKwgjQZbVdvn9RS?= =?us-ascii?Q?oCVpuzd1oQZ7OU56RPM6YaG4pTxTaPo9ZOcAuWbaVJmtqPPg8gT+RuCu1/AI?= =?us-ascii?Q?6yYusQ1aaVMyDfVRwT5CnBzU+uN9ypj/CnsrGE1jqR5eYKqHzfDfbaLJFEOf?= =?us-ascii?Q?uAuqOIn9wrj+WQjh6komy8Ip3PcRu+uEW5djWzLVYySv/R7qqQ2tJ4OPxDUD?= =?us-ascii?Q?de6WLgIQnb/ouTW1U4BCaYQb65+JoNmUMfuF6oqsc5TNS0UKmte7IsL96mHb?= =?us-ascii?Q?GZb+xbvNRIwOJnZT9clkauxKG4SVCsaprVTkt8lVU+A7vPw=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 67a3d275-e659-40db-0593-08d8a0483d10 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Dec 2020 15:52:19.4695 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: sAfs0f+hHzHe7yRjUK8w/lD6825cpI8q3HUOxjmBUQtPHN20pDhgQM0c8WXgo+oq7NZY2gIF5R5a4mIH0Bzxeg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB2406 X-Rspamd-Queue-Id: 4Cvm9d03Myz3GpN X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=bDWWLWWN; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.81 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.66.81:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[40.107.66.81:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.81:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.81:from]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 15:52:22 -0000 Hope you don't mind a top post... It's interesting that the leak of opens does not correlate well to load, but it doesn't give me any insight into what might be causing the leak. --> Is there something that is always running that accesses files in the nullfs mount? If you can set up a system with no jobs running... --> Watch it to see if there is a leak happening. --> Try the two cases of creating a bunch of files and opening a bunch of file that already exist, and then closing the files for both cases. Both file creation and opening existing files uses the NFSv4 Open operation, but follow quite different code paths through the VFS. --> If the leak occurs on one but not the other, it would narrow things down. Good luck with it, rick ________________________________________ From: J David Sent: Monday, December 14, 2020 10:21 AM To: Konstantin Belousov Cc: Rick Macklem; freebsd-fs@freebsd.org Subject: Re: Major issues with nfsv4 CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca TLDR: The values of OpenOwner and Opens have a statistically significant correlation to the passage of time and are statistically independent of the number of currently running jobs (jails), processes, or threads. 3,173 samples were collected over approximately twelve hours, containing the following values (five number summary in parenthesis: min 1Q median 3Q max): - nfsstat -E -c OpenOwner (137 1405 2380 3541 4693) - nfsstat -E -c Opens (49 10479 18229 27732 36589) - # of active Jobs (1 50 50 50 51) - # of Job processes (1 117 117 117 121) - # of Job threads (1 519 521 525 533) - # of nfscl Threads (48 53 53 53 55) - Total # of processes on system (149 260 261 264 280) - Total # of threads on system (481 996 1001 1005 1023) OpenOwner and Opens are the dependent variables. The remaining values and the sample sequence number (N) are independent variables. The following table shows the adjusted R-squared values of linear regressions using each combination of the independent and dependent variables. While R-squared is not always the best measure of goodness of fit, it is easy to understand, and given the type of data and the relationship sought, its use here is both accurate and illustrative. OpenOwner Opens N 0.9369 0.9310 NTestEnd* 0.9962 0.9979 Jobs 0.2461 0.0324 JobProcs 0.0225 0.0285 JobThreads 0.0921 0.1060 NfsclThreads 0.0072 0.0000 SysProcs 0.0325 0.0376 SysThreads 0.1003 0.1145 *Because the test ended at sample 3156, NTestEnd reflects the regressions of OpenOwner and Opens vs. sample sequence number for only sample 1 - 3156. The results strongly indicate that both OpenOwner and Opens are highly correlated with time. No other regression demonstrates a statistically significant correlation. Opens and OpenOwner are also highly correlated to each other (adjusted R-squared =3D 0.9957). The high correlation and strong linear relationship with time suggests this is caused by something that is both roughly constant over time and largely independent of system activity measures based on process counts. It may be worth re-doing this test, capturing the rest of "nfsstat -E -c stats" about operations as well as counts of open files. Finding a strong correlation might help narrow down the causal action, which would hopefully make it possible to independently reproduce and/or fix this. Couple of questions around that: 1) Is there a way to get the total number of currently-open files more efficiently than enumerating them? (E.g., "fstat | wc -l" and "fstat -m | wc -l" are slow and resource-intensive.) 2) If so, is there a way to do that on a per-process basis? Thanks! From owner-freebsd-fs@freebsd.org Thu Dec 17 04:25:36 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9A2164A9A69 for ; Thu, 17 Dec 2020 04:25:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660053.outbound.protection.outlook.com [40.107.66.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CxJnq50KSz4r2W for ; Thu, 17 Dec 2020 04:25:35 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=M68JrNtXDQaaijz2tp4d0RbivONQx7NYkT92KxNx6W4F0wWaQKEjOtasTIQMM1pdnAdAQIRcKrKl+iLleKMJ3sLgUCMC0uOX4V33Onp5QF2GQx/oLreQDJrYSeUtd3tPqh3H8bCR0rbcePqdj7a3U5mSW3J1pFUpcnnTzB6koxJiDcXVR/bOmE/SlfykQf80QI3X4iI3tQoidtJr5+VFA24q03LikmwPhd12TONqOTjMS22EIJH/UWjDXRNCRXErIenoqGzUJ7HppFB3D5UeOxxt3z9LCjqjAHcrMXbJWTqwskCXofjoTGlpxktyR5PoW3hpOYYOc40ZCnKul4MqZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jM2BjWWXNro0rmwnYT3nyo+z80BiyPMoW09x3G3KLFI=; b=cFXPfWEqNuV6MCu969w7gx4kFr89K1ZxjqCk3rsvDkP6yzcFfl94/ieB6yxwzkk9Pk39Iy3k/c2e+WYZ9bDEZAhbLSCkHsT5xGmKLqELwOuwwViDpN6yBARbq9X+x2MDeNHr4cNAiqgWs12WbFZVm2Oml1B6bUb0/5NjdxBMpbbvZ0bjwu0mPTq/SuVyZqVdUsK46xZ2pChCis6TWu3ioXO9DQ40KvzLyVfPBx0cUjG8giWH8jp40tuLVLnaGEIEIFof/eq2vPW1AYKEuNF+DaVEWFxeuKW5UiuzC9ptukx234th2pd0VLA0ZgpyqrOlr9PmjEfeIifFhn3z3YUz4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jM2BjWWXNro0rmwnYT3nyo+z80BiyPMoW09x3G3KLFI=; b=YBed2JwbvHOc3wC4ob9runx9uBM9GEF52rkrHES+N44ISolgqvT4vu1/MYzS7pYICto+qpc+kwp3539iEtBNNLsXiu4m5N+ANddzWG92fODlmb5ttc6QCktZMHpzktkDfWSIvKMut6Gn0ldUo4XmFeAg59UGJB6oYslsdSaK+IgOespf+lMMeaUGDxVNr8dZNOFkb9YrYhrqiZ68bmPfkIP7Fztmr2/SK3Mj5Zs7O4+pEdeuoY0lWofO2jE2Vmkajp3oGoojoz1Mj3RfwD9ptqWJwoukEQ+/6CMnE0ljsqQr8uJihR5D+gpJ8mfcrUu7DleWijw48Uoum4x7YKf8ww== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB2435.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:38::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.17; Thu, 17 Dec 2020 04:25:33 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::7d6b:aa68:78f4:5d94%7]) with mapi id 15.20.3654.025; Thu, 17 Dec 2020 04:25:33 +0000 From: Rick Macklem To: J David , Konstantin Belousov CC: "freebsd-fs@freebsd.org" Subject: Re: Major issues with nfsv4 Thread-Topic: Major issues with nfsv4 Thread-Index: AQHWzw/HDat+dHoH9kKG5K3Xpd53kqnxDteQgAFi0QCAABTa84AALLCAgAAVvcmAAAu0AIAA4wiAgAAI5gCAAO/IAIAAiDrmgABWOgCAAGv4AIAEv9nS Date: Thu, 17 Dec 2020 04:25:33 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: eadf8492-690b-486a-25cc-08d8a243cbaf x-ms-traffictypediagnostic: QB1PR01MB2435: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: g7S/GniSXeEOW3tRub3B0L/7SrQxBNsnPyEY+f9Dc8M4mNDsIn0PNi//8bIFHv4YMJidK2E0tgHdGX57xdBKoRsT3H2U9NgWIoTHRO82AjMour+z8ORQVxtWo+sp73xw+foYa9rrP9J9QrsaDe0fhfHVVCZY/SUkpP6e8Mt2HmoYCvOyajUEJf3EStgUt/XGD/+Zt9bSwmKcTgrayCeb7nzXipAzhk7+IbgOfh9F0CFlTN6+TgJU5Wpd3FN0KhBcP9OKwCRP2AYfXXZz7faqGkyrg9ZRsdrVmSmqtQvkF/YzUQSb6PZdKMUYQ9bCeW1egob7eQ3/TokC28v7hinXzA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(396003)(136003)(366004)(39860400002)(376002)(66946007)(2906002)(76116006)(66556008)(64756008)(66446008)(316002)(66476007)(478600001)(6506007)(5660300002)(33656002)(786003)(110136005)(52536014)(91956017)(7696005)(86362001)(8936002)(71200400001)(83380400001)(4326008)(8676002)(55016002)(186003)(53546011)(9686003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?KYxiVleSAgx1N0ROabroPrMm27CvOB+XRXyIb3OAWZi80CEU45nJx5stzh6C?= =?us-ascii?Q?FpJsVzt8IscK7eRE/xVk00soj2YrMs5kmCcyPIKnK3QCkatUCcYvMseILm6o?= =?us-ascii?Q?tILSbB9xWT/ixMhVANW2MeiHq7te/J5qxMlUTHiuC+TbiU4BW6V/MNSHC4F7?= =?us-ascii?Q?aiakRFhkKBFUYNxZp3dQ4SxMnx8cXJsZbPqS38t6LuZxnWEcP0AeGkLZ47Tz?= =?us-ascii?Q?tBVFGn/X2ZdlNcGjWSF3Eu9Mov3ma2axQ6gyetzt7S6UQR0c6nobJUEyunSQ?= =?us-ascii?Q?XLPqCBgdTTjBkxqpkVC6HAum5ZdnSaP/SkaVGBeWQLAK45MHF5l8ydBVVx+W?= =?us-ascii?Q?uv88wFCZeWUuGh4oMpV61xnSsDfVG30OfA5jWktS7azTdu9hQEaqSL29f+Qm?= =?us-ascii?Q?Ii6ijWPRr3dpzCoWjYk8ySUdwTVbD7rB8h3vLhQzfFfau6Ra/KSSbHOR1FFQ?= =?us-ascii?Q?d5gzgs47d4QSveFQTe8InlI4ed3nr3MX0xzj64JXAlCcvKF13BnwgqF0omq9?= =?us-ascii?Q?7optcCRyc7cuC+JJ3I+tmL9UJqs0DX6xeZpr3oQ5eHtygBnyZKVClgkFL5FI?= =?us-ascii?Q?89+YjjwuOlqCXoEOLo6MksnAOWwDPb2ZE1lq23+qpqP3kDg3g9FAstQ8qPFu?= =?us-ascii?Q?NZIhn7TsMjumoy1XIqT5snGjerGsLoIPsUauSKKSSi6Uyfdaeojc3iDzEr8v?= =?us-ascii?Q?KHylBbVESQHfv4hiGnSy92kvo97mELPG1BQhjgGH5cpPk6eUEagwcQdI3aBr?= =?us-ascii?Q?2ndvjtstnJN2BTMU2/mKCzFT21alG9RNLKWYzHg5xojitB7mKDlyOo4HoKVA?= =?us-ascii?Q?S9iAwl7ESXovU2IgvIA1rbWD1wR6rzQ9Jft45rDTh7d3aHVuZD9muRkvZkmS?= =?us-ascii?Q?kYuDMyzf9Znxu03IF4Quq8mz8vKEBizOWHEdhDdEDFpq7DsVDcj+RLp65IgZ?= =?us-ascii?Q?3H/C8QT6fufK1Wa72PDCwgFxkkX3XNgObY6YNR42e8Z5Mb5Tm78e6c/gdbV9?= =?us-ascii?Q?gh0Uw003EV/2rUTauqgONh1QYfQld/DmJSLmhZ5GBfha1+k=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: eadf8492-690b-486a-25cc-08d8a243cbaf X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Dec 2020 04:25:33.5847 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 3It8nvNOYIn5Xv9XEHpv5GY8A1HIGEmJMHSZJwoEbwngV062dR1BvYgyoCiaGagGLo0W7P8SqlIo8SMA955Yzg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB2435 X-Rspamd-Queue-Id: 4CxJnq50KSz4r2W X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=YBed2Jwb; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.53 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.66.53:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[40.107.66.53:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.53:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.53:from]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2020 04:25:36 -0000 If you can do so when the "Opens" count has gone fairly high, please "sysctl vfs.deferred_inact" and let us know what that returns. rick ________________________________________ From: J David Sent: Sunday, December 13, 2020 10:51 PM To: Konstantin Belousov Cc: Rick Macklem; freebsd-fs@freebsd.org Subject: Re: Major issues with nfsv4 CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca On Sun, Dec 13, 2020 at 4:25 PM Konstantin Belousov w= rote: > Nullfs with -o nocache (default for NFS mounts) should not cache vnodes. > So it is more likely a local load that has 130k files open. Of course, > it is the OP who can answer the question. This I can rule out; there is no visible correlation between "Opens" and the number of files open on the system. Just finishing a test right now, and: $ sudo nfsstat -E -c | fgrep -A1 OpenOwner OpenOwner Opens LockOwner Locks Delegs Local= Own 4678 36245 15 6 0 = 0 $ sudo fstat | wc -l 2698 $ ps Haxlww | wc -l 1012 The value of Opens increases consistently over time. Killing the processes causing this behavior *did not* reduce the number of OpenOwner or Opens. Unmounting the nullfs mounts (after the processes were gone) *did*: $ sudo nfsstat -E -c | fgrep -A1 OpenOwner OpenOwner Opens LockOwner Locks Delegs Local= Own 130 41 0 0 0 = 0 Mutex contention was observed this time, but once it was apparent that "Opens" was increasing over time, I didn't let the test get to the point of disrupting activities. This test ended at Opens =3D 36589, which is well short of the previous 130,000+. It is possible that mutex contention becomes an issue once system CPU resources are exhausted. More about the results of the latest test after the data is analyzed. After that's done, I'll attempt Rick's patch. In the long run, we would definitely like to get delegation to work. Baby steps! Thanks!