From owner-freebsd-fs@freebsd.org Mon Dec 14 15:21:43 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A40934BCC46 for ; Mon, 14 Dec 2020 15:21:43 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CvlVG1cNTz3FTg for ; Mon, 14 Dec 2020 15:21:42 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: by mail-lf1-x134.google.com with SMTP id l11so31051987lfg.0 for ; Mon, 14 Dec 2020 07:21:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=o+H+NLT8BAXe2r81tkY63OrR0U2RqQWnhB3/NSO05F0=; b=u8A1lJ650oU61d1/TUIk/WYST2axV5d7vhiT0ahrOm2e8jOje9BQQVFKSiPfW57306 sbyt+yhCTYSwPlkXDXkikviqtA8XE2BrjUtJwPfy/cqdXIQlMDZSeAJ7a8gkdc7MQ5aR F8YUA2RRflO2fS8QZW+DBEF71sxCvuKixUIOqwOqhji2gKJWxufBkNPJ8LyQt6rI3Rgp J33DWfeaDbm+2jCaMnmW0zvANAL9RiAeC9m1APiTKhZ6NzhLrwgU/uj0C3UNzoEmf+oN IrDhOKl3J7mxTrU++XLvr+uhgtafVQsU2WzMAAwgdaRk2rFLT0Hp6P3t1JS01dS0Zw6m OA8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o+H+NLT8BAXe2r81tkY63OrR0U2RqQWnhB3/NSO05F0=; b=NIRJ3Mrl63zwZ96o30jgFDNRApc1D7Vra23aMQ6jNmlur1ZUXDGU9pd/ysRXdKaw2H fruVB/H5RAei7g+CnBL7aRBH2v5aDg52qgOYH3PngrnUhhQmx+zr+adubqfAcHpuAs2X pr9lcdTx8IQfbluegVT5RcZylPPd0ibmGFMILe9SWuCevu83FrM0jpcELqWLAxLfg9Wb AdQK8ugGTafKLld9WunVBWOZC9Vqgx1t3Q9R4R2X3hJ60vE0z6nwBipyVfyzUUBRqguK NhGtt/NWKg7+NqAReOrkIKGed57GGNOwkwfK1Cy795RcW0YzYkdvy6GiFC3RXdrCfcKP vRDA== X-Gm-Message-State: AOAM531GAxtg6Oz9fofiztSvL2E7Pe9ww4Lk+hxn6iXkiecqr8I3nzML NKKe3vIVxUU+KZZeQKoPHf1nzzOlJfhAs9nbDik= X-Google-Smtp-Source: ABdhPJzkEHWs6Tm5+BNnNhy1nkwVavYm8LADVeDdOdC2HtxMROlCaSE8O5I+tmpncF1p3OfKLzRUH1B5HrQN7cQKyvU= X-Received: by 2002:a19:4a13:: with SMTP id x19mr9344111lfa.648.1607959298848; Mon, 14 Dec 2020 07:21:38 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: J David Date: Mon, 14 Dec 2020 10:21:27 -0500 Message-ID: Subject: Re: Major issues with nfsv4 To: Konstantin Belousov Cc: Rick Macklem , "freebsd-fs@freebsd.org" X-Rspamd-Queue-Id: 4CvlVG1cNTz3FTg X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=u8A1lJ65; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of jdavidlists@gmail.com designates 2a00:1450:4864:20::134 as permitted sender) smtp.mailfrom=jdavidlists@gmail.com X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; HAS_ATTACHMENT(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::134:from]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::134:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::134:from]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-fs] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2020 15:21:43 -0000 TLDR: The values of OpenOwner and Opens have a statistically significant correlation to the passage of time and are statistically independent of the number of currently running jobs (jails), processes, or threads. 3,173 samples were collected over approximately twelve hours, containing the following values (five number summary in parenthesis: min 1Q median 3Q max): - nfsstat -E -c OpenOwner (137 1405 2380 3541 4693) - nfsstat -E -c Opens (49 10479 18229 27732 36589) - # of active Jobs (1 50 50 50 51) - # of Job processes (1 117 117 117 121) - # of Job threads (1 519 521 525 533) - # of nfscl Threads (48 53 53 53 55) - Total # of processes on system (149 260 261 264 280) - Total # of threads on system (481 996 1001 1005 1023) OpenOwner and Opens are the dependent variables. The remaining values and the sample sequence number (N) are independent variables. The following table shows the adjusted R-squared values of linear regressions using each combination of the independent and dependent variables. While R-squared is not always the best measure of goodness of fit, it is easy to understand, and given the type of data and the relationship sought, its use here is both accurate and illustrative. OpenOwner Opens N 0.9369 0.9310 NTestEnd* 0.9962 0.9979 Jobs 0.2461 0.0324 JobProcs 0.0225 0.0285 JobThreads 0.0921 0.1060 NfsclThreads 0.0072 0.0000 SysProcs 0.0325 0.0376 SysThreads 0.1003 0.1145 *Because the test ended at sample 3156, NTestEnd reflects the regressions of OpenOwner and Opens vs. sample sequence number for only sample 1 - 3156. The results strongly indicate that both OpenOwner and Opens are highly correlated with time. No other regression demonstrates a statistically significant correlation. Opens and OpenOwner are also highly correlated to each other (adjusted R-squared = 0.9957). The high correlation and strong linear relationship with time suggests this is caused by something that is both roughly constant over time and largely independent of system activity measures based on process counts. It may be worth re-doing this test, capturing the rest of "nfsstat -E -c stats" about operations as well as counts of open files. Finding a strong correlation might help narrow down the causal action, which would hopefully make it possible to independently reproduce and/or fix this. Couple of questions around that: 1) Is there a way to get the total number of currently-open files more efficiently than enumerating them? (E.g., "fstat | wc -l" and "fstat -m | wc -l" are slow and resource-intensive.) 2) If so, is there a way to do that on a per-process basis? Thanks!