Date: Mon, 14 Dec 2020 10:21:27 -0500 From: J David <j.david.lists@gmail.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Rick Macklem <rmacklem@uoguelph.ca>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: Major issues with nfsv4 Message-ID: <CABXB=RSpfiU3R1JuLU_DE60SARs0rkPVROPLewJFjBwMXRnbSw@mail.gmail.com> In-Reply-To: <CABXB=RTn9NC3PE-QyNLmaKUvAWtYtdN_39Nks5i05_VxWpbhRw@mail.gmail.com> References: <YQXPR0101MB096849ADF24051F7479E565CDDCA0@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <CABXB=RSyN%2Bo2yXcpmYw8sCSUUDhN-w28Vu9v_cCWa-2=pLZmHg@mail.gmail.com> <YQXPR0101MB09680D155B6D685442B5E25EDDCA0@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <CABXB=RSSE=yOwgOXsnbEYPqiWk5K5NfzLY=D%2BN9mXdVn%2B--qLQ@mail.gmail.com> <YQXPR0101MB0968B17010B3B36C8C41FDE1DDC90@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <X9Q9GAhNHbXGbKy7@kib.kiev.ua> <YQXPR0101MB0968C7629D57CA21319E50C2DDC90@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <X9UDArKjUqJVS035@kib.kiev.ua> <CABXB=RRNnW9nNhFCJS1evNUTEX9LNnzyf2gOmZHHGkzAoQxbPw@mail.gmail.com> <YQXPR0101MB0968B120A417AF69CEBB6A12DDC80@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <X9aGwshgh7Cwiv8p@kib.kiev.ua> <CABXB=RTFSAEZvp%2BmoiF%2BrE9vpEjJVacLYa6G=yP641f9oHJ1zw@mail.gmail.com> <CABXB=RTn9NC3PE-QyNLmaKUvAWtYtdN_39Nks5i05_VxWpbhRw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
TLDR: The values of OpenOwner and Opens have a statistically significant correlation to the passage of time and are statistically independent of the number of currently running jobs (jails), processes, or threads. 3,173 samples were collected over approximately twelve hours, containing the following values (five number summary in parenthesis: min 1Q median 3Q max): - nfsstat -E -c OpenOwner (137 1405 2380 3541 4693) - nfsstat -E -c Opens (49 10479 18229 27732 36589) - # of active Jobs (1 50 50 50 51) - # of Job processes (1 117 117 117 121) - # of Job threads (1 519 521 525 533) - # of nfscl Threads (48 53 53 53 55) - Total # of processes on system (149 260 261 264 280) - Total # of threads on system (481 996 1001 1005 1023) OpenOwner and Opens are the dependent variables. The remaining values and the sample sequence number (N) are independent variables. The following table shows the adjusted R-squared values of linear regressions using each combination of the independent and dependent variables. While R-squared is not always the best measure of goodness of fit, it is easy to understand, and given the type of data and the relationship sought, its use here is both accurate and illustrative. OpenOwner Opens N 0.9369 0.9310 NTestEnd* 0.9962 0.9979 Jobs 0.2461 0.0324 JobProcs 0.0225 0.0285 JobThreads 0.0921 0.1060 NfsclThreads 0.0072 0.0000 SysProcs 0.0325 0.0376 SysThreads 0.1003 0.1145 *Because the test ended at sample 3156, NTestEnd reflects the regressions of OpenOwner and Opens vs. sample sequence number for only sample 1 - 3156. The results strongly indicate that both OpenOwner and Opens are highly correlated with time. No other regression demonstrates a statistically significant correlation. Opens and OpenOwner are also highly correlated to each other (adjusted R-squared = 0.9957). The high correlation and strong linear relationship with time suggests this is caused by something that is both roughly constant over time and largely independent of system activity measures based on process counts. It may be worth re-doing this test, capturing the rest of "nfsstat -E -c stats" about operations as well as counts of open files. Finding a strong correlation might help narrow down the causal action, which would hopefully make it possible to independently reproduce and/or fix this. Couple of questions around that: 1) Is there a way to get the total number of currently-open files more efficiently than enumerating them? (E.g., "fstat | wc -l" and "fstat -m | wc -l" are slow and resource-intensive.) 2) If so, is there a way to do that on a per-process basis? Thanks!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABXB=RSpfiU3R1JuLU_DE60SARs0rkPVROPLewJFjBwMXRnbSw>