From owner-freebsd-fs@freebsd.org Sat Oct 22 03:06:15 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 40FF5C1C1FD for ; Sat, 22 Oct 2016 03:06:15 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0092.outbound.protection.outlook.com [157.56.111.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D3C08B59 for ; Sat, 22 Oct 2016 03:06:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM (10.165.218.133) by YTXPR01MB0190.CANPRD01.PROD.OUTLOOK.COM (10.165.218.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.669.12; Fri, 21 Oct 2016 21:47:28 +0000 Received: from YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM ([10.165.218.133]) by YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM ([10.165.218.133]) with mapi id 15.01.0669.021; Fri, 21 Oct 2016 21:47:28 +0000 From: Rick Macklem To: Marek Salwerowicz , "freebsd-fs@freebsd.org" Subject: Re: ZFS - NFS server for VMware ESXi issues Thread-Topic: ZFS - NFS server for VMware ESXi issues Thread-Index: AQHSK3xZ74x1v1Hl9kyKv24uLlFAmaCzbiXv Date: Fri, 21 Oct 2016 21:47:28 +0000 Message-ID: References: <930df17b-8db8-121a-a24b-b4909b8162dc@misal.pl> In-Reply-To: <930df17b-8db8-121a-a24b-b4909b8162dc@misal.pl> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-office365-filtering-correlation-id: 1b0492a9-35f8-4b1a-d35c-08d3f9fbda69 x-microsoft-exchange-diagnostics: 1; YTXPR01MB0190; 7:CDNv5duTgPl6KMGDlR2iDeOp1nPGeT4aHHddHs3pTwfaQlOer44OaFeD9dAzwY5+K5wVuBEX2Azbknyq1cxt72wiV1I//z7ME4T8IkM3yo+XKzxX43m/X1XBJ4Jhh8aY8z+bRpwGMdvZcs57Mp2sci0PNrIUHS03k3okL86RagJRK92KIfENYxT7lrVqBOPHC+66NnPEdarv5ghag6Lp1YhWs4qb3vw2iK/qpDjTFf6fKyxj5QMq2OhGFIuI0Z7FbRlgmDbNIfF8JWAxWtbXCekibbIn7+pQMSdSHTX9gvCyRLhf6PhD1HFmw1wlxkfONajQnj6qNCBXm2EFib1YxZ9D1d/UxeOt2CK5zSIdvwA=; 23:GW06fk62wFmRndJneFQts8QTCRj3zwW2VRBlNWA/n3VRV9sXsSZJlstvUA+3lX8mkjPzYwf0yC211jOMPNu6uyYyGskWLs1NJ5ZOzCs2uVzVMoPPUB0Byr1mx+WkzdCw1o+cWf/TOzwpRTYZnbgS4bY3MQ2U8Udg1RKP6hGgiauusg37aevfatWej5ab3WaA x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:YTXPR01MB0190; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040176)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6043046)(6042046); SRVR:YTXPR01MB0190; BCL:0; PCL:0; RULEID:; SRVR:YTXPR01MB0190; x-forefront-prvs: 01026E1310 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(199003)(189002)(24454002)(81166006)(86362001)(107886002)(81156014)(586003)(3280700002)(3660700001)(102836003)(8676002)(2900100001)(77096005)(2501003)(2906002)(87936001)(68736007)(9686002)(7846002)(6606003)(97736004)(5660300001)(19625215002)(54356999)(92566002)(11100500001)(19627405001)(76176999)(10400500002)(101416001)(2950100002)(50986999)(74316002)(74482002)(5002640100001)(7696004)(8936002)(106116001)(106356001)(189998001)(16236675004)(33656002)(5001770100001)(105586002)(122556002); DIR:OUT; SFP:1101; SCL:1; SRVR:YTXPR01MB0190; H:YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Oct 2016 21:47:28.4506 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTXPR01MB0190 X-OriginatorOrg: uoguelph.ca Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Oct 2016 03:06:15 -0000 Marek Salwerowicz wrote: Stuff snipped for brevity... > Today, after two weeks of working, we experienced the same situation. > The nfsd service was in following state: > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 984 root 128 20 0 12344K 4020K vq->vq 8 346:27 0.00% nfsd > > nfsd service didn't respond to service nfsd restart, but this time > machine was able to reboot using "# reboot" command. I am not sure how "top" got a STATE of "vq->vq", but I suspect that refers = to the vdev section of the ZFS code. (The only other place in the kernel where "vq= ->vq" shows up is in virtio and I doubt you are using that?) I'm not a ZFS guy so I can't help, but I'd guess that it's looping around i= n the vdev code, possibly competing for the vq->vq_lock? Hopefully someone with ZFS expertise can help out? Btw, about the only area of the NFS server that might need tuning is the DR= C and this doesn't suggest that. If you "nfsstat -e -s" on the server and see lar= ge #s for the last line under "Server Cache Stats:" there are tunables that can be us= ed. I'd also suggest you capture the output of "ps axHl" on the server when it = happens again, which tells you what all the nfsd threads are up to. Good luck with it, rick