From owner-freebsd-fs@freebsd.org Tue Feb 27 22:54:04 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD922F315BB for ; Tue, 27 Feb 2018 22:54:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670041.outbound.protection.outlook.com [40.107.67.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT TLS CA 4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 50C2C758BA; Tue, 27 Feb 2018 22:54:03 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM (52.132.66.153) by YQBPR0101MB1409.CANPRD01.PROD.OUTLOOK.COM (52.132.68.158) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.548.13; Tue, 27 Feb 2018 22:54:02 +0000 Received: from YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::3531:c817:d6f:9b93]) by YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM ([fe80::3531:c817:d6f:9b93%13]) with mapi id 15.20.0548.013; Tue, 27 Feb 2018 22:54:02 +0000 From: Rick Macklem To: Ruben , "freebsd-fs@freebsd.org" CC: "rmacklem@FreeBSD.org" Subject: Re: Linux NFSv4 clients: bad sequence-id errors. Thread-Topic: Linux NFSv4 clients: bad sequence-id errors. Thread-Index: AQHTsAK1+Ml+d7nQUEWuV8GTWVG+IqO42B0H Date: Tue, 27 Feb 2018 22:54:01 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; YQBPR0101MB1409; 7:SNXGXZZ6hk2fgoGFz6e9qBeI/2AfD8z7s1BgP3P9TdaZmu4cTM+s2aREpx6m69rzednMIWJqQL4UdgoYJ+SfuZFg0v2GcydInsngi5E6pI9n/D9UPhgLHaJmfLG1kwwDxGEyDP8l4/ldCgJiqet1vcq53Q4J8arCpktrW6Rlmz6jpUZBRNWf03d2kdpsmFOtkHPH3n7aKU+QSgBR85abtpH/faPySzLlJm98K5rIqFAwuhI/5t6rQSIQ/QMiSK3D x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: bd807567-f045-4265-d97d-08d57e34fec0 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989060)(4534165)(4627221)(201703031133081)(201702281549075)(8990040)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:YQBPR0101MB1409; x-ms-traffictypediagnostic: YQBPR0101MB1409: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(158342451672863)(265634631926514)(75325880899374); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(5005006)(8121501046)(3002001)(10201501046)(93006095)(93001095)(3231220)(944501161)(52105095)(6041288)(20161123558120)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(20161123560045)(6072148)(201708071742011); SRVR:YQBPR0101MB1409; BCL:0; PCL:0; RULEID:; SRVR:YQBPR0101MB1409; x-forefront-prvs: 05961EBAFC x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(979002)(346002)(39380400002)(39860400002)(376002)(366004)(396003)(189003)(199004)(8936002)(97736004)(2950100002)(186003)(786003)(316002)(106356001)(102836004)(3660700001)(99286004)(229853002)(7696005)(5660300001)(76176011)(6506007)(68736007)(81166006)(305945005)(110136005)(8676002)(26005)(74316002)(81156014)(2900100001)(1250700005)(6246003)(478600001)(3280700002)(2906002)(6306002)(55016002)(2501003)(5250100002)(5890100001)(14454004)(6436002)(53936002)(74482002)(9686003)(4326008)(25786009)(575784001)(86362001)(105586002)(33656002)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB1409; H:YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-microsoft-antispam-message-info: H+T4qE2ygP5euoKCfs1UEJWFF0NLmlOvQHaL+c5+twera19FQUyoDt6TS00IQk/SUTkKdff2IxsuFGi319bW2V+l3AKng6RiTOkCPxXCi6l5i3jGPLGeqvYgm6za/J5I6k3RwTOL9J5Vka4jMI4WZLGTx3as4d5SVONJM6Ufmh8= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: bd807567-f045-4265-d97d-08d57e34fec0 X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Feb 2018 22:54:01.9618 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB1409 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Feb 2018 22:54:05 -0000 Ruben wrote: >I'm experiencing a strange issue on a machine providing a couple of >nfsv4 exports. A Linux client that generates a lot of traffic to and >from the nfs server sometimes starts throwing "bad sequence-id errors": > >Feb 27 10:39:42 localhost kernel: [12481477.608103] NFS: v4 server >returned a bad sequence-id error on an unconfirmed sequence 80f7d0d0! The handling of sequence-id in NFSv4.0 is complex and I won't even try to guess why this is happening. I am surprised that your Linux mounts are using NFSv4.0 and not NFSv4.1? (Usually Linux uses the most recent version supported by the server.) I mention this since "sessions" replaced the sequence-id stuff in NFSv4.1 and, as such, shouldn't have such an issue. >They typically occur after a couple of months of uptime on the nfsd >machine. Every couple of seconds they are thrown by the client. The >situation is "remedied" by restarting the nfsd on the server. Although >functionality on the specific client does not appear to be affected >(much?), its a bit disturbing. I've done some digging and found : The fact that this is fixed by restarting the nfsd suggests a client side problem. Why? Because restarting the nfsd does not reset any server state, so the sequenc= e-id situation would not be affected by doing this. (To get rid of server side s= tate, you must unload the nfsd.ko after killing off the nfsd daemon.) All restarting the nfsd daemon will do is force the client to establish a n= ew TCP connection. That is at a layer below the NFS state. >https://lists.freebsd.org/pipermail/freebsd-fs/2015-July/021707.html > >and the patch attached by Rick ( nfsv41exch.patch : >http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20150729/586f776= a/attachment.bin >) . > >Since the issue started manifesting itself I have restarted the nfs >daemon (grabbed a pcap and the corresponding error lines mentioning the >sequences prior to doing that in case anyone is interested). If you email me the pcap as an attachment, I can take a look at it in wires= hark. >The nfs server runs FreeBSD 11.1 : I'm being lazy and not looking, but I am almost sure a 2015 patch will be i= n 11.1 and probably also in 10.2 and 11.0. >freebsd-version -uk >11.1-RELEASE-p1 >11.1-RELEASE-p1 > >but I have seen it on 10.2 and 11.0 as well. The linux client is (/has >been) running a version of Debian. > >The export lines in /etc/exports : > >V4: / -network=3D192.168.9.0 -mask=3D255.255.255.0 > >/data/Sabnzb2015 -maproot=3Droot: -alldirs -network=3D192.168.9.0 >-mask=3D255.255.255.0 > >Uptime: > >8:27PM up 196 days, 22:10, 1 users, load averages: 0.21, 0.17, 0.17 > >Traffice since uptime (guessing NFS / non-NFS ratio of 3 to 1) > > lagg0 in 400.901 KB/s 400.901 KB/s 10.425 T= B > out 32.781 KB/s 32.781 KB/s 14.132 T= B > > >I'm wondering: can the 2015 patch provided by Rick still be "safely" >applied or has the nfs code changed too much since then? I've witnessed >this issue a couple of times now and would very much like to test the >patch provided. As above, I'd be surprised if the patch isn't already in your 11.1 kernel, but you can take a look. If it isn't, let me know because that means it slipped through the cracks and I need to get it committed, etc. rick