From owner-freebsd-stable@freebsd.org Thu Dec 19 14:09:40 2019 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 76CC01E17E7 for ; Thu, 19 Dec 2019 14:09:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 47dtzm1dWnz3Dl7 for ; Thu, 19 Dec 2019 14:09:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.nyi.freebsd.org (Postfix) id 361061E17E6; Thu, 19 Dec 2019 14:09:40 +0000 (UTC) Delivered-To: stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 35CEE1E17E5 for ; Thu, 19 Dec 2019 14:09:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660089.outbound.protection.outlook.com [40.107.66.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47dtzl1sdsz3Dl6 for ; Thu, 19 Dec 2019 14:09:38 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AfSqKAUG+NG3/FiBUm2nqMZtnU5ZJdd5UmgOSe17bOCYdF24bdwl6FuumUa2ABO7MfCqQ2d454JEbYeIwWWdF9lqxpuCMjW410JtM72LC+5sfv36SVBb2x4ATqIgMeHwUwk0FHOG+q1QFdrHEHOnUnB9pPelS1UVkuFeWkfPhMyjGY31+YWcjVSiJR4dxLnOsdLhQCPyhELl4QyoFI5LEJMP+nWSuvBjC6jC1dR//L/4GIO3gGy4R/RuCqBMyEsKrgpIexYfkFM16K2f2H2jqeUowv/qqH7FxWlAZkj/Srqhy2A8NE6RobdD2yExh+qZVzfsZXs80KOSTkbC+Sqb0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7za0d1ilKDjMG1D4C+8cdHbVkjsnzgYE4A6BSWIUX2Y=; b=Td+hiFSa5Guy51/8cz08wDkXm/uP90VSm1J7whGtwv9866RXNJANrNv98BKpIys+J/6RmMglq1Dlol8lRsDRR1+HnartRkTGVd8aLUNr08cBI8JOtNGHWc2IWvSkuRpVSfuiLe6C6yEh+UUhKC8kpr64JRb966AXTTKq+adFwZJ7kVaLZRb2Tx/Iojo4IC/Wonz9p8OyuHJz6NCYEUHyucmh+FsNVrhj6uQ6OWT7j5iE18/ZCf++CF3vRpHvu4/+y8+npUZlez1bOvI1tNruTxUObIOaJhxlBnzbMLUkJNisvrLhw7lBNfqXn9XCHywhcDileYHIxMzk9J96beKPSg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none Received: from YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM (52.132.69.153) by YQBPR0101MB0835.CANPRD01.PROD.OUTLOOK.COM (52.132.71.150) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2559.16; Thu, 19 Dec 2019 14:09:37 +0000 Received: from YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM ([fe80::9504:a50d:ee12:b75]) by YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM ([fe80::9504:a50d:ee12:b75%5]) with mapi id 15.20.2538.019; Thu, 19 Dec 2019 14:09:37 +0000 From: Rick Macklem To: Daniel Braniss CC: Richard P Mackerras , "stable@freebsd.org" Subject: Re: nfs lockd errors after NetApp software upgrade. Thread-Topic: nfs lockd errors after NetApp software upgrade. Thread-Index: AQHVtawq+ga5QLcdVkqBDG/GW9zFg6e/+Am+gAARTACAAANHAIAAi7Y3gACf34CAAEVO6A== Date: Thu, 19 Dec 2019 14:09:37 +0000 Message-ID: References: <0121E289-D2AE-44BA-ADAC-4814CAEE676F@cs.huji.ac.il> , <854B6E5A-C6BC-44B3-A656-FC9B8EF19881@cs.huji.ac.il> In-Reply-To: <854B6E5A-C6BC-44B3-A656-FC9B8EF19881@cs.huji.ac.il> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fa84a4dd-d9ea-4bea-502e-08d7848d152a x-ms-traffictypediagnostic: YQBPR0101MB0835: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7691; x-forefront-prvs: 0256C18696 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(396003)(366004)(39860400002)(346002)(376002)(189003)(199004)(26005)(186003)(53546011)(7696005)(6506007)(55016002)(6916009)(9686003)(71200400001)(2906002)(786003)(54906003)(8936002)(316002)(86362001)(81166006)(81156014)(8676002)(5660300002)(52536014)(966005)(478600001)(4326008)(33656002)(66946007)(66556008)(64756008)(66446008)(66476007)(76116006); DIR:OUT; SFP:1101; SCL:1; SRVR:YQBPR0101MB0835; H:YQBPR0101MB1427.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: yEqKfGg3diXckIOo78VB9S65yu8tUgQs6JHl36OkVF89B4JBL0Wc/pyHiuot3gkJoi2Tr17aRo5ZvufoezGb+rqJh2oh5cikhNErHUK76zGlU7EOQLMryYmE7xYvaIMPE6+rbGLP0OheRTooYVxP1Mt/YaSSM2UwglFaULFWoJQMb9aVLPZEoxOoXdS8+dl5KSjKjk9+Tep39fbqOPQoxhbYP6JdcTOqfLKBVhJQZpzMv9IZfZPYDJbFY+UmghhDIyJoH9LfaXakrvsxZWB91wyWkBOwj3PSqw/x+0NRI+MaKXvi3J4fkxxmkRnCtTZmo464PK6gMYe6L/BpK+boYs7wvc2nxs02NeVkt7chTChXmvH5n+mqoEm04reSeeMYVo4wZc+Srt8oB2LGl8nrnz1oGQyfsqlgTIIp/5Lt/R7/Qg80YQWF8BBs4+1cWcYvYk02hecaGbdSMc8w1NJyFKaxxRlkZCMiA0nGgtGqzvs= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: fa84a4dd-d9ea-4bea-502e-08d7848d152a X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Dec 2019 14:09:37.5654 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: O35IXmwn9kcPjIdsBA/xSBC5M0gi3AM5W53uplF49ldFTaPulGXjMEI/1xgnhxXbzjLGPaJmRTkz5q7weug7AQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB0835 X-Rspamd-Queue-Id: 47dtzl1sdsz3Dl6 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.89 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.66 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[uoguelph.ca]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[89.66.107.40.list.dnswl.org : 127.0.3.0]; IP_SCORE(-1.36)[ipnet: 40.64.0.0/10(-3.83), asn: 8075(-2.92), country: US(-0.05)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.64.0.0/10, country:US]; ARC_ALLOW(-1.00)[i=1]; FREEMAIL_CC(0.00)[gmail.com] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Dec 2019 14:09:40 -0000 Daniel Braniss wrote:=0A= [stuff snipped]=0A= >all mounts are nfsv3/tcp=0A= This doesn't affect what the NLM code (rpc.lockd) uses. I honestly don't kn= ow when=0A= the NLM uses tcp vs udp. I think rpc.statd still uses IP broadcast at times= .=0A= =0A= To me, it looks like a network configuration issue.=0A= You could capture packets (maybe when a client first starts rpc.statd and r= pc.lockd)=0A= and then look at them in wireshark. I'd disable statup of rpc.lockd and rpc= .statd=0A= at boot for a test client and then run something like:=0A= # tcpdump -s 0 -s out.pcap host =0A= - and then start rpc.statd and rpc.lockd=0A= Then I'd look at out.pcap in wireshark (much better at decoding this stuff = than=0A= tcpdump). I'd look for things like different reply IP addresses from the Ne= tapp,=0A= which might confuse this tired old NLM protocol Sun devised in the mid-1980= s.=0A= =0A= >the error is also appearing on freebsd-11.2-stable, I=92m now checking if = it=92s also=0A= >happening on 12.1=0A= >btw, the NetApp version is 9.3P17=0A= Yes. I wasn't the author of the NSM and NLM code (long ago I refused to eve= n=0A= try to implement it, because I knew the protocol was badly broken) and I av= oid=0A= fiddling with. As such, it won't have change much since around FreeBSD7.=0A= =0A= rick=0A= =0A= cheers,=0A= danny=0A= =0A= > rick=0A= >=0A= > Cheers=0A= >=0A= > Richard=0A= > (NetApp admin)=0A= >=0A= > On Wed, 18 Dec 2019 at 15:46, Daniel Braniss > wrote:=0A= >=0A= >=0A= >> On 18 Dec 2019, at 16:55, Rick Macklem > wrote:=0A= >>=0A= >> Daniel Braniss wrote:=0A= >>=0A= >>> Hi,=0A= >>> The server with the problems is running FreeBSD 11.1 stable, it was wor= king fine for >several months,=0A= >>> but after a software upgrade of our NetAPP server it=92s reporting many= lockd errors >and becomes catatonic,=0A= >>> ...=0A= >>> Dec 18 13:11:02 moo-09 kernel: nfs server fr-06:/web/www: lockd not res= ponding=0A= >>> Dec 18 13:11:45 moo-09 last message repeated 7 times=0A= >>> Dec 18 13:12:55 moo-09 last message repeated 8 times=0A= >>> Dec 18 13:13:10 moo-09 kernel: nfs server fr-06:/web/www: lockd is aliv= e again=0A= >>> Dec 18 13:13:10 moo-09 last message repeated 8 times=0A= >>> Dec 18 13:13:29 moo-09 kernel: sonewconn: pcb 0xfffff8004cc051d0: Liste= n queue >overflow: 194 already in queue awaiting acceptance (1 occurrences)= =0A= >>> Dec 18 13:14:29 moo-09 kernel: sonewconn: pcb 0xfffff8004cc051d0: Liste= n queue >overflow: 193 already in queue awaiting acceptance (3957 occurrenc= es)=0A= >>> Dec 18 13:15:29 moo-09 kernel: sonewconn: pcb 0xfffff8004cc051d0: Liste= n queue >overflow: 193 already in queue awaiting acceptance =85=0A= >> Seems like their software upgrade didn't improve handling of NLM RPCs?= =0A= >> Appears to be handling RPCs slowly and/or intermittently. Note that no o= ne=0A= >> tests it with IPv6, so at least make sure you are still using IPv4 for t= he mounts and=0A= >> try and make sure IP broadcast works between client and Netapp. I think = the NLM=0A= >> and NSM (rpc.statd) still use IP broadcast sometimes.=0A= >>=0A= > we are ipv4 - we have our own class c :-)=0A= >> Maybe the network guys can suggest more w.r.t. why, but as I've stated b= efore,=0A= >> the NLM is a fundamentally broken protocol which was never published by = Sun,=0A= >> so I suggest you avoid using it if at all possible.=0A= > well, at the moment the ball is on NetAPP court, and switching to NFSv4 a= t the moment is out of the question, it=92s=0A= > a production server used by several thousand students.=0A= >=0A= >>=0A= >> - If the locks don't need to be seen by other clients, you can just use = the "nolockd"=0A= >> mount option.=0A= >> or=0A= >> - If locks need to be seen by other clients, try NFSv4 mounts. Netapp fi= lers=0A= >> should support NFSv4.1, which is a much better protocol that NFSv4.0.= =0A= >>=0A= >> Good luck with it, rick=0A= > thanks=0A= > danny=0A= >=0A= >> =85=0A= >> any ideas?=0A= >>=0A= >> thanks,=0A= >> danny=0A= >>=0A= >> _______________________________________________=0A= >> freebsd-stable@freebsd.org mailing li= st=0A= >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A= >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org= "=0A= >=0A= > _______________________________________________=0A= > freebsd-stable@freebsd.org mailing lis= t=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A= > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org<= mailto:freebsd-stable-unsubscribe@freebsd.org>"=0A= =0A=