From owner-svn-src-projects@freebsd.org Wed Jul 1 00:43:59 2020 Return-Path: Delivered-To: svn-src-projects@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D07E6359E81 for ; Wed, 1 Jul 2020 00:43:59 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on0609.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::609]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49xMt64CFmz48qT; Wed, 1 Jul 2020 00:43:58 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VKC/ENotjofpkWkREfOETO3OCELfbImVzpULu3NqMt3Faz5PPlC/D3lzknx8oz656xHFMt8VJw/hu6eIJfQmNVtEC4OcL+lg1cdMV4ZQQsej2wy8bySrxh6WHiv79Kr+23l6ViY3fDFVepgDPg5ME+dZXFUuhNOwe+dDp2wEpwMQ2qItqttP1ENZacQ5g3rXz7cQOtPcQfMlFwXyEor0tyssIF3sYwBXchP2kiLayTnu/l8Nw11J+u0iLXwFvBibbcq3AmUFQ8/lDai7irHhk7Dt5TstJHgIsj0l+DUQc6/PFOtsrkiTSOygryCve0ScWEmCqn4UreiNhcL2tvZcVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=T2fN+GzNekKv8CwPw+KfZh2cO26Fkq/cXvNpwyT6qkw=; b=nLOyTtf8oV87C6JeTCFmy6o71aRT+jYat/x/IyAi675IZFf6cb71Ue9CYmd/U1D08pkwlAtvwE+41hho1YHD2iMpqU+TFEcxS4Yef+t6ASAEYmMyuA5RfUmWX0c5ZnuxsePkLoh4FWYqWjMwshpPEHk6S8bsBbNMvwFM1uSec3U6sToeLwusdH4cZIXAtjtiqaYSFgjs7HNUiazbwiSNNunVqd5I+h4XyvPXBRO1vHW7+3wB+g+pqkISndNW/1ibVT10sUxovvHMBhZwBB4djUMhtTVPwKD0Dumn16asQlUBhf/fjXyu3frcP3bE5sVBiUeqEboD8+EUk5YTomw7qg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=T2fN+GzNekKv8CwPw+KfZh2cO26Fkq/cXvNpwyT6qkw=; b=DAWeWb0B/B6R1i3z0Wsk49oCJrkWe8w9Ar6gT793QfjQutwk96e1hN20nCiI1VS2PvRKqf+Yk0L0A5sLCSH7uEw9T2vuNaKkDiH7mBjnJ83aN7pFnQzIxF04AXSQzVY7dReMx6nL+gDsSux5yAOlwLO2pDUczVfVAg+ZIerb3LG/R40bhiWTlPQMbgut0Tj6gEkyL/Mo0oGWLkdCfwWKjzPu2zT7bPGOAPlHt2eHli8vuwMJX4nA5I3BvWAbET+x+VCwT5SGFZ9u9H/l1/IOBi2LjBHXFSkGS8rMcjw52IIfAsCdQwYgAKR2RugIcTIk/ZG6VV/YhWMjMg9//DrGTg== Received: from QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:38::14) by QB1PR01MB3362.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:37::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3131.20; Wed, 1 Jul 2020 00:43:51 +0000 Received: from QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM ([fe80::60f3:4ca2:8a4a:1e91]) by QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM ([fe80::60f3:4ca2:8a4a:1e91%7]) with mapi id 15.20.3131.028; Wed, 1 Jul 2020 00:43:41 +0000 From: Rick Macklem To: Benjamin Kaduk CC: Benjamin Kaduk , Rick Macklem , src-committers , "svn-src-projects@freebsd.org" Subject: Re: svn commit: r362798 - in projects/nfs-over-tls/sys/rpc: . rpcsec_tls Thread-Topic: svn commit: r362798 - in projects/nfs-over-tls/sys/rpc: . rpcsec_tls Thread-Index: AQHWTu29MoFCP0YO4kaq4Sj4aYWIUajxUeOAgAAARmyAAAhnAIAAg0Q8 Date: Wed, 1 Jul 2020 00:43:41 +0000 Message-ID: References: <202006301449.05UEnq2x072917@repo.freebsd.org> , <20200630163340.GN58278@kduck.mit.edu> In-Reply-To: <20200630163340.GN58278@kduck.mit.edu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 317b9ed8-3f6c-4fdc-5c2d-08d81d57ccfc x-ms-traffictypediagnostic: QB1PR01MB3362: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 04519BA941 x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: mtmTXF0qV4lgLtEhUTE+BQT4TzIA1Oil0NLvw6W9iEp8E53ntz1OqUDJnxGfC3dU4XWjnXq8Ly7oKNjtI8rBkJTMaDhsicR1bCFRsv9yULe6ML/5Sb3nE6mFlQtTOkBLfqnkKLMgRWkStglh0hwHkS4AVKVrFsxJExBwyYazIeqzfA2Xs+fkFm9knI9F9aWMoknPtmSCnJ2S76082e4fHpQOdBh8CrgkkopqfKIMQqrUTdD4Xq4oWB++V8lkI3iyN20OoSJPtL3efzsL32ooOmnPujnyztDLHmZx7w4yzceLSSXZqqExMEdbSvYIcLhH2DHvOQi3AL1k0nFk+Te7suj5OmHFi6xxV5wgdQWNUv54RNUcRygGNuLR4TpwXowdrjmK9wqsPZxGAlfYXOoOqg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFTY:; SFS:(366004)(396003)(39860400002)(346002)(376002)(136003)(8676002)(71200400001)(64756008)(66446008)(76116006)(2906002)(66946007)(91956017)(66476007)(4326008)(786003)(66556008)(316002)(8936002)(966005)(33656002)(54906003)(186003)(6506007)(86362001)(52536014)(478600001)(7696005)(9686003)(5660300002)(55016002)(6916009)(83380400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: eKd2+LsYaiJh0W7Wn8PXJ+IQqWZaxnJgwmy+UvJKg0RO2xP52AQom6x1LYeW1ZbkM2DO1oRJtNxSLjVEXdEeibHMQnYYNHexfRlWDIY/2b65SYqWxJIbhEi89Nbk1MXC8IvDj5zaqLGUHfIogqKKMjhlbXnrOWSjBHMSyAcVSKLPXEIzyXKpB3AMRDo6ZSFNBPBeq/VV94lnqcTBVgOftc/JXzYKAMrvZ9++FuVJ8+EA+CdC0ss+mwY7+7XJ/Ov+p2qJyX7dbrcwidKjiwkYgLRCTpeJW9DKjn8TBdanrq4i3tHESO0tnqO78nfQ+aKaqw7aIx6YlWsJhpjiV/5bfSI7Ow1OgIgPyUotZRXlMsDvqIQZb3sgi1Reu0JcRZdbHWPsCHTMok7T5vdXqvfGJMayekSa46Iq2SAWghLySFNZf2LKQiIeXu0tllhAZrWUE8IZe2Lp0A3RGtP3RT7xBfV36QPMtLkNjTsG5RnV0DHodzqn9beo2sVaT1pRRZpU1vGrb61Sd/KWJtH5IBdSb3kya25WkYmc3PYEmxa+pTs= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 317b9ed8-3f6c-4fdc-5c2d-08d81d57ccfc X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Jul 2020 00:43:41.0246 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: gfw2OJGaSaQJppdT0IOK1r4j38SbW/Mb38UyjaJdMF4BCuMR0nn2hHk6Aueym9yfNeirokz5o0gdGWJmPTwehA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB3362 X-Rspamd-Queue-Id: 49xMt64CFmz48qT X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=DAWeWb0B; dmarc=none; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5c::609 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.67 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-1.02)[-1.016]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; NEURAL_HAM_LONG(-1.02)[-1.024]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[uoguelph.ca]; RCPT_COUNT_FIVE(0.00)[5]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[uoguelph.ca:+]; NEURAL_HAM_SHORT(-0.13)[-0.128]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1] X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2020 00:43:59 -0000 Benjamin Kaduk wrote:=0A= >On Tue, Jun 30, 2020 at 04:20:45PM +0000, Rick Macklem wrote:=0A= >> Benjamin Kaduk wrote:=0A= >> >On Tue, Jun 30, 2020 at 7:49 AM Rick Macklem > wrote:=0A= >> >Author: rmacklem=0A= >> >Date: Tue Jun 30 14:49:51 2020=0A= >> >New Revision: 362798=0A= >> >URL: https://svnweb.freebsd.org/changeset/base/362798=0A= >> >=0A= >> >Log:=0A= >> > Testing when a server does not respond to TLS handshake records expos= ed=0A= >> > a couple of problems, since the daemon would be in SSL_connect() for = 6 minutes.=0A= >> >=0A= >> > - When the upcall timed out and was retried, the RPCTLS_SYSC_CLSOCKET= syscall=0A= >> > was broken and did not return an error upon a retry. It allocated a= file=0A= >> > descriptor for a NULL socket.=0A= >> > - The socket structure in the kernel could be free'd while the daemon= was=0A= >> > still using it in SSL_connect().=0A= >> > - Adjust the timeout a retry count so that upcalls are only attempted= once=0A= >> > with a 10minute timeout.=0A= >> >=0A= >> >=0A= >> >10 minutes seems really long! It sounds from the description like the = upcall so >>that=0A= >> >userspace can run SSL_connect() was taking 6 minutes, and you needed 10= >>minutes so=0A= >> >as to be longer than the 6 minutes that is "out of your control"?=0A= >> Well, I think a long timeout here is ok, since a timeout indicates a bro= ken daemon.=0A= >> (The upcalls to the local daemon should be reliable and cannot safely be= redone.=0A= >> In a perfect world, the upcall mechanism would be "exactly once" instea= d of=0A= >> "at least once". I think an upcall might fail when the mbuf pool in the= kernel=0A= >> is exhausted, but that should be rare.)=0A= >>=0A= >> >I feel like there should be some sockopts available to get the SSL_conn= ect() timeout=0A= >> >down, so that the upcall timeout doesn't need to be so long, either.=0A= >> Yes, 6 minutes does seem like a long time. I only discovered this yester= day when=0A= >> I simulated a server that did not respond to handshake records.=0A= >>=0A= >> I haven't yet dug into the openssl code to see if there is a way to adju= st this=0A= >> timeout.=0A= >> I also do not know what a good timeout value for SSL_connect() might be,= =0A= >> even if the daemon can override the default.=0A= >>=0A= >> In practice, this should only happen when trying to do an NFS mount on= =0A= >> a broken server which responds to the "STARTTLS" Null RPC, but does not= =0A= >> do the handshake.=0A= >> Having the mount attempt stuck for 6minutes before failing is not that s= erious=0A= >> a problem, imho.=0A= >> (When systems boot after something like a power failure, delays getting = NFS=0A= >> mounts done, due to the NFS server/network needing to be up, is fairly= =0A= >> normal. The "-b" option to put the mount attempt in background has been= =0A= >> around for a long time for this.)=0A= >>=0A= >> If you happen to know how to set a timeout for SSL_connect() in the open= ssl=0A= >> library, I would be interested in hearing that.=0A= >=0A= >As it happens, I took a look before I wrote the initial note, and there=0A= >doesn't seem to be any intrinsic TLS (not DTLS) handshake timeouts in=0A= >libssl itself; I expect this is actually just the (kernel's!) TCP timeout.= =0A= >So you'd be getting the socket fd (e.g., SSL_get_fd(), if you don't have a= =0A= >reference already) and using setsockopt() to set the timeout(s).=0A= Interesting. The test case I simulated did not close the TCP socket used by= =0A= SSL_connect(). The server just replied to the STARTTLS Null RPC, but did no= t=0A= call SSL_accept(), so the server side just isn't playing "handshake".=0A= "netstat -a" showed the connection as ESTABLISHED.=0A= During debugging, I also used the trick of putting:=0A= while (1)=0A= sleep(1);=0A= right after the SSL_connect() call and, when watching it via "ps",=0A= it would switch from "sbwait" to "nanoslp" after 6 minutes and=0A= a syslog() call showed that SSL_connect() had returned -1.=0A= =0A= So, if the TCP connection was "established", what caused the SSL_connect()= =0A= to return with an error (-1) after 6 minutes?=0A= =0A= Now, there is a 6 minute idle timeout in the RPC code for TCP where it,=0A= by default, closes the connection when there is 6 minutes without any=0A= activity. (I have to look if waiting for a reply for the upcall implies "no= activity" and if=0A= this also happens for AF_LOCAL sockets, which is what the upcalls use.)=0A= =0A= Now, if that happens, a SIGPIPE would be posted to the daemon, which=0A= is SIG_IGN'd by the daemon. But maybe the SIGPIPE somehow causes=0A= SSL_connect() to return -1 by making the syscall it is doing (read/recv on = the=0A= TCP socket sitting in sbwait) return EINTR, or something like that?=0A= =0A= I can change this 6minute timeout to see if that affects it.=0A= =0A= When you've got upcalls and library functions both talking to sockets it=0A= can get interesting.=0A= =0A= Thanks for the comments, rick=0A= =0A= -Ben=0A= =0A=