From owner-svn-src-stable-11@freebsd.org Mon May 22 19:34:38 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6C424D784DC; Mon, 22 May 2017 19:34:38 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4957D1EE3; Mon, 22 May 2017 19:34:38 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v4MJYbgO072317; Mon, 22 May 2017 19:34:37 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v4MJYbqG072316; Mon, 22 May 2017 19:34:37 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201705221934.v4MJYbqG072316@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Mon, 22 May 2017 19:34:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r318660 - stable/11/sys/rpc X-SVN-Group: stable-11 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 May 2017 19:34:38 -0000 Author: rmacklem Date: Mon May 22 19:34:37 2017 New Revision: 318660 URL: https://svnweb.freebsd.org/changeset/base/318660 Log: MFC: r317906 Fix the client side krpc from doing TCP reconnects for ERESTART from sosend(). When sosend() replies ERESTART in the client side krpc, it indicates that the RPC message hasn't yet been sent and that the send queue is full or locked while a signal is posted for the process. Without this patch, this would result in a RPC_CANTSEND reply from clnt_vc_call(), which would cause clnt_reconnect_call() to create a new TCP transport connection. For most NFS servers, this wasn't a serious problem, although it did imply retries of outstanding RPCs, which could possibly have missed the DRC. For an NFSv4.1 mount to AmazonEFS, this caused a serious problem, since AmazonEFS often didn't retain the NFSv4.1 session and would reply with NFS4ERR_BAD_SESSION. This implies to the client a crash/reboot which requires open/lock state recovery. Three options were considered to fix this: - Return the ERESTART all the way up to the system call boundary and then have the system call redone. This is fraught with risk, due to convoluted code paths, asynchronous I/O RPCs etc. cperciva@ worked on this, but it is still a work in prgress and may not be feasible. - Set SB_NOINTR for the socket buffer. This fixes the problem, but makes the sosend() completely non interruptible, which kib@ considered inappropriate. It also would break forced dismount when a thread was blocked in sosend(). - Modify the retry loop in clnt_vc_call(), so that it loops for this case for up to 15sec. Testing showed that the sosend() usually succeeded by the 2nd retry. The extreme case observed was 111 loop iterations, or about 100msec of delay. This third alternative is what is implemented in this patch, since the change is: - localized - straightforward - forced dismount is not broken by it. This patch has been tested by cperciva@ extensively against AmazonEFS. Modified: stable/11/sys/rpc/clnt_vc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/rpc/clnt_vc.c ============================================================================== --- stable/11/sys/rpc/clnt_vc.c Mon May 22 19:28:38 2017 (r318659) +++ stable/11/sys/rpc/clnt_vc.c Mon May 22 19:34:37 2017 (r318660) @@ -57,6 +57,7 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include #include #include @@ -107,6 +108,8 @@ static struct clnt_ops clnt_vc_ops = { static void clnt_vc_upcallsdone(struct ct_data *); +static int fake_wchan; + /* * Create a client handle for a connection. * Default options are set, which the user can change using clnt_control()'s. @@ -298,7 +301,7 @@ clnt_vc_call( uint32_t xid; struct mbuf *mreq = NULL, *results; struct ct_request *cr; - int error; + int error, trycnt; cr = malloc(sizeof(struct ct_request), M_RPC, M_WAITOK); @@ -328,8 +331,20 @@ clnt_vc_call( timeout = ct->ct_wait; /* use default timeout */ } + /* + * After 15sec of looping, allow it to return RPC_CANTSEND, which will + * cause the clnt_reconnect layer to create a new TCP connection. + */ + trycnt = 15 * hz; call_again: mtx_assert(&ct->ct_lock, MA_OWNED); + if (ct->ct_closing || ct->ct_closed) { + ct->ct_threads--; + wakeup(ct); + mtx_unlock(&ct->ct_lock); + free(cr, M_RPC); + return (RPC_CANTSEND); + } ct->ct_xid++; xid = ct->ct_xid; @@ -397,13 +412,16 @@ call_again: */ error = sosend(ct->ct_socket, NULL, NULL, mreq, NULL, 0, curthread); mreq = NULL; - if (error == EMSGSIZE) { + if (error == EMSGSIZE || (error == ERESTART && + (ct->ct_waitflag & PCATCH) == 0 && trycnt-- > 0)) { SOCKBUF_LOCK(&ct->ct_socket->so_snd); sbwait(&ct->ct_socket->so_snd); SOCKBUF_UNLOCK(&ct->ct_socket->so_snd); AUTH_VALIDATE(auth, xid, NULL, NULL); mtx_lock(&ct->ct_lock); TAILQ_REMOVE(&ct->ct_pending, cr, cr_link); + /* Sleep for 1 clock tick before trying the sosend() again. */ + msleep(&fake_wchan, &ct->ct_lock, 0, "rpclpsnd", 1); goto call_again; }