From owner-freebsd-net@FreeBSD.ORG Sun Oct 5 08:22:40 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 59D9593; Sun, 5 Oct 2014 08:22:40 +0000 (UTC) Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69]) by mx1.freebsd.org (Postfix) with ESMTP id 1288C381; Sun, 5 Oct 2014 08:22:40 +0000 (UTC) Received: from [10.0.1.12] (79-64-249-234.host.pobb.as13285.net [79.64.249.234]) by cyrus.watson.org (Postfix) with ESMTPSA id DADC846B0C; Sun, 5 Oct 2014 04:22:33 -0400 (EDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: [PATCH] Only lock send buffer in sopoll() if needed From: "Robert N. M. Watson" In-Reply-To: <201409301400.09999.jhb@freebsd.org> Date: Sun, 5 Oct 2014 08:08:17 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <9DED9207-89B7-47C3-B823-047A5FDFC511@FreeBSD.org> References: <201409301400.09999.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1878.6) Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Oct 2014 08:22:40 -0000 I'm not convinced that the race with SBS_CANTSENDMORE is OK, even though = I really want to write that it is. The risk here is that we miss an = asynchronous disconnect event, and that the thread then blocks even = though an event is pending, which is a nasty turn of events. We might = want to dig about a bit and see whether this case 'matters' -- e.g., = that there are important cases where a test for writability might need = to catch SBS_CANTRCVMORE but SBS_CANTSENDMORE is not simultaneously set. Could you say a bit more about the performance benefits of this change = -- is it substantial? If so, we might need to add a new socket-buffer = flag along the lines of SB[RS]_DISCONNECTED that is 'broadcast' to both = socket buffers on certain events. Doing so might require rejiggering = elsewhere by causing additional locks to need to be held, possibly out = of order. Another thing that's worth pondering -- and it is a lot of work -- is = finally sorting out the conflation of SOCK_LOCK() and = SOCKBUF_LOCK(&so_rcv). We have a number of races in the socket state = machine stemming from this conflation (due to lockless reads along the = lines of the one you are proposing, only against so_state and friends). = This is a non-trivial change, and it's not 100% clear to me how to make = it, so would require quite a bit of analysis. It might have the effect = of needing three non-trivial adjustments: (1) instantiating two new = locks per socket, one sleepable, one a mutex; (2) rejiggering a set of = 'global' socket properties out from under the receive lock, such as (but = definitely not limited to) so_state; (3) doing some socket-layer vs. = protocol-layer rejiggering so that protocol state machine is the = 'primary' with event notifications to the socket layer to update its = state, as opposed to the current world order where some state = transitions are triggered by one layer, and some by the other. Although = not pretty, you can see an example of a resolution of a socket-protocol = race in the current behaviour of solisten, which used to drive the state = machine from the socket layer, which conflicted with protocol-layer = checks; now, the protocol uses the socket layer as a library to test = (and later set) properties. I don't think the issue you are looking at requires starting to dig into = that can of worms, but it is a long-standing problem in socket locking / = protocol-layer interaction that does need to get resolved in order to = make the behaviour of conditions like the one you are looking at 'more = natural'. Robert On 30 Sep 2014, at 19:00, John Baldwin wrote: > Right now sopoll() always locks both socket buffers. The receive = socket=20 > buffer lock is always needed, but the send socket buffer lock is only = needed=20 > while polling for writing (there is a potential test of = SBS_CANTSENDMORE=20 > without the lock, but I think this might be ok). What do folks think? >=20 > Index: uipc_socket.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- uipc_socket.c (revision 272305) > +++ uipc_socket.c (working copy) > @@ -3003,7 +3003,12 @@ sopoll_generic(struct socket *so, int events, = stru > { > int revents =3D 0; >=20 > - SOCKBUF_LOCK(&so->so_snd); > + if (events & (POLLOUT | POLLWRNORM)) > + SOCKBUF_LOCK(&so->so_snd); > + /* > + * The so_rcv lock doubles as SOCK_LOCK so it it is needed for > + * all requests. > + */ > SOCKBUF_LOCK(&so->so_rcv); > if (events & (POLLIN | POLLRDNORM)) > if (soreadabledata(so)) > @@ -3038,7 +3043,8 @@ sopoll_generic(struct socket *so, int events, = stru > } >=20 > SOCKBUF_UNLOCK(&so->so_rcv); > - SOCKBUF_UNLOCK(&so->so_snd); > + if (events & (POLLOUT | POLLWRNORM)) > + SOCKBUF_UNLOCK(&so->so_snd); > return (revents); > } >=20 > --=20 > John Baldwin