From owner-svn-src-all@freebsd.org Sat Jun 24 04:48:33 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1255FD975B7; Sat, 24 Jun 2017 04:48:33 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-pf0-f170.google.com (mail-pf0-f170.google.com [209.85.192.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DED6F82E5F; Sat, 24 Jun 2017 04:48:32 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-pf0-f170.google.com with SMTP id s66so31814620pfs.1; Fri, 23 Jun 2017 21:48:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to:cc; bh=LpRcQQXhFBQK6170kxiJZzPXhGd9VFcZPRTx4adYIrA=; b=sjj6xGt/rbVz57f7L/E3VWVlz1kmodj/wOFqM+7MCCVTwMpL4a2Leq2qcwxsxMgJV2 ActMsioa1Z5tZZKzHlKN/++XYhotOr18QqLQO8qPGZ8ib2/Q760P4bM3eVm0DiBJHK3v QLSMxLJqVmEi6ghUnA7SPDh4WUHxhzvsJWwJw6GGCfQfpbzZkCjfOJrk2YXtTdzOk2IA jyT3b1u8Jlz5jl32f9jGxwSKZsBxwhWkEUbMKMVC29ZYr60f2Y+HWBgWDQfKclOpI6/D ziVJMhSTg/l8dWnQZDpzb5ZsiUdKxuqXX053tR5C4juurBZBiXukK8K6+4h3A7ffljUs wXkw== X-Gm-Message-State: AKS2vOzLfyb2mvljRdzcCTVhluZ/LhhrV/ghOcZC0WiZpFpLRX2xqTQQ YZCvrH1Nyy2Ah3DuGfc= X-Received: by 10.98.220.26 with SMTP id t26mr11701221pfg.32.1498279705898; Fri, 23 Jun 2017 21:48:25 -0700 (PDT) Received: from mail-pg0-f54.google.com (mail-pg0-f54.google.com. [74.125.83.54]) by smtp.gmail.com with ESMTPSA id r63sm12811393pgr.65.2017.06.23.21.48.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 23 Jun 2017 21:48:25 -0700 (PDT) Received: by mail-pg0-f54.google.com with SMTP id e187so28648144pgc.1; Fri, 23 Jun 2017 21:48:25 -0700 (PDT) X-Received: by 10.84.241.198 with SMTP id t6mr6083095plm.48.1498279704794; Fri, 23 Jun 2017 21:48:24 -0700 (PDT) MIME-Version: 1.0 Reply-To: cem@freebsd.org Received: by 10.100.133.66 with HTTP; Fri, 23 Jun 2017 21:48:24 -0700 (PDT) In-Reply-To: <201706082130.v58LUY0j095589@repo.freebsd.org> References: <201706082130.v58LUY0j095589@repo.freebsd.org> From: Conrad Meyer Date: Fri, 23 Jun 2017 21:48:24 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: svn commit: r319722 - in head: sys/cam/ctl sys/dev/iscsi sys/kern sys/netgraph sys/netgraph/bluetooth/socket sys/netinet sys/ofed/drivers/infiniband/core sys/ofed/drivers/infiniband/ulp/sdp sys/rpc... To: Gleb Smirnoff Cc: src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org, Allan Jude Content-Type: text/plain; charset="UTF-8" X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Jun 2017 04:48:33 -0000 Hi Gleb, We suspect this revision has broken setsockopt(SO_SNDBUF), etc., on listen sockets, as used by e.g. nginx. Example backtrace: http://imgur.com/a/fj5JQ The proposed mechanism is the destroyed snd/rcv sockbufs (and associated locks) as part of solisten_proto(). Best, Conrad On Thu, Jun 8, 2017 at 2:30 PM, Gleb Smirnoff wrote: > Author: glebius > Date: Thu Jun 8 21:30:34 2017 > New Revision: 319722 > URL: https://svnweb.freebsd.org/changeset/base/319722 > > Log: > Listening sockets improvements. > > o Separate fields of struct socket that belong to listening from > fields that belong to normal dataflow, and unionize them. This > shrinks the structure a bit. > - Take out selinfo's from the socket buffers into the socket. The > first reason is to support braindamaged scenario when a socket is > added to kevent(2) and then listen(2) is cast on it. The second > reason is that there is future plan to make socket buffers pluggable, > so that for a dataflow socket a socket buffer can be changed, and > in this case we also want to keep same selinfos through the lifetime > of a socket. > - Remove struct struct so_accf. Since now listening stuff no longer > affects struct socket size, just move its fields into listening part > of the union. > - Provide sol_upcall field and enforce that so_upcall_set() may be called > only on a dataflow socket, which has buffers, and for listening sockets > provide solisten_upcall_set(). > > o Remove ACCEPT_LOCK() global. > - Add a mutex to socket, to be used instead of socket buffer lock to lock > fields of struct socket that don't belong to a socket buffer. > - Allow to acquire two socket locks, but the first one must belong to a > listening socket. > - Make soref()/sorele() to use atomic(9). This allows in some situations > to do soref() without owning socket lock. There is place for improvement > here, it is possible to make sorele() also to lock optionally. > - Most protocols aren't touched by this change, except UNIX local sockets. > See below for more information. > > o Reduce copy-and-paste in kernel modules that accept connections from > listening sockets: provide function solisten_dequeue(), and use it in > the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4), > infiniband, rpc. > > o UNIX local sockets. > - Removal of ACCEPT_LOCK() global uncovered several races in the UNIX > local sockets. Most races exist around spawning a new socket, when we > are connecting to a local listening socket. To cover them, we need to > hold locks on both PCBs when spawning a third one. This means holding > them across sonewconn(). This creates a LOR between pcb locks and > unp_list_lock. > - To fix the new LOR, abandon the global unp_list_lock in favor of global > unp_link_lock. Indeed, separating these two locks didn't provide us any > extra parralelism in the UNIX sockets. > - Now call into uipc_attach() may happen with unp_link_lock hold if, we > are accepting, or without unp_link_lock in case if we are just creating > a socket. > - Another problem in UNIX sockets is that uipc_close() basicly did nothing > for a listening socket. The vnode remained opened for connections. This > is fixed by removing vnode in uipc_close(). Maybe the right way would be > to do it for all sockets (not only listening), simply move the vnode > teardown from uipc_detach() to uipc_close()? > > Sponsored by: Netflix > Differential Revision: https://reviews.freebsd.org/D9770 > > Modified: > head/sys/cam/ctl/ctl_ha.c > head/sys/dev/iscsi/icl_soft_proxy.c > head/sys/kern/sys_socket.c > head/sys/kern/uipc_accf.c > head/sys/kern/uipc_debug.c > head/sys/kern/uipc_sockbuf.c > head/sys/kern/uipc_socket.c > head/sys/kern/uipc_syscalls.c > head/sys/kern/uipc_usrreq.c > head/sys/netgraph/bluetooth/socket/ng_btsocket_l2cap.c > head/sys/netgraph/bluetooth/socket/ng_btsocket_rfcomm.c > head/sys/netgraph/bluetooth/socket/ng_btsocket_sco.c > head/sys/netgraph/ng_ksocket.c > head/sys/netinet/sctp_input.c > head/sys/netinet/sctp_syscalls.c > head/sys/netinet/sctp_sysctl.c > head/sys/netinet/sctp_usrreq.c > head/sys/netinet/tcp_subr.c > head/sys/netinet/tcp_syncache.c > head/sys/netinet/tcp_timewait.c > head/sys/ofed/drivers/infiniband/core/iwcm.c > head/sys/ofed/drivers/infiniband/ulp/sdp/sdp_main.c > head/sys/rpc/svc_vc.c > head/sys/sys/sockbuf.h > head/sys/sys/socket.h > head/sys/sys/socketvar.h > head/usr.bin/netstat/inet.c