From owner-freebsd-net@freebsd.org Tue Jan 19 23:21:12 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0DCC5A88431 for ; Tue, 19 Jan 2016 23:21:12 +0000 (UTC) (envelope-from Vedant.Mathur@netapp.com) Received: from mx144.netapp.com (mx144.netapp.com [216.240.21.25]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (Client CN "mx141.netapp.com", Issuer "Symantec Class 3 Secure Server CA - G4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B7D5E18AF for ; Tue, 19 Jan 2016 23:21:11 +0000 (UTC) (envelope-from Vedant.Mathur@netapp.com) X-IronPort-AV: E=Sophos;i="5.22,319,1449561600"; d="scan'208,217";a="93168100" Received: from hioexcmbx04-prd.hq.netapp.com ([10.122.105.37]) by mx144-out.netapp.com with ESMTP; 19 Jan 2016 15:20:16 -0800 Received: from HIOEXCMBX04-PRD.hq.netapp.com (10.122.105.37) by hioexcmbx04-prd.hq.netapp.com (10.122.105.37) with Microsoft SMTP Server (TLS) id 15.0.1130.7; Tue, 19 Jan 2016 15:20:16 -0800 Received: from HIOEXCMBX04-PRD.hq.netapp.com ([::1]) by hioexcmbx04-prd.hq.netapp.com ([fe80::5d07:58:69ea:1a63%21]) with mapi id 15.00.1130.005; Tue, 19 Jan 2016 15:20:16 -0800 From: "Mathur, Vedant" To: "freebsd-net@freebsd.org" Subject: Performing a socket accept in listen sockets' upcall - Yes or No? Thread-Topic: Performing a socket accept in listen sockets' upcall - Yes or No? Thread-Index: AQHRUw/0G8ZgJLe6xkWgTa5dcyoGxA== Date: Tue, 19 Jan 2016 23:20:15 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.122.56.79] MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2016 23:21:12 -0000 Hi all, I am working with FreeBSD in an environment where majority of my applicatio= ns run in the kernel and are heavily relying on the socket upcall mechanism= . All of these kernel applications use the socket upcall mechanism for sock= et I/O. These applications also use the upcall mechanism to accept incoming TCP con= nections. How the applications do so is by registering a receive upcall on= the head socket and waiting for the upcall. When the stack makes the recei= ve upcall on the listen socket, most of the applications defer the actual a= ccept to a different thread using a queueing mechanism and return from the = upcall. However, I have some applications which instead of deferring the socket acc= ept operation to a different thread want to perform the socket accept opera= tion inline. In other words the application wants to accept the child socke= t in the context of the listen sockets' upcall. My questions are regarding = this behavior and are as follows: A) First and foremost, does the FreeBSD socket upcall framework even suppor= t such kind of an accept operation? Does this conceptually break FreeBSD so= cket accept and upcall semantics? Is the upcall mechanism to be ONLY used f= or socket I/O? B) If the accept operation is supported in the context of the head sockets'= upcall then, how do we handle the use after free cases where issuing a hea= d sockets upcall can potentially make the application accept and then close= the child socket which we access after the upcall returns. To be more spec= ific, in FreeBSD 11.x the syncache_socket() code calls sonewconn() and then= connects the child socket to the peer. Later it calls soisconnected() whic= h places the child socket in head's so_comp queue and makes an upcall which= can potentially free the child socket. Once syncache_socket() returns, the= call stack eventually returns until tcp_input() for TCP sockets. tcp_input= () now calls tcp_do_segment() which accesses the child socket and its inpcb= and tcpcb which can potentially be freed memory. If A) is possible then, I can see that commit r261242 modified the code su= ch that we deferred placing the child socket on the so_comp queue until the= connect was complete. I believe this change is still susceptible to the us= e after free case described above. If we agree then what are the potential= solutions? Thanks! Vedant Mathur