From owner-freebsd-current@FreeBSD.ORG Fri Dec 6 03:04:33 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 67052F0; Fri, 6 Dec 2013 03:04:33 +0000 (UTC) Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com [IPv6:2a00:1450:4010:c03::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 2E7BF133A; Fri, 6 Dec 2013 03:04:32 +0000 (UTC) Received: by mail-la0-f45.google.com with SMTP id eh20so39634lab.4 for ; Thu, 05 Dec 2013 19:04:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=mm46h5ABagl0FCh8JpvgvRzoQH2L1l1AWnt4Tz+UZ3s=; b=FX2egA1tHxz2DbTswccBj2H9zS1iSgk2BW8PI4eK/xa2zMhDL8jvWRA0lPNf+sWsFD 8+cg15Xe/tNrLLAg0SNt9YIHnlMvzOWgpSIRWhGnbMqWNzKYKYoDBGJee4dFKRwkh6Uh trg1L5vy/2vFtnp6hPYPvv3SFLVWI5KeQlYVFoDPg8cOUGm6Oj8i/TSel1cDJVf8mMf4 0o1yGvAclB4x0NoXChxW0rAdASsFcy+RO6vSKVM0kiejyWcr58BDe/g15mCCLK0bPVlM dOOLBQinezRSfVCsO4HqwQHzW/V4Ov+YNGkpWlqwB8QKXM9QeXsdsLQCj/sFoxEHLR+Z PZ7w== MIME-Version: 1.0 X-Received: by 10.152.234.170 with SMTP id uf10mr238362lac.43.1386299069530; Thu, 05 Dec 2013 19:04:29 -0800 (PST) Received: by 10.114.166.163 with HTTP; Thu, 5 Dec 2013 19:04:29 -0800 (PST) In-Reply-To: References: <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org> Date: Fri, 6 Dec 2013 11:04:29 +0800 Message-ID: Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour From: Sepherosa Ziehau To: Adrian Chadd Content-Type: text/plain; charset=ISO-8859-1 Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= , freebsd-net , Oleg Moskalenko , Tim Kientzle , "freebsd-current@freebsd.org" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Dec 2013 03:04:33 -0000 On Tue, Dec 3, 2013 at 5:41 AM, Adrian Chadd wrote: > > On 2 December 2013 03:45, Sepherosa Ziehau wrote: > > > > On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd wrote: > > > >> Ok, so given this, how do you guarantee the UTHREAD stays on the given > >> CPU? You assume it stays on the CPU that the initial listen socket was > >> created on, right? If it's migrated to another CPU core then the > >> listen queue still stays in the original hash group that's in a netisr > >> on a different CPU? > > > > As I wrote in the above brief introduction, Dfly currently relies on the > > scheduler doing the proper thing (the scheduler does do a very good job > > during my tests). I need to export certain kind of socket option to make > > that information available to user space programs. Force UTHREAD binding in > > kernel is not helpful, given in reverse proxy application, things are > > different. And even if that kind of binding information was exported to > > user space, user space program still would have to poll it periodically (in > > Dfly at least), since other programs binding to the same addr/port could > > come and go, which will cause reorganizing of the inp localgroup in the > > current Dfly implementation. > > Right. I kinda gathered that. It's fine, I was conceptually thinking > of doing some thead pinning into this anyway. > > How do you see this scaling on massively multi-core machines? Like 32, > 48, 64, 128 cores? I had some vague handwav-y notion of maybe limiting We do have a 48 core box. It is mainly used for package building and other stuffs. I didn't run network stress tests on it. However, we do address some message passing problems on it which will not be unveiled on 8 cpu boxes. > the concept of pcbgroup hash / netisr threads to a subset of CPUs, or > have them be able to float between sockets but only have 1 (or n, Floating around may be good, but by pinning netisr to a specific CPU you could enjoy lockless per-cpu data. > maybe) per socket. Or just have a fixed, smaller pool. The idea then We used to have dedicated threads for UDP and TCP processing, but it turns out that one netisr per cpu works best in Dfly. You probably need to try and measure before deciding to move to 1 or N netisrs per cpu. Best Regards, sephe > is the scheduler would need to be told that a given userland > thread/process belongs to a given netisr thread, and to schedule them > on the same CPU when possible. > > Anyway, thanks for doing this work. I only wish that you'd do it for > FreeBSD. :-) > > > > -adrian -- Tomorrow Will Never Die