From owner-freebsd-hackers@FreeBSD.ORG Thu Jun 5 21:07:25 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3F961F93; Thu, 5 Jun 2014 21:07:25 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CE92E2A5C; Thu, 5 Jun 2014 21:07:24 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1Wsaxi-000C2L-GI; Thu, 05 Jun 2014 20:56:22 +0400 Message-ID: <5390DB64.6010704@FreeBSD.org> Date: Fri, 06 Jun 2014 01:04:36 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: John Baldwin Subject: Re: Permit init(8) use its own cpuset group. References: <538C8F9A.4020301@FreeBSD.org> <201406051009.59432.jhb@freebsd.org> <5390C907.1070405@FreeBSD.org> <201406051559.11274.jhb@freebsd.org> In-Reply-To: <201406051559.11274.jhb@freebsd.org> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8bit Cc: Konstantin Belousov , freebsd-hackers@freebsd.org, hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jun 2014 21:07:25 -0000 On 05.06.2014 23:59, John Baldwin wrote: > On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote: >> On 05.06.2014 18:09, John Baldwin wrote: >>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov wrote: >>>> On 04.06.2014 19:06, John Baldwin wrote: >>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote: >>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. Chernikov wrote: >>>>>>> Hello list! >>>>>>> >>>>>>> Currently init(8) uses group 1 which is root group. >>>>>>> Modifications of this group affects both kernel and userland threads. >>>>>>> Additionally, such modifications are impossible, for example, in >>> presence >>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds their > threads >>>>> to >>>>>>> particular cpus. >>>>>>> >>>>>>> Proposed change ("init_cpuset" loader tunable) permits changing cpu >>>>>>> masks for >>>>>>> userland more easily. Restricting user processes to migrate to/from > CPU >>>>>>> cores >>>>>>> used for network traffic processing is one of the cases. >>>>>>> >>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same version > attached >>>>>>> inline) >>>>>>> >>>>>>> If there are no objections, I'll commit this next week. >>>>>> Why is the tunable needed ? >>>>> Because some people already depend on doing 'cpuset -l 0 -s 1'. It is >>> also >>>>> documented in our manpages that processes start in cpuset 1 by default > so >>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc. >>>>> >>>>> For the stated problem (bound ithreads in drivers), I would actually > like >>> to >>>>> fix ithreads that are bound to a specific CPU to create a different > cpuset >>>>> instead so they don't conflict with set 1. >>>> Yes, this seem to be much better approach. >>>> Please take a look on the new patch (here or in the phabricator). >>>> Comments: >>>> >>>> Use different approach for modifyable root set: >>>> >>>> * Make sets in 0..15 internal >>>> * Define CPUSET_SET1 & CPUSET_ITHREAD in that range >>>> * Add cpuset_lookup_builtin() to retrieve such cpu sets by id >>>> * Create additional root set for ithreads >>>> * Use this set in ithread_create() >>>> * Change intr_setaffinity() to use cpuset_iroot (do we really need > this)? >>>> >>>> We can probably do the same for kprocs, but I'm unsure if we really need > it. >>> >>> I imagined something a bit simpler. Just create a new set in > intr_event_bind >>> and bind the ithread to the new set. No need to have more magic set ids, > etc. >> Well, we also have userland which can modify given changes via `cpuset >> -x', so we need to be able to add some more logic on set >> allocation/keeping. Additionally, we can try to do the same via `cpuset >> -t', so introducing something like cpuset_setIthread() and hooking into >> intr_event_bind() won't probably be enough. At least I can't think out a >> quick and easy way to do this. > > cpuset -x calls intr_event_bind(). If you just do it there you fix both > places. 1:04 [0] ra# procstat -t 12 | grep irq275 12 100121 intr irq275: ix0:qu1 2 127 wait - 1:04 [0] ra# cpuset -g -x 275 irq 275 mask: 2 1:04 [0] ra# cpuset -g -t 100121 tid 100121 mask: 2 1:04 [0] ra# cpuset -l 3 -t 100121 -------------------------^^^------ 1:05 [0] ra# cpuset -g -t 100121 tid 100121 mask: 3 > >>> That also means that an ithread that isn't bound to a specific CPU via > either >>> 'cpuset -x' or BUS_BIND_INTR() will honor 'cpuset -s 1' like other >>> kernel processes. I think that's probably fine and sensible. The issue > is >> Well, it is questionable. Kernel threads are a bit different in terms of >> TLB changes, memory working set and so on. (Personally I'd prefer to >> separate user / kthreads / ithreads to different sets in HEAD but that's >> another story). >> >> Anyway, we probably can (and should) MFC a bit different version which >> tries to change several sets at once if user supplied set 1 as argument. > > No, I don't think we need umpteen special sets. I think we just need to fix > this one specific case of bound ithreads and everything else will work fine. > If someone wants to move kprocs out of set 1, they can already do that with > the existing tools via cpuset -C, etc. >