From owner-freebsd-hackers@FreeBSD.ORG Thu Jun 12 03:58:13 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 70925A16; Thu, 12 Jun 2014 03:58:13 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 3025F2A64; Thu, 12 Jun 2014 03:58:12 +0000 (UTC) Received: from jre-mbp.elischer.org (etroy.elischer.org [121.45.232.70]) (authenticated bits=0) by vps1.elischer.org (8.14.8/8.14.8) with ESMTP id s5C3vxmU046483 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Wed, 11 Jun 2014 20:58:02 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <53992542.3080908@freebsd.org> Date: Thu, 12 Jun 2014 11:57:54 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "Alexander V. Chernikov" , John Baldwin Subject: Re: Permit init(8) use its own cpuset group. References: <538C8F9A.4020301@FreeBSD.org> <5390DB64.6010704@FreeBSD.org> <53923C09.6020302@FreeBSD.org> <201406091343.39584.jhb@freebsd.org> <53988933.5010902@FreeBSD.org> In-Reply-To: <53988933.5010902@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: Konstantin Belousov , freebsd-hackers@freebsd.org, hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2014 03:58:13 -0000 On 6/12/14, 12:52 AM, Alexander V. Chernikov wrote: > On 09.06.2014 21:43, John Baldwin wrote: >> On Friday, June 06, 2014 6:09:13 pm Alexander V. Chernikov wrote: >>> On 06.06.2014 01:04, Alexander V. Chernikov wrote: >>>> On 05.06.2014 23:59, John Baldwin wrote: >>>>> On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote: >>>>>> On 05.06.2014 18:09, John Baldwin wrote: >>>>>>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov >>>>>>> wrote: >>>>>>>> On 04.06.2014 19:06, John Baldwin wrote: >>>>>>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote: >>>>>>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. >>>>>>>>>> Chernikov wrote: >>>>>>>>>>> Hello list! >>>>>>>>>>> >>>>>>>>>>> Currently init(8) uses group 1 which is root group. >>>>>>>>>>> Modifications of this group affects both kernel and >>>>>>>>>>> userland threads. >>>>>>>>>>> Additionally, such modifications are impossible, for >>>>>>>>>>> example, in >>>>>>> presence >>>>>>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds >>>>>>>>>>> their >>>>> threads >>>>>>>>> to >>>>>>>>>>> particular cpus. >>>>>>>>>>> >>>>>>>>>>> Proposed change ("init_cpuset" loader tunable) permits >>>>>>>>>>> changing cpu >>>>>>>>>>> masks for >>>>>>>>>>> userland more easily. Restricting user processes to >>>>>>>>>>> migrate to/from >>>>> CPU >>>>>>>>>>> cores >>>>>>>>>>> used for network traffic processing is one of the cases. >>>>>>>>>>> >>>>>>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same >>>>>>>>>>> version >>>>> attached >>>>>>>>>>> inline) >>>>>>>>>>> >>>>>>>>>>> If there are no objections, I'll commit this next week. >>>>>>>>>> Why is the tunable needed ? >>>>>>>>> Because some people already depend on doing 'cpuset -l 0 -s >>>>>>>>> 1'. It is >>>>>>> also >>>>>>>>> documented in our manpages that processes start in cpuset 1 >>>>>>>>> by default >>>>> so >>>>>>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc. >>>>>>>>> >>>>>>>>> For the stated problem (bound ithreads in drivers), I would >>>>>>>>> actually >>>>> like >>>>>>> to >>>>>>>>> fix ithreads that are bound to a specific CPU to create a >>>>>>>>> different >>>>> cpuset >>>>>>>>> instead so they don't conflict with set 1. >>>>>>>> Yes, this seem to be much better approach. >>>>>>>> Please take a look on the new patch (here or in the >>>>>>>> phabricator). >>>>>>>> Comments: >>>>>>>> >>>>>>>> Use different approach for modifyable root set: >>>>>>>> >>>>>>>> * Make sets in 0..15 internal >>>>>>>> * Define CPUSET_SET1 & CPUSET_ITHREAD in that range >>>>>>>> * Add cpuset_lookup_builtin() to retrieve such cpu sets by id >>>>>>>> * Create additional root set for ithreads >>>>>>>> * Use this set in ithread_create() >>>>>>>> * Change intr_setaffinity() to use cpuset_iroot (do we >>>>>>>> really need >>>>> this)? >>>>>>>> We can probably do the same for kprocs, but I'm unsure if we >>>>>>>> really need >>>>> it. >>>>>>> I imagined something a bit simpler. Just create a new set in >>>>> intr_event_bind >>>>>>> and bind the ithread to the new set. No need to have more >>>>>>> magic set ids, >>>>> etc. >>>>>> Well, we also have userland which can modify given changes via >>>>>> `cpuset >>>>>> -x', so we need to be able to add some more logic on set >>>>>> allocation/keeping. Additionally, we can try to do the same via >>>>>> `cpuset >>>>>> -t', so introducing something like cpuset_setIthread() and >>>>>> hooking into >>>>>> intr_event_bind() won't probably be enough. At least I can't >>>>>> think out a >>>>>> quick and easy way to do this. >>>>> cpuset -x calls intr_event_bind(). If you just do it there you >>>>> fix both >>>>> places. >>> Yup. I was wrong. This version seems to be simpler while doing the >>> right >>> thing. >> Hmm, I do think this patch is doing the right thing, but I'm >> worried you >> might have weird effects, e.g. I worry that because the 'intr' proc >> is in >> set 1 still, the thread masks will be (incorrectly) checked when >> changing > Well, as far as I understand we don't have any cpu sets bound to > program ("pid" search returns set of first thread). > Anyway, each original ithread mask is released via cpuset_rel() so > I'm a bit unsure about remaining thread masks. > Playing with cpuset and ixgbe driver, does not reveal given problem: > > 12:59 [0] test15# cpuset -g -s 1 > cpuset 1 mask: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, > 16, 17, 18, 19, 20, 21, 22, 23 > 13:00 [0] test15# kldload ixgbe > 13:00 [0] test15# vmstat -i | grep ix > irq263: ix0:que 0 116 0 > irq264: ix0:que 1 108 0 > irq265: ix0:que 2 108 0 > irq266: ix0:que 3 108 0 > irq267: ix0:que 4 108 0 > irq268: ix0:que 5 108 0 > irq269: ix0:que 6 108 0 > irq270: ix0:que 7 108 0 > ... > 13:00 [0] test15# cpuset -g -x 263 > irq 263 mask: 0 > 13:00 [0] test15# cpuset -g -x 264 > irq 264 mask: 1 > 13:00 [0] test15# cpuset -l 21,22,23 -s 1 > 13:01 [0] test15# cpuset -g -s 1 > cpuset 1 mask: 21, 22, 23 > 13:01 [0] test15# cpuset -g -x 263 > irq 263 mask: 0 > 13:01 [0] test15# cpuset -g -x 264 > irq 264 mask: 1 > 13:01 [0] test15# cpuset -l 12 -x 264 > 13:01 [0] test15# cpuset -g -x 264 > irq 264 mask: 12 > 13:01 [0] test15# cpuset -l 22,23 -s 1 > 13:01 [0] test15# echo $? > 0 > 13:01 [0] test15# cpuset -g -x 264 > irq 264 mask: 12 > 13:01 [0] test15# cpuset -g -s 1 > cpuset 1 mask: 22, 23 > > >> set 1 and you still won't be able to 'cpuset -l 0 -s 1'. >> >> Also, strictly speaking, it would probably be best to return >> threads to >> set 1 when NOCPU is passed to intr_event_bind() to unbind an >> interrupt. > Yup. in a quick look I didn't see anywhere that checked if the old set and new set are the same set.. > > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"