Date: Thu, 12 Jun 2014 11:57:54 +0800 From: Julian Elischer <julian@freebsd.org> To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>, John Baldwin <jhb@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-hackers@freebsd.org, hackers@freebsd.org Subject: Re: Permit init(8) use its own cpuset group. Message-ID: <53992542.3080908@freebsd.org> In-Reply-To: <53988933.5010902@FreeBSD.org> References: <538C8F9A.4020301@FreeBSD.org> <5390DB64.6010704@FreeBSD.org> <53923C09.6020302@FreeBSD.org> <201406091343.39584.jhb@freebsd.org> <53988933.5010902@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 6/12/14, 12:52 AM, Alexander V. Chernikov wrote: > On 09.06.2014 21:43, John Baldwin wrote: >> On Friday, June 06, 2014 6:09:13 pm Alexander V. Chernikov wrote: >>> On 06.06.2014 01:04, Alexander V. Chernikov wrote: >>>> On 05.06.2014 23:59, John Baldwin wrote: >>>>> On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote: >>>>>> On 05.06.2014 18:09, John Baldwin wrote: >>>>>>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov >>>>>>> wrote: >>>>>>>> On 04.06.2014 19:06, John Baldwin wrote: >>>>>>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote: >>>>>>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. >>>>>>>>>> Chernikov wrote: >>>>>>>>>>> Hello list! >>>>>>>>>>> >>>>>>>>>>> Currently init(8) uses group 1 which is root group. >>>>>>>>>>> Modifications of this group affects both kernel and >>>>>>>>>>> userland threads. >>>>>>>>>>> Additionally, such modifications are impossible, for >>>>>>>>>>> example, in >>>>>>> presence >>>>>>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds >>>>>>>>>>> their >>>>> threads >>>>>>>>> to >>>>>>>>>>> particular cpus. >>>>>>>>>>> >>>>>>>>>>> Proposed change ("init_cpuset" loader tunable) permits >>>>>>>>>>> changing cpu >>>>>>>>>>> masks for >>>>>>>>>>> userland more easily. Restricting user processes to >>>>>>>>>>> migrate to/from >>>>> CPU >>>>>>>>>>> cores >>>>>>>>>>> used for network traffic processing is one of the cases. >>>>>>>>>>> >>>>>>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same >>>>>>>>>>> version >>>>> attached >>>>>>>>>>> inline) >>>>>>>>>>> >>>>>>>>>>> If there are no objections, I'll commit this next week. >>>>>>>>>> Why is the tunable needed ? >>>>>>>>> Because some people already depend on doing 'cpuset -l 0 -s >>>>>>>>> 1'. It is >>>>>>> also >>>>>>>>> documented in our manpages that processes start in cpuset 1 >>>>>>>>> by default >>>>> so >>>>>>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc. >>>>>>>>> >>>>>>>>> For the stated problem (bound ithreads in drivers), I would >>>>>>>>> actually >>>>> like >>>>>>> to >>>>>>>>> fix ithreads that are bound to a specific CPU to create a >>>>>>>>> different >>>>> cpuset >>>>>>>>> instead so they don't conflict with set 1. >>>>>>>> Yes, this seem to be much better approach. >>>>>>>> Please take a look on the new patch (here or in the >>>>>>>> phabricator). >>>>>>>> Comments: >>>>>>>> >>>>>>>> Use different approach for modifyable root set: >>>>>>>> >>>>>>>> * Make sets in 0..15 internal >>>>>>>> * Define CPUSET_SET1 & CPUSET_ITHREAD in that range >>>>>>>> * Add cpuset_lookup_builtin() to retrieve such cpu sets by id >>>>>>>> * Create additional root set for ithreads >>>>>>>> * Use this set in ithread_create() >>>>>>>> * Change intr_setaffinity() to use cpuset_iroot (do we >>>>>>>> really need >>>>> this)? >>>>>>>> We can probably do the same for kprocs, but I'm unsure if we >>>>>>>> really need >>>>> it. >>>>>>> I imagined something a bit simpler. Just create a new set in >>>>> intr_event_bind >>>>>>> and bind the ithread to the new set. No need to have more >>>>>>> magic set ids, >>>>> etc. >>>>>> Well, we also have userland which can modify given changes via >>>>>> `cpuset >>>>>> -x', so we need to be able to add some more logic on set >>>>>> allocation/keeping. Additionally, we can try to do the same via >>>>>> `cpuset >>>>>> -t', so introducing something like cpuset_setIthread() and >>>>>> hooking into >>>>>> intr_event_bind() won't probably be enough. At least I can't >>>>>> think out a >>>>>> quick and easy way to do this. >>>>> cpuset -x calls intr_event_bind(). If you just do it there you >>>>> fix both >>>>> places. >>> Yup. I was wrong. This version seems to be simpler while doing the >>> right >>> thing. >> Hmm, I do think this patch is doing the right thing, but I'm >> worried you >> might have weird effects, e.g. I worry that because the 'intr' proc >> is in >> set 1 still, the thread masks will be (incorrectly) checked when >> changing > Well, as far as I understand we don't have any cpu sets bound to > program ("pid" search returns set of first thread). > Anyway, each original ithread mask is released via cpuset_rel() so > I'm a bit unsure about remaining thread masks. > Playing with cpuset and ixgbe driver, does not reveal given problem: > > 12:59 [0] test15# cpuset -g -s 1 > cpuset 1 mask: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, > 16, 17, 18, 19, 20, 21, 22, 23 > 13:00 [0] test15# kldload ixgbe > 13:00 [0] test15# vmstat -i | grep ix > irq263: ix0:que 0 116 0 > irq264: ix0:que 1 108 0 > irq265: ix0:que 2 108 0 > irq266: ix0:que 3 108 0 > irq267: ix0:que 4 108 0 > irq268: ix0:que 5 108 0 > irq269: ix0:que 6 108 0 > irq270: ix0:que 7 108 0 > ... > 13:00 [0] test15# cpuset -g -x 263 > irq 263 mask: 0 > 13:00 [0] test15# cpuset -g -x 264 > irq 264 mask: 1 > 13:00 [0] test15# cpuset -l 21,22,23 -s 1 > 13:01 [0] test15# cpuset -g -s 1 > cpuset 1 mask: 21, 22, 23 > 13:01 [0] test15# cpuset -g -x 263 > irq 263 mask: 0 > 13:01 [0] test15# cpuset -g -x 264 > irq 264 mask: 1 > 13:01 [0] test15# cpuset -l 12 -x 264 > 13:01 [0] test15# cpuset -g -x 264 > irq 264 mask: 12 > 13:01 [0] test15# cpuset -l 22,23 -s 1 > 13:01 [0] test15# echo $? > 0 > 13:01 [0] test15# cpuset -g -x 264 > irq 264 mask: 12 > 13:01 [0] test15# cpuset -g -s 1 > cpuset 1 mask: 22, 23 > > >> set 1 and you still won't be able to 'cpuset -l 0 -s 1'. >> >> Also, strictly speaking, it would probably be best to return >> threads to >> set 1 when NOCPU is passed to intr_event_bind() to unbind an >> interrupt. > Yup. in a quick look I didn't see anywhere that checked if the old set and new set are the same set.. > > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53992542.3080908>