Date: Sat, 07 Jun 2014 02:09:13 +0400 From: "Alexander V. Chernikov" <melifaro@FreeBSD.org> To: John Baldwin <jhb@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-hackers@freebsd.org, hackers@freebsd.org Subject: Re: Permit init(8) use its own cpuset group. Message-ID: <53923C09.6020302@FreeBSD.org> In-Reply-To: <5390DB64.6010704@FreeBSD.org> References: <538C8F9A.4020301@FreeBSD.org> <201406051009.59432.jhb@freebsd.org> <5390C907.1070405@FreeBSD.org> <201406051559.11274.jhb@freebsd.org> <5390DB64.6010704@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
On 06.06.2014 01:04, Alexander V. Chernikov wrote:
> On 05.06.2014 23:59, John Baldwin wrote:
>> On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote:
>>> On 05.06.2014 18:09, John Baldwin wrote:
>>>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov wrote:
>>>>> On 04.06.2014 19:06, John Baldwin wrote:
>>>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote:
>>>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. Chernikov wrote:
>>>>>>>> Hello list!
>>>>>>>>
>>>>>>>> Currently init(8) uses group 1 which is root group.
>>>>>>>> Modifications of this group affects both kernel and userland threads.
>>>>>>>> Additionally, such modifications are impossible, for example, in
>>>> presence
>>>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds their
>> threads
>>>>>> to
>>>>>>>> particular cpus.
>>>>>>>>
>>>>>>>> Proposed change ("init_cpuset" loader tunable) permits changing cpu
>>>>>>>> masks for
>>>>>>>> userland more easily. Restricting user processes to migrate to/from
>> CPU
>>>>>>>> cores
>>>>>>>> used for network traffic processing is one of the cases.
>>>>>>>>
>>>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same version
>> attached
>>>>>>>> inline)
>>>>>>>>
>>>>>>>> If there are no objections, I'll commit this next week.
>>>>>>> Why is the tunable needed ?
>>>>>> Because some people already depend on doing 'cpuset -l 0 -s 1'. It is
>>>> also
>>>>>> documented in our manpages that processes start in cpuset 1 by default
>> so
>>>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc.
>>>>>>
>>>>>> For the stated problem (bound ithreads in drivers), I would actually
>> like
>>>> to
>>>>>> fix ithreads that are bound to a specific CPU to create a different
>> cpuset
>>>>>> instead so they don't conflict with set 1.
>>>>> Yes, this seem to be much better approach.
>>>>> Please take a look on the new patch (here or in the phabricator).
>>>>> Comments:
>>>>>
>>>>> Use different approach for modifyable root set:
>>>>>
>>>>> * Make sets in 0..15 internal
>>>>> * Define CPUSET_SET1 & CPUSET_ITHREAD in that range
>>>>> * Add cpuset_lookup_builtin() to retrieve such cpu sets by id
>>>>> * Create additional root set for ithreads
>>>>> * Use this set in ithread_create()
>>>>> * Change intr_setaffinity() to use cpuset_iroot (do we really need
>> this)?
>>>>>
>>>>> We can probably do the same for kprocs, but I'm unsure if we really need
>> it.
>>>>
>>>> I imagined something a bit simpler. Just create a new set in
>> intr_event_bind
>>>> and bind the ithread to the new set. No need to have more magic set ids,
>> etc.
>>> Well, we also have userland which can modify given changes via `cpuset
>>> -x', so we need to be able to add some more logic on set
>>> allocation/keeping. Additionally, we can try to do the same via `cpuset
>>> -t', so introducing something like cpuset_setIthread() and hooking into
>>> intr_event_bind() won't probably be enough. At least I can't think out a
>>> quick and easy way to do this.
>>
>> cpuset -x calls intr_event_bind(). If you just do it there you fix both
>> places.
Yup. I was wrong. This version seems to be simpler while doing the right
thing.
> 1:04 [0] ra# procstat -t 12 | grep irq275
> 12 100121 intr irq275: ix0:qu1 2 127 wait -
> 1:04 [0] ra# cpuset -g -x 275
> irq 275 mask: 2
> 1:04 [0] ra# cpuset -g -t 100121
> tid 100121 mask: 2
> 1:04 [0] ra# cpuset -l 3 -t 100121
> -------------------------^^^------
> 1:05 [0] ra# cpuset -g -t 100121
> tid 100121 mask: 3
>
>>
>>>> That also means that an ithread that isn't bound to a specific CPU via
>> either
>>>> 'cpuset -x' or BUS_BIND_INTR() will honor 'cpuset -s 1' like other
>>>> kernel processes. I think that's probably fine and sensible. The issue
>> is
>>> Well, it is questionable. Kernel threads are a bit different in terms of
>>> TLB changes, memory working set and so on. (Personally I'd prefer to
>>> separate user / kthreads / ithreads to different sets in HEAD but that's
>>> another story).
>>>
>>> Anyway, we probably can (and should) MFC a bit different version which
>>> tries to change several sets at once if user supplied set 1 as argument.
>>
>> No, I don't think we need umpteen special sets. I think we just need to fix
>> this one specific case of bound ithreads and everything else will work fine.
>> If someone wants to move kprocs out of set 1, they can already do that with
>> the existing tools via cpuset -C, etc.
>>
>
[-- Attachment #2 --]
Index: sys/kern/kern_cpuset.c
===================================================================
--- sys/kern/kern_cpuset.c
+++ sys/kern/kern_cpuset.c
@@ -716,6 +716,66 @@
}
/*
+ * Apply new cpumask to the ithread.
+ */
+int
+cpuset_setithread(lwpid_t id, cpuset_t *mask)
+{
+ struct cpuset *nset, *rset;
+ struct cpuset *parent, *old_set;
+ struct thread *td;
+ struct proc *p;
+ cpusetid_t cs_id;
+ int error;
+
+ nset = uma_zalloc(cpuset_zone, M_WAITOK);
+ rset = uma_zalloc(cpuset_zone, M_WAITOK);
+ error = cpuset_which(CPU_WHICH_TID, id, &p, &td, &old_set);
+ if (((cs_id = alloc_unr(cpuset_unr)) == CPUSET_INVALID) || error != 0)
+ goto out;
+ thread_lock(td);
+ old_set = td->td_cpuset;
+ if (old_set->cs_id == 1 || (old_set->cs_id == CPUSET_INVALID &&
+ old_set->cs_parent->cs_id == 1)) {
+ /* Default mask, we need to use new root set */
+ error = _cpuset_create(rset, cpuset_zero,
+ &cpuset_zero->cs_mask, cs_id);
+ if (error != 0) {
+ PROC_UNLOCK(p);
+ goto out;
+ }
+ rset->cs_flags |= CPU_SET_ROOT;
+ parent = rset;
+ rset = NULL;
+ cs_id = CPUSET_INVALID;
+ } else {
+ /* Assume existing set was already allocated by previous call */
+ parent = td->td_cpuset;
+ old_set = NULL;
+ }
+
+ error = cpuset_shadow(parent, nset, mask);
+ if (error == 0) {
+ td->td_cpuset = nset;
+ sched_affinity(td);
+ nset = NULL;
+ }
+ thread_unlock(td);
+ PROC_UNLOCK(p);
+ if (old_set != NULL)
+ cpuset_rel(old_set);
+out:
+ if (nset != NULL)
+ uma_zfree(cpuset_zone, nset);
+ if (rset != NULL)
+ uma_zfree(cpuset_zone, rset);
+ if (cs_id != CPUSET_INVALID)
+ free_unr(cpuset_unr, cs_id);
+ return (error);
+}
+
+
+/*
* Creates the cpuset for thread0. We make two sets:
*
* 0 - The root set which should represent all valid processors in the
Index: sys/kern/kern_intr.c
===================================================================
--- sys/kern/kern_intr.c
+++ sys/kern/kern_intr.c
@@ -323,7 +323,7 @@
CPU_SET(cpu, &mask);
id = ie->ie_thread->it_thread->td_tid;
mtx_unlock(&ie->ie_lock);
- error = cpuset_setthread(id, &mask);
+ error = cpuset_setithread(id, &mask);
if (error)
return (error);
} else
@@ -339,7 +339,7 @@
CPU_SET(ie->ie_cpu, &mask);
id = ie->ie_thread->it_thread->td_tid;
mtx_unlock(&ie->ie_lock);
- (void)cpuset_setthread(id, &mask);
+ (void)cpuset_setithread(id, &mask);
} else
mtx_unlock(&ie->ie_lock);
return (error);
Index: sys/sys/cpuset.h
===================================================================
--- sys/sys/cpuset.h
+++ sys/sys/cpuset.h
@@ -118,6 +118,7 @@
struct cpuset *cpuset_ref(struct cpuset *);
void cpuset_rel(struct cpuset *);
int cpuset_setthread(lwpid_t id, cpuset_t *);
+int cpuset_setithread(lwpid_t id, cpuset_t *);
int cpuset_create_root(struct prison *, struct cpuset **);
int cpuset_setproc_update_set(struct proc *, struct cpuset *);
char *cpusetobj_strprint(char *, const cpuset_t *);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53923C09.6020302>
