Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 07 Jun 2014 02:09:13 +0400
From:      "Alexander V. Chernikov" <melifaro@FreeBSD.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-hackers@freebsd.org, hackers@freebsd.org
Subject:   Re: Permit init(8) use its own cpuset group.
Message-ID:  <53923C09.6020302@FreeBSD.org>
In-Reply-To: <5390DB64.6010704@FreeBSD.org>
References:  <538C8F9A.4020301@FreeBSD.org> <201406051009.59432.jhb@freebsd.org> <5390C907.1070405@FreeBSD.org> <201406051559.11274.jhb@freebsd.org> <5390DB64.6010704@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------000900090005030503080000
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 7bit

On 06.06.2014 01:04, Alexander V. Chernikov wrote:
> On 05.06.2014 23:59, John Baldwin wrote:
>> On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote:
>>> On 05.06.2014 18:09, John Baldwin wrote:
>>>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov wrote:
>>>>> On 04.06.2014 19:06, John Baldwin wrote:
>>>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote:
>>>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. Chernikov wrote:
>>>>>>>> Hello list!
>>>>>>>>
>>>>>>>> Currently init(8) uses group 1 which is root group.
>>>>>>>> Modifications of this group affects both kernel and userland threads.
>>>>>>>> Additionally, such modifications are impossible, for example, in 
>>>> presence
>>>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds their 
>> threads
>>>>>> to
>>>>>>>> particular cpus.
>>>>>>>>
>>>>>>>> Proposed change ("init_cpuset" loader tunable) permits changing cpu
>>>>>>>> masks for
>>>>>>>> userland more easily. Restricting user processes to migrate to/from 
>> CPU
>>>>>>>> cores
>>>>>>>> used for network traffic processing is one of the cases.
>>>>>>>>
>>>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same version 
>> attached
>>>>>>>> inline)
>>>>>>>>
>>>>>>>> If there are no objections, I'll commit this next week.
>>>>>>> Why is the tunable needed ?
>>>>>> Because some people already depend on doing 'cpuset -l 0 -s 1'.  It is 
>>>> also
>>>>>> documented in our manpages that processes start in cpuset 1 by default 
>> so
>>>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc.
>>>>>>
>>>>>> For the stated problem (bound ithreads in drivers), I would actually 
>> like 
>>>> to
>>>>>> fix ithreads that are bound to a specific CPU to create a different 
>> cpuset
>>>>>> instead so they don't conflict with set 1.
>>>>> Yes, this seem to be much better approach.
>>>>> Please take a look on the new patch (here or in the phabricator).
>>>>> Comments:
>>>>>
>>>>> Use different approach for modifyable root set:
>>>>>
>>>>>   * Make sets in 0..15 internal
>>>>>   * Define CPUSET_SET1 & CPUSET_ITHREAD in that range
>>>>>   * Add cpuset_lookup_builtin() to retrieve such cpu sets by id
>>>>>   * Create additional root set for ithreads
>>>>>   * Use this set in ithread_create()
>>>>>   * Change intr_setaffinity() to use cpuset_iroot (do we really need 
>> this)?
>>>>>
>>>>> We can probably do the same for kprocs, but I'm unsure if we really need 
>> it.
>>>>
>>>> I imagined something a bit simpler.  Just create a new set in 
>> intr_event_bind
>>>> and bind the ithread to the new set.  No need to have more magic set ids, 
>> etc. 
>>> Well, we also have userland which can modify given changes via `cpuset
>>> -x', so we need to be able to add some more logic on set
>>> allocation/keeping. Additionally, we can try to do the same via `cpuset
>>> -t', so introducing something like cpuset_setIthread() and hooking into
>>> intr_event_bind() won't probably be enough. At least I can't think out a
>>> quick and easy way to do this.
>>
>> cpuset -x calls intr_event_bind().  If you just do it there you fix both 
>> places.
Yup. I was wrong. This version seems to be simpler while doing the right
thing.

> 1:04 [0] ra# procstat -t 12 | grep irq275
>    12 100121 intr             irq275: ix0:qu1    2  127 wait    -
> 1:04 [0] ra# cpuset -g -x 275
> irq 275 mask: 2
> 1:04 [0] ra# cpuset -g -t 100121
> tid 100121 mask: 2
> 1:04 [0] ra# cpuset -l 3 -t 100121
> -------------------------^^^------
> 1:05 [0] ra# cpuset -g -t 100121
> tid 100121 mask: 3
> 
>>
>>>> That also means that an ithread that isn't bound to a specific CPU via 
>> either 
>>>> 'cpuset -x' or BUS_BIND_INTR() will honor 'cpuset -s 1' like other
>>>> kernel processes.  I think that's probably fine and sensible.  The issue 
>> is
>>> Well, it is questionable. Kernel threads are a bit different in terms of
>>> TLB changes, memory working set and so on. (Personally I'd prefer to
>>> separate user / kthreads / ithreads to different sets in HEAD but that's
>>> another story).
>>>
>>> Anyway, we probably can (and should) MFC a bit different version which
>>> tries to change several sets at once if user supplied set 1 as argument.
>>
>> No, I don't think we need umpteen special sets.  I think we just need to fix
>> this one specific case of bound ithreads and everything else will work fine.  
>> If someone wants to move kprocs out of set 1, they can already do that with 
>> the existing tools via cpuset -C, etc.
>>
> 


--------------000900090005030503080000
Content-Type: text/x-patch;
 name="D141_6.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="D141_6.diff"

Index: sys/kern/kern_cpuset.c
===================================================================
--- sys/kern/kern_cpuset.c
+++ sys/kern/kern_cpuset.c
@@ -716,6 +716,66 @@
 }
 
 /*
+ * Apply new cpumask to the ithread.
+ */
+int
+cpuset_setithread(lwpid_t id, cpuset_t *mask)
+{
+	struct cpuset *nset, *rset;
+	struct cpuset *parent, *old_set;
+	struct thread *td;
+	struct proc *p;
+	cpusetid_t cs_id;
+	int error;
+
+	nset = uma_zalloc(cpuset_zone, M_WAITOK);
+	rset = uma_zalloc(cpuset_zone, M_WAITOK);
+	error = cpuset_which(CPU_WHICH_TID, id, &p, &td, &old_set);
+	if (((cs_id = alloc_unr(cpuset_unr)) == CPUSET_INVALID) || error != 0)
+		goto out;
+	thread_lock(td);
+	old_set = td->td_cpuset;
+	if (old_set->cs_id == 1 || (old_set->cs_id == CPUSET_INVALID &&
+	    old_set->cs_parent->cs_id == 1)) {
+		/* Default mask, we need to use new root set */
+		error = _cpuset_create(rset, cpuset_zero,
+		    &cpuset_zero->cs_mask, cs_id);
+		if (error != 0) {
+			PROC_UNLOCK(p);
+			goto out;
+		}
+		rset->cs_flags |= CPU_SET_ROOT;
+		parent = rset;
+		rset = NULL;
+		cs_id = CPUSET_INVALID;
+	} else {
+		/* Assume existing set was already allocated by previous call */
+		parent = td->td_cpuset;
+		old_set = NULL;
+	}
+
+	error = cpuset_shadow(parent, nset, mask);
+	if (error == 0) {
+		td->td_cpuset = nset;
+		sched_affinity(td);
+		nset = NULL;
+	}
+	thread_unlock(td);
+	PROC_UNLOCK(p);
+	if (old_set != NULL)
+		cpuset_rel(old_set);
+out:
+	if (nset != NULL)
+		uma_zfree(cpuset_zone, nset);
+	if (rset != NULL)
+		uma_zfree(cpuset_zone, rset);
+	if (cs_id != CPUSET_INVALID)
+		free_unr(cpuset_unr, cs_id);
+	return (error);
+}
+
+
+/*
  * Creates the cpuset for thread0.  We make two sets:
  * 
  * 0 - The root set which should represent all valid processors in the
Index: sys/kern/kern_intr.c
===================================================================
--- sys/kern/kern_intr.c
+++ sys/kern/kern_intr.c
@@ -323,7 +323,7 @@
 			CPU_SET(cpu, &mask);
 		id = ie->ie_thread->it_thread->td_tid;
 		mtx_unlock(&ie->ie_lock);
-		error = cpuset_setthread(id, &mask);
+		error = cpuset_setithread(id, &mask);
 		if (error)
 			return (error);
 	} else
@@ -339,7 +339,7 @@
 				CPU_SET(ie->ie_cpu, &mask);
 			id = ie->ie_thread->it_thread->td_tid;
 			mtx_unlock(&ie->ie_lock);
-			(void)cpuset_setthread(id, &mask);
+			(void)cpuset_setithread(id, &mask);
 		} else
 			mtx_unlock(&ie->ie_lock);
 		return (error);
Index: sys/sys/cpuset.h
===================================================================
--- sys/sys/cpuset.h
+++ sys/sys/cpuset.h
@@ -118,6 +118,7 @@
 struct cpuset *cpuset_ref(struct cpuset *);
 void	cpuset_rel(struct cpuset *);
 int	cpuset_setthread(lwpid_t id, cpuset_t *);
+int	cpuset_setithread(lwpid_t id, cpuset_t *);
 int	cpuset_create_root(struct prison *, struct cpuset **);
 int	cpuset_setproc_update_set(struct proc *, struct cpuset *);
 char	*cpusetobj_strprint(char *, const cpuset_t *);

--------------000900090005030503080000--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53923C09.6020302>