Date: Tue, 03 Jun 2014 22:13:33 +0400 From: "Alexander V. Chernikov" <melifaro@FreeBSD.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: hackers@freebsd.org Subject: Re: Permit init(8) use its own cpuset group. Message-ID: <538E104D.3000904@FreeBSD.org> In-Reply-To: <20140602164850.GS3991@kib.kiev.ua> References: <538C8F9A.4020301@FreeBSD.org> <20140602164850.GS3991@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------040501050600090800020807
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
On 02.06.2014 20:48, Konstantin Belousov wrote:
> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. Chernikov wrote:
>> Hello list!
>>
>> Currently init(8) uses group 1 which is root group.
>> Modifications of this group affects both kernel and userland threads.
>> Additionally, such modifications are impossible, for example, in presence
>> of multi-queue NIC drivers (like igb or ixgbe) which binds their threads to
>> particular cpus.
>>
>> Proposed change ("init_cpuset" loader tunable) permits changing cpu
>> masks for
>> userland more easily. Restricting user processes to migrate to/from CPU
>> cores
>> used for network traffic processing is one of the cases.
>>
>> Phabricator: https://phabric.freebsd.org/D141 (the same version attached
>> inline)
>>
>> If there are no objections, I'll commit this next week.
> Why is the tunable needed ? IMO it is more reasonable to create a new
> cpuset always, at least I cannot explain why the existing behaviour
Initially I've planned to make this behavior to be default. However I
though someone
will complain on changing defaults. That's why I've introduced the
tunable for that.
> is useful. If creating new cpuset unconditionally, you could consider
> doing it in kernel in start_init(). You would also need to update the
> cpuset(2) man page, which states that cpuset 1 is set for all processes.
Please take a look on the new version doing this (also updated in
phabricator,
https://phabric.freebsd.org/D141 ).
>
> Still, see comments below for the patch.
>
>> Index: sbin/init/init.c
>> ===================================================================
>> --- sbin/init/init.c (revision 266306)
>> +++ sbin/init/init.c (working copy)
>> @@ -47,6 +47,8 @@ static const char rcsid[] =
>> #include <sys/param.h>
>> #include <sys/ioctl.h>
>> #include <sys/mount.h>
>> +#include <sys/param.h>
>> +#include <sys/cpuset.h>
>> #include <sys/sysctl.h>
>> #include <sys/wait.h>
>> #include <sys/stat.h>
>> @@ -320,6 +322,19 @@ invalid:
>> warning("Can't chroot to %s: %m", kenv_value);
>> }
>>
>> + if (kenv(KENV_GET, "init_cpuset", kenv_value, sizeof(kenv_value)) > 0) {
>> + if (getpid() == 1) {
> If you use the && operator for conditionals instead of two if()s, the nesting
> level of the block can be reduced.
>
>> + cpusetid_t setid;
> The variable must be declared at the function start.
>
>> +
>> + setid = -1;
> Why do you need to initialize the variable ?
>
>> + if (cpuset(&setid) != 0) {
>> + warning("cpu set alloc failed: %m");
>> + } else {
>> + if (cpuset_setid(CPU_WHICH_PID, 1, setid) != 0)
> This could be else if () again to reduce indentation.
>
>> + warning("cpuset_setsid failed: %m");
>> + }
>> + }
>> + }
>> /*
>> * Additional check if devfs needs to be mounted:
>> * If "/" and "/dev" have the same device number,
>> _______________________________________________
>> freebsd-hackers@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
--------------040501050600090800020807
Content-Type: text/x-patch;
name="D141.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename="D141.diff"
Index: lib/libc/sys/cpuset.2
===================================================================
--- lib/libc/sys/cpuset.2
+++ lib/libc/sys/cpuset.2
@@ -25,7 +25,7 @@
.\"
.\" $FreeBSD$
.\"
-.Dd February 14, 2014
+.Dd June 03, 2014
.Dt CPUSET 2
.Os
.Sh NAME
@@ -52,7 +52,9 @@
Processor sets contain lists of CPUs that members may run on and exist only
as long as some process is a member of the set.
All processes in the system have an assigned set.
-The default set for all processes in the system is the set numbered 1.
+Kernel and userland processes uses different ones.
+Kernel processes are started in set 1 while userland processes are started
+in set 2.
Threads belong to the same set as the process which contains them,
however, they may further restrict their set with the anonymous
per-thread mask.
Index: sys/kern/init_main.c
===================================================================
--- sys/kern/init_main.c
+++ sys/kern/init_main.c
@@ -702,6 +702,7 @@
GIANT_REQUIRED;
td = curthread;
+ td->td_cpuset = cpuset_pid1();
p = td->td_proc;
vfs_mountroot();
Index: sys/kern/kern_cpuset.c
===================================================================
--- sys/kern/kern_cpuset.c
+++ sys/kern/kern_cpuset.c
@@ -722,7 +722,7 @@
* system. It is initially created with a mask of all processors
* because we don't know what processors are valid until cpuset_init()
* runs. This set is immutable.
- * 1 - The default set which all processes are a member of until changed.
+ * 1 - The default set which all kernel processes are a member of until changed.
* This allows an administrator to move all threads off of given cpus to
* dedicate them to high priority tasks or save power etc.
*/
@@ -752,16 +752,38 @@
*/
set = uma_zalloc(cpuset_zone, M_WAITOK);
error = _cpuset_create(set, cpuset_zero, &cpuset_zero->cs_mask, 1);
- KASSERT(error == 0, ("Error creating default set: %d\n", error));
+ KASSERT(error == 0, ("Error creating default kernel cpuset: %d\n",
+ error));
/*
* Initialize the unit allocator. 0 and 1 are allocated above.
+ * Set 2 is reserved for init.
*/
- cpuset_unr = new_unrhdr(2, INT_MAX, NULL);
+ cpuset_unr = new_unrhdr(3, INT_MAX, NULL);
return (set);
}
/*
+ * Creates another cpuset for init process.
+ *
+ * 2 - The default set which all userland processes are a member of until
+ * changed. This allows an administrator to move all threads off of given
+ * cpus to dedicate them to high priority tasks or save power etc.
+ */
+struct cpuset *
+cpuset_pid1(void)
+{
+ struct cpuset *set;
+ int error;
+
+ set = uma_zalloc(cpuset_zone, M_WAITOK);
+ error = _cpuset_create(set, cpuset_zero, &cpuset_zero->cs_mask, 2);
+ KASSERT(error == 0, ("Error creating default userland cpuset: %d\n",
+ error));
+ return (set);
+}
+
+/*
* Create a cpuset, which would be cpuset_create() but
* mark the new 'set' as root.
*
Index: sys/sys/cpuset.h
===================================================================
--- sys/sys/cpuset.h
+++ sys/sys/cpuset.h
@@ -115,6 +115,7 @@
struct proc;
struct cpuset *cpuset_thread0(void);
+struct cpuset *cpuset_pid1(void);
struct cpuset *cpuset_ref(struct cpuset *);
void cpuset_rel(struct cpuset *);
int cpuset_setthread(lwpid_t id, cpuset_t *);
Index: usr.bin/cpuset/cpuset.1
===================================================================
--- usr.bin/cpuset/cpuset.1
+++ usr.bin/cpuset/cpuset.1
@@ -25,7 +25,7 @@
.\"
.\" $FreeBSD$
.\"
-.Dd January 14, 2011
+.Dd June 03, 2014
.Dt CPUSET 1
.Os
.Sh NAME
@@ -81,7 +81,9 @@
.Pp
There are two sets applicable to each process and one private mask per thread.
Every process in the system belongs to a cpuset.
-By default processes are started in set 1.
+Kernel and userland processes uses different default sets.
+Kernel processes are started in set 1.
+Userland processes are started in set 2.
The mask or id may be queried using
.Fl c .
Each thread also has a private mask of CPUs it is allowed to run
@@ -163,9 +165,9 @@
belongs to restricting it to CPUs 0 and 2:
.Dl cpuset -l 0,2 -c -p <sh pid>
.Pp
-Modify the cpuset all threads are in by default to contain only
+Modify the cpuset all userland threads are in by default to contain only
the first 4 CPUs, leaving the rest idle:
-.Dl cpuset -l 0-3 -s 1
+.Dl cpuset -l 0-3 -s 2
.Pp
Print the id of the cpuset
.Pa /bin/sh
--------------040501050600090800020807--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?538E104D.3000904>
