Date: Sun, 25 Mar 2018 08:07:27 +0200 From: "O. Hartmann" <ohartmann@walstatt.org> To: Jeff Roberson <jeff@FreeBSD.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r331508 - in head: lib/libc/sys share/man/man9 usr.bin/cpuset Message-ID: <20180325080754.4e169d9a@thor.intern.walstatt.dynvpn.de> In-Reply-To: <201803242358.w2ONwiuu051354@repo.freebsd.org> References: <201803242358.w2ONwiuu051354@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/QZ9Mo8ZpX8oTgh.41WmBpJT Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Am Sat, 24 Mar 2018 23:58:44 +0000 (UTC) Jeff Roberson <jeff@FreeBSD.org> schrieb: > Author: jeff > Date: Sat Mar 24 23:58:44 2018 > New Revision: 331508 > URL: https://svnweb.freebsd.org/changeset/base/331508 >=20 > Log: > Document new NUMA related syscalls and utility options. > =20 > Sponsored by: Netflix, Dell/EMC Isilon >=20 > Modified: > head/lib/libc/sys/Makefile.inc > head/lib/libc/sys/cpuset.2 > head/lib/libc/sys/cpuset_getaffinity.2 > head/share/man/man9/Makefile > head/share/man/man9/malloc.9 > head/share/man/man9/zone.9 > head/usr.bin/cpuset/cpuset.1 >=20 > Modified: head/lib/libc/sys/Makefile.inc > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/lib/libc/sys/Makefile.inc Sat Mar 24 23:26:54 2018 (r331507) > +++ head/lib/libc/sys/Makefile.inc Sat Mar 24 23:58:44 2018 (r331508) > @@ -174,6 +174,7 @@ MAN+=3D abort2.2 \ > connectat.2 \ > cpuset.2 \ > cpuset_getaffinity.2 \ > + cpuset_getdomain.2 \ > dup.2 \ > execve.2 \ > _exit.2 \ > @@ -371,6 +372,7 @@ MLINKS+=3Dnanosleep.2 clock_nanosleep.2 > MLINKS+=3Dcpuset.2 cpuset_getid.2 \ > cpuset.2 cpuset_setid.2 > MLINKS+=3Dcpuset_getaffinity.2 cpuset_setaffinity.2 > +MLINKS+=3Dcpuset_getdomain.2 cpuset_setdomain.2 > MLINKS+=3Ddup.2 dup2.2 > MLINKS+=3Dexecve.2 fexecve.2 > MLINKS+=3Dextattr_get_file.2 extattr.2 \ >=20 > Modified: head/lib/libc/sys/cpuset.2 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/lib/libc/sys/cpuset.2 Sat Mar 24 23:26:54 2018 (r331507) > +++ head/lib/libc/sys/cpuset.2 Sat Mar 24 23:58:44 2018 (r331508) > @@ -48,21 +48,21 @@ > The > .Nm > family of system calls allow applications to control sets of processors = and > -assign processes and threads to these sets. > -Processor sets contain lists of CPUs that members may run on and exist o= nly > -as long as some process is a member of the set. > +memory domains and assign processes and threads to these sets. > +Processor sets contain lists of CPUs and domains that members may run on > +and exist only as long as some process is a member of the set. > All processes in the system have an assigned set. > The default set for all processes in the system is the set numbered 1. > Threads belong to the same set as the process which contains them, > however, they may further restrict their set with the anonymous > -per-thread mask. > +per-thread mask to bind to a specific CPU or subset of CPUs and memory d= omains. > .Pp > Sets are referenced by a number of type > .Ft cpuset_id_t . > Each thread has a root set, an assigned set, and an anonymous mask. > Only the root and assigned sets are numbered. > -The root set is the set of all CPUs available in the system or in the > -system partition the thread is running in. > +The root set is the set of all CPUs and memory domains available in the = system > +or in the system partition the thread is running in. > The assigned set is a subset of the root set and is administratively > assignable on a per-process basis. > Many processes and threads may be members of a numbered set. > @@ -72,7 +72,8 @@ set. > It is intended that administrators will manipulate numbered sets using > .Xr cpuset 1 > while application developers will manipulate anonymous sets using > -.Xr cpuset_setaffinity 2 . > +.Xr cpuset_setaffinity 2 and > +.Xr cpuset_setdomain 2 . > .Pp > To select the correct set a value of type > .Ft cpulevel_t > @@ -175,9 +176,10 @@ with a process or thread is unsupported since > this references the unnumbered anonymous mask. > .Pp > The actual contents of the sets may be retrieved or manipulated using > -.Xr cpuset_getaffinity 2 > -and > -.Xr cpuset_setaffinity 2 . > +.Xr cpuset_getaffinity 2 , > +.Xr cpuset_setaffinity 2 , > +.Xr cpuset_getdomain 2 , and > +.Xr cpuset_setdomain 2 . > See those manual pages for more detail. > .Sh RETURN VALUES > .Rv -std > @@ -220,6 +222,8 @@ for allocation. > .Xr cpuset 1 , > .Xr cpuset_getaffinity 2 , > .Xr cpuset_setaffinity 2 , > +.Xr cpuset_getdomain 2 , > +.Xr cpuset_setdomain 2 , > .Xr pthread_affinity_np 3 , > .Xr pthread_attr_affinity_np 3 , > .Xr cpuset 9 >=20 > Modified: head/lib/libc/sys/cpuset_getaffinity.2 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/lib/libc/sys/cpuset_getaffinity.2 Sat Mar 24 23:26:54 2018 > (r331507) +++ head/lib/libc/sys/cpuset_getaffinity.2 Sat Mar 24 23:58:44 > 2018 (r331508) @@ -160,6 +160,8 @@ See > .Xr cpuset 2 , > .Xr cpuset_getid 2 , > .Xr cpuset_setid 2 , > +.Xr cpuset_getdomain 2 , > +.Xr cpuset_setdomain 2 , > .Xr pthread_affinity_np 3 , > .Xr pthread_attr_affinity_np 3 , > .Xr cpuset 9 >=20 > Modified: head/share/man/man9/Makefile > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/share/man/man9/Makefile Sat Mar 24 23:26:54 2018 (r331507) > +++ head/share/man/man9/Makefile Sat Mar 24 23:58:44 2018 (r331508) > @@ -1271,6 +1271,8 @@ MLINKS+=3Dmake_dev.9 destroy_dev.9 \ > make_dev.9 make_dev_p.9 \ > make_dev.9 make_dev_s.9 > MLINKS+=3Dmalloc.9 free.9 \ > + malloc.9 malloc_domain.9 \ > + malloc.9 free_domain.9 \ > malloc.9 mallocarray.9 \ > malloc.9 MALLOC_DECLARE.9 \ > malloc.9 MALLOC_DEFINE.9 \ > @@ -2213,10 +2215,12 @@ MLINKS+=3Dvslock.9 vsunlock.9 > MLINKS+=3Dzone.9 uma.9 \ > zone.9 uma_zalloc.9 \ > zone.9 uma_zalloc_arg.9 \ > + zone.9 uma_zalloc_domain.9 \ > zone.9 uma_zcreate.9 \ > zone.9 uma_zdestroy.9 \ > zone.9 uma_zfree.9 \ > zone.9 uma_zfree_arg.9 \ > + zone.9 uma_zfree_domain.9 \ > zone.9 uma_zone_get_cur.9 \ > zone.9 uma_zone_get_max.9 \ > zone.9 uma_zone_set_max.9 \ >=20 > Modified: head/share/man/man9/malloc.9 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/share/man/man9/malloc.9 Sat Mar 24 23:26:54 2018 (r331507) > +++ head/share/man/man9/malloc.9 Sat Mar 24 23:58:44 2018 (r331508) > @@ -46,9 +46,13 @@ > .Ft void * > .Fn malloc "size_t size" "struct malloc_type *type" "int flags" > .Ft void * > +.Fn malloc_domain "size_t size" "struct malloc_type *type" "int domain" = "int flags" > +.Ft void * > .Fn mallocarray "size_t nmemb" "size_t size" "struct malloc_type *type" = "int flags" > .Ft void > .Fn free "void *addr" "struct malloc_type *type" > +.Ft void > +.Fn free_domain "void *addr" "struct malloc_type *type" > .Ft void * > .Fn realloc "void *addr" "size_t size" "struct malloc_type *type" "int f= lags" > .Ft void * > @@ -64,6 +68,14 @@ The > function allocates uninitialized memory in kernel address space for an > object whose size is specified by > .Fa size . > +.Pp > +The > +.Fn malloc_domain > +variant allocates the object from the specified memory domain. Memory a= llocated > +with this function should be returned with > +.Fn free_domain . > +See > +.Xr numa 9 for more details. > .Pp > The > .Fn mallocarray >=20 > Modified: head/share/man/man9/zone.9 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/share/man/man9/zone.9 Sat Mar 24 23:26:54 2018 (r331507) > +++ head/share/man/man9/zone.9 Sat Mar 24 23:58:44 2018 (r331508) > @@ -32,8 +32,10 @@ > .Nm uma_zcreate , > .Nm uma_zalloc , > .Nm uma_zalloc_arg , > +.Nm uma_zalloc_domain , > .Nm uma_zfree , > .Nm uma_zfree_arg , > +.Nm uma_zfree_domain , > .Nm uma_zdestroy , > .Nm uma_zone_set_max , > .Nm uma_zone_get_max , > @@ -55,11 +57,15 @@ > .Fn uma_zalloc "uma_zone_t zone" "int flags" > .Ft "void *" > .Fn uma_zalloc_arg "uma_zone_t zone" "void *arg" "int flags" > +.Ft "void *" > +.Fn uma_zalloc_domain "uma_zone_t zone" "void *arg" "int domain" "int fl= ags" > .Ft void > .Fn uma_zfree "uma_zone_t zone" "void *item" > .Ft void > .Fn uma_zfree_arg "uma_zone_t zone" "void *item" "void *arg" > .Ft void > +.Fn uma_zfree_domain "uma_zone_t zone" "void *item" "void *arg" > +.Ft void > .Fn uma_zdestroy "uma_zone_t zone" > .Ft int > .Fn uma_zone_set_max "uma_zone_t zone" "int nitems" > @@ -78,10 +84,13 @@ > .Fn SYSCTL_ADD_UMA_CUR ctx parent nbr name access zone descr > .Sh DESCRIPTION > The zone allocator provides an efficient interface for managing > -dynamically-sized collections of items of similar size. > +dynamically-sized collections of items of identical size. > The zone allocator can work with preallocated zones as well as with > runtime-allocated ones, and is therefore available much earlier in the > -boot process than other memory management routines. > +boot process than other memory management routines. The zone allocator > +provides per-cpu allocation caches with linear scalability on SMP > +systems as well as round-robin and first-touch policies for NUMA > +systems. > .Pp > A zone is an extensible collection of items of identical size. > The zone allocator keeps track of which items are in use and which > @@ -209,6 +218,11 @@ The zone is for the > subsystem. > .It Dv UMA_ZONE_VM > The zone is for the VM subsystem. > +.It Dv UMA_ZONE_NUMA > +The zone should use a first-touch NUMA policy rather than the round-robin > +default. Callers that do not free memory on the same domain it is alloca= ted > +from will cause mixing in per-cpu caches. See > +.Xr numa 9 for more details. > .El > .Pp > To allocate an item from a zone, simply call > @@ -243,12 +257,21 @@ The variations > .Fn uma_zalloc_arg > and > .Fn uma_zfree_arg > -allow to > +allow callers to > specify an argument for the > .Dv ctor > and > .Dv dtor > functions, respectively. > +The=20 > +.Fn uma_zalloc_domain > +function allows callers to specify a fixed > +.Xr numa 9 domain to allocate from. This uses a guaranteed but slow pat= h in > +the allocator which reduces concurrency. The=20 > +.Fn uma_zfree_domain > +function should be used to return memory allocated in this fashion. This > +function infers the domain from the pointer and does not require it as an > +argument. > .Pp > Created zones, > which are empty, >=20 > Modified: head/usr.bin/cpuset/cpuset.1 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- head/usr.bin/cpuset/cpuset.1 Sat Mar 24 23:26:54 2018 (r331507) > +++ head/usr.bin/cpuset/cpuset.1 Sat Mar 24 23:58:44 2018 (r331508) > @@ -34,20 +34,24 @@ > .Sh SYNOPSIS > .Nm > .Op Fl l Ar cpu-list > +.Op Fl n Ar policy:domain-list=20 > .Op Fl s Ar setid > .Ar cmd ... > .Nm > .Op Fl l Ar cpu-list > +.Op Fl n Ar policy:domain-list=20 > .Op Fl s Ar setid > .Fl p Ar pid > .Nm > .Op Fl c > .Op Fl l Ar cpu-list > +.Op Fl n Ar policy:domain-list=20 > .Fl C > .Fl p Ar pid > .Nm > .Op Fl c > .Op Fl l Ar cpu-list > +.Op Fl n Ar policy:domain-list=20 > .Op Fl j Ar jailid | Fl p Ar pid | Fl t Ar tid | Fl s Ar setid | Fl x Ar= irq > .Nm > .Fl g > @@ -57,8 +61,9 @@ > The > .Nm > command can be used to assign processor sets to processes, run commands > -constrained to a given set or list of processors, and query information > -about processor binding, sets, and available processors in the system. > +constrained to a given set or list of processors and memory domains, and= query > +information about processor binding, memory binding and policy, sets, and > +available processors and memory domains in the system. > .Pp > .Nm > requires a target to modify or query. > @@ -92,6 +97,15 @@ This last set is the list of all possible CPUs in the= =20 > queried using > .Fl r . > .Pp > +Most sets include NUMA memory domain and policy information. This can be > +inspected with > +.Fl g > +and set with > +.Fl n . > +This will specify which NUMA domains are visible to the process and > +affect where anonymous memory and file pages will be stored on first acc= ess. > +Files accessed first by other processes may specify conflicting policy. > +.Pp > When running a command it may join a set specified with > .Fl s > otherwise a new set is created. > @@ -110,7 +124,8 @@ Create a new cpuset and assign the target process to t > The requested operation should reference the cpuset available via the > target specifier. > .It Fl d Ar domain > -Specifies a NUMA domain id as the target of the operation. > +Specifies a NUMA domain id as the target of the operation. This can only > +be used to query the cpus visible in each numberd domain. > .It Fl g > Causes > .Nm > @@ -130,6 +145,13 @@ numbers separated by '-' for ranges and commas separ= at > A special list of > .Dq all > may be specified in which case the list includes all CPUs from the root = set. > +.It Fl n Ar domain-list:policy > +Specifies a list of domains and allocation policy to apply to a target. = Ranges > +may be specified as in > +.Fl l . > +Valid policies include first-touch, ft, round-robin, rr, and prefer. Th= e prefer > +policy accepts only a single domain in the set. The parent of the set is > +consulted if the preferred domain is unavailable. > .It Fl p Ar pid > Specifies a pid as the target of the operation. > .It Fl s Ar setid > _______________________________________________ > svn-src-head@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-head > To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org" A buildkernel fails with: [...] --- all_subdir_lib/libc --- make[4]: make[4]: don't know how to make cpuset_getdomain.2. Stop make[4]: stopped in /usr/src/lib/libc --=20 O. Hartmann Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=BCr Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Abs.= 4 BDSG). --Sig_/QZ9Mo8ZpX8oTgh.41WmBpJT Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iLUEARMKAB0WIQQZVZMzAtwC2T/86TrS528fyFhYlAUCWrc8ugAKCRDS528fyFhY lN9DAgCnRCCpE6cj40tcvmnqoDFyfJD+DI7FkN+QCCG+Pr8umqnF6l7YifUDGN8n /wm0goHPYAJSmn7YWNxfRijjW42EAgCWq/HLvXRaXmr5p46SVtZHibXNC7gO3guy fQrK/by0WskDYFD4NMec7wsBi8R3uUOkKJZ5UDy9T92WfYEnamsL =PD5m -----END PGP SIGNATURE----- --Sig_/QZ9Mo8ZpX8oTgh.41WmBpJT--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180325080754.4e169d9a>