Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Jul 2015 18:36:31 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Garrett Cooper <yaneurabeya@gmail.com>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: CFT/CFR: NUMA policy branch
Message-ID:  <CAJ-Vmo=bNZG=PJWGYMLBSjPoW5mmcYSDiC_8ecLgnoDHXY9kEA@mail.gmail.com>
In-Reply-To: <D60A4F17-D573-4C0D-AD8B-2C3710A67DA6@gmail.com>
References:  <CAJ-Vmo=SnqXTF5m65haKqrVf699zinyXs%2BQdvR6V88CW7vooCw@mail.gmail.com> <CAJ-VmonyTfSxj%2BD=FN3TUCO33w4vGqh1REQqx-8rd-JcArfqSA@mail.gmail.com> <CAJ-Vmo=ON9bEngDoMFK-kJR=qVvcX%2BEeQXx%2BUoEsh1npMHjESQ@mail.gmail.com> <D60A4F17-D573-4C0D-AD8B-2C3710A67DA6@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7 July 2015 at 23:13, Garrett Cooper <yaneurabeya@gmail.com> wrote:
> On Jul 5, 2015, at 19:06, Adrian Chadd <adrian@freebsd.org> wrote:
>
>> Hi,
>>
>> I've done another update. kib@ has been beating me with the clue stick a=
 bit.
>>
>> https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian=
_numa_policy
>>
>> * (kib) (numactl.c) fix up sorting of include files
>> * (kib) (numactl.c) consistent use of values when calling err()
>> * (kib) (numactl.c) consistently wrap lines at 78 characters, don't
>> prematurely wrap lines
>> * (kib) don't use the old-style BSD licence mentioning "regents", use
>> the updated one
>> * (kib) (vm_domain.c) don't break out after iterating a few times and
>> have the API be unpredictable - so now the API will always succeed in
>> reading a vm_policy
>>
>> I've tested the policies (first-touch, fixed-domain, round-robin) and
>> they all still work as advertised, both on threads and processes.
>>
>> I'd appreciate more reviews and some further testing.
>
> Please create a dummy pull request or post the code up on Phabricator.
>
> - Please put some of the items like policy_to_str and str_to_policy in a =
library.

I'll sketch out a libnuma after this lands. Yes, this stuff will be part of=
 it.

> - Please use that code in the kernel as well for sysctl_vm_default_policy=
 to reduce duplication.

ok. I'll do it after it lands.

> - Please note reasoning for why `options MAXMEMDOM=3D16` in numa(4).

There's no reasoning.

> - Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa=
_getaffinity, but not sys_numa_setaffinity?

I'll check.

> - sys_numa_setaffinity and sys_numa_getaffinity look similar. Could somet=
hing be implemented like sysctl(3) for handling getting/setting of affiniti=
es all in one shot?

Nope. Other things are the same (eg cpuset).

> - `if (p)` should be `if (p !=3D NULL)`, etc per style(9).
> - Is there a way that the affinity could be inherited/not inherited acros=
s threads, similar to what ktrace -i does? If so, how does one do that?

There isn't, much like how there isn't the same for that in cpuset.

> - In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multi=
ple threads access/mangle the value of td_dom_rr_idx in parallel?

No - it's curthread. only curthread is modifying that.

> - In vm_domain_policy_validate, couldn=E2=80=99t you remove all of the in=
termediary `return (-1)`=E2=80=99s as long as you put `break;`s in the swit=
ch/case statements?
> - Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? I=
f so, what should they have in there?

They're placeholders - primarily because I think at some point the
policy definition will grow to be bigger and we may want to malloc /
reference them in some instances. (Much like cpuset.)
This way it's clear that they should be called, even if they don't do anyth=
ing.

I like defining "object lifecycle", even if part of the create/destroy
lifecycle is null. Adding things like create/destroy functions later
on has proven to be more buggy.

> - Would it make sense for `struct vm_domain_iterator` to be a queue(9)-li=
ke type (just based on the name alone)? If not, what data structure do you =
anticipate it having, e.g. tree, queue, directed graph, etc?

No. It's an iterator - it's designed to abstract away the concept of
iterating over a domain policy. It won't have any kind of data
structure like you mentioned, as it's an iterator.

> - In vm_domain_iterator_run, vi->n is always decremented after the vi->n =
<=3D 0 run is done =E2=80=94 why not move that outside the switch-case stat=
ement?
> - Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterato=
r_run` up at the top?

vm_domain_iterator_isdone() is (currently) defined as a tail loop
condition, not a head loop condition. Ie, you can only call it after
you've gone through the loop at least once.

Now, I don't necessarily like how the loop constructs work, but it's
designed to be a minimal set of changes to the vm_page.c code that was
doing this kind of iteration over the round-robin set.

I think it's because I wrote iterator_isdone() first before I finished
fleshing out iterator_run().

> - The parameter names for functions/syscalls can be omitted in the declar=
ations.

ok

> - The change to vm_page.c seems like it could be committed separate from =
the NUMA changes.

It will be. It's in there because without it any significant imbalance
in page allocations between domains can end with exhaustion in one
domain and it all going bad.

> - `However without it'll kernel panic below - the code didn=E2=80=99t` ->=
 `However without the following check, the kernel will panic below; the cod=
e didn=E2=80=99t`

> - vm_domain_iterator_run seems like it could use one of the queue(9) data=
 structures.

again, no - there's no queue struct underlying it. It's an iterator.

> - Why is OPT_TID 1001?
> - `int error;` should come after `cpuset_t set;` per alignment and style(=
9).
> - `atoi` parsing is better handled via strtoll, etc.

Il'l go sort those out in numactl.c soon.

Thanks!


-a



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=bNZG=PJWGYMLBSjPoW5mmcYSDiC_8ecLgnoDHXY9kEA>