Date: Fri, 4 Feb 2011 17:38:04 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: John Baldwin <jhb@freebsd.org> Cc: svn-src-head@freebsd.org, Randall Stewart <rrs@freebsd.org>, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r218232 - head/sys/netinet Message-ID: <alpine.BSF.2.00.1102041731160.17623@fledge.watson.org> In-Reply-To: <201102031529.25072.jhb@freebsd.org> References: <201102031922.p13JML8i055697@svn.freebsd.org> <201102031529.25072.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 3 Feb 2011, John Baldwin wrote: >> 1) Move per John Baldwin to mp_maxid >> 2) Some signed/unsigned errors found by Mac OS compiler (from Michael) >> 3) a couple of copyright updates on the effected files. > > Note that mp_maxid is the maxium valid ID, so you typically have to do > things like: > > for (i = 0; i <= mp_maxid; i++) { > if (CPU_ABSENT(i)) > continue; > ... > } > > There is a CPU_FOREACH() macro that does the above (but assumes you want to > skip over non-existent CPUs). I'm finding the network stack requires quite a bit more along these lines, btw. I'd love also to have: PACKAGE_FOREACH() CORE_FOREACH() HWTHREAD_FOREACH() CURPACKAGE() CURCORE() CURTHREAD() Available when putting together thread worker pools, distributing work, identifying where to channel work, making dispatch decisions and so on. It seems likely that in some scenarios, it will be desirable to have worker thread topology linked to hardware topology -- for example, a network stack worker per core, with distribution of work targeting the closest worker (subject to ordering constraints)... > Hmmm, this is more complicated. Can sctp_queue_to_mcore() handle the fact > that a cpu_to_use value might not be valid? If not you might want to > maintain a separate "dense" virtual CPU ID table numbered 0 .. mp_ncpus - 1 > that maps to "present" FreeBSD CPU IDs. I think Robert has done something > similar to support RSS in TCP. Does that make sense? This proves somewhat complicated. I basically have two models, depending on whether RSS is involved (which adds an external factor). Without RSS, I build a contiguous workstream number space, which is then mapped via a table to the CPU ID space, allowing mappings and hashing to be done easily -- however, these refer to ordered flow processing streams (i.e., "threads") rather than CPUs, in the strict sense. In the future with dynamic configuration, this becomes important because what I do is rebalance ordered processing streams rather than work to CPUs. With RSS there has to be a link between work distribution and the CPU identifiers shared by device drivers, hardware, etc, in which case RSS identifies viable CPUs as it starts (probably not quite correctly, I'll be looking for a review of that code shortly, cleaning it up currently). This issue came up some at the BSDCan devsummit last year: as more and more kernel subsystems need to exploit parallelism explicitly, the thread programming model isn't bad, but lacks a strong tie to hardware topology in order to help manage work distribution. One idea idly bandied around was to do something along the lines of KSE/GCD for the kernel: provide a layered "work" model with ordering constraints, rather than exploit threads directly, for work-oriented subsystems. This is effectively what netisr does, but in a network stack-specific way. But with crypto code, IPSEC, storage stuff, etc, all looking to exploit parallelism, perhaps a more general model is called for. Robert
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1102041731160.17623>