Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Mar 2014 03:02:49 -0600
From:      Chris Torek <torek@torek.net>
To:        freebsd-hackers@freebsd.org
Subject:   slight problem with r254457 (kthread_add fix)
Message-ID:  <201403270902.s2R92nPo064876@elf.torek.net>

next in thread | raw e-mail | index | archive | help
This fix:

 http://lists.freebsd.org/pipermail/svn-src-head/2013-August/050550.html

has introduced a bit of a problem with kernel threads that enable
the FPU, on amd64.

Before the fix, a new thread added to proc0 (when kthread_add() is
called with p == NULL) was essentially "forked from" thread0:

	if (p == NULL) {
		p = &proc0;
		oldtd = &thread0;
	}

Now, each such new thread "forks from" the *newest* thread in proc0:

	if (p == NULL)
		p = &proc0;
	...
	PROC_LOCK(p);
	oldtd = FIRST_THREAD_IN_PROC(p);

The problem is that "first thread in proc" is really the *last*
(i.e., most recent) thread in the proc, as thread_link() does a
TAILQ_INSERT_HEAD() on p->p_threads, and FIRST_THREAD_IN_PROC
takes the TAILQ_FIRST element.

What this means is that if something enables the FPU (e.g., a
device creates a taskq thread and enables the FPU in that thread,
so that the device can use XMM registers), every subsequent taskq
also has the FPU enabled.  (We found this by unconditionally
enabling the extensions in various threads, assuming that new ones
did not have FPU enabled for kernel use yet.  That worked until
we picked up this particular change.)

The fix itself is fine but "forking from" the most-recent kernel
thread, rather than thread0, for this case seems wrong.  I see
three obvious ways to fix *that*:

 - Restore the old behavior:

 	if (p == NULL) {
		p = &proc0;
		oldtd = &thread0;
	} else
		oldtd = NULL;
	...
	PROC_LOCK(p);
	if (oldtd == NULL)
		oldtd = FIRST_THREAD_IN_PROC(p);

   Conservative, will definitely work, but still seems a bit odd
   behavior-wise: a new "not in a proc" kthread clones from thread0,
   but a new "is in a proc" kthread clones from most-recent-thread
   in that kernel proc.  (But that's what this did before the fix,
   modulo the bug where it cloned from a dead thread...)

 - Pick the *last* (i.e., earliest still-alive) thread in the proc:

	PROC_LOCK(p);
 	oldtd = LAST_THREAD_IN_PROC(p);

   where LAST_THREAD_IN_PROC uses the tail entry on the tailq
   (pretty straightforward).

 - Get rid of FIRST_THREAD_IN_PROC entirely, as it's not clear
   what "first" means (at least it was not to me: I assumed at
   first that it meant the *oldest*, i.e., first-created, one):
   add new macro accessors NEWEST_THREAD_IN_PROC,
   OLDEST_THREAD_IN_PROC, and (for when you don't care about
   order) SOME_THREAD_IN_PROC [%], and start using those where
   appropriate.  The thread_link() routine then puts "newest" and
   "oldest" at whichever end of the tailq is best for kernel code
   space/speed, and only it and the macros need to agree.

   [% Needs better name.  Perhaps just THREAD_IN_PROC?]

I actually like the last option best, but it is the largest
overall change and it messes with all kinds of other kernel code
(there are 62 occurrences of FIRST_THREAD_IN_PROC in a count I
just ran).  However, it becomes clearer which thread is really
wanted.

(I'm going to do the first fix, for now at least, in our kernel.)

Chris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201403270902.s2R92nPo064876>