Date: Mon, 16 May 2011 18:05:50 -0400 From: Max Laier <max@love2party.net> To: Attilio Rao <attilio@freebsd.org> Cc: Andriy Gapon <avg@freebsd.org>, FreeBSD current <freebsd-current@freebsd.org>, Peter Grehan <grehan@freebsd.org>, neel@freebsd.org Subject: Re: proposed smp_rendezvous change Message-ID: <201105161805.51188.max@love2party.net> In-Reply-To: <BANLkTik8CqtiP9OgvBpL08dqK6Aj%2BLQ3OA@mail.gmail.com> References: <4DCD357D.6000109@FreeBSD.org> <201105161738.53366.max@love2party.net> <BANLkTik8CqtiP9OgvBpL08dqK6Aj%2BLQ3OA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 16 May 2011 17:54:54 Attilio Rao wrote: > 2011/5/16 Max Laier <max@love2party.net>: > > On Monday 16 May 2011 16:46:03 John Baldwin wrote: > >> On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: > >> > On Monday 16 May 2011 14:21:27 John Baldwin wrote: > >> > > Yes, we need to fix that. Humm, it doesn't preempt when you do a > >> > > critical_exit() though? Or do you use a hand-rolled critical exit > >> > > that doesn't do a deferred preemption? > >> > > >> > Right now I just did a manual td_critnest++/--, but I guess ... > >> > >> Ah, ok, so you would "lose" a preemption. That's not really ideal. > >> > >> > > Actually, I'm curious how the spin unlock inside the IPI could yield > >> > > the CPU. Oh, is rmlock doing a wakeup inside the IPI handler? I > >> > > guess that is ok as long as the critical_exit() just defers the > >> > > preemption to the end of the IPI handler. > >> > > >> > ... the earliest point where it is safe to preempt is after doing the > >> > > >> > atomic_add_int(&smp_rv_waiters[2], 1); > >> > > >> > so that we can start other IPIs again. However, since we don't accept > >> > new IPIs until we signal EOI in the MD code (on amd64), this might > >> > still not be a good place to do the yield?!? > >> > >> Hmm, yeah, you would want to do the EOI before you yield. However, we > >> could actually move the EOI up before calling the MI code so long as we > >> leave interrupts disabled for the duration of the handler (which we do). > >> > >> > The spin unlock boils down to a critical_exit() and unless we did a > >> > critical_enter() at some point during the redenvouz setup, we will > >> > yield() if we owepreempt. I'm not quite sure how that can happen, but > >> > it seems like there is a path that allows the scheduler to set it from > >> > a foreign CPU. > >> > >> No, it is only set on curthread by curthread. This is actually my main > >> question. I've no idea how this could happen unless the rmlock code is > >> actually triggering a wakeup or sched_add() in its rendezvous handler. > >> > >> I don't see anything in rm_cleanIPI() that would do that however. > >> > >> I wonder if your original issue was really fixed just by the first > >> patch you had which fixed the race in smp_rendezvous()? > > > > I found the stack that lead me to this patch in the first place: > > > > #0 sched_switch (td=0xffffff011a970000, newtd=0xffffff006e6784b0, > > flags=4) at src/sys/kern/sched_ule.c:1939 > > #1 0xffffffff80285c7f in mi_switch (flags=6, newtd=0x0) at > > src/sys/kern/kern_synch.c:475 > > #2 0xffffffff802a2cb3 in critical_exit () at > > src/sys/kern/kern_switch.c:185 #3 0xffffffff80465807 in spinlock_exit > > () at > > src/sys/amd64/amd64/machdep.c:1458 > > #4 0xffffffff8027adea in rm_cleanIPI (arg=<value optimized out>) at > > src/sys/kern/kern_rmlock.c:180 > > #5 0xffffffff802b9887 in smp_rendezvous_action () at > > src/sys/kern/subr_smp.c:402 > > #6 0xffffffff8045e2a4 in Xrendezvous () at > > src/sys/amd64/amd64/apic_vector.S:235 > > #7 0xffffffff802a2c6e in critical_exit () at > > src/sys/kern/kern_switch.c:179 #8 0xffffffff804365ba in uma_zfree_arg > > (zone=0xffffff009ff4b5a0, item=0xffffff000f34cd40, > > udata=0xffffff000f34ce08) at > > src/sys/vm/uma_core.c:2370 > > . > > . > > . > > > > and now that I look at it again, it is clear that critical_exit() just > > isn't interrupt safe. I'm not sure how to fix that, yet ... but this: > > > > > > if (td->td_critnest == 1) { > > td->td_critnest = 0; > > if (td->td_owepreempt) { > > td->td_critnest = 1; > > > > clearly doesn't work. > > I'm sorry if I didn't reply to the whole rendezvous thread, but as > long as there is so many people taking care of it, I'll stay hidden in > my corner. > > I just wanted to tell that I think you are misunderstanding what > critical section is supposed to do. > > When an interrupt fires, it goes on the old "interrupt/kernel context" > which means it has not a context of his own. That is the reason why we > disable interrupts on spinlocks (or similar workaround for !x86 > architectures) and this is why spinlocks are the only protection > usable in code that runs in interrupt context. > > Preempting just means another thread will be scheduler in the middle > of another thread execution path. > > This code is perfectly fine if you consider curthread won't be > descheduled while it is executing. Well, no - it is not. With this you can end up with a curthread that has td_critnest=0 and td_owepreempt=1 in interrupt context. If you use a spinlock on such a thread, it will do the preemption at the point where you drop the spinlock, this is bad in some circumstances. One example is the smp_rendevous case we are discussing here. Max
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201105161805.51188.max>