Date: Mon, 16 May 2011 17:38:53 -0400 From: Max Laier <max@love2party.net> To: John Baldwin <jhb@freebsd.org> Cc: FreeBSD current <freebsd-current@freebsd.org>, Peter Grehan <grehan@freebsd.org>, Andriy Gapon <avg@freebsd.org>, neel@freebsd.org Subject: Re: proposed smp_rendezvous change Message-ID: <201105161738.53366.max@love2party.net> In-Reply-To: <201105161646.03338.jhb@freebsd.org> References: <4DCD357D.6000109@FreeBSD.org> <201105161630.44577.max@love2party.net> <201105161646.03338.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 16 May 2011 16:46:03 John Baldwin wrote: > On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: > > On Monday 16 May 2011 14:21:27 John Baldwin wrote: > > > Yes, we need to fix that. Humm, it doesn't preempt when you do a > > > critical_exit() though? Or do you use a hand-rolled critical exit that > > > doesn't do a deferred preemption? > > > > Right now I just did a manual td_critnest++/--, but I guess ... > > Ah, ok, so you would "lose" a preemption. That's not really ideal. > > > > Actually, I'm curious how the spin unlock inside the IPI could yield > > > the CPU. Oh, is rmlock doing a wakeup inside the IPI handler? I > > > guess that is ok as long as the critical_exit() just defers the > > > preemption to the end of the IPI handler. > > > > ... the earliest point where it is safe to preempt is after doing the > > > > atomic_add_int(&smp_rv_waiters[2], 1); > > > > so that we can start other IPIs again. However, since we don't accept > > new IPIs until we signal EOI in the MD code (on amd64), this might still > > not be a good place to do the yield?!? > > Hmm, yeah, you would want to do the EOI before you yield. However, we > could actually move the EOI up before calling the MI code so long as we > leave interrupts disabled for the duration of the handler (which we do). > > > The spin unlock boils down to a critical_exit() and unless we did a > > critical_enter() at some point during the redenvouz setup, we will > > yield() if we owepreempt. I'm not quite sure how that can happen, but > > it seems like there is a path that allows the scheduler to set it from a > > foreign CPU. > > No, it is only set on curthread by curthread. This is actually my main > question. I've no idea how this could happen unless the rmlock code is > actually triggering a wakeup or sched_add() in its rendezvous handler. > > I don't see anything in rm_cleanIPI() that would do that however. > > I wonder if your original issue was really fixed just by the first patch > you had which fixed the race in smp_rendezvous()? I found the stack that lead me to this patch in the first place: #0 sched_switch (td=0xffffff011a970000, newtd=0xffffff006e6784b0, flags=4) at src/sys/kern/sched_ule.c:1939 #1 0xffffffff80285c7f in mi_switch (flags=6, newtd=0x0) at src/sys/kern/kern_synch.c:475 #2 0xffffffff802a2cb3 in critical_exit () at src/sys/kern/kern_switch.c:185 #3 0xffffffff80465807 in spinlock_exit () at src/sys/amd64/amd64/machdep.c:1458 #4 0xffffffff8027adea in rm_cleanIPI (arg=<value optimized out>) at src/sys/kern/kern_rmlock.c:180 #5 0xffffffff802b9887 in smp_rendezvous_action () at src/sys/kern/subr_smp.c:402 #6 0xffffffff8045e2a4 in Xrendezvous () at src/sys/amd64/amd64/apic_vector.S:235 #7 0xffffffff802a2c6e in critical_exit () at src/sys/kern/kern_switch.c:179 #8 0xffffffff804365ba in uma_zfree_arg (zone=0xffffff009ff4b5a0, item=0xffffff000f34cd40, udata=0xffffff000f34ce08) at src/sys/vm/uma_core.c:2370 . . . and now that I look at it again, it is clear that critical_exit() just isn't interrupt safe. I'm not sure how to fix that, yet ... but this: if (td->td_critnest == 1) { td->td_critnest = 0; if (td->td_owepreempt) { td->td_critnest = 1; clearly doesn't work. Max
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201105161738.53366.max>