Date: Tue, 17 Aug 2004 00:14:52 -0500 From: Jon Noack <noackjr@alumni.rice.edu> To: Julian Elischer <julian@elischer.org> Cc: freebsd-current@freebsd.org Subject: Re: Deadlocks with recent SMP current Message-ID: <4121944C.5060802@alumni.rice.edu> In-Reply-To: <411EF85A.30006@elischer.org> References: <20040813121208.M31181@cvs.imp.ch> <20040813102922.E93695@carver.gumbysoft.com> <411D20DF.2000503@samsco.org> <411E9399.3050200@alumni.rice.edu> <411EF85A.30006@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 08/15/04 00:44, Julian Elischer wrote: > Jon Noack wrote: >> On 08/13/04 15:13, Scott Long wrote: >>> Can you try the patch below? It's really only a band-aid, but might >>> make things usable for now. Also, are more lockups being seen under >>> ULE or under 4BSD. There was a recent change to ULE (rev 1.120 of >>> sched_ule.c) that seems to have aggrivated the scheduler problems on >>> my test systems. >>> >>> Scott >>> >>> Index: kern_switch.c >>> =================================================================== >>> RCS file: /usr/ncvs/src/sys/kern/kern_switch.c,v >>> retrieving revision 1.78 >>> diff -u -r1.78 kern_switch.c >>> --- kern_switch.c 10 Aug 2004 00:26:25 -0000 1.78 >>> +++ kern_switch.c 13 Aug 2004 20:11:27 -0000 >>> @@ -345,6 +345,8 @@ >>> return; >>> } >>> >>> + critical_enter(); >>> + >>> tda = kg->kg_last_assigned; >>> if ((ke = td->td_kse) == NULL) { >>> if (kg->kg_idle_kses) { >>> @@ -441,6 +443,7 @@ >>> CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d", >>> td, td->td_ksegrp, td->td_proc->p_pid); >>> } >>> + critical_exit(); >>> } >>> >>> /* >> >> Here's a data point: >> My dual Pentium3 system has been up for 20+ hours with this patch. >> Previously, it wouldn't survive for more than an hour or so >> (regardless of load). > > try the following change instead: > in maybe_preempt() in kern_switch.c > > ctd = curthread; > + if ((ctd->td_kse == NULL) || (ctd->td_kse->ke_thread != ctd)) > + return (0); > pri = td->td_priority; With the previous patch I still had difficulties getting through a buildworld in multi-user (while running apache, postfix+amavisd-new, nfs, etc.). With this patch I have not run into any issues (make -j4 buildworlds are stable on my dual p3 even after uncommenting -DUSE_KQUEUE and rebuilding make). If the last patch was a bandaid, this is one of those new-fangled "sport" bandaids that are water- and sweat-resistent... ;-) Jon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4121944C.5060802>