Date: Wed, 04 Aug 2004 17:36:37 -0600 From: Scott Long <scottl@freebsd.org> To: Sven Willenberger <sven@dmv.com> Cc: freebsd-current@freebsd.org Subject: Re: Postgresql locks up server - no response at all Message-ID: <41117305.2010301@freebsd.org> In-Reply-To: <1091657001.29488.64.camel@lanshark.dmv.com> References: <20040804204915.8337A5D08@ptavv.es.net> <1091653253.29492.57.camel@lanshark.dmv.com> <411154D8.1050001@freebsd.org> <1091657001.29488.64.camel@lanshark.dmv.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Sven Willenberger wrote: > On Wed, 2004-08-04 at 15:27 -0600, Scott Long wrote: > >>Sven Willenberger wrote: >> >> >>>On Wed, 2004-08-04 at 13:49 -0700, Kevin Oberman wrote: >>> >>> >>>>>Date: Wed, 4 Aug 2004 13:34:56 -0700 >>>>>From: Jeremy Chadwick <freebsd@jdc.parodius.com> >>>>>Sender: owner-freebsd-current@freebsd.org >>>>> >>>>>I've seen this with our SuperMicro SuperServer 5013C-T, running mysqld. >>>>>Please note that the server is "heavily loaded" (note the quotes); usually >>>>>a load of around 0.50 to 1.00 at all times, with mysqld being the top >>>>>process. Server runs all latest -CURRENT builds. >>>>> >>>>>Many people over in freebsd-threads mentioned this problem, and recommended >>>>>all sorts-of different workarounds. I tried every one available to me, >>>>>except mucking with PREEMPTION (as I did not feel comfortable tinkering >>>>>with a random .h file on the box; seemed to be a kernel-related thing, >>>>>so I'd rather have just an "options" line for it -- I'm conditionally >>>>>lazy). >>>> >>>>Please note that PREEMPTION is now NOT enabled in CURRENT. scottl >>>>changed that a day or two ago because of all of these lock-ups. He and >>>>Julian are listed as working to isolate the problem. Scott believes it's >>>>in the scheduler. It's not specific to either ULE or 4BSD. >>>> >>>>So cvsup, rebuild the kernel and you should be fine.At least for a while. >>> >>> >>>Based on this and Jeremy C.'s response it would appear that I should >>>either try to upgrade my 5.2.1-P8 system to -CURRENT (which is scary >>>because of the vinum array - root is not mounted on a vinum device, but >>>the data directory is - will gvinum simply read this correctly? it is a >>>stripe+mirror array of 4 drives) or start from scratch and go back to >>>4.10 (STABLE) for a while. I am assuming that the lockups I am seeing >>>were exacerbated by the PREEMPTION episodes of the past couple weeks? If >>>I choose the upgrade to -CURRENT, are there any caveats or >>>recommendations? (besides reading "/usr/src/UPDATING" which I do >>>religiously anyway) >>> >>>_______________________________________________ >>>freebsd-current@freebsd.org mailing list >>>http://lists.freebsd.org/mailman/listinfo/freebsd-current >>>To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >> >>I'm a bit nervous with asking you to upgrade to -current. PREEMPTION is >>practically disabled in 5.2.1 so upgrading has a low chance of fixing >>the problem except maybe by sheer luck. The best action would be to >>get a crashdump. If your system has an NMI button, then there are some >>trivial patches that will assist with this. If not, then you might want >>to look at backporting the ichwd watchdog driver and letting that do a >>chip-assisted NMI. >> >>In any case, finding out exactly what each CPU is doing at the time of >>the lockup is going to be vital. The lockups that I've been able to >>reproduce happen when a TAILQ in the scheduler gets corrupted and >>resulting in one CPU spinning on the list forever with the scheduler >>lock held. All other cpus then quickly grind to a halt while they wait >>for the sched lock to become free, which it never does. >> > > > The case unfortunately does not have a button (although the mobo does > have an NMI header/jumper). Backporting the watchdog driver sounds > doable; other than downloading the sys/dev/ichwd directory from a > repository and adding "options ichwd" to my kernel config file, what > else would be needed? I am willing to try to get at least one crashdump > before I have to go back to a -STABLE setup or try something so I can > get some uptime on this box. > I believe that the ichwd driver depends on the watchdog infrastructure driver that was added back in the early spring. I'm not 100% sure, though. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?41117305.2010301>