Date: Wed, 25 May 2005 02:25:38 +0200 From: Palle Girgensohn <girgen@FreeBSD.org> To: kwsn@earthlink.net, freebsd-amd64@freebsd.org Cc: toby.murray@gmail.com Subject: Re: Panic while running jdk15 Message-ID: <24CD85AD72E7F49E3A9AC091@rambutan.pingpong.net> In-Reply-To: <1115965490.59966.18.camel@jonnyv.kwsn.lan> References: <1115839640.59966.12.camel@jonnyv.kwsn.lan> <1115965490.59966.18.camel@jonnyv.kwsn.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
--On torsdag, maj 12, 2005 23.24.50 -0700 Jon Kuster <kwsn@earthlink.net> wrote: > On Wed, 2005-05-11 at 12:27 -0700, Jon Kuster wrote: >> After we managed to get jdk15 built and then shipped our box to the >> colo, it has started panicing. We haven't been able to reliably >> reproduce this yet, but it always happens when our java program is doing >> it's thing. >> >> kernel trap 12 with interrupts disabled >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id=00 >> fault virtual address = 0x1c0 >> fault code = supervisor write, page not present >> instruction pointer = 0x8 :0xffffffff80382348 >> stack pointer = 0x10 :0xffffffff7935aa0 >> frame pointer = 0x10 :0xffffffff7935ae0 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = resume, IOPL = 0 >> current process = 6503 (sh) >> >> I haven't been able to get a dump yet, or even a trace in ddb - our >> remote management card apparently emulates a usb keyboard which doesn't >> seem to work when the box is paniced. >> >> nm -n /boot/kernel/kernel |grep ffffffff803823 >> ffffffff80382330 T cpu_throw >> ffffffff80382380 T cpu_switch > > We've switched off Hyperthreading (we're running em64T xeons), and that > seems to have worked around the problem. It's a little too early to say > for sure, but we were seeing panics twice a day, and we haven't had a > panic in about a day and a half. Hi! This looks very similar to our problem. Dell 2850 (i.e. em64T xeon, two CPUs). Turning off HTT made it live longer (long enough for med to believe it actually solved the problem), but after a week or so it crashed twice a day again. We're *not* running java, though. Apache 1.3, php4, postgres8.0.3, amavis (i.e. perl), postfix. apache, postgres and php are very loaded, the machine has a load >= .8 most of the time (mostly due to sloppy code, but anyway). 5.4-release made it better, for a few days, but then it started crashing again. Today, I've built a non-SMP kernel, so we're effectively running a single CPU. It has not crashed so far (but it is slow). Always Fatal trap 12: page fault while in kernel mode It also hangs and does not reboot by itself. it seems so hard it never manages to save a core dump, and has to be restarted by hitting the big button. Contacted Dell support, as I'm beginning to suspect the hardware. After BIOS upgrade today, recommended by Dell, The machine hung at userland startup, when starting the various daemons. Five times in a row, at least. Then it decided to actually come up, and stayed up for eight hours. then down again. sic... If it works fine with one CPU, is it likely to be hardware problem or software? Jon, you report is a few weeks old, what happened? Does it live happily w/o HTT? /Palle
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?24CD85AD72E7F49E3A9AC091>