Date: Tue, 12 Jun 2007 11:38:19 -0400 From: "Worth Bishop" <wbishop@twosensemedia.com> To: <freebsd-questions@freebsd.org> Subject: Fw: FreeBSD 6.2 Repeating Crash - Sleeping thread; Fatal trap 12: page fault; warning: 'T2' might be used uninitialized Message-ID: <004b01c7ad07$b3d474d0$0801000a@S0030153310>
next in thread | raw e-mail | index | archive | help
Addendum: For what it's worth, the 250Gb Samsung drive was added when the system was upgraded - it's only 3-4 months old. ----- Original Message ----- From: "Worth Bishop" <wbishop@twosensemedia.com> To: <freebsd-questions@freebsd.org> Sent: Tuesday, June 12, 2007 11:33 AM Subject: FreeBSD 6.2 Repeating Crash - Sleeping thread; Fatal trap 12: page fault; warning: 'T2' might be used uninitialized > Please help if you can... > > BACKGROUND > > This crash is occurring on a dual-AMD 1.6Ghz cpu white-box system with 1 > Gb ram, 250Gb storage running GENERIC kernel. The system has been in > production use as a web server for nearly five years. > > About 3 - 4 months ago, the system was upgraded from an earlier FreeBSD > version to 6.1. At the same time, all supporting applications (Apache > webserver, PERL, PostgreSQL, PHP, countless other applications & > libraries) were upgraded to the current releases. The system was stable up > until a couple of weeks ago. > > FIRST ERROR EVENT > > The system crashed during normal usage. The following message was > displayed on the console which was not responsive to keyboard input: > > Sleeping thread (tid 100122, pid 11099) > owns a non-sleepable lock > > panic: sleeping thread > cpuid=1 > > The system was restarted, an fsck routine was completed (answering "yes" > to all the "Do you want to salvage" type questions) and the server ran > fine. For about a week. It then crashed again several times, at intervals > varying from a few minutes of uptime to a few days. > > SECOND ERROR EVENT > > After some crashes, a message similar to that above was displayed. > However, at other times a message similar to this was displayed: > > kernel trap 12 with interrupts disabled > > Fatal trap 12: page fault while in kernel mode > cpuid=0; apic id=01 > fault virtual address =0x100 > fault code =supervisor read, page not present > instruction pointer =0x20:0xc066c731 > stack pointer =0x28:0xe432ebf0 > framepointer =0x28:0xe432ebfc > code segment =base 0x0, limit0xfffff, type 0x1b > > =DPL 0, pres 1, def32 1, gran1 > processor eflags =resume, IOPL=0 > current process =36 (syncer) > trap number = 12 > panic: page fault > cpuid=0 > uptime: 3d10h11m44s > Dumping 1535 Mb (2 chunks) [NOTE: the system had 1.5Gb memory at that > time. Memory was removed, reseated, swapped, etc., now 1Gb] > chunk 0:1Mb (159 pages) > > CORRECTIONS ATTEMPTED > > Somewhere during this ordeal, a Google search revealed a number of other > people experiencing the "Sleeping thread" problem. One of these was > apparently experienced in a FreeBSD 6.x development version stress test. > No definitive solution was identified in anything we say, except a single > reference to the problem being a kernel bug fixed in FreeBSD 6.2. > > Accordingly, we upgraded from 6.1 to 6.2 but have still experienced the > problem. > > We reviewed the 'messages' file and found references to several things > which led us to check FreeBSD 6.2 ERRATA > (http://www.freebsd.org/releases/6.2R/errata.html). This suggested adding > 'kern.ipc.nmbclusters="0"' to the /boot/loader.conf file which might avoid > a known issue. We tried this, but saw no relief. > > We also found a reference in the manual that suggested the issue might be > a problem with the APIC in 6.x. This recommended adding > 'hint.apic.0.disabled="1"' to loader.conf. Tried this; no help. > > In order to try to get more information about the system dumps we added: > dumpdev="AUTO" and dumpdir="/usr/crash" [to get more storage space than > available in /var/] and have generated several vmcore.# files of ~1 Gb > each (all identical size). > > We attempted to use DDB to analyze the dumps (struggling now, unfamiliar > with kernel debugging process) with no success. Research suggested we > needed to create a debug version of the kernel (i.e., KERNEL.DEBUG) with > debugging options enabled. > > We duly copied GENERIC and edited it, noting that "options ddb" was > already enabled. We added 'makeoptions DEBUG=-g # Build > kernel with gdb(1) debug symbols' as suggested and tried to "make > buildkernel" which errored out stating that KDB must be enabled to use > DDB. We edited KERNEL.DEBUG to add 'options KDB > # Enable kernel debugger' and attempted to "make buildkernel" again. This > time, the process stopped again with the message: > > THIRD ERROR EVENT > > [snip] > inline-unit-growth=100 --param > rge-function-growth=1000 -mno-align-long-strings -mpreferred-stack-boundary=2 > -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -ffreestanding -Werror > /usr/src/sys/crypto/sha2/sha2.c > /usr/src/sys/crypto/sha2/sha2.c: In function `SHA512_Transform': > /usr/src/sys/crypto/sha2/sha2.c:753: warning: 'T2' might be used > uninitialized in this function > *** Error code 1 > > Stop in /usr/obj/usr/src/sys/KERNEL.DEBUG. > *** Error code 1 > > Stop in /usr/src. > *** Error code 1 > > Stop in /usr/src. > www:/usr/src# > > With this, we are stumped. > > HELP PLEASE! > > Can anyone: > > - lead us to a solution based on these error messages? > - help us understand why the GENERIC kernel with only the debugging > options added failed to make? > - help us understand what '/usr/src/crypto/sha2/sha2.c' has to do with > anything? > - help us understand what we need to do to extract useful information > from the vmcore.# files? > - offer any other suggestions? > > Thanks in advance! > > > > > > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?004b01c7ad07$b3d474d0$0801000a>