Date: Mon, 9 Oct 2006 21:20:12 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-smp@freebsd.org Cc: Charles Ulrich <charles@idealso.com> Subject: Re: FreeBSD 6.1 Instability Message-ID: <200610092120.12570.jhb@freebsd.org> In-Reply-To: <200610051544.03861.charles@idealso.com> References: <200610051544.03861.charles@idealso.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 05 October 2006 15:44, Charles Ulrich wrote: > Greetings, >=20 > We have been running FreeBSD on our mail servers for about as long as I c= an=20 > remember. Recently, we decided to go SMP to handle increased mail load. A= fter=20 > assembling the hardware, installing the OS and software, and restoring al= l of=20 > our data, we noticed in testing that our first machine began hanging=20 > semi-regularly when it began processing lots of mail. Disabling SMP=20 > eliminated the hangs completely. We tried it all again on completely=20 > different hardware with exactly the same result. Our conclusion: somethin= gs's=20 > buggy in SMP. >=20 > Here are the symptoms. The machine hangs, and becomes completely=20 > unresponsive. =A0It looks like a deadlock. =A0It will sometimes respond t= o the=20 > power button and shut down (without being able to first sync and unmount= =20 > filesystems), and sometimes the power button event gets caught in the=20 > deadlock. =A0Sinceit's not actually a crash, there is no core dump or oth= er=20 > debugging information. In the most recent situation, it hung at different= =20 > points every time I tried to compile ezm3, after successfully compiling o= ther=20 > packages. >=20 > We're system administrators, not kernel hackers, so this is a plea for he= lp. I=20 > wouldn't know where to start, but I'm hoping someone can point me in the= =20 > right direction. We're also willing to give a (trustworthy) FreeBSD devel= oper=20 > root access to the test machine since it's just sitting idle right now. I= f=20 > you need to crash it, that's fine. We'll have people during normal busine= ss=20 > hours who know how to push a reset button. >=20 > Thanks for your time. Compile a debug kernel and include 'DDB' in the kernel. When it hangs, bre= ak into the debugger and type 'panic' to have it panic the machine and write out a = crash dump. Once you have the crash dump, download http://www.FreeBSD.org/~jhb/g= db/gdb6 and do this: $ kgdb /usr/obj/usr/src/sys/FOO/kernel.debug /var/crash/vmcore.X (where FOO is your kernel config file and X is the right vmcore file) Then do this: (gdb) source /path/to/gdb6 (gdb) ps =2E.. And reply with the output from the 'ps' command. =2D-=20 John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200610092120.12570.jhb>