Date: Mon, 20 Dec 2004 09:23:18 +1030 From: Greg 'groggy' Lehey <grog@FreeBSD.org> To: Daniel Johansson <donnex@gmail.com> Cc: questions@freebsd.org Subject: Re: My server gets kernel panic every 7th day Message-ID: <20041219225318.GH84787@wantadilla.lemis.com> In-Reply-To: <2a37e1ef04121914421fe84902@mail.gmail.com> References: <2a37e1ef04121802575db1ba26@mail.gmail.com> <20041218195002.GC78603@xor.obsecurity.org> <20041219222919.GE84787@wantadilla.lemis.com> <2a37e1ef04121914352677c442@mail.gmail.com> <20041219223801.GG84787@wantadilla.lemis.com> <2a37e1ef04121914421fe84902@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--K1n7F7fSdjvFAEnM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sunday, 19 December 2004 at 23:42:20 +0100, Daniel Johansson wrote: > On Mon, 20 Dec 2004 09:08:01 +1030, Greg 'groggy' Lehey > <grog@freebsd.org> wrote: >> On Sunday, 19 December 2004 at 23:35:18 +0100, Daniel Johansson wrote: >>> On Mon, 20 Dec 2004 08:59:19 +1030, Greg 'groggy' Lehey >>> <grog@freebsd.org> wrote: >>>> On Saturday, 18 December 2004 at 11:50:02 -0800, Kris Kennaway wrote: >>>>> On Sat, Dec 18, 2004 at 11:57:35AM +0100, Daniel Johansson wrote: >>>>>> Hi, i've had my server up for over a year now and it's been rock sol= id >>>>>> but for the latest weeks the server has rebooted evert Saturday at >>>>>> exact 04:19:57 because of a find command. I have no idea why and I've >>>>>> checked the cron log and I don't think any crontab is runned at that >>>>>> time. Not as far as I can see from the cron log. Anyway find makes t= he >>>>>> server get a kernel panic and it reboots. This is the fourth week in= a >>>>>> row it happens and I've checked the hardware, no problems at all. >>>>> >>>>> How did you "check the hardware"? Hardware failure is by far the >>>>> most common cause of "strange panics under abnormal load [such as >>>>> when the weekly cron job runs]". >>>> >>>> If this panic occurs repeatedly under certain circumstances, it's >>>> probably not hardware. Anyway, there's not much point standing >>>> outside and scratching our heads. We have a facility for analysing >>>> this kind of problem: the processor dump and kernel debugger. >>> >>> Yeah, I want to say thank you for your help. I think I've been able to >>> reproduce the kernel panic now, finalay! >>> >>> On my server I run 3 jails and every night at 04:15 when it runs >>> periodic weekly it runs it in 3 jails + the host enviroment. This >>> seems to cause the kernel panic, I don't really know why yet. I can >>> run periodic weekly separatly in every jail + the host without kernel >>> panic but when I run it at the same time on all places it kernel >>> panics. >> >> What does the dump backtrace show? >> >>> It can still be the PSU, don't have any other atm to try with. I'll >>> do some more testing and see if I can get any more info. >> >> There's no point looking at the hardware until you've looked at the >> dump. I'd appreciate it if you didn't require me to move the text of your messages to where it fits. > Okay, is this hard to do? I've no idea how to look at the dump or > how to understand the dump. You don't have to be kernel hacker to > understand that? It's described in the handbook. Basically: - Build a kernel with debug symbols (you should be doing this anyway). You need the following line in your configuration file: makeoptions DEBUG=3D-g # Build kernel with gdb(1) debug symbols - Make sure that dumps are enabled. You should have something like this in your /etc/rc.conf: dumpdev=3D/dev/ad0s2b The device name should be the name of your swap partition, and it must be at least slightly larger than your main memory. - Ensure you have a directory /var/crash, and that the file system in which it resides has enough space for the dump (a little larger than main memory). - When you get a dump, it will be copied to /var/crash automatically on reboot. Go there and get a backtrace. You don't say which version of FreeBSD you're using, but in general this will do it: # cd /var/crash # gdb -k /usr/obj/src/sys/GENERIC/kernel.debug vmcore.0 (gdb) bt =20 The name of the kernel (kernel.debug) depends on how you built your kernel. If it's not called GENERIC, the name of the directory will change accordingly. That's it in a nutshell. There's much more detail in chapter 6 of my debug tutorial, which you can find at http://www.lemis.com/grog/Papers/Debug-tutorial/tutorial.pdf . Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply or reply to the original recipients. For more information, see http://www.lemis.com/questions.html See complete headers for address and phone numbers. --K1n7F7fSdjvFAEnM Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (FreeBSD) iD8DBQFBxgZeIubykFB6QiMRAv89AKCrjuusTCI/XtbNRIbkbCztLSVY4QCfX7zv tUBu0zXuB/1Ezo9YzmktJKk= =WqFV -----END PGP SIGNATURE----- --K1n7F7fSdjvFAEnM--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041219225318.GH84787>