Date: Wed, 10 Sep 2003 09:31:27 -0700 From: Dragoncrest <dragoncrest@voyager.net> To: =?iso-8859-1?Q?Rapha=EBl?= Marmier <raphael@computer-rental.ch> Cc: freebsd-questions@FreeBSD.ORG Subject: Re: Mail server reboot while upgrading ports Message-ID: <5.2.0.9.2.20030910092343.00a17d10@pop.voyager.net> In-Reply-To: <CA678A69-E31B-11D7-8B65-000393D67E4A@computer-rental.ch> References: <5.2.0.9.2.20030910053700.01d83680@pop.voyager.net>
next in thread | previous in thread | raw e-mail | index | archive | help
At 01:17 AM 9/10/03 +0200, Rapha=EBl Marmier wrote: >heat? Both could be overheating the same way if it is the same hardware in= =20 >the same room under identical conditions. Try to move one to the fridge=20 >and see if it stop freezing ;) Actually, it's rebooting randomly during installs or upgrades=20 only. It did it once due to spam assassign, but that was a long time=20 back. I did however encounter something of interest that might give us=20 some insight into this. Even though I'd periodically have failed builds or= =20 installs, or even reboots at random during this, I always seemed to get a=20 lot of this one ruby error. Here's two examples. /usr/local/lib/ruby/site_ruby/1.6/pkginfo.rb:45: [BUG] rb_gc_mark():=20 unknown data type 0x7(0x8053e04) corrupted object ruby 1.6.8 (2003-03-26) [i386-freebsd4] /usr/local/lib/ruby/site_ruby/1.6/pkgdb.rb:336: [BUG] rb_gc_mark(): unknown= =20 data type 0x7(0x82024dc) corrupted object ruby 1.6.8 (2003-03-26) [i386-freebsd4] I've since done a pkgdb -F to see if that would help and it did=20 find one small error in the database. So not sure if that fixed it or=20 not. I am however fully completed with all the upgrades after 5 painful=20 hours, so that's at least good. That means I shouldn't have to touch this= =20 for a while longer at which this problem may crop up its ugly head. I'll=20 keep an eye on it and report anything new I find. I don't expect this to=20 be easy to solve. But we'll keep looking for clues. Speaking of clues, is there a way to log everything that goes on=20 in a TTY session? I've noticed that when this thing crashes it prints=20 something to the screen (I can't see it cause I'm away at another desk when= =20 it does it) about the crash and a reboot in 15 seconds but I'm never fast=20 enough over there to catch it. I want to try and catch that in a file if=20 possible so I better know what's wrong with this thing. I'm sure it has=20 something to do with adding a switch to the logging system, but I can't=20 find where to add the switch and which one would do it. Any input would be= =20 welcome. Thanks.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5.2.0.9.2.20030910092343.00a17d10>