From owner-freebsd-questions@FreeBSD.ORG Thu Jan 6 19:33:55 2005 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 45DD916A4CE for ; Thu, 6 Jan 2005 19:33:55 +0000 (GMT) Received: from smtp03.gnvlscdb.sys.nuvox.net (smtp.nuvox.net [64.89.70.9]) by mx1.FreeBSD.org (Postfix) with ESMTP id B5E5B43D31 for ; Thu, 6 Jan 2005 19:33:54 +0000 (GMT) (envelope-from joe@jwebmedia.com) Received: from [192.168.1.107] (66.49.54.54.nw.nuvox.net [66.49.54.54]) j06JY2QU011062; Thu, 6 Jan 2005 14:34:03 -0500 User-Agent: Microsoft-Entourage/10.1.4.030702.0 Date: Thu, 06 Jan 2005 13:33:41 -0600 From: "Joseph Koenig (jWeb)" To: Joe Koenig , FreeBSD Mailing List Message-ID: In-Reply-To: Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit cc: Henry Miller Subject: Re: Hardware or OS problem? System Crashing... X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jan 2005 19:33:55 -0000 >> On 1/5/2005 at 09:14 Joseph Koenig (jWeb) wrote: >> >>> Hi, >>> >>> We have a system that is currently giving us some trouble. The system >> is >>> FreeBSD 4.9. It's a 2 GHz system with 1MB RAM and (here's the kicker) >> 73GB >>> RAID 1 ATA drives. The system serves as a web/database server >> dedicated to >>> 1 >>> site. Daily the system goes out and downloads real estate listings >> (via >>> shell scripts and cURL) and processes them (via PHP into MySQL). Also, >>> nightly the system downloads a zipped set of images (probably around >>> 400-500) and processes them into thumbnails (PHP scripts calling >>> ImageMagick). Over the last week or two, the system is crashing and >>> rebooting into single user mode. It's not consistently during updates, >> or >>> resizing of images, or anything like that. Yesterday, it crashed with >> 99% >>> processor idle and load averages of 0.00 0.00 0.00 -- I was watching a >>> 'top' >>> when the machine died. When it boots into single user mode, an fsck >> must be >>> run, which identified a few corrupt JPEG files -- however, the >> sysadmin who >>> reboots it never tells me which files they are. The sysadmin is >> convinced >>> it >>> is a FreeBSD problem and says that Linux will not crash because of a >>> corrupt >>> file and if it does, will not boot into single user mode and he will >> be >>> able >>> to access it remotely to do the fsck. About 3-4 weeks ago, one of the >>> drives >>> in the mirror set crashed and had to be replaced. I'm not convinced >> that >>> drives are not to blame for these issues. Is there any way to verify >> that? >>> Is it possible a corrupt JPEG on the drive could cause the system to >> crash >>> randomly? What can I do to correctly identify the problem so that we >> can >>> fix >>> it and not change the OS? Thanks, >> >> The sysadmin has no clue about either linux or freebsd! >> >> A corrupt JPEG cannot cause a crash of the OS, for any real OS. (If it >> does, it is a bug in the OS, but I doubt one exists) Real OS includes >> Windows XP, linux, and FreeBSD. >> >> However, an OS crash can cause a corrupt JPEG! >> >> Either linux or FreeBSD may boot into single user mode when the >> filesystem is corrupt. What your sysadmin means is that with one of >> the newer filesystems Linux uses journeling, which is much less likely >> to enter this situation, but it still can happen. With soft updates >> FreeBSD is in the same situation as linux, but softupdates is >> (generally, there are exceptions) better than journeling. There is >> softupdates in Freebsd 4.9, but I'm not sure how to enable it, or how >> good it is. (in 5.3 it is awesome!) >> >> I suspect hardware. >> >> I'd burn memtest to a CD, and run that for a few hours to see if >> something is identified. Memtest won't catch everything, but it does >> a pretty good job. >> >> Also look at other factors. Does the HVAC kick in when this happens? >> Is someone hitting the panic stop switch? Situations like that have >> happened, and they can take a while to debug. They are not likely, but >> don't rule them out. >> >> FreeBSD 4.9 is fairly old at this point. You should seriously >> consider upgrading to 4.11 (due out in a few weeks), or 5.3 (my >> recommendation, but a much more involved upgrade). >> > > In addition, to the original problem stated above, we are seeing a number of > problems like "...in free(): warning: modified (page-) pointer" and "...in > free(): warning: chunk is already free". I have them admin running a memtest > today, but wanted to make sure these errors were not indicative of something > else going on. Thanks, > Well, the sysadmin tells me that memtest passed. Any one have any suggestions as to what could be causing the crashes? Thanks, Joe