Date: Fri, 15 Sep 2000 10:58:52 +0200 (MEST) From: Jean-Francois Dockes <jean-francois.dockes@wanadoo.fr> To: dhesi@rahul.net (Rahul Dhesi) Cc: freebsd-stable@FreeBSD.ORG Subject: Re: SCSI retries without errors in /var/log/messages? Message-ID: <14785.58572.126822.918071@localhost.dockes.com> In-Reply-To: <20000914171405.655227C3D@yellow.rahul.net> References: <freebsd-stable.14784.33648.251152.511680@localhost.dockes.com> <20000914171405.655227C3D@yellow.rahul.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Rahul Dhesi writes: > SunOS deals with soft memory errors in a very nice way. After a certain > number of soft memory errors have occurred, it syslog's a message saying > essentially: > > XXX corrected memory errors on memory chip YYY > > where XXX is how many times an error was corrected and YYY is > the location of the chip on the motherboard. > > I think this is a very nice strategy. It avoids too many false warnings > but still alerts the operator to consider replacing an unreliable memory > chip. The same strategy could be used in any situation where > correctable errors are occurring: Simply keep count of them and log a > warning when a threshold is reached. But it is important to have the block number for each retry (because the conclusions are different if it's always the same or all over the disk). You can't just keep a count of errors, you have to keep specific data about each error, which makes things more difficult for the driver, because there are many more disk blocks than memory chips. Logging to a disk file is probably a lot easier. You can then use your preferred extraction and report language to do stats. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14785.58572.126822.918071>
