From owner-freebsd-hackers Fri Jan 16 04:20:14 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id EAA09599 for hackers-outgoing; Fri, 16 Jan 1998 04:20:14 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from alpo.whistle.com (alpo.whistle.com [207.76.204.38]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id EAA09588 for ; Fri, 16 Jan 1998 04:20:08 -0800 (PST) (envelope-from julian@whistle.com) Received: (from daemon@localhost) by alpo.whistle.com (8.8.5/8.8.5) id EAA22471 for ; Fri, 16 Jan 1998 04:16:22 -0800 (PST) Received: from UNKNOWN(), claiming to be "current1.whistle.com" via SMTP by alpo.whistle.com, id smtpd022467; Fri Jan 16 04:16:16 1998 Message-ID: <34BF4ECF.167EB0E7@whistle.com> Date: Fri, 16 Jan 1998 04:13:03 -0800 From: Julian Elischer Organization: Whistle Communications X-Mailer: Mozilla 3.0Gold (X11; I; FreeBSD 2.2.5-RELEASE i386) MIME-Version: 1.0 To: hackers@FreeBSD.ORG Subject: The 'dave rivers' momorial panic. Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk I've gone some considerable distance towards tracking down a crash that seems to resemble the problem dave's been seeing. Basically, a particular setup (running on several pieces of hardware) is incapable of doing a full news expire, We've managed to simplify the case down to a reproducible test, that involves copying the contents of one disk to a newly newfs'd partition on another. The exact symptom that we see is that bits in the cylinder group bitmap get set by "something" after the cg has bee queued for write. adding a test confirms that everything is alright immediatly before the write, but the next time the cg is accessed, there are some extra bits set. The changes are present on the disk. It's not hardware.. we've changed everything, but it's reproducible. with this particular setup.. the more I write here the more it sounds like flaky hardware.. <\hmmmmm> but the patterns seen on disk do not act like hardware.. it looks like a reallocation.. some file or more likely, directory, is extended, and the cg summary info is never updated, though the bitmaps are.. so the question is: does anyone know of any 'covert' paths where the cg structs (including bitmaps) are accessed other than in ffs_alloc.c? I'd love to be able to mark the pages concerned 'read-only' when I queue them for write. that'd catch the other writer,, :) anyone have any ideas on how I'd do that for a bdwrite(bp)? julian