From owner-freebsd-hackers Thu Feb 27 21:50:38 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA18592 for hackers-outgoing; Thu, 27 Feb 1997 21:50:38 -0800 (PST) Received: from dg-rtp.dg.com (dg-rtp.rtp.dg.com [128.222.1.2]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id VAA18570 for ; Thu, 27 Feb 1997 21:50:35 -0800 (PST) Received: by dg-rtp.dg.com (5.4R3.10/dg-rtp-v02) id AA00857; Fri, 28 Feb 1997 00:50:02 -0500 Received: from ponds by dg-rtp.dg.com.rtp.dg.com; Fri, 28 Feb 1997 00:50 EST Received: from lakes.water.net (lakes [10.0.0.3]) by ponds.water.net (8.8.3/8.7.3) with ESMTP id WAA18889; Thu, 27 Feb 1997 22:29:12 -0500 (EST) Received: (from rivers@localhost) by lakes.water.net (8.8.3/8.6.9) id WAA03821; Thu, 27 Feb 1997 22:34:33 -0500 (EST) Date: Thu, 27 Feb 1997 22:34:33 -0500 (EST) From: Thomas David Rivers Message-Id: <199702280334.WAA03821@lakes.water.net> To: ponds!root.com!dg, ponds!freefall.cdrom.com!freebsd-hackers Subject: Another installment of the "dup alloc"/"bad dir" panic problems. Content-Type: text Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Recall, this is with 2.1.6.1; but it's also been reported with 2.2; on varying processors and with SCSI and IDE disks. The new information is that my previous idea of disksort() being re-entered; and thus messing up the buffer chains is likely not correct. I think the reason my kernel printf()'s looked funny was because I was using cntrl-S on the console (perhaps a syscons buffering issue.) I'm pretty confident nothing is wrong with disksort(). However, I've tentatively determined that adding these printf's to disksort() have affected the problem. [I took Jordan's advice and also added printf()'s for entering disksort() and leaving it, as well as printing the block number for the buffer element being added to the queue. - other than that, it's a stock 2.1.6.1 kernel) If you recall; I could trash a particular inode; run newfs and discover the inode was not properly zero'd out (sometimes) although I had verified that the write() for that particular block, with a buffer full of zeros, had been issued. It now appears that having the printf()s in disksort() affects the problem in a positive manner (that is, I'm not able to demonstrate the previous "non-writing" behaviour I had seen; the inode in question is reliably filled with zeros.) I'm not sure what this means; does it point to some critical timing situation required for causing the problem? Does it point to missing splXXX() call... would anyone care to comment? - Thanks - - Dave Rivers -