From owner-freebsd-current@FreeBSD.ORG Tue Jan 20 09:10:45 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B9C5C16A4CE for ; Tue, 20 Jan 2004 09:10:45 -0800 (PST) Received: from backmaster.cdsnet.net (backmaster.cdsnet.net [63.163.68.2]) by mx1.FreeBSD.org (Postfix) with SMTP id 9872143D53 for ; Tue, 20 Jan 2004 09:10:30 -0800 (PST) (envelope-from mrcpu@backmaster.cdsnet.net) Received: (qmail 3008 invoked by uid 29999); 20 Jan 2004 17:10:30 -0000 Date: Tue, 20 Jan 2004 09:10:30 -0800 From: Jaye Mathisen To: current@freebsd.org Message-ID: <20040120171030.GR50677@backmaster.cdsnet.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.4i X-Mailman-Approved-At: Wed, 21 Jan 2004 05:01:44 -0800 Subject: 5.2 SMP data corruption problems... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Jan 2004 17:10:45 -0000 5.2-current as of 1/15. mobo is Tyan HESL-T, bios rev is 1.04, dual P3 1G'S. 2 3WARE CONTROLLERS, latest bios, 16 drives. Was seeing data corruption on large copies to the 3ware drives, via FTP/samba or even just tar from disk to disk. Small files never seemed to get corruped (md5 checksum'd everything regularly), but files over 4G seemed to always get corrupted somewhere, although not at the same spots. Eventually the box panic'd with a lock order reversal, and would not let me fsck the large partition (900GB), it would keep panicing in pass 2 wiht anotehr lock-order reversal. I supped to current as of 1/19, tried again, same thing, file corruption, lots of panics. Finally, in the midst of just messing with stuff, I build a new kernel without the smp/apic stuff, and it's working fine. Disk-to-disk copies are fine, no corruption, nothing during uploads, no panics. And I can fsck the partition that I couldn't before, and it works fine. I do not have the kernel dump info, the debugging was being done remotely over the phone, no way I was going to transcribe it that way. Anyway, just a heads up for those with potentially serverworks chipsets and 5.2, there's possibly something wrong. The corruption is silent, if I hadn't checked, there'd be no way to know.