From owner-freebsd-current@FreeBSD.ORG  Thu May  8 04:15:30 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CDF4537B401
	for <current@freebsd.org>; Thu,  8 May 2003 04:15:30 -0700 (PDT)
Received: from sauron.fto.de (p15106025.pureserver.info [217.160.140.13])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 637DE43F93
	for <current@freebsd.org>; Thu,  8 May 2003 04:15:29 -0700 (PDT)
	(envelope-from hschaefer@fto.de)
Received: from localhost (localhost.fto.de [127.0.0.1])
	by sauron.fto.de (Postfix) with ESMTP id AB6A125C0F7
	for <current@freebsd.org>; Thu,  8 May 2003 13:15:27 +0200 (CEST)
Received: from sauron.fto.de ([127.0.0.1])
 by localhost (sauron [127.0.0.1]) (amavisd-new, port 10024) with ESMTP
 id 11680-08 for <current@freebsd.org>; Thu,  8 May 2003 13:15:26 +0200 (CEST)
Received: from giskard.foundation.hs (p5091A08C.dip.t-dialin.net
	[80.145.160.140])
	by sauron.fto.de (Postfix) with ESMTP id 1063925C0C2
	for <current@freebsd.org>; Thu,  8 May 2003 13:15:25 +0200 (CEST)
Received: from daneel.foundation.hs (daneel.foundation.hs [192.168.20.2])
	by giskard.foundation.hs (8.9.3/8.9.3) with ESMTP id NAA63112
	for <current@freebsd.org>; Thu, 8 May 2003 13:15:25 +0200 (CEST)
	(envelope-from hschaefer@fto.de)
Date: Thu, 8 May 2003 13:15:25 +0200 (CEST)
From: Heiko Schaefer <hschaefer@fto.de>
X-X-Sender: heiko@daneel.foundation.hs
To: current@freebsd.org
Message-ID: <20030508131508.M78057@daneel.foundation.hs>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Virus-Scanned: by amavisd-new at fto.de
Subject: Re: data corruption with current (maybe sis chipset related?)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 08 May 2003 11:15:31 -0000

Hi Poul,

> >sorry to be pushy, but have you found anything - or been able to reproduce
> >anything since then ? i'm still confused how exactly you determined that
> >some data on your disk was corrupt.
>
> No, I'm not any further.
>
> What I saw was the stdout/stderr from a "make universe" that suddenly
> had a bunch of zero bytes in the middle.

if that is related, it would rule out anything that has to do with the
harddisk-subsystem, wouldn't it ?

> The thing we need, more than anything else, is a way to reproduce
> this on demand.

well, i can reproduce it pretty well. but only as a relatively
time-consuming process (a number of hours). if anyone has code to debug
this (which doesn't produce too much output :)), i can gladly run it and
very likely the corruption will occur sometime soon during my test.

it would appear to me that if your stdout/err null-bytes are related to my
data corruption while copying, the error must be somehow connected to
memory management.

if so, i imagine that code which moves data around in memory and checksums
it every once in a while (start with large chunk of data called a, {
malloc b, copy a to b, dealloc a, checksum b, use b as a now}, repeat - or
something like that) should lead to such results.

however, one thing i wonder about: shouldn't copying between ide disks
with udma go more or less around memory management (and even the cpu) ?

regards,

Heiko

-- 
Free Software. Why put up with inferior code and antisocial corporations?
http://www.gnu.org/philosophy/why-free.html