From owner-freebsd-stable@FreeBSD.ORG Wed Apr 23 05:20:00 2008 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E130F106566C for ; Wed, 23 Apr 2008 05:20:00 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.240]) by mx1.freebsd.org (Postfix) with ESMTP id A2EBA8FC12 for ; Wed, 23 Apr 2008 05:20:00 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by an-out-0708.google.com with SMTP id c14so711715anc.13 for ; Tue, 22 Apr 2008 22:19:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:received:date:from:to:cc:subject:message-id:reply-to:references:mime-version:content-type:content-disposition:in-reply-to:user-agent; bh=Pic81kIuITus+bOcBi1vFN4PBQwWZX6vY1XvgZFZemk=; b=qJSKOH1UF4lO1O+lO75EpDeF0iOnbW6GNhG6vEsr7DQf3iURKEHoMO3kNROqcygKK1E+7LFk9vvRZqCUOvJUqt5oVWZ4H5l5tH7GgFgtB7+zyPsve5Tq09VFa0s8s0CNJTb9Zeco6vfRt8wddWuZXziXqC7Ww9ldjP5YhYMH7Pw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version:content-type:content-disposition:in-reply-to:user-agent; b=KAubaUs7yDJ0D+x9l2atqAsae0w7nY5HtXWU0E0yGDkFDleNYEKyrAfhl2Ih7GTS3t24loZGnBMUbLj4gnp5w66ajCK4vbKtAk9Sr3kFrxoFFP0QmKhCWBNkKeMO2lwcRad6bp/91k2V0iOaN4pjq0C4qK6/YQ1uQ3bbjf/vtBI= Received: by 10.100.14.2 with SMTP id 2mr1963746ann.37.1208926436601; Tue, 22 Apr 2008 21:53:56 -0700 (PDT) Received: from michelle.cdnetworks.co.kr ( [211.53.35.84]) by mx.google.com with ESMTPS id d24sm12371648and.12.2008.04.22.21.53.52 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 22 Apr 2008 21:53:54 -0700 (PDT) Received: from michelle.cdnetworks.co.kr (localhost.cdnetworks.co.kr [127.0.0.1]) by michelle.cdnetworks.co.kr (8.13.5/8.13.5) with ESMTP id m3N4rmRI055539 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Apr 2008 13:53:48 +0900 (KST) (envelope-from pyunyh@gmail.com) Received: (from yongari@localhost) by michelle.cdnetworks.co.kr (8.13.5/8.13.5/Submit) id m3N4rlSl055538; Wed, 23 Apr 2008 13:53:47 +0900 (KST) (envelope-from pyunyh@gmail.com) Date: Wed, 23 Apr 2008 13:53:47 +0900 From: Pyun YongHyeon To: pluknet Message-ID: <20080423045347.GE54715@cdnetworks.co.kr> References: <20080421094718.GY25623@hub.freebsd.org> <200804211537.m3LFbaZA086977@lava.sentex.ca> <200804221501.m3MF1guW092221@lava.sentex.ca> <200804221741.m3MHfYjO092795@lava.sentex.ca> <200804221807.m3MI73bN092981@lava.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i Cc: stable@freebsd.org Subject: Re: nfs-server silent data corruption X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Apr 2008 05:20:01 -0000 On Wed, Apr 23, 2008 at 12:13:44AM +0400, pluknet wrote: > On 22/04/2008, Mike Tancsa wrote: > > At 02:00 PM 4/22/2008, Arno J. Klaassen wrote: > > > > > > > > > > Are you using the latest RELENG_7, or at least the latest version of > > > > nfe thats in RELENG_7 ? > > > > > > > > > Think so : > > > > > > > OK, and it is the latest RELENG_7 ? Or just the if_nfe.c file has been > > manually updated ? Also, you are using ULE or the 4BSD scheduler ? I still > > have 4BSD on the box I am testing on. > > Hi, I have the same problem with data corruption (with nfe on nfs server side), > particularly when transferring large files. > Maybe this is somehow associated with the topic. > > My simple test case: > truncate -s 1000m bigfile > ^^ here I get zero-filed file > cp bigfile /nfs/mounted > ^^ here I get not-at-all-zero-filed file, after uploading to nfs server > > I looked at the corrupted file. It contains a few ranges, filed with > non-zero bytes: > equal to zero? real 4-byte value offset > ====================================== > not equal 1200355616 at pos=38797316 > ... <-- this range contains per-4bytes garbage, omit > not equal 3879749905 at pos=38813696 > > not equal 161160732 at pos=45613060 > ... <-- ditto > not equal 575257183 at pos=45629440 > > not equal 1943682165 at pos=59768836 > ... <-- ditto > not equal 2843639625 at pos=59785216 > > not equal 2653910121 at pos=60293124 > ... <-- ditto > not equal 3462830780 at pos=60309504 > > Some info: > > nfs server on 8-CURRENT as of Apr 17 > nfs client on 7.0-STABLE as of Apr 12 > > dmesg | grep nfe > nfe0: port 0xe000-0xe007 mem > 0xe2001000-0xe2001fff irq 20 at device 4.0 on pci0 > miibus0: on nfe0 > nfe0: Ethernet address: 00:04:61:6c:76:b1 > nfe0: [FILTER] > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > nfe0: tx v1 error 0x6001 > ^^^ I'm not sure it's related with data corruption issue but 0x6001 would mean Tx underflow error. I recall these Tx errors were seen on nfe(4) if negotiated speed/duplex does not match with link partner or MACs. Does link partner also agree on speed/duplex settings of nfe(4)? What PHY driver nfe(4) use? > This appears while cp'ing file to server. > (btw they do not appear with disabled polling, probably it's an another issue) > > vmstat -i | grep nfe > irq20: nfe0 ohci0 1 0 > > nfe0: flags=8843 metric 0 mtu 1500 > options=48 > ether 00:04:61:6c:76:b1 > inet 192.168.200.137 netmask 0xffffff00 broadcast 192.168.200.255 > media: Ethernet autoselect (100baseTX ) > status: active > I can reproduce it regardless polling presence. > > nfe0@pci0:0:4:0: class=0x020000 card=0x10001695 chip=0x006610de > rev=0xa1 hdr=0x00 > -- Regards, Pyun YongHyeon