From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:33:01 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FB9B106566C; Wed, 30 Mar 2011 17:33:01 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id F3CCC8FC15; Wed, 30 Mar 2011 17:33:00 +0000 (UTC) Received: by iyj12 with SMTP id 12so1876848iyj.13 for ; Wed, 30 Mar 2011 10:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=sM9Njcqgcu3I00sSdD0t8V4+c+hxhVTViaGo+Y5feQ0=; b=n5O655Q6MYMMElLHZSdG1dwqAVlQq5k7pYlYp3h/SmVHRIhgsA3lUX9bFCMGVfh97K ZP/V34y27U1tuQUgIetNtfCjZ6W+aE1DZvYjYTuZ26hDAU1DjuH5pHsnE7UjaKgeoPX/ LHaXszbcQg5feIAmY5w/myYqYx9VqXuaGob64= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=krLnc1AvSiN6C7ehqpzU6zzunsPHXsCU1enwGsE+Eax7lCRz/rkN1oHFk/1WMLj4Ww GKFKcu54RwLwLnCQWq5qHrfepFOhOYyAUao/JBSN1jw6prTKnedE3rgwkwNCFGchenjl 5xKemG3B02B4rFmfovXJpevNHESGlLCVy+TWw= Received: by 10.43.64.9 with SMTP id xg9mr1445206icb.102.1301506380429; Wed, 30 Mar 2011 10:33:00 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id gy41sm161351ibb.22.2011.03.30.10.32.57 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 30 Mar 2011 10:32:59 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 30 Mar 2011 10:31:45 -0700 From: YongHyeon PYUN Date: Wed, 30 Mar 2011 10:31:45 -0700 To: Yamagi Burmeister Message-ID: <20110330173145.GB8601@michelle.cdnetworks.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:33:01 -0000 On Wed, Mar 30, 2011 at 04:22:23PM +0200, Yamagi Burmeister wrote: > Hi, > I recently got four about two years old Asus M3A-H/HDMI mainboards with > an integrated Attansic L2 ethernet controller. This NIC is supported by > age(4) and recognized by freebsd: > > ---- > > age0: > mem 0xfeac0000-0xfeafffff irq 18 at device 0.0 on pci2 > age0: 1280 Tx FIFO, 2364 Rx FIFO > age0: Using 1 MSI messages. > age0: 4GB boundary crossed, switching to 32bit DMA addressing mode. > miibus0: on age0 > atphy0: PHY 0 on miibus0 > atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, > 1000baseT-FDX-master, auto > age0: Ethernet address: 00:23:54:31:a0:12 > age0: [FILTER] > > ---- > > age0: flags=8843 metric 0 mtu 1500 > options=c319b WOL_MCAST,WOL_MAGIC,VLAN_HWTSO,LINKSTATE> > ether 00:23:54:31:a0:12 > inet6 fe80::223:54ff:fe31:a012%age0 prefixlen 64 scopeid 0x1 > nd6 options=3 > media: Ethernet autoselect (none) > status: no carrier > > ---- > > All for boxes are unstable if the Attansic NIC is in use, no one of them > survived more than 60 minutes of ~20mb/s network traffic. I managed to > get some coredumps and extracted the backtraces. Since everytime one of > the boxes paniced I got different panic message and a different backtrace > with a different subsystem involved I suspected broken hardware. I > plugged a em(4) NIC into the PCI slot and wasn't able to reproduce the > problem, in fact the boxes run rock solid for several days. Next I set > up a Windows 7, installed the Attansic vendor driver and did another > run. All went smooth, no crash for nearly 24 hours. > > My guess is kernel memory corruption by age(4), which would explain all > the different backtraces and the different panic messages. This problem > is reproducible in at least FreeBSD 7.4 and 8.2 and with TSO4 enabled > and disabled. I'm willing to debug this, but I really don't know how. So > any help or a pointer into the right direction would be appreciated. > AFAIK this is the first report for possible memory corruption triggered by age(4). I'm still not sure whether it's caused by age(4) but you can disable RX checksum offloading and see whether that makes any difference. Since I have no longer access to the hardware it would be even better if you can tell me which traffic pattern triggered the issue.