From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 18:32:10 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D228106579B; Thu, 31 Mar 2011 18:32:10 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8F2AF8FC20; Thu, 31 Mar 2011 18:32:09 +0000 (UTC) Received: by iwn33 with SMTP id 33so3334774iwn.13 for ; Thu, 31 Mar 2011 11:32:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=lwa7dXbQmYw3PyuXynzgHDdzlenF1/eh7AAn5sDP//Q=; b=sms9XluXm0CjN59xWlGjZXRXCUfT4KoajjCIhDmO2nByS97P6tDolMlKwB0QgxY4tb WX9CROMUpTkqWAfUv0o0q0Qgqv8X4Bh+5UVJNhUXEe4TlUzL3vHdFOffro16RtVkm1Mr RLh1yeu4m+l7Khk9Swhm9oN012e1nMeRfX9Wo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=IO3irBVSr7Y6CROM0rV8BJ55G5a2WcqzEtnAsfLdm8QOORgNSp9xzYRD17p3EFvzJ0 kTCOCxNwpbiojEjggIrMDFY4DfqAkyOVasXgetaq0qgfN+6NE47/Dq1hWbsAy82qMWqC 3X0tZo+EkBoV4DRsUR/9wEkrcEmxLpIY9ZstY= Received: by 10.43.60.200 with SMTP id wt8mr3425221icb.358.1301596328824; Thu, 31 Mar 2011 11:32:08 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id uf10sm772377icb.5.2011.03.31.11.32.05 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 11:32:07 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 31 Mar 2011 11:30:54 -0700 From: YongHyeon PYUN Date: Thu, 31 Mar 2011 11:30:54 -0700 To: Yamagi Burmeister Message-ID: <20110331183054.GC11981@michelle.cdnetworks.com> References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> <20110331181651.GB11981@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="JgQwtEuHJzHdouWu" Content-Disposition: inline In-Reply-To: <20110331181651.GB11981@michelle.cdnetworks.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 18:32:10 -0000 --JgQwtEuHJzHdouWu Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 31, 2011 at 11:16:52AM -0700, YongHyeon PYUN wrote: > On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote: > > On Thu, 31 Mar 2011, YongHyeon PYUN wrote: > > > > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. > > >>After limiting the memory via hw.physmem to 3GB the problems are gone. > > >>The box is running crashfree for more than 6 hours and has served over > > >>300GB of data via age(4). > > >> > > > > > >Thanks for testing. Remove the hw.physmem configuration and try > > >attached patch and let me know how it goes. > > > > Thanks for your help, but the patch doesn't work. Another random panic - > > this time "page fault in kernel mode" - with nothing age(4) or network > > stack related stuff in the backtrace... > > > > Maybe it'll help to know about a bug fix in the linux atl1 driver, now > > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > > 64 bit DMA was disabled: > > > > 64-bit DMA causes data corruption with atl1. We don't know why, and > > Atheros is working on it. For now, just use 32-bit DMA. This is a big > > hack that is probably wrong, but it stops the bleeding. > > > > There was no later follow up on it. I think that this can't be problem > > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org > > gitweb URL is: > > > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > > > > Thanks a lot! It seems the L1 controller has data corruption issue > when 64bit DMA addressing is used. Try this one. Oops, there was a bug in previous patch. Try this instead. --JgQwtEuHJzHdouWu Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="age.dma.diff3" Index: sys/dev/age/if_age.c =================================================================== --- sys/dev/age/if_age.c (revision 220116) +++ sys/dev/age/if_age.c (working copy) @@ -1092,11 +1092,14 @@ * Create Tx/Rx buffer parent tag. * L1 supports full 64bit DMA addressing in Tx/Rx buffers * so it needs separate parent DMA tag. + * XXX + * It seems enabling 64bit DMA causes data corruption. Limit + * DMA address space to 32bit. */ error = bus_dma_tag_create( bus_get_dma_tag(sc->age_dev), /* parent */ 1, 0, /* alignment, boundary */ - BUS_SPACE_MAXADDR, /* lowaddr */ + BUS_SPACE_MAXADDR_32BIT, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ BUS_SPACE_MAXSIZE_32BIT, /* maxsize */ @@ -2452,6 +2455,9 @@ /* Update the consumer index. */ sc->age_cdata.age_rr_cons = rr_cons; + bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag, + sc->age_cdata.age_rx_ring_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* Sync descriptors. */ bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag, sc->age_cdata.age_rr_ring_map, --JgQwtEuHJzHdouWu--