Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Mar 2011 11:30:54 -0700
From:      YongHyeon PYUN <pyunyh@gmail.com>
To:        Yamagi Burmeister <lists@yamagi.org>
Cc:        freebsd-net@freebsd.org, yongari@freebsd.org
Subject:   Re: Kernel memory corruption(?) with age(4)
Message-ID:  <20110331183054.GC11981@michelle.cdnetworks.com>
In-Reply-To: <20110331181651.GB11981@michelle.cdnetworks.com>
References:  <alpine.BSF.2.00.1103301620110.17846@saya.home.yamagi.org> <20110330173145.GB8601@michelle.cdnetworks.com> <alpine.BSF.2.00.1103302137330.1646@maka.home.yamagi.org> <20110330202858.GC8601@michelle.cdnetworks.com> <alpine.BSF.2.00.1103310859310.3082@saya.home.yamagi.org> <20110331171302.GA11981@michelle.cdnetworks.com> <alpine.BSF.2.00.1103311951060.2217@maka.home.yamagi.org> <20110331181651.GB11981@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--JgQwtEuHJzHdouWu
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Thu, Mar 31, 2011 at 11:16:52AM -0700, YongHyeon PYUN wrote:
> On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote:
> > On Thu, 31 Mar 2011, YongHyeon PYUN wrote:
> > 
> > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
> > >>After limiting the memory via hw.physmem to 3GB the problems are gone.
> > >>The box is running crashfree for more than 6 hours and has served over
> > >>300GB of data via age(4).
> > >>
> > >
> > >Thanks for testing. Remove the hw.physmem configuration and try
> > >attached patch and let me know how it goes.
> > 
> > Thanks for your help, but the patch doesn't work. Another random panic -
> > this time "page fault in kernel mode" - with nothing age(4) or network
> > stack related stuff in the backtrace...
> > 
> > Maybe it'll help to know about a bug fix in the linux atl1 driver, now
> > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> > 64 bit DMA was disabled:
> > 
> >   64-bit DMA causes data corruption with atl1.  We don't know why, and
> >   Atheros is working on it. For now, just use 32-bit DMA. This is a big
> >   hack that is probably wrong, but it stops the bleeding.
> > 
> > There was no later follow up on it. I think that this can't be problem
> > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org
> > gitweb URL is:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> > 
> 
> Thanks a lot! It seems the L1 controller has data corruption issue
> when 64bit DMA addressing is used. Try this one.

Oops, there was a bug in previous patch.
Try this instead.

--JgQwtEuHJzHdouWu
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="age.dma.diff3"

Index: sys/dev/age/if_age.c
===================================================================
--- sys/dev/age/if_age.c	(revision 220116)
+++ sys/dev/age/if_age.c	(working copy)
@@ -1092,11 +1092,14 @@
 	 * Create Tx/Rx buffer parent tag.
 	 * L1 supports full 64bit DMA addressing in Tx/Rx buffers
 	 * so it needs separate parent DMA tag.
+	 * XXX
+	 * It seems enabling 64bit DMA causes data corruption. Limit
+	 * DMA address space to 32bit.
 	 */
 	error = bus_dma_tag_create(
 	    bus_get_dma_tag(sc->age_dev), /* parent */
 	    1, 0,			/* alignment, boundary */
-	    BUS_SPACE_MAXADDR,		/* lowaddr */
+	    BUS_SPACE_MAXADDR_32BIT,	/* lowaddr */
 	    BUS_SPACE_MAXADDR,		/* highaddr */
 	    NULL, NULL,			/* filter, filterarg */
 	    BUS_SPACE_MAXSIZE_32BIT,	/* maxsize */
@@ -2452,6 +2455,9 @@
 		/* Update the consumer index. */
 		sc->age_cdata.age_rr_cons = rr_cons;
 
+		bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag,
+		    sc->age_cdata.age_rx_ring_map,
+		    BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
 		/* Sync descriptors. */
 		bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag,
 		    sc->age_cdata.age_rr_ring_map,

--JgQwtEuHJzHdouWu--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110331183054.GC11981>