Date: Thu, 31 Mar 2011 11:16:52 -0700 From: YongHyeon PYUN <pyunyh@gmail.com> To: Yamagi Burmeister <lists@yamagi.org> Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) Message-ID: <20110331181651.GB11981@michelle.cdnetworks.com> In-Reply-To: <alpine.BSF.2.00.1103311951060.2217@maka.home.yamagi.org> References: <alpine.BSF.2.00.1103301620110.17846@saya.home.yamagi.org> <20110330173145.GB8601@michelle.cdnetworks.com> <alpine.BSF.2.00.1103302137330.1646@maka.home.yamagi.org> <20110330202858.GC8601@michelle.cdnetworks.com> <alpine.BSF.2.00.1103310859310.3082@saya.home.yamagi.org> <20110331171302.GA11981@michelle.cdnetworks.com> <alpine.BSF.2.00.1103311951060.2217@maka.home.yamagi.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--xgyAXRrhYN0wYx8y Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote: > On Thu, 31 Mar 2011, YongHyeon PYUN wrote: > > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. > >>After limiting the memory via hw.physmem to 3GB the problems are gone. > >>The box is running crashfree for more than 6 hours and has served over > >>300GB of data via age(4). > >> > > > >Thanks for testing. Remove the hw.physmem configuration and try > >attached patch and let me know how it goes. > > Thanks for your help, but the patch doesn't work. Another random panic - > this time "page fault in kernel mode" - with nothing age(4) or network > stack related stuff in the backtrace... > > Maybe it'll help to know about a bug fix in the linux atl1 driver, now > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > 64 bit DMA was disabled: > > 64-bit DMA causes data corruption with atl1. We don't know why, and > Atheros is working on it. For now, just use 32-bit DMA. This is a big > hack that is probably wrong, but it stops the bleeding. > > There was no later follow up on it. I think that this can't be problem > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org > gitweb URL is: > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > Thanks a lot! It seems the L1 controller has data corruption issue when 64bit DMA addressing is used. Try this one. --xgyAXRrhYN0wYx8y Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="age.dma.diff2" Index: sys/dev/age/if_age.c =================================================================== --- sys/dev/age/if_age.c (revision 220116) +++ sys/dev/age/if_age.c (working copy) @@ -1092,10 +1092,13 @@ * Create Tx/Rx buffer parent tag. * L1 supports full 64bit DMA addressing in Tx/Rx buffers * so it needs separate parent DMA tag. + * XXX + * It seems enabling 64bit DMA causes data corruption. Limit + * DMA address space to 32bit. */ error = bus_dma_tag_create( bus_get_dma_tag(sc->age_dev), /* parent */ - 1, 0, /* alignment, boundary */ + BUS_SPACE_MAXADDR_32BIT, 0, /* alignment, boundary */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ @@ -2452,6 +2455,9 @@ /* Update the consumer index. */ sc->age_cdata.age_rr_cons = rr_cons; + bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag, + sc->age_cdata.age_rx_ring_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* Sync descriptors. */ bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag, sc->age_cdata.age_rr_ring_map, --xgyAXRrhYN0wYx8y--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110331181651.GB11981>