From owner-freebsd-embedded@FreeBSD.ORG Mon Jul 14 19:55:20 2014 Return-Path: Delivered-To: freebsd-embedded@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 78BA7AB8; Mon, 14 Jul 2014 19:55:20 +0000 (UTC) Received: from mail-qa0-x229.google.com (mail-qa0-x229.google.com [IPv6:2607:f8b0:400d:c00::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 112FF2C94; Mon, 14 Jul 2014 19:55:20 +0000 (UTC) Received: by mail-qa0-f41.google.com with SMTP id j7so3046579qaq.14 for ; Mon, 14 Jul 2014 12:55:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=Twoig2LCFUVsGqc/gfRFb7hex0qgBtuQIt9AMUprsFI=; b=l/ZIF+ftwjDzWdSJdaAtrCFLhKQorsT7K9xK0ZRlU25OxHu7+fZ+nryaUvGY9vGZ/q zeo4erKcD/EDlV0SMmiOVOArcAkirRwQ0QJVG19KHyomTXa9WK5Li9OaniIkx0pVcPJM c8H9+rp8y8L8ctmQRzUFQy+jKLLLDg0axwJwmBrUK1wK2uPtY0Wol6n4QKStzZWpcykp e8MmpUlrV84u87hi60H0ekAnra/DxLRqKbvCCYm+gtEDeiMtZdVqM+RJlC3J8r4E9BEi MR87TAxx4J1Q8GLnw00fMhyvQJMN09mKS9YFp39VifLmeX44ftuNT2nGNahI0nzY/tsG /xmw== MIME-Version: 1.0 X-Received: by 10.140.80.19 with SMTP id b19mr26900617qgd.102.1405367719244; Mon, 14 Jul 2014 12:55:19 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.202.193 with HTTP; Mon, 14 Jul 2014 12:55:19 -0700 (PDT) In-Reply-To: <20140714194932.GA51947@zibbi.meraka.csir.co.za> References: <20140708165438.GA83704@zibbi.meraka.csir.co.za> <20140709044612.GA51378@zibbi.meraka.csir.co.za> <20140710040227.GA73872@zibbi.meraka.csir.co.za> <20140714080708.GA88464@zibbi.meraka.csir.co.za> <20140714184209.GA21922@zibbi.meraka.csir.co.za> <20140714190946.GA49930@zibbi.meraka.csir.co.za> <20140714191416.GB49930@zibbi.meraka.csir.co.za> <20140714194932.GA51947@zibbi.meraka.csir.co.za> Date: Mon, 14 Jul 2014 12:55:19 -0700 X-Google-Sender-Auth: gyHjRAXOSsMZ7UJwmRvEyu8VhBk Message-ID: Subject: Re: CAMBRIA and more than one atheros card From: Adrian Chadd To: John Hay Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-wireless@freebsd.org" , "freebsd-embedded@freebsd.org" , Ian Lepore X-BeenThere: freebsd-embedded@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Dedicated and Embedded Systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jul 2014 19:55:20 -0000 Right. What's the output of hw.ath ? -a On 14 July 2014 12:49, John Hay wrote: > On Mon, Jul 14, 2014 at 12:23:47PM -0700, Adrian Chadd wrote: >> Ah, you're going to need more than 64 bounce pages. That's just not >> enough for all the mbufs the 11n support requires. >> >> It likely didn't show up in previous incarnations of this because >> pre-11n support had a much smaller pool of buffers in use. >> >> For doing 11n, there's hm, 256 TX and 256 RX buffers for each NIC? Or >> is it 512 and 512? It's something large like that. So assuming 512, >> that's 1024 * 4096 bytes each, so 4mbyte per NIC needed of bounce >> buffers. >> >> You could tweak ATH_TXBUF and ATH_RXBUF down to something like 128 TX >> and 128 RX, but you can't go much lower than that per NIC as an A-MPDU >> aggregate requires up to 64 buffers to transmit and/or receive and if >> you're not careful you'll actually overrun the RX FIFO and/or starve >> the TX code from having mbufs available to transmit with. > > I don't mind up the bounce buffers, if that will help. I'll see how > high I can go before it explode. :-) I guess one cannot easily use > the RAM above 64M for ram disks and code and then not bouncing at > all. :-/ > > I just had an ok boot, ie. all 3 aths came up without an error. So > just after a login, the bounce stats looked like this: > > tst-11-arm:~ # sysctl hw.busdma > hw.busdma.total_bpages: 64 > hw.busdma.zone0.total_bpages: 64 > hw.busdma.zone0.free_bpages: 57 > hw.busdma.zone0.reserved_bpages: 0 > hw.busdma.zone0.active_bpages: 7 > hw.busdma.zone0.total_bounced: 525 > hw.busdma.zone0.total_deferred: 0 > hw.busdma.zone0.lowaddr: 0x4000fff > hw.busdma.zone0.alignment: 4096 > > The usage seems to slowly climb though. After 30 minutes it looks like > this: > > tst-11-arm:~ # sysctl hw.busdma > hw.busdma.total_bpages: 64 > hw.busdma.zone0.total_bpages: 64 > hw.busdma.zone0.free_bpages: 43 > hw.busdma.zone0.reserved_bpages: 0 > hw.busdma.zone0.active_bpages: 21 > hw.busdma.zone0.total_bounced: 4713 > hw.busdma.zone0.total_deferred: 0 > hw.busdma.zone0.lowaddr: 0x4000fff > hw.busdma.zone0.alignment: 4096 > > The board is pretty idle, with no routing daemon or traffic except for > the occasional broadcast and multicasts that arrive. > > John > >> >> >> -a >> >> >> On 14 July 2014 12:14, John Hay wrote: >> > On Mon, Jul 14, 2014 at 09:09:46PM +0200, John Hay wrote: >> >> On Mon, Jul 14, 2014 at 12:06:34PM -0700, Adrian Chadd wrote: >> >> > .. why's it need bounce buffers? >> >> >> >> I found this: >> >> arm/xscale/ixp425/ixp425_pci.c: /* NB: PCI dma window is 64M so anything above must be bounced */ >> > >> > I found a sysctl to show some bounce info. :-) >> > >> > On the boot that ath1 and ath2 failed during ifconfig and ath0 started >> > to complain after a while: >> > >> > ##################### >> > tst-11-arm:~ # sysctl hw.busdma >> > hw.busdma.total_bpages: 64 >> > hw.busdma.zone0.total_bpages: 64 >> > hw.busdma.zone0.free_bpages: 0 >> > hw.busdma.zone0.reserved_bpages: 0 >> > hw.busdma.zone0.active_bpages: 64 >> > hw.busdma.zone0.total_bounced: 1191 >> > hw.busdma.zone0.total_deferred: 0 >> > hw.busdma.zone0.lowaddr: 0x4000fff >> > hw.busdma.zone0.alignment: 4096 >> > ##################### >> > >> > I then thought to down wlan0 to see if the page count changed, but >> > do not know. :-) >> > >> > ##################### >> > tst-11-arm:~ # ifconfig wlan0 down >> > dev.ath.0.debug: 0 -> 8 >> > /sbin/ifconfig.bin wlan0 down >> > ath0: ath_stop_locked: invalid 0 if_flags 0x8802 >> > Sleeping thread (tid 100019, pid 0) owns a non-sleepable lock >> > panic: sleeping thread >> > Uptime: 32m46s >> > Automatic reboot in 15 seconds - press a key on the console to abort >> > Rebooting... >> > ##################### >> > >> > John >> > >> >> >> >> John >> >> > >> >> > >> >> > -a >> >> > >> >> > >> >> > On 14 July 2014 11:42, John Hay wrote: >> >> > > On Mon, Jul 14, 2014 at 09:48:56AM -0700, Adrian Chadd wrote: >> >> > >> Hm, it could be some bus related stupidity. It's allocating 2k or 4k >> >> > >> mbufs for the receive path because wifi frames are bigger than >> >> > >> ethernet frames. If you're not seeing failures in 4k mbuf allocations >> >> > >> then I'm not sure what it could be. >> >> > >> >> >> > >> I'll see about firing it up locally and checking. >> >> > > >> >> > > I'm hunting a bit more and it looks like the fail is in the bounce pages. >> >> > > It looks like the calls look like this: >> >> > > >> >> > > ath_legacy_rxbuf_init() >> >> > > bus_dmamap_load_mbuf_sg() >> >> > > _bus_dmamap_load_buffer() >> >> > > _bus_dmamap_reserve_pages() >> >> > > reserve_bounce_pages() >> >> > > >> >> > > I have added a printf at the start of alloc_bounce_pages() and it seems >> >> > > that it is only called (twice) when ath0 is probed. Is bus_dma_tag_t dmat, >> >> > > the first argument to alloc_bounce_pages(), common for the whole pci bus? >> >> > > >> >> > > The start of the boot, with my printf looks like this: >> >> > > >> >> > > #################### >> >> > > FreeBSD ARM (Gateworks Cambria) boot2 v0.4 >> >> > > - >> >> > > Default: /boot/kernel/kernel >> >> > > boot: >> >> > > Copyright (c) 1992-2014 The FreeBSD Project. >> >> > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 >> >> > > The Regents of the University of California. All rights reserved. >> >> > > FreeBSD is a registered trademark of The FreeBSD Foundation. >> >> > > FreeBSD 11.0-CURRENT #9 r268502M: Mon Jul 14 15:30:15 SAST 2014 >> >> > > jhay@dolphin.meraka.csir.co.za:/usr/obj/arm.armeb/snaps/arm/11-tst/src/sys/S >> >> > > MALL-CAMBRIA arm >> >> > > gcc version 4.2.1 20070831 patched [FreeBSD] >> >> > > CPU: IXP435 rev 1 (ARMv5TE) (XScale core) >> >> > > Big-endian DC enabled IC enabled WB enabled LABT branch prediction enabled >> >> > > 32KB/32B 32-way instruction cache >> >> > > 32KB/32B 32-way write-back-locking data cache >> >> > > real memory = 134213632 (127 MB) >> >> > > avail memory = 124440576 (118 MB) >> >> > > random device not loaded; using insecure entropy >> >> > > wlan: mac acl policy registered >> >> > > random: initialized >> >> > > ixp0: >> >> > > ixp0: 37e7f >> >> > > pcib0: on ixp0 >> >> > > pci0: on pcib0 >> >> > > ath0: irq 28 at device 1.0 on pci0 >> >> > > alloc_bounce_pages: numpages 63 >> >> > > [ath] enabling AN_TOP2_FIXUP >> >> > > alloc_bounce_pages: numpages 1 >> >> > > ath0: AR9220 mac 128.2 RF5133 phy 13.0 >> >> > > ath0: 2GHz radio: 0x0000; 5GHz radio: 0x00c0 >> >> > > ath1: irq 27 at device 2.0 on pci0 >> >> > > [ath] enabling AN_TOP2_FIXUP >> >> > > ath1: AR9220 mac 128.2 RF5133 phy 13.0 >> >> > > ath1: 2GHz radio: 0x0000; 5GHz radio: 0x00c0 >> >> > > ath2: irq 26 at device 3.0 on pci0 >> >> > > ath2: AR5413 mac 10.5 RF5413 phy 6.1 >> >> > > ath2: 2GHz radio: 0x0000; 5GHz radio: 0x0063 >> >> > > ixpclk0: on ixp0 >> >> > > #################### >> >> > > >> >> > > Interesting, with this boot, with the new kernel, the aths did not fail. >> >> > > Could the printf have changed something or was it just coincidence? With >> >> > > a reboot ath1 and ath2 failed again during configuration and a little >> >> > > while later ath0 started to complain: >> >> > > >> >> > > ath0: ath_rx_proc: no mbuf! >> >> > > >> >> > > John >> >> > > >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> -a >> >> > >> >> >> > >> On 14 July 2014 01:07, John Hay wrote: >> >> > >> > On Thu, Jul 10, 2014 at 06:02:27AM +0200, John Hay wrote: >> >> > >> >> On Wed, Jul 09, 2014 at 06:46:12AM +0200, John Hay wrote: >> >> > >> >> > Hi Guys, >> >> > >> >> > >> >> > >> >> > The problem is back / still there. I initially saw the problem during >> >> > >> >> > boot, with the interface configs in rc.conf, but because it is so >> >> > >> >> > mixed with the rest, I took it out and put it in the script, then >> >> > >> >> > after multi-user boot was finished, I did a login and ran the script, >> >> > >> >> > with the output I showed in the initial post. >> >> > >> >> > >> >> > >> >> > So I put the interface configs back into rc.conf and I'm seeing the >> >> > >> >> > same problem, here is a cut during boot: >> >> > >> >> > >> >> > >> >> > ############### >> >> > >> >> > Starting file system checks: >> >> > >> >> > /dev/ad0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS >> >> > >> >> > /dev/ad0s1a: clean, 93385 free (33 frags, 11669 blocks, 0.0% fragmentation) >> >> > >> >> > Mounting local file systems:. >> >> > >> >> > Writing entropy file:. >> >> > >> >> > Setting hostname: tst-cambria-11. >> >> > >> >> > wlan0: Ethernet address: 00:21:a4:35:70:42 >> >> > >> >> > wlan1: Ethernet address: 00:21:a4:35:6c:96 >> >> > >> >> > ath1: unable to start recv logic >> >> > >> >> > wlan2: Ethernet address: 00:21:a4:32:38:c2 >> >> > >> >> > ath2: unable to start recv logic >> >> > >> >> > ############### >> >> > >> >> > >> >> > >> >> > Looking at the vmstat -z output the 256 Bucket fail is much higher than >> >> > >> >> > if I let it boot to multiuser and then configured the interfaces. It >> >> > >> >> > was in the 6000, while now it is much higher: >> >> > >> >> > >> >> > >> >> > Without wlan configs in rc.conf, but configured afterwards: >> >> > >> >> > 16 Bucket: 64, 0, 15, 300, 3139, 16, 0 >> >> > >> >> > 256 Bucket: 1024, 0, 31, 1, 592,6062, 0 >> >> > >> >> > vmem btag: 28, 0, 4496, 256, 4496, 32, 0 >> >> > >> >> > >> >> > >> >> > With wlan configs in rc.conf: >> >> > >> >> > 16 Bucket: 64, 0, 16, 299, 8611, 16, 0 >> >> > >> >> > 256 Bucket: 1024, 0, 26, 6, 773,16928, 0 >> >> > >> >> > vmem btag: 28, 0, 4405, 59, 4405, 30, 0 >> >> > >> >> > >> >> > >> >> > Both of them boot from a ro compact-flash with md /etc and /var, but >> >> > >> >> > they are small 4.3M and 2.1M. I have hacked in the use of tmpfs, but >> >> > >> >> > that did not make a difference. >> >> > >> >> > >> >> > >> > >> >> > >> > Friday I power cycled the board and it came up without an error. All 3 >> >> > >> > atheros cards configured in rc.conf. So I left it on for the weekend >> >> > >> > and by this morning there was still no errors, so I rebooted it and >> >> > >> > again saw the "ath1: unable to start recv logic" message for ath1 and >> >> > >> > ath2. I power cycled it, but still get the error, so Friday was just >> >> > >> > lucky with some timing thing, maybe if the card receive while it is >> >> > >> > still configuring? Also if I leave the board booted in this state, >> >> > >> > ath0 also start to give problems: >> >> > >> > >> >> > >> > ################# >> >> > >> > ath0: ath_rx_proc: no mbuf! >> >> > >> > ath0: ath_rx_proc: no mbuf! >> >> > >> > ... >> >> > >> > ath0: device timeout >> >> > >> > ath0: ath_reset: unable to start recv logic >> >> > >> > ... >> >> > >> > ################# >> >> > >> > >> >> > >> > In anycase it seems that memory allocation problem. How do I figure out >> >> > >> > where it is? "netstat -m" does not seem to point to an error. The mbuf >> >> > >> > info in vmstat -z also look ok. It is only "256 Bucket" in vmstat -z >> >> > >> > that point to an alloc failure. How do I figure out where that is and >> >> > >> > how do I fix it or work around it? >> >> > >> > >> >> > >> > ################### >> >> > >> > tst-11-arm:~ # netstat -m >> >> > >> > 128/382/510 mbufs in use (current/cache/total) >> >> > >> > 127/129/256/7654 mbuf clusters in use (current/cache/total/max) >> >> > >> > 127/126 mbuf+clusters out of packet secondary zone in use (current/cache) >> >> > >> > 0/0/0/3827 4k (page size) jumbo clusters in use (current/cache/total/max) >> >> > >> > 0/0/0/1134 9k jumbo clusters in use (current/cache/total/max) >> >> > >> > 0/0/0/637 16k jumbo clusters in use (current/cache/total/max) >> >> > >> > 286K/353K/639K bytes allocated to network (current/cache/total) >> >> > >> > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >> >> > >> > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) >> >> > >> > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) >> >> > >> > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) >> >> > >> > 0/3/1488 sfbufs in use (current/peak/max) >> >> > >> > 0 requests for sfbufs denied >> >> > >> > 0 requests for sfbufs delayed >> >> > >> > 0 requests for I/O initiated by sendfile >> >> > >> > tst-11-arm:~ # vmstat -z | grep mbuf >> >> > >> > mbuf_packet: 256, 48990, 127, 126, 2294, 0, 0 >> >> > >> > mbuf: 256, 48990, 1, 256, 920, 0, 0 >> >> > >> > mbuf_cluster: 2048, 7654, 253, 3, 253, 0, 0 >> >> > >> > mbuf_jumbo_page: 4096, 3827, 0, 0, 0, 0, 0 >> >> > >> > mbuf_jumbo_9k: 9216, 1134, 0, 0, 0, 0, 0 >> >> > >> > mbuf_jumbo_16k: 16384, 637, 0, 0, 0, 0, 0 >> >> > >> > mbuf_ext_refcnt: 4, 0, 0, 504, 1, 0, 0 >> >> > >> > tst-11-arm:~ # vmstat -z | grep Bucket >> >> > >> > 4 Bucket: 16, 0, 7, 497, 1565, 0, 0 >> >> > >> > 6 Bucket: 24, 0, 0, 0, 0, 0, 0 >> >> > >> > 8 Bucket: 32, 0, 3, 375, 83, 0, 0 >> >> > >> > 12 Bucket: 48, 0, 2, 334, 5, 0, 0 >> >> > >> > 16 Bucket: 64, 0, 14, 301, 43687, 16, 0 >> >> > >> > 32 Bucket: 128, 0, 7, 148, 196, 0, 0 >> >> > >> > 64 Bucket: 256, 0, 19, 56, 77, 0, 0 >> >> > >> > 128 Bucket: 512, 0, 16, 16, 48, 0, 0 >> >> > >> > 256 Bucket: 1024, 0, 29, 3, 1521,86738, 0 >> >> > >> > tst-11-arm:~ # >> >> > >> > ################### >> >> > >> > >> >> > >> > Regards >> >> > >> > >> >> > >> > John >> >> > >> > -- >> >> > >> > John Hay -- jhay@meraka.csir.co.za / jhay@meraka.org.za / jhay@FreeBSD.org >> >> _______________________________________________ >> >> freebsd-wireless@freebsd.org mailing list >> >> http://lists.freebsd.org/mailman/listinfo/freebsd-wireless >> >> To unsubscribe, send any mail to "freebsd-wireless-unsubscribe@freebsd.org"