From owner-freebsd-embedded@FreeBSD.ORG Mon Jul 14 19:23:49 2014 Return-Path: Delivered-To: freebsd-embedded@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 03D762FE; Mon, 14 Jul 2014 19:23:49 +0000 (UTC) Received: from mail-qg0-x22d.google.com (mail-qg0-x22d.google.com [IPv6:2607:f8b0:400d:c04::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 909A52A0D; Mon, 14 Jul 2014 19:23:48 +0000 (UTC) Received: by mail-qg0-f45.google.com with SMTP id f51so3796384qge.4 for ; Mon, 14 Jul 2014 12:23:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=YCfhQ9+wcjnPl4MpcAv82V3mB94IHXnMUAdNCqjYzR0=; b=GR71SibUitfzu1ahGiH8PvFOvyOMT3gx94oPxJFqWriWmVQtzrDymS0Q5YNBcuwoP3 WMlQsgNJGwJOkCNJb+DJ8aIh5cgmHrsNkSpiE9nbGbOuiU2IPLTONQf2aTQ03H1FAll+ jfQtZ7vnt329LFSsL9s5EWMuQ5h8oF9jOWxJxE6xB4+v5qrceWP2Ak1+QewfOW7n/3ZQ 4WuGeTb5rpFt6xWAtNRNKZ+guw8lkN71vRs/BX5a9Z5WlEGpE3U5SBwH3C56pfRhlZGD SY6Gpk3SbbQzuXcJ0jtKT6A/7Z6WgtH3kcV9rTvOztZ0n+Ql19BQyCfFrF8AxyFaD8Bf JG5g== MIME-Version: 1.0 X-Received: by 10.224.71.198 with SMTP id i6mr25281287qaj.76.1405365827595; Mon, 14 Jul 2014 12:23:47 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.202.193 with HTTP; Mon, 14 Jul 2014 12:23:47 -0700 (PDT) In-Reply-To: <20140714191416.GB49930@zibbi.meraka.csir.co.za> References: <20140707193252.GA79553@zibbi.meraka.csir.co.za> <20140708165438.GA83704@zibbi.meraka.csir.co.za> <20140709044612.GA51378@zibbi.meraka.csir.co.za> <20140710040227.GA73872@zibbi.meraka.csir.co.za> <20140714080708.GA88464@zibbi.meraka.csir.co.za> <20140714184209.GA21922@zibbi.meraka.csir.co.za> <20140714190946.GA49930@zibbi.meraka.csir.co.za> <20140714191416.GB49930@zibbi.meraka.csir.co.za> Date: Mon, 14 Jul 2014 12:23:47 -0700 X-Google-Sender-Auth: DcSgoqDXjEoZRrWU9gZ3ew9JFJA Message-ID: Subject: Re: CAMBRIA and more than one atheros card From: Adrian Chadd To: John Hay Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-wireless@freebsd.org" , "freebsd-embedded@freebsd.org" , Ian Lepore X-BeenThere: freebsd-embedded@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Dedicated and Embedded Systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jul 2014 19:23:49 -0000 Ah, you're going to need more than 64 bounce pages. That's just not enough for all the mbufs the 11n support requires. It likely didn't show up in previous incarnations of this because pre-11n support had a much smaller pool of buffers in use. For doing 11n, there's hm, 256 TX and 256 RX buffers for each NIC? Or is it 512 and 512? It's something large like that. So assuming 512, that's 1024 * 4096 bytes each, so 4mbyte per NIC needed of bounce buffers. You could tweak ATH_TXBUF and ATH_RXBUF down to something like 128 TX and 128 RX, but you can't go much lower than that per NIC as an A-MPDU aggregate requires up to 64 buffers to transmit and/or receive and if you're not careful you'll actually overrun the RX FIFO and/or starve the TX code from having mbufs available to transmit with. -a On 14 July 2014 12:14, John Hay wrote: > On Mon, Jul 14, 2014 at 09:09:46PM +0200, John Hay wrote: >> On Mon, Jul 14, 2014 at 12:06:34PM -0700, Adrian Chadd wrote: >> > .. why's it need bounce buffers? >> >> I found this: >> arm/xscale/ixp425/ixp425_pci.c: /* NB: PCI dma window is 64M so anything above must be bounced */ > > I found a sysctl to show some bounce info. :-) > > On the boot that ath1 and ath2 failed during ifconfig and ath0 started > to complain after a while: > > ##################### > tst-11-arm:~ # sysctl hw.busdma > hw.busdma.total_bpages: 64 > hw.busdma.zone0.total_bpages: 64 > hw.busdma.zone0.free_bpages: 0 > hw.busdma.zone0.reserved_bpages: 0 > hw.busdma.zone0.active_bpages: 64 > hw.busdma.zone0.total_bounced: 1191 > hw.busdma.zone0.total_deferred: 0 > hw.busdma.zone0.lowaddr: 0x4000fff > hw.busdma.zone0.alignment: 4096 > ##################### > > I then thought to down wlan0 to see if the page count changed, but > do not know. :-) > > ##################### > tst-11-arm:~ # ifconfig wlan0 down > dev.ath.0.debug: 0 -> 8 > /sbin/ifconfig.bin wlan0 down > ath0: ath_stop_locked: invalid 0 if_flags 0x8802 > Sleeping thread (tid 100019, pid 0) owns a non-sleepable lock > panic: sleeping thread > Uptime: 32m46s > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... > ##################### > > John > >> >> John >> > >> > >> > -a >> > >> > >> > On 14 July 2014 11:42, John Hay wrote: >> > > On Mon, Jul 14, 2014 at 09:48:56AM -0700, Adrian Chadd wrote: >> > >> Hm, it could be some bus related stupidity. It's allocating 2k or 4k >> > >> mbufs for the receive path because wifi frames are bigger than >> > >> ethernet frames. If you're not seeing failures in 4k mbuf allocations >> > >> then I'm not sure what it could be. >> > >> >> > >> I'll see about firing it up locally and checking. >> > > >> > > I'm hunting a bit more and it looks like the fail is in the bounce pages. >> > > It looks like the calls look like this: >> > > >> > > ath_legacy_rxbuf_init() >> > > bus_dmamap_load_mbuf_sg() >> > > _bus_dmamap_load_buffer() >> > > _bus_dmamap_reserve_pages() >> > > reserve_bounce_pages() >> > > >> > > I have added a printf at the start of alloc_bounce_pages() and it seems >> > > that it is only called (twice) when ath0 is probed. Is bus_dma_tag_t dmat, >> > > the first argument to alloc_bounce_pages(), common for the whole pci bus? >> > > >> > > The start of the boot, with my printf looks like this: >> > > >> > > #################### >> > > FreeBSD ARM (Gateworks Cambria) boot2 v0.4 >> > > - >> > > Default: /boot/kernel/kernel >> > > boot: >> > > Copyright (c) 1992-2014 The FreeBSD Project. >> > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 >> > > The Regents of the University of California. All rights reserved. >> > > FreeBSD is a registered trademark of The FreeBSD Foundation. >> > > FreeBSD 11.0-CURRENT #9 r268502M: Mon Jul 14 15:30:15 SAST 2014 >> > > jhay@dolphin.meraka.csir.co.za:/usr/obj/arm.armeb/snaps/arm/11-tst/src/sys/S >> > > MALL-CAMBRIA arm >> > > gcc version 4.2.1 20070831 patched [FreeBSD] >> > > CPU: IXP435 rev 1 (ARMv5TE) (XScale core) >> > > Big-endian DC enabled IC enabled WB enabled LABT branch prediction enabled >> > > 32KB/32B 32-way instruction cache >> > > 32KB/32B 32-way write-back-locking data cache >> > > real memory = 134213632 (127 MB) >> > > avail memory = 124440576 (118 MB) >> > > random device not loaded; using insecure entropy >> > > wlan: mac acl policy registered >> > > random: initialized >> > > ixp0: >> > > ixp0: 37e7f >> > > pcib0: on ixp0 >> > > pci0: on pcib0 >> > > ath0: irq 28 at device 1.0 on pci0 >> > > alloc_bounce_pages: numpages 63 >> > > [ath] enabling AN_TOP2_FIXUP >> > > alloc_bounce_pages: numpages 1 >> > > ath0: AR9220 mac 128.2 RF5133 phy 13.0 >> > > ath0: 2GHz radio: 0x0000; 5GHz radio: 0x00c0 >> > > ath1: irq 27 at device 2.0 on pci0 >> > > [ath] enabling AN_TOP2_FIXUP >> > > ath1: AR9220 mac 128.2 RF5133 phy 13.0 >> > > ath1: 2GHz radio: 0x0000; 5GHz radio: 0x00c0 >> > > ath2: irq 26 at device 3.0 on pci0 >> > > ath2: AR5413 mac 10.5 RF5413 phy 6.1 >> > > ath2: 2GHz radio: 0x0000; 5GHz radio: 0x0063 >> > > ixpclk0: on ixp0 >> > > #################### >> > > >> > > Interesting, with this boot, with the new kernel, the aths did not fail. >> > > Could the printf have changed something or was it just coincidence? With >> > > a reboot ath1 and ath2 failed again during configuration and a little >> > > while later ath0 started to complain: >> > > >> > > ath0: ath_rx_proc: no mbuf! >> > > >> > > John >> > > >> > >> >> > >> >> > >> >> > >> >> > >> -a >> > >> >> > >> On 14 July 2014 01:07, John Hay wrote: >> > >> > On Thu, Jul 10, 2014 at 06:02:27AM +0200, John Hay wrote: >> > >> >> On Wed, Jul 09, 2014 at 06:46:12AM +0200, John Hay wrote: >> > >> >> > Hi Guys, >> > >> >> > >> > >> >> > The problem is back / still there. I initially saw the problem during >> > >> >> > boot, with the interface configs in rc.conf, but because it is so >> > >> >> > mixed with the rest, I took it out and put it in the script, then >> > >> >> > after multi-user boot was finished, I did a login and ran the script, >> > >> >> > with the output I showed in the initial post. >> > >> >> > >> > >> >> > So I put the interface configs back into rc.conf and I'm seeing the >> > >> >> > same problem, here is a cut during boot: >> > >> >> > >> > >> >> > ############### >> > >> >> > Starting file system checks: >> > >> >> > /dev/ad0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS >> > >> >> > /dev/ad0s1a: clean, 93385 free (33 frags, 11669 blocks, 0.0% fragmentation) >> > >> >> > Mounting local file systems:. >> > >> >> > Writing entropy file:. >> > >> >> > Setting hostname: tst-cambria-11. >> > >> >> > wlan0: Ethernet address: 00:21:a4:35:70:42 >> > >> >> > wlan1: Ethernet address: 00:21:a4:35:6c:96 >> > >> >> > ath1: unable to start recv logic >> > >> >> > wlan2: Ethernet address: 00:21:a4:32:38:c2 >> > >> >> > ath2: unable to start recv logic >> > >> >> > ############### >> > >> >> > >> > >> >> > Looking at the vmstat -z output the 256 Bucket fail is much higher than >> > >> >> > if I let it boot to multiuser and then configured the interfaces. It >> > >> >> > was in the 6000, while now it is much higher: >> > >> >> > >> > >> >> > Without wlan configs in rc.conf, but configured afterwards: >> > >> >> > 16 Bucket: 64, 0, 15, 300, 3139, 16, 0 >> > >> >> > 256 Bucket: 1024, 0, 31, 1, 592,6062, 0 >> > >> >> > vmem btag: 28, 0, 4496, 256, 4496, 32, 0 >> > >> >> > >> > >> >> > With wlan configs in rc.conf: >> > >> >> > 16 Bucket: 64, 0, 16, 299, 8611, 16, 0 >> > >> >> > 256 Bucket: 1024, 0, 26, 6, 773,16928, 0 >> > >> >> > vmem btag: 28, 0, 4405, 59, 4405, 30, 0 >> > >> >> > >> > >> >> > Both of them boot from a ro compact-flash with md /etc and /var, but >> > >> >> > they are small 4.3M and 2.1M. I have hacked in the use of tmpfs, but >> > >> >> > that did not make a difference. >> > >> >> > >> > >> > >> > >> > Friday I power cycled the board and it came up without an error. All 3 >> > >> > atheros cards configured in rc.conf. So I left it on for the weekend >> > >> > and by this morning there was still no errors, so I rebooted it and >> > >> > again saw the "ath1: unable to start recv logic" message for ath1 and >> > >> > ath2. I power cycled it, but still get the error, so Friday was just >> > >> > lucky with some timing thing, maybe if the card receive while it is >> > >> > still configuring? Also if I leave the board booted in this state, >> > >> > ath0 also start to give problems: >> > >> > >> > >> > ################# >> > >> > ath0: ath_rx_proc: no mbuf! >> > >> > ath0: ath_rx_proc: no mbuf! >> > >> > ... >> > >> > ath0: device timeout >> > >> > ath0: ath_reset: unable to start recv logic >> > >> > ... >> > >> > ################# >> > >> > >> > >> > In anycase it seems that memory allocation problem. How do I figure out >> > >> > where it is? "netstat -m" does not seem to point to an error. The mbuf >> > >> > info in vmstat -z also look ok. It is only "256 Bucket" in vmstat -z >> > >> > that point to an alloc failure. How do I figure out where that is and >> > >> > how do I fix it or work around it? >> > >> > >> > >> > ################### >> > >> > tst-11-arm:~ # netstat -m >> > >> > 128/382/510 mbufs in use (current/cache/total) >> > >> > 127/129/256/7654 mbuf clusters in use (current/cache/total/max) >> > >> > 127/126 mbuf+clusters out of packet secondary zone in use (current/cache) >> > >> > 0/0/0/3827 4k (page size) jumbo clusters in use (current/cache/total/max) >> > >> > 0/0/0/1134 9k jumbo clusters in use (current/cache/total/max) >> > >> > 0/0/0/637 16k jumbo clusters in use (current/cache/total/max) >> > >> > 286K/353K/639K bytes allocated to network (current/cache/total) >> > >> > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >> > >> > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) >> > >> > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) >> > >> > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) >> > >> > 0/3/1488 sfbufs in use (current/peak/max) >> > >> > 0 requests for sfbufs denied >> > >> > 0 requests for sfbufs delayed >> > >> > 0 requests for I/O initiated by sendfile >> > >> > tst-11-arm:~ # vmstat -z | grep mbuf >> > >> > mbuf_packet: 256, 48990, 127, 126, 2294, 0, 0 >> > >> > mbuf: 256, 48990, 1, 256, 920, 0, 0 >> > >> > mbuf_cluster: 2048, 7654, 253, 3, 253, 0, 0 >> > >> > mbuf_jumbo_page: 4096, 3827, 0, 0, 0, 0, 0 >> > >> > mbuf_jumbo_9k: 9216, 1134, 0, 0, 0, 0, 0 >> > >> > mbuf_jumbo_16k: 16384, 637, 0, 0, 0, 0, 0 >> > >> > mbuf_ext_refcnt: 4, 0, 0, 504, 1, 0, 0 >> > >> > tst-11-arm:~ # vmstat -z | grep Bucket >> > >> > 4 Bucket: 16, 0, 7, 497, 1565, 0, 0 >> > >> > 6 Bucket: 24, 0, 0, 0, 0, 0, 0 >> > >> > 8 Bucket: 32, 0, 3, 375, 83, 0, 0 >> > >> > 12 Bucket: 48, 0, 2, 334, 5, 0, 0 >> > >> > 16 Bucket: 64, 0, 14, 301, 43687, 16, 0 >> > >> > 32 Bucket: 128, 0, 7, 148, 196, 0, 0 >> > >> > 64 Bucket: 256, 0, 19, 56, 77, 0, 0 >> > >> > 128 Bucket: 512, 0, 16, 16, 48, 0, 0 >> > >> > 256 Bucket: 1024, 0, 29, 3, 1521,86738, 0 >> > >> > tst-11-arm:~ # >> > >> > ################### >> > >> > >> > >> > Regards >> > >> > >> > >> > John >> > >> > -- >> > >> > John Hay -- jhay@meraka.csir.co.za / jhay@meraka.org.za / jhay@FreeBSD.org >> _______________________________________________ >> freebsd-wireless@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-wireless >> To unsubscribe, send any mail to "freebsd-wireless-unsubscribe@freebsd.org"