From owner-freebsd-stable@FreeBSD.ORG Sun Sep 10 16:19:12 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 95DBE16A412 for ; Sun, 10 Sep 2006 16:19:12 +0000 (UTC) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 956D343D55 for ; Sun, 10 Sep 2006 16:19:11 +0000 (GMT) (envelope-from sam@errno.com) Received: from [10.0.0.2] ([10.0.0.2]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id k8AGJA7K093014 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 10 Sep 2006 09:19:11 -0700 (PDT) (envelope-from sam@errno.com) Message-ID: <45043AFA.4080001@errno.com> Date: Sun, 10 Sep 2006 09:19:06 -0700 From: Sam Leffler Organization: Errno Consulting User-Agent: Thunderbird 1.5.0.5 (Macintosh/20060719) MIME-Version: 1.0 To: dandee@volny.cz References: <000001c6d340$004fdf40$6508280a@tocnet28.jspoj.czf> In-Reply-To: <000001c6d340$004fdf40$6508280a@tocnet28.jspoj.czf> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-stable@freebsd.org Subject: Re: Where is the maximum of hw.ath.txbuf and rxbuf ? (former: atheros driver under high load, panics and even more freezes) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Sep 2006 16:19:12 -0000 Daniel Dvořák wrote: > Hi Sam, > > thank you for your answer. I think it is connected to this problem somehow, but not fully. > > I increased txbuf and rxbuf twice to 200 and 80, I saw some > betterment in less of "no buffers space ...", but latency went up to 2000 ms. > > Now I ended at txbuf=800 and rxbuf=320 on both sides R1 and R2. > > But still, there is the same problem: > > It was tested after the rebooting R2 almost at once. > > --- R1 ping statistics --- > 10000 packets transmitted, 8752 packets received, 12% packet loss > round-trip min/avg/max/stddev = 1.324/920.480/6323.454/766.399 ms up to 6k ms > > R2# athstats -i ath0 > 11309 data frames received > 11384 data frames transmit > 12508 long on-chip tx retries > 769 tx failed 'cuz too many retries > 24M current transmit rate > 2 tx management frames > 6 tx frames discarded prior to association > 31 tx stopped 'cuz no xmit buffer > 38 tx frames with no ack marked > 3 rx failed 'cuz of bad CRC > 4464 rx failed 'cuz of PHY err > 4464 OFDM timing > 24 periodic calibrations > 27 rssi of last ack > 27 avg recv rssi > -96 rx noise floor > 1 switched default/rx antenna > Antenna profile: > [1] tx 10614 rx 11449 > > > Where is the maximum of txbuf and rxbuf ? The max is derived from available memory. The h/w has no upper bounds. > > I would like to test it. > > Thank you for your attention. > > Daniel > > >> -----Original Message----- >> From: Sam Leffler [mailto:sam@errno.com] >> Sent: Friday, September 08, 2006 6:30 AM >> To: dandee@volny.cz >> Cc: freebsd-stable@freebsd.org >> Subject: Re: atheros driver under high load, panics and even >> more freezes >> >> Daniel Dvoøák wrote: >>> Hi Sam and all, >>> >>> I am not sure if I understand your answer, but I try it. >>> >>> When I use start my test, athstats shows this: >>> >>> athstats -i ath0 >>> >>> 19308912 data frames received >>> 15723536 data frames transmit >>> 6536 tx frames with an alternate rate >>> 2188280 long on-chip tx retries >>> 62583 tx failed 'cuz too many retries >>> 348 tx linearized to cluster >>> 24M current transmit rate >>> 6 tx management frames >>> 6 tx frames discarded prior to association >>> 27129 tx stopped 'cuz no xmit buffer >>> 23057 tx frames with no ack marked >>> 1182 rx failed 'cuz of bad CRC >>> 761604 rx failed 'cuz of PHY err >>> 761604 OFDM timing >>> 4829 periodic calibrations >>> 28 rssi of last ack >>> 27 avg recv rssi >>> -96 rx noise floor >>> 1 switched default/rx antenna >>> Antenna profile: >>> [1] tx 15660942 rx 19451935 >>> [2] tx 2 rx 0 >>> >>> ... >>> >>> >>> I use this ping command from R2: >>> ping -i .002 -c 10000 -s 1472 opposite side R1 >>> >>> --- R1 ping statistics --- >>> 10000 packets transmitted, 10000 packets received, 0% packet loss >>> round-trip min/avg/max/stddev = 1.316/1.442/49.391/1.757 ms >>> >>> You can see nice average latency about 1,4 ms and no one >> packet was lost. >>> athstats almost wasn´t changed. >>> >>> 19309465 data frames received >>> 15724079 data frames transmit >>> 6536 tx frames with an alternate rate >>> 2188281 long on-chip tx retries >>> 62583 tx failed 'cuz too many retries >>> 348 tx linearized to cluster >>> 24M current transmit rate >>> 6 tx management frames >>> 6 tx frames discarded prior to association >>> 27129 tx stopped 'cuz no xmit buffer >>> 23075 tx frames with no ack marked >>> 1182 rx failed 'cuz of bad CRC >>> 761605 rx failed 'cuz of PHY err >>> 761605 OFDM timing >>> 4834 periodic calibrations >>> 29 rssi of last ack >>> 27 avg recv rssi >>> -96 rx noise floor >>> 1 switched default/rx antenna >>> Antenna profile: >>> [1] tx 15661485 rx 19452488 >>> [2] tx 2 rx 0 >>> >>> For compare with flood ping at once: >>> >>> --- R1 ping statistics --- >>> 10000 packets transmitted, 10000 packets received, 0% packet loss >>> round-trip min/avg/max/stddev = 1.319/1.516/5.594/0.120 ms >>> >>> Almost the same, yes max is even better. >>> >>> >> ---------------------------------------------------------------------- >>> ------ >>> -------------- >>> >>> If I use interval 1/1000 s to send the echo request, the >> situation is >>> rapidly changed. >>> ping -i .001 -c 10000 -s 1472 opposite side R1 >>> >>> --- R1 ping statistics --- >>> 10000 packets transmitted, 9681 packets received, 3% packet loss >>> round-trip min/avg/max/stddev = 1.319/335.806/564.946/170.691 ms >>> >>> R2# ifconfig -v ath0 >>> ath0: >> flags=8c43 mtu >>> 1500 >>> ------ ??????????? OACTIVE FLAG ????????? ---- >>> inet6 fe80::20b:6bff:fe2a:c78e%ath0 prefixlen 64 scopeid 0x2 >>> inet 10.XX.YY.ZZ netmask 0xfffffffc broadcast 10.40.64.19 >>> ether xxxxxxxxxxxxxxxx >>> media: IEEE 802.11 Wireless Ethernet OFDM/24Mbps mode 11a >>> (OFDM/24Mbps) >>> status: associated >>> >>> 19350739 data frames received >>> 15765446 data frames transmit >>> 6536 tx frames with an alternate rate >>> 2194842 long on-chip tx retries >>> 62590 tx failed 'cuz too many retries >>> 348 tx linearized to cluster >>> 24M current transmit rate >>> 6 tx management frames >>> 6 tx frames discarded prior to association >>> 29242 tx stopped 'cuz no xmit buffer >>> 23155 tx frames with no ack marked >>> 1182 rx failed 'cuz of bad CRC >>> 764641 rx failed 'cuz of PHY err >>> 764641 OFDM timing >>> 4856 periodic calibrations >>> 28 rssi of last ack >>> 27 avg recv rssi >>> -96 rx noise floor >>> 1 switched default/rx antenna >>> Antenna profile: >>> [1] tx 15702845 rx 19493774 >>> [2] tx 2 rx 0 >>> >>> I observe flags of ath and when latency is going to high more and >>> more, there is a new flag which I´ve never seen before, >> OACTIVE FLAG ? >>> R2# man ifconfig | grep "OACTIVE" >>> >>> When ping ends oactive flag disappears. >>> >>> When the same ping test is done from linux box to fbsd, >> nice latency >>> 1,2ms and no "no buffer". >>> >>> with -i 0.002 the throughput is about 0,5MB/s in and out of cource >>> >>> with -i 0.001 until no buffer is about 0,85MB/s in and out. >>> >>> when no buffer and octive appears, the throughput is about >> 0,1MB/s or >>> 128KB/s if you like or 1Mbit/s. >>> >>> I attached the progress of pinging ip address. >> You ask why you're seeing OACTIVE when you lower the >> inter-packet wait time to ping. This is because you're >> flooding the tx queue of the ath driver and using up all the >> tx buffers/descriptors. When ath is handed a frame to send >> and it has no resources available it will mark the interface >> "active' (OACTIVE) and drop the packet. You can also see >> this in the athstats output ("tx stopped 'cuz no xmit >> buffer"). Linux behaves differently because it blocks the >> user process when this happens until such time as there are >> resources to do the send. This behaviour likely also >> explains the variability in the ping times; I think the rx >> processing may be deferred while the driver deals with the tx flood. >> >> You can up the number of tx buffers available with the >> ATH_TXBUF config option or by setting the hw.ath.txbuf >> tunable from the loader. The default is 100 buffers which is >> usually plenty for sta-mode operation--which is what the >> driver is optimized for (it appears linux defaults to 200 tx >> buffers which would also explain different behaviour). >> Likewise there is a rx-side tunable for the number of rx buffers. >> >> Sam >> > > >