Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Mar 2012 00:10:14 GMT
From:      Adrian Chadd <adrian@freebsd.org>
To:        freebsd-wireless@FreeBSD.org
Subject:   Re: kern/166190: [ath] TX hangs and frames stuck in TX queue
Message-ID:  <201203190010.q2J0AElW089620@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/166190; it has been noted by GNATS.

From: Adrian Chadd <adrian@freebsd.org>
To: bug-followup@freebsd.org, Vincent Hoffman <vince@unsane.co.uk>
Cc: freebsd-wireless@freebsd.org
Subject: Re: kern/166190: [ath] TX hangs and frames stuck in TX queue
Date: Sun, 18 Mar 2012 17:10:04 -0700

 I think I understand what's going on here.
 
 It turns out that multiple instances of the TX code (via if_start())
 were running at the same time. These were processing frames from the
 input queue and assigning them sequence numbers.
 
 This seems to be occuring:
 
 * thread A would allocate sequence number 5
 * thread B would concurrency allocate sequence number 6
 * thread B would then "win" the race to add it to the BAW, as the
 sequence numbers were allocated early but it wouldn't be added to the
 queue until much later
 * then thread A would try adding its frame to the BAW, but since the
 BAW left edge is now 6, 5 is now "out of window".
 
 I have a local patch here which I'm going to test tonight/tomorrow. It
 delays the sequence number allocation until _right before_ the frame
 may be added to the BAW. This is done inside the same lock, so there's
 no chance that it'll race with another concurrent thread.
 
 I won't commit it until I have committed some verification code to
 -HEAD to complain loudly when a frame _before_ the BAW is trying to be
 queued. Since that shouldn't happen in reality, I'm going to guess
 that it'll pop up in my testing and Vincents use.
 
 Once I've verified that (a) my sanity checking code is firing as I
 expect it to, (b) Vincent also sees the same, and (c) this is fixed by
 my patch, I'll look at committing it.
 
 Vincent - thanks so very much for persisting with this bug! I'd not
 have really found it at all if you didn't point the odd behaviour out
 to me.
 
 Now - yes, the solution would also be "serialise the whole TX queue
 damnit." Yes, that'd solve it, but as I'm seeing 802.11ac around the
 corner, I'd like to actually debug, diagnose and document how a
 multi-threaded TX/RX path could work. Serialising the driver TX path
 isn't going to help me do that. :-)
 
 
 Adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201203190010.q2J0AElW089620>