From owner-freebsd-net@FreeBSD.ORG Mon Mar 23 20:12:20 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD3D9106567D for ; Mon, 23 Mar 2009 20:12:20 +0000 (UTC) (envelope-from bms@incunabulum.net) Received: from out2.smtp.messagingengine.com (out2.smtp.messagingengine.com [66.111.4.26]) by mx1.freebsd.org (Postfix) with ESMTP id 9E7EA8FC08 for ; Mon, 23 Mar 2009 20:12:20 +0000 (UTC) (envelope-from bms@incunabulum.net) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 504622FA989 for ; Mon, 23 Mar 2009 16:12:20 -0400 (EDT) Received: from heartbeat2.messagingengine.com ([10.202.2.161]) by compute1.internal (MEProxy); Mon, 23 Mar 2009 16:12:20 -0400 X-Sasl-enc: V4v0RDd4VX78K/trlUrzk7Pd6nIQfrtEHKRivuJag8lL 1237839139 Received: from empiric.lon.incunabulum.net (unknown [81.168.51.182]) by mail.messagingengine.com (Postfix) with ESMTPSA id BFFC72E8E6 for ; Mon, 23 Mar 2009 16:12:19 -0400 (EDT) Message-ID: <49C7ED1F.2060608@incunabulum.net> Date: Mon, 23 Mar 2009 20:12:15 +0000 From: Bruce M Simpson User-Agent: Thunderbird 2.0.0.19 (X11/20090126) MIME-Version: 1.0 To: freebsd-net@freebsd.org References: <20090317103650.GA6156@rebelion.Sisis.de> <49BF7DE4.4010804@incunabulum.net> <20090317104548.GB6182@rebelion.Sisis.de> <49BFF258.4020207@freebsd.org> <20090323182310.GA1825@rebelion.Sisis.de> <49C7D89A.6070502@incunabulum.net> <49C7DF8A.8070408@freebsd.org> In-Reply-To: <49C7DF8A.8070408@freebsd.org> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: ath0 apparent silent disassociation X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 20:12:21 -0000 [Repost without attachment] OK. We've managed to reproduce this set of symptoms now in our work area. [If anyone needs to see a pcap, please Cc: me offlist.] Timebase: beginning of the pcap is in sync with a bringup from single-user mode; the tcpdump runs in the background from init whilst the system is brought up. OK, so I timed the apparent loss of connectivity as 6m 30s from that point I hit the stopwatch, to when I hit it again when the AP's Web GUI no longer shows the STA affected as being associated. Obviously such a timing is subject to human/visual jitter, and how often Netgear's firmware pulls the STA association list from the AP into the web GUI. What stands out in the pcap is that 302.291s in (almost 5m exactly), the STA (ath0) sends an IEEE 802.11 NULL frame to the AP with the PWR MGT bit set (I'm going to sleep!). This more or less coincides with a normal beacon from the Netgear AP. It does not advertise Auto Power Save Delivery (apsd), that bit is 0. This is puzzling as we don't enable power management by default. As I understand it, this may be an AP feature in some environments... I can try reproducing this with an explicit 'ifconfig ath0 -powersave' and see if it reoccurs. You'll see that after this NULL frame is sent, there is another Probe Request, and the Netgear AP does Probe Respond, but this makes no difference (I ended the capture around 150s after the NULL frame was sent). At this point we can't send traffic from the ath0, or rather, the AP is acting as though it never even heard the STA. The STA learns the AP's IP address/MAC mapping through passive ARP -- we still see broadcasts on the SSID -- but the AP has started to totally ignore the STA, and seemed to have ignored its ARP requests also. We are using MAC address ACL control with this AP, and the ath0 affected is definitely listed in its ACL table, configured up, rebooted etc. It is as though the STA is entering power saving mode when not explicitly told to, and the AP is not waking up the STA as it should. If any more information needed, or where to look, please let me know what's involved (I MFCed the change after all, so I'll help where I can until I'm on holiday this week...) My lab colleague is just working around this with 'ping ' for now, that keeps things up, as does OpenVPN... cheers BMS