From owner-freebsd-stable@FreeBSD.ORG Mon Feb 27 21:22:55 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7056B16A420 for ; Mon, 27 Feb 2006 21:22:55 +0000 (GMT) (envelope-from david@catwhisker.org) Received: from bunrab.catwhisker.org (adsl-63-193-123-122.dsl.snfc21.pacbell.net [63.193.123.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id 86EFA43D5E for ; Mon, 27 Feb 2006 21:22:52 +0000 (GMT) (envelope-from david@catwhisker.org) Received: from bunrab.catwhisker.org (localhost [127.0.0.1]) by bunrab.catwhisker.org (8.13.3/8.13.3) with ESMTP id k1RLMplZ056492 for ; Mon, 27 Feb 2006 13:22:51 -0800 (PST) (envelope-from david@bunrab.catwhisker.org) Received: (from david@localhost) by bunrab.catwhisker.org (8.13.3/8.13.1/Submit) id k1RLMpoT056491 for stable@freebsd.org; Mon, 27 Feb 2006 13:22:51 -0800 (PST) (envelope-from david) Date: Mon, 27 Feb 2006 13:22:51 -0800 From: David Wolfskill To: stable@freebsd.org Message-ID: <20060227212251.GS13464@bunrab.catwhisker.org> Mail-Followup-To: David Wolfskill , stable@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Cc: Subject: 6.1-PRERELEASE, wi(4), and dhclient X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Feb 2006 21:22:55 -0000 OK; I have been following the -stable list (among others), and realize that there are some "outstanding issues" with respect to wi(4). I'm wondering if premature invocation of dhclient with a reason code of "EXPIRE" is one of those "outstanding issues." Here's some history behind the query; I'll try to be brief: * Until about a week ago, I primarily used FreeBSD 4-STABLE on my succession of laptops (since Dec 2004, a Dell Inspiron 8200). This had nothing to do with the quality of FreeBSD, but was because I didn't want my laptop environment to diverge too much from a couple of "production" machines I have, which have their software built on a separate "build machine." Unfortunately, the build machine stopped working (hardware issues) in Feb 2005, and I have not yet had the resources to fix or replace it. * However, I had been tracking 6-STABLE (on slice 3), as well. * I had seen that dhclient sometimes failed to get a lease using the wi0 NIC under 6-STABLE (possibly 5-STABLE, as well, though I stopped tracking that about a month after 6.0 was released). At the time, I figured it wasn't that big a problem for me, since I was only running 6.x "experimentally" -- and I wasn't really in a position to fix wi(4). Further, I didn't have the problem under 4.x at all. * Last week, the laptop's disk drive reported uncorrectable errors reading from slice 1 (the 4.x slice). Given the available options, I booted from slice 3 (6.x), and then spent the next several days trying to re-assemble a native 6.x working environment. (The process has got to a point where the laptop is generally usable now. Pending some port-building issue resolution, I'm reasonably comfortable with it, save for the issue that catalyzed this note.) * Over the last couple of weeks (tracking 6-STABLE daily), I've not seen a recurrence of the "wi0 fails to get a lease" issue. This seems encouraging. * However, I did see that wi0 would sometimes lose its lease prematurely. * At first, I took the "very large hammer" approach to "resolving" this issue: Once I got a DHCP lease, I'd kill dhclient, thus preventing it from changing anything. I realize that there is no respect in which this might possibly be optimal, but I was short on several resources... and it did have the desired effect of allowing the laptop to maintain connectivity. * More recently, I thought I'd try to track just what is going on, while still trying to maintain connectivity. To that end, I hacked up a dhclient-enter-hooks that would look for a reason of EXPIRE, and then set exit_status to a non-zero value, thus telling dhclient to ignore the invocation. While this has many of the more unfortunate aspects of the earlier approach, it does permit some quantification of what is going on. Thus, yesterday afternoon, I booted up the laptop with the above-cited dhclient-enter-hooks in place at about 16:40 hrs. The laptop got a 7-day (604800-second) lease, and scheduled renewal in 302400 seconds (3.5 days). So far, so good. Here's an excerpt from the message log, showing when dhclient-enter-hooks was invoked. Please note that this was a rather "quiet" network -- there was only one other DHCP client on it: Feb 26 17:41:58 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 18:55:44 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 19:07:34 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 19:30:36 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 19:35:54 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 19:42:07 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 19:45:33 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation Feb 26 20:00:20 localhost dhclient: Ignoring claimed EXPIRE dhclient invocation I'm rather at a loss to make sense of the timing of these invocations. The first is after about an hour; I don't see much of a pattern as far as the intervals for the others are concerned. I'd rather not use this fairly heavy-handed approach to avoiding the symptome of the problem, and actually resolve the problem. If nothing else, I think it would be a lot nicer for other folks who are also trying to use NICs that use the wi(4) driver. So: other than the stock "use a different NIC" (the one I'm using is built into the laptop), would someone please loan me a clue? I do keep a local private mirror of the FreeBSD CVS repository, so it's fairly easy for me to test patches -- it just takes time to rebuild stuff. And I'll see about assigning one of the slices to -CURRENT soon. Here's what I'm running at the moment (recall that I'm tracking RELENG_6 daily): localhost(6.1-P)[11] uname -a FreeBSD localhost 6.1-PRERELEASE FreeBSD 6.1-PRERELEASE #5: Mon Feb 27 06:48:11 PST 2006 root@g1-18.catwhisker.org.:/common/S2/obj/usr/src/sys/LAPTOP_30W i386 localhost(6.1-P)[12] Here's what dmesg has to say about wi0: pcib2: pccard2 requested memory range 0xf4000000-0xfbffffff: good pccard2: CIS version PC Card Standard 5.0 pccard2: CIS info: Dell, TrueMobile 1150 Series PC Card, Version 01.01, pccard2: Manufacturer code 0x156, product 0x2 pccard2: function 0: network adapter, ccr addr 3e0 mask 1 pccard2: function 0, config table entry 1: I/O card; irq mask ffff; iomask 6, iospace 0-3f; io16 irqpulse irqlevel pcib2: pccard2 requested I/O range 0xe000-0xffff: in range pcib2: pccard2 requested memory range 0xf4000000-0xfbffffff: good wi0: at port 0xe000-0xe03f irq 11 function 0 config 1 on pccard2 wi0: [MPSAFE] wi0: using Lucent Embedded WaveLAN/IEEE wi0: Lucent Firmware: Station (6.14.1) wi0: bpf attached wi0: Ethernet address: 00:02:2d:5b:2c:78 wi0: bpf attached wi0: bpf attached wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps fdc0: output ready timeout Thanks! Peace, david -- David H. Wolfskill david@catwhisker.org Mail filters, like sewers, need to be most restrictive at the point of entry. See http://www.catwhisker.org/~david/publickey.gpg for my public key.