From owner-freebsd-rc@FreeBSD.ORG Tue Dec 27 09:26:57 2011 Return-Path: Delivered-To: freebsd-rc@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFC001065670; Tue, 27 Dec 2011 09:26:57 +0000 (UTC) (envelope-from rea@codelabs.ru) Received: from 0.mx.codelabs.ru (0.mx.codelabs.ru [144.206.177.45]) by mx1.freebsd.org (Postfix) with ESMTP id 4A87E8FC13; Tue, 27 Dec 2011 09:26:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=codelabs.ru; s=two; h=Sender:In-Reply-To:Content-Type:MIME-Version:Message-ID:Subject:Cc:To:From:Date; bh=reVtzAbOphnltwcPEXia45Zh+V3ytyrUOYTNYfGw9lk=; b=XPW7rxB8bg00wvz5nCSQ/OsCy4NctMY0IBqMpw/cvZUqL22ZOxUMszKIh0/qqEFD+Fzra5UBDKpG+v0/vMoRyv0ykX4ZAnZ5dMVPGJqOVNqzXpMtwKOCQ31RbHnOTHVpkCkrEA9KebtC/U5xTaCzIh3hOhfh83E+jIWXZYRnANY00+zR5jGxwVHJ6Qha8NKgW/V2UZ7wUMoGPhqJMPveL63JJhGLt/c+/vHthoxqE4UxqJyAYdFAjVsJpflCPrtaXtKFN2601T4c0sKYZ3yvvSy3kg9zuP28MUtAWOh+x25T2rjScX0DykFZxli42o/ndBlhPuOolrvwOZgJiR0hzA==; Received: from void.codelabs.ru (void.codelabs.ru [144.206.177.25]) by 0.mx.codelabs.ru with esmtpsa (TLSv1:AES256-SHA:256) id 1RfTJ9-0009Re-7O; Tue, 27 Dec 2011 12:26:55 +0300 Date: Tue, 27 Dec 2011 13:26:51 +0400 From: Eygene Ryabinkin To: Doug Barton Message-ID: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="JVVqWhpkAs5raV7A" Content-Disposition: inline In-Reply-To: <4EF971E4.4050905@FreeBSD.org> <4EF96D7D.3030701@FreeBSD.org> Sender: rea@codelabs.ru Cc: Pyun Yong-Hyeon , Brooks Davis , freebsd-rc@FreeBSD.ORG, Garrett Cooper , Gleb Smirnoff , Dag-Erling Smorgrav , d@delphij.net, Xin LI Subject: Re: Annoying ERROR: 'wlan0' is not a DHCP-enabled interface X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Dec 2011 09:26:58 -0000 --JVVqWhpkAs5raV7A Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Mon, Dec 26, 2011 at 11:21:08PM -0800, Doug Barton wrote: > On 12/26/2011 22:39, Eygene Ryabinkin wrote: > > This solution will also do the work, but I am slightly concerned > > that it will > >=20 > > - call all netif machinery for interfaces with static IPs: >=20 > The machinery is not that big/complex. It is not an argument. It would be an argument, if this addition will add the substantial value, so putting the load on the system via the added netif invocation will worth it. > > it will be useless for already-configured interfaces; >=20 > It also won't harm anything. It just ruined the connectivity of my workstation I am sitting in front of. I had just changed the devd rule to {{{ notify 0 { match "system" "IFNET"; match "type" "LINK_UP"; media-type "ethernet"; action "/etc/rc.d/netif quietstart $subsystem"; action "logger /etc/rc.d/netif quietstart $subsystem"; }; }}} And I had started to experience infinite link flaps on my static interface: {{{ Dec 27 11:51:59 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:02 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:02 kernel: msk0: link state changed to UP Dec 27 11:52:02 kernel: msk0: link state changed to DOWN Dec 27 11:52:06 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:06 kernel: msk0: link state changed to UP Dec 27 11:52:06 kernel: msk0: link state changed to DOWN Dec 27 11:52:09 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:09 kernel: msk0: link state changed to UP Dec 27 11:52:09 kernel: msk0: link state changed to DOWN Dec 27 11:52:13 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:13 kernel: msk0: link state changed to UP Dec 27 11:52:13 kernel: msk0: link state changed to DOWN Dec 27 11:52:18 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:18 kernel: msk0: link state changed to UP Dec 27 11:52:18 kernel: msk0: link state changed to DOWN Dec 27 11:52:21 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:21 kernel: msk0: link state changed to UP Dec 27 11:52:21 kernel: msk0: link state changed to DOWN Dec 27 11:52:25 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:25 kernel: msk0: link state changed to UP Dec 27 11:52:25 kernel: msk0: link state changed to DOWN Dec 27 11:52:28 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:28 kernel: msk0: link state changed to UP Dec 27 11:52:28 kernel: msk0: link state changed to DOWN Dec 27 11:52:32 rea: /etc/rc.d/netif quietstart msk0 Dec 27 11:52:32 kernel: msk0: link state changed to UP Dec 27 11:52:32 kernel: msk0: link state changed to DOWN }}} This is with devd running. With devd disabled and running 'service netif quietstart msk0' I had discovered the reason: 'start' for msk(4) makes interface to be brought down and resurrected back (the logs are from two invocations of netif to assure myself that the problem is repeatable): {{{ Dec 27 12:31:27 kernel: msk0: link state changed to DOWN Dec 27 12:31:31 kernel: msk0: link state changed to UP Dec 27 12:31:35 kernel: msk0: link state changed to DOWN Dec 27 12:31:38 kernel: msk0: link state changed to UP }}} So, in my case, linkup event triggers devd and 'netif start' that, in turn, triggers DOWN/UP, so we have while(1)-type loop. This isn't "won't harm anything"-type of change, isn't it? > > - in the case of vlan interfaces, ifconfig dance will be done twice > > for each of them: once from the netif for the parent interface and > > once for each vlan in turn. >=20 > Are you certain that the devd.conf trigger will fire when a vlan is up'ed? Doug, please, do everyone a favor: if you're unsure in something, check it by yourself first. This will greatly reduce the number of such questions. It is all simple: add variable 'vlans_' to rc.conf and set its value to some numbers, say '1 2'. Then add 'ifconfig__1' and 'ifconfig__2' saying, for example, 'up'. Run 'service netif start', unplug the cable and watch the logs for UP/DOWN interface notifications. Here are mine with 2 VLANs: {{{ Dec 27 09:43:57 kernel: sk0: link state changed to DOWN Dec 27 09:43:57 kernel: sk0.1: link state changed to DOWN Dec 27 09:43:57 kernel: sk0.2: link state changed to DOWN Dec 27 09:44:00 kernel: sk0: link state changed to UP Dec 27 09:44:00 kernel: sk0.1: link state changed to UP Dec 27 09:44:00 kernel: sk0.2: link state changed to UP }}} > > This will just do the work that is useless in all-static configuration. >=20 > I'm not sure I agree that it's useless. I can actually see this as quite > handy. Personally I try to be in the habit of adding the configuration > to rc.conf first, and using netif to start the interface so that I know > for sure what will happen when that host reboots. No problems, do it yourself by hand. But we're talking about devd who will do this automatically upon the link flap event. That's useless and, as was demonstrated by me and Garrett, harmful. > > Worse, this solution will ruin host's connectivity in the following > > scenario: > >=20 > > - one runs his remote server with all static configuration and strict, > > default-to-deny firewall configuration (call this person "Eygene > > Ryabinkin"); > >=20 > > - his upstream provider tells him: listen, we're rearranging our IP > > space and you should change IP1 to IP2; > >=20 > > - administrator is busy changing the configuration of his host; his > > plan is to substitute IP1 to IP2 everywhere and to reboot his > > machine to cleanly acquire IP2 and continue operations; > >=20 > > - he already substituted IP1 -> IP2 in rc.conf and starts poking > > the firewall configuration, but here comes the link down event > > due to the $PROVIDER who reconfigures his $CISCO or whatever; > >=20 > > - the system ends up in an unusable state, because link up event > > will change interface's IP, but firewall isn't ready for this > > and isn't allowing connections to IP2, but allows them only for > > IP1 that is already gone from the interface due to devd and netif > > script. >=20 > First, I think what you're describing is a pretty small edge case. Doug, I am sorry, but that's childish: no matter how small is the probability, this event _will_ happen. And it will make the administrator in question to lose the connectivity of his server: that's not just "I will lose the message to the log". He will scratch his head, because it is very unnatural thing in the all-static configuration. Once he will find what happened, he won't be satisfied with the way FreeBSD works in this area, I promise. That's not a feature, that's a bug. Once again, you're trying to tell me: my solution is better, but yes, it will horribly fail in the minority of cases up to rendering the remote system unusable in a very unnatural way that can't be predicted =66rom the common sense, but requires the administrator to know deeply the internals of how devd.conf is currently organized. All I can answer, that such a solution (if I am correct and it will fail in such a way) is a no go at all, unless $SOMETHING will be fixed to cure the problems. > > People may tell me that > >=20 > > - Eygene Ryabinkin should run firewall configuration whose knowledge > > of IP for the interface is based on the automagic like ipfw's "me" > > verb; > >=20 > > - Eygene Ryabinkin should not work with the remote host without access > > to its physical console via remote KVM or alike; >=20 > Second, these are both valid points. :) The first one is, actually, can't be implemented in the general case when the interface runs many IPs and I require different firewalling rules for different IPs: ipfw's "me" catches _all_ IPs in the system and if there will be some macro of sort "addrs()", it will catch all IPs of the given interface, at best. pf's '()' works only at the ruleset load time, so its's not dynamic at all. > > I am aware of these fine points, however my meat is that static IP > > configuration is the _static_ one (cool assertion, isn't it?). But it > > has at least one consequence: people view their static IP > > configurations as a really static ones and tend to think that only their > > direct actions will change them. So, any non-atomic changes in > > configuration won't be regarded as a problem: only direct actions that > > will initiate the reconfiguration of the network interfaces must > > change the stuff and changes in configuration files that aren't > > supplemented with such actions must not change anything. >=20 > I agree that this change will require user education. It shouldn't, because it has no real gain apart from fixing the DHCP issue in the way that is different from mine, even if the solution will be 100% harmless. And it isn't harmless, that's the problem. > However 'ifconfig down' and 'ifconfig up' are actually direct > actions. What are you trying to say by this? When firewall is involved, it is not just 'ifconfig down', 'ifconfig up', it will require at least 'service ipfw/pf/ipf restart' and 'service routing restart'. > Users who don't want this can simply comment out the entry > in rc.conf, or the entry in devd.conf. Users that want netif in their devd.conf instead of dhclient can change the devd.conf by themselves. And, given that your change - makes some interfaces to enter the infinite up/down flapping; - makes default route to disappear; it should be written in bold in devd.conf: "don't stick netif here, unless you want these 'side effects'". > > Your way to fix the problem adds the possibility of the > > linkdown/linkup event combo to alter the configuration that is in the > > process of being changed. That's unexpected and one can't be ready > > for it in all situations (though remote console will save some brain > > cells): it depends on the external factor one can't fully control. >=20 > In very rare edge cases, yes. Damn, rather big part of my work consists of these edge cases, when two "unlikely" factors came into the existence and systems are behaving in an "improbable" way. Doug, OS and its infrastructure must be reliable and work with POLA in mind. > > Linkup/linkdown events aren't that rare and generally they are not > > viewed as something unusual that will ruin people's connectivity: > > as long as L3 layer and above will stay alive, link flaps on L2 > > shouldn't change its operations apart from outages for the flap > > duration. >=20 > But it's the combination of "unexpected L2 flap" AND "being in the > process of making an rc.conf change" that will trigger the problem > you describe. And once again, if the user doesn't want the change to > take effect immediately they can comment it out. If no config exists > for the interface, nothing bad will happen. Had you heard about Fukishima power plant? > > So, my motto here is "Static is static, leave it alone and don't > > make it to depend on the dynamic events; DHCP is the dynamic > > protocol by its nature, so it can depend on the dynamic events". >=20 > While I don't agree that the problems you're describing are enough > of a possibility to be concerned about, the other alternative that I > considered is for devd.conf to call a wrapper script that first > determines whether or not it's a DHCP interface, and then calls > rc.d/dhclient if it is. However, there are a couple of downsides to > that. First, it's more work. :) But seriously, one advantage of > using netif is that it will also work with interfaces that are > dynamically configured with IPv6. If we were to move to a wrapper > script idea I'd like to see it support that as well as IPv4 DHCP. This wrapper script will either duplicate the internals of dhclient and whatever machinery for IPv6 in the part of determining the applicability of the dynamical configuration for this interface or it will blindly call dhclient and other scripts checking if they were successful or not. I would not be against the netif route, but it creates serious problems with a) unnatural behaviour of the network stack in the "edge" cases; b) constant link flapping for at least msk(4) interfaces; c) default routes (essentially, all routes that go through the interface in question): they just disappear, at least for the msk(4), sk(4) and nfe(4) interfaces. The demo will be provided in my reply to your message with ID 4EF96D7D.3030701@FreeBSD.org. On the other hand, documenting the "quiet" semantics and using it for dhclient to silence the error will - fix the issue; - allow wrapper script you're talking about to be written in a simple way without duplicating the machinery inside dhclient. --=20 Eygene Ryabinkin ,,,^..^,,, [ Life's unfair - but root password helps! | codelabs.ru ] [ 82FE 06BC D497 C0DE 49EC 4FF0 16AF 9EAE 8152 ECFB | freebsd.org ] --JVVqWhpkAs5raV7A Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iF4EAREIAAYFAk75j1sACgkQFq+eroFS7PsJrAEAjjHsVae7/3xuker+gQCsRbyw 9D5nTYg/NS0Jvbq1sbIA/jYCjlVyz7NOuForzDa8JW5w11R3aN/Sfzi90dQ66+X7 =eOVp -----END PGP SIGNATURE----- --JVVqWhpkAs5raV7A--