Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Nov 2011 19:43:12 +0400
From:      Gleb Smirnoff <glebius@FreeBSD.org>
To:        Qing Li <qingli@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r227791 - head/sys/netinet
Message-ID:  <20111122154312.GL96616@FreeBSD.org>
In-Reply-To: <CAGnGRdL%2BTBG7O55FnfgTmLBE8=cdZPy%2BcOdJZNgbB9P=BCdMww@mail.gmail.com>
References:  <201111211410.pALEAD9B046139@svn.freebsd.org> <CAGnGRdLpWwTkfjirBYe7x-1TVOMtHiRJNX4dM-iQXwQgP3mCVQ@mail.gmail.com> <20111121195439.GE96616@FreeBSD.org> <CAGnGRdL%2BTBG7O55FnfgTmLBE8=cdZPy%2BcOdJZNgbB9P=BCdMww@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
  Qing,

  first I'd like to notice that we are speaking about obsoleted interfaces.
This should be taken into account during all the discussion. We shouldn't
make code more complex in favor to make obsoleted interfaces more smart.

1) Scrubbing in in_ifinit() is done only in case of SIOCSIFADDR. The latter
isn't ever used in our userland code. And I suppose isn't used by any software
in ports. Because you can't set a CIDR prefix with SIOCSIFADDR, which makes
it useless in 2011.

Actually I'd be glad to cut that SIOCSIFADDR/SIOCSIFNETMASK at all.

2) The code that restores the oldaddr is activated only in two cases:
   2.1) SIOCSIFADDR
   2.2) SIOCAIFADDR with an address in the in_aliasreq that already exists
	on the interface. If it doesn't exist, address is _added_,
	not changed!

I'll skip on SIOCSIFADDR, said enough on it.

To trigger 2.2 we need something like this:

# ifconfig igb1 10.0.0.1/24
# ifconfig igb1 10.0.0.1/24 alias

I don't see any idea in smart error recovering for this strange case and even
can't imagine error.

Back to your comments:

On Mon, Nov 21, 2011 at 01:36:13PM -0800, Qing Li wrote:
Q> > From my point of view logically speaking, we should first remove route,
Q> > then remove address. Otherwise, for a short time we've got an invalid
Q> > route in table.
Q> 
Q> For a short time you have an invalid address, it is faster to remove the
Q> address from the list to prevent usage, then to flush the route, which would
Q> also trigger L2 table flushes as well.

I have made a test case that proves, that usage of deleted address isn't
prevented when it is removed, but loopback route still exists.

The test is the following run a race between this program:

        struct ifreq ifr;
        int s;

        bzero(&ifr, sizeof(struct ifreq));

        strncpy(ifr.ifr_name, "igb1", sizeof ifr.ifr_name);
        ifr.ifr_addr.sa_family = AF_INET;
        ifr.ifr_addr.sa_len = sizeof(struct sockaddr_in);
        ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr.s_addr = inet_addr("10.0.0.1");

        s = socket(AF_INET, SOCK_DGRAM, 0);

        for (;;)
		ioctl(s, SIOCSIFADDR, &ifr);

And this script:

	while (true); do nc -z 10.0.0.1 22 || echo Fail; done

The script would never fail, that proves that traffic flows even when
address is temporarily wiped. With new code script may fail.

One note: your system will panic w/o fix that I've mailed you before, but
this is a separate issue that should be discussed separately.

Q> > Q> Putting "in_scrubprefix()" at the top does not gain you anything at
Q> > Q> all, but can
Q> > Q> potentially be problematic if additional tasks are in fact performed
Q> > Q> in "if_ioctl()"
Q> > Q> that may in fact affect the logic in "in_ifinit()".
Q> > Q>
Q> > Q> Case in point, I had to perform additional routing related tasks so I
Q> > Q> re-introduced "nd6_rtrequest()" in r227460.
Q> >
Q> > Pardon, can you please elaborate on this? I don't see any problems that
Q> > I intoroduced, but if ther are any, we can either push in_scrubprefix()
Q> > down the function as it was before, of fix them some other way.
Q> >
Q> 
Q> The point being, perhaps the address related tasks to be performed in
Q> "if_ioctl()" today is limited, however, we may need to perform additional
Q> housekeeping tasks in "if_ioctl()", and if there is error, we would want to
Q> revert the process.
Q> 
Q> There are so many different L2 and pseudo interface functions that map
Q> to "if_ioctl()", it is Not safe to ignore its error and not reflect that error
Q> back to the consistency of the routing table and the interface address
Q> list.

Yep, most of modern if_ioctl's do not fail at all. But my code checks for
error, so that isn't a problem.

Q> Again, removing the associated route at the beginning of in_ifinit() does
Q> not gain much in terms of performance or code cleanness, nor does it
Q> improve the logic, so why limit what we could do in the future in "if_ioctl()"
Q> by making such a change ??
Q> 
Q> >
Q> > Q> You are not simplifying much logic by removing the error recovery code from
Q> > Q> the return of "if_ioctl()". So why removing the flexibility of what
Q> > Q> "if_ioctl()" is
Q> > Q> intended for as part of the address configuration logic ?
Q> >
Q> > Because in_ifinit() was inconsistent. It tried to recover in case
Q> > of (*ifp->if_ioctl) failure, but did not try to recover in case
Q> > of failure of:
Q> >
Q> > in_addprefix()
Q> > ifa_add_loopback_route()
Q> >
Q> 
Q> The "inconsistency" is due to the fact failures from these two functions
Q> are not fatal, and does not necessarily affect the actual operations.
Q> 
Q> In function "in_addprefix()", there are 2 main possible errors. EEXIST is
Q> non fatal, it just means the same prefix route is already covered by
Q> another address, and what is being inserted is just an alias.
Q> 
Q> The other error is due to RTM_ADD failure, however, since IFA_ROUTE is
Q> not set on the address when failure occurs, no route will ever be directed
Q> to either the interface or that address. Therefore no harm is done.
Q> 
Q> In the case of function "ifa_add_loopback_route()", again, a failure is just
Q> an indication of an EEXIST.
Q> 
Q> You may ask why not check for EEXIST before calling these functions, and
Q> I can tell you having worked those code regions for quite some time, the
Q> actual work amounts to the same efforts, but complicates the logic WRT
Q> the callers. It's actually better just let the failure occur.

Application writers don't know this in-kernel things. They usually write code
this way:

	if (ioctl(foo) < 0) {
		/* fatal error: can't configure address! */
	}

And this is correct way. Look into this from viewpoint of say quagga developer.
One doesn't need to know about this tricks in FreeBSD kernel.

-- 
Totus tuus, Glebius.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111122154312.GL96616>