Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 May 2009 03:17:47 +0100
From:      Bruce Simpson <bms@incunabulum.net>
To:        Warner Losh <imp@FreeBSD.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r191943 - head/sys/netinet
Message-ID:  <4A06394B.8050002@incunabulum.net>
In-Reply-To: <200905091850.n49Io1vX031388@svn.freebsd.org>
References:  <200905091850.n49Io1vX031388@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Warner Losh wrote:
> Author: imp
> Date: Sat May  9 18:50:01 2009
> New Revision: 191943
> URL: http://svn.freebsd.org/changeset/base/191943
>
> Log:
>   Remove bogus comment.
>   

Thanks for tackling the BURN_BRIDGES cleanup. Some more general thoughts:

Actually, in_multihead can probably get blown away now. It was used only 
by the IN_LOOKUP_MULTI() macro.

It is no longer used in HEAD anywhere (and vice versa for the IPv6 
equivalent); the code now uses the if_multiaddrs TAILQ in ifnet. I just 
didn't do that right away to avoid VIMAGE churn. I'm wondering if we 
should even be putting network layer addresses in ifnet at all, given 
the fun locking churn and LORs it can lead to.

We've got a couple of peculiarities with IPv4 multicast and 
limited/network broadcast, which originated in 4.4BSD and have largely 
been perpetuated by other implementations ever since TCP/IP has 
pollinated everywhere.

With an SSM capable stack, in_multi becomes a bigger structure with more 
state. What we might benefit from is from introducing a map of inpcb to 
in_multi 1:M instead (with bidirectional key lookup), which is a wholly 
separate idea. It would avoid the need for a walk of all inpcb in the 
system whenever a multicast (or broadcast!) datagram arrives on an 
interface. On a system with tens of thousands of sockets, that probably 
wouldn't scale, and the (since reclaimed) additional memory might be 
worth it.

I believe Windows already does just this. In fact it's worth looking at 
their BSD sockets API in Vista/Longhorn -- there are a bunch of tricks 
in there to work around / fix some historical BSD lameness.

That is pure irony, it has to be seen -- clients I've worked with in the 
past are rolling out their satellite multimedia systems using Windows as 
the client, but not strictly for these reasons; their timing problems 
would go away if they had POSIX realtime APIs, instead of having to hack 
around them the way the Geiss Winamp plugin guy did, or going straight 
to a kernel-mode driver; but by using Windows, they don't even run into 
issues like this with the network layer, they've already been fixed.

(The real reason they stuck with Windows was probably more to do with 
the fact that commercial MPEG4 in RTP implementations were available 
there and then when they needed them in their business plan.)

Right now we have the kludgy situation that we're carrying around a 
BSD-specific socket option, SO_REUSEPORT. It exists solely because 
laddr-bound inpcb's will not match the in_multi lookup, and it isn't 
actually used in other implementations. This is counter-intuitive 
because multicast group membership is scoped to particular links, yet if 
you bind() to an address configured on that link, you don't receive 
multicast traffic. Same with broadcast.

We do need to preserve the SO_REUSEPORT behaviour for idempotence and 
backwards compatibility.

Just something to chew on...

cheers,
BMS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A06394B.8050002>