Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 1 Dec 2013 20:29:24 -0800
From:      Oleg Moskalenko <mom040267@gmail.com>
To:        Sepherosa Ziehau <sepherosa@gmail.com>
Cc:        =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>, freebsd-net <freebsd-net@freebsd.org>, Tim Kientzle <kientzle@freebsd.org>, "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
Subject:   Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
Message-ID:  <CALDtMrLgm-D30u8HWWF=sVda0h4QtYdyiGHpYPw1kfTWbMbJ6Q@mail.gmail.com>
In-Reply-To: <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
References:  <CAPBZQG29BEJJ8BK=gn%2Bg_n5o7JSnPbsKQ-=3=6AkFOxzt%2B=wGQ@mail.gmail.com> <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org> <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU%2BVQQ@mail.gmail.com> <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-%2B6EQHiAaDZqGtaodhMMA@mail.gmail.com> <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Sepherosa, while reading your description I noticed another long-standing
problem for UDP application developers: the UDP sockets are always hashed
with 2-tuple. But UDP sockets can be "connected", too, to a remote address,
with connect(...) function. Unfortunately, with 2-tuple hashing, that
pattern is useless for large-scale applications: if a large number of UDP
sockets on the same local port are "connected" to remote address, then the
kernel have to go thru the long list of UDP sockets with the same hash
value.

If the connected UDP sockets would use 4-tuples, then it would be very
helpful for the new generation of the UDP-based media applications. For
example, servers which use DTLS protocol would become simpler and more
efficient.

Thanks
Oleg



On Sun, Dec 1, 2013 at 8:17 PM, Sepherosa Ziehau <sepherosa@gmail.com>wrote=
:

>
>
>
> On Sat, Nov 30, 2013 at 2:42 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
>
>> Well seems Dragonfly has some version of it already from commit [1].
>>
>>
> The distribution algorithm was changed a little bit after initial commit
> to gain more idle time (bnx(4) output has already been maxed out):
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be2=
8b150d3f4fd518914bdeba6
>
> Well, I also addressed a reasonable concern from nginx folks (I am not
> quite sure about Linux's position on it; Linux original implementation of
> SO_REUSEPORT from Google had this drawback, which I mentioned in the comm=
it
> message):
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45=
eb69750219f79f5e8982272
>
> As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in
> dports; should be easier to be back ported to FreeBSD's ports.  I failed =
to
> convince nginx folks to merge it into mainline and I am currently onto
> other stuffs, will come back to them later.  If FreeBSD is going to
> implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
> mainline will be easier.
>
> I also put up a brief description of SO_REUSEPORT in dfly; may be useful
> to you:
> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>
> Best Regards,
> sephe
>
>
>> In FreeBSD there is the framework for this with by defining PCBGROUP.
>> Also the explanation of it at [2] and [3].
>> It can achieve approximately the same features of SO_RESUSEPORT of linux=
.
>> The only thing missing is the marketing behind it and i think and better
>> RSS support.
>> By looking at dates the support is there before linux so all you guys
>> looking for it can experiment with it.
>>
>> What i was trying to accomplish was something else from performance
>> improvement and
>> maybe put a sysctl behind it to make it more acceptable..
>>
>> [1]
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9=
c021abb8197718d7a2d441c9
>> [2]
>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=3Dbigexcerpts#=
L51
>> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.htm=
l
>>
>>
>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko <mom040267@gmail.com
>> >wrote:
>>
>> > Tim, you are wrong. Read what is "multicast" definition, and read how
>> UDP
>> > and TCP sockets work in Linux 3.9+ kernels.
>> >
>> > Oleg .
>> >
>> >
>> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle <kientzle@freebsd.org
>> >wrote:
>> >
>> >>
>> >> On Nov 29, 2013, at 4:04 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
>> >>
>> >> > Hello,
>> >> >
>> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two
>> daemons to
>> >> > share the same port and possibly listening ip =85
>> >>
>> >> These flags are used with TCP-based servers.
>> >>
>> >> I=92ve used them to make software upgrades go more smoothly.
>> >> Without them, the following often happens:
>> >>
>> >> * Old server stops.  In the process, all of its TCP connections are
>> >> closed.
>> >>
>> >> * Connections to old server remain in the TCP connection table until
>> the
>> >> remote end can acknowledge.
>> >>
>> >> * New server starts.
>> >>
>> >> * New server tries to open port but fails because that port is =93sti=
ll
>> in
>> >> use=94 by connections in the TCP connection table.
>> >>
>> >> With these flags, the new server can open the port even though
>> >> it is =93still in use=94 by existing connections.
>> >>
>> >>
>> >> > This is not the case today.
>> >> > Only multicast sockets seem to have the behaviour of broadcasting t=
he
>> >> data
>> >> > to all sockets sharing the same properties through these options!
>> >>
>> >> That is what multicast is for.
>> >>
>> >> If you want the same data sent to all listeners, then
>> >> that is multicast behavior and you should be using
>> >> a multicast socket.
>> >>
>> >> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>> >>
>> >> You=92re trying to turn all UDP sockets with those options
>> >> into multicast sockets.
>> >>
>> >> If you want a multicast socket, you should ask for one.
>> >>
>> >> Tim
>> >>
>> >> _______________________________________________
>> >> freebsd-net@freebsd.org mailing list
>> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org=
"
>> >>
>> >
>> >
>>
>>
>> --
>> Ermal
>> _______________________________________________
>> freebsd-current@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.or=
g
>> "
>>
>
>
>
> --
> Tomorrow Will Never Die
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CALDtMrLgm-D30u8HWWF=sVda0h4QtYdyiGHpYPw1kfTWbMbJ6Q>