From owner-freebsd-hackers@FreeBSD.ORG  Mon May 13 18:02:51 2013
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 44CF699C
 for <freebsd-hackers@freebsd.org>; Mon, 13 May 2013 18:02:51 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-we0-x22a.google.com (mail-we0-x22a.google.com
 [IPv6:2a00:1450:400c:c03::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id D0970DFA
 for <freebsd-hackers@freebsd.org>; Mon, 13 May 2013 18:02:50 +0000 (UTC)
Received: by mail-we0-f170.google.com with SMTP id u54so6860239wes.1
 for <freebsd-hackers@freebsd.org>; Mon, 13 May 2013 11:02:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type
 :content-transfer-encoding;
 bh=JC8ymxN4pV3ep9diBXA88ebRohte7G+X4aVt7hpzwwg=;
 b=PXt9P4He4+9lNbgZ/sgzyDALjUsxiFQun9FYnkiujqOLHRoGntHTt2/b4aiKLKURFv
 gqxAI7piza+Rt06IcBrAXdlKaXQ/A8XHuAPaNE95Fyu8Z8OgmopRO2com8JG8ki6GkTd
 BTaMWq8NkrRazCQN6fh12Pd9UVYMg5aAvQI6Zk53OHcXVQAoSqd6+H2VXX4z866XOxNi
 bWJrF4CNSmA0MlcRZ6m55uBMVdMc8NHLUnARP73BtsQzvdzNZYYxZm5IyMEQ2//SwdlY
 72aG/Ag4y9sgMiUNntxAY6jOLHorWfxu08wmHDbFVd//xzX//+nVzCepRb10t1WL1+8h
 oglw==
MIME-Version: 1.0
X-Received: by 10.180.185.179 with SMTP id fd19mr20915496wic.1.1368468168053; 
 Mon, 13 May 2013 11:02:48 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.217.58.138 with HTTP; Mon, 13 May 2013 11:02:47 -0700 (PDT)
In-Reply-To: <985C1C3F-3F70-47D2-8F43-F3D6CCA4482C@gmail.com>
References: <CCE4FFC4-F846-4F81-85EE-776B753C63C6@gmail.com>
 <CAJ-VmokBfBvdLa_Wf2EajF+vecVntLDaxdVeNvhAOiPp6HkjNA@mail.gmail.com>
 <84DCA050-99D4-4B22-A031-35E0928709E0@gmail.com>
 <CAJ-Vmo=CVeXpf9WNOegD3yG9Q0NwUWaLadVrv1RgeyAaHYADiQ@mail.gmail.com>
 <985C1C3F-3F70-47D2-8F43-F3D6CCA4482C@gmail.com>
Date: Mon, 13 May 2013 11:02:47 -0700
X-Google-Sender-Auth: IC3sRV8Uu0N5I0PYWpwdLnQuVpI
Message-ID: <CAJ-VmomV=3+ryHpxzFEe_Yb3WK1MThDQ_CZ8KUhQFRDn0TBd_w@mail.gmail.com>
Subject: Re: Managing userland data pointers in kqueue/kevent
From: Adrian Chadd <adrian@freebsd.org>
To: Eugen-Andrei Gavriloaie <shiretu@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 13 May 2013 18:02:51 -0000

Hi,

The reason I tend to suggest this is for portability and debugging
reasons. (Before and even since libevent came into existence.)

If you do it right, you can stub / inline out all of the wrapper
functions in userland and translate them to straight system or library
calls.

Anyway. I'm all for making kqueue better. I just worry that adding
little hacks here and there isn't the right way to do it. If you want
to guarantee specific behaviours with kqueue, you should likely define
how it should work in its entirety and see if it will cause
architectural difficulties down the track. Until that is done, I think
you have no excuse to get your code working as needed.

Don't blame kqueue because what (iirc) is not defined behaviour isn't
defined in a way that makes you happy :)


Adrian

On 13 May 2013 09:36, Eugen-Andrei Gavriloaie <shiretu@gmail.com> wrote:
> Hi Adrian,
>
> All the tricks, work arounds, paradigms suggested/implemented by us, the =
kq users, are greatly simplified by simply adding that thing that Paul is s=
uggesting. What you are saying here is to basically do not-so-natural thing=
s to overcome a real problem which can be very easy and non-intrusivly solv=
ed at lower levels. Seriously, if you truly believe that you can put the eq=
ual sign between the complexity of the user space code and the wanted patch=
 in kqueue kernel side, than I simply shut up.
>
> Besides, one of the important points in kq philosophy is simplifying thin=
gs. I underline the "one of". It is not the goal, of course. Complex things=
 are complex things no matter how hard you try to simplify them. But this i=
s definitely (should) not falling into that category.
>
> ------
> Eugen-Andrei Gavriloaie
> Web: http://www.rtmpd.com
>
> On May 13, 2013, at 6:47 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>
>> ... holy crap.
>>
>> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie <shiretu@gmail.com> wrote:
>>> Hi,
>>>
>>> Well, Paul already asked this question like 3-4 times now. Even insisti=
ng on it. I will also ask it again:
>>> If user code is responsible of tracking down the data associated with t=
he signalled entity, what is the point of having user data?
>>> Is rendered completely useless=85
>>
>> .. why does everything have to have a well defined purpose that is
>> also suited for use in _all_ situations?
> That is called perfection. I know we can't achieve it, but I like to walk=
 in that direction at least.
>
>>
>>> Not to mention, that your suggestion with FD index is a definite no-go.=
 The FD values are re-used. Especially in MT environments. Imagine one kque=
ue call taking place in thread A and another one in thread B. Both threads =
waiting for events.
>>
>> .. so don't do that. I mean, you're already having to write your code
>> to _not_ touch FDs in other threads. I've done this before, it isn't
>> that hard and it doesn't hurt performance.
> Why not? This is how you achieve natural load balancing for multiple keve=
nt() calls from multiple threads over the same kq fd. Otherwise, again, you=
 have to write complex code to manually balance the threads. That brings lo=
cking again=85.
> Why people always think that locking is cheap? Excessive locking hurts. A=
 lot!
>
>>
>>> When A does his magic, because of internal business rules, it decides t=
o close FD number 123. It closes it and it connects somewhere else by openi=
ng a new one. Surprise, we MAY  get the value 123 again as a new socket, we=
 put it on our index, etc. Now, thread B comes in and it has stale/old even=
ts for the old 123 FD. Somethings bad like EOF for the OLD version of FD nu=
mber 123 (the one we just closed anyway). Guess what=85 thread B will deall=
ocate the perfectly good thingy inside the index associated with 123.
>>
>> So you just ensure that nothing at all calls a close(123); but calls
>> fd_close(123) which will in turn close(123) and free all the state
>> associated with it.
> Once threads A and B returned from their kevent() calls, all bets are off=
. In between, you get the the behaviour I just described from threads A and=
 B racing towards FD123 to either close it or create a new one. How is wrap=
ping close() going to help? Is not like you have any control over what the =
socket() function is going to return. (That gave me another token idea btw=
=85 I will explain in another email, perhaps you care to comment)
> Mathematically speaking, the fd-to-data association is not bijective.
>
>
>>
>> You have fd_close() either grab a lock, or you ensure that only the
>> owning thread can call fd_close(123) and if any other thread calls it,
>> the behaviour is undefined.
> As I said, that adds up to the user-space code complexity. Just don't for=
get that Paul's suggestion solves all this problems in a ridiculously simpl=
e manner. All our ideas of keeping track who is owning who and indexes are =
going to be put to rest. kq will notify us when the udata is out of scope f=
rom kq perspective. That is all we ask.
>
>>
>>> And regarding the "thread happiness", that is not happiness at all IMHO=
=85
>>
>> Unless you're writing a high connection throughput web server, the
>> overhead of grabbing a lock in userland during the fd shutdown process
>> is trivial. Yes, I've written those. It doesn't hurt you that much.
> That "that much" is subjective. And a streaming server is a few orders of=
 magnitude more complex than a web server. Remember, a web server is bound =
to request/response paradigm. While a streaming server is a full duplex (no=
t request/response based) animal for most of connections. I strongly believ=
e that becomes a real problem. (I would love to be wrong on this one!)
>
>>
>> I'm confused as to why this is still an issue. Sure, fix the kqueue
>> semantics and do it in a way that doesn't break backwards
>> compatibility.
> Than, if someone has time and pleasure, it would be nice to have it. Is a=
 neat solution. Is one thing saying, hey, we don't have time, do it yoursel=
f. And another thing in trying to offer "better" solutions by defending suc=
h an obvious caveat.
>
>> But please don't claim that it's stopping you from
>> getting real work done.
> I didn't and I won't. I promise!
>
>> I've written network apps with kqueue that
>> scales to 8+ cores and (back in mid-2000's) gigabit + of small HTTP
>> transactions.
> Good for you. How is this relevant to or discussion of simplifying things=
? Of course is possible. But let's make things simpler and more efficient. =
It really pays off in the long run. Hell, this is how kq was born in the fi=
rst place: getting rid of all garbage that one was supposed to do to achiev=
e what kq does with a few lines of code. Let's make that even better than i=
t currently is.
>
>> This stuff isn't at all problematic.
>>
>>
>> Adrian
>