From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 21 22:43:46 2011
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B50D2106566B;
	Wed, 21 Sep 2011 22:43:46 +0000 (UTC)
	(envelope-from lacombar@gmail.com)
Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50])
	by mx1.freebsd.org (Postfix) with ESMTP id 225108FC12;
	Wed, 21 Sep 2011 22:43:45 +0000 (UTC)
Received: by wwe3 with SMTP id 3so1286857wwe.31
	for <multiple recipients>; Wed, 21 Sep 2011 15:43:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	bh=5/uOl3OpreKoXCE17MzYpJIcKASzPGSE4ZFS31UsSlg=;
	b=lqyyxfaAJEzS+9LbNHznKDliHkBAmlde9TdZH56uwN/Z2ThEKocFvWZAFGRt5aGZ9M
	39YmWIWjnJDfLDAGMDxNkx6Zc2O/Uk3hARBs/7HoFWyqlUSWJWD05vydXnvjaSo9Xy5e
	e2tKhgrgHH69bR/a6mKizZQiszArkuINXvNwM=
MIME-Version: 1.0
Received: by 10.216.172.75 with SMTP id s53mr2830581wel.38.1316645025269; Wed,
	21 Sep 2011 15:43:45 -0700 (PDT)
Received: by 10.180.95.169 with HTTP; Wed, 21 Sep 2011 15:43:45 -0700 (PDT)
In-Reply-To: <CAHM0Q_NbOGj4rEpHWBJooyrzYi2rehbxd5LChTga1DzWW6P44g@mail.gmail.com>
References: <CACqU3MXQ6tD804fKymeFeKDnHndSXVvHJwepYztB4DsnNmtMiw@mail.gmail.com>
	<CACqU3MWwOw_otd0sJ-c4OXedeeJtchwiX9Xpx7V0zNW+cNZ7Yw@mail.gmail.com>
	<CAHM0Q_NfoSoa52rAAF8iUPQoqardbgSsq0PDnfh+mUFN993ZVA@mail.gmail.com>
	<CACqU3MWMeAMcrDZ2NF_OytYgiAFxmHvYRKcCVk=-=_VVYAcExQ@mail.gmail.com>
	<CAHM0Q_NbOGj4rEpHWBJooyrzYi2rehbxd5LChTga1DzWW6P44g@mail.gmail.com>
Date: Wed, 21 Sep 2011 18:43:45 -0400
Message-ID: <CACqU3MXJJeF0HnqQSQQAAANhR_cnB3hF9qF2xb3GnU=J5xiaVA@mail.gmail.com>
From: Arnaud Lacombe <lacombar@gmail.com>
To: "K. Macy" <kmacy@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject: Re: buf_ring(9) API precisions
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Sep 2011 22:43:46 -0000

Hi,

On Mon, Sep 19, 2011 at 8:46 AM, K. Macy <kmacy@freebsd.org> wrote:
> If the value lags next by one then it is ours. This rule applies to
> all callers so the rule holds consistently.
>
I think you do not understand what I mean, which is that the following:

       while (br->br_prod_tail !=3D prod_head)
               cpu_spinwait();
       br->br_prod_bufs++;
       br->br_prod_bytes +=3D nbytes;
       br->br_prod_tail =3D prod_next;
       critical_exit();

at runtime, can be seen, memory-wise as:

       while (br->br_prod_tail !=3D prod_head)
               cpu_spinwait();
       br->br_prod_tail =3D prod_next;
       br->br_prod_bufs++;
       br->br_prod_bytes +=3D nbytes;
       critical_exit();

That is, there is no memory barrier to enforce completion of the
load/increment/store/load/load/addition/store operations before
updating what other thread spin on. Yes, `br_prod_tail' is marked
`volatile', but there is no guarantee that it will not be re-ordered
wrt. non-volatile write (to `br_prod_bufs' and `br_prod_bytes').

 - Arnaud

> On Mon, Sep 19, 2011 at 5:53 AM, Arnaud Lacombe <lacombar@gmail.com> wrot=
e:
>> Hi,
>>
>> On Fri, Sep 16, 2011 at 10:41 AM, K. Macy <kmacy@freebsd.org> wrote:
>>> On Fri, Sep 16, 2011 at 3:02 AM, Arnaud Lacombe <lacombar@gmail.com> wr=
ote:
>>>> Hi,
>>>>
>>>> On Wed, Sep 14, 2011 at 10:53 PM, Arnaud Lacombe <lacombar@gmail.com> =
wrote:
>>>>> Hi Kip,
>>>>>
>>>>> I've got a few question about the buf_ring(9) API.
>>>>>
>>>>> 1) what means the 'drbr_' prefix. I can guess the two last letter, 'b=
'
>>>>> and 'r', for Buffer Ring, but what about 'd' and 'r' ?
>>>>>
>>>>> 2) in `sys/sys/buf_ring.h', you defined 'struct buf_ring' as:
>>>>>
>>>>> struct buf_ring {
>>>>> =A0 =A0 =A0 =A0volatile uint32_t =A0 =A0 =A0 br_prod_head;
>>>>> =A0 =A0 =A0 =A0volatile uint32_t =A0 =A0 =A0 br_prod_tail;
>>>>> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 br_prod_si=
ze;
>>>>> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 br_prod_ma=
sk;
>>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_drops;
>>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_prod_bufs;
>>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_prod_bytes;
>>>> shouldn't those 3 fields be updated atomically, especially on 32bits
>>>> platforms ? That might pose a problem as, AFAIK, FreeBSD do not have
>>>> MI 64bits atomics operations...
>>>
>>> Between the point at which br_prod_tail =3D=3D prod_head and when we
>>> update br_prod_tail to point to prod_next we are the exclusive owners
>>> of the fields in buf_ring. That is why we wait for any other
>>> enqueueing threads to update br_prod_tail to point to prod_head before
>>> continuing.
>>>
>> How do you enforce ordering ? I do not see anything particular
>> forbidding the `br->br_prod_tail' to be committed first, leading other
>> thread to believe they have access to the statistics, while the other
>> thread has not yet committed its change.
>>
>> Thanks,
>> =A0- Arnaud
>>
>>> Cheers
>>>
>>> =A0 =A0 =A0 =A0/*
>>> =A0 =A0 =A0 =A0 * If there are other enqueues in progress
>>> =A0 =A0 =A0 =A0 * that preceeded us, we need to wait for them
>>> =A0 =A0 =A0 =A0 * to complete
>>> =A0 =A0 =A0 =A0 */
>>> =A0 =A0 =A0 =A0while (br->br_prod_tail !=3D prod_head)
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cpu_spinwait();
>>> =A0 =A0 =A0 =A0br->br_prod_bufs++;
>>> =A0 =A0 =A0 =A0br->br_prod_bytes +=3D nbytes;
>>> =A0 =A0 =A0 =A0br->br_prod_tail =3D prod_next;
>>> =A0 =A0 =A0 =A0critical_exit();
>>>
>>
>