From owner-freebsd-hackers@FreeBSD.ORG Sat Sep 24 08:56:10 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3AD55106564A for ; Sat, 24 Sep 2011 08:56:10 +0000 (UTC) (envelope-from kmacybsd@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id E6F638FC15 for ; Sat, 24 Sep 2011 08:56:09 +0000 (UTC) Received: by vcbf13 with SMTP id f13so2564709vcb.13 for ; Sat, 24 Sep 2011 01:56:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Py2Q3TEfO47GRqK4hndfnfbdMEiWC+jEVpA4El9Vnks=; b=eT2VyASeH7+dr7To9rWfYh6WqTtqtqMtJ4CCRrwXvp1RpJVC4SBBg80s/xUY3cVxl2 E93DVuavIeTslEYjKAObEhUPGuioXW1fX5vq7T4LF63Wyeemdhp1/VapbouJqqA/TW1c uyHnaDRIQ1zV2qeiOXs7RNE5BuCAn9p8dLqsM= MIME-Version: 1.0 Received: by 10.52.96.166 with SMTP id dt6mr4332317vdb.345.1316854569170; Sat, 24 Sep 2011 01:56:09 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.52.113.202 with HTTP; Sat, 24 Sep 2011 01:56:08 -0700 (PDT) In-Reply-To: References: Date: Sat, 24 Sep 2011 10:56:08 +0200 X-Google-Sender-Auth: PSAKzBNNfWE-y8FmvrXyBB4aTP8 Message-ID: From: "K. Macy" To: Arnaud Lacombe Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: FreeBSD Hackers Subject: Re: buf_ring(9) API precisions X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Sep 2011 08:56:10 -0000 You're right. A write memory barrier is needed there. Thanks On Thu, Sep 22, 2011 at 12:43 AM, Arnaud Lacombe wrote= : > Hi, > > On Mon, Sep 19, 2011 at 8:46 AM, K. Macy wrote: >> If the value lags next by one then it is ours. This rule applies to >> all callers so the rule holds consistently. >> > I think you do not understand what I mean, which is that the following: > > =A0 =A0 =A0 while (br->br_prod_tail !=3D prod_head) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 cpu_spinwait(); > =A0 =A0 =A0 br->br_prod_bufs++; > =A0 =A0 =A0 br->br_prod_bytes +=3D nbytes; > =A0 =A0 =A0 br->br_prod_tail =3D prod_next; > =A0 =A0 =A0 critical_exit(); > > at runtime, can be seen, memory-wise as: > > =A0 =A0 =A0 while (br->br_prod_tail !=3D prod_head) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 cpu_spinwait(); > =A0 =A0 =A0 br->br_prod_tail =3D prod_next; > =A0 =A0 =A0 br->br_prod_bufs++; > =A0 =A0 =A0 br->br_prod_bytes +=3D nbytes; > =A0 =A0 =A0 critical_exit(); > > That is, there is no memory barrier to enforce completion of the > load/increment/store/load/load/addition/store operations before > updating what other thread spin on. Yes, `br_prod_tail' is marked > `volatile', but there is no guarantee that it will not be re-ordered > wrt. non-volatile write (to `br_prod_bufs' and `br_prod_bytes'). > > =A0- Arnaud > >> On Mon, Sep 19, 2011 at 5:53 AM, Arnaud Lacombe wro= te: >>> Hi, >>> >>> On Fri, Sep 16, 2011 at 10:41 AM, K. Macy wrote: >>>> On Fri, Sep 16, 2011 at 3:02 AM, Arnaud Lacombe w= rote: >>>>> Hi, >>>>> >>>>> On Wed, Sep 14, 2011 at 10:53 PM, Arnaud Lacombe = wrote: >>>>>> Hi Kip, >>>>>> >>>>>> I've got a few question about the buf_ring(9) API. >>>>>> >>>>>> 1) what means the 'drbr_' prefix. I can guess the two last letter, '= b' >>>>>> and 'r', for Buffer Ring, but what about 'd' and 'r' ? >>>>>> >>>>>> 2) in `sys/sys/buf_ring.h', you defined 'struct buf_ring' as: >>>>>> >>>>>> struct buf_ring { >>>>>> =A0 =A0 =A0 =A0volatile uint32_t =A0 =A0 =A0 br_prod_head; >>>>>> =A0 =A0 =A0 =A0volatile uint32_t =A0 =A0 =A0 br_prod_tail; >>>>>> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 br_prod_s= ize; >>>>>> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 br_prod_m= ask; >>>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_drops; >>>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_prod_bufs; >>>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_prod_bytes= ; >>>>> shouldn't those 3 fields be updated atomically, especially on 32bits >>>>> platforms ? That might pose a problem as, AFAIK, FreeBSD do not have >>>>> MI 64bits atomics operations... >>>> >>>> Between the point at which br_prod_tail =3D=3D prod_head and when we >>>> update br_prod_tail to point to prod_next we are the exclusive owners >>>> of the fields in buf_ring. That is why we wait for any other >>>> enqueueing threads to update br_prod_tail to point to prod_head before >>>> continuing. >>>> >>> How do you enforce ordering ? I do not see anything particular >>> forbidding the `br->br_prod_tail' to be committed first, leading other >>> thread to believe they have access to the statistics, while the other >>> thread has not yet committed its change. >>> >>> Thanks, >>> =A0- Arnaud >>> >>>> Cheers >>>> >>>> =A0 =A0 =A0 =A0/* >>>> =A0 =A0 =A0 =A0 * If there are other enqueues in progress >>>> =A0 =A0 =A0 =A0 * that preceeded us, we need to wait for them >>>> =A0 =A0 =A0 =A0 * to complete >>>> =A0 =A0 =A0 =A0 */ >>>> =A0 =A0 =A0 =A0while (br->br_prod_tail !=3D prod_head) >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cpu_spinwait(); >>>> =A0 =A0 =A0 =A0br->br_prod_bufs++; >>>> =A0 =A0 =A0 =A0br->br_prod_bytes +=3D nbytes; >>>> =A0 =A0 =A0 =A0br->br_prod_tail =3D prod_next; >>>> =A0 =A0 =A0 =A0critical_exit(); >>>> >>> >> >