From owner-freebsd-hackers@FreeBSD.ORG Wed Sep 21 22:43:46 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B50D2106566B; Wed, 21 Sep 2011 22:43:46 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 225108FC12; Wed, 21 Sep 2011 22:43:45 +0000 (UTC) Received: by wwe3 with SMTP id 3so1286857wwe.31 for ; Wed, 21 Sep 2011 15:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=5/uOl3OpreKoXCE17MzYpJIcKASzPGSE4ZFS31UsSlg=; b=lqyyxfaAJEzS+9LbNHznKDliHkBAmlde9TdZH56uwN/Z2ThEKocFvWZAFGRt5aGZ9M 39YmWIWjnJDfLDAGMDxNkx6Zc2O/Uk3hARBs/7HoFWyqlUSWJWD05vydXnvjaSo9Xy5e e2tKhgrgHH69bR/a6mKizZQiszArkuINXvNwM= MIME-Version: 1.0 Received: by 10.216.172.75 with SMTP id s53mr2830581wel.38.1316645025269; Wed, 21 Sep 2011 15:43:45 -0700 (PDT) Received: by 10.180.95.169 with HTTP; Wed, 21 Sep 2011 15:43:45 -0700 (PDT) In-Reply-To: References: Date: Wed, 21 Sep 2011 18:43:45 -0400 Message-ID: From: Arnaud Lacombe To: "K. Macy" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: FreeBSD Hackers Subject: Re: buf_ring(9) API precisions X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2011 22:43:46 -0000 Hi, On Mon, Sep 19, 2011 at 8:46 AM, K. Macy wrote: > If the value lags next by one then it is ours. This rule applies to > all callers so the rule holds consistently. > I think you do not understand what I mean, which is that the following: while (br->br_prod_tail !=3D prod_head) cpu_spinwait(); br->br_prod_bufs++; br->br_prod_bytes +=3D nbytes; br->br_prod_tail =3D prod_next; critical_exit(); at runtime, can be seen, memory-wise as: while (br->br_prod_tail !=3D prod_head) cpu_spinwait(); br->br_prod_tail =3D prod_next; br->br_prod_bufs++; br->br_prod_bytes +=3D nbytes; critical_exit(); That is, there is no memory barrier to enforce completion of the load/increment/store/load/load/addition/store operations before updating what other thread spin on. Yes, `br_prod_tail' is marked `volatile', but there is no guarantee that it will not be re-ordered wrt. non-volatile write (to `br_prod_bufs' and `br_prod_bytes'). - Arnaud > On Mon, Sep 19, 2011 at 5:53 AM, Arnaud Lacombe wrot= e: >> Hi, >> >> On Fri, Sep 16, 2011 at 10:41 AM, K. Macy wrote: >>> On Fri, Sep 16, 2011 at 3:02 AM, Arnaud Lacombe wr= ote: >>>> Hi, >>>> >>>> On Wed, Sep 14, 2011 at 10:53 PM, Arnaud Lacombe = wrote: >>>>> Hi Kip, >>>>> >>>>> I've got a few question about the buf_ring(9) API. >>>>> >>>>> 1) what means the 'drbr_' prefix. I can guess the two last letter, 'b= ' >>>>> and 'r', for Buffer Ring, but what about 'd' and 'r' ? >>>>> >>>>> 2) in `sys/sys/buf_ring.h', you defined 'struct buf_ring' as: >>>>> >>>>> struct buf_ring { >>>>> =A0 =A0 =A0 =A0volatile uint32_t =A0 =A0 =A0 br_prod_head; >>>>> =A0 =A0 =A0 =A0volatile uint32_t =A0 =A0 =A0 br_prod_tail; >>>>> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 br_prod_si= ze; >>>>> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 br_prod_ma= sk; >>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_drops; >>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_prod_bufs; >>>>> =A0 =A0 =A0 =A0uint64_t =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_prod_bytes; >>>> shouldn't those 3 fields be updated atomically, especially on 32bits >>>> platforms ? That might pose a problem as, AFAIK, FreeBSD do not have >>>> MI 64bits atomics operations... >>> >>> Between the point at which br_prod_tail =3D=3D prod_head and when we >>> update br_prod_tail to point to prod_next we are the exclusive owners >>> of the fields in buf_ring. That is why we wait for any other >>> enqueueing threads to update br_prod_tail to point to prod_head before >>> continuing. >>> >> How do you enforce ordering ? I do not see anything particular >> forbidding the `br->br_prod_tail' to be committed first, leading other >> thread to believe they have access to the statistics, while the other >> thread has not yet committed its change. >> >> Thanks, >> =A0- Arnaud >> >>> Cheers >>> >>> =A0 =A0 =A0 =A0/* >>> =A0 =A0 =A0 =A0 * If there are other enqueues in progress >>> =A0 =A0 =A0 =A0 * that preceeded us, we need to wait for them >>> =A0 =A0 =A0 =A0 * to complete >>> =A0 =A0 =A0 =A0 */ >>> =A0 =A0 =A0 =A0while (br->br_prod_tail !=3D prod_head) >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0cpu_spinwait(); >>> =A0 =A0 =A0 =A0br->br_prod_bufs++; >>> =A0 =A0 =A0 =A0br->br_prod_bytes +=3D nbytes; >>> =A0 =A0 =A0 =A0br->br_prod_tail =3D prod_next; >>> =A0 =A0 =A0 =A0critical_exit(); >>> >> >