From owner-freebsd-net@FreeBSD.ORG  Tue Jul  2 21:15:56 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id 5CCB0600
 for <freebsd-net@freebsd.org>; Tue,  2 Jul 2013 21:15:56 +0000 (UTC)
 (envelope-from rizzo.unipi@gmail.com)
Received: from mail-lb0-x22a.google.com (mail-lb0-x22a.google.com
 [IPv6:2a00:1450:4010:c04::22a])
 by mx1.freebsd.org (Postfix) with ESMTP id BF54D175D
 for <freebsd-net@freebsd.org>; Tue,  2 Jul 2013 21:15:55 +0000 (UTC)
Received: by mail-lb0-f170.google.com with SMTP id t13so3821882lbd.15
 for <freebsd-net@freebsd.org>; Tue, 02 Jul 2013 14:15:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=rO8IOQfm+5fYF4pfVHUZhKSf6/3nDEKXpUITxYc57ok=;
 b=HJxO8MCsZTdl3qa9xfCGykiFa++lVAlkYqRChOcqNqA/8k53kzq4nnRnM5hfQiwNbK
 9DP1WT1WbwDKz18Rzo7oNG1YpP3kC6f0UFZe5AwNiwvsalZp0RUtSTKn+v332dbMxXVH
 9ReBiY1ctf5+JHRFBo+TQgzbTyksKoIkqHHwxfW6ikuzF8vSi7KQkKQbZbBP5qSsRQml
 9q/Lw+/32Z2n3uOeogaZUyHgeR/ccR7KCYXq3CCrp6BGZz/V/C6brS7d6g9AwzhqpzjK
 rEca7aJjaW1+ch99ygLt1eq1EYfPb0aMm7ZPzoC1m5VEtOPpigCxVpQoO4VK1HtmgXKh
 RW6g==
MIME-Version: 1.0
X-Received: by 10.112.156.42 with SMTP id wb10mr14821709lbb.62.1372799754532; 
 Tue, 02 Jul 2013 14:15:54 -0700 (PDT)
Sender: rizzo.unipi@gmail.com
Received: by 10.114.200.15 with HTTP; Tue, 2 Jul 2013 14:15:54 -0700 (PDT)
In-Reply-To: <CALG4x-UYBFsMttpZx1-c_wtVf5MST8+_t1psY2HQskTiOZDFLA@mail.gmail.com>
References: <CALG4x-V-OLoqMXQarSNy5Lv3kNVu01AiN4A49Nv7t-Ysfr1DBg@mail.gmail.com>
 <CA+hQ2+gwW6FOQS79xmWVLSWWHrZMFnhaUM98Kp6aDVaUePNfTA@mail.gmail.com>
 <CALG4x-UYBFsMttpZx1-c_wtVf5MST8+_t1psY2HQskTiOZDFLA@mail.gmail.com>
Date: Tue, 2 Jul 2013 23:15:54 +0200
X-Google-Sender-Auth: a9OytJFYPwGh1t9rd6WP0wtuXd8
Message-ID: <CA+hQ2+i_qu7RouPW+ihfb5nL_1SQWMpFxTpoHfhaCvhtS8-EHQ@mail.gmail.com>
Subject: Re: Multiqueue support for bpf
From: Luigi Rizzo <rizzo@iet.unipi.it>
To: Takuya ASADA <syuu@dokukino.com>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: FreeBSD Net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Jul 2013 21:15:56 -0000

On Tue, Jul 2, 2013 at 5:56 PM, Takuya ASADA <syuu@dokukino.com> wrote:

> Hi,
>
>  Do you have an updated URL for the diffs ? The link below from your
>> original message
>> seems not working now (NXDOMAIN)
>>
>> http://www.dokukino.com/mq_bpf_20110813.diff
>>
>
> Changes with recent head is on my repository:
> http://svnweb.freebsd.org/base/user/syuu/mq_bpf/
> And I attached a diff file on this mail.
>
>
thanks for the diffs (the URL to the repo is useful too,
but a URL to generate diffs is more convenient for reviewing changes).

I believe it still needs a bit of work before being merged.

My comments (in order of the patch):

=== ifnet.9 (and related code in if.c, sockio.h) ===
 - if_get_rxqueue_len()/if_get_rxqueue_len() is not a good name,
  as to me at least it suggests that it returns the size of the
  individual queue, rather than the number of queues.

- cpu affinity (in userspace) is a bitmask, whereas in the BSD kernel
  we almost never use the term "affinity", and favour "couid" or "oncpu"
  (i.e. an individual CPU id).
  I think you should either rename if_get_txqueue_affinity(), or make
  the return type a cpuset (which seems more sensible as the return
  value is passed to userspace)

=== bpf.4 (and related code) ===

- the ioctl() to attach/detach/report queues attached to a specific
  bpf descriptor talk about "mask bit" but neither the internal nor
  the external implementation deal with bits.
  I'd rather document those ioctl as "attaching queue to file descriptor".

- the BPF ioctl names are generally inconsistent (using either S or SET
  and G or GET for the setter and getter methods).
  But you should pick one of the patterns and stick with it,
  not introduce a third variant (GT/ST).
  Given we are in 2013 we might go for the long form GET and SET
  so i suggest the following (spaces for clarity)

+#define BIOC ENA QMASK _IO('B', 133)
+#define BIOC DIS QMASK _IO('B', 134)
+#define BIOC SET RXQMASK _IOWR('B', 135, uint32_t)
+#define BIOC CLR RXQMASK _IOWR('B', 136, uint32_t)
+#define BIOC GET RXQMASK _IOR('B', 137, uint32_t)
+#define BIOC SET TXQMASK _IOWR('B', 138, uint32_t)
+#define BIOC CLR TXQMASK _IOWR('B', 139, uint32_t)
+#define BIOC GET TXQMASK _IOR('B', 140, uint32_t)
+#define BIOC SET OTHERMASK _IO('B', 141)
+#define BIOC CLR OTHERMASK _IO('B', 142)
+#define BIOC GET OTHERMASK _IOR('B', 143, uint32_t)

  Also related: the existing ioctls() use u_int as argumnts, rather
  than uint32_t. I personally prefer the uint32_t form, but you
  should at least add a comment to indicate that the choice is
  deliberate.

=== if.c ===


- you have a KASSERT to report if ifp->if_get_*xqueue_affinity() is not
  set, but i'd rather run the function only if is set, so you can
  have a multiqueue interface which does not bind queues to specific cores
  (which i am not sure is always a great idea; too many processes
  statically bound to the same queue mean you lose opportunity to
  parallelize work.)

=== mbuf.h ===

as mentioned earlier, the modification to struct mbuf should
be avoided if possible at all. It seems that you need just one
direction bit (which maybe is available already from the context)
and one queue identifier, which in the rx path, at least in your
implementation is always a replica of the 'flowid' field.
Can you see if perhaps the flowid field can be (ab)used on the
tx path as well ?


=== if.h ===

- in if.h, can you use individual variables instead of arrays
  for  ifr_queue_affinity_index and friends ?
  The macros to map the fields of ifr_ifru one
  level up are a necessary evil,
  but there is no point in using the arrays.

  - SIOCGIFTXQAFFINITY seems to use the receive function (copy&paste typo)
   talks about
  Also, this function is probably something that should be coordinated
  with work on generic multiqueue support


=== bpf.c ===

- in linux (and hopefully in FreeBSD at some point) the number of queues
  can be changed at runtime.
  So i suggest that you cache the current number of queues when
  you allocate the arrays (qm_*xq_qmask[] ) rather than invoking
  ifp->if_get_*xqueue_len() everytime you need to do a boundary check.
  This will save us from all sort of problems later.

- in terms of code, the six BIOC*XQMASK are very similar, you are probably
  better off having one single case in the switch

- can you some comments in the code for the chunk at @@ -2117,6 +2391,42 @@
  I do not completely understand why you are returning if the *queue tag
  in the mbuf is out of range (my impression is that you should
  just continue, or if you think the packet is incorrect it should
  be filtered out before entering the LIST_FOREACH() ).
  Secondly, you should use the cached value of *queue_len


cheers
luigi

-- 
-----------------------------------------+-------------------------------
 Prof. Luigi RIZZO, rizzo@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
 TEL      +39-050-2211611               . via Diotisalvi 2
 Mobile   +39-338-6809875               . 56122 PISA (Italy)

 -----------------------------------------+-------------------------------