Date: Sun, 29 Jun 2014 22:17:21 +0200 From: Luigi Rizzo <rizzo@iet.unipi.it> To: Adrian Chadd <adrian@freebsd.org> Cc: Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: ipfw pipe config bw tun0 Message-ID: <CA%2BhQ2%2BgtRK1tAGMiyr6KU0F8S=Sm5J4Djc-uTaVdinHAVp9xXw@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
On Sun, Jun 29, 2014 at 6:29 PM, Adrian Chadd <adrian@freebsd.org> wrote: > We can start adding that. How should it behave for multi-queue devices? > =E2=80=8BLong reply, sorry about that: =E2=80=8B =E2=80=8Bif i remember well, this feature was implemented assuming that at most one packet was outstanding, so multiqueue was not really an issue: any time you get a completion interrupt from any queue you push out the next packet from the pipe. The goal was to provide weighted fair queueing using the actual NIC's bandwidth to clock packets out. Remember, this was done in '99 and on hardware that did not have queues or interrupt moderation. These days, between deep NIC queues, interrupt moderation, multiqueue and very high bandwidths, the assumption of one outstanding packet is a bad one for performance. You'd also have the option to tie a pipe to an individual queue or to the entire NIC (the user API changes to do this is trivial, e.g. you can append a :queue_number to to the interface name as i did in netmap). This said: 1. if you don't mind the fact that the interface has a deep queue, you could just push packets from a PIPE to an interface until if_transmit returns an error (make sure the packet is not lost by adding a reference to the mbuf or something), and then any interrupt completion from any queue would be used to 'clock' packets out. 2. if the NIC's queue bothers you (it might, because it adds an equivalent error to the nice properties of the scheduler), then the pipe could try to track how many bytes are queued, stop after a given threshold, and then when an interrupt completion is received decrease the 'outstanding' counter by the actual number of bytes sent. Essentially, what ALTQ does, but with the classification flexibility of ipfw/dummynet. Surely, keeping only one outstanding packet is too expensive and would kill throughput. But a modern interface with 256..1024 buffers of 1.5K each is up to 3..12 Mbits which is way too high. If we want to (re)implement this feature, we should preliminarly introduce some way to control the outstanding traffic on an interface -- can be done in dummynet as #2 above, or within the NIC's driver if we eventually build something like ethtool/bql . cheers luigi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2BhQ2%2BgtRK1tAGMiyr6KU0F8S=Sm5J4Djc-uTaVdinHAVp9xXw>