Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jun 2014 11:56:26 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        "Alexander V. Chernikov" <melifaro@freebsd.org>
Cc:        Bryan Venteicher <bryanv@daemoninthecloset.org>, current@freebsd.org, net@freebsd.org
Subject:   Re: dhclient sucks cpu usage...
Message-ID:  <20140610185626.GK31367@funkthat.com>
In-Reply-To: <5397415B.5070409@FreeBSD.org>
References:  <20140610000246.GW31367@funkthat.com> <100488220.4292.1402369436876.JavaMail.root@daemoninthecloset.org> <5396CD41.2080300@FreeBSD.org> <20140610162443.GD31367@funkthat.com> <5397415B.5070409@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 21:33 +0400:
> On 10.06.2014 20:24, John-Mark Gurney wrote:
> >Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 13:17 
> >+0400:
> >>On 10.06.2014 07:03, Bryan Venteicher wrote:
> >>>Hi,
> >>>
> >>>----- Original Message -----
> >>>>So, after finding out that nc has a stupidly small buffer size (2k
> >>>>even though there is space for 16k), I was still not getting as good
> >>>>as performance using nc between machines, so I decided to generate some
> >>>>flame graphs to try to identify issues...  (Thanks to who included a
> >>>>full set of modules, including dtraceall on memstick!)
> >>>>
> >>>>So, the first one is:
> >>>>https://www.funkthat.com/~jmg/em.stack.svg
> >>>>
> >>>>As I was browsing around, the em_handle_que was consuming quite a bit
> >>>>of cpu usage for only doing ~50MB/sec over gige..  Running top -SH shows
> >>>>me that the taskqueue for em was consuming about 50% cpu...  Also pretty
> >>>>high for only 50MB/sec...  Looking closer, you'll see that bpf_mtap is
> >>>>consuming ~3.18% (under ether_nh_input)..  I know I'm not running 
> >>>>tcpdump
> >>>>or anything, but I think dhclient uses bpf to be able to inject packets
> >>>>and listen in on them, so I kill off dhclient, and instantly, the
> >>>>taskqueue
> >>>>thread for em drops down to 40% CPU... (transfer rate only marginally
> >>>>improves, if it does)
> >>>>
> >>>>I decide to run another flame graph w/o dhclient running:
> >>>>https://www.funkthat.com/~jmg/em.stack.nodhclient.svg
> >>>>
> >>>>and now _rxeof drops from 17.22% to 11.94%, pretty significant...
> >>>>
> >>>>So, if you care about performance, don't run dhclient...
> >>>>
> >>>Yes, I've noticed the same issue. It can absolutely kill performance
> >>>in a VM guest. It is much more pronounced on only some of my systems,
> >>>and I hadn't tracked it down yet. I wonder if this is fallout from
> >>>the callout work, or if there was some bpf change.
> >>>
> >>>I've been using the kludgey workaround patch below.
> >>Hm, pretty interesting.
> >>dhclient should setup proper filter (and it looks like it does so:
> >>13:10 [0] m@ptichko s netstat -B
> >>   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
> >>  1224    em0 -ifs--l  41225922         0        11     0     0 dhclient
> >>)
> >>see "match" count.
> >>And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for
> >>each consumer on interface).
> >>It should not introduce significant performance penalties.
> >Don't forget that it has to process the returning ack's... So, you're
> Well, it can be still captured with the proper filter like "ip && udp && 
> port 67 or port 68".
> We're using tcpdump on high packet ratios (>1M) and it does not 
> influence process _much_.
> We should probably convert its rwlock to rmlock and use per-cpu counters 
> for statistics, but that's a different story.
> >looking around 10k+ pps that you have to handle and pass through the
> >filter...  That's a lot of packets to process...
> >
> >Just for a bit more "double check", instead of using the HD as a
> >source, I used /dev/zero...   I ran a netstat -w 1 -I em0 when
> >running the test, and I was getting ~50.7MiB/s w/ dhclient running and
> >then I killed dhclient and it instantly jumped up to ~57.1MiB/s.. So I
> >launched dhclient again, and it dropped back to ~50MiB/s...
> dhclient uses different BPF sockets for reading and writing (and it 
> moves write socket to privileged child process via fork().
> The problem we're facing with is the fact that dhclient does not set 
> _any_ read filter on write socket:
> 21:27 [0] zfscurr0# netstat -B
>   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
>  1529    em0 --fs--l     86774     86769     86784  4044  3180 dhclient
> --------------------------------------- ^^^^^ --------------------------
>  1526    em0 -ifs--l     86789         0         1     0     0 dhclient
> 
> so all traffic is pushed down introducing contention on BPF descriptor 
> mutex.
> 
> (That's why I've asked for netstat -B output.)
> 
> Please try an attached patch to fix this. This is not the right way to 
> fix this, we'd better change BPF behavior not to attach to interface 
> readers for write-only consumers.
> This have been partially implemented as net.bpf.optimize_writers hack, 
> but it does not work for all direct BPF consumers (which are not using 
> pcap(3) API).

Ok, looks like this patch helps the issue...

netstat -B; sleep 5; netstat -B:
  Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
  958    em0 --fs--l   3880000        14        35  3868  2236 dhclient
  976    em0 -ifs--l   3880014         0         1     0     0 dhclient
  Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
  958    em0 --fs--l   4178525        14        35  3868  2236 dhclient
  976    em0 -ifs--l   4178539         0         1     0     0 dhclient

and now the rate only drops from ~66MiB/s to ~63MiB/s when dhclient is
running...  Still a significant drop (5%), but better than before...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140610185626.GK31367>