Date: Tue, 10 Jun 2014 11:56:26 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: "Alexander V. Chernikov" <melifaro@freebsd.org> Cc: Bryan Venteicher <bryanv@daemoninthecloset.org>, current@freebsd.org, net@freebsd.org Subject: Re: dhclient sucks cpu usage... Message-ID: <20140610185626.GK31367@funkthat.com> In-Reply-To: <5397415B.5070409@FreeBSD.org> References: <20140610000246.GW31367@funkthat.com> <100488220.4292.1402369436876.JavaMail.root@daemoninthecloset.org> <5396CD41.2080300@FreeBSD.org> <20140610162443.GD31367@funkthat.com> <5397415B.5070409@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 21:33 +0400: > On 10.06.2014 20:24, John-Mark Gurney wrote: > >Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 13:17 > >+0400: > >>On 10.06.2014 07:03, Bryan Venteicher wrote: > >>>Hi, > >>> > >>>----- Original Message ----- > >>>>So, after finding out that nc has a stupidly small buffer size (2k > >>>>even though there is space for 16k), I was still not getting as good > >>>>as performance using nc between machines, so I decided to generate some > >>>>flame graphs to try to identify issues... (Thanks to who included a > >>>>full set of modules, including dtraceall on memstick!) > >>>> > >>>>So, the first one is: > >>>>https://www.funkthat.com/~jmg/em.stack.svg > >>>> > >>>>As I was browsing around, the em_handle_que was consuming quite a bit > >>>>of cpu usage for only doing ~50MB/sec over gige.. Running top -SH shows > >>>>me that the taskqueue for em was consuming about 50% cpu... Also pretty > >>>>high for only 50MB/sec... Looking closer, you'll see that bpf_mtap is > >>>>consuming ~3.18% (under ether_nh_input).. I know I'm not running > >>>>tcpdump > >>>>or anything, but I think dhclient uses bpf to be able to inject packets > >>>>and listen in on them, so I kill off dhclient, and instantly, the > >>>>taskqueue > >>>>thread for em drops down to 40% CPU... (transfer rate only marginally > >>>>improves, if it does) > >>>> > >>>>I decide to run another flame graph w/o dhclient running: > >>>>https://www.funkthat.com/~jmg/em.stack.nodhclient.svg > >>>> > >>>>and now _rxeof drops from 17.22% to 11.94%, pretty significant... > >>>> > >>>>So, if you care about performance, don't run dhclient... > >>>> > >>>Yes, I've noticed the same issue. It can absolutely kill performance > >>>in a VM guest. It is much more pronounced on only some of my systems, > >>>and I hadn't tracked it down yet. I wonder if this is fallout from > >>>the callout work, or if there was some bpf change. > >>> > >>>I've been using the kludgey workaround patch below. > >>Hm, pretty interesting. > >>dhclient should setup proper filter (and it looks like it does so: > >>13:10 [0] m@ptichko s netstat -B > >> Pid Netif Flags Recv Drop Match Sblen Hblen Command > >> 1224 em0 -ifs--l 41225922 0 11 0 0 dhclient > >>) > >>see "match" count. > >>And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for > >>each consumer on interface). > >>It should not introduce significant performance penalties. > >Don't forget that it has to process the returning ack's... So, you're > Well, it can be still captured with the proper filter like "ip && udp && > port 67 or port 68". > We're using tcpdump on high packet ratios (>1M) and it does not > influence process _much_. > We should probably convert its rwlock to rmlock and use per-cpu counters > for statistics, but that's a different story. > >looking around 10k+ pps that you have to handle and pass through the > >filter... That's a lot of packets to process... > > > >Just for a bit more "double check", instead of using the HD as a > >source, I used /dev/zero... I ran a netstat -w 1 -I em0 when > >running the test, and I was getting ~50.7MiB/s w/ dhclient running and > >then I killed dhclient and it instantly jumped up to ~57.1MiB/s.. So I > >launched dhclient again, and it dropped back to ~50MiB/s... > dhclient uses different BPF sockets for reading and writing (and it > moves write socket to privileged child process via fork(). > The problem we're facing with is the fact that dhclient does not set > _any_ read filter on write socket: > 21:27 [0] zfscurr0# netstat -B > Pid Netif Flags Recv Drop Match Sblen Hblen Command > 1529 em0 --fs--l 86774 86769 86784 4044 3180 dhclient > --------------------------------------- ^^^^^ -------------------------- > 1526 em0 -ifs--l 86789 0 1 0 0 dhclient > > so all traffic is pushed down introducing contention on BPF descriptor > mutex. > > (That's why I've asked for netstat -B output.) > > Please try an attached patch to fix this. This is not the right way to > fix this, we'd better change BPF behavior not to attach to interface > readers for write-only consumers. > This have been partially implemented as net.bpf.optimize_writers hack, > but it does not work for all direct BPF consumers (which are not using > pcap(3) API). Ok, looks like this patch helps the issue... netstat -B; sleep 5; netstat -B: Pid Netif Flags Recv Drop Match Sblen Hblen Command 958 em0 --fs--l 3880000 14 35 3868 2236 dhclient 976 em0 -ifs--l 3880014 0 1 0 0 dhclient Pid Netif Flags Recv Drop Match Sblen Hblen Command 958 em0 --fs--l 4178525 14 35 3868 2236 dhclient 976 em0 -ifs--l 4178539 0 1 0 0 dhclient and now the rate only drops from ~66MiB/s to ~63MiB/s when dhclient is running... Still a significant drop (5%), but better than before... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140610185626.GK31367>