Date: Sun, 2 May 2021 00:18:04 +0300 From: =?UTF-8?B?w5Z6a2FuIEtJUklL?= <ozkan.kirik@gmail.com> To: Mark Johnston <markj@freebsd.org> Cc: FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: IPsec performace - netisr hits %100 Message-ID: <CAAcX-AHTZiSvdXyxtBFySO%2BJKsNDJi%2BELvQ%2BZpctQ34QMaGRgw@mail.gmail.com> In-Reply-To: <CAAcX-AGHV=eCH2atS3adsPCpxuqQ2Eeh1y2KDZtBTKtGsPYhEg@mail.gmail.com> References: <CAAcX-AF=0s5tueCuanFKkoALNkRnWJ-8QrzfCqSu=ReoWvqMug@mail.gmail.com> <YIxpdL9b6v8%2BN%2BLg@nuc> <CAAcX-AHSk92gXQ3HXw4KYpXQ-jTVCjX0svStu5z49ykH-tk2QQ@mail.gmail.com> <CAAcX-AG2KyN-7yMm%2BMpKbCRDKivFQjq6BVR0r50t4P3HpDRx=Q@mail.gmail.com> <YIx6eHEH53B4g1iB@nuc> <CAAcX-AGHNzU%2BvWD0Dvr_BQYcb25V=RHqyLeT7n_XkQiVXSwN0g@mail.gmail.com> <CAAcX-AE_jRirL64tbL4ikRa4XDuvkeQgDObLqphJN7HtXyqwLg@mail.gmail.com> <CAAcX-AH=vGn_NJQGLx8dvxXdvfD=ZB1CG-C5ztZO5kwYa55KAw@mail.gmail.com> <YI1q6ByrN2akPt79@nuc> <CAAcX-AGHV=eCH2atS3adsPCpxuqQ2Eeh1y2KDZtBTKtGsPYhEg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
the previous flamegraph is captured while iperf client (jail) sends to iperf server. the attached flamegraph to this mail is captured while iperf configured full-duplex mode. Throughput is about up: 1.5 Gbps down: 1.5 Gbps total: 3 Gbps On Sat, May 1, 2021 at 11:57 PM =C3=96zkan KIRIK <ozkan.kirik@gmail.com> wr= ote: > Hello, > The flamegraph is attached. > > # netstat -s > ... > ipsec: > 0 inbound packets violated process security policy > 0 inbound packets failed due to insufficient memory > 0 invalid inbound packets > 0 outbound packets violated process security policy > 0 outbound packets with no SA available > 0 outbound packets failed due to insufficient memory > 0 outbound packets with no route available > 0 invalid outbound packets > 0 outbound packets with bundled SAs > 0 spd cache hits > 0 spd cache misses > 0 clusters copied during clone > 0 mbufs inserted during makespace > ah: > 0 packets shorter than header shows > 0 packets dropped; protocol family not supported > 0 packets dropped; no TDB > 0 packets dropped; bad KCR > 0 packets dropped; queue full > 0 packets dropped; no transform > 0 replay counter wraps > 0 packets dropped; bad authentication detected > 0 packets dropped; bad authentication length > 0 possible replay packets detected > 0 packets in > 0 packets out > 0 packets dropped; invalid TDB > 0 bytes in > 0 bytes out > 0 packets dropped; larger than IP_MAXPACKET > 0 packets blocked due to policy > 0 crypto processing failures > 0 tunnel sanity check failures > AH output histogram: > aes-gmac-128: 35517864 > esp: > 0 packets shorter than header shows > 0 packets dropped; protocol family not supported > 0 packets dropped; no TDB > 0 packets dropped; bad KCR > 0 packets dropped; queue full > 20 packets dropped; no transform > 0 packets dropped; bad ilen > 0 replay counter wraps > 0 packets dropped; bad encryption detected > 0 packets dropped; bad authentication detected > 0 possible replay packets detected > 23598941 packets in > 11918943 packets out > 0 packets dropped; invalid TDB > 32247932688 bytes in > 630318292 bytes out > 0 packets dropped; larger than IP_MAXPACKET > 0 packets blocked due to policy > 0 crypto processing failures > 0 tunnel sanity check failures > ESP output histogram: > aes-gcm-16: 35517864 > > dev.qat.1.stats.sym_alloc_failures: 0 > dev.qat.1.stats.ring_full: 1267 > dev.qat.1.stats.gcm_aad_updates: 0 > dev.qat.1.stats.gcm_aad_restarts: 0 > dev.qat.1.%domain: 0 > dev.qat.1.%parent: pci16 > dev.qat.1.%pnpinfo: vendor=3D0x8086 device=3D0x37c8 subvendor=3D0x8086 > subdevice=3D0x0000 class=3D0x0b4000 > dev.qat.1.%location: slot=3D0 function=3D0 dbsf=3Dpci0:182:0:0 > dev.qat.1.%driver: qat > dev.qat.1.%desc: Intel C620/Xeon D-2100 QuickAssist PF > dev.qat.0.stats.sym_alloc_failures: 0 > dev.qat.0.stats.ring_full: 0 > dev.qat.0.stats.gcm_aad_updates: 0 > dev.qat.0.stats.gcm_aad_restarts: 0 > dev.qat.0.%domain: 0 > dev.qat.0.%parent: pci15 > dev.qat.0.%pnpinfo: vendor=3D0x8086 device=3D0x37c8 subvendor=3D0x8086 > subdevice=3D0x0000 class=3D0x0b4000 > dev.qat.0.%location: slot=3D0 function=3D0 dbsf=3Dpci0:181:0:0 > dev.qat.0.%driver: qat > dev.qat.0.%desc: Intel C620/Xeon D-2100 QuickAssist PF > dev.qat.%parent: > > > > > On Sat, May 1, 2021 at 5:51 PM Mark Johnston <markj@freebsd.org> wrote: > >> On Sat, May 01, 2021 at 04:30:59PM +0300, =C3=96zkan KIRIK wrote: >> > This bug is related to CCR. @Navdeep Parhar <np@freebsd.org> , @John >> Baldwin >> > <jhb@freebsd.org> if you are interested to fix this bug related with >> CCR, I >> > can test if you provide patches. Test environment is explained in my >> first >> > email on this thread. >> > >> > @Mark Johnston <markj@freebsd.org> Now again on stable/13, >> > - with aesni, without netipsec/ipsec_input.c patch - 1.44Gbps - single >> > netisr thread eats %100 cpu >> > - with qat, without netipsec/ipsec_input.c patch - 1.88Gbps - single >> netisr >> > thread eats %100 cpu >> > - with aesni, with netipsec/ipsec_input.c patch - 1.33Gbps >> > - with qat, with netipsec/ipsec_input.c patch - 2.85Gbps - >> > >> > stable/13 results are better then stable/12 but not enough fast. There >> is >> > something makes bottleneck for IPsec. >> >> So with these results it looks like we have 4 crypto threads running, >> which is what I'd expect for two pairs of IP addresses. There is still >> a single-threaded bottleneck. I would suggest generating a flame graph >> using DTrace and https://github.com/brendangregg/FlameGraph to see where >> we're spending CPU time. It would also be useful to know if we're >> getting errors or drops anywhere. The QAT (sysctl dev.qat.*.stats) and >> ESP/AH (netstat -s -p (esp|ah)) counters would be a useful start, in >> addition to counters from cxgbe. >> >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAcX-AHTZiSvdXyxtBFySO%2BJKsNDJi%2BELvQ%2BZpctQ34QMaGRgw>