From owner-freebsd-hackers@freebsd.org Wed Apr 17 18:09:55 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 58A021574B9D for ; Wed, 17 Apr 2019 18:09:55 +0000 (UTC) (envelope-from jim@netgate.com) Received: from mail-ot1-x341.google.com (mail-ot1-x341.google.com [IPv6:2607:f8b0:4864:20::341]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2175A8C168 for ; Wed, 17 Apr 2019 18:09:54 +0000 (UTC) (envelope-from jim@netgate.com) Received: by mail-ot1-x341.google.com with SMTP id s24so21492160otk.13 for ; Wed, 17 Apr 2019 11:09:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netgate.com; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=/0PKlSur5GU3OW7MhUxvxkhX3sT5fJ0GqTD7yGNDwDE=; b=pYm0tabMpyTfnxTgxrzF9McZqwiEVRzMRohlbhrW4QwOUq8BBkvje42QfKqL7gh+4q tPVz3m2fv92T03CC6segNDc5W62b9SQmnX3rW8ncKeDS0RkYCilBaGKUXMsIJgev9tHf rllryy6CGSWn7XJc90SFCJiPAf+085EJmcLH0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=/0PKlSur5GU3OW7MhUxvxkhX3sT5fJ0GqTD7yGNDwDE=; b=OKjUOtQALuWb5xjAsjXx8VJVjGrFHpPjy+uvB6RJ95eH0OAMBOcx8LOMxcEkQLDl+i x7NJLctsXvlKJ6LqMGPB+a+XOndV0b9XNayQh7PAYi59arKUoyYUlE1vas5xaamkIV2h kBWCNVYD0AIWMDqy+iC1eGYYZU3QpoOnLupt6fc8L8jpEjIo80JjEvZ/c34qneEVT8b4 asNWI6I5e1smhVpjRaZ0HMZd8DwbCDQbBJGsx6E/I0QSCAd3JE7OD6uvr2FiJhMQXgtK RlB7wzm7xI9/OisOJzSgJR075/1mVSY5PjYmiv+npSLVb7+VA1HWe+0it2dlJHvqEc7l tkew== X-Gm-Message-State: APjAAAVXg4QDXIYIs7Itm3FNwyNGivFfncT6+rj961asXRSgtIEypfBp v+BcnfL/kIXFFqxGIQmj4MAyvs4FoEo48Q== X-Google-Smtp-Source: APXvYqxrzQUFWYiOp8MASkG2o/eKuy7tWgKAG3CkD3OQhWL+C6ny6FUE88hOCpm+/ObtHtcKW7GvAg== X-Received: by 2002:a9d:3988:: with SMTP id y8mr57134769otb.231.1555524592680; Wed, 17 Apr 2019 11:09:52 -0700 (PDT) Received: from [10.10.10.235] ([66.196.5.190]) by smtp.gmail.com with ESMTPSA id h23sm25825272oic.10.2019.04.17.11.09.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:09:51 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: openvpn and system overhead From: Jim Thompson In-Reply-To: Date: Wed, 17 Apr 2019 13:09:50 -0500 Cc: Miroslav Lachman <000.fbsd@quip.cz>, Mark Millard via freebsd-hackers Content-Transfer-Encoding: quoted-printable Message-Id: <94EA4F3F-4D78-4E08-9AF8-441B957A4749@netgate.com> References: <8648d069-2172-2c09-8e59-d66a8265a120@quip.cz> To: Wojciech Puchar X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: 2175A8C168 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netgate.com header.s=google header.b=pYm0tabM; dmarc=pass (policy=none) header.from=netgate.com; spf=pass (mx1.freebsd.org: domain of jim@netgate.com designates 2607:f8b0:4864:20::341 as permitted sender) smtp.mailfrom=jim@netgate.com X-Spamd-Result: default: False [-4.29 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[netgate.com:s=google]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[netgate.com:+]; DMARC_POLICY_ALLOW(-0.50)[netgate.com,none]; RCVD_IN_DNSWL_NONE(0.00)[1.4.3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; MX_GOOD(-0.01)[alt1.aspmx.l.google.com,aspmx.l.google.com,aspmx5.googlemail.com,aspmx2.googlemail.com,aspmx3.googlemail.com,alt2.aspmx.l.google.com,aspmx4.googlemail.com]; IP_SCORE(-0.81)[ip: (1.27), ipnet: 2607:f8b0::/32(-3.04), asn: 15169(-2.22), country: US(-0.06)]; NEURAL_HAM_SHORT(-0.97)[-0.968,0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Apr 2019 18:09:55 -0000 > On Apr 17, 2019, at 10:54 AM, Wojciech Puchar = wrote: >=20 >=20 >=20 > On Wed, 17 Apr 2019, Miroslav Lachman wrote: >=20 >> Wojciech Puchar wrote on 2019/04/17 17:08: >>> i'm running openvpn server on Xeon E5 2620 server. >>> when receiving 100Mbit/s traffic over VPN it uses 20% of single = core. >>> At least 75% of it is system time. >>> Seems like 500Mbit/s is a max for a single openvpn process. >>> can anything be done about that to improve performance? >>=20 >> You can play with ciphers, AES-NI etc. >> https://community.openvpn.net/openvpn/wiki/Gigabit_Networks_Linux >>=20 >> Miroslav Lachman >>=20 >>=20 > again. it's system time mostly not user time. Yup. I=E2=80=99ve looked at this a bunch over the years for pfSense. The tun/tap device can be viewed as a simple Point-to-Point IP or = Ethernet device, which instead of receiving packets from a physical=20 media, receives them from user space program and instead of sending = packets via physical media sends them to the user space program.=20 Let's say that you configured IP on the tap0, then whenever the kernel = sends an IP packet to tap0, it is passed to the application (OpenVPN, = for example).=20 Open=10VPN encrypts, authenticates, and occasionally compresses this = packet, encapsulates it, and sends it to the other side over TCP or = (preferably) UDP. The application on the other side receives the packet, decompresses and = decrypts the data received and writes the packet to its TAP device, the = kernel on the other side handles the packet like it came from real = physical device. Each time you copy data from user to kernel or kernel to user space, you = also incur a context switch with all the associated overheads. Using a tun/tap device incurs an additional context switch in each = direction, as you=E2=80=99re basically running the program to send data = (say, =E2=80=98ping=E2=80=99 or =E2=80=99ssh=E2=80=99), and another = program is used to encrypt and encapsulate the packet before it leaves = the machine. The process is roughly the same on the other side. So = you get twice the copies, and twice the number of context switches. = Making things worse, the =E2=80=9CIP stack=E2=80=9D inside OpenVPN is = single-threaded, and processes one packet at a time, so all the = overheads accrue to each packet, rather than being amortized across = several packets. Net-net, openvpn won=E2=80=99t do close to 1Mpps. There is a = decent-enough write-up of recent actual benchmarking in a masters thesis = that compares IPsec, OpenVPN and Wireguard, on linux here: = https://www.net.in.tum.de/fileadmin/bibtex/publications/theses/2018-pudelk= o-vpn-performance.pdf Section 5.5 if you want to skip to the substance. Basically, with *no* = encryption overheads, OpenVPN still has a static overhead of around 8500 = cycles/packet on the setup they used (Xeon E5-2620 v4), which seems = quite similar to yours. Given all this, they show that OpenVPN enters = an overload condition at around 120Kpps. There is some hope if you really have to have a lower-overhead OpenVPN. = An OpenVPN session has two channels, multiplexed on the same connection. = One is a control channel, the other is a data channel. The control = channel and associated configuration code in OpenVPN is =E2=80=A6 = complex. It has close to 10 trillion configuration options, and any = re-write of this code would be a huge, huge undertaking. Nearly = unthinkable, really. The data channel, otoh, is relatively = straight-forward, especially if you don=E2=80=99t need all the crypto = options provided, and, instead, limit yourself to, say AES-GCM or = another AEAD (ChaCha20 / Poly1305) transform. (Here, if your CPU has = AES-NI or similar (e.g. ARMv8 has AES acceleration instructions) AES-GCM = will always be faster.) But, if you=E2=80=99re willing to limit yourself to one, or a few = transforms, it theory, it=E2=80=99s possible to make a specialized tun / = tap device such that the data channel is kept in-kernel, with = encryption/decryption and encapsulation/decapsulation of data packets = occurring in the kernel, but control packets passed up and down to/from = the associated user space process. A partial attempt of this idea (for linux) can be found here: = https://github.com/marywangran/OpenVPN-Linux-kernel it looks abandoned, = so maybe it didn=E2=80=99t pan out, or maybe the work just got = asymptotic. There is a bunch of work to get this right (keeping the openVPN user = process happy, counters up to date, etc), but, at the end of the day, = it=E2=80=99s all software. Netflix got enough of OpenSSL's AES-GCM = implementation into the kernel to run the transmit side. They didn=E2=80=99= t care about the receive side, and just let nginx deal with the = relatively light rx flows in their deployment, but it does show that = it=E2=80=99s possible with enough work. Even with all that work, It will probably never be as fast as a decent = IPsec implementation. Jim