Date: Sat, 4 Jul 2020 16:16:42 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: Alan Somers <asomers@freebsd.org> Cc: Pawe?? Jakub Dawidek <pawel@dawidek.net>, freebsd-geom@freebsd.org Subject: Re: Single-threaded bottleneck in geli Message-ID: <20200704231642.GU4213@funkthat.com> In-Reply-To: <CAOtMX2idj0pfpo4k%2BxOjaZ9Tk9tLNavNqAo9nGyYOs6OkG7r8w@mail.gmail.com> References: <CAOtMX2hHaEzOT0jmc_QcukVZjRKUtCm55bTT9Q5=BNCcL9rf%2Bg@mail.gmail.com> <80B62FE6-FCFB-42B8-A34C-B28E7DDBF45D@dawidek.net> <CAOtMX2idj0pfpo4k%2BxOjaZ9Tk9tLNavNqAo9nGyYOs6OkG7r8w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Alan Somers wrote this message on Sat, Jul 04, 2020 at 15:59 -0600: > I might give this a shot. What is the best way tell if geli ought to use > direct dispatch? Is there a generic "are crypto instructions available" > macro that would cover aesni as well as other platform-specific > instructions? Direct dispatch has the advantage of saving scheduling context switches... Geli has two modes, one is hardware acceleration mode, where it does a bit of work to put together the request, and then hands it off to the crypto framework (say an accelerator card), and then there is the mode where it has to be done in software, where it dispatches to a set of worker threads... In both modes, it would make sense for GELI to be able to do the work to construct those requests via direct dispatch... This would eliminate a context switch, which is always a good thing... I haven't looked at the OpenCrypto code in a while, so I don't know what the locking requirements are... The key cache is already locked by an mtx, but I believe it's a leaf lock, and so shouldn't be an issue... I'll add this to my list of things to look at... Also, if you have that many geli devices, you might also want to set: kern.geom.eli.threads=1 As it stands, geli fires up ncpu threads for EACH geli device, so likely have thousands of geli threads... > On Sat, Jul 4, 2020, 2:55 PM Pawe?? Jakub Dawidek <pawel@dawidek.net> wrote: > > > Direct dispatch would be great for geli, especially that geli can use own > > (multiple) threads when necessary (eg. using crypto cards). With AES-NI you > > could go straight to the disk. > > > > > On Jul 3, 2020, at 13:22, Alan Somers <asomers@freebsd.org> wrote: > > > > > > ???I don't. What I meant was that a single thread (geom) is limiting the > > > performance of the system overall. I'm certain, based on top, gstat, and > > > zpool iostat, that geom is the limiting factor on this system. > > > -Alan > > > > > >> On Fri, Jul 3, 2020 at 2:18 PM Pawe?? Jakub Dawidek <pawel@dawidek.net> > > >> wrote: > > >> > > >> Hi Alan, > > >> > > >> why do you think it will hurt single-threaded performance? > > >> > > >>>> On Jul 3, 2020, at 12:30, Alan Somers <asomers@freebsd.org> wrote: > > >>> > > >>> ???I'm using geli, gmultipath, and ZFS on a large system, with hundreds > > of > > >>> drives. What I'm seeing is that under at least some workloads, the > > >> overall > > >>> performance is limited by the single geom kernel process. procstat and > > >>> kgdb aren't much help in telling exactly why this process is using so > > >> much > > >>> CPU, but it certainly must be related to the fact that over 15,000 IOPs > > >> are > > >>> going through that thread. What can I do to improve this situation? > > >> Would > > >>> it make sense to enable direct dispatch for geli? That would hurt > > >>> single-threaded performance, but probably improve performance for > > highly > > >>> multithreaded workloads like mine. > > >>> > > >>> Example top output: > > >>> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU > > COMMAND > > >>> 13 root -8 - 0B 96K CPU46 46 82.7H 70.54% > > >>> geom{g_down} > > >>> 13 root -8 - 0B 96K - 9 35.5H 25.32% > > >>> geom{g_up} -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200704231642.GU4213>