From owner-freebsd-hackers@freebsd.org Fri Jul 10 08:52:41 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id CF633363F6A for ; Fri, 10 Jul 2020 08:52:41 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4B36Hs1r4bz4LLq; Fri, 10 Jul 2020 08:52:40 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id 06A8qQ4j084950 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Fri, 10 Jul 2020 11:52:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 06A8qQ4j084950 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id 06A8qQvw084949; Fri, 10 Jul 2020 11:52:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 10 Jul 2020 11:52:26 +0300 From: Konstantin Belousov To: Alan Somers Cc: FreeBSD Hackers Subject: Re: Right-sizing the geli thread pool Message-ID: <20200710085226.GC2866@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 4B36Hs1r4bz4LLq X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; REPLY(-4.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2020 08:52:41 -0000 On Thu, Jul 09, 2020 at 03:26:41PM -0600, Alan Somers wrote: > Currently, geli creates a separate thread pool for each provider, and by > default each thread pool contains one thread per cpu. On a large server > with many encrypted disks, that can balloon into a very large number of > threads! I have a patch in progress that switches from per-provider thread > pools to a single thread pool for the entire module. Happily, I see read > IOPs increase by up to 60%. But to my surprise, write IOPs _decreases_ by > up to 25%. dtrace suggests that the CPU usage is dominated by the > vmem_free call in biodone, as in the below stack. > > kernel`lock_delay+0x32 > kernel`biodone+0x88 > kernel`g_io_deliver+0x214 > geom_eli.ko`g_eli_write_done+0xf6 > kernel`g_io_deliver+0x214 > kernel`md_kthread+0x275 > kernel`fork_exit+0x7e > kernel`0xffffffff8104784e > > I only have one idea for how to improve things from here. The geli thread > pool is still fed by a single global bio queue. That could cause cache > thrashing, if bios get moved between cores too often. I think a superior > design would be to use a separate bio queue for each geli thread, and use > work-stealing to balance them. However, Geli uses mapped io, and the fact that vmem_free() is called from biodone() means that geom has to enable transient remapping to handle unmapped requests coming to geli provider. This path was never supposed to be fast. Geli might need an access to the bio's data e.g. for AES-NI processing, or rather, crypto(9) aesni driver needs it. But it might be very beneficial to declare geli as supporting unmapped io and only do transient remapping on pinned thread to avoid global allocations of KVA and shootdowns. Another possible huge optimization could be in aesni crypto(9) driver. I am not sure what is the state of crypto(9) WRT unmapped requests, there were a lot of work improving the framework so it might support unmapped. On amd64 aesni can work with unmapped requests through DMAP, which means that no remapping is needed. > > 1) That doesn't explain why this change benefits reads more than writes, and > 2) work-stealing is hard to get right, and I can't find any examples in the > kernel. > > Can anybody offer tips or code for implementing work stealing? Or any > other suggestions about why my write performance is suffering? I would > like to get this change committed, but not without resolving that issue. > > -Alan > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"