From owner-freebsd-hackers@freebsd.org Wed Nov 20 00:01:28 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 129B01C2498 for ; Wed, 20 Nov 2019 00:01:28 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gate2.funkthat.com", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47HjXR0c31z46d8 for ; Wed, 20 Nov 2019 00:01:26 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.15.2/8.15.2) with ESMTPS id xAK013Cg004301 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 19 Nov 2019 16:01:03 -0800 (PST) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.15.2/8.15.2/Submit) id xAK012Xd004300; Tue, 19 Nov 2019 16:01:02 -0800 (PST) (envelope-from jmg) Date: Tue, 19 Nov 2019 16:01:02 -0800 From: John-Mark Gurney To: Wojciech Puchar Cc: freebsd-hackers@freebsd.org Subject: Re: geom_ssdcache Message-ID: <20191120000102.GI4552@funkthat.com> Mail-Followup-To: Wojciech Puchar , freebsd-hackers@freebsd.org References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD 11.0-RELEASE-p7 amd64 X-PGP-Fingerprint: D87A 235F FB71 1F3F 55B7 ED9B D5FF 5A51 C0AC 3D65 X-Files: The truth is out there X-URL: https://www.funkthat.com/ X-Resume: https://www.funkthat.com/~jmg/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.6.1 (2016-04-27) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (gold.funkthat.com [127.0.0.1]); Tue, 19 Nov 2019 16:01:03 -0800 (PST) X-Rspamd-Queue-Id: 47HjXR0c31z46d8 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of jmg@gold.funkthat.com has no SPF policy when checking 208.87.223.18) smtp.mailfrom=jmg@gold.funkthat.com X-Spamd-Result: default: False [-1.23 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.99)[-0.988,0]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; IP_SCORE(-0.47)[ip: (-1.20), ipnet: 208.87.216.0/21(-0.60), asn: 32354(-0.48), country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[funkthat.com]; AUTH_NA(1.00)[]; NEURAL_HAM_LONG(-0.97)[-0.974,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[jmg@funkthat.com,jmg@gold.funkthat.com]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:32354, ipnet:208.87.216.0/21, country:US]; FROM_NEQ_ENVFROM(0.00)[jmg@funkthat.com,jmg@gold.funkthat.com]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Nov 2019 00:01:28 -0000 Wojciech Puchar wrote this message on Tue, Nov 19, 2019 at 13:06 +0100: > today SSD are really fast and quite cheap, but still hard drives are many > times cheaper. > > Magnetic hard drives are OK in long reads anyway, just bad on seeks. > > While now it's trendy to use ZFS i would stick to UFS anyway. > > I try to keep most of data on HDDs but use SSD for small files and high > I/O needs. > > It works but needs to much manual and semi automated work. > > It would be better to just use HDD for storage and some of SSD for cache > and other for temporary storage only. > > My idea is to make geom layer for caching one geom provider (magnetic > disk/partition or gmirror/graid5) using other geom provider (SSD > partition). Other thing you should decide is if the cache will be shared or per geom provider. And how this would interact w/ multiple separate geom caches... Likely w/ a shared cache (single ssd covering multiple providers), starting clear each time would be best. > I have no experience in writing geom layer drivers but i think geom_cache > would be my fine starting point. At first i would do read/write through > caching. Writeback caching would be next - if at all, doesn't seem good > idea except you are sure SSD won't fail. Re: ssd failing, you can put a gmirror under the cache to address this... > But my question is really on UFS. I would like to know in geom layer if > read/write operation is inode/directory/superblock write or regular data > write - so i would give the first time higher priority. Regular data would > not be cached at all, or only when read size will be less than defined > value. At the geom layer, I don't think that this information is available. > Is it possible to modify UFS code to pass somehow a flag/value when > issuing read/write request to device layer? Take a look at sys/ufs/ffs/ffs_vfsops.c, and it looks like at least the writes are already segmented by superblock (see ffs_use_bwrite), but you'd further need to split them appart. Also, with snap shots, things might be a little bit more difficult for them. Most of the metadata is likely to be able to be cached in ram already, unless you have a large, LARGE UFS fs, then why aren't you using ZFS? I'd also suggest you look at profiling the actual read/writes to make sure you'd be able to get the performance you need... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."