From owner-freebsd-fs@freebsd.org Sat May 7 16:45:20 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 135B3B31AA9 for ; Sat, 7 May 2016 16:45:20 +0000 (UTC) (envelope-from phk@phk.freebsd.dk) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 01B9C1BA7 for ; Sat, 7 May 2016 16:45:20 +0000 (UTC) (envelope-from phk@phk.freebsd.dk) Received: by mailman.ysv.freebsd.org (Postfix) id F19F7B31AA8; Sat, 7 May 2016 16:45:19 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF081B31AA7 for ; Sat, 7 May 2016 16:45:19 +0000 (UTC) (envelope-from phk@phk.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id AAA781BA6 for ; Sat, 7 May 2016 16:45:19 +0000 (UTC) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id 4CE364FB03; Sat, 7 May 2016 16:38:24 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTP id u47GcLq0059878; Sat, 7 May 2016 16:38:22 GMT (envelope-from phk@phk.freebsd.dk) To: "George Neville-Neil" cc: fs@freebsd.org Subject: Re: Fwd: The Morning Paper: NOVA - A log-structured file system for hybrid volatile/non-volatile main memories In-reply-to: <2BE88161-D83A-4265-9EC3-C2F7F7033E93@neville-neil.com> From: "Poul-Henning Kamp" References: <4188b6afbe9e5d43111fef4d4ae5e599a57.20160506051425@mail23.atl91.mcsv.net> <2BE88161-D83A-4265-9EC3-C2F7F7033E93@neville-neil.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <59876.1462639101.1@critter.freebsd.dk> Date: Sat, 07 May 2016 16:38:21 +0000 Message-ID: <59877.1462639101@critter.freebsd.dk> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2016 16:45:20 -0000 That's a pretty obvious idea, all things considered, but not necessarily a good idea and certainly not the best idea. Hybrid "disk with SSD cache" is a transitionary phenomena, it's probably not going to be relevant in five years, which means that it is almost already too late to develop a new filesystem for it: By the time the code is trustworthy, nobody will need it any more. That is not to say that there are no relevant improvements to make. Many years ago we removed the rotational optimizations in FFS in response to zoned drives, and given the properties of SSDs it is no longer evident that journaling, softupdates or even supergroups are good ideas anymore. The design-choices to make metadata updates single-sector modifications should be revisited as well. While LFS seems an obvious storage strategy for SSDs, it's surplus to requirements because SSD devices already contains a LFS. Only they call files "a logical sector (extent)" and the LFS itself a "Flash Adaptation Layer". LFS also has some well documented drawbacks, in particular WRT cleaning, and the optimization that is a tradeoff for are utterly pointless on media effectively without access time. It would probably be smarter to focus on reducing the the number of, and increasing the size of media I/O transactions, with a side order of general scalability, so that small files have small metadata and bigger files trade metadatasize for performance. There is also an interesting space between per-partition and per-inode keying of encryption which is ripe for study. The double or even triple work overlap between todays filesystems and the FALs in SSDs could be avoided if a more expressive set of verbs than Read/Write/Erase(=TRIM) were exported upwards (expose the extents as "inodes" ?) but given the patent-minefield and the heavy-duty NIH-attitudes, that is probably not going to happen. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.