From owner-freebsd-fs@freebsd.org Sun May 21 12:15:08 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4AA20D6B1D4; Sun, 21 May 2017 12:15:08 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mailout.stack.nl (mailout05.stack.nl [IPv6:2001:610:1108:5010::202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mailout.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D9620398; Sun, 21 May 2017 12:15:07 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mailout.stack.nl (Postfix) with ESMTP id 0FFC879; Sun, 21 May 2017 14:14:57 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id EF00E28497; Sun, 21 May 2017 14:14:56 +0200 (CEST) Date: Sun, 21 May 2017 14:14:56 +0200 From: Jilles Tjoelker To: Konstantin Belousov Cc: freebsd-current@freebsd.org, freebsd-fs@freebsd.org, freebsd-ports@freebsd.org, emaste@freebsd.org, Kirk McKusick Subject: Re: 64-bit inodes (ino64) Status Update and Call for Testing Message-ID: <20170521121456.GA21613@stack.nl> References: <20170420194314.GI1788@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170420194314.GI1788@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 May 2017 12:15:08 -0000 On Thu, Apr 20, 2017 at 10:43:14PM +0300, Konstantin Belousov wrote: > Inodes are data structures corresponding to objects in a file system, > such as files and directories. FreeBSD has historically used 32-bit > values to identify inodes, which limits file systems to somewhat under > 2^32 objects. Many modern file systems internally use 64-bit identifiers > and FreeBSD needs to follow suit to properly and fully support these > file systems. > The 64-bit inode project, also known as ino64, started life many years > ago as a project by Gleb Kurtsou (gleb@). After that time several > people have had a hand in updating it and addressing regressions, after > mckusick@ picked up and updated the patch, and acted as a flag-waver. > Sponsored by the FreeBSD Foundation I have spent a significant effort > on outstanding issues and integration -- fixing compat32 ABI, NFS and > ZFS, addressing ABI compat issues and investigating and fixing ports > failures. rmacklem@ provided feedback on NFS changes, emaste@ and > jhb@ provided feedback and review on the ABI transition support. pho@ > performed extensive testing and identified a number of issues that > have now been fixed. kris@ performed an initial ports investigation > followed by an exp-run by antoine@. emaste@ helped with organization > of the process. > This note explains how to perform useful testing of the ino64 branch, > beyond typical smoke tests. > 1. Overview. > The ino64 branch extends the basic system types ino_t and dev_t from > 32-bit to 64-bit, and nlink_t from 16-bit to 64-bit. The struct dirent > layout is modified due to the larger size of ino_t, and also gains a > d_off (directory offset) member. As ino64 implies an ABI change anyway > the struct statfs f_mntfromname[] and f_mntonname[] array length > MNAMELEN is increased from 88 to 1024, to allow for longer mount path > names. > ABI breakage is mitigated by providing compatibility using versioned > symbols, ingenious use of the existing padding in structures, and by > employing other tricks. Unfortunately, not everything can be fixed, > especially outside the base system. For instance, third-party APIs > which pass struct stat around are broken in backward and forward- > incompatible way. We have another type in this area which is too small in some situations: uint8_t for struct dirent.d_namlen. For filesystems that store filenames as upto 255 UTF-16 code units, the name to be stored in d_name may be upto 765 bytes long in UTF-8. This was reported in PR 204643. The code currently handles this by returning the short (8.3) name, but this name may not be present or usable, leaving the file inaccessible. Actually allowing longer names seems too complicated to add to the ino64 change, but changing d_namlen to uint16_t (using d_pad0 space) and skipping entries with d_namlen > 255 in libc may be helpful. Note that applications using the deprecated readdir_r() will not be able to read such long names, since the API does not allow specifying that a larger buffer has been provided. (This could be avoided by making struct dirent.d_name 766 bytes long instead of 256.) Unfortunately, the existence of readdir_r() also prevents changing struct dirent.d_name to the more correct flexible array. -- Jilles Tjoelker