From owner-freebsd-fs@freebsd.org Thu May 25 03:55:37 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 656E9D812BE; Thu, 25 May 2017 03:55:37 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-wr0-f175.google.com (mail-wr0-f175.google.com [209.85.128.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 007BB1272; Thu, 25 May 2017 03:55:36 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-wr0-f175.google.com with SMTP id l50so65096071wrc.3; Wed, 24 May 2017 20:55:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to:cc; bh=denZGa0E7R+2zwrbI2YNAlSG+9qZEkN29uS8oV5zyOk=; b=lBQrtZ2y5+kZ1odgef5E1VdMRBbYW5Q+La0hi4H7QN8wqs71DWgzsz2FDihe5drnHs RbSus5ZNevz6HM4hwiSyTs9GgP/3a1TTav9gAuPo+Fk5fLkOLhfbZ891qNDJaifVURQJ 7pKta3NvxtdzrIgO0LOowhV5WfJwKFVdEZBhfB3L9XKZQNLlis2h4pgp7w3Uual5g+mm rGEKrI8HL2C1tah448bQP5sQqDnR/I8PytNf1PgJLOjpLmb2aTcmdPdnAVGXN9b7NDib /H95qpK5rZ2jqpdvwv6fT3Y36J/XNZwEvshcO1wkwlNjiN2v9Kop00uAokGgz4dLx2iO vY6Q== X-Gm-Message-State: AODbwcB3eJRAULB5yt0kxWg4sefqKs3gvBl1MW+5ksTYWRF2TxKjJQNu tAlSmiuvyYV/9w== X-Received: by 10.223.174.200 with SMTP id y66mr26944855wrc.79.1495684146203; Wed, 24 May 2017 20:49:06 -0700 (PDT) Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com. [74.125.82.42]) by smtp.gmail.com with ESMTPSA id j44sm6014205wre.67.2017.05.24.20.49.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 May 2017 20:49:05 -0700 (PDT) Received: by mail-wm0-f42.google.com with SMTP id d127so84454804wmf.0; Wed, 24 May 2017 20:49:05 -0700 (PDT) X-Received: by 10.80.184.129 with SMTP id l1mr26296183ede.88.1495684145147; Wed, 24 May 2017 20:49:05 -0700 (PDT) MIME-Version: 1.0 Reply-To: cem@freebsd.org Received: by 10.80.169.4 with HTTP; Wed, 24 May 2017 20:49:04 -0700 (PDT) In-Reply-To: <20170521121456.GA21613@stack.nl> References: <20170420194314.GI1788@kib.kiev.ua> <20170521121456.GA21613@stack.nl> From: Conrad Meyer Date: Wed, 24 May 2017 20:49:04 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: 64-bit inodes (ino64) Status Update and Call for Testing To: Jilles Tjoelker Cc: freebsd-current , freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 May 2017 03:55:37 -0000 Hi Jilles, Thanks for bringing this up. And of course, thanks to kib@ for including the d_namlen size bump and for his work in driving the rest of this change through to completion. On Sun, May 21, 2017 at 5:14 AM, Jilles Tjoelker wrote: > We have another type in this area which is too small in some situations: > uint8_t for struct dirent.d_namlen. For filesystems that store filenames > as upto 255 UTF-16 code units, the name to be stored in d_name may be > upto 765 bytes long in UTF-8. This was reported in PR 204643. The code > currently handles this by returning the short (8.3) name, but this name > may not be present or usable, leaving the file inaccessible. We've been working to add such support to our FreeBSD-derivative product. A big piece of it is expanding d_namlen out to 16 bits. We've also been trying to divorce system-wide constants like MAXNAMLEN / NAME_MAX and MAXPATHLEN / PATH_MAX from filesystem-specific limitations (UFS' limit of 255 bytes). And push that upstream when possible, e.g., r313475, r316509. Bumping d_namlen in FreeBSD reduces the amount of ABI breakage we have to introduce in our product relative to FreeBSD, and leaves open the possibility of supporting 255-unicode-character filesystems natively in FreeBSD down the road. > Actually allowing longer names seems too complicated to add to the ino64 > change, but changing d_namlen to uint16_t (using d_pad0 space) and > skipping entries with d_namlen > 255 in libc may be helpful. > > Note that applications using the deprecated readdir_r() will not be able > to read such long names, since the API does not allow specifying that a > larger buffer has been provided. (This could be avoided by making struct > dirent.d_name 766 bytes long instead of 256.) We're looking at 255 Unicode code points, which can be 4 bytes a piece in UTF8, or 1020 bytes potentially. > Unfortunately, the existence of readdir_r() also prevents changing > struct dirent.d_name to the more correct flexible array. Best, Conrad