Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Jun 2018 12:22:00 +0000 (UTC)
From:      Bruce Evans <bde@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r335053 - in head/sys: compat/freebsd32 compat/linux fs/nfsclient kern sys
Message-ID:  <201806131222.w5DCM00c001080@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: bde
Date: Wed Jun 13 12:22:00 2018
New Revision: 335053
URL: https://svnweb.freebsd.org/changeset/base/335053

Log:
  Fix the encoding of major and minor numbers in 64-bit dev_t by restoring
  the old encodings for the lower 16 and 32 bits and only using the
  higher 32 bits for unusually large major and minor numbers.  This
  change breaks compatibility with the previous encoding (which was only
  used in -current).
  
  Fix truncation to (essentially) 16-bit dev_t in newnfs v3.
  
  Any encoding of device numbers gives an ABI, so it can't be changed
  without translations for compatibility.  Extra bits give the much
  larger complication that the translations need to compress into fewer
  bits.  Fortunately, more than 32 bits are rarely needed, so
  compression is rarely needed except for 16-bit linux dev_t where it
  was always needed but never done.
  
  The previous encoding moved the major number into the top 32 bits.
  Almost no translation code handled this, so the major number was blindly
  truncated away in most 32-bit encodings.  E.g., for ffs, mknod(8) with
  major = 1 and minor = 2 gave dev_t = 0x10000002; ffs cannot represent
  this and blindly truncated it to 2.  But if this mknod was run on any
  released version of FreeBSD, it gives dev_t = 0x102.  ffs can represent
  this, but in the previous encoding it was not decoded, giving major = 0,
  minor = 0x102.
  
  The presence of bugs was most obvious for exporting dev_t's from an
  old system to -current, since bugs in newnfs augment them.  I fixed
  oldnfs to support 32-bit dev_t in 1996 (r16634), but this regressed
  to 16-bit dev_t in newnfs, first to the old 16-bit encoding and then
  further in -current.  E.g., old ad0 with major = 234, minor = 0x10002
  had the correct (major, minor) number on the wire, but newnfs truncated
  this to (234, 2) and then the previous encoding shifted the major
  number into oblivion as seen by ffs or old applications.
  
  I first tried to fix this by translating on every ABI/API boundary, but
  there are too many boundaries and too many sloppy translations by blind
  truncation.  So use the old encoding for the low 32 bits so that sloppy
  translations work no worse than before provided the high 32 bits are
  not set.  Add some error checking for when bits are lost.  Keep not
  doing any error checking for translations for almost everything in
  compat/linux.
  
  compat/freebsd32/freebsd32_misc.c:
  Optionally check for losing bits after possibly-truncating assignments as
  before.
  
  compat/linux/linux_stats.c:
  Depend on the representation being compatible with Linux's (or just with
  itself for local use) and spell some of the translations as assignments in
  a macro that hides the details.
  
  fs/nfsclient/nfs_clcomsubs.c:
  Essentially the same fix as in 1996, except there is now no possible
  truncation in makedev() itself.  Also fix nearby style bugs.
  
  kern/vfs_syscalls.c:
  As for freebsd32.  Also update the sysctl description to include file
  numbers, and change it to describe device ids as device numbers.
  
  sys/types.h:
  Use inline functions (wrapped by macros) since the expressions are now
  a bit too complicated for plain macros.  Describe the encoding and
  some of the reasons for it.  16-bit compatibility didn't leave many
  reasonable choices for the 32-bit encoding, and 32-bit compatibility
  doesn't leave many reasonable choices for the 64-bit encoding.  My
  choice is to put the 8 new minor bits in the low 8 bits of the top 32
  bits.  This minimizes discontiguities.
  
  Reviewed by:	kib (except for rewrite of the comment in linux_stats.c)

Modified:
  head/sys/compat/freebsd32/freebsd32_misc.c
  head/sys/compat/linux/linux_stats.c
  head/sys/fs/nfsclient/nfs_clcomsubs.c
  head/sys/kern/vfs_syscalls.c
  head/sys/sys/types.h

Modified: head/sys/compat/freebsd32/freebsd32_misc.c
==============================================================================
--- head/sys/compat/freebsd32/freebsd32_misc.c	Wed Jun 13 12:17:11 2018	(r335052)
+++ head/sys/compat/freebsd32/freebsd32_misc.c	Wed Jun 13 12:22:00 2018	(r335053)
@@ -2127,11 +2127,27 @@ freebsd11_cvtstat32(struct stat *in, struct freebsd11_
 			break;
 		}
 	}
-	CP(*in, *out, st_dev);
+	out->st_dev = in->st_dev;
+	if (out->st_dev != in->st_dev) {
+		switch (ino64_trunc_error) {
+		default:
+			break;
+		case 1:
+			return (EOVERFLOW);
+		}
+	}
 	CP(*in, *out, st_mode);
 	CP(*in, *out, st_uid);
 	CP(*in, *out, st_gid);
-	CP(*in, *out, st_rdev);
+	out->st_rdev = in->st_rdev;
+	if (out->st_rdev != in->st_rdev) {
+		switch (ino64_trunc_error) {
+		default:
+			break;
+		case 1:
+			return (EOVERFLOW);
+		}
+	}
 	TS_CP(*in, *out, st_atim);
 	TS_CP(*in, *out, st_mtim);
 	TS_CP(*in, *out, st_ctim);

Modified: head/sys/compat/linux/linux_stats.c
==============================================================================
--- head/sys/compat/linux/linux_stats.c	Wed Jun 13 12:17:11 2018	(r335052)
+++ head/sys/compat/linux/linux_stats.c	Wed Jun 13 12:22:00 2018	(r335053)
@@ -128,13 +128,30 @@ translate_fd_major_minor(struct thread *td, int fd, st
 	fdrop(fp, td);
 }
 
+/*
+ * l_dev_t has the same encoding as dev_t in the latter's low 16 bits, so
+ * don't bother going through major() and minor().  Keep doing blind
+ * truncation, as for other fields.  The previous version didn't even do
+ * blind truncation after dev_t was expanded to 64 bits.  It failed to
+ * mask out bits 8-15 in minor().  These bits can only be nonzero in th
+ * 64-bit version.
+ *
+ * This is only used for st_dev.  st_dev is for the mounted-on device so
+ * it can't be a device that needs very special translation.  The translation
+ * of blind truncation is done here.  st_rdev is supposed to be specially
+ * translated in callers, with the blind truncation done there too and
+ * st_rdev in the native struct state abused to hold the linux st_rdev.
+ * Callers do the last step using an open-coded Linux makedev().
+ */
+#define	dev_to_ldev(d)	((uint16_t)(d))
+
 static int
 newstat_copyout(struct stat *buf, void *ubuf)
 {
 	struct l_newstat tbuf;
 
 	bzero(&tbuf, sizeof(tbuf));
-	tbuf.st_dev = minor(buf->st_dev) | (major(buf->st_dev) << 8);
+	tbuf.st_dev = dev_to_ldev(buf->st_dev);
 	tbuf.st_ino = buf->st_ino;
 	tbuf.st_mode = buf->st_mode;
 	tbuf.st_nlink = buf->st_nlink;
@@ -222,7 +239,7 @@ stat_copyout(struct stat *buf, void *ubuf)
 	struct l_stat lbuf;
 
 	bzero(&lbuf, sizeof(lbuf));
-	lbuf.st_dev = buf->st_dev;
+	lbuf.st_dev = dev_to_ldev(buf->st_dev);
 	lbuf.st_ino = buf->st_ino;
 	lbuf.st_mode = buf->st_mode;
 	lbuf.st_nlink = buf->st_nlink;
@@ -524,7 +541,7 @@ stat64_copyout(struct stat *buf, void *ubuf)
 	struct l_stat64 lbuf;
 
 	bzero(&lbuf, sizeof(lbuf));
-	lbuf.st_dev = minor(buf->st_dev) | (major(buf->st_dev) << 8);
+	lbuf.st_dev = dev_to_ldev(buf->st_dev);
 	lbuf.st_ino = buf->st_ino;
 	lbuf.st_mode = buf->st_mode;
 	lbuf.st_nlink = buf->st_nlink;

Modified: head/sys/fs/nfsclient/nfs_clcomsubs.c
==============================================================================
--- head/sys/fs/nfsclient/nfs_clcomsubs.c	Wed Jun 13 12:17:11 2018	(r335052)
+++ head/sys/fs/nfsclient/nfs_clcomsubs.c	Wed Jun 13 12:22:00 2018	(r335053)
@@ -430,8 +430,9 @@ nfsm_loadattr(struct nfsrv_descript *nd, struct nfsvat
 		NFSM_DISSECT(fp, struct nfs_fattr *, NFSX_V3FATTR);
 		nap->na_type = nfsv34tov_type(fp->fa_type);
 		nap->na_mode = fxdr_unsigned(u_short, fp->fa_mode);
-		nap->na_rdev = makedev(fxdr_unsigned(u_char, fp->fa3_rdev.specdata1),
-			fxdr_unsigned(u_char, fp->fa3_rdev.specdata2));
+		nap->na_rdev = NFSMAKEDEV(
+		    fxdr_unsigned(int, fp->fa3_rdev.specdata1),
+		    fxdr_unsigned(int, fp->fa3_rdev.specdata2));
 		nap->na_nlink = fxdr_unsigned(uint32_t, fp->fa_nlink);
 		nap->na_uid = fxdr_unsigned(uid_t, fp->fa_uid);
 		nap->na_gid = fxdr_unsigned(gid_t, fp->fa_gid);

Modified: head/sys/kern/vfs_syscalls.c
==============================================================================
--- head/sys/kern/vfs_syscalls.c	Wed Jun 13 12:17:11 2018	(r335052)
+++ head/sys/kern/vfs_syscalls.c	Wed Jun 13 12:22:00 2018	(r335053)
@@ -2094,12 +2094,25 @@ cvtstat(struct stat *st, struct ostat *ost)
 int ino64_trunc_error;
 SYSCTL_INT(_vfs, OID_AUTO, ino64_trunc_error, CTLFLAG_RW,
     &ino64_trunc_error, 0,
-    "Error on truncation of inode number, device id or link count");
+    "Error on truncation of device, file or inode number, or link count");
+
 int
 freebsd11_cvtstat(struct stat *st, struct freebsd11_stat *ost)
 {
 
 	ost->st_dev = st->st_dev;
+	if (ost->st_dev != st->st_dev) {
+		switch (ino64_trunc_error) {
+		default:
+			/*
+			 * Since dev_t is almost raw, don't clamp to the
+			 * maximum for case 2, but ignore the error.
+			 */
+			break;
+		case 1:
+			return (EOVERFLOW);
+		}
+	}
 	ost->st_ino = st->st_ino;
 	if (ost->st_ino != st->st_ino) {
 		switch (ino64_trunc_error) {
@@ -2130,6 +2143,14 @@ freebsd11_cvtstat(struct stat *st, struct freebsd11_st
 	ost->st_uid = st->st_uid;
 	ost->st_gid = st->st_gid;
 	ost->st_rdev = st->st_rdev;
+	if (ost->st_rdev != st->st_rdev) {
+		switch (ino64_trunc_error) {
+		default:
+			break;
+		case 1:
+			return (EOVERFLOW);
+		}
+	}
 	ost->st_atim = st->st_atim;
 	ost->st_mtim = st->st_mtim;
 	ost->st_ctim = st->st_ctim;

Modified: head/sys/sys/types.h
==============================================================================
--- head/sys/sys/types.h	Wed Jun 13 12:17:11 2018	(r335052)
+++ head/sys/sys/types.h	Wed Jun 13 12:22:00 2018	(r335053)
@@ -366,9 +366,36 @@ __bitcount64(__uint64_t _x)
 
 #include <sys/select.h>
 
-#define	major(x)	((int)((dev_t)(x) >> 32))
-#define	minor(x)	((int)(x))
-#define	makedev(x, y)	(((dev_t)(x) << 32) | (unsigned)(y))
+/*
+ * The major and minor numbers are encoded in dev_t as MMMmmmMm (where
+ * letters correspond to bytes).  The encoding of the lower 4 bytes is
+ * constrained by compatibility with 16-bit and 32-bit dev_t's.  The
+ * encoding of of the upper 4 bytes is the least unnatural one consistent
+ * with this and other constraints.  Also, the decoding of the m bytes by
+ * minor() is unnatural to maximize compatibility subject to not discarding
+ * bits.  The upper m byte is shifted into the position of the lower M byte
+ * instead of shifting 3 upper m bytes to close the gap.  Compatibility for
+ * minor() is achieved iff the upper m byte is 0.
+ */
+#define	major(d)	__major(d)
+static __inline int
+__major(dev_t _d)
+{
+	return (((_d >> 32) & 0xffffff00) | ((_d >> 8) & 0xff));
+}
+#define	minor(d)	__minor(d)
+static __inline int
+__minor(dev_t _d)
+{
+	return (((_d >> 24) & 0xff00) | (_d & 0xffff00ff));
+}
+#define	makedev(M, m)	__makedev((M), (m))
+static __inline dev_t
+__makedev(int _M, int _m)
+{
+	return (((dev_t)(_M & 0xffffff00) << 32) | ((_M & 0xff) << 8) |
+	    ((dev_t)(_m & 0xff00) << 24) | (_m & 0xffff00ff));
+}
 
 /*
  * These declarations belong elsewhere, but are repeated here and in



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201806131222.w5DCM00c001080>