From owner-cvs-src@FreeBSD.ORG  Fri Jul 11 17:07:57 2003
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 107EF37B405; Fri, 11 Jul 2003 17:07:57 -0700 (PDT)
Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 6B19E43FBF; Fri, 11 Jul 2003 17:07:55 -0700 (PDT)
	(envelope-from bde@zeta.org.au)
Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246])
	by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id KAA29602;
	Sat, 12 Jul 2003 10:07:30 +1000
Date: Sat, 12 Jul 2003 10:07:29 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-X-Sender: bde@gamplex.bde.org
To: Peter Wemm <peter@wemm.org>
In-Reply-To: <20030711215612.51C6F2A7EA@canning.wemm.org>
Message-ID: <20030712092215.S31542@gamplex.bde.org>
References: <20030711215612.51C6F2A7EA@canning.wemm.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?= <des@des.no>
cc: src-committers@FreeBSD.org
cc: cvs-all@FreeBSD.org
cc: cvs-src@FreeBSD.org
Subject: Re: cvs commit: src/sys/amd64/include pmap.h vmparam.h 
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Jul 2003 00:07:57 -0000

On Fri, 11 Jul 2003, Peter Wemm wrote:

> Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?= wrote:
> > Peter Wemm <peter@FreeBSD.org> writes:
> > >   Increase user VM space from 1/2TB (512GB) to 128TB.
> >
> > What's our file size limit these days?
>
> Thats a very good question.  Last year, I was able to make Very Large sparse
> files on a ufs1 fs and mmap them to try and fill up the ia64 user VM.  The
> 32 bit block numbers were the limit though.

32-bit (ufs1) block numbers should permit file sizes up to 4TB with the
default fragment size of 2K.  The following patch is needed to fix overflow
at 1TB.

%%%
Index: fs.h
===================================================================
RCS file: /home/ncvs/src/sys/ufs/ffs/fs.h,v
retrieving revision 1.38
diff -u -2 -r1.38 fs.h
--- fs.h	10 Jan 2003 06:59:34 -0000	1.38
+++ fs.h	27 Apr 2003 17:40:14 -0000
@@ -490,5 +490,5 @@
  * This maps filesystem blocks to device size blocks.
  */
-#define fsbtodb(fs, b)	((b) << (fs)->fs_fsbtodb)
+#define	fsbtodb(fs, b)	((daddr_t)(b) << (fs)->fs_fsbtodb)
 #define	dbtofsb(fs, b)	((b) >> (fs)->fs_fsbtodb)

%%%

This bug apparently doesn't affect ufs2 since `b' is always already
64-bit and daddr_t "just happens" to have the same size as `b', so wer
are back to the pre-BSD4.4 multiplication and shift overflow possibilities
for 32 * 32 -> 32-bit multiplications, except now with
64 * 64 -> 64-bits (64-bit off_t's made overflow impossible in some
cases since we could do 32 * 32 -> 64 bits).

> ufs2 should have no such problem
> itself, but if its living in a disklabel'ed drive, the physical limit for
> the partition is 1TB.  If you use the raw disk without a label, or use
> gpt on the drive, that limit is gone too.

Modulo bugs, disklabel actually has a limit of (2TB - 512) with the normal
sector size of 512, since its sizes and offsets are unsigned.  It should
be able to handle sizes like (64TB - epsilon) using fake sector sizes
of like 16K (which is a reasonable sector size since it is the default
ufs block size).  I wasn't able to test this because I didn't have any
disks larger than 2TB handy (md has a limit of 2TB - 512).

The following test program gives more details about the bugs and creates
a sparse file of size 22TB on a ufs1 file system.

%%%
#!/bin/sh

SOMEFILE=/c/tmp/zz

# Bugs:
# (1) md silently truncates sizes (in DEV_BSIZE'ed units) mod 2^32.
# (2) at least pre-GEOM versions of md get confused by this and cause a
# null pointer panic in devstat.
#
# Use the maximum size that works (2^32 - 1).  Unfortunately, this prevents
# testing of file systems with size 2 TB or larger.
dd if=/dev/zero of=$SOMEFILE oseek=0xFFFFFFFE count=1

mdconfig -a -t vnode -f ${SOMEFILE} -u 0

# The large values here are more to make newfs not take too long than to
# get a large maxfilesize.
newfs -O 1 -b 65536 -f 65536 -i 6533600 /dev/md0

# Note that this reports a very large maxfilesise (2282 TB).  This is the
# size associated with the triple indirect block limit, not the correct
# one.  I think the actual limit should be 64 TB (less epsilon).
dumpfs /dev/md0 | grep maxfilesize

mount /dev/md0 /mnt

# Bugs:
# (1) fsbtodb(nb) overflows when nb has type ufs1_daddr_t and the result
# should be larger than (2^31 - 1).
# (2) dscheck() used to detect garbage block numbers caused by (1) (if the
# garbage happened to be negative or too large).  Then it reported the error
# and invalidated the buffer.  GEOM doesn't detect any error.  It apparently
# passes on the garbage, so the error is eventually detected by ffs (since
# md0 is on an ffs vnode) (if the garbage is preposterous).  ffs_balloc_ufs1()
# eventually sees the error as an EFBIG returned be bwrite() and gives up.
# But the buffer says in the buffer cache to cause problems later.
# (3) The result of bwrite() is sometimes ignored.
#
# Chop a couple of F's off the seek so that we don't get an EFBIG error.
# Unfortunately, this breaks testing for files of size near 2282 TB.
dd if=/dev/zero of=/mnt/zz oseek=0xFFFFFE count=1

ls -l /mnt/zz

# Bugs:
# (1) umount(2) returns the undocumented errno EFBIG for the unwriteable
# buffer.
# (2) umount -f and unmount at reboot fail too (the latter leaving all file
# systems dirty).
#
# Removing the file clears the problem.
rm /mnt/zz
umount /mnt

# Since we couldn't demonstrate files larger than 2 TB on md0, demonstrate
# one near ${SOMEFILE}.
dumpfs /c | egrep '(^bsize|^fsize|maxfilesize)'
dd if=/dev/zero of="$SOMEFILE"-bigger oseek=0x3FFFFFFFE count=1
ls -l "$SOMEFILE"-bigger
rm "$SOMEFILE"-bigger

mdconfig -d -u 0
rm $SOMEFILE
%%%

Bruce