Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Mar 2006 20:32:55 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Nicolas KOWALSKI <Nicolas.Kowalski@imag.fr>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: quotas problem on 4.11/UFS
Message-ID:  <20060324194552.J6509@epsplex.bde.org>
In-Reply-To: <vqohd5pzezu.fsf@corbeau.imag.fr>
References:  <vqohd5pzezu.fsf@corbeau.imag.fr>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 23 Mar 2006, Nicolas KOWALSKI wrote:

> Our FreeBSD 4.11 fileserver (NFS, Samba) seems to have some problems
> with quotas, apparently just after a lot of files are deleted. The
> clients are mostly Linux 2.6 workstations (75), and some Windows XP
> (10).
>
> For example, yesterday the quota files of one disk were these:
>
> -rw-r-----  1 root tty 2097120 Mar 21 23:14 quota.group
> -rw-r-----  1 root tty 2097120 Mar 21 23:14 quota.user
>
>
> But today, after a user deleted a lot of his files, the quotas files
> are:
>
> -rw-r-----  1 root tty    2097120 Mar 22 23:10 quota.group
> -rw-r-----  1 root tty 4294967264 Mar 22 23:10 quota.user

IIRC, quota maps are sparse, with uid N mapped to offset
N*sizeof(somestruct).  With 32-bit uids, N can be as large as
4294967295, so the file size wants to be (this+1)*sizeof(somestruct) =
some not very large multiple of 4GB.  The large file sizes for this
might even work, without wasting much disk space, since files can be
sparse too, but there may be overflow problems at 4G.

Your magic size of 4294967264 is 4G-32.  4G-2 is a common large uid;
it is produced by nfs's default mapping of the root uid (nfs spells
this id bogusly as -2, but uids are unsigned so it becomes a very large
unsigned value (but "very large" is only 65334 with 16-bit uids)).  Your
magic file size would be explained by sizeof(somestruct) being 32 and
the magic uid N = (uint32_t)-2 being used.  Then (N+1)*sizeof(somestruct)
gives 4G-32 after overflow in the multiplication.

> The repquota command still works, but takes more time (30 secs instead
> of immediate).

The slowness is because even sparse files take a long time to process
if they are large.  repquota apparently has to search through almost
4GB of zeros before it gets to data for the big uid at the end.  Then
after wasting a lot of time to the nonzero data at the end, this data
will probably be misprocessed due to the multiplication overflow.

> We will reboot the server today to rebuild the quota files.
>
> Does anyone know what is happenning here, and if possible how to
> prevent it ? Thanks.

It seems to be necessary to limit uids to a fairly small value to work
around the quota bugs.  Limiting them to something like 4G/32 would
avoid the overflow bugs (if any) or prevent file sizes of something
like 4G*32 if there are no overflow bugs.  But limiting them is not
so easy for remote access.  A remote system may have too many large
uids for remapping them all to be practical.  nfs has the maproot
directive in /etc/exports to support mapping the root uid to anything
(default -2) but it doesn't seem to support mapping other uids
individually (it has mapall to map them all to the same (?) uid).

There might be PRs with more details about this.  PR38156 just
reports that quota -2 doesn't work.  It gives the same magic number
4G-32 for the size of quota.user and the magic number 4G-2 in
error output from quotcheck, but other details are different:
quota.user was copied from a Sun system, and quotacheck is apparently
failing by trying to seek to offset (4G-2)*sizeof(somestruct) where
data for this uid should be found, so it seems that the Sun system
had multiplication overflow bugs but quotacheck doesn't.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060324194552.J6509>