Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Feb 2012 12:46:30 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        Maxim Konovalov <maxim@freebsd.org>, svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r232156 - head/sys/kern
Message-ID:  <20120226122354.D2293@besplex.bde.org>
In-Reply-To: <788625604.124085.1330217939168.JavaMail.root@erie.cs.uoguelph.ca>
References:  <788625604.124085.1330217939168.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 25 Feb 2012, Rick Macklem wrote:

>> Log:
>> o Reduce chances for integer overflow.
>> o More verbose sysctl description added.
>>
>> MFC after: 2 weeks
>> Sponsored by: Nginx, Inc.
>>
>> Modified:
>> head/sys/kern/vfs_cache.c
>>
>> Modified: head/sys/kern/vfs_cache.c
>> ==============================================================================
>> --- head/sys/kern/vfs_cache.c Sat Feb 25 11:07:32 2012 (r232155)
>> +++ head/sys/kern/vfs_cache.c Sat Feb 25 12:06:40 2012 (r232156)
>> @@ -369,7 +369,7 @@ sysctl_debug_hashstat_nchash(SYSCTL_HAND
>> maxlength = count;
>> }
>> n_nchash = nchash + 1;
>> - pct = (used * 100 * 100) / n_nchash;
>> + pct = (used * 100) / (n_nchash / 100);
>
> You might want to consider a sanity check to make
> sure n_nchash is >= 100, to avoid a "divide by zero".
>
> There was an NFS PR# a while back, where "desiredvnodes" was
> set very small and resulted in a "divide by zero" in the NFS code.

Interesting.  I considered mentioning this possibility in my reply,
but decided that desiredvnodes is always initially at least a few
hundred, since even an unbootable machine with 4MB memory has
1024 4K-pages.  You can tune desiredvnodes down to a bad value,
but another old bug is that tuning desiredvnodes doesn't affect
the namecache, so you can't make the above divide by provided
n_nchash was initially not bad.

In nfs, the problem is larger and still exists in oldnfs:

% nfsclient/nfs_vfsops.c-		nmp->nm_wsize = NFS_WSIZE;
% nfsclient/nfs_vfsops.c-		nmp->nm_rsize = NFS_RSIZE;
% nfsclient/nfs_vfsops.c-	}
% nfsclient/nfs_vfsops.c:	nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);

Now the divisor is 1000, so problems occur with the initial value occurs
on machines with 10 times as much memory as ones which have a problem in
the namecache code, and a good initial value can be tuned down to ensure
division by zero.

% nfsclient/nfs_vfsops.c-	nmp->nm_readdirsize = NFS_READDIRSIZE;
% nfsclient/nfs_vfsops.c-	nmp->nm_numgrps = NFS_MAXGRPS;
% nfsclient/nfs_vfsops.c-	nmp->nm_readahead = NFS_DEFRAHEAD;
% --
% fs/nfsclient/nfs_clvfsops.c-	nmp->nm_timeo = NFS_TIMEO;
% fs/nfsclient/nfs_clvfsops.c-	nmp->nm_retry = NFS_RETRANS;
% fs/nfsclient/nfs_clvfsops.c-	nmp->nm_readahead = NFS_DEFRAHEAD;
% fs/nfsclient/nfs_clvfsops.c:	if (desiredvnodes >= 11000)
% fs/nfsclient/nfs_clvfsops.c:		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
% fs/nfsclient/nfs_clvfsops.c-	else
% fs/nfsclient/nfs_clvfsops.c-		nmp->nm_wcommitsize = hibufspace / 10;
% fs/nfsclient/nfs_clvfsops.c-

Has been fixed, but if you only want to avoid division by 0, then a simpler
fix is to split up the powers of 10, e.g.:

 		nmp->nm_wcommitsize = hibufspace * 100 / (desiredvnodes / 10);

No one would set desiredvnodes to < 10, but now there is possible overflow
for the multiplication, and this is even easier to arrange then division
by zero since it is easy to set highbufspace to a non-physical value
(2**63-1 on 64-bit machines).  So this is too fragile.  Except the
sysctls are privileged and we should rely on their users to not misuse
them.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120226122354.D2293>