From owner-svn-src-head@FreeBSD.ORG Sun Feb 26 01:46:34 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 99B4D106566B; Sun, 26 Feb 2012 01:46:34 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by mx1.freebsd.org (Postfix) with ESMTP id 257048FC0A; Sun, 26 Feb 2012 01:46:33 +0000 (UTC) Received: from c211-30-171-136.carlnfd1.nsw.optusnet.com.au (c211-30-171-136.carlnfd1.nsw.optusnet.com.au [211.30.171.136]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q1Q1kULG001155 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 26 Feb 2012 12:46:31 +1100 Date: Sun, 26 Feb 2012 12:46:30 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Rick Macklem In-Reply-To: <788625604.124085.1330217939168.JavaMail.root@erie.cs.uoguelph.ca> Message-ID: <20120226122354.D2293@besplex.bde.org> References: <788625604.124085.1330217939168.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Maxim Konovalov , svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r232156 - head/sys/kern X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Feb 2012 01:46:34 -0000 On Sat, 25 Feb 2012, Rick Macklem wrote: >> Log: >> o Reduce chances for integer overflow. >> o More verbose sysctl description added. >> >> MFC after: 2 weeks >> Sponsored by: Nginx, Inc. >> >> Modified: >> head/sys/kern/vfs_cache.c >> >> Modified: head/sys/kern/vfs_cache.c >> ============================================================================== >> --- head/sys/kern/vfs_cache.c Sat Feb 25 11:07:32 2012 (r232155) >> +++ head/sys/kern/vfs_cache.c Sat Feb 25 12:06:40 2012 (r232156) >> @@ -369,7 +369,7 @@ sysctl_debug_hashstat_nchash(SYSCTL_HAND >> maxlength = count; >> } >> n_nchash = nchash + 1; >> - pct = (used * 100 * 100) / n_nchash; >> + pct = (used * 100) / (n_nchash / 100); > > You might want to consider a sanity check to make > sure n_nchash is >= 100, to avoid a "divide by zero". > > There was an NFS PR# a while back, where "desiredvnodes" was > set very small and resulted in a "divide by zero" in the NFS code. Interesting. I considered mentioning this possibility in my reply, but decided that desiredvnodes is always initially at least a few hundred, since even an unbootable machine with 4MB memory has 1024 4K-pages. You can tune desiredvnodes down to a bad value, but another old bug is that tuning desiredvnodes doesn't affect the namecache, so you can't make the above divide by provided n_nchash was initially not bad. In nfs, the problem is larger and still exists in oldnfs: % nfsclient/nfs_vfsops.c- nmp->nm_wsize = NFS_WSIZE; % nfsclient/nfs_vfsops.c- nmp->nm_rsize = NFS_RSIZE; % nfsclient/nfs_vfsops.c- } % nfsclient/nfs_vfsops.c: nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000); Now the divisor is 1000, so problems occur with the initial value occurs on machines with 10 times as much memory as ones which have a problem in the namecache code, and a good initial value can be tuned down to ensure division by zero. % nfsclient/nfs_vfsops.c- nmp->nm_readdirsize = NFS_READDIRSIZE; % nfsclient/nfs_vfsops.c- nmp->nm_numgrps = NFS_MAXGRPS; % nfsclient/nfs_vfsops.c- nmp->nm_readahead = NFS_DEFRAHEAD; % -- % fs/nfsclient/nfs_clvfsops.c- nmp->nm_timeo = NFS_TIMEO; % fs/nfsclient/nfs_clvfsops.c- nmp->nm_retry = NFS_RETRANS; % fs/nfsclient/nfs_clvfsops.c- nmp->nm_readahead = NFS_DEFRAHEAD; % fs/nfsclient/nfs_clvfsops.c: if (desiredvnodes >= 11000) % fs/nfsclient/nfs_clvfsops.c: nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000); % fs/nfsclient/nfs_clvfsops.c- else % fs/nfsclient/nfs_clvfsops.c- nmp->nm_wcommitsize = hibufspace / 10; % fs/nfsclient/nfs_clvfsops.c- Has been fixed, but if you only want to avoid division by 0, then a simpler fix is to split up the powers of 10, e.g.: nmp->nm_wcommitsize = hibufspace * 100 / (desiredvnodes / 10); No one would set desiredvnodes to < 10, but now there is possible overflow for the multiplication, and this is even easier to arrange then division by zero since it is easy to set highbufspace to a non-physical value (2**63-1 on 64-bit machines). So this is too fragile. Except the sysctls are privileged and we should rely on their users to not misuse them. Bruce