Date: Sun, 26 Feb 2012 12:46:30 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Maxim Konovalov <maxim@freebsd.org>, svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r232156 - head/sys/kern Message-ID: <20120226122354.D2293@besplex.bde.org> In-Reply-To: <788625604.124085.1330217939168.JavaMail.root@erie.cs.uoguelph.ca> References: <788625604.124085.1330217939168.JavaMail.root@erie.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 25 Feb 2012, Rick Macklem wrote: >> Log: >> o Reduce chances for integer overflow. >> o More verbose sysctl description added. >> >> MFC after: 2 weeks >> Sponsored by: Nginx, Inc. >> >> Modified: >> head/sys/kern/vfs_cache.c >> >> Modified: head/sys/kern/vfs_cache.c >> ============================================================================== >> --- head/sys/kern/vfs_cache.c Sat Feb 25 11:07:32 2012 (r232155) >> +++ head/sys/kern/vfs_cache.c Sat Feb 25 12:06:40 2012 (r232156) >> @@ -369,7 +369,7 @@ sysctl_debug_hashstat_nchash(SYSCTL_HAND >> maxlength = count; >> } >> n_nchash = nchash + 1; >> - pct = (used * 100 * 100) / n_nchash; >> + pct = (used * 100) / (n_nchash / 100); > > You might want to consider a sanity check to make > sure n_nchash is >= 100, to avoid a "divide by zero". > > There was an NFS PR# a while back, where "desiredvnodes" was > set very small and resulted in a "divide by zero" in the NFS code. Interesting. I considered mentioning this possibility in my reply, but decided that desiredvnodes is always initially at least a few hundred, since even an unbootable machine with 4MB memory has 1024 4K-pages. You can tune desiredvnodes down to a bad value, but another old bug is that tuning desiredvnodes doesn't affect the namecache, so you can't make the above divide by provided n_nchash was initially not bad. In nfs, the problem is larger and still exists in oldnfs: % nfsclient/nfs_vfsops.c- nmp->nm_wsize = NFS_WSIZE; % nfsclient/nfs_vfsops.c- nmp->nm_rsize = NFS_RSIZE; % nfsclient/nfs_vfsops.c- } % nfsclient/nfs_vfsops.c: nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000); Now the divisor is 1000, so problems occur with the initial value occurs on machines with 10 times as much memory as ones which have a problem in the namecache code, and a good initial value can be tuned down to ensure division by zero. % nfsclient/nfs_vfsops.c- nmp->nm_readdirsize = NFS_READDIRSIZE; % nfsclient/nfs_vfsops.c- nmp->nm_numgrps = NFS_MAXGRPS; % nfsclient/nfs_vfsops.c- nmp->nm_readahead = NFS_DEFRAHEAD; % -- % fs/nfsclient/nfs_clvfsops.c- nmp->nm_timeo = NFS_TIMEO; % fs/nfsclient/nfs_clvfsops.c- nmp->nm_retry = NFS_RETRANS; % fs/nfsclient/nfs_clvfsops.c- nmp->nm_readahead = NFS_DEFRAHEAD; % fs/nfsclient/nfs_clvfsops.c: if (desiredvnodes >= 11000) % fs/nfsclient/nfs_clvfsops.c: nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000); % fs/nfsclient/nfs_clvfsops.c- else % fs/nfsclient/nfs_clvfsops.c- nmp->nm_wcommitsize = hibufspace / 10; % fs/nfsclient/nfs_clvfsops.c- Has been fixed, but if you only want to avoid division by 0, then a simpler fix is to split up the powers of 10, e.g.: nmp->nm_wcommitsize = hibufspace * 100 / (desiredvnodes / 10); No one would set desiredvnodes to < 10, but now there is possible overflow for the multiplication, and this is even easier to arrange then division by zero since it is easy to set highbufspace to a non-physical value (2**63-1 on 64-bit machines). So this is too fragile. Except the sysctls are privileged and we should rely on their users to not misuse them. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120226122354.D2293>