From owner-svn-src-head@FreeBSD.ORG Mon Mar 9 19:35:22 2009 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 102F01065673; Mon, 9 Mar 2009 19:35:22 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id F16518FC15; Mon, 9 Mar 2009 19:35:21 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n29JZLAU035583; Mon, 9 Mar 2009 19:35:21 GMT (envelope-from jhb@svn.freebsd.org) Received: (from jhb@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n29JZL3d035574; Mon, 9 Mar 2009 19:35:21 GMT (envelope-from jhb@svn.freebsd.org) Message-Id: <200903091935.n29JZL3d035574@svn.freebsd.org> From: John Baldwin Date: Mon, 9 Mar 2009 19:35:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r189595 - in head/sys: kern sys ufs/ffs vm X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Mar 2009 19:35:22 -0000 Author: jhb Date: Mon Mar 9 19:35:20 2009 New Revision: 189595 URL: http://svn.freebsd.org/changeset/base/189595 Log: Adjust some variables (mostly related to the buffer cache) that hold address space sizes to be longs instead of ints. Specifically, the follow values are now longs: runningbufspace, bufspace, maxbufspace, bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace, hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a relatively small number (~ 44000) of buffers set in kern.nbuf would result in integer overflows resulting either in hangs or bogus values of hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see such problems. There was a check for a nbuf setting that would cause overflows in the auto-tuning of nbuf. I've changed it to always check and cap nbuf but warn if a user-supplied tunable would cause overflow. Note that this changes the ABI of several sysctls that are used by things like top(1), etc., so any MFC would probably require a some gross shims to allow for that. MFC after: 1 month Modified: head/sys/kern/subr_param.c head/sys/kern/sys_pipe.c head/sys/kern/vfs_bio.c head/sys/sys/buf.h head/sys/sys/pipe.h head/sys/ufs/ffs/ffs_snapshot.c head/sys/ufs/ffs/ffs_vfsops.c head/sys/vm/vm_init.c head/sys/vm/vnode_pager.c Modified: head/sys/kern/subr_param.c ============================================================================== --- head/sys/kern/subr_param.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/kern/subr_param.c Mon Mar 9 19:35:20 2009 (r189595) @@ -89,9 +89,9 @@ int maxfilesperproc; /* per-proc open f int ncallout; /* maximum # of timer events */ int nbuf; int nswbuf; -int maxswzone; /* max swmeta KVA storage */ -int maxbcache; /* max buffer cache KVA storage */ -int maxpipekva; /* Limit on pipe KVA */ +long maxswzone; /* max swmeta KVA storage */ +long maxbcache; /* max buffer cache KVA storage */ +u_long maxpipekva; /* Limit on pipe KVA */ int vm_guest; /* Running as virtual machine guest? */ u_long maxtsiz; /* max text size */ u_long dfldsiz; /* initial data size limit */ @@ -203,11 +203,11 @@ init_param1(void) #ifdef VM_SWZONE_SIZE_MAX maxswzone = VM_SWZONE_SIZE_MAX; #endif - TUNABLE_INT_FETCH("kern.maxswzone", &maxswzone); + TUNABLE_LONG_FETCH("kern.maxswzone", &maxswzone); #ifdef VM_BCACHE_SIZE_MAX maxbcache = VM_BCACHE_SIZE_MAX; #endif - TUNABLE_INT_FETCH("kern.maxbcache", &maxbcache); + TUNABLE_LONG_FETCH("kern.maxbcache", &maxbcache); maxtsiz = MAXTSIZ; TUNABLE_ULONG_FETCH("kern.maxtsiz", &maxtsiz); @@ -282,7 +282,7 @@ init_param3(long kmempages) maxpipekva = (kmempages / 20) * PAGE_SIZE; if (maxpipekva < 512 * 1024) maxpipekva = 512 * 1024; - TUNABLE_INT_FETCH("kern.ipc.maxpipekva", &maxpipekva); + TUNABLE_ULONG_FETCH("kern.ipc.maxpipekva", &maxpipekva); } /* Modified: head/sys/kern/sys_pipe.c ============================================================================== --- head/sys/kern/sys_pipe.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/kern/sys_pipe.c Mon Mar 9 19:35:20 2009 (r189595) @@ -184,7 +184,7 @@ static int pipeallocfail; static int piperesizefail; static int piperesizeallowed = 1; -SYSCTL_INT(_kern_ipc, OID_AUTO, maxpipekva, CTLFLAG_RDTUN, +SYSCTL_ULONG(_kern_ipc, OID_AUTO, maxpipekva, CTLFLAG_RDTUN, &maxpipekva, 0, "Pipe KVA limit"); SYSCTL_INT(_kern_ipc, OID_AUTO, pipekva, CTLFLAG_RD, &amountpipekva, 0, "Pipe KVA usage"); Modified: head/sys/kern/vfs_bio.c ============================================================================== --- head/sys/kern/vfs_bio.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/kern/vfs_bio.c Mon Mar 9 19:35:20 2009 (r189595) @@ -112,26 +112,26 @@ static void bremfreel(struct buf *bp); int vmiodirenable = TRUE; SYSCTL_INT(_vfs, OID_AUTO, vmiodirenable, CTLFLAG_RW, &vmiodirenable, 0, "Use the VM system for directory writes"); -int runningbufspace; -SYSCTL_INT(_vfs, OID_AUTO, runningbufspace, CTLFLAG_RD, &runningbufspace, 0, +long runningbufspace; +SYSCTL_LONG(_vfs, OID_AUTO, runningbufspace, CTLFLAG_RD, &runningbufspace, 0, "Amount of presently outstanding async buffer io"); -static int bufspace; -SYSCTL_INT(_vfs, OID_AUTO, bufspace, CTLFLAG_RD, &bufspace, 0, +static long bufspace; +SYSCTL_LONG(_vfs, OID_AUTO, bufspace, CTLFLAG_RD, &bufspace, 0, "KVA memory used for bufs"); -static int maxbufspace; -SYSCTL_INT(_vfs, OID_AUTO, maxbufspace, CTLFLAG_RD, &maxbufspace, 0, +static long maxbufspace; +SYSCTL_LONG(_vfs, OID_AUTO, maxbufspace, CTLFLAG_RD, &maxbufspace, 0, "Maximum allowed value of bufspace (including buf_daemon)"); -static int bufmallocspace; -SYSCTL_INT(_vfs, OID_AUTO, bufmallocspace, CTLFLAG_RD, &bufmallocspace, 0, +static long bufmallocspace; +SYSCTL_LONG(_vfs, OID_AUTO, bufmallocspace, CTLFLAG_RD, &bufmallocspace, 0, "Amount of malloced memory for buffers"); -static int maxbufmallocspace; -SYSCTL_INT(_vfs, OID_AUTO, maxmallocbufspace, CTLFLAG_RW, &maxbufmallocspace, 0, +static long maxbufmallocspace; +SYSCTL_LONG(_vfs, OID_AUTO, maxmallocbufspace, CTLFLAG_RW, &maxbufmallocspace, 0, "Maximum amount of malloced memory for buffers"); -static int lobufspace; -SYSCTL_INT(_vfs, OID_AUTO, lobufspace, CTLFLAG_RD, &lobufspace, 0, +static long lobufspace; +SYSCTL_LONG(_vfs, OID_AUTO, lobufspace, CTLFLAG_RD, &lobufspace, 0, "Minimum amount of buffers we want to have"); -int hibufspace; -SYSCTL_INT(_vfs, OID_AUTO, hibufspace, CTLFLAG_RD, &hibufspace, 0, +long hibufspace; +SYSCTL_LONG(_vfs, OID_AUTO, hibufspace, CTLFLAG_RD, &hibufspace, 0, "Maximum allowed value of bufspace (excluding buf_daemon)"); static int bufreusecnt; SYSCTL_INT(_vfs, OID_AUTO, bufreusecnt, CTLFLAG_RW, &bufreusecnt, 0, @@ -142,11 +142,11 @@ SYSCTL_INT(_vfs, OID_AUTO, buffreekvacnt static int bufdefragcnt; SYSCTL_INT(_vfs, OID_AUTO, bufdefragcnt, CTLFLAG_RW, &bufdefragcnt, 0, "Number of times we have had to repeat buffer allocation to defragment"); -static int lorunningspace; -SYSCTL_INT(_vfs, OID_AUTO, lorunningspace, CTLFLAG_RW, &lorunningspace, 0, +static long lorunningspace; +SYSCTL_LONG(_vfs, OID_AUTO, lorunningspace, CTLFLAG_RW, &lorunningspace, 0, "Minimum preferred space used for in-progress I/O"); -static int hirunningspace; -SYSCTL_INT(_vfs, OID_AUTO, hirunningspace, CTLFLAG_RW, &hirunningspace, 0, +static long hirunningspace; +SYSCTL_LONG(_vfs, OID_AUTO, hirunningspace, CTLFLAG_RW, &hirunningspace, 0, "Maximum amount of space to use for in-progress I/O"); int dirtybufferflushes; SYSCTL_INT(_vfs, OID_AUTO, dirtybufferflushes, CTLFLAG_RW, &dirtybufferflushes, @@ -324,7 +324,7 @@ runningbufwakeup(struct buf *bp) { if (bp->b_runningbufspace) { - atomic_subtract_int(&runningbufspace, bp->b_runningbufspace); + atomic_subtract_long(&runningbufspace, bp->b_runningbufspace); bp->b_runningbufspace = 0; mtx_lock(&rbreqlock); if (runningbufreq && runningbufspace <= lorunningspace) { @@ -444,7 +444,8 @@ bd_speedup(void) caddr_t kern_vfs_bio_buffer_alloc(caddr_t v, long physmem_est) { - int maxbuf; + int tuned_nbuf; + long maxbuf; /* * physmem_est is in pages. Convert it to kilobytes (assumes @@ -474,11 +475,17 @@ kern_vfs_bio_buffer_alloc(caddr_t v, lon if (maxbcache && nbuf > maxbcache / BKVASIZE) nbuf = maxbcache / BKVASIZE; + tuned_nbuf = 1; + } else + tuned_nbuf = 0; - /* XXX Avoid integer overflows later on with maxbufspace. */ - maxbuf = (INT_MAX / 3) / BKVASIZE; - if (nbuf > maxbuf) - nbuf = maxbuf; + /* XXX Avoid unsigned long overflows later on with maxbufspace. */ + maxbuf = (LONG_MAX / 3) / BKVASIZE; + if (nbuf > maxbuf) { + if (!tuned_nbuf) + printf("Warning: nbufs lowered from %d to %ld\n", nbuf, + maxbuf); + nbuf = maxbuf; } /* @@ -548,8 +555,8 @@ bufinit(void) * this may result in KVM fragmentation which is not handled optimally * by the system. */ - maxbufspace = nbuf * BKVASIZE; - hibufspace = imax(3 * maxbufspace / 4, maxbufspace - MAXBSIZE * 10); + maxbufspace = (long)nbuf * BKVASIZE; + hibufspace = lmax(3 * maxbufspace / 4, maxbufspace - MAXBSIZE * 10); lobufspace = hibufspace - MAXBSIZE; lorunningspace = 512 * 1024; @@ -577,7 +584,7 @@ bufinit(void) * be met. We try to size hidirtybuffers to 3/4 our buffer space assuming * BKVASIZE'd (8K) buffers. */ - while (hidirtybuffers * BKVASIZE > 3 * hibufspace / 4) { + while ((long)hidirtybuffers * BKVASIZE > 3 * hibufspace / 4) { hidirtybuffers >>= 1; } lodirtybuffers = hidirtybuffers / 2; @@ -613,7 +620,7 @@ bfreekva(struct buf *bp) if (bp->b_kvasize) { atomic_add_int(&buffreekvacnt, 1); - atomic_subtract_int(&bufspace, bp->b_kvasize); + atomic_subtract_long(&bufspace, bp->b_kvasize); vm_map_remove(buffer_map, (vm_offset_t) bp->b_kvabase, (vm_offset_t) bp->b_kvabase + bp->b_kvasize); bp->b_kvasize = 0; @@ -837,7 +844,7 @@ bufwrite(struct buf *bp) * Normal bwrites pipeline writes */ bp->b_runningbufspace = bp->b_bufsize; - atomic_add_int(&runningbufspace, bp->b_runningbufspace); + atomic_add_long(&runningbufspace, bp->b_runningbufspace); if (!TD_IS_IDLETHREAD(curthread)) curthread->td_ru.ru_oublock++; @@ -1983,7 +1990,7 @@ restart: bp->b_kvabase = (caddr_t) addr; bp->b_kvasize = maxsize; - atomic_add_int(&bufspace, bp->b_kvasize); + atomic_add_long(&bufspace, bp->b_kvasize); atomic_add_int(&bufreusecnt, 1); } vm_map_unlock(buffer_map); @@ -2707,7 +2714,7 @@ allocbuf(struct buf *bp, int size) } else { free(bp->b_data, M_BIOBUF); if (bp->b_bufsize) { - atomic_subtract_int( + atomic_subtract_long( &bufmallocspace, bp->b_bufsize); bufspacewakeup(); @@ -2744,7 +2751,7 @@ allocbuf(struct buf *bp, int size) bp->b_bufsize = mbsize; bp->b_bcount = size; bp->b_flags |= B_MALLOC; - atomic_add_int(&bufmallocspace, mbsize); + atomic_add_long(&bufmallocspace, mbsize); return 1; } origbuf = NULL; @@ -2758,7 +2765,7 @@ allocbuf(struct buf *bp, int size) origbufsize = bp->b_bufsize; bp->b_data = bp->b_kvabase; if (bp->b_bufsize) { - atomic_subtract_int(&bufmallocspace, + atomic_subtract_long(&bufmallocspace, bp->b_bufsize); bufspacewakeup(); bp->b_bufsize = 0; Modified: head/sys/sys/buf.h ============================================================================== --- head/sys/sys/buf.h Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/sys/buf.h Mon Mar 9 19:35:20 2009 (r189595) @@ -446,10 +446,10 @@ buf_countdeps(struct buf *bp, int i) #ifdef _KERNEL extern int nbuf; /* The number of buffer headers */ -extern int maxswzone; /* Max KVA for swap structures */ -extern int maxbcache; /* Max KVA for buffer cache */ -extern int runningbufspace; -extern int hibufspace; +extern long maxswzone; /* Max KVA for swap structures */ +extern long maxbcache; /* Max KVA for buffer cache */ +extern long runningbufspace; +extern long hibufspace; extern int dirtybufthresh; extern int bdwriteskip; extern int dirtybufferflushes; Modified: head/sys/sys/pipe.h ============================================================================== --- head/sys/sys/pipe.h Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/sys/pipe.h Mon Mar 9 19:35:20 2009 (r189595) @@ -56,7 +56,7 @@ /* * See sys_pipe.c for info on what these limits mean. */ -extern int maxpipekva; +extern u_long maxpipekva; /* * Pipe buffer information. Modified: head/sys/ufs/ffs/ffs_snapshot.c ============================================================================== --- head/sys/ufs/ffs/ffs_snapshot.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/ufs/ffs/ffs_snapshot.c Mon Mar 9 19:35:20 2009 (r189595) @@ -2229,7 +2229,7 @@ ffs_copyonwrite(devvp, bp) VI_UNLOCK(devvp); if (saved_runningbufspace != 0) { bp->b_runningbufspace = saved_runningbufspace; - atomic_add_int(&runningbufspace, + atomic_add_long(&runningbufspace, bp->b_runningbufspace); } return (0); /* Snapshot gone */ @@ -2354,7 +2354,7 @@ ffs_copyonwrite(devvp, bp) */ if (saved_runningbufspace != 0) { bp->b_runningbufspace = saved_runningbufspace; - atomic_add_int(&runningbufspace, bp->b_runningbufspace); + atomic_add_long(&runningbufspace, bp->b_runningbufspace); } return (error); } Modified: head/sys/ufs/ffs/ffs_vfsops.c ============================================================================== --- head/sys/ufs/ffs/ffs_vfsops.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/ufs/ffs/ffs_vfsops.c Mon Mar 9 19:35:20 2009 (r189595) @@ -1915,7 +1915,7 @@ ffs_geom_strategy(struct bufobj *bo, str } } bp->b_runningbufspace = bp->b_bufsize; - atomic_add_int(&runningbufspace, + atomic_add_long(&runningbufspace, bp->b_runningbufspace); } else { error = ffs_copyonwrite(vp, bp); Modified: head/sys/vm/vm_init.c ============================================================================== --- head/sys/vm/vm_init.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/vm/vm_init.c Mon Mar 9 19:35:20 2009 (r189595) @@ -186,12 +186,12 @@ again: panic("startup: table size inconsistency"); clean_map = kmem_suballoc(kernel_map, &kmi->clean_sva, &kmi->clean_eva, - nbuf * BKVASIZE + nswbuf * MAXPHYS, FALSE); + (long)nbuf * BKVASIZE + (long)nswbuf * MAXPHYS, FALSE); buffer_map = kmem_suballoc(clean_map, &kmi->buffer_sva, - &kmi->buffer_eva, nbuf * BKVASIZE, FALSE); + &kmi->buffer_eva, (long)nbuf * BKVASIZE, FALSE); buffer_map->system_map = 1; pager_map = kmem_suballoc(clean_map, &kmi->pager_sva, &kmi->pager_eva, - nswbuf * MAXPHYS, FALSE); + (long)nswbuf * MAXPHYS, FALSE); pager_map->system_map = 1; exec_map = kmem_suballoc(kernel_map, &minaddr, &maxaddr, exec_map_entries * (ARG_MAX + (PAGE_SIZE * 3)), FALSE); Modified: head/sys/vm/vnode_pager.c ============================================================================== --- head/sys/vm/vnode_pager.c Mon Mar 9 19:22:45 2009 (r189594) +++ head/sys/vm/vnode_pager.c Mon Mar 9 19:35:20 2009 (r189595) @@ -525,7 +525,7 @@ vnode_pager_input_smlfs(object, m) bp->b_bcount = bsize; bp->b_bufsize = bsize; bp->b_runningbufspace = bp->b_bufsize; - atomic_add_int(&runningbufspace, bp->b_runningbufspace); + atomic_add_long(&runningbufspace, bp->b_runningbufspace); /* do the input */ bp->b_iooffset = dbtob(bp->b_blkno); @@ -905,7 +905,7 @@ vnode_pager_generic_getpages(vp, m, byte bp->b_bcount = size; bp->b_bufsize = size; bp->b_runningbufspace = bp->b_bufsize; - atomic_add_int(&runningbufspace, bp->b_runningbufspace); + atomic_add_long(&runningbufspace, bp->b_runningbufspace); PCPU_INC(cnt.v_vnodein); PCPU_ADD(cnt.v_vnodepgsin, count);