From owner-freebsd-ia64@FreeBSD.ORG Wed Oct 12 18:54:52 2005 Return-Path: X-Original-To: ia64@FreeBSD.org Delivered-To: freebsd-ia64@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 823E216A41F; Wed, 12 Oct 2005 18:54:52 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 49D6743D73; Wed, 12 Oct 2005 18:54:46 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 10EA61A3C19; Wed, 12 Oct 2005 11:54:46 -0700 (PDT) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id BC38151214; Wed, 12 Oct 2005 14:54:43 -0400 (EDT) Date: Wed, 12 Oct 2005 14:54:43 -0400 From: Kris Kennaway To: ia64@FreeBSD.org, ppc@FreeBSD.org Message-ID: <20051012185443.GA59565@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2fHTh5uZTiUOsy+g" Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Cc: Subject: [kris@obsecurity.org: int/long confusion with maxbcache and maxswzone (fixes 6.0 on >12GB machines)] X-BeenThere: freebsd-ia64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the IA-64 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2005 18:54:52 -0000 --2fHTh5uZTiUOsy+g Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable FYI, you probably want to also add the VM_*_MAX defines on ia64 and ppc too. Kris ----- Forwarded message from Kris Kennaway ----- Date: Tue, 11 Oct 2005 17:38:00 -0400 From: Kris Kennaway To: current@FreeBSD.org, sparc64@FreeBSD.org Cc: des@FreeBSD.org Subject: int/long confusion with maxbcache and maxswzone (fixes 6.0 on >12G= B machines) User-Agent: Mutt/1.4.2.1i A few weeks ago I reported that bufinit() on sparc64 machines with >12GB of RAM goes into an infinite loop because of a 32-bit integer counter overflowing. On 5.x it was possible to work around this with the kern.maxbcache tunable, but this didn't work on 6.0 or above. It turns out the problem began here: ---- Revision 1.67 / (download) - annotate - [select for diffs], Mon Nov 8 18:20= :02 2004 UTC (11 months ago) by des Branch: MAIN Changes since 1.66: +17 -17 lines Diff to previous 1.66 (colored) #include instead of (the former includes the latter, but also declares variables which are defined in kern/subr_param.c). Change som VM parameters from quad_t to unsigned long. They refer to quantities (size limits for text, heap and stack segments) which must necessarily be smaller than the size of the address space, so long is adequate on all platforms. MFC after: 1 week ---- which contained: -int maxswzone; /* max swmeta KVA storage */ -int maxbcache; /* max buffer cache KVA storage */ +long maxswzone; /* max swmeta KVA storage */ +long maxbcache; /* max buffer cache KVA storage */ However, des forgot to change the other definition of maxbcache in : extern int maxbcache; /* Max KVA for buffer cache */ In fact, it's a good thing he didn't. On sparc64 if you make that variable a long it causes 32-bit integer overflows elsewhere, which lead to severe filesystem damage on systems with >12GB RAM. With the above bug this is reduced to a hang at boot. The hang is because maxbcache is not capped to a maximum value on sparc64, and a loop termination condition never occurs because of a 32-bit integer overflow. On amd64 it's capped to /* * Ceiling on size of buffer cache (really only effects write queueing, * the VM page cache is not effected), can be changed via * the kern.maxbcache /boot/loader.conf variable. */ #ifndef VM_BCACHE_SIZE_MAX #define VM_BCACHE_SIZE_MAX (400 * 1024 * 1024) #endif so large-memory amd64 systems never see it. ia64 and ppc would also hang at boot with >12GB, I think. On 5.x, the same hang exists, but you can work around it with the tunable. This tunable was broken by the long/int mismatch on 6.0, so sparc64 systems with >12GB were unusable. This patch reverts the above int->long change, and adds definitions for VM_BCACHE_SIZE_MAX and VM_SWZONE_SIZE_MAX on sparc64 copied from amd64. Actually, they should probably be added on other architectures too (ia64, ppc). Can someone please review? Kris Index: kern/subr_param.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/kern/subr_param.c,v retrieving revision 1.71 diff -u -r1.71 subr_param.c --- kern/subr_param.c 16 Apr 2005 15:07:41 -0000 1.71 +++ kern/subr_param.c 11 Oct 2005 21:09:01 -0000 @@ -75,8 +75,8 @@ int ncallout; /* maximum # of timer events */ int nbuf; int nswbuf; -long maxswzone; /* max swmeta KVA storage */ -long maxbcache; /* max buffer cache KVA storage */ +int maxswzone; /* max swmeta KVA storage */ +int maxbcache; /* max buffer cache KVA storage */ int maxpipekva; /* Limit on pipe KVA */ u_long maxtsiz; /* max text size */ u_long dfldsiz; /* initial data size limit */ @@ -106,11 +106,11 @@ #ifdef VM_SWZONE_SIZE_MAX maxswzone =3D VM_SWZONE_SIZE_MAX; #endif - TUNABLE_LONG_FETCH("kern.maxswzone", &maxswzone); + TUNABLE_INT_FETCH("kern.maxswzone", &maxswzone); #ifdef VM_BCACHE_SIZE_MAX maxbcache =3D VM_BCACHE_SIZE_MAX; #endif - TUNABLE_LONG_FETCH("kern.maxbcache", &maxbcache); + TUNABLE_INT_FETCH("kern.maxbcache", &maxbcache); =20 maxtsiz =3D MAXTSIZ; TUNABLE_ULONG_FETCH("kern.maxtsiz", &maxtsiz); Index: sparc64/include/param.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/sparc64/include/param.h,v retrieving revision 1.19 diff -u -r1.19 param.h --- sparc64/include/param.h 20 Nov 2004 02:29:50 -0000 1.19 +++ sparc64/include/param.h 11 Oct 2005 20:54:30 -0000 @@ -110,6 +110,22 @@ #define KSTACK_GUARD_PAGES 1 /* pages of kstack guard; 0 disables */ #define PCPU_PAGES 1 =20 +/* + * Ceiling on amount of swblock kva space, can be changed via + * the kern.maxswzone /boot/loader.conf variable. + */ +#ifndef VM_SWZONE_SIZE_MAX +#define VM_SWZONE_SIZE_MAX (32 * 1024 * 1024) +#endif + +/* + * Ceiling on size of buffer cache (really only effects write queueing, + * the VM page cache is not effected), can be changed via + * the kern.maxbcache /boot/loader.conf variable. + */ +#ifndef VM_BCACHE_SIZE_MAX +#define VM_BCACHE_SIZE_MAX (400 * 1024 * 1024) +#endif =20 /* * Mach derived conversion macros Index: sparc64/include/vmparam.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/sparc64/include/vmparam.h,v retrieving revision 1.14 diff -u -r1.14 vmparam.h --- sparc64/include/vmparam.h 27 Dec 2002 19:31:26 -0000 1.14 +++ sparc64/include/vmparam.h 11 Oct 2005 20:54:30 -0000 @@ -171,6 +171,13 @@ #endif =20 /* + * Ceiling on amount of kmem_map kva space. + */ +#ifndef VM_KMEM_SIZE_MAX +#define VM_KMEM_SIZE_MAX (400 * 1024 * 1024) +#endif + +/* * Initial pagein size of beginning of executable file. */ #ifndef VM_INITIAL_PAGEIN ----- End forwarded message ----- --2fHTh5uZTiUOsy+g Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFDTVvyWry0BWjoQKURAiNdAJ4mnfmNDYH/WmcEsCVa4X18EG4nGgCfbwM2 Cd8y53QKwjoQPxhg7/Mh4BQ= =OAmX -----END PGP SIGNATURE----- --2fHTh5uZTiUOsy+g--