Date: Sat, 20 Aug 2011 20:41:47 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: alc@freebsd.org Cc: freebsd-stable@freebsd.org, perryh@pluto.rain.com, "Alexander V. Chernikov" <melifaro@ipfw.ru>, daniel@digsys.bg Subject: Re: 32GB limit per swap device? Message-ID: <20110820174147.GW17489@deviant.kiev.zoral.com.ua> In-Reply-To: <CAJUyCcMc7m65c_XjHNFi0A4cHHySC1brLS7HdivstxeOi6uFQw@mail.gmail.com> References: <4E4143A6.6030307@digsys.bg> <935F8EC2-88E0-45A3-BE8B-7210BE223BC5@mac.com> <4e42a0c0.e2t/9MF98O3HFjb1%perryh@pluto.rain.com> <4E4CCA6C.8020408@ipfw.ru> <CAJUyCcMc7m65c_XjHNFi0A4cHHySC1brLS7HdivstxeOi6uFQw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--gfR41eDGUhhc/UyZ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 20, 2011 at 12:33:29PM -0500, Alan Cox wrote: > On Thu, Aug 18, 2011 at 3:16 AM, Alexander V. Chernikov <melifaro@ipfw.ru= >wrote: >=20 > > On 10.08.2011 19:16, perryh@pluto.rain.com wrote: > > > >> Chuck Swiger<cswiger@mac.com> wrote: > >> > >> On Aug 9, 2011, at 7:26 AM, Daniel Kalchev wrote: > >>> > >>>> I am trying to set up 64GB partitions for swap for a system that > >>>> has 64GB of RAM (with the idea to dump kernel core etc). But, on > >>>> 8-stable as of today I get: > >>>> > >>>> WARNING: reducing size to maximum of 67108864 blocks per swap unit > >>>> > >>>> Is there workaround for this limitation? > >>>> > >>> > > Another interesting question: > > > > swap pager operates in page blocks (PAGE_SIZE=3D4k on common arch). > > > > Block device size in passed to swaponsomething() in number of _disk_ bl= ocks > > (e.g. in DEV_BSIZE=3D512). After that, kernel b-lists (on top of which= swap > > pager is build) maximum objects check is enforced. > > > > The (possible) problem is that real object count we will operate on is = not > > the value passed to swaponsomething() since it is calculated in wrong u= nits. > > > > we should check b-list limit on (X * DEV_BSIZE512 / PAGE_SIZE) value wh= ich > > is rough (X / 8) so we should be able to address 32*8=3D256G. > > > > The code should look like this: > > > > Index: vm/swap_pager.c > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D**=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D**=3D=3D=3D=3D=3D=3D=3D > > --- vm/swap_pager.c (revision 223877) > > +++ vm/swap_pager.c (working copy) > > @@ -2129,6 +2129,15 @@ swaponsomething(struct vnode *vp, void *id, u_lo= ng > > u_long mblocks; > > > > /* > > + * nblks is in DEV_BSIZE'd chunks, convert to PAGE_SIZE'd chunk= s. > > + * First chop nblks off to page-align it, then convert. > > + * > > + * sw->sw_nblks is in page-sized chunks now too. > > + */ > > + nblks &=3D ~(ctodb(1) - 1); > > + nblks =3D dbtoc(nblks); > > + > > + /* > > > > * If we go beyond this, we get overflows in the radix > > * tree bitmap code. > > */ > > @@ -2138,14 +2147,6 @@ swaponsomething(struct vnode *vp, void *id, u_lo= ng > > mblocks); > > nblks =3D mblocks; > > } > > - /* > > - * nblks is in DEV_BSIZE'd chunks, convert to PAGE_SIZE'd chunk= s. > > - * First chop nblks off to page-align it, then convert. > > - * > > - * sw->sw_nblks is in page-sized chunks now too. > > - */ > > - nblks &=3D ~(ctodb(1) - 1); > > - nblks =3D dbtoc(nblks); > > > > sp =3D malloc(sizeof *sp, M_VMPGDATA, M_WAITOK | M_ZERO); > > sp->sw_vp =3D vp; > > > > > > (move pages recalculation before b-list check) > > > > > > Can someone comment on this? > > > > > I believe that you are correct. Have you tried testing this change on a > large swap device? I probably agree too, but I am in the process of re-reading the swap code, and I do not quite believe in the limit. When the initial code was committed, our daddr_t was 32bit, I checked the RELENG_4 sources. Current code uses int64_t for daddr_t. My impression right now is that we only utilize the low 32bits of daddr_t. Esp. interesting looks the following typedef: typedef uint32_t u_daddr_t; /* unsigned disk address */ which (correctly) means that typical mask (u_daddr_t)-1 is 0xffffffff. I wonder whether we could just use full 64bit and de-facto remove the limitation on the swap partition size. --gfR41eDGUhhc/UyZ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk5P8dsACgkQC3+MBN1Mb4gKdwCeK7fVc2QYLxELDvVNP+xeDEdQ bk8An2aneYCGFD/rDi0TA2tSjFHD5Srd =Eikm -----END PGP SIGNATURE----- --gfR41eDGUhhc/UyZ--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110820174147.GW17489>