From owner-freebsd-arch Tue Oct 8 13:31:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BC0F37B401 for ; Tue, 8 Oct 2002 13:31:09 -0700 (PDT) Received: from vbook.express.ru (asplinux.ru [195.133.213.194]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9B30E43E77 for ; Tue, 8 Oct 2002 13:31:08 -0700 (PDT) (envelope-from vova@express.ru) Received: from vova by vbook.express.ru with local (Exim 3.36 #1) id 17z10H-0000zm-00; Wed, 09 Oct 2002 00:30:53 +0400 Subject: Re: Database indexes and ram (was Re: using mem above 4Gb was: swapon some regular file) From: "Vladimir B. " Grebenschikov To: Matthew Dillon Cc: Nate Lawson , arch@FreeBSD.ORG In-Reply-To: <200210082015.g98KFFrq084625@apollo.backplane.com> References: <1034105993.913.1.camel@vbook.express.ru> <200210082015.g98KFFrq084625@apollo.backplane.com> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-Mailer: Ximian Evolution 1.0.7 Date: 09 Oct 2002 00:30:52 +0400 Message-Id: <1034109053.913.7.camel@vbook.express.ru> Mime-Version: 1.0 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG =F7 Wed, 09.10.2002, =D7 00:15, Matthew Dillon =CE=C1=D0=C9=D3=C1=CC: > :.. > :> It's often surprisingly effective to just access the index on disk and > :> tune your VM cache instead. You can lose performance by double-cachin= g > :> data. > : > :I don't want cache disk data in extra memory - simply store index in RAM > :(no disk access at all) - I think it must be faster. > : > :> -Nate > : > :--=20 > :Vladimir B. Grebenschikov > :vova@sw.ru, SWsoft, Inc. >=20 > If you have enough ram to hold the index, copying the index into > anonymous memory will be no slower or faster then mmap()ing it into r= am. >=20 > If you do not have enough ram to hold the index then trying to store=20 > it in ram won't work. Mattew, please look at my initial posting. My idea is to extend ram available for storing such thing as index above 4Gb (actually about 3Gb) limit, if there more physical ram. Current mmap(read vm) implementation will map/cache only in memory below 4Gb not depending of amount of physical ram. > Database indexes, e.g. typically B+Trees or similar entities, are > highly cacheable and designed to reduce the number of seek/reads=20 > required to do a lookup as much as possible. This tends to result > in fairly good matching between our VM system and a fairly optimal > caching of the index. =20 >=20 > For example, take a B+Tree with 64 elements per node and a database w= ith > 16 million records in it. 16 million records can be represented by=20 > four levels in the B+Tree. The first three levels (64*64* > 64*sizeof(btreeelm)) =3D 262144 * sizeof(btreeelm), or, typically, > less then 16 MB of data which the VM system will cache at a high=20 > priority due to the frequency of accesses. The last B+Tree level in > this example represents the only seek/read that would have to occur o= n > the disk (if you didn't have enough memory to hold the entire index). >=20 > The only *PROBLEM* with using mmap() is that the database will not ha= ve > a very good idea about whether a particular mapped memory location is > resident or whether it will stall the process while doing a disk read= , > which can seriously impact multi-threaded access to the database. > madvise() and mincore() can be used to some effect but that still mea= ns > making system calls that one would rather not have to make. Still, > mmap() can be used to good effect and I usually find it easier to use > then having to write a userland shared memory disk cache manager. Agree, but see above. > -Matt --=20 Vladimir B. Grebenschikov vova@sw.ru, SWsoft, Inc. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message