From owner-freebsd-arch@FreeBSD.ORG Fri Feb 1 09:57:41 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5BED439F; Fri, 1 Feb 2013 09:57:41 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 8B9BC85C; Fri, 1 Feb 2013 09:57:40 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r119vZEa083737; Fri, 1 Feb 2013 11:57:35 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r119vZEa083737 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r119vZsT083736; Fri, 1 Feb 2013 11:57:35 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 1 Feb 2013 11:57:35 +0200 From: Konstantin Belousov To: Andriy Gapon Subject: Re: kva size on amd64 Message-ID: <20130201095735.GM2522@kib.kiev.ua> References: <507E7E59.8060201@FreeBSD.org> <51098743.2050603@FreeBSD.org> <510A2C09.6030709@FreeBSD.org> <510AB848.3010806@rice.edu> <510B8F2B.5070609@FreeBSD.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="GAvX+JaMaI2IseJS" Content-Disposition: inline In-Reply-To: <510B8F2B.5070609@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: alc@FreeBSD.org, freebsd-arch@FreeBSD.org, Alan Cox , Alan Cox X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Feb 2013 09:57:41 -0000 --GAvX+JaMaI2IseJS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Feb 01, 2013 at 11:47:23AM +0200, Andriy Gapon wrote: > on 31/01/2013 20:30 Alan Cox said the following: > > Try developing a different allocation strategy for the kmem_map.=20 > > First-fit is clearly not working well for the ZFS ARC, because of > > fragmentation. For example, instead of further enlarging the kmem_map, > > try splitting it into multiple submaps of the same total size, > > kmem_map1, kmem_map2, etc. Then, utilize these akin to the "old" and > > "new" spaces of a copying garbage collector or storage segments in a > > log-structured file system. However, actual copying from an "old" space > > to a "new" space may not be necessary. By the time that the "new" space > > from which you are currently allocating fills up or becomes sufficiently > > fragmented that you can't satisfy an allocation, you've likely created > > enough contiguous space in an "old" space. > >=20 > > I'll hypothesize that just a couple kmem_map submaps that are .625 of > > physical memory size would suffice. The bottom line is that the total > > virtual address space should be less than 2x physical memory. > >=20 > > In fact, maybe the system starts off with just a single kmem_map, and > > you only create additional kmem_maps on demand. As someone who doesn't > > use ZFS that would actually save me physical memory that is currently > > being wasted on unnecessary preallocated page table pages for my > > kmem_map. This begins to sound like option (1) that you propose above. > >=20 > > This might also help to keep physical memory fragmentation in check. >=20 > Alan, >=20 > very interesting suggestions, thank you! >=20 > Of course, this is quite a bit more work than just jacking up some limit = :-) > So, it could be a while before any code materializes. >=20 > Actually, I have been obsessed quite for some time with an idea of confin= ing ZFS > to its own submap. But ZFS does its allocations through malloc(9) and um= a(9) > (depending on configuration). It seemed like a bit of work to provide sup= port > for per-zone or per-tag submaps in uma and malloc. > What is your opinion of this approach? Definitely not being Alan. I think that the rework of the ZFS memory management should remove the use of uma or kmem_alloc() at all. From what I heard in part from you, there is no reason to keep the filesystem caches mapped full time. I hope to commit shortly a facilities that would allow ZFS to easily manage copying for i/o from the unmapped set of pages. The checksumming you mentioned would require some more work, but this does not look unsurmountable. Having ZFS use raw vm_page_t for caching would also remove the pressure on KVA. >=20 > P.S. > BTW, do I understand correctly that the reservation of kernel page tables > happens through vm_map_insert -> pmap_growkernel ? Yes. E.g. kmem_suballoc->vm_map_find->vm_map_insert->pmap_growkernel. --GAvX+JaMaI2IseJS Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRC5GOAAoJEJDCuSvBvK1BSl4P/AwobgyBRxTlpRtXwizUDZmx ucd851Y5z+h67xiTT3v7DMrHo9k7q4eW/S0aQwb1nbDdHEFUFm8Cesc5tTm/7cJ6 6x2SHbUd362Vwvfr0hfyhcnEB8w0YulapEqwaogzeixh1VFLxfFYTnFFykqFH/8q Wc4LkgiNHyGDQxKHh00HgvsMU0XBZqkqMQrCr73ePR2/CMXNPEQkiJGZqMXlaFYY 5NSpDJqbkjiz8Y5bywOpxdp28Ywdkg9bGwdGzBPeVQZy5RePo8I9GOG3/lhZ0sm8 Cpoez3Pr/iXkLVGz26ZiuO0v4fQcduog/96WarjShU4rC6EthEy5XrJknUUg+wGs OND+e709g+o25APt64LrRw9X1I5l9qeQKTUXRv6NR2V9R5v0pVZF8Dec+3N3ZK9S OiLENXMQB404jSbcBGJyPFaQaxuo0MHAR4Rh6G2AiW+m10hzxL+A3O2WgpLl+JZl eFUNRePxyoMOo2A0ZhkVJrsvS/iOKZnOzeJT1zjTe0VlWuBAvhSteeXveLHN5R2E VedO5XPygw0v1kX/kr1ZAN5g8wISNkE8SikcM9M+nkp4eVHLNmvROGP8SVtBsdRA QyJr4cl1YraxLY6B94VanmxQA6QQS5C9S5i4HUO/b1IfzQyEOSVqjX7RimX00C7i cVdkSr2zo2riOg9NpPgo =xt4H -----END PGP SIGNATURE----- --GAvX+JaMaI2IseJS--