Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Mar 2013 18:32:33 -0600
From:      Ian Lepore <ian@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        arch@FreeBSD.org
Subject:   Re: Unmapped buffers: to be merged in several days
Message-ID:  <1363134753.1291.287.camel@revolution.hippie.lan>
In-Reply-To: <20130311091852.GR3794@kib.kiev.ua>
References:  <20130311091852.GR3794@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
On Mon, 2013-03-11 at 11:18 +0200, Konstantin Belousov wrote:
> The latest version of the unmapped buffers patch is available at
> http://people.freebsd.org/~kib/misc/unmapped.17.patch
> The patch makes the user data buffers, as well as the page-ins, for
> UFS, the swap-in/out, clustering use unmapped buffers, removing the TLB
> shootdown overhead and buffer map contention and fragmentation.
> The ahci(4) and md(4) is converted to accept unmapped BIO requests.
> 
> Other drivers and geom classes get the compat mapped BIOs, the
> transient mapping is established by the geom down thread. The KVA
> for the transient maping the carved from the buffer map, up to 10%
> of which is repurposed to the transient bio KVA. The hope is that
> the rest of drivers and geom classes will be converted to accept
> unmapped i/o shortly, making the transient map unused.
> 
> The patch was tested by Peter Holm using the whole stress2 suite,
> on both i386 and amd64, on ahci(4) and ad(4) attached disks. ad(4)
> uses the transient remapping for unmapped requests, so the testing
> should cover both new and old i/o pathes. The previous version of the
> patch is already used on some high-load machines by Scott Long, on
> ahci(4), isci(4) and mps(4). Brendan Fabeny did useful testing in his
> environment.
> 
> The biggest change comparing to the previous mail, is the prevention of
> the deadlocks due to the bugs in the bufspace limit code. In the HEAD,
> bufspace is equal to the size of the buffer map, which effectively
> makes the code which limits the total space allocated to buffers, by
> maxbufspace, a nop, due to the buffer map fragmentation.
> 
> In the patch, filesystem metadata is not the subject to maxbufspace
> limit anymore. Since the metadata buffers are always mapped, the buffers
> still have to fit into the buffer map, which provides a reasonable
> (but practically unreachable) upper bound on it. The non-metadata buffer
> allocations, both mapped and unmapped, is accounted against maxbufspace,
> as before. Effectively, this means that the maxbufspace is forced on
> mapped and unmapped buffers separately.
> 
> I intend to commit the change as is, with the following modifications:
> - the pmap_copy_pages() will be a stub for all architectures where
>   it was not tested. The only tested arches are i386, amd64 and powerpc64.
> - For all architectures where pmap_copy_pages() is a stub, the GB_UNMAPPED
>   flag for the buffer allocators will be nop.
> 
> FYI.

I tested this for armv4 today, and it works.  I had a (bogus)
used-before-init warning from gcc, and I had to add a couple lines of
code to the pmap_copy_pages() to increment some variables; patch
attached.  I think the pmap-v6 routine needs the same change, but I
didn't get as far as testing v6 yet.  

I tested with both the md and ahci drivers on armv4.  Peformance seemed
to be about the same before and after based on some crude tests such as
"time tar -cf - /mnt >/dev/null" where I had the ahci drive (a fast ssd
with a few hundred MB of data on ufs) mounted on /mnt.

I don't have a v6 board with a sata interface running yet, but I can
test with md, hopefully I'll get to it tomorrow.

-- Ian


[-- Attachment #2 --]
Minimal changes required to get umapped.17 to build and run.

diff -r 179fcc6b2485 -r 2f1c61450df0 sys/arm/arm/pmap.c
--- a/sys/arm/arm/pmap.c	Tue Mar 12 13:41:10 2013 -0600
+++ b/sys/arm/arm/pmap.c	Tue Mar 12 13:45:34 2013 -0600
@@ -4458,6 +4458,9 @@ pmap_copy_pages(vm_page_t ma[], vm_offse
 		pmap_copy_page_offs_func(VM_PAGE_TO_PHYS(a_pg), a_pg_offset,
 		    VM_PAGE_TO_PHYS(b_pg), b_pg_offset, cnt);
 #endif
+		xfersize -= cnt;
+		a_offset += cnt;
+		b_offset += cnt;
 	}
 }
 
diff -r 179fcc6b2485 -r 2f1c61450df0 sys/dev/md/md.c
--- a/sys/dev/md/md.c	Tue Mar 12 13:41:10 2013 -0600
+++ b/sys/dev/md/md.c	Tue Mar 12 13:45:34 2013 -0600
@@ -753,9 +753,10 @@ mdstart_vnode(struct md_s *sc, struct bi
 
 	KASSERT(bp->bio_length <= MAXPHYS, ("bio_length %jd",
 	    (uintmax_t)bp->bio_length));
-	if ((bp->bio_flags & BIO_UNMAPPED) == 0)
+	if ((bp->bio_flags & BIO_UNMAPPED) == 0) {
+		pb = NULL;
 		aiov.iov_base = bp->bio_data;
-	else {
+	} else {
 		pb = getpbuf(&md_vnode_pbuf_freecnt);
 		pmap_qenter((vm_offset_t)pb->b_data, bp->bio_ma, bp->bio_ma_n);
 		aiov.iov_base = (void *)((vm_offset_t)pb->b_data +

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1363134753.1291.287.camel>