Date: Tue, 12 Mar 2013 18:32:33 -0600 From: Ian Lepore <ian@FreeBSD.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: arch@FreeBSD.org Subject: Re: Unmapped buffers: to be merged in several days Message-ID: <1363134753.1291.287.camel@revolution.hippie.lan> In-Reply-To: <20130311091852.GR3794@kib.kiev.ua> References: <20130311091852.GR3794@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Mon, 2013-03-11 at 11:18 +0200, Konstantin Belousov wrote: > The latest version of the unmapped buffers patch is available at > http://people.freebsd.org/~kib/misc/unmapped.17.patch > The patch makes the user data buffers, as well as the page-ins, for > UFS, the swap-in/out, clustering use unmapped buffers, removing the TLB > shootdown overhead and buffer map contention and fragmentation. > The ahci(4) and md(4) is converted to accept unmapped BIO requests. > > Other drivers and geom classes get the compat mapped BIOs, the > transient mapping is established by the geom down thread. The KVA > for the transient maping the carved from the buffer map, up to 10% > of which is repurposed to the transient bio KVA. The hope is that > the rest of drivers and geom classes will be converted to accept > unmapped i/o shortly, making the transient map unused. > > The patch was tested by Peter Holm using the whole stress2 suite, > on both i386 and amd64, on ahci(4) and ad(4) attached disks. ad(4) > uses the transient remapping for unmapped requests, so the testing > should cover both new and old i/o pathes. The previous version of the > patch is already used on some high-load machines by Scott Long, on > ahci(4), isci(4) and mps(4). Brendan Fabeny did useful testing in his > environment. > > The biggest change comparing to the previous mail, is the prevention of > the deadlocks due to the bugs in the bufspace limit code. In the HEAD, > bufspace is equal to the size of the buffer map, which effectively > makes the code which limits the total space allocated to buffers, by > maxbufspace, a nop, due to the buffer map fragmentation. > > In the patch, filesystem metadata is not the subject to maxbufspace > limit anymore. Since the metadata buffers are always mapped, the buffers > still have to fit into the buffer map, which provides a reasonable > (but practically unreachable) upper bound on it. The non-metadata buffer > allocations, both mapped and unmapped, is accounted against maxbufspace, > as before. Effectively, this means that the maxbufspace is forced on > mapped and unmapped buffers separately. > > I intend to commit the change as is, with the following modifications: > - the pmap_copy_pages() will be a stub for all architectures where > it was not tested. The only tested arches are i386, amd64 and powerpc64. > - For all architectures where pmap_copy_pages() is a stub, the GB_UNMAPPED > flag for the buffer allocators will be nop. > > FYI. I tested this for armv4 today, and it works. I had a (bogus) used-before-init warning from gcc, and I had to add a couple lines of code to the pmap_copy_pages() to increment some variables; patch attached. I think the pmap-v6 routine needs the same change, but I didn't get as far as testing v6 yet. I tested with both the md and ahci drivers on armv4. Peformance seemed to be about the same before and after based on some crude tests such as "time tar -cf - /mnt >/dev/null" where I had the ahci drive (a fast ssd with a few hundred MB of data on ufs) mounted on /mnt. I don't have a v6 board with a sata interface running yet, but I can test with md, hopefully I'll get to it tomorrow. -- Ian [-- Attachment #2 --] Minimal changes required to get umapped.17 to build and run. diff -r 179fcc6b2485 -r 2f1c61450df0 sys/arm/arm/pmap.c --- a/sys/arm/arm/pmap.c Tue Mar 12 13:41:10 2013 -0600 +++ b/sys/arm/arm/pmap.c Tue Mar 12 13:45:34 2013 -0600 @@ -4458,6 +4458,9 @@ pmap_copy_pages(vm_page_t ma[], vm_offse pmap_copy_page_offs_func(VM_PAGE_TO_PHYS(a_pg), a_pg_offset, VM_PAGE_TO_PHYS(b_pg), b_pg_offset, cnt); #endif + xfersize -= cnt; + a_offset += cnt; + b_offset += cnt; } } diff -r 179fcc6b2485 -r 2f1c61450df0 sys/dev/md/md.c --- a/sys/dev/md/md.c Tue Mar 12 13:41:10 2013 -0600 +++ b/sys/dev/md/md.c Tue Mar 12 13:45:34 2013 -0600 @@ -753,9 +753,10 @@ mdstart_vnode(struct md_s *sc, struct bi KASSERT(bp->bio_length <= MAXPHYS, ("bio_length %jd", (uintmax_t)bp->bio_length)); - if ((bp->bio_flags & BIO_UNMAPPED) == 0) + if ((bp->bio_flags & BIO_UNMAPPED) == 0) { + pb = NULL; aiov.iov_base = bp->bio_data; - else { + } else { pb = getpbuf(&md_vnode_pbuf_freecnt); pmap_qenter((vm_offset_t)pb->b_data, bp->bio_ma, bp->bio_ma_n); aiov.iov_base = (void *)((vm_offset_t)pb->b_data +
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1363134753.1291.287.camel>
