From owner-svn-src-all@freebsd.org Sat Dec 12 14:08:30 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6542F9D6577; Sat, 12 Dec 2015 14:08:30 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3C7311079; Sat, 12 Dec 2015 14:08:30 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id tBCE8TlJ008559; Sat, 12 Dec 2015 14:08:29 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id tBCE8TDS008558; Sat, 12 Dec 2015 14:08:29 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201512121408.tBCE8TDS008558@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Sat, 12 Dec 2015 14:08:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r292128 - head/sys/dev/md X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Dec 2015 14:08:30 -0000 Author: kib Date: Sat Dec 12 14:08:29 2015 New Revision: 292128 URL: https://svnweb.freebsd.org/changeset/base/292128 Log: In md(4) over vnode, correct handling of the unaligned unmapped io requests which page alignment + size is greater than MAXPHYS. Right now md(4) over vnode would use the physical buffer of the size MAXPHYS to map a data of size MAXPHYS + page offset of the user buffer. This typically corrupts next pbuf, or, if the pbuf used was the last pbuf in the map, the next page after the pbuf's map. Split request up to the size of io which fits into pbuf KVA with alignment, and retry if a part of the bio is left unprocessed. Reported by: Fabian Keil Tested by: Fabian Keil, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Modified: head/sys/dev/md/md.c Modified: head/sys/dev/md/md.c ============================================================================== --- head/sys/dev/md/md.c Fri Dec 11 23:52:08 2015 (r292127) +++ head/sys/dev/md/md.c Sat Dec 12 14:08:29 2015 (r292128) @@ -836,8 +836,8 @@ mdstart_vnode(struct md_s *sc, struct bi struct buf *pb; bus_dma_segment_t *vlist; struct thread *td; - off_t len, zerosize; - int ma_offs; + off_t iolen, len, zerosize; + int ma_offs, npages; switch (bp->bio_cmd) { case BIO_READ: @@ -858,6 +858,7 @@ mdstart_vnode(struct md_s *sc, struct bi pb = NULL; piov = NULL; ma_offs = bp->bio_ma_offset; + len = bp->bio_length; /* * VNODE I/O @@ -890,7 +891,6 @@ mdstart_vnode(struct md_s *sc, struct bi auio.uio_iovcnt = howmany(bp->bio_length, zerosize); piov = malloc(sizeof(*piov) * auio.uio_iovcnt, M_MD, M_WAITOK); auio.uio_iov = piov; - len = bp->bio_length; while (len > 0) { piov->iov_base = __DECONST(void *, zero_region); piov->iov_len = len; @@ -904,7 +904,6 @@ mdstart_vnode(struct md_s *sc, struct bi piov = malloc(sizeof(*piov) * bp->bio_ma_n, M_MD, M_WAITOK); auio.uio_iov = piov; vlist = (bus_dma_segment_t *)bp->bio_data; - len = bp->bio_length; while (len > 0) { piov->iov_base = (void *)(uintptr_t)(vlist->ds_addr + ma_offs); @@ -920,11 +919,20 @@ mdstart_vnode(struct md_s *sc, struct bi piov = auio.uio_iov; } else if ((bp->bio_flags & BIO_UNMAPPED) != 0) { pb = getpbuf(&md_vnode_pbuf_freecnt); - pmap_qenter((vm_offset_t)pb->b_data, bp->bio_ma, bp->bio_ma_n); - aiov.iov_base = (void *)((vm_offset_t)pb->b_data + ma_offs); - aiov.iov_len = bp->bio_length; + bp->bio_resid = len; +unmapped_step: + npages = atop(min(MAXPHYS, round_page(len + (ma_offs & + PAGE_MASK)))); + iolen = min(ptoa(npages) - (ma_offs & PAGE_MASK), len); + KASSERT(iolen > 0, ("zero iolen")); + pmap_qenter((vm_offset_t)pb->b_data, + &bp->bio_ma[atop(ma_offs)], npages); + aiov.iov_base = (void *)((vm_offset_t)pb->b_data + + (ma_offs & PAGE_MASK)); + aiov.iov_len = iolen; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; + auio.uio_resid = iolen; } else { aiov.iov_base = bp->bio_data; aiov.iov_len = bp->bio_length; @@ -948,15 +956,21 @@ mdstart_vnode(struct md_s *sc, struct bi vn_finished_write(mp); } - if (pb) { - pmap_qremove((vm_offset_t)pb->b_data, bp->bio_ma_n); + if (pb != NULL) { + pmap_qremove((vm_offset_t)pb->b_data, npages); + if (error == 0) { + len -= iolen; + bp->bio_resid -= iolen; + ma_offs += iolen; + if (len > 0) + goto unmapped_step; + } relpbuf(pb, &md_vnode_pbuf_freecnt); } - if (piov != NULL) - free(piov, M_MD); - - bp->bio_resid = auio.uio_resid; + free(piov, M_MD); + if (pb == NULL) + bp->bio_resid = auio.uio_resid; return (error); }