From owner-svn-src-head@freebsd.org Fri Apr 14 12:03:35 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9384ED3D09E; Fri, 14 Apr 2017 12:03:35 +0000 (UTC) (envelope-from bde@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56487EEC; Fri, 14 Apr 2017 12:03:35 +0000 (UTC) (envelope-from bde@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v3EC3Ybo007154; Fri, 14 Apr 2017 12:03:34 GMT (envelope-from bde@FreeBSD.org) Received: (from bde@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v3EC3YGn007153; Fri, 14 Apr 2017 12:03:34 GMT (envelope-from bde@FreeBSD.org) Message-Id: <201704141203.v3EC3YGn007153@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bde set sender to bde@FreeBSD.org using -f From: Bruce Evans Date: Fri, 14 Apr 2017 12:03:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r316827 - head/sys/dev/syscons X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Apr 2017 12:03:35 -0000 Author: bde Date: Fri Apr 14 12:03:34 2017 New Revision: 316827 URL: https://svnweb.freebsd.org/changeset/base/316827 Log: Further unobfuscate the method of drawing the mouse cursor in vga planar mode. Don't manually unroll the 2 inner loops. On Haswell, doing so gave a speedup of about 0.5% (about 4 cycles per iteration out of 1400), but hard-coded a limit of width 9 and made better better optimizations harder to see. gcc-4.2.1 -O does the unrolling anyway, unless tricked with a volatile hack. gcc's unrolling is not very good and gives a a speedup of about half as much (about 2 cycles per iteration). (All timing on i386.) Manual unrolling was only feasible because the inner loop only iterates once or twice. Usually twice, but a dynamic check is needed to decide, and was not moved from the second-innermost loop manually or by gcc. This commit basically adds another dynamic check in the inner loop. Cursor widths of 10-17 require 3 iterations in the inner loop and this is not so easy to unroll -- even gcc stops at 2. Modified: head/sys/dev/syscons/scvgarndr.c Modified: head/sys/dev/syscons/scvgarndr.c ============================================================================== --- head/sys/dev/syscons/scvgarndr.c Fri Apr 14 11:58:41 2017 (r316826) +++ head/sys/dev/syscons/scvgarndr.c Fri Apr 14 12:03:34 2017 (r316827) @@ -1031,7 +1031,7 @@ draw_pxlmouse_planar(scr_stat *scp, int int xoff, yoff; int ymax; u_short m; - int i, j; + int i, j, k; line_width = scp->sc->adp->va_line_width; xoff = (x - scp->xoff*8)%8; @@ -1043,42 +1043,27 @@ draw_pxlmouse_planar(scr_stat *scp, int outw(GDCIDX, 0xff08); /* bit mask */ outw(GDCIDX, 0x0803); /* data rotate/function select (and) */ p = scp->sc->adp->va_window + line_width*y + x/8; - if (x < scp->xpixel - 8) { - for (i = y, j = 0; i < ymax; ++i, ++j) { - m = ~((mouse_and_mask[j] & ~mouse_or_mask[j]) >> xoff); - readb(p); - writeb(p, m >> 8); - readb(p + 1); - writeb(p + 1, m); - p += line_width; - } - } else { - xoff += 8; - for (i = y, j = 0; i < ymax; ++i, ++j) { - m = ~((mouse_and_mask[j] & ~mouse_or_mask[j]) >> xoff); - readb(p); - writeb(p, m); - p += line_width; + for (i = y, j = 0; i < ymax; ++i, ++j) { + m = ~((mouse_and_mask[j] & ~mouse_or_mask[j]) >> xoff); + for (k = 0; k < 2; ++k) { + if (x + 8 * k < scp->xpixel) { + readb(p + k); + writeb(p + k, m >> (8 * (1 - k))); + } } + p += line_width; } outw(GDCIDX, 0x1003); /* data rotate/function select (or) */ p = scp->sc->adp->va_window + line_width*y + x/8; - if (x < scp->xpixel - 8) { - for (i = y, j = 0; i < ymax; ++i, ++j) { - m = mouse_or_mask[j] >> xoff; - readb(p); - writeb(p, m >> 8); - readb(p + 1); - writeb(p + 1, m); - p += line_width; - } - } else { - for (i = y, j = 0; i < ymax; ++i, ++j) { - m = mouse_or_mask[j] >> xoff; - readb(p); - writeb(p, m); - p += line_width; + for (i = y, j = 0; i < ymax; ++i, ++j) { + m = mouse_or_mask[j] >> xoff; + for (k = 0; k < 2; ++k) { + if (x + 8 * k < scp->xpixel) { + readb(p + k); + writeb(p + k, m >> (8 * (1 - k))); + } } + p += line_width; } outw(GDCIDX, 0x0003); /* data rotate/function select */ }