From owner-svn-src-head@FreeBSD.ORG Sun May 11 01:58:57 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4015930A; Sun, 11 May 2014 01:58:57 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2D28E262E; Sun, 11 May 2014 01:58:57 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s4B1wvu6072382; Sun, 11 May 2014 01:58:57 GMT (envelope-from nwhitehorn@svn.freebsd.org) Received: (from nwhitehorn@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s4B1wvFA072381; Sun, 11 May 2014 01:58:57 GMT (envelope-from nwhitehorn@svn.freebsd.org) Message-Id: <201405110158.s4B1wvFA072381@svn.freebsd.org> From: Nathan Whitehorn Date: Sun, 11 May 2014 01:58:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r265864 - head/sys/dev/vt/hw/ofwfb X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 May 2014 01:58:57 -0000 Author: nwhitehorn Date: Sun May 11 01:58:56 2014 New Revision: 265864 URL: http://svnweb.freebsd.org/changeset/base/265864 Log: Make ofwfb not be painfully slow. This reduces the time for a verbose boot on my G4 iBook by more than half. Still 10% slower than syscons, but that's much better than a factor of 2. The slowness had to do with pathological write performance on 8-bit framebuffers, which are almost universally used on Open Firmware systems. Writing 1 byte at a time, potentially nonconsecutively, resulted in many extra PCI write cycles. This patch, in the common case where it's writing one or several characters in an 8x8 font, gangs the writes together into a set of 32-bit writes. This is a port of r143830 to vt(4). The EFI framebuffer is also extremely slow, probably for the same reason, and the same patch will likely help there. Modified: head/sys/dev/vt/hw/ofwfb/ofwfb.c Modified: head/sys/dev/vt/hw/ofwfb/ofwfb.c ============================================================================== --- head/sys/dev/vt/hw/ofwfb/ofwfb.c Sun May 11 01:44:11 2014 (r265863) +++ head/sys/dev/vt/hw/ofwfb/ofwfb.c Sun May 11 01:58:56 2014 (r265864) @@ -136,6 +136,10 @@ ofwfb_bitbltchr(struct vt_device *vd, co uint32_t fgc, bgc; int c; uint8_t b, m; + union { + uint32_t l; + uint8_t c[4]; + } ch1, ch2; fgc = sc->sc_colormap[fg]; bgc = sc->sc_colormap[bg]; @@ -147,36 +151,70 @@ ofwfb_bitbltchr(struct vt_device *vd, co return; line = (sc->sc_stride * top) + left * sc->sc_depth/8; - for (; height > 0; height--) { - for (c = 0; c < width; c++) { - if (c % 8 == 0) + if (mask == NULL && sc->sc_depth == 8 && (width % 8 == 0)) { + for (; height > 0; height--) { + for (c = 0; c < width; c += 8) { b = *src++; - else - b <<= 1; - if (mask != NULL) { + + /* + * Assume that there is more background than + * foreground in characters and init accordingly + */ + ch1.l = ch2.l = (bg << 24) | (bg << 16) | + (bg << 8) | bg; + + /* + * Calculate 2 x 4-chars at a time, and then + * write these out. + */ + if (b & 0x80) ch1.c[0] = fg; + if (b & 0x40) ch1.c[1] = fg; + if (b & 0x20) ch1.c[2] = fg; + if (b & 0x10) ch1.c[3] = fg; + + if (b & 0x08) ch2.c[0] = fg; + if (b & 0x04) ch2.c[1] = fg; + if (b & 0x02) ch2.c[2] = fg; + if (b & 0x01) ch2.c[3] = fg; + + *(uint32_t *)(sc->sc_addr + line + c) = ch1.l; + *(uint32_t *)(sc->sc_addr + line + c + 4) = + ch2.l; + } + line += sc->sc_stride; + } + } else { + for (; height > 0; height--) { + for (c = 0; c < width; c++) { if (c % 8 == 0) - m = *mask++; + b = *src++; else - m <<= 1; - /* Skip pixel write, if mask has no bit set. */ - if ((m & 0x80) == 0) - continue; - } - switch(sc->sc_depth) { - case 8: - *(uint8_t *)(sc->sc_addr + line + c) = - b & 0x80 ? fg : bg; - break; - case 32: - *(uint32_t *)(sc->sc_addr + line + 4*c) = - (b & 0x80) ? fgc : bgc; - break; - default: - /* panic? */ - break; + b <<= 1; + if (mask != NULL) { + if (c % 8 == 0) + m = *mask++; + else + m <<= 1; + /* Skip pixel write, if mask not set. */ + if ((m & 0x80) == 0) + continue; + } + switch(sc->sc_depth) { + case 8: + *(uint8_t *)(sc->sc_addr + line + c) = + b & 0x80 ? fg : bg; + break; + case 32: + *(uint32_t *)(sc->sc_addr + line + 4*c) + = (b & 0x80) ? fgc : bgc; + break; + default: + /* panic? */ + break; + } } + line += sc->sc_stride; } - line += sc->sc_stride; } }