From owner-freebsd-arch@FreeBSD.ORG Wed Mar 5 14:07:49 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 048A59F1; Wed, 5 Mar 2014 14:07:49 +0000 (UTC) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id A1577CBB; Wed, 5 Mar 2014 14:07:48 +0000 (UTC) Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 03F2F4212EF; Thu, 6 Mar 2014 01:07:39 +1100 (EST) Date: Thu, 6 Mar 2014 01:07:38 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Justin Hibbits Subject: Re: newcons fb driver In-Reply-To: Message-ID: <20140305230458.L1053@besplex.bde.org> References: <42130.1393829535@critter.freebsd.dk> <5314B2A2.3060100@pix.net> <60475.1393876211@critter.freebsd.dk> <20140304071445.Y3158@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=ddC5gxne c=1 sm=1 tr=0 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=wcSzQ5qOX8Chkg5hSN0A:9 a=CjuIK1q_8ugA:10 Cc: Poul-Henning Kamp , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Mar 2014 14:07:49 -0000 On Tue, 4 Mar 2014, Justin Hibbits wrote: > On Mon, Mar 3, 2014 at 2:33 PM, Bruce Evans wrote: >> Is newcons so much worse than syscons that it doesn't even have a >> backing buffer? Backing buffers are a fundamental part of virtual >> consoles. ... >> >> Why would newcons need to start supporting bytewise i/o now? Hardware >> was rarely broken enough to need it even in FreeBSD-1, and syscons has >> always been sloppy about it. ... > > Newcons does have a backing text buffer. I'm not even sure how we got > on topic of a text buffer, when my question was regarding the frame > buffer (sys/dev/vt/hw/fb/vt_fb.c), specifically vt_fb_bitbltchr(), and > consists of only the following: Since it takes very large hardware pessimizations to give slowness with a correctly implemented backing text buffer. Newcons doesn't seem to have one of those, even for vga (sys/dev/vt/hw/vga/vga.c) where it has a character output routine. vga supports text mode using slow code, but I think it is only 2 times slower than syscons in most cases. Bitmapped mode with 16x8 characters and 256 colors takes 128 bytes/character (most wasted for coloring individual pixels in characters), so it is inherently 64 times slower than text mode. vt_fb_bitbltchr() supports this. It doesn't seem to be especially pessimal, except for its per-char interface. It seems to use bytewise accesses a bit too much for simplicity, although it sometimes does 32-bit. This should only give a further slowness factor of 2, 4 or 8 unless the frame buffer hardware is stupid. vt_fb_setpixel() is an example of a really slow interface. I thought at first that vt_fb_bitbltchr() was much faster, but now see that it is not much more than a wrapper around a manually inlined vt_fb_setpixel(). The problems are easier to see in it. Setting pixels one at a time is too slow. For 16x8 characters, that is at least 128 memory accesses per character, and the interface prevents merging these accesses. vt_fb_bitbltchr() could do some merging, but doesn't seem to do any. Anyway, the support seems to be limited to modes with 2**8, 2**16 or 2**32 colors, so that there are 1, 2 or 4 bytes per pixel. For modes with < 8 bits per pixel, vt_fb_setpixel() is even worse. Maybe I misremember how colors are packed, but this is very hardware dependent and I think vt only supports 1 simple type of packing. A correctly implemented backing text buffer consists of characters (and possibly attributes) stored in fast memory in a form for copying to a frame buffer, with the frame buffer in text mode. A not so correctly implemented backing text buffer is the same, except with the frame buffer in pixel mode. This is much more needed if the frame buffer doesn't support text mode. > * Does newcons support a background image, or is the mask there simply > for drawing the text? > * If it does support a background image, would it be good to > double-buffer this, writing to the frame buffer only after the text is > blitted? > * If it doesn't support an image, would it be acceptable to prime the > word to lay down with the background, and modify the pixels in the > word accordingly? I don't know what the mask does. How would you lay down the background much faster than with blitting? > Without some kind of optimization, newcons on powerpc is unacceptably slow. How slow is that exactly? If the frame buffer speed is 50MB/S, then 16x8 characters with 256 colors can be written at 390K/S. This is acceptable. If the frame buffer is much slower than that, then it is too slow, at least without hardware scrolling. Sample frame buffer speeds: my first PC (1982): 1MB/sec (slower through the CPU) ISA ET4000: 2.4MB/sec read, 5.9MB/sec write VLB ET4000/W32i: 6.8MB/sec read, 25.5MB/sec write PCI S3/868: 3.5MB/sec read, 23.1MB/sec write PCI S3/Virge: 4.1MB/sec read, 40.0MB/sec write PCI S3/Savage: 3.3MB/sec read, 25.8MB/sec write PCI Xpert: 5.3MB/sec read, 21.8MB/sec write PCI R9200SE: 5.8MB/sec read, 60.2MB/sec write (but 120MB/sec through FPU) I stopped measuring these speeds often about 20 years ago when they became fast enough for simple i/o in text mode. The last 2 are newer. Read speeds are still much lower for some reason (just the usual PCI slowness?). This would amplify any slowness from the hardware doing read-modify-write accesses for to convert to aligned 32/64/128/...-bit accesses. I think bit blitting can't do any better than the hardware by doing the read cycles itself. Everything needs to be buffered in fast memory just to avoid reading slow memory. Bruce