Date: Wed, 16 May 2018 20:56:34 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: =?UTF-8?Q?Jean-S=c3=a9bastien_P=c3=a9dron?= <dumbbell@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r333669 - in head/sys: dev/vt kern sys teken Message-ID: <20180516191245.T5082@besplex.bde.org> In-Reply-To: <201805160901.w4G912FD056132@repo.freebsd.org> References: <201805160901.w4G912FD056132@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 16 May 2018, [UTF-8] Jean-S=C3=A9bastien P=C3=A9dron wrote: > Author: dumbbell > Date: Wed May 16 09:01:02 2018 > New Revision: 333669 > URL: https://svnweb.freebsd.org/changeset/base/333669 > > Log: > teken, vt(4): New callbacks to lock the terminal once > > ... to process input, instead of inside each smaller operations such as > appending a character or moving the cursor forward. > .... > The goal is to improve input processing speed of vt(4). As a benchmark, > here is the time taken to write a text file of 360 000 lines (26 MiB) on > `ttyv0`: > > * vt(4), unmodified: 1500 ms > * vt(4), with this patch: 1200 ms > * syscons(4): 700 ms Syscons was pessimized by a factor of about 12 using related methods (excessive layering, aktough not so much locking). So the correct comparison is with unpessimized syscons taking about 60 ms. These times are just for writing to the history buffer and are very relevan= t for normal operation. Pessimizations by factors of 12 just annoy me. > This is on a Haswell laptop with a GENERIC-NODEBUG kernel. My times are on Haswell too, but on a 4.0GHz desktop with non-GENERIC non-debug kernels. My test does 65MB of (weird) text output consisting 650 lines of length 100000. Again, this is not very representative. The (trivial) benchmark sources have almost no changes since I wrote it to benchmark and optimize syscons the Minix console driver 25-30 years ago. (I didn't do much with syscons then, but made sure that it was only slightly slower.) Old tests did 80-column output and all versions do 1 write(2) per line and it was convenient to scale up the line length to avoid having too much of the time being for syscall overhead. The 65MB is scaled to take about 1 second on an AthlonXP 2.2GHz with the best version of syscons. Approximate times on Haswell: - 0.2 seconds with FreeBSD-5.2 syscons modified to recover some of my fixes from 1993. I micro-optimized the inner loop in 1993. The inner loop handles these 100000-character lines 80 characters at a time using as close as possible to *dst++ =3D *src++, with considerable overhead for attributes, checking for escape sequences, and for reducing to 80 columns. According to the comment, this took 26 cycles on i486's (probably DX2/66), but I optimized it to only 18. This optimized inner loop was turned into mostly nonsense before FreeBSD-5.2, using inlining in all the wrong places. The loop was moved into an inline function (sc_term_gen_print()), but it calls a non-inline function (sc_vtb_putchar()). This made it about 50% slower IIRC. - 50% slower in pre-teken versions in pre-release versions of FreeBSD-8. FreeBSD-8 changed the upper layers of the tty driver. The pessimization is to do quoting stuff per-char. This made the i/o even slower than the old way. The older way produced larger tinygrams by transfering between the layers only about 100 bytes per output call, and used inefficent cli= sts. - about 500% slower for teken, by calling from the sc layer to the teken layer for every char. IIRC, there are more than 5 but less than 10 function calls per char, so it is doing OK to be only 5 times slower. - thus the time on Haswell was about 2.4 seconds for 65MB. This is for text mode. Only slightly slower for graphics mode. Screen refresh shou= ld occur at most about 50 times in 2.4 seconds, so at most 100k of the outp= ut should actually reach the screen and that shouldn't take long on a 4GHz system! (On a 30 year ET4000, the frame buffer speed was 5.9MB/sec so 100k must take at least 1/30 seconds in text mode. Any slower than that is bad.) - I optimized this a little by avoiding 1 or 2 function calls (for attribut= e handling) per character, so syscoons onl takes about 2 seconds for 65MB in -current. This is consistent with your 700 ms (26/65 * 2000 ms =3D 800). - I have syscons mostly fixed in local patches: - Method A: restore scterm-sc.c (don't use teken). This was very easy, = and fixes many other bugs much more easily than in my committed and local patches. The excessive layering in syscons actually helps here -- the API for the layering of the terminal emulator has only small changes, so scterm-sc.c from FreeBSD-7 takes only about 10 lines of changes to drop back in. Also restore optimizations from 1993 as far as possible. They moved t= o scterm-sc.c and were lost with teken. This fixes everything except the slow upper layers, so the speed is 0.33 seconds for 65MB. seconds for - Method B: restore only sctermvar.h from FreeBSD-7. Use only sc_term_gen_print() and its infrastructure from this. This does essentially *dst++ =3D *src++ to the history buffer until it hits an escape sequence, and it must not be called while in an escape sequence. Subvert the teken layering so that this can be used. > At the same time, the locking is changed in the vt_flush() function > which is responsible to draw the text on screen. So instead of > (indirectly) using VTBUF_LOCK() just to read and reset the dirty area > of the internal buffer, the lock is held for about the entire function, > including the drawing part. > > The change is mostly visible while content is scrolling fast: before, > lines could appear garbled while scrolling because the internal buffer > was accessed without locks (once the scrolling was finished, the output > was correct). Now, the scrolling appears correct. > > In the end, the locking model is closer to what syscons(4) does. This is only very fast because the output never reaches the screen. Frame buffers aren't much faster than 25 years ago. Text mode tends to be limited to a few MB/sec. Old VGA graphics modes (not supported by vt) are much slower since they require using PIO registers and there is more to do. Direct bitmapped modes might be no slower than text mode, but again there is a lot of I/O to do. Old syscons updates so fast that no scrolling is visible when all lines are the same. This also requires magic buffer sizes. clists and the transfer size of 100 gave suitable magic (e.g., 100 divides 80x25). Now tty buffer sizes are normally powers of 2, so the magic lining up rarely occurs and this looks like artifacts in scrolling. Bruce From owner-svn-src-all@freebsd.org Wed May 16 11:19:05 2018 Return-Path: <owner-svn-src-all@freebsd.org> Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D673CEDA964; Wed, 16 May 2018 11:19:04 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8C18283D52; Wed, 16 May 2018 11:19:04 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6912F197A7; Wed, 16 May 2018 11:19:04 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w4GBJ4lL026349; Wed, 16 May 2018 11:19:04 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w4GBJ3In026346; Wed, 16 May 2018 11:19:03 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201805161119.w4GBJ3In026346@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste <emaste@FreeBSD.org> Date: Wed, 16 May 2018 11:19:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r333672 - in head/sys/dev/vt: . hw/fb hw/ofwfb X-SVN-Group: head X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: in head/sys/dev/vt: . hw/fb hw/ofwfb X-SVN-Commit-Revision: 333672 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" <svn-src-all.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-all>, <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/> List-Post: <mailto:svn-src-all@freebsd.org> List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-all>, <mailto:svn-src-all-request@freebsd.org?subject=subscribe> X-List-Received-Date: Wed, 16 May 2018 11:19:05 -0000 Author: emaste Date: Wed May 16 11:19:03 2018 New Revision: 333672 URL: https://svnweb.freebsd.org/changeset/base/333672 Log: Clean up vt source whitespace issues Modified: head/sys/dev/vt/hw/fb/vt_early_fb.c head/sys/dev/vt/hw/ofwfb/ofwfb.c head/sys/dev/vt/vt.h head/sys/dev/vt/vt_core.c Modified: head/sys/dev/vt/hw/fb/vt_early_fb.c ============================================================================== --- head/sys/dev/vt/hw/fb/vt_early_fb.c Wed May 16 11:06:39 2018 (r333671) +++ head/sys/dev/vt/hw/fb/vt_early_fb.c Wed May 16 11:19:03 2018 (r333672) @@ -224,7 +224,7 @@ vt_efb_init(struct vt_device *vd) * remapped for us when relocation turns on. */ if (OF_getproplen(node, "address") == sizeof(info->fb_pbase)) { - /* XXX We assume #address-cells is 1 at this point. */ + /* XXX We assume #address-cells is 1 at this point. */ OF_getencprop(node, "address", &info->fb_pbase, sizeof(info->fb_pbase)); Modified: head/sys/dev/vt/hw/ofwfb/ofwfb.c ============================================================================== --- head/sys/dev/vt/hw/ofwfb/ofwfb.c Wed May 16 11:06:39 2018 (r333671) +++ head/sys/dev/vt/hw/ofwfb/ofwfb.c Wed May 16 11:19:03 2018 (r333672) @@ -317,7 +317,7 @@ ofwfb_initialize(struct vt_device *vd) } if (i != 16) sc->iso_palette = 1; - + break; case 32: @@ -417,7 +417,7 @@ ofwfb_init(struct vt_device *vd) * remapped for us when relocation turns on. */ if (OF_getproplen(node, "address") == sizeof(fb_phys)) { - /* XXX We assume #address-cells is 1 at this point. */ + /* XXX We assume #address-cells is 1 at this point. */ OF_getprop(node, "address", &fb_phys, sizeof(fb_phys)); #if defined(__powerpc__) Modified: head/sys/dev/vt/vt.h ============================================================================== --- head/sys/dev/vt/vt.h Wed May 16 11:06:39 2018 (r333671) +++ head/sys/dev/vt/vt.h Wed May 16 11:19:03 2018 (r333672) @@ -93,7 +93,7 @@ struct vt_driver; void vt_allocate(const struct vt_driver *, void *); void vt_deallocate(const struct vt_driver *, void *); -typedef unsigned int vt_axis_t; +typedef unsigned int vt_axis_t; /* * List of locks Modified: head/sys/dev/vt/vt_core.c ============================================================================== --- head/sys/dev/vt/vt_core.c Wed May 16 11:06:39 2018 (r333671) +++ head/sys/dev/vt/vt_core.c Wed May 16 11:19:03 2018 (r333672) @@ -539,14 +539,14 @@ vt_window_switch(struct vt_window *vw) return (0); if (!(vw->vw_flags & (VWF_OPENED|VWF_CONSOLE))) return (EINVAL); - + vd->vd_curwindow = vw; vd->vd_flags |= VDF_INVALID; if (vd->vd_driver->vd_postswitch) vd->vd_driver->vd_postswitch(vd); return (0); } - + VT_LOCK(vd); if (curvw == vw) { /* Nothing to do. */ @@ -2279,7 +2279,7 @@ skip_thunk: /* XXX */ *(int *)data = M_CG640x480; return (0); - case CONS_BELLTYPE: /* set bell type sound */ + case CONS_BELLTYPE: /* set bell type sound */ if ((*(int *)data) & CONS_QUIET_BELL) vd->vd_flags |= VDF_QUIET_BELL; else @@ -2383,7 +2383,7 @@ skip_thunk: break; } return (0); - case KDENABIO: /* allow io operations */ + case KDENABIO: /* allow io operations */ error = priv_check(td, PRIV_IO); if (error != 0) return (error); @@ -2396,20 +2396,20 @@ skip_thunk: td->td_frame->tf_rflags |= PSL_IOPL; #endif return (0); - case KDDISABIO: /* disallow io operations (default) */ + case KDDISABIO: /* disallow io operations (default) */ #if defined(__i386__) td->td_frame->tf_eflags &= ~PSL_IOPL; #elif defined(__amd64__) td->td_frame->tf_rflags &= ~PSL_IOPL; #endif return (0); - case KDMKTONE: /* sound the bell */ + case KDMKTONE: /* sound the bell */ vtterm_beep(tm, *(u_int *)data); return (0); - case KIOCSOUND: /* make tone (*data) hz */ + case KIOCSOUND: /* make tone (*data) hz */ /* TODO */ return (0); - case CONS_SETKBD: /* set the new keyboard */ + case CONS_SETKBD: /* set the new keyboard */ mtx_lock(&Giant); error = 0; if (vd->vd_keyboard != *(int *)data) { @@ -2437,7 +2437,7 @@ skip_thunk: } mtx_unlock(&Giant); return (error); - case CONS_RELKBD: /* release the current keyboard */ + case CONS_RELKBD: /* release the current keyboard */ mtx_lock(&Giant); error = 0; if (vd->vd_keyboard != -1) { @@ -2509,7 +2509,7 @@ skip_thunk: VT_UNLOCK(vd); return (error); } - case VT_SETMODE: { /* set screen switcher mode */ + case VT_SETMODE: { /* set screen switcher mode */ struct vt_mode *mode; struct proc *p1;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180516191245.T5082>