From owner-svn-src-head@FreeBSD.ORG Mon Jun 18 04:07:12 2012 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7957D1065673; Mon, 18 Jun 2012 04:07:12 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by mx1.freebsd.org (Postfix) with ESMTP id CA6398FC0C; Mon, 18 Jun 2012 04:07:11 +0000 (UTC) Received: from c122-106-171-232.carlnfd1.nsw.optusnet.com.au (c122-106-171-232.carlnfd1.nsw.optusnet.com.au [122.106.171.232]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q5I470ib018093 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 18 Jun 2012 14:07:02 +1000 Date: Mon, 18 Jun 2012 14:07:00 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Poul-Henning Kamp In-Reply-To: <201206172102.q5HL2mG9032399@svn.freebsd.org> Message-ID: <20120618130626.M952@besplex.bde.org> References: <201206172102.q5HL2mG9032399@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r237203 - head/sys/dev/fb X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jun 2012 04:07:12 -0000 On Sun, 17 Jun 2012, Poul-Henning Kamp wrote: > Log: > On certain newer Intel Atom based motherboards, for instance the > D2500CC which I have, syscons in text-mode fails to show the expected > contents due to write errors into video-memory. > > At least one of the causes is that we copy from syscons internal buffer > to the video memory with optimized bcopy(9) which uses >16bit operations. > > Until now, 32bit and wider operations have always worked on the video > memory, but since I cannot find a single source which says that this > SHALL work, and since these chipsets/bugs are now out there, this > commit changes syscons to always use 16bit copies on i386 & amd64. > > This may be relevevant for PR's: > 166262 > 166639 > and various other bug reports floating elsewhere on the net, but > I lack hardware to test those. There is no reason 16-bit copies should work either. IIRC, CRTC registers were once written 16 bits at a time, but some (rare) hardware doesn't like this. One would expect frame buffers to look more like memory than device ragisters, but it is apparently possible for this to be broken too. > > Modified: > head/sys/dev/fb/fbreg.h > > Modified: head/sys/dev/fb/fbreg.h > ============================================================================== > --- head/sys/dev/fb/fbreg.h Sun Jun 17 20:45:45 2012 (r237202) > +++ head/sys/dev/fb/fbreg.h Sun Jun 17 21:02:48 2012 (r237203) > @@ -35,9 +35,16 @@ > > /* some macros */ > #if defined(__amd64__) || defined(__i386__) > -#define bcopy_io(s, d, c) bcopy((void *)(s), (void *)(d), (c)) > -#define bcopy_toio(s, d, c) bcopy((void *)(s), (void *)(d), (c)) > -#define bcopy_fromio(s, d, c) bcopy((void *)(s), (void *)(d), (c)) Syscons used to use the bogus bcopyb() function to avoid this bug. This has been turned into nonsense -- bcopyb() still exists on i386, but is never used (it wasn't copied to amd64). The above was used instead. (bcopyb() is bogus because the interface is logically a bus-io one and has nothing to do with bcopy().) Syscons (fb) still uses the bogus fillw() function. fillw() is implemented bogusly. First, the corresponding API used by the main part of fb is named fillw_io(). That is not as bogus as bcopy*_io() since fillw() isn't a standard memory API like bcopy() is. But fillw() is bogus since it should have been named something like fillw_io() to begin with. Then: - on amd64 and i386, fillw() is a function "optimized" in support.[sS]. fillw_io() is #defined to this. - on other arches, fillw() is implemented using messy ifdefs: - on ia64, the ifdefs are used to obfuscate the logically correct function bus_space_set_region(2) as fillw_io() - on sparc64, a fillw() function is implemented in C, under a messy ifdef that seems to be completely broken -- fillw() is never used directly, and seems to be unattached to fillw_io() on sparc64 - on powerpc(), both fillw_io() and fillw() are #defined in terms of ofwfb_fillw(). Now fillw_io() exists, but fillw() seems to be unused garbage. - the #else clause if the messy ifdef is commented as being for !__i386__ && !__amd64__ && !__ia64__ && !__sparc64__ && !__powerpc__ (in that unsorted order. __amd64__ and __i386__ were not unsorted in the corresponding #if). I think this reduces to __arm__. So for arm, in this #else clause fillw_io() is #defined in terms of memset_io() (another bogus interface). The unused fillw() is defined in terms of memsetw(). This can be considered a non-bogus interface -- it can be like memset() and thus unsuitable for i/o, but take a "word" pattern instead of a u_char pattern. The full (?) set of bogus interfaces is easiest to see in this #else clause. It is bcopy_{io,toio,fromio}, bzero_io (another nonsense), and fill{_io,w,w_io}. > Modified: head/sys/dev/fb/fbreg.h > ============================================================================== > --- head/sys/dev/fb/fbreg.h Sun Jun 17 20:45:45 2012 (r237202) > +++ head/sys/dev/fb/fbreg.h Sun Jun 17 21:02:48 2012 (r237203) > @@ -35,9 +35,16 @@ > > /* some macros */ > #if defined(__amd64__) || defined(__i386__) > -#define bcopy_io(s, d, c) bcopy((void *)(s), (void *)(d), (c)) > -#define bcopy_toio(s, d, c) bcopy((void *)(s), (void *)(d), (c)) > -#define bcopy_fromio(s, d, c) bcopy((void *)(s), (void *)(d), (c)) > + > +static __inline void > +copyw(uint16_t *src, uint16_t *dst, size_t size) > +{ > + while (size--) > + *dst++ = *src++; > +} > +#define bcopy_io(s, d, c) copyw((void*)(s), (void*)(d), (c)) > +#define bcopy_toio(s, d, c) copyw((void*)(s), (void*)(d), (c)) > +#define bcopy_fromio(s, d, c) copyw((void*)(s), (void*)(d), (c)) This patch doesn't improve the above mess: - it is only for amd64 and i386 - it introduces yet another function and naming scheme that doesn't use bus space and doesn't have "io" in the basic function - it makes other old bugs more obvious: - space instead of tab after all #define's - bogus casts to (void *) for bcopy()'s args. These probably don't even hide bugs, since bcopy()'s prototype says to do the same conversions and the compiler probably doesn't warn about implicit conversions to void *. - bogus casts to (void *) for copyw()'s args. These may now hide bugs, since they are different from the ones that the prototype says to do, and the compiler would probably complain about implicit conversions fro a non-uint16_t * pointer to a uint16_t * one. > #define bzero_io(d, c) bzero((void *)(d), (c)) bzero() still does 32 or even 64 bit accesses. Hmm, I wonder if your hardware works with 32-bit accesses on i386 but not with 64-bit ones on amd64. > #define fill_io(p, d, c) fill((p), (void *)(d), (c)) > #define fillw_io(p, d, c) fillw((p), (void *)(d), (c)) I tried to get the powers that be to clean up this mess, but there was no interest. From old mail (the version that seems to have the most complete patch; perhaps not my best version of the patch): % On Wed, 17 Jun 2009, Bruce Evans wrote: % % ... % % Here are patches for RELENG_7 to remove the frame buffer grot from % i386 and partially clean it up in frame buffer code. This compiles % in a least 1 i386 kernel and is fast enough (22 MB/S for % % dd % % +#endif % % +/* XXX fix gratuitous MD spelling: */ % % +#ifdef __amd64__ % % +#define BUS_SPACE_MEM AMD64_BUS_SPACE_MEM % % +#endif % % +#ifdef __i386__ % % +#define BUS_SPACE_MEM I386_BUS_SPACE_MEM % % +#endif % % +#ifdef __ia64__ % % +#define BUS_SPACE_MEM IA64_BUS_SPACE_MEM % % +#endif % % +#if defined(__amd64__) || defined(__i386__) || defined(__ia64__) % % #define bcopy_fromio(s, d, c) \ % % - bus_space_read_region_1(IA64_BUS_SPACE_MEM, s, 0, (void*)(d), c) % % + bus_space_read_region_1(BUS_SPACE_MEM, s, 0, (void*)(d), c) % % #define bcopy_io(s, d, c) \ % % - bus_space_copy_region_1(IA64_BUS_SPACE_MEM, s, 0, d, 0, c) % % + bus_space_copy_region_1(BUS_SPACE_MEM, s, 0, d, 0, c) % % #define bcopy_toio(s, d, c) \ % % - bus_space_write_region_1(IA64_BUS_SPACE_MEM, d, 0, (void*)(s), c) % % + bus_space_write_region_1(BUS_SPACE_MEM, d, 0, (void*)(s), c) % % #define bzero_io(d, c) \ % % - bus_space_set_region_1(IA64_BUS_SPACE_MEM, (intptr_t)(d), 0, 0, c) % % + bus_space_set_region_1(BUS_SPACE_MEM, (intptr_t)(d), 0, 0, c) % % #define fill_io(p, d, c) \ % % - bus_space_set_region_1(IA64_BUS_SPACE_MEM, (intptr_t)(d), 0, p, c) % % + bus_space_set_region_1(BUS_SPACE_MEM, (intptr_t)(d), 0, p, c) % % #define fillw_io(p, d, c) \ % % - bus_space_set_region_2(IA64_BUS_SPACE_MEM, (intptr_t)(d), 0, p, c) % % -#define readb(a) bus_space_read_1(IA64_BUS_SPACE_MEM, a, 0) % % -#define readw(a) bus_space_read_2(IA64_BUS_SPACE_MEM, a, 0) % % -#define writeb(a, v) bus_space_write_1(IA64_BUS_SPACE_MEM, a, 0, v) % % -#define writew(a, v) bus_space_write_2(IA64_BUS_SPACE_MEM, a, 0, v) % % -#define writel(a, v) bus_space_write_4(IA64_BUS_SPACE_MEM, a, 0, v) % % -#endif /* __ia64__ */ % % + bus_space_set_region_2(BUS_SPACE_MEM, (intptr_t)(d), 0, p, c) % % +#define readb(a) bus_space_read_1(BUS_SPACE_MEM, a, 0) % % +#define readw(a) bus_space_read_2(BUS_SPACE_MEM, a, 0) % % +#define writeb(a, v) bus_space_write_1(BUS_SPACE_MEM, a, 0, v) % % +#define writew(a, v) bus_space_write_2(BUS_SPACE_MEM, a, 0, v) % % +#define writel(a, v) bus_space_write_4(BUS_SPACE_MEM, a, 0, v) % % +#endif /* __amd64__ || __i386__ || __ia64__ */ % % +/* fillw for __amd64__ || __i386__ || __ia64__ and || __sparc64__ too, ugh. */ % % static __inline void % % fillw(int val, uint16_t *buf, size_t size) % % @@ -102,5 +101,5 @@ % % #define fillw(p, d, c) memsetw((d), (p), (c)) % % #define fillw_io(p, d, c) memsetw_io((d), (p), (c)) % % -#endif /* !__i386__ */ % % +#endif % % % % /* video function table */ % % Index: dev/fb/vga.c % % =================================================================== % % RCS file: /home/ncvs/src/sys/dev/fb/vga.c,v % % retrieving revision 1.36 % % diff -u -2 -r1.36 vga.c % % --- dev/fb/vga.c 4 Dec 2005 02:12:41 -0000 1.36 % % +++ dev/fb/vga.c 17 Jun 2009 02:19:22 -0000 % % @@ -1331,15 +1331,4 @@ % % % % #ifndef VGA_NO_MODE_CHANGE % % -#if defined(__i386__) || defined(__amd64__) /* XXX */ % % -static void % % -fill(int val, void *d, size_t size) % % -{ % % - u_char *p = d; % % - % % - while (size-- > 0) % % - *p++ = val; % % -} % % -#endif /* __i386__ */ % % - % % static void % % filll_io(int val, vm_offset_t d, size_t size) Gak, yet another fill function to actually force 32-bit accesses. What does your hardware do with this and the bzero()s? The i586 optimized copying routines used 64-bit accesses even on i386, and of course for sp33d, bcopy should use SSE for 128-bit accesses. I wonder if there is only a problem with misaligned cases on your hardware. x86 bcopy() is intentionally sloppy with alignment since it expects callers to align if this is best for speed. But frame buffer accesses could naturally cause lots of cases with only 16-bit alignment. % % Index: i386/i386/support.s % % =================================================================== % % RCS file: /home/ncvs/src/sys/i386/i386/support.s,v % % retrieving revision 1.119.2.1 % % diff -u -2 -r1.119.2.1 support.s % % --- i386/i386/support.s 12 Jan 2009 15:48:22 -0000 1.119.2.1 % % +++ i386/i386/support.s 17 Jun 2009 01:43:27 -0000 % % @@ -429,49 +429,4 @@ % % END(i686_pagezero) % % % % -/* fillw(pat, base, cnt) */ % % -ENTRY(fillw) % % - pushl %edi % % - movl 8(%esp),%eax % % - movl 12(%esp),%edi % % - movl 16(%esp),%ecx % % - cld % % - rep % % - stosw % % - popl %edi % % - ret % % -END(fillw) % % - % % -ENTRY(bcopyb) % % - pushl %esi % % - pushl %edi % % - movl 12(%esp),%esi % % - movl 16(%esp),%edi % % - movl 20(%esp),%ecx % % - movl %edi,%eax % % - subl %esi,%eax % % - cmpl %ecx,%eax /* overlapping && src < dst? */ % % - jb 1f % % - cld /* nope, copy forwards */ % % - rep % % - movsb % % - popl %edi % % - popl %esi % % - ret % % - % % - ALIGN_TEXT % % -1: % % - addl %ecx,%edi /* copy backwards. */ % % - addl %ecx,%esi % % - decl %edi % % - decl %esi % % - std % % - rep % % - movsb % % - popl %edi % % - popl %esi % % - cld % % - ret % % -END(bcopyb) % % - % % ENTRY(bcopy) % % MEXITCOUNT % % Index: i386/include/cpufunc.h % % =================================================================== % % RCS file: /home/ncvs/src/sys/i386/include/cpufunc.h,v % % retrieving revision 1.145.2.1 % % diff -u -2 -r1.145.2.1 cpufunc.h % % --- i386/include/cpufunc.h 12 Jan 2009 15:48:22 -0000 1.145.2.1 % % +++ i386/include/cpufunc.h 17 Jun 2009 01:44:24 -0000 % % @@ -45,12 +45,4 @@ % % struct region_descriptor; % % % % -#define readb(va) (*(volatile u_int8_t *) (va)) % % -#define readw(va) (*(volatile u_int16_t *) (va)) % % -#define readl(va) (*(volatile u_int32_t *) (va)) % % - % % -#define writeb(va, d) (*(volatile u_int8_t *) (va) = (d)) % % -#define writew(va, d) (*(volatile u_int16_t *) (va) = (d)) % % -#define writel(va, d) (*(volatile u_int32_t *) (va) = (d)) % % - % % #if defined(__GNUCLIKE_ASM) && defined(__CC_SUPPORTS___INLINE) % % % % Index: i386/include/md_var.h % % =================================================================== % % RCS file: /home/ncvs/src/sys/i386/include/md_var.h,v % % retrieving revision 1.76.2.1 % % diff -u -2 -r1.76.2.1 md_var.h % % --- i386/include/md_var.h 21 Jan 2009 20:16:11 -0000 1.76.2.1 % % +++ i386/include/md_var.h 17 Jun 2009 01:44:01 -0000 % % @@ -80,5 +80,4 @@ % % struct dumperinfo; % % % % -void bcopyb(const void *from, void *to, size_t len); % % void busdma_swi(void); % % void cpu_setregs(void); % % @@ -95,5 +94,4 @@ % % void dump_drop_page(vm_paddr_t); % % void enable_sse(void); % % -void fillw(int /*u_short*/ pat, void *base, size_t cnt); % % void i486_bzero(void *buf, size_t len); % % void i586_bcopy(const void *from, void *to, size_t len); % % In fbreg.h, this just uses the ia64 macros for i386 (and for amd64, % untested) after s/IA64_//g. There are still messes for sparc64, % powerpc and default (arm only?), and new messes to get an MI name % for BUS_SPACE_MEM (the very idea of a frame buffer depends on having % a memory bus-space, so BUS_SPACE_MEM should be MI). fillw() doesn't % use bus space since it is for ordinary memory. It should probably % be implemented in vga.c. between fill() (removed above) and filll(). % fill() looks like it is for ordinary memory, but was only used for % bus space memory, only on amd64 and i386; thus it became unused. % It is also better spelled as memset(). fillw() and filll() would % probably be better spelled memset[wl]() like fillw() already is in % MD code for the default case. % % Bruce Bruce