Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Jun 2012 14:07:00 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Poul-Henning Kamp <phk@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r237203 - head/sys/dev/fb
Message-ID:  <20120618130626.M952@besplex.bde.org>
In-Reply-To: <201206172102.q5HL2mG9032399@svn.freebsd.org>
References:  <201206172102.q5HL2mG9032399@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 17 Jun 2012, Poul-Henning Kamp wrote:

> Log:
>  On certain newer Intel Atom based motherboards, for instance the
>  D2500CC which I have, syscons in text-mode fails to show the expected
>  contents due to write errors into video-memory.
>
>  At least one of the causes is that we copy from syscons internal buffer
>  to the video memory with optimized bcopy(9) which uses >16bit operations.
>
>  Until now, 32bit and wider operations have always worked on the video
>  memory, but since I cannot find a single source which says that this
>  SHALL work, and since these chipsets/bugs are now out there, this
>  commit changes syscons to always use 16bit copies on i386 & amd64.
>
>  This may be relevevant for PR's:
>  	166262
>  	166639
>  and various other bug reports floating elsewhere on the net, but
>  I lack hardware to test those.

There is no reason 16-bit copies should work either.  IIRC, CRTC registers
were once written 16 bits at a time, but some (rare) hardware doesn't
like this.  One would expect frame buffers to look more like memory than
device ragisters, but it is apparently possible for this to be broken too.

>
> Modified:
>  head/sys/dev/fb/fbreg.h
>
> Modified: head/sys/dev/fb/fbreg.h
> ==============================================================================
> --- head/sys/dev/fb/fbreg.h	Sun Jun 17 20:45:45 2012	(r237202)
> +++ head/sys/dev/fb/fbreg.h	Sun Jun 17 21:02:48 2012	(r237203)
> @@ -35,9 +35,16 @@
>
> /* some macros */
> #if defined(__amd64__) || defined(__i386__)
> -#define bcopy_io(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
> -#define bcopy_toio(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
> -#define bcopy_fromio(s, d, c)	bcopy((void *)(s), (void *)(d), (c))

Syscons used to use the bogus bcopyb() function to avoid this bug.  This
has been turned into nonsense -- bcopyb() still exists on i386, but is
never used (it wasn't copied to amd64).  The above was used instead.
(bcopyb() is bogus because the interface is logically a bus-io one and
has nothing to do with bcopy().)

Syscons (fb) still uses the bogus fillw() function.  fillw() is implemented
bogusly.  First, the corresponding API used by the main part of fb is named
fillw_io().  That is not as bogus as bcopy*_io() since fillw() isn't a
standard memory API like bcopy() is.  But fillw() is bogus since it should
have been named something like fillw_io() to begin with.  Then:
- on amd64 and i386, fillw() is a function "optimized" in support.[sS].
   fillw_io() is #defined to this.
- on other arches, fillw() is implemented using messy ifdefs:
   - on ia64, the ifdefs are used to obfuscate the logically correct function
     bus_space_set_region(2) as fillw_io()
   - on sparc64, a fillw() function is implemented in C, under a messy ifdef
     that seems to be completely broken -- fillw() is never used directly,
     and seems to be unattached to fillw_io() on sparc64
   - on powerpc(), both fillw_io() and fillw() are #defined in terms of
     ofwfb_fillw().  Now fillw_io() exists, but fillw() seems to be unused
     garbage.
   - the #else clause if the messy ifdef is commented as being for
     !__i386__ && !__amd64__ && !__ia64__ && !__sparc64__ && !__powerpc__
     (in that unsorted order.  __amd64__ and __i386__ were not unsorted
     in the corresponding #if).  I think this reduces to __arm__.  So
     for arm, in this #else clause fillw_io() is #defined in terms of
     memset_io() (another bogus interface).  The unused fillw() is defined
     in terms of memsetw().  This can be considered a non-bogus interface
     -- it can be like memset() and thus unsuitable for i/o, but take a
     "word" pattern instead of a u_char pattern.

     The full (?) set of bogus interfaces is easiest to see in this #else
     clause.  It is bcopy_{io,toio,fromio}, bzero_io (another nonsense),
     and fill{_io,w,w_io}.

> Modified: head/sys/dev/fb/fbreg.h
> ==============================================================================
> --- head/sys/dev/fb/fbreg.h	Sun Jun 17 20:45:45 2012	(r237202)
> +++ head/sys/dev/fb/fbreg.h	Sun Jun 17 21:02:48 2012	(r237203)
> @@ -35,9 +35,16 @@
>
> /* some macros */
> #if defined(__amd64__) || defined(__i386__)
> -#define bcopy_io(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
> -#define bcopy_toio(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
> -#define bcopy_fromio(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
> +
> +static __inline void
> +copyw(uint16_t *src, uint16_t *dst, size_t size)
> +{
> +	while (size--)
> +		*dst++ = *src++;
> +}
> +#define bcopy_io(s, d, c)	copyw((void*)(s), (void*)(d), (c))
> +#define bcopy_toio(s, d, c)	copyw((void*)(s), (void*)(d), (c))
> +#define bcopy_fromio(s, d, c)	copyw((void*)(s), (void*)(d), (c))

This patch doesn't improve the above mess:
- it is only for amd64 and i386
- it introduces yet another function and naming scheme that doesn't use
   bus space and doesn't have "io" in the basic function
- it makes other old bugs more obvious:
   - space instead of tab after all #define's
   - bogus casts to (void *) for bcopy()'s args.  These probably don't even
     hide bugs, since bcopy()'s prototype says to do the same conversions
     and the compiler probably doesn't warn about implicit conversions to
     void *.
   - bogus casts to (void *) for copyw()'s args.  These may now hide bugs,
     since they are different from the ones that the prototype says to do,
     and the compiler would probably complain about implicit conversions
     fro a non-uint16_t * pointer to a uint16_t * one.

> #define bzero_io(d, c)		bzero((void *)(d), (c))

bzero() still does 32 or even 64 bit accesses.  Hmm, I wonder if your
hardware works with 32-bit accesses on i386 but not with 64-bit ones
on amd64.

> #define fill_io(p, d, c)	fill((p), (void *)(d), (c))
> #define fillw_io(p, d, c)	fillw((p), (void *)(d), (c))

I tried to get the powers that be to clean up this mess, but there was
no interest.  From old mail (the version that seems to have the most
complete patch; perhaps not my best version of the patch):

% On Wed, 17 Jun 2009, Bruce Evans wrote:
% 
% ...
% 
% Here are patches for RELENG_7 to remove the frame buffer grot from
% i386 and partially clean it up in frame buffer code.  This compiles
% in a least 1 i386 kernel and is fast enough (22 MB/S for
% 
%      dd </dev/zero bs=1000000 count=100 | tr '\000' 'p'

The frame buffer grot infected {amd64,i386}/cpufunc.h with bogus #defines
for readb(), etc., and the 17 Jun 2009 thread was originally about cleaning
up this and other grot in cpufunc.h.  There are some layering problems with
cpufunc.h vs bus space.  But in fb, everything seems to be easy to fix using
bus space.  I only tested this on i386, but exactly the same definitions
should work for all arches (assuming that you want to #define things to
bcopy*/fillw*/etc to avoid raw bus space calls), except for the bug that
BUS_SPACE_MEM has a gratuitous MD prefix.

% 
% % Index: dev/atkbdc/atkbd.c
% % ===================================================================
% % RCS file: /home/ncvs/src/sys/dev/atkbdc/atkbd.c,v
% % retrieving revision 1.52.2.1
% % diff -u -2 -r1.52.2.1 atkbd.c
% % --- dev/atkbdc/atkbd.c	20 Apr 2009 16:55:48 -0000	1.52.2.1
% % +++ dev/atkbdc/atkbd.c	17 Jun 2009 02:40:20 -0000
% % @@ -1104,5 +1104,7 @@
% %  		return ENODEV;
% %          p = BIOS_PADDRTOVADDR(((u_int32_t)vmf.vmf_es << 4) + vmf.vmf_bx);
% % -	if ((readb(p + 6) & 0x40) == 0)	/* int 16, function 0x09 supported? */
% % +	/* int 16, function 0x09 supported? */
% % +	/* XXX why not just try it? */
% % +	if ((*(u_int8_t *)(p + 6) & 0x40) == 0)
% %  		return ENODEV;
% %  	vmf.vmf_ax = 0x0900;
% % Index: dev/fb/fbreg.h
% % ===================================================================
% % RCS file: /home/ncvs/src/sys/dev/fb/fbreg.h,v
% % retrieving revision 1.21
% % diff -u -2 -r1.21 fbreg.h
% % --- dev/fb/fbreg.h	18 Jan 2007 13:08:08 -0000	1.21
% % +++ dev/fb/fbreg.h	17 Jun 2009 02:23:39 -0000
% % @@ -35,41 +35,40 @@
% % 
% %  /* some macros */
% % -#ifdef __i386__
% % -#define bcopy_io(s, d, c)	generic_bcopy((void *)(s), (void *)(d), (c))
% % -#define bcopy_toio(s, d, c)	generic_bcopy((void *)(s), (void *)(d), (c))
% % -#define bcopy_fromio(s, d, c)	generic_bcopy((void *)(s), (void *)(d), (c))
% % -#define bzero_io(d, c)		generic_bzero((void *)(d), (c))
% % -#define fill_io(p, d, c)	fill((p), (void *)(d), (c))
% % -#define fillw_io(p, d, c)	fillw((p), (void *)(d), (c))
% % -void generic_bcopy(const void *s, void *d, size_t c);
% % -void generic_bzero(void *d, size_t c);
% % -#elif defined(__amd64__)
% % -#define bcopy_io(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
% % -#define bcopy_toio(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
% % -#define bcopy_fromio(s, d, c)	bcopy((void *)(s), (void *)(d), (c))
% % -#define bzero_io(d, c)		bzero((void *)(d), (c))
% % -#define fill_io(p, d, c)	fill((p), (void *)(d), (c))
% % -#define fillw_io(p, d, c)	fillw((p), (void *)(d), (c))
% % -#elif defined(__ia64__) || defined(__sparc64__)
% % -#if defined(__ia64__)
% % +#if defined(__amd64__) || defined(__i386__) ||  defined(__ia64__) || \
% % +    defined(__sparc64__)
% % +/* XXX __sparc64__ doesn't seem to belong here. */
% % +#if defined(__amd64__) || defined(__i386__) ||  defined(__ia64__)
% %  #include <machine/bus.h>
% % +#endif
% % +/* XXX fix gratuitous MD spelling: */
% % +#ifdef __amd64__
% % +#define	BUS_SPACE_MEM	AMD64_BUS_SPACE_MEM
% % +#endif
% % +#ifdef __i386__
% % +#define	BUS_SPACE_MEM	I386_BUS_SPACE_MEM
% % +#endif
% % +#ifdef __ia64__
% % +#define	BUS_SPACE_MEM	IA64_BUS_SPACE_MEM
% % +#endif
% % +#if defined(__amd64__) || defined(__i386__) ||  defined(__ia64__)
% %  #define	bcopy_fromio(s, d, c)	\
% % -	bus_space_read_region_1(IA64_BUS_SPACE_MEM, s, 0, (void*)(d), c)
% % +	bus_space_read_region_1(BUS_SPACE_MEM, s, 0, (void*)(d), c)
% %  #define	bcopy_io(s, d, c)	\
% % -	bus_space_copy_region_1(IA64_BUS_SPACE_MEM, s, 0, d, 0, c)
% % +	bus_space_copy_region_1(BUS_SPACE_MEM, s, 0, d, 0, c)
% %  #define	bcopy_toio(s, d, c)	\
% % -	bus_space_write_region_1(IA64_BUS_SPACE_MEM, d, 0, (void*)(s), c)
% % +	bus_space_write_region_1(BUS_SPACE_MEM, d, 0, (void*)(s), c)
% %  #define	bzero_io(d, c)		\
% % -	bus_space_set_region_1(IA64_BUS_SPACE_MEM, (intptr_t)(d), 0, 0, c)
% % +	bus_space_set_region_1(BUS_SPACE_MEM, (intptr_t)(d), 0, 0, c)
% %  #define	fill_io(p, d, c)	\
% % -	bus_space_set_region_1(IA64_BUS_SPACE_MEM, (intptr_t)(d), 0, p, c)
% % +	bus_space_set_region_1(BUS_SPACE_MEM, (intptr_t)(d), 0, p, c)
% %  #define	fillw_io(p, d, c)	\
% % -	bus_space_set_region_2(IA64_BUS_SPACE_MEM, (intptr_t)(d), 0, p, c)
% % -#define	readb(a)		bus_space_read_1(IA64_BUS_SPACE_MEM, a, 0)
% % -#define	readw(a)		bus_space_read_2(IA64_BUS_SPACE_MEM, a, 0)
% % -#define	writeb(a, v)		bus_space_write_1(IA64_BUS_SPACE_MEM, a, 0, v)
% % -#define	writew(a, v)		bus_space_write_2(IA64_BUS_SPACE_MEM, a, 0, v)
% % -#define	writel(a, v)		bus_space_write_4(IA64_BUS_SPACE_MEM, a, 0, v)
% % -#endif /* __ia64__ */
% % +	bus_space_set_region_2(BUS_SPACE_MEM, (intptr_t)(d), 0, p, c)
% % +#define	readb(a)		bus_space_read_1(BUS_SPACE_MEM, a, 0)
% % +#define	readw(a)		bus_space_read_2(BUS_SPACE_MEM, a, 0)
% % +#define	writeb(a, v)		bus_space_write_1(BUS_SPACE_MEM, a, 0, v)
% % +#define	writew(a, v)		bus_space_write_2(BUS_SPACE_MEM, a, 0, v)
% % +#define	writel(a, v)		bus_space_write_4(BUS_SPACE_MEM, a, 0, v)
% % +#endif /* __amd64__ || __i386__ || __ia64__ */
% % +/* fillw for __amd64__ || __i386__ || __ia64__ and || __sparc64__ too, ugh. */
% %  static __inline void
% %  fillw(int val, uint16_t *buf, size_t size)
% % @@ -102,5 +101,5 @@
% %  #define fillw(p, d, c)		memsetw((d), (p), (c))
% %  #define fillw_io(p, d, c)	memsetw_io((d), (p), (c))
% % -#endif /* !__i386__ */
% % +#endif
% % 
% %  /* video function table */
% % Index: dev/fb/vga.c
% % ===================================================================
% % RCS file: /home/ncvs/src/sys/dev/fb/vga.c,v
% % retrieving revision 1.36
% % diff -u -2 -r1.36 vga.c
% % --- dev/fb/vga.c	4 Dec 2005 02:12:41 -0000	1.36
% % +++ dev/fb/vga.c	17 Jun 2009 02:19:22 -0000
% % @@ -1331,15 +1331,4 @@
% % 
% %  #ifndef VGA_NO_MODE_CHANGE
% % -#if defined(__i386__) || defined(__amd64__)	/* XXX */
% % -static void
% % -fill(int val, void *d, size_t size)
% % -{
% % -    u_char *p = d;
% % -
% % -    while (size-- > 0)
% % -	*p++ = val;
% % -}
% % -#endif /* __i386__ */
% % -
% %  static void
% %  filll_io(int val, vm_offset_t d, size_t size)


Gak, yet another fill function to actually force 32-bit accesses.
What does your hardware do with this and the bzero()s?

The i586 optimized copying routines used 64-bit accesses even on i386,
and of course for sp33d, bcopy should use SSE for 128-bit accesses.

I wonder if there is only a problem with misaligned cases on your
hardware.  x86 bcopy() is intentionally sloppy with alignment since it
expects callers to align if this is best for speed.  But frame buffer
accesses could naturally cause lots of cases with only 16-bit alignment.

% % Index: i386/i386/support.s
% % ===================================================================
% % RCS file: /home/ncvs/src/sys/i386/i386/support.s,v
% % retrieving revision 1.119.2.1
% % diff -u -2 -r1.119.2.1 support.s
% % --- i386/i386/support.s	12 Jan 2009 15:48:22 -0000	1.119.2.1
% % +++ i386/i386/support.s	17 Jun 2009 01:43:27 -0000
% % @@ -429,49 +429,4 @@
% %  END(i686_pagezero)
% % 
% % -/* fillw(pat, base, cnt) */
% % -ENTRY(fillw)
% % -	pushl	%edi
% % -	movl	8(%esp),%eax
% % -	movl	12(%esp),%edi
% % -	movl	16(%esp),%ecx
% % -	cld
% % -	rep
% % -	stosw
% % -	popl	%edi
% % -	ret
% % -END(fillw)
% % -
% % -ENTRY(bcopyb)
% % -	pushl	%esi
% % -	pushl	%edi
% % -	movl	12(%esp),%esi
% % -	movl	16(%esp),%edi
% % -	movl	20(%esp),%ecx
% % -	movl	%edi,%eax
% % -	subl	%esi,%eax
% % -	cmpl	%ecx,%eax			/* overlapping && src < dst? */
% % -	jb	1f
% % -	cld					/* nope, copy forwards */
% % -	rep
% % -	movsb
% % -	popl	%edi
% % -	popl	%esi
% % -	ret
% % -
% % -	ALIGN_TEXT
% % -1:
% % -	addl	%ecx,%edi			/* copy backwards. */
% % -	addl	%ecx,%esi
% % -	decl	%edi
% % -	decl	%esi
% % -	std
% % -	rep
% % -	movsb
% % -	popl	%edi
% % -	popl	%esi
% % -	cld
% % -	ret
% % -END(bcopyb)
% % -
% %  ENTRY(bcopy)
% %  	MEXITCOUNT
% % Index: i386/include/cpufunc.h
% % ===================================================================
% % RCS file: /home/ncvs/src/sys/i386/include/cpufunc.h,v
% % retrieving revision 1.145.2.1
% % diff -u -2 -r1.145.2.1 cpufunc.h
% % --- i386/include/cpufunc.h	12 Jan 2009 15:48:22 -0000	1.145.2.1
% % +++ i386/include/cpufunc.h	17 Jun 2009 01:44:24 -0000
% % @@ -45,12 +45,4 @@
% %  struct region_descriptor;
% % 
% % -#define readb(va)	(*(volatile u_int8_t *) (va))
% % -#define readw(va)	(*(volatile u_int16_t *) (va))
% % -#define readl(va)	(*(volatile u_int32_t *) (va))
% % -
% % -#define writeb(va, d)	(*(volatile u_int8_t *) (va) = (d))
% % -#define writew(va, d)	(*(volatile u_int16_t *) (va) = (d))
% % -#define writel(va, d)	(*(volatile u_int32_t *) (va) = (d))
% % -
% %  #if defined(__GNUCLIKE_ASM) && defined(__CC_SUPPORTS___INLINE)
% % 
% % Index: i386/include/md_var.h
% % ===================================================================
% % RCS file: /home/ncvs/src/sys/i386/include/md_var.h,v
% % retrieving revision 1.76.2.1
% % diff -u -2 -r1.76.2.1 md_var.h
% % --- i386/include/md_var.h	21 Jan 2009 20:16:11 -0000	1.76.2.1
% % +++ i386/include/md_var.h	17 Jun 2009 01:44:01 -0000
% % @@ -80,5 +80,4 @@
% %  struct	dumperinfo;
% % 
% % -void	bcopyb(const void *from, void *to, size_t len);
% %  void	busdma_swi(void);
% %  void	cpu_setregs(void);
% % @@ -95,5 +94,4 @@
% %  void	dump_drop_page(vm_paddr_t);
% %  void	enable_sse(void);
% % -void	fillw(int /*u_short*/ pat, void *base, size_t cnt);
% %  void	i486_bzero(void *buf, size_t len);
% %  void	i586_bcopy(const void *from, void *to, size_t len);
% 
% In fbreg.h, this just uses the ia64 macros for i386 (and for amd64,
% untested) after s/IA64_//g.  There are still messes for sparc64,
% powerpc and default (arm only?), and new messes to get an MI name
% for BUS_SPACE_MEM (the very idea of a frame buffer depends on having
% a memory bus-space, so BUS_SPACE_MEM should be MI).  fillw() doesn't
% use bus space since it is for ordinary memory.  It should probably
% be implemented in vga.c. between fill() (removed above) and filll().
% fill() looks like it is for ordinary memory, but was only used for
% bus space memory, only on amd64 and i386; thus it became unused.
% It is also better spelled as memset().  fillw() and filll() would
% probably be better spelled memset[wl]() like fillw() already is in
% MD code for the default case.
% 
% Bruce

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120618130626.M952>