Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Nov 2001 02:10:01 EET
From:      Maxim Sobolev <sobomax@FreeBSD.org>
To:        eischen@pcnet1.pcnet.com
Cc:        marcus@marcuscom.com, freebsd-ports@FreeBSD.org, hackers@FreeBSD.org
Subject:   Re: Using bit 21 of EFLAGS in user-mode [was: Re: sigreturn: eflags creash (fixed!)]
Message-ID:  <200111160010.CAA15164@ipcard.iptcom.net>
In-Reply-To: <Pine.SUN.3.91.1011115173611.10851A-100000@pcnet1.pcnet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 15 Nov 2001 17:41:32 -0500 (EST), Daniel Eischen wrote:
> On Thu, 15 Nov 2001, Maxim Sobolev wrote:
> > On Thu, 15 Nov 2001 14:56:31 -0500 (EST), Joe Clarke wrote:
> > > 
> > > I learned about this by reading through some of the -hackers archives.
> > > One person complained of similar errors trying to get xine to work on
> > > FreeBSD.  Removing the MMX detection code fixed it.  I remembered libpng
> > > also used MMX, so I removed the pnggccrd.c source, and voila!
> > > 
> > > Based on core dumps, strace output, and a lot of code surfing, this makes
> > > sense to me.  Basically, any png-dependent app's thread that runs longer
> > > than what ITIMER_PROF is set to gets hit with a SIGPROF.  When that
> > > happens, things context switch.  eflags must have been corrupted by the
> > > MMX code, thus sigreturn() bombs out, and causes uthread_kern to die as
> > > well.  Here's what strace looks like when balsa tries to read a 33 MB
> > > mailbox:
> > > 
> > > 74202 sigreturn(0x81f2c64
> > > 
> > > When this happens, strace politely dies with a bus error.
> > > 
> > > Thanks for testing this, Maxim.  Hopefully someone can find the problem
> > > and fix it for good.
> > 
> > That explains... After a quick glance at png code I found that
> > the only place where EFLAGS is altered is CPUID code, where
> > the library flips bit 21 of EFLAGS in order to ensure that the
> > CPUID instruction is supported (otherwise it will get SIGILL
> > on older processors). Unfortunately, for some reason FreeBSB
> 
> Does it need to keep bit 21 of EFLAGS flipped, or can libpng
> set it back and keep knowledge that CPUID is supported?  Or
> does that bit need to remain set for CPUID to work?

No it doesn't need it to be in any specific state. The only
knowelege a program gains from the bit 21 is that its state
could be changed, which means that CPUID instruction is
supported. Unfortunately original libpng doesn't bother to
set the state of the bit back, which exposed this problem.

> If at all possible, a fix should be committed that wouldn't
> necessitate a new kernel be built for -stable.

Yes, I was also thinking about that. I've committed a patch,
which restores state of the bit 21 as soon as possible. There
is still a chance that the program will get a signal during
that time, but this change is rather slim. The "unsafe" piece
of code in question looks like:

	popfl		<-load eflags with bit 21 flipped
	pushfl		<-save resulting eflags
+	popl %%eax	<-load resulting eflags into eax
+	pushl %%ecx	<-save original eflags
	popfl		<-restore original eflags

Of course, it is possible to either mask all signals during
detection period, or rip out detection code based around
eflags and replace it with SIGILL handler, but this will
cannibalize on speed improvement from MMX optimisations
because of the additonal overhead associated with doing
syscall necessary to set-up signal handler or signal mask.
In any case, tomorrow I will test this workaround
extensively, and if it appears that it is not sufficient
to prevent `sigreturn: eflags...' errors, then I'll just
disable MMX code in the libpng.

-Maxim

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200111160010.CAA15164>