From owner-freebsd-hackers Thu Nov 15 16:10:59 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3]) by hub.freebsd.org (Postfix) with ESMTP id BF61C37B416; Thu, 15 Nov 2001 16:10:46 -0800 (PST) Received: (from eischen@localhost) by pcnet1.pcnet.com (8.12.1/8.12.1) id fAG09etu025898; Thu, 15 Nov 2001 19:09:40 -0500 (EST) Date: Thu, 15 Nov 2001 19:09:40 -0500 (EST) From: Daniel Eischen To: John Baldwin Cc: hackers@FreeBSD.org, freebsd-ports@FreeBSD.org, marcus@marcuscom.com, Maxim Sobolev Subject: Re: Using bit 21 of EFLAGS in user-mode [was: Re: sigreturn: efl In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 15 Nov 2001, John Baldwin wrote: > On 15-Nov-01 Daniel Eischen wrote: > > On Thu, 15 Nov 2001, Maxim Sobolev wrote: > >> On Thu, 15 Nov 2001 14:56:31 -0500 (EST), Joe Clarke wrote: > >> > > >> > I learned about this by reading through some of the -hackers archives. > >> > One person complained of similar errors trying to get xine to work on > >> > FreeBSD. Removing the MMX detection code fixed it. I remembered libpng > >> > also used MMX, so I removed the pnggccrd.c source, and voila! > >> > > >> > Based on core dumps, strace output, and a lot of code surfing, this makes > >> > sense to me. Basically, any png-dependent app's thread that runs longer > >> > than what ITIMER_PROF is set to gets hit with a SIGPROF. When that > >> > happens, things context switch. eflags must have been corrupted by the > >> > MMX code, thus sigreturn() bombs out, and causes uthread_kern to die as > >> > well. Here's what strace looks like when balsa tries to read a 33 MB > >> > mailbox: > >> > > >> > 74202 sigreturn(0x81f2c64 > >> > > >> > When this happens, strace politely dies with a bus error. > >> > > >> > Thanks for testing this, Maxim. Hopefully someone can find the problem > >> > and fix it for good. > >> > >> That explains... After a quick glance at png code I found that > >> the only place where EFLAGS is altered is CPUID code, where > >> the library flips bit 21 of EFLAGS in order to ensure that the > >> CPUID instruction is supported (otherwise it will get SIGILL > >> on older processors). Unfortunately, for some reason FreeBSB > > > > Does it need to keep bit 21 of EFLAGS flipped, or can libpng > > set it back and keep knowledge that CPUID is supported? Or > > does that bit need to remain set for CPUID to work? > > It needs to be able to change it. If you can change the value of the bit (done > by pushf ; pop %eax ; mov %eax,%ebx ; xor $PSL_ID,%eax ; push %eax ; popf ; > pushf ; pop %eax ; compare bit PSL_ID of eax ebx to see if they match). > The problem is if a signal comes in during the middle of that bit toggling due > to a profiling timer. I think the problem may be that it uses a sequence that > leaves the bit set, thus the kernel freaks out thinking that the user has > changed a kernel only flag. The solution is Maxim's patch to make the kernel > not care about the flag (which it shouldn't since cpuid is not a privileged > instruction). I just thought perhaps libpng could do something like: static int init_done = 0; static int cpuid_supported = 0; ... if (init_done == 0) { block_all_sigs(); cpuid_supported = check_cpuid(); init_done = 1; unblock_sigs(); } But if it always needs to change the bit, I guess the above doesn't help. -- Dan Eischen To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message