Date: Tue, 07 Feb 2006 10:06:32 -0800 From: Cy Schubert <Cy.Schubert@komquats.com> To: Yar Tikhiy <yar@comp.chem.msu.su> Cc: freebsd-current@freebsd.org, netchild@freebsd.org Subject: Re: 7.0-CURRENT Hang Message-ID: <200602071806.k17I6WQc007602@cwsys.cwsent.com> In-Reply-To: Message from Yar Tikhiy <yar@comp.chem.msu.su> of "Tue, 07 Feb 2006 20:31:54 %2B0300." <20060207173154.GE19674@comp.chem.msu.su>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <20060207173154.GE19674@comp.chem.msu.su>, Yar Tikhiy writes: > On Mon, Feb 06, 2006 at 08:29:35PM -0800, Cy Schubert wrote: > > > > On the Pentium P54C model (that's an old 120 MHz Pentium I use as a 4.x, > > 5.x, and 7.x ports build testbed) the CPUID instruction when called with AL > > > = 0x02, CPUID returns EAX = EBX = ECX = EDX = 0. The code fragment in > > identcpu.c below results in "rounds" becoming 0xffffffff. > > > > do_cpuid(0x2, regs); > > rounds = (regs[0] & 0xff) - 1; > > > > The subsequent loop of the following will loop virtually for ever (it takes > > > forever tor this machine to count down from 0xffffffff performing a very > > great many calls to get_INTEL_TLB in the process, virtually hanging the > > machine in the process. > > > > while (rounds > 0) { > > [... code ...] > > rounds--; > > } > > FWIW, my presumably P54C machine (Family 5 Model 2 Stepping 6) > doesn't indicate it has the CPUID 0x02 function. That is, CPUID > 0x00 returns EAX = 0x01, which is the highest function supported. > Could you try to run the misc/cpuid port on your Pentium and show > its output? It might appear that the code around CPUID 0x02 shouldn't > be reached at all in your case. Zero values from CPUID 0x02 are > pretty indicative of that. Mine is Family 5 Model 2 Stepping 12. All of my doc is for Pentium-Pro and newer so you are probably correct. > > Dealing with "rounds" equal to -1 can be a good idea anyway to catch > braid dead CPUs instead of hanging the system on them. Well, with rounds = -1 [actually (unsigned int)0xffffffff], the CPU will "appear" to hang as it "rounds" or loops virtually forever -- counting back from 0xffffffff on a 120 MHz machine and performing get TLB info a number of times each iteration takes hours to do just a few iterations. I've seen mine go through "rounds", decrementing rounds-- each time, for hours at a time, though initially before digging into it using DDB it did appear that the CPU was hung, it was just starting to loop for 4,294,967,295 times. On older and slower machines, if it took hours to iterate through a few iterations, my guess is that it would take days to loop through this code. My patch allows it to take the defaults and finally boot. If the CPU doesn't support AL = 0x02, what's the point of looping? It appears to run nicely with the patch. I have another machine just like this as a firewall. Cheers, Cy Schubert <Cy.Schubert@komquats.com> Web: http://www.komquats.com and http://www.bcbodybuilder.com FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org BC Government: <Cy.Schubert@gov.bc.ca> "Lift long enough and I believe arrogance is replaced by humility and fear by courage and selfishness by generosity and rudeness by compassion and caring." -- Dave Draper
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200602071806.k17I6WQc007602>