Date: Tue, 7 Feb 2006 21:27:55 +0300 From: Yar Tikhiy <yar@comp.chem.msu.su> To: Cy Schubert <Cy.Schubert@spqr.komquats.com> Cc: freebsd-current@freebsd.org, netchild@freebsd.org Subject: Re: 7.0-CURRENT Hang Message-ID: <20060207182755.GB32998@comp.chem.msu.su> In-Reply-To: <200602071806.k17I6WQc007602@cwsys.cwsent.com> References: <20060207173154.GE19674@comp.chem.msu.su> <200602071806.k17I6WQc007602@cwsys.cwsent.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Feb 07, 2006 at 10:06:32AM -0800, Cy Schubert wrote: > In message <20060207173154.GE19674@comp.chem.msu.su>, Yar Tikhiy writes: > > On Mon, Feb 06, 2006 at 08:29:35PM -0800, Cy Schubert wrote: > > > > > > On the Pentium P54C model (that's an old 120 MHz Pentium I use as a 4.x, > > > 5.x, and 7.x ports build testbed) the CPUID instruction when called with AL > > > > > = 0x02, CPUID returns EAX = EBX = ECX = EDX = 0. The code fragment in > > > identcpu.c below results in "rounds" becoming 0xffffffff. > > > > > > do_cpuid(0x2, regs); > > > rounds = (regs[0] & 0xff) - 1; > > > > > > The subsequent loop of the following will loop virtually for ever (it takes > > > > > forever tor this machine to count down from 0xffffffff performing a very > > > great many calls to get_INTEL_TLB in the process, virtually hanging the > > > machine in the process. > > > > > > while (rounds > 0) { > > > [... code ...] > > > rounds--; > > > } > > > > FWIW, my presumably P54C machine (Family 5 Model 2 Stepping 6) > > doesn't indicate it has the CPUID 0x02 function. That is, CPUID > > 0x00 returns EAX = 0x01, which is the highest function supported. > > Could you try to run the misc/cpuid port on your Pentium and show > > its output? It might appear that the code around CPUID 0x02 shouldn't > > be reached at all in your case. Zero values from CPUID 0x02 are > > pretty indicative of that. > > Mine is Family 5 Model 2 Stepping 12. All of my doc is for Pentium-Pro and > newer so you are probably correct. Do you know what CPUID function 0x00 returns in EAX for your CPU? Hint: just run misc/cpuid once and show its output here. I've just fixed the port so that it has no bogus dependencies and is very light-weight. > > Dealing with "rounds" equal to -1 can be a good idea anyway to catch > > braid dead CPUs instead of hanging the system on them. > > Well, with rounds = -1 [actually (unsigned int)0xffffffff], the CPU will > "appear" to hang as it "rounds" or loops virtually forever -- counting back > from 0xffffffff on a 120 MHz machine and performing get TLB info a number > of times each iteration takes hours to do just a few iterations. I've seen > mine go through "rounds", decrementing rounds-- each time, for hours at a > time, though initially before digging into it using DDB it did appear that > the CPU was hung, it was just starting to loop for 4,294,967,295 times. On > older and slower machines, if it took hours to iterate through a few > iterations, my guess is that it would take days to loop through this code. > My patch allows it to take the defaults and finally boot. If the CPU > doesn't support AL = 0x02, what's the point of looping? It appears to run > nicely with the patch. I do see that rounds = -1 is causing trouble. I just meant that we should not call do_cpuid(0x02) at all if (cpu_high < 2) because it can result in undefined behavior. Your patch still makes sense because it deals with possible brain-dead CPUs. I'd implement it in a slightly different way though -- stay tuned! :-) -- Yar
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060207182755.GB32998>