Date: Thu, 26 Aug 2010 04:10:11 GMT From: Garrett Cooper <gcooper@FreeBSD.org> To: freebsd-bugs@FreeBSD.org Subject: Re: kern/145385: [cpu] Logical processor cannot be disabled for some SMT-enabled Intel procs Message-ID: <201008260410.o7Q4ABAl081282@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/145385; it has been noted by GNATS. From: Garrett Cooper <gcooper@FreeBSD.org> To: Garrett Cooper <gcooper@freebsd.org> Cc: Jeff Roberson <jroberson@jroberson.net>, bug-followup@freebsd.org, jkim@freebsd.org, Attilio Rao <attilio@freebsd.org>, jeff@freebsd.org Subject: Re: kern/145385: [cpu] Logical processor cannot be disabled for some SMT-enabled Intel procs Date: Wed, 25 Aug 2010 21:09:31 -0700 --0015174c35444853d8048eb22a66 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Aug 25, 2010 at 9:08 PM, Garrett Cooper <gcooper@freebsd.org> wrote= : > On Tue, Aug 24, 2010 at 9:53 PM, Jeff Roberson <jroberson@jroberson.net> = wrote: >> On Tue, 24 Aug 2010, Garrett Cooper wrote: >> >>> On Tue, Aug 24, 2010 at 3:45 PM, Garrett Cooper <gcooper@freebsd.org> >>> wrote: >>>> >>>> On Tue, Aug 24, 2010 at 2:51 PM, Garrett Cooper <yanegomi@gmail.com> >>>> wrote: >>>>> >>>>> On Aug 24, 2010, at 2:03 PM, Jeff Roberson wrote: >>>>> >>>>> >>>>> On Tue, 24 Aug 2010, Garrett Cooper wrote: >>>>> >>>>> On Tue, Aug 24, 2010 at 12:22 PM, Jeff Roberson >>>>> <jroberson@jroberson.net> >>>>> wrote: >>>>> >>>>> On Tue, 24 Aug 2010, Garrett Cooper wrote: >>>>> >>>>> On Mon, Aug 23, 2010 at 6:33 AM, John Baldwin <jhb@freebsd.org> wrote= : >>>>> >>>>> On Sunday, August 22, 2010 4:17:37 am Garrett Cooper wrote: >>>>> >>>>> =A0 =A0 =A0 The following trivial patch fixes the issue on my W3520 p= rocessor; >>>>> >>>>> AFAICS >>>>> >>>>> it's what should be done after reading several of the specs because t= he >>>>> >>>>> logical count that's tracked with ebx is exactly what is needed for >>>>> >>>>> logical_cpus (it's an absolute quantity). I need to verify it with a >>>>> >>>>> multi-cpu >>>>> >>>>> topology at work (the two r710s I was testing with E-series Xeons on >>>>> >>>>> aren't >>>>> >>>>> available remotely right now). >>>>> >>>>> Thanks! >>>>> >>>>> -Garrett >>>>> >>>>> Jung-uk Kim and Attilio Rao have both been looking at this code recen= tly >>>>> >>>>> and >>>>> >>>>> are in a better position to review the patch in the PR. >>>>> >>>>> (Moving jhb@ to BCC, adding jeff@ for possible input on ULE) >>>>> >>>>> The patch works as expected (it now properly detects the SMIT CPUs as >>>>> >>>>> logical CPUs), but setting machdep.hlt_logical_cpus=3D1 causes other >>>>> >>>>> problems with scheduling tasks because certain kernel threads get >>>>> >>>>> stuck at boot when netbooting (in particular I've seen problems with >>>>> >>>>> usbhub* and a few others bits), so in order for >>>>> >>>>> machdep.hlt_logical_cpus to be fixed on SMT processors, it might >>>>> >>>>> require some changes to the ULE scheduler to shuffle around the >>>>> >>>>> threads to available cores/processors? >>>>> >>>>> >>>>> hlt_logical_cpus should be rewritten to use cpusets to change the >>>>> default >>>>> >>>>> system set rather than specifically halting those cpus. =A0There are = a >>>>> number >>>>> >>>>> of loops in the kernel that iterate over all cpus and attempt to bind >>>>> and >>>>> >>>>> perform some task. =A0I think there are a number of other reasons to >>>>> prefer a >>>>> >>>>> less aggressive approach to avoiding the logical cpus as well. Simply >>>>> >>>>> preventing user thread schedule will achieve the intent of the sysctl= in >>>>> any >>>>> >>>>> event. >>>>> >>>>> =A0=A0Ok... in that event then the bug is ok, but maybe I should add >>>>> >>>>> some code to the patch to warn the user about functional issues >>>>> >>>>> associated with halting logical CPUs? >>>>> >>>>> I don't think the bug is ok. =A0We probably shouldn't have sysctls wh= ich >>>>> readily break the kernel. =A0As I said we should instead have the sys= ctl >>>>> backend to cpuset. =A0It shouldn't take more than an hour to code and >>>>> test. >>>> >>>> =A0 =A0Ok.. I'll look at this once I have my other system back online = so >>>> I can actively break something until I get it to work. >>> >>> =A0 BTW... there's a lot of code in machdep.c that does the same thing >>> to idle the CPU, for instance, cpu_idle_hlt, cpu_idle_acpi, >>> cpu_idle_amdc1e (on amd64). What should be done about those cases >>> (same thing, or different)? >> >> Those are the actual idle functions that the scheduler uses. =A0Those ar= e >> safe. > > =A0 =A0I'll look into running this on a Nehalem processor machine, but > this appears to as expected on my Penryn processor test machine with > machdep.hlt_cpus =3D { 110, 101, 11, 0 } and with machdep.idle=3Dacpi; I'= m > not sure if the if the loop is supposed to be there still, but it > wouldn't make sense because the CPU would be spinning in the kernel. Sorry.. forgot the patch :(. -Garrett --0015174c35444853d8048eb22a66 Content-Type: application/octet-stream; name="kern-145385.diff" Content-Disposition: attachment; filename="kern-145385.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_gdb3fz1k0 SW5kZXg6IG1wX21hY2hkZXAuYwo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBtcF9tYWNoZGVwLmMJKHJldmlzaW9u IDIxMTc5NCkKKysrIG1wX21hY2hkZXAuYwkod29ya2luZyBjb3B5KQpAQCAtMjAzLDcgKzIwMyw3 IEBACiAJCQkJY250Kys7CiAJCX0KIAkJaWYgKHR5cGUgPT0gQ1BVSURfVFlQRV9TTVQpCi0JCQlj cHVfbG9naWNhbCA9IGNudDsKKwkJCWNwdV9sb2dpY2FsID0gbG9naWNhbF9jcHVzID0gY250Owog CQllbHNlIGlmICh0eXBlID09IENQVUlEX1RZUEVfQ09SRSkKIAkJCWNwdV9jb3JlcyA9IGNudDsK IAl9CkBAIC0xNTM2LDIwICsxNTM2LDEzIEBACiAjaWZkZWYgTVBfV0FUQ0hET0cKIAl1X2ludCBj cHVpZDsKICNlbmRpZgotCWludCByZXR2YWw7CiAKIAltYXNrID0gUENQVV9HRVQoY3B1bWFzayk7 CiAjaWZkZWYgTVBfV0FUQ0hET0cKIAljcHVpZCA9IFBDUFVfR0VUKGNwdWlkKTsKIAlhcF93YXRj aGRvZyhjcHVpZCk7CiAjZW5kaWYKLQotCXJldHZhbCA9IDA7Ci0Jd2hpbGUgKG1hc2sgJiBobHRf Y3B1c19tYXNrKSB7Ci0JCXJldHZhbCA9IDE7Ci0JCV9fYXNtIF9fdm9sYXRpbGUoInN0aTsgaGx0 IiA6IDogOiAibWVtb3J5Iik7Ci0JfQotCXJldHVybiAocmV0dmFsKTsKKwlyZXR1cm4gKG1hc2sg JiBobHRfY3B1c19tYXNrKTsKIH0KIAogI2lmZGVmIENPVU5UX0lQSVMK --0015174c35444853d8048eb22a66--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201008260410.o7Q4ABAl081282>