From owner-freebsd-stable@FreeBSD.ORG Wed Apr 20 18:54:59 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1637B1065672; Wed, 20 Apr 2011 18:54:59 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id A0AF78FC0A; Wed, 20 Apr 2011 18:54:58 +0000 (UTC) Received: by qwc9 with SMTP id 9so627188qwc.13 for ; Wed, 20 Apr 2011 11:54:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=nGs3iDI7Q1u2vw/EJIaQerRwFOD5Z+kKcmO6nPDbAhM=; b=qqMoiEYPnJklGKYT4Ni5y+XIYGGtZxGWDC/J415SarHAiD4casCQPwO1n+QwziOpyk p8ex1nuaX+hW7jIk+G13QkGQ8Hing3PHgu3NuEsH/MGx45zSWnb+OO/vwY5cIcSN/gaH n8ygvZCh+/CkfntJ/PRHGMotLRjAjYvC/Sm1E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=s14yLL2hvDtvyNdUlhsMmWBoHuvId9ka1FNhSvipoiKtax2ghoVRD8Bn1Vm30g+id7 4F5ML/mOkLqff1kXYtSGdI+nAawLeP+aZETJZ36RkvDicW84wCLdXJVQ6vKGQY6xLebM v5zJ1ZTE3ALX4eqbh/9dKpJsaZ2fiyFYf4ZH0= MIME-Version: 1.0 Received: by 10.229.64.84 with SMTP id d20mr5591317qci.206.1303324190877; Wed, 20 Apr 2011 11:29:50 -0700 (PDT) Received: by 10.229.221.136 with HTTP; Wed, 20 Apr 2011 11:29:50 -0700 (PDT) In-Reply-To: <20110420164100.Y43371@sola.nimnet.asn.au> References: <4D9EEDAF.3020803@rulez.sk> <20110411125416.S35056@sola.nimnet.asn.au> <4DA37E31.4020700@FreeBSD.org> <20110413024230.Y35056@sola.nimnet.asn.au> <20110420164100.Y43371@sola.nimnet.asn.au> Date: Wed, 20 Apr 2011 22:29:50 +0400 Message-ID: From: Sergey Kandaurov To: Ian Smith Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Daniel Gerzo , freebsd-stable@freebsd.org, Alexander Motin Subject: Re: kern.smp.maxid error on i386 UP [was: powerd / cpufreq question] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Apr 2011 18:54:59 -0000 On 20 April 2011 21:02, Ian Smith wrote: > On Wed, 13 Apr 2011, Ian Smith wrote: > =A0> On Tue, 12 Apr 2011, Daniel Gerzo wrote: > =A0> =A0> On 11.4.2011 6:08, Ian Smith wrote: > [..] > =A0> =A0> > Are those kern.cp_times values as they came, or did you remov= e trailing > =A0> =A0> > zeroes? =A0Reason I ask is that on my Thinkpad T23, single-co= re 1133/733 > =A0> =A0> > MHz, sysctl kern.cp_time shows the usual 5 values, but kern.c= p_times has > =A0> =A0> > the same 5 values for cpu0, but then 5 zeroes for each of cpu= 1 through > =A0> =A0> > cpu31, on 8.2-PRE about early January. =A0I need to update th= e script to > =A0> =A0> > remove surplus data for non-existing cpus, but wonder if the = extra data > =A0> =A0> > also appeared on your 12 core box? > =A0> =A0> > =A0> =A0> I haven't removed anything, it's a pure copy&paste. > =A0> > =A0> Thanks. =A0I'll check the single-cpu case again after updating to 8.= 2-R > > Ok, still a problem on at least my i386 single core Thinkpad T23 at > 8.2-R, since 8.0 I think, certainly evident in a sysctl -a at 8.1-R > > FreeBSD t23.smithi.id.au 8.2-RELEASE FreeBSD 8.2-RELEASE #1: Thu Apr 14 > 21:45:47 EST 2011 root@t23.smithi.id.au:/usr/obj/usr/src/sys/GENERIC i386 > > Verbose dmesg: http://smithi.id.au/t23_dmesg_boot-v.8.2-R.txt > sysctl -a: =A0 =A0 http://smithi.id.au/t23_sysctl-a_8.2-R.txt > > kern.ccpu: 0 > =A00 > kern.smp.forward_signal_enabled: 1 > kern.smp.topology: 0 > kern.smp.cpus: 1 > kern.smp.disabled: 0 > kern.smp.active: 0 > kern.smp.maxcpus: 32 > kern.smp.maxid: 31 =A0 =A0 =A0<<<<<<< > hw.ncpu: 1 > > kern.cp_times: 38548 1 120437 195677 9660939 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > /usr/src/sys/kern/kern_clock.c: > return SYSCTL_OUT(req, 0, sizeof(long) * CPUSTATES * (mp_maxid + 1)); > > Consumers of kern.cp_times like powerd, top, dtrace? and others have to > loop over 32 cpus, all but one non-existent, and there seem to be many > places in the kernel doing eg: for (cpu =3D 0; cpu <=3D mp_maxid; cpu++) = { > and while CPU_FOREACH / CPU_ABSENT will skip over them, seems wasteful > at best on machines least likely to have cycles to spare. > > eg: powerd parses kern.cp_times to count cpus, wasting cycles adding > up the 31 'empty' cpus. =A0I haven't explored other userland consumers. > > Clearly kern.smp.maxid (ie mp_maxid) should be 0, not 31. =A0On i386, > non-APIC i386 at least, mp_maxid is not set to (mp_ncpus - 1) as on some > other archs .. after having being initialised to (MAXCPU - 1) in > /sys/i386/i386/mp_machdep.c it's never updated for non-smp machines. > > I haven't chased all of these rabbits down all of their holes by any > means, but it seems that making /sys/i386/i386/mp_machdep.c do what it > says it's gonna do ('with an id of 0') should help. =A0Paste, tabs lost: > > int > cpu_mp_probe(void) > { > =A0 =A0 =A0 =A0/* > =A0 =A0 =A0 =A0 * Always record BSP in CPU map so that the mbuf init code= works > =A0 =A0 =A0 =A0 * correctly. > =A0 =A0 =A0 =A0 */ > =A0 =A0 =A0 =A0all_cpus =3D 1; > =A0 =A0 =A0 =A0if (mp_ncpus =3D=3D 0) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * No CPUs were found, so this must be a U= P system. =A0Setup > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * the variables to represent a system wit= h a single CPU > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * with an id of 0. > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mp_ncpus =3D 1; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp_maxid =3D 0; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return (0); > =A0 =A0 =A0 =A0} > > =A0 =A0 =A0 =A0/* At least one CPU was found. */ > =A0 =A0 =A0 =A0if (mp_ncpus =3D=3D 1) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * One CPU was found, so this must be a UP= system with > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * an I/O APIC. > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp_maxid =3D 0; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return (0); > =A0 =A0 =A0 =A0} > > =A0 =A0 =A0 =A0/* At least two CPUs were found. */ > =A0 =A0 =A0 =A0return (1); > } > > Note that the second added line above already exists in > /sys/amd64/amd64/mp_machdep.c, maybe to fix a similar problem, though > that should only apply to 'a UP system with an I/O APIC'. =A0Maybe better > could be to fix this in cpu_mp_probe's caller, /sys/kern/subr_smp.c: > > static void > mp_start(void *dummy) > { > =A0 =A0 =A0 =A0mtx_init(&smp_ipi_mtx, "smp rendezvous", NULL, MTX_SPIN); > > =A0 =A0 =A0 =A0/* Probe for MP hardware. */ > =A0 =A0 =A0 =A0if (smp_disabled !=3D 0 || cpu_mp_probe() =3D=3D 0) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mp_ncpus =3D 1; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp_maxid =3D 0; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0all_cpus =3D PCPU_GET(cpumask); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; > =A0 =A0 =A0 =A0} > > =A0 =A0 =A0 =A0cpu_mp_start(); > =A0 =A0 =A0 =A0printf("FreeBSD/SMP: Multiprocessor System Detected: %d CP= Us\n", > =A0 =A0 =A0 =A0 =A0 =A0mp_ncpus); > =A0 =A0 =A0 =A0cpu_mp_announce(); > } > > I'm probably a long way off base for a solution, but think I've located > the problem. =A0Thoughts? =A0Is this a known issue? =A0Might any develope= rs > actually still have a single-cpu i386 system to check this on? :) > > Very happy to test any patches etc. > Ouch. Looks like that affects a system with 2 cores as well. Intel Core2 E7200, 8.2-R i386 SMP: kern.smp.forward_signal_enabled: 1 kern.smp.topology: 0 kern.smp.cpus: 2 kern.smp.disabled: 0 kern.smp.active: 1 kern.smp.maxcpus: 32 kern.smp.maxid: 31 kern.cp_times: 867360 171 429180 70114 170549535 1385294 306 176659 82618 170270900 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Your analysis looks promising. --=20 wbr, pluknet