Date: Fri, 16 Dec 2011 12:14:24 +0100 From: Attilio Rao <attilio@freebsd.org> To: Steve Kargl <sgk@troutmask.apl.washington.edu> Cc: Andrey Chernov <ache@nagual.pp.ru>, George Mitchell <george+freebsd@m5p.com>, Doug Barton <dougb@freebsd.org>, freebsd-stable@freebsd.org Subject: Re: SCHED_ULE should not be the default Message-ID: <CAJ-FndD0vFWUnRPxz6CTR5JBaEaY3gh9y7-Dy6Gds69_aRgfpg@mail.gmail.com> In-Reply-To: <20111215215554.GA87606@troutmask.apl.washington.edu> References: <4EE1EAFE.3070408@m5p.com> <CAJ-FndBSOS3hKYqmPnVkoMhPmowBBqy9-%2BeJJEMTdoVjdMTEdw@mail.gmail.com> <20111215215554.GA87606@troutmask.apl.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
2011/12/15 Steve Kargl <sgk@troutmask.apl.washington.edu>: > On Thu, Dec 15, 2011 at 05:25:51PM +0100, Attilio Rao wrote: >> >> I basically went through all the e-mail you just sent and identified 4 >> real report on which we could work on and summarizied in the attached >> Excel file. >> I'd like that George, Steve, Doug, Andrey and Mike possibly review the >> few datas there and add more, if they want, or make more important >> clarifications in particular about the Xorg presence (or rather not) >> in their workload. > > Your summary of my observations appears correct. > > I have grabbed an up-to-date /usr/src, built and > installed world, and built and installed a new > kernel on one of the nodes in my cluster. =C2=A0It > has > > CPU: Dual Core AMD Opteron(tm) Processor 280 (2392.65-MHz K8-class CPU) > =C2=A0Origin =3D "AuthenticAMD" =C2=A0Id =3D 0x20f12 =C2=A0Family =3D f = =C2=A0Model =3D 21 =C2=A0Stepping =3D 2 > =C2=A0Features=3D0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,M= TRR,PGE, > =C2=A0MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > =C2=A0Features2=3D0x1<SSE3> > =C2=A0AMD Features=3D0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow!+,3DNow!> > =C2=A0AMD Features2=3D0x3<LAHF,CMP> > real memory =C2=A0=3D 17179869184 (16384 MB) > avail memory =3D 16269832192 (15516 MB) > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > FreeBSD/SMP: 2 package(s) x 2 core(s) > > I can perform new tests with both ULE and 4BSD, but you'll > need to be precise in the information you want collected > (and how to collect the data) due to the rather limited > amount of time I currently have. It seems a perfect environment, just please make sure you made a debug-free userland (setting MALLOC_PRODUCTION in jemalloc basically). The first thing is, can you try reproducing your case? As far as I got it, for you it was enough to run N + small_amount of CPU-bound threads to show performance penalty, so I'd ask you to start with using dnetc or just your preferred cpu-bound workload and verify you can reproduce the issue. As it happens, please monitor the threads bouncing and CPU utilization via 'top' (you don't need to be 100% precise, jut to get an idea, and keep an eye on things like excessive threads migration, thread binding obsessity, low throughput on CPU). One note: if your workloads need to do I/O please use a tempfs or memory storage to do so, in order to reduce I/O effects at all. Also, verify this doesn't happen with 4BSD scheduler, just in case. Finally, if the problem is still in place, please recompile your kernel by adding: options KTR options KTR_ENTRIES=3D262144 options KTR_COMPILE=3D(KTR_SCHED) options KTR_MASK=3D(KTR_SCHED) And reproduce the issue. When you are in the middle of the scheduling issue go with: # ktrdump -ctf > ktr-ule-problem-YOURNAME.out and send to the mailing list along with your dmesg and the informations on the CPU utilization you gathered by top(1). That should cover it all, but if you have further questions, please just go ahead. Thanks, Attilio --=20 Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndD0vFWUnRPxz6CTR5JBaEaY3gh9y7-Dy6Gds69_aRgfpg>