Date: Tue, 2 Oct 2007 15:26:43 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Kevin Oberman <oberman@es.net> Cc: cvs-all@freebsd.org, src-committers@freebsd.org, cvs-src@freebsd.org, Jeff Roberson <jeff@freebsd.org>, Garance A Drosehn <gad@freebsd.org>, Ben Kaduk <minimarmot@gmail.com>, Bruce Evans <brde@optusnet.com.au>, Jeff Roberson <jroberson@chesapeake.net> Subject: Re: cvs commit: src/sys/kern sched_ule.c Message-ID: <20071002133623.X40629@besplex.bde.org> In-Reply-To: <20071001145257.EC9FC4500F@ptavv.es.net> References: <20071001145257.EC9FC4500F@ptavv.es.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 1 Oct 2007, Kevin Oberman wrote: >> Date: Mon, 1 Oct 2007 21:26:39 +1000 (EST) >> From: Bruce Evans <brde@optusnet.com.au> >> >> On Mon, 1 Oct 2007, Jeff Roberson wrote: >>> Given that the overwhelming amount of feedback by qualified poeple, I think >>> it's fair to say that ULE gives a more responsive system under load. >> >> This is not my experience. Maybe I don't run enough interactive bloatware >> to have a large enough interactive load for the scheduler to make a >> difference. > > That, or you don't run interactive on older systems with slow CPUs and > limited memory. (This does NOT imply that ULE is going to help when > experiencing heavy swapfile activity. I don't think anything helps > that except more RAM.) Not recently. I used a P5/133 which was new in 1996 as an X client until Y2K since it was fast enough to be an X client, but I stopped running builds in it in 1998. > The place it seem most evident to me is X responsiveness when the system > (1GHz X 256MB PIII) is busy with large builds. Performance is terrible > with 4BSD and only bad with ULE. Note that I am running Gnome (speaking > of bloatware). > > The difference when running ULE is pretty dramatic. Again, this is not my experience. I don't run gnome, but occasionally run X, and often run kernel builds and network benchmarks. A quick test now showed good interactivity for light browsing and editing at a load average of 32 generated by a pessimized makeworld (-j16) on both an A64 2.2GHz UP and a Celeron 366MHz UP. The light interactive use just doesn't need to run long enough for its priority to become as low (numerically high) as the build. Maybe heavy X use with streaming video is what you count as interactive. I count that as not very hard realtime. Further testing of my ~4BSD scheduler in ~5.2 indicates that when a process wants less than about 1/loadavg of the CPU on average, it usually just gets it, with no scheduling delays, since it usually has higher priority than all other user processes. Otherwise, the worst-case scheduling delays increase from ~10 msec to ~2 seconds. It is easy to reduce the scheduling quantum from its default of 100 msec by a factor of 100, but this doesn't seem to work right. So the behaviour is very dependent on the load and on the amount of CPU wanted by the interactive process. ... I now have more experience with ULE. A version built today gave dramatically worse interactivity, so much so that I think it must have been broken recently. A simple shell loop hangs the rest of the system in some cases, and a background build has similar bad effects, probably limited mainly by useful loops not being endless. First I tried an old regression test for nice[1-2]: %%% for i in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 do nice -$i sh -c "while :; do echo -n;done" & done top -o time %%% This hung after starting only about one of the shell processes. After cutting the list down to just one process with nice -20, it still hung. Shells on other syscons terminals running at rtprio 0 could not compete with the nice -20 process: - they could not start top to look at what was happening - an already-running could not display anything new - they could not start killall. With the list cut down to about 6 processes, ps in ddb showed evidence of all the processes starting, and I was able to kill them all using kill in ddb. The above was with HZ = 100. After changing HZ to 1000, one nice -20 process could be started with no problems, but similar problems occur with a few more processes. With a nice list of "for i in 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20", one of the shells apparently runs for about a minute before its priority is reduced at all. During this time, the symptoms were the same as above. The shell that uses extra time initially is not usually the first one in the list. After starting all the shells, the behaviour was normal, including niceness having too little effect. On a later run, all the shells started in a couple of seconds (still slow) even with the full nice list restored. Running makeworld with just -j4 n the background gives similar symptoms. When a new process is started, it sometimes gets too many cycles to begin with, and apparently completely stops all processes in the makeworld (but not the top displaying things) for several seconds. After a while (I guess when the interactivity score descreases), this behaviour changes to giving the new process very few cycles even if it is semi-interactive (a foreground process started from a shell). In at least this phase, ^C to kill processes doesn't work, but ^Z to suspend them and then kill from the shell works normally, and interactivity in not-very-bloated mail programs and editors is very bad. A non-interactive utility to measure the scheduling delay reports a max delay of about 2 seconds for most runs, while with other schedulers and kernels it only reports 2 seconds occasionally even at much higher loads. Other behaviour with 4BSD schedulers and various kernels: - the max scheduling delay is almost independent of the CPU speed. - the max scheduling delay is slightly worse for -current with 4BSD than with my ~5.2. - -current has anomalous behaviour relative to ~5.2 for background makeworld -j16: many fewer runnable processes, a much smaller max load average, and many more zombies visible when top looks. - in ~5.2, removing the hack that puts threads back on the head of the queue instead of the tail significantly reduces the max scheduling delay. (This is a non-hack with related changes in -current, but I just used s/TAIL/HEAD/.) This hack reduced makeworld time significantly. I think removing it improves interactivity only by accident. Removing it restores the old bogus scheduler behaviour of rescheduling on every "slow" interrupt, which gives essentially roundrobin scheduling under loads that generate lots of interrupts. Interactivity is still poor because makeworld sometimes generates a few hundred processes per second and cycling through that many takes a long time even with a tiny quantum. - reducing kern.sched.quantum never had much effect. Same for increasing HZ in -current with 4BSD. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071002133623.X40629>