Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Nov 95 22:34 WET
From:      uhclem%nemesis@fw.ast.com (Frank Durda IV)
To:        simonm@dcs.gla.ac.uk, davidg@root.com
Cc:        current@freebsd.org
Subject:   Re: Disk I/O that binds 
Message-ID:  <m0tFD4o-000J7hC@nemesis.lonestar.org>

next in thread | raw e-mail | index | archive | help
[1]Just a guess, but could this be caused by the disk request sorting
[1]done by FreeBSD?  New requests get entered into the queue possibly
[1]before old ones, and it could be a long time before a request finally
[1]gets serviced.

[2]   Yes, and this does contribute to the problem. This was the topic of a
[2]recent discussion between myself and John Dyson. Our disksort could use
[2]a little work. One thing I'd like to do is make it possible for the driver
[2]to request that no sorting be done. This would allow controllers that can
[2]do tagged queuing to pass the I/O off to the drive unordered - allowing
[2]the drive to use superior sorting algorithms.
[2]-DG

[3]Frank writes:
[3]Are you saying that the disksort is not performing an elevator sort
[3]or some other type of sort designed for multi-tasking disk operations?
[3]Interesting.
[3]
[3]That would certainly explain the symptom I see.  We wrote an elevator sort
[3]years ago for Tandy 6000 XENIX because the one that came in the AT&T V7 and
[3]Microsoft XENIX port code did would end up favoring the process doing a
[3]copy of large items from one place to another (such as article archiving
[3]in news), and other processes that were doing simple stats of files would
[3]hang waiting until the other process gulped.  The end result of the MS
[3]scheme was a sort that did pretty good in the sequential read/write
[3]benchmark, but killed multiuser operations.  We replaced it.
[3]It never occurred to me that this sort of problem would still be around.

I have just run a few tests and have found a way to get a bind to occur
in just a few tries.  All I/O was on SCSI drives (NO IDE).  Hard disk
was 2GB Seagate Baracuda and SCSI was 1540B Adaptec.  1104 stock kernel
(also done on a 2.0.5 with driver deletions kernel), 8MB RAM.

1.	Kill any processes that might be doing a lot of writes in
	background, such as tind, kick off UUCP, the users, etc.  It seems
	OK to leave update, sendmail and other intermittent items running,
	although it may fail faster with them eliminated too.

2.	cd /usr/spool/news (assuming you have news) on one multiscreen.
	Type  cp history /dev/null
	(or you can use some other extremely large file.  My copy of history
	was 29Meg.  History is usually a bit fragmented, although I don't
	know if that is a factor.)

3.	On a different multiscreen, do a ls -alR of *ANY* filesystem
	located on the same drive.  (It can be a different slice).

Now watch the ls progress.  It will probably run fast for 40 to 80 seconds
and then it will slow and stop.  Each time it pauses, start counting
and note where you are path-wise.  Then when it resumes, note how
many files were in the directory it took a long time on.

When you hit a directory that has less than ten files in it and it
takes 20 seconds or more to display it, you are seeing the problem.   If
you can't get it to happen right away, do a few "!!"s on the screen
with the cp so that it won't run out of things to do and give you false
results. 

You may note when the ls pauses, the hard disk seems to go quiet also
(less seeking), although the SCSI controller light remains on solid.

I did a ps -alx on a third screen while the ls was stuck (in my case,
the directory it was stuck on had six files, one subdirectory, and took
27 seconds to resume and display.  (The subdirectory contained one file.)
It also paused on the next three or four directories for excessive
amounts of time vs the number of files present in the directories.
The ps shows that the cp was in "getblk D+" while the ls was in "biowai D+".

Note that there is no disk writing going on here in the test commands.
I was able to get it to fail with disk writing, such as changing the
cp history /dev/null to cp history xyzzy, but it seemed to take a lot longer
to fail.  I also didn't kill update or any of the basic services, so there
was someone doing a write once in a while, even during the first example.

This smells like a disksort implementation flaw that resets direction each
time an item is added to the queue, rather than completely exhausting the
queue in one direction before reversing direction.  Something like an
elevator sort should be done.

Frank Durda IV <uhclem@nemesis.lonestar.org>|"The Knights who say "LETNi"
or uhclem%nemesis@fw.ast.com (Fastest Route)| demand...  A SEGMENT REGISTER!!!"
...letni!rwsys!nemesis!uhclem               |"A what?"
...decvax!fw.ast.com!nemesis!uhclem         |"LETNi! LETNi! LETNi!"  - 1983




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0tFD4o-000J7hC>