Date: Mon, 13 Nov 95 22:34 WET From: uhclem%nemesis@fw.ast.com (Frank Durda IV) To: simonm@dcs.gla.ac.uk, davidg@root.com Cc: current@freebsd.org Subject: Re: Disk I/O that binds Message-ID: <m0tFD4o-000J7hC@nemesis.lonestar.org>
next in thread | raw e-mail | index | archive | help
[1]Just a guess, but could this be caused by the disk request sorting [1]done by FreeBSD? New requests get entered into the queue possibly [1]before old ones, and it could be a long time before a request finally [1]gets serviced. [2] Yes, and this does contribute to the problem. This was the topic of a [2]recent discussion between myself and John Dyson. Our disksort could use [2]a little work. One thing I'd like to do is make it possible for the driver [2]to request that no sorting be done. This would allow controllers that can [2]do tagged queuing to pass the I/O off to the drive unordered - allowing [2]the drive to use superior sorting algorithms. [2]-DG [3]Frank writes: [3]Are you saying that the disksort is not performing an elevator sort [3]or some other type of sort designed for multi-tasking disk operations? [3]Interesting. [3] [3]That would certainly explain the symptom I see. We wrote an elevator sort [3]years ago for Tandy 6000 XENIX because the one that came in the AT&T V7 and [3]Microsoft XENIX port code did would end up favoring the process doing a [3]copy of large items from one place to another (such as article archiving [3]in news), and other processes that were doing simple stats of files would [3]hang waiting until the other process gulped. The end result of the MS [3]scheme was a sort that did pretty good in the sequential read/write [3]benchmark, but killed multiuser operations. We replaced it. [3]It never occurred to me that this sort of problem would still be around. I have just run a few tests and have found a way to get a bind to occur in just a few tries. All I/O was on SCSI drives (NO IDE). Hard disk was 2GB Seagate Baracuda and SCSI was 1540B Adaptec. 1104 stock kernel (also done on a 2.0.5 with driver deletions kernel), 8MB RAM. 1. Kill any processes that might be doing a lot of writes in background, such as tind, kick off UUCP, the users, etc. It seems OK to leave update, sendmail and other intermittent items running, although it may fail faster with them eliminated too. 2. cd /usr/spool/news (assuming you have news) on one multiscreen. Type cp history /dev/null (or you can use some other extremely large file. My copy of history was 29Meg. History is usually a bit fragmented, although I don't know if that is a factor.) 3. On a different multiscreen, do a ls -alR of *ANY* filesystem located on the same drive. (It can be a different slice). Now watch the ls progress. It will probably run fast for 40 to 80 seconds and then it will slow and stop. Each time it pauses, start counting and note where you are path-wise. Then when it resumes, note how many files were in the directory it took a long time on. When you hit a directory that has less than ten files in it and it takes 20 seconds or more to display it, you are seeing the problem. If you can't get it to happen right away, do a few "!!"s on the screen with the cp so that it won't run out of things to do and give you false results. You may note when the ls pauses, the hard disk seems to go quiet also (less seeking), although the SCSI controller light remains on solid. I did a ps -alx on a third screen while the ls was stuck (in my case, the directory it was stuck on had six files, one subdirectory, and took 27 seconds to resume and display. (The subdirectory contained one file.) It also paused on the next three or four directories for excessive amounts of time vs the number of files present in the directories. The ps shows that the cp was in "getblk D+" while the ls was in "biowai D+". Note that there is no disk writing going on here in the test commands. I was able to get it to fail with disk writing, such as changing the cp history /dev/null to cp history xyzzy, but it seemed to take a lot longer to fail. I also didn't kill update or any of the basic services, so there was someone doing a write once in a while, even during the first example. This smells like a disksort implementation flaw that resets direction each time an item is added to the queue, rather than completely exhausting the queue in one direction before reversing direction. Something like an elevator sort should be done. Frank Durda IV <uhclem@nemesis.lonestar.org>|"The Knights who say "LETNi" or uhclem%nemesis@fw.ast.com (Fastest Route)| demand... A SEGMENT REGISTER!!!" ...letni!rwsys!nemesis!uhclem |"A what?" ...decvax!fw.ast.com!nemesis!uhclem |"LETNi! LETNi! LETNi!" - 1983
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0tFD4o-000J7hC>