From owner-freebsd-scsi Thu Oct 22 15:28:53 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA12601 for freebsd-scsi-outgoing; Thu, 22 Oct 1998 15:28:53 -0700 (PDT) (envelope-from owner-freebsd-scsi@FreeBSD.ORG) Received: from dingo.cdrom.com (dingo.cdrom.com [204.216.28.145]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA12587 for ; Thu, 22 Oct 1998 15:28:48 -0700 (PDT) (envelope-from mike@dingo.cdrom.com) Received: from dingo.cdrom.com (localhost.cdrom.com [127.0.0.1]) by dingo.cdrom.com (8.9.1/8.8.8) with ESMTP id PAA01402; Thu, 22 Oct 1998 15:31:39 -0700 (PDT) (envelope-from mike@dingo.cdrom.com) Message-Id: <199810222231.PAA01402@dingo.cdrom.com> X-Mailer: exmh version 2.0.2 2/24/98 To: Thomas F Keefe cc: patton@sysnet.net, freebsd-scsi@FreeBSD.ORG Subject: Re: Sequential Disk I/O In-reply-to: Your message of "Thu, 22 Oct 1998 12:08:02 EDT." <199810221608.MAA05083@remulak.cse.psu.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 22 Oct 1998 15:31:39 -0700 From: Mike Smith Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > > if you study how the big boys liek Oracle do it, they use block sizes in > > excess of 8kb. Depending on how big your disks are 32kb blocks are > > considered more reasonable. What they do is fit multiple database "blocks" > > or records into a logical block. I suggest a vastly bigger blocksize than > > 512bytes if you're serious about performance. > > By using large blocks I can approximate the > performance of sequentail access. I may have > a seek and rotation penalty at the beginning > of the transfer, but that one penalty is > amortized over a large number of sectors. > This may be my only choice if I cannot > solve this problem. At this point I have > spent a lot of time and become very curious > about this. It looks like you're suffering from some major misconceptions about the relative performance and behaviour of the I/O subsystem. This makes it difficult to relate to your questions, because they're deeply founded on these misconceptions. To start with, you have to understand that I/O transactions move along queues, and the behaviour of each of these queues is different, and many of these behaviours are dependant upon other factors. > > What I was hoping to get from this group, was > something like: > (1) It is impossible to avoid the rotational > latency when issuing writes to adjacent > sectors on a SCSI disk because the time > required between the completion of one command > and the start of the next is a significant > fraction of the time it takes for a 5400RPM disk > platter to rotate. Thus, even with large > strides, the next command comes to late. > This, makes only one access per revolution > possible. Issues of rotational latency are completely irrelevant. The decoupling between the system and the media on a SCSI disk is complete. > or; > > (2) Modern SCSI drives (by default) access an > entire track on both reads and writes. > Thus, only one access is possible per revolution. > This default behavior can be disabled through > the mode page as follows ... The second does not follow from the first, but again, this is completely irrelevant. > Any enlightenment you can offer on this topic will be > appreciated. Without meaning offense, the amount of "enlightenment" required to correct your understandings on the topic is beyond the scope of any single email, and will probably only be gathered over a number of years of experience. To answer what I recall as being your original question as succinctly as possible; there is an effectively fixed overhead involved in any given I/O transaction. So in moving data, there are two costs: - tD, the time taken to move the data itself. - tO, the overhead. tD is directly proportional to the amount of data involved, so for any given amount of data, tD is constant. tO is effectively fixed per transaction. Thus, if you move the data in small chunks, total tO is proportionally larger than if you use large chunks. When you come to compute throughput for a single consumer, there's another cost: - tL, the latency time (time between making the response and receiving an answer which is not accounted for in tO). tL varies significantly depending on load, but it also has a fixed component per transaction. Thus, again, smaller chunks mean a larger total tL. tO and tL are significant for many of the queues, and so the system does what it can to coalesce as many read/write operations as possible, as well as caching, etc. in order reduce them where possible. However, it is in your application that you can make the best performance optimisations because only your application can know its access behaviour in advance. If you're interested in pursuing this further, you might want to start by studying the FreeBSD I/O subsystem. Then you'll want to look at how SCSI works, starting with a typical SCSI host adapter, the SCSI standard and perhaps some high-level documentation for a SCSI disk. Please don't think you can apply something like the early-70's vintage disk theory from eg. Tanenbaum to modern disk systems; this will only confuse you. -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message