Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Aug 1996 20:25:45 +0200 (MET DST)
From:      Stefan Esser <se@zpr.uni-koeln.de>
To:        Michael Smith <msmith@atrad.adelaide.edu.au>
Cc:        se@zpr.uni-koeln.de (Stefan Esser), hardware@freebsd.org
Subject:   Re: ASUS SC200 SCSI card?
Message-ID:  <199608271825.UAA04079@x14.mi.uni-koeln.de>
In-Reply-To: <199608240126.KAA24070@genesis.atrad.adelaide.edu.au>
References:  <199608232024.WAA22814@x14.mi.uni-koeln.de> <199608240126.KAA24070@genesis.atrad.adelaide.edu.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Michael Smith writes:
 > Stefan Esser stands accused of saying:

 > >  > Even with two 810's coming up for an opcode every us?  I'd have
 > >  > thought you'd want to allow for (max latency + one opcode fetch) < 1us
 > >  > so that the second one didn't starve...
 > > 
 > > This isn't how the latency timer works ...
 > > 
 > > The latency timer prevents a device with 
 > > a large internal buffer from sending long
 > > bursts, which else might cause overruns 
 > > in receive buffers of other devices.
 > 
 > ... but this is (as far as I can tell) exactly what I was saying; the

Well, you most probably are right. That was what you were saying.

But having the Latency Timer set to guarantee the NCR's instruction
fetches at a rate of a few million a second (it takes 12 clocks to 
execute an instruction, which is equivalent to a cycle time of 360ns)
seemed so strange an idea, that I didn't think you really meant that :-)

 > latency timer defines how long another device can hog the bus.  If the
 > 810 wants the bus every us (it may not, I'm just using this as an
 > example), then the latency must be set to 1us or less so that a device
 > that starts a burst just before the 810 requests the bus will stop
 > before the 810 starves.

The problem is, that the TOTAL latency of all devices had to be 12
cycles. With 3 PCI bus-masters (say: the host bridge, the NCR and an
DEC 21040 based Ethernet card), the latency timer had to be set to 
6 (two other devices), if there was no time lost for arbitration, and
it actually had to be 0 in order to allow for arbitrarion overhead :)

(The minimum PCI transaction seems to take 4 cycles with most current
chip sets. This is because the address is multiplexed over the same 
lines as the data, and there are idle cycles required, whenever there
is a change of active drivers. This does affect multi-chip chip sets,
which often have different chips that drive the address information 
onto the PCI bus, and a data buffer (with multiple FIFOs) which takes
over the address/data lines after the address has been accepted by the
target of the transaction.)

 > If you add another 810, and assume that it comes up for a fetch just
 > after the first 810, which is held off by a burst from a device that
 > runs the full time allowed (128 bytes, not too long).  Then the first 810
 > gets the bus and fills its pipeline; has more than 1us expired? is
 > the second 810 starved?  Does it actually care?

No, it most likely doesn't really care. The NCR executes a few hundred
instructions per SCSI command. This includes the initial selection,
the sending of the command, generally at least one disconnect, several
SCSI messages being sent, and of course the final status phase.

The data phase is the most important phase :) and usually accounts for
the largest fraction of the bytes transfered. But it does only require
2 or 4 instructions per 4KByte page (depending on the alignment of the
buffer), or about one instruction per 100 microseconds.

 > These are the questions that would lead me to suggest backing the latency
 > timer down.  Practical experience (offered by RG and co.) suggest that I'm
 > wrong, but I guess I just don't understand why 8)

No, you are not wrong. But the effect of lowering the latency timer 
value are much more negative (because of the reduced burst lengths in
case of high demand for the bus and the lost startup cycles at the end
of each burst) then the longer latency that results from the NCR not 
being able to fetch the next instruction immediately when its done
with the previous one. The instruction fetches occur when the NCR is 
in phases were it has to wait for the SCSI target to respond, which 
often takes tens of microseconds. And the actual data transfer runs 
with such a little number of data fetches, that it does not hold up 
the actual data transfer of a 10MB/s device.

Things are different with WIDE or Ultra (or Ultra-WIDE :) devices,
and that is why the "better" NCR chips offer instruction read-ahead
or even a local 4KB SRAM on the chip for instruction and parameter
storage. This makes the 53c825A and the 53c875 run for a complete 
SCSI command with no need to access host system RAM (except for the
data transfered to/from disk :)


Regards, STefan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199608271825.UAA04079>