From owner-freebsd-hardware Fri Aug 23 14:23:02 1996 Return-Path: owner-hardware Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA14690 for hardware-outgoing; Fri, 23 Aug 1996 14:23:02 -0700 (PDT) Received: from FileServ1.MI.Uni-Koeln.DE (FileServ1.MI.Uni-Koeln.DE [134.95.212.1]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id OAA14679 for ; Fri, 23 Aug 1996 14:22:54 -0700 (PDT) Received: from x14.mi.uni-koeln.de (annexr3-6.slip.Uni-Koeln.DE) by FileServ1.MI.Uni-Koeln.DE with SMTP id AA08226 (5.67b/IDA-1.5 for ); Fri, 23 Aug 1996 23:22:26 +0200 Received: (from se@localhost) by x14.mi.uni-koeln.de (8.7.5/8.6.9) id WAA22802; Fri, 23 Aug 1996 22:07:17 +0200 (MET DST) Date: Fri, 23 Aug 1996 22:07:17 +0200 (MET DST) Message-Id: <199608232007.WAA22802@x14.mi.uni-koeln.de> From: Stefan Esser To: Peter Childs Cc: se@zpr.uni-koeln.de (Stefan Esser), msmith@atrad.adelaide.edu.au, freebsd-hardware@FreeBSD.ORG Subject: Re: ASUS SC200 SCSI card? In-Reply-To: <199608221728.CAA00592@al.imforei.apana.org.au> References: <199608212019.WAA07179@x14.mi.uni-koeln.de> <199608221728.CAA00592@al.imforei.apana.org.au> Sender: owner-hardware@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Peter Childs writes: > > [ Discussion about hangs on 2.1.5-stable machine with dual ASUS SC200 > NCR810 PCI scsi controllers follows... may be dangerous to > mental health ] > > Stefan Esser wrote... > > > So this means that with multiple SC200 cards they can all be set on > > > INTA ?? If so are there any pros/cons to doing this? > > > > In fact they all SHOULD be set to Int A ! > > Ok.. the second NCR was set to INT B... i've put them both on A > now. Ok. The BIOS has setup a connection from Int A of the PCI slot you put the card in to some interrupt line of the ISA interrupt controller. It then put the IRQ choosen into some NCR register (in PCI config space) for the driver init code. The driver used that IRQ number to install an interrupt handler. Now, if you had the card configured to Int B, then the BIOS would still (correctly !) setup Int A, since the NCR chip got its request for the Int A hard coded into some other config space register. Now the NCR had got a jumper to the Int B line in your previous setup, while the BIOS and the driver assumed to have it wired to Int A, and Int B interrupts would either be ignored or delivered as some other IRQ, depending on implementation details of your mother board. This might have lead to strange effects (stray interrupts, for example), but if Int B came out as some IRQ for which some other devcie had registered a handler, then that device would receive "spurious" interrupts, and most probably ignore them ... But the second NCR was effectively limited to a low command rate (It will be polled 100 times a second until the driver sees the first interrupt occur). > > Please send me some details (from /var/log/messages). I need > > at least the complete boot message log (preferably from a > > boot with "-v" for more verbose probe output) and the error > > message when the SCSI command was aborted. > > Nothing gets into /var/log/message when it dies... I've taken the following > action (one crash after the next) > > 1. INT B -> INT A on the second card. > 2. PCI latency was set to 80... Michael Smith suggested it be lower than > 32 to i've moved it to 20. PCI latency is a very nice concept, but like interrupts quite different from what you'd expect in a PC compaticle system ... The latency timer has to be set to a value that permits long bursts of data to be sent, taking advantage of page mode and cache snoop optimizations (one snoop per cache line instead of per memory access). But these burst ought to be limited in such a way, that no device's input buffer overflows because a burst takes too long. While ISA bus-master devices (for example the Adpatec 1542) did short bursts (4 WORD transfers, IIRC) and then released the ISA bus for a few microseconds, PCI has a concept of an arbiter, which assigns the bus to any bus-master in the system, generally in a round-robin fashion. If a device FIFO is 512 bytes and data arrives at 10MB/s (say a 100baseT Ethernet chip), then it can give up the PCI bus for some 50 microseconds. At a burst transfer rate of 80MB/s it would take less than 10 microseconds to write the FIFO contents to memory, while the same chip might only be able to get 20MB/s using small (4 DWORD) transfers. You want to guarantee, that each device gets the PCI bus granted before its buffer overflows. And the easiest way to achieve this is to have a timer set to the maximum latency allowed divided by the number of devices on the PCI bus. If a device starts a burst, it is allowed to proceed, even if some other device requests the PCI bus. But if there is a request from some other device and the latency timer is expired, the first device is asked to stop its burst ASAP, and the next device will become the bus owner. This way a burst can be extended arbitrarily if there is no other request for the bus, but there is a guarantee that after #devices times max_latency each device had access to the PCI bus. PCI defines registers to contain information about the required burst length and the maximum latency, and a PCI BIOS might be able to calculate the optimum value of the latency timer from these parameters of all PCI devices installed ... (But I don't know of any PCI BIOS that actually does this.) > 3. Grabbed a fresh 2.1.5-stable kernel (i follow -stable, but my kernel > tree had ipfilter stuff in it...) > 4. Removed the > options OD_BOGUS_NOT_READY > line from my config. > > It feels fine when i'm not accessing the MO drive.. but i did a > "make clean" on my -stable tree... which is on the second scsi bus, > did a "bad144 -s /dev/rod0" on the MO disk (also second scsi bus), > and started thrashing tin.. (newsspools on the first scsi - old > disk)... this all ran fine for a good 10 minutes... then suddenly.. > bang.. locked solid. Hmmm, it proceeds for 10 minutes, then hangs without any error messages ? > I'll include the "-v" boot here.. and hope i don't annoy to many > people with its size :) Well, they don't have to read beyond this point :) > I think when i get back after the weekend i'll pull one of the SCSI > controllers out, and see how i go thrashing all the devices. Its a > real pain not being able to depend on this machine, esp. the MO > disk (it always crashes before i can get a backup finished :) Yes, I really understand that ... > FreeBSD 2.1.5-STABLE #0: Fri Aug 23 11:56:26 CST 1996 > root@:/disk2/kernel/sys/compile/AL_1.8 > CPU: i486DX (486-class CPU) > Origin = "AuthenticAMD" Id = 0x494 > real memory = 67108864 (65536K bytes) > avail memory = 64106496 (62604K bytes) > pcibus_setup(1): mode1res=0x80000000 (0x80000000), mode2res=0xff (0x0e) > pcibus_setup(2): mode1res=0x80000000 (0x80000000) > pcibus_check: device 0 1 2 3 4 5 is there (id=04961039) > Probing for devices on PCI bus 0: > configuration mode 1 allows 32 devices. > chip0 rev 49 on pci0:5 Hmm, a SiS chip set ... Did you try to disable PCI performance options like burst mode or write buffers ? There are some PCI chip sets that don't work reliably with competing bus-masters and those options enabled. > ncr0 rev 17 int a irq 15 on pci0:11 Is this a NCR 53c810A ? I don't have a data book about that particular chip, but according to a numbering convention used for other NCR chips, the A devices get a rev. > 0x10. > mapreg[10] type=1 addr=0000e800 size=0100. > mapreg[14] type=0 addr=fbff0000 size=0100. > reg20: virtual=0xf546f000 physical=0xfbff0000 size=0x100 > ncr0: restart (scsi reset). > ncr0 scanning for targets 0..6 (V2 pl23 95/09/07) > Choosing drivers for scbus configured at 0 > (ncr0:1:0): "QUANTUM FIREBALL1080S 1Q09" type 0 fixed SCSI 2 > sd is configured at 0 > sd0(ncr0:1:0): Direct-Access > sd0(ncr0:1:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. > 1042MB (2134305 512 byte sectors) > sd0(ncr0:1:0): with 3835 cyls, 4 heads, and an average 139 sectors/track > (ncr0:5:0): "MICROP 1684-07MB1057403 HSP4" type 0 fixed SCSI 1 > sd is configured at 3 > sd3(ncr0:5:0): Direct-Access 323MB (663476 512 byte sectors) > sd3(ncr0:5:0): with 1780 cyls, 7 heads, and an average 53 sectors/track > ncr1 rev 1 int a irq 14 on pci0:12 This one is the same revision as the chip I got. > mapreg[10] type=1 addr=0000e400 size=0100. > mapreg[14] type=0 addr=fbfe0000 size=0100. > reg20: virtual=0xf5472000 physical=0xfbfe0000 size=0x100 > ncr1: restart (scsi reset). > ncr1 scanning for targets 0..6 (V2 pl23 95/09/07) > (ncr1:3:0): "SEAGATE ST51080N 0943" type 0 fixed SCSI 2 > sd is configured at 4 > sd4(ncr1:3:0): Direct-Access > sd4(ncr1:3:0): FAST SCSI-2 100ns (10 Mb/sec) offset 8. > 1030MB (2109840 512 byte sectors) > sd4(ncr1:3:0): with 4826 cyls, 4 heads, and an average 109 sectors/track > (ncr1:6:0): "FUJITSU M2512A 1507" type 7 removable SCSI 2 > od is configured at 0 > od0(ncr1:6:0): Optical > od0(ncr1:6:0): 200ns (5 Mb/sec) offset 8. > 217MB (446325 512 byte sectors) > od0(ncr1:6:0): with approximate 217 cyls, 64 heads, and 32 sectors/track > pci0: uses 512 bytes of memory from fbfe0000 upto fbff00ff. > pci0: uses 512 bytes of I/O space from e400 upto e8ff. [ probe of ISA devices removed ] > Relevant(??) bits of my kernel config as follows... > > controller pci0 > controller ncr0 > > controller scbus0 at ncr0 > controller scbus1 > > disk sd0 at scbus0 target 1 > disk sd4 at scbus1 target 3 > disk sd3 at scbus0 target 5 > device od0 at scbus1 target 6 > > #options OD_BOGUS_NOT_READY Doesn't look wrong at all ... So please try: - with all devices connected to one NCR - with PCI performance options disabled - with the SCSI transfer rate reduced to 2MHz (-> ncrcontrol -s sync=2) I guess you know about the limitation on the length of the SCSI cable, the requirements for correct termination and