Date: Wed, 2 Mar 2005 05:02:41 -0800 From: "Ted Mittelstaedt" <tedm@toybox.placo.com> To: <freebsd-questions@freebsd.org> Subject: RE: Installation instructions for Firefox somewhere? Message-ID: <LOBBIFDAGNMAMLGJJCKNGEKDFAAA.tedm@toybox.placo.com> In-Reply-To: <9810408603.20050302055304@wanadoo.fr>
next in thread | previous in thread | raw e-mail | index | archive | help
> -----Original Message----- > From: owner-freebsd-questions@freebsd.org > [mailto:owner-freebsd-questions@freebsd.org]On Behalf Of Anthony > Atkielski > Sent: Tuesday, March 01, 2005 8:53 PM > To: freebsd-questions@freebsd.org > Subject: Re: Installation instructions for Firefox somewhere? > > > Ted Mittelstaedt writes: > > > It appears you have a narrow-SCSI max 10MB sync disk drive and a > > ultra -3 20MB sync disk drive on the same adapter card. > > Such a combination is iffy at best. > > The configuration was the one recommended by HP. I bought the second > drive from HP directly. They both have the same type of SCSI > interface, > approved by HP. > HP didn't manufacture either of the drives nor the SCSI controller so why would you think that they know what they are talking about? HP does the same thing Compaq does (now really the same since they are the same company) they buy off-the-shelf parts from other manufacturers and bundle them together into systems that they sell. Dell, Gateway and all the rest of them do the same thing. A very few of their products (like the Vectra XU 6/200 that you have) they do design the motherboards, but that's it. And of course they design the sheetmetal. But for the motherboards in most of their stuff they get OEMs to make them for them. And despite all the testing on occasion they screw up and release patches that patch around hardware problems. > I'm tired of hearing why it's not FreeBSD's fault. When you > can tell me > exactly what theses messages mean, instead of guessing, let me know. Ok here goes: Feb 26 20:09:23 contactdish kernel: (da0:ahc0:0:0:0): Retrying Command Feb 26 20:09:23 contactdish kernel: (da0:ahc0:0:0:0): Request Requeued Feb 26 20:09:23 contactdish kernel: (da0:ahc0:0:0:0): Retrying Command This happens when the SCSI target disk0 stop answering commands from the SCSI adapter. Feb 26 20:09:26 contactdish kernel: (da1:ahc0:0:2:0): Retrying Command Feb 26 20:09:26 contactdish kernel: (da1:ahc0:0:2:0): Queue Full Feb 26 20:09:26 contactdish kernel: (da1:ahc0:0:2:0): Retrying Command Same thing as above just with the second disk. The usual problem is bad termination that causes this, because what happens with bad termination is that electrical noise causes one or more targets on the bus to receive a command that is garbage, that target shuts down and goes out of sync with the other initiators and targets on the bus, as soon as that happens all targets shut down. But it can also be caused by a device that isn't totally compliant with the standard interfering with another device on the bus (although this is rare) And it can also be caused by the adapter card driver sending a command to a target that the target doesen't understand or does not process properly, this can happen when during the probe on boot, a target responds saying it supports something, then really doesen't. IDE devices are infamous for this, claiming to support UDMA, PIO mode 4, and such when they really don't support them properly. Sometimes if the bus is left quiet, the devices can resync and things go on. Mostly though it almost always leads to the next thing that you have, here: eb 25 20:09:29 contactdish kernel: ahc0: Recovery Initiated Feb 25 20:09:29 contactdish kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins The driver for the SCSI adapter has finally given up trying to send commands to the adapter card your disks are tied to and has decided to just reset the card entirely, which resets the bus and all devices on it, which reestablishes sync. All the rest of the data that follows is a dump of the state of the card and the commands sent, and what queue entries are trashed so the operating system can pick up where it left off if the card comes back online. Feb 25 20:09:29 contactdish kernel: (da1:ahc0:0:2:0): SCB 0x49 - timed out Feb 25 20:09:29 contactdish kernel: sg[0] - Addr 0x1309b000 : Length 2048 Feb 25 20:09:29 contactdish kernel: (da1:ahc0:0:2:0): Queuing a BDR SCB Feb 25 20:09:29 contactdish kernel: ahc0: Timedout SCBs already complete. Interrupts may not be functioning. This is a bit significant, after the bus reset, the second disk (the Quantum) isn't answering. But it looks like it later on started responding since otherwise your system would probably have paniced. None of this though is any help here. You know what the problem is you just don't know what is causing it. The idea that a SCSI command sent to a disk by the adapter card causing this is unlikely, unless either the Seagate or Quantum models that you have are known rogues (and I didn't find that they are) it is much more likely a conflict on the SCSI bus. > I'm not going to plug and unplug hardware all day based on your > speculations, particularly since I know this hardware configuration > works, and has worked for eight years. > Well first of all I already told you to run your BIOS config and set the adapter to limit sync negotiation on the Quantum to 10Mb and see if that fixed it. That would not involve you removing stuff. Secondly, you don't know how NT setup the disks and such on your system. It is quite possible that the NT driver saw the mismatch and simply reprogrammed the SCSI adapter card to limit both disks to 10Mbt transfers. Or possibly the NT driver decided not to send writes to both disks at the same time. So, comparisons like "it worked with NT so the hardware must be good" are almost useless. But the most important thing, and I think why your having so much trouble here, is that you are trying to approach this problem as though you paid $9,000 for this server, yesterday. If your Vectra was a brand new prototype in an HP test lab, or even if it was 10 days old from HP and you ran into this problem, you might have engineers with SCSI analyzers from HP's server build department all over you. But it's not - this is a server that has a production life that is OVER. I know you don't like Ebay and you probably think that everything on it is junk, but people are selling HP servers on it right now that are more powerful than yours and younger than it for under a hundred bucks - see: http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&category=0&item=5754427891& rd=1 The fact of the matter is that ANY life you can get out of this server today is found money - it's a freebie. HP, 8 or 10 years ago when they designed this server would have told you THEN that they wern't expecting it to be in service today. Now to be honest I have a soft spot for older hardware, the gateway router system that this very message is passing though just happens to be a dual Pentium Pro with 128MB of ram and a 4GB SCSI disk on an Adaptec controller, here's it's dmesg: $ dmesg Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.11-RELEASE #0: Mon Feb 21 04:13:14 PST 2005 root@nat-rtr.freebsd-corp-net-guide.com:/usr/src/sys/compile/NATRTR Timecounter "i8254" frequency 1193182 Hz CPU: Pentium Pro (199.43-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x619 Stepping = 9 Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA, CMOV> real memory = 134217728 (131072K bytes) avail memory = 126894080 (123920K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 Preloaded elf kernel "kernel" at 0xc03a9000. Pentium Pro MTRR support enabled md0: Malloc disk npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> on motherboard IOAPIC #0 intpin 16 -> irq 2 IOAPIC #0 intpin 17 -> irq 16 IOAPIC #0 intpin 18 -> irq 17 IOAPIC #0 intpin 19 -> irq 18 pci0: <PCI bus> on pcib0 isab0: <Intel 82371SB PCI to ISA bridge> at device 7.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel PIIX3 ATA controller> port 0xf000-0xf00f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: <S3 Trio graphics accelerator> at 17.0 irq 2 de0: <Digital 21140A Fast Ethernet> port 0x9100-0x917f mem 0xe0800000-0xe080007f irq 16 at device 18.0 on pci0 de0: 21140A [10-100Mb/s] pass 2.2 de0: address 00:40:05:43:ce:5f ed0: <NE2000 PCI Ethernet (RealTek 8029)> port 0x9200-0x921f irq 17 at device 19.0 on pci0 ed0: address 52:54:05:f2:ab:67, type NE2000 (16 bit) ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0x9300-0x93ff mem 0xe0801000-0xe0801fff irq 18 at device 20.0 on pci0 aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcefff on isa0 pmtimer0 on isa0 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: model MouseMan+, device ID 0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold ppbus0: IEEE1284 device found /NIBBLE Probing for PnP devices on ppbus0: ppbus0: <EPSON Stylus C84> PRINTER ESCPL2,BDC,D4 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 APIC_IO: Testing 8254 interrupt delivery APIC_IO: routing 8254 via IOAPIC #0 intpin 2 IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to deny, logging disabled SMP: AP CPU #1 Launched! afd0: 239MB <IOMEGA ZIP 250 ATAPI> [239/64/32] at ata0-master PIO3 Waiting 15 seconds for SCSI devices to settle de0: enabling Full Duplex 100baseTX port Mounting root from ufs:/dev/da0s1a da0 at ahc0 bus 0 target 0 lun 0 da0: <MICROP 4343WS X502> Fixed Direct Access SCSI-2 device da0: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled da0: 4146MB (8491920 512 byte sectors: 255H 63S/T 528C) cd1 at ahc0 bus 0 target 6 lun 0 cd1: <TOSHIBA CD-ROM XM-5701TA 0167> Removable CD-ROM SCSI-2 device cd1: 10.000MB/s transfers (10.000MHz, offset 8) cd1: Attempt to query device size failed: NOT READY, Medium not present cd0 at ahc0 bus 0 target 3 lun 0 cd0: <SONY CD-R CDU926S 1.1f> Removable CD-ROM SCSI-2 device cd0: 10.000MB/s transfers (10.000MHz, offset 15) cd0: Attempt to query device size failed: NOT READY, Medium not present This is a real live system I just built last week. It runs a champ. And it was COMPLETELY free to me. I assembled it literally from a pile of parts that I got from a customer who was scrapping them. Said pile included cases even, it also includes a second dual Ppro motherboard (sans CPU's). It replaced a AMD K6 200Mhz system that also ran fine, and I also got free. And these aren't the only ex-Windows, ex-Novell and ex-other systems I've used over the years. Nobody could be more of a proponent of rescuing old gear than I am. But, I have learned something in dealing with the older gear, and that is that you must be extremely flexible with it. If I get 2 systems that are flakey, I will swap parts between them in an effort to get 1 stable system. More commonly though I have a half dozen or more older systems in the pool at a time, that parts move around between, and that new systems in a state of disrepair come into, and old systems that are in a state of stability go out of to friends and others who need systems. And you do have to call it quits on some gear eventually. I finally last year got rid of the last of the 486 stuff that I had sitting around, and I had some really nice 486 servers, EISA SCSI and the rest. The Pentium stuff that's non-Pro and non P2 is going this year, as well as all the AT case style stuff. > As it stands now, all I know for sure is that FreeBSD apparently cannot > support what Windows can support, and nobody call tell me why. > You must understand Anthony that in the FreeBSD and Linux (and sometimes in the Microsoft and Solaris) worlds, that problem solution is approached rather differently. We are dealing with a lot of people here who have paid nothing for the software, have quite often got the hardware for free (your Vectra despite your paying $9K for it once, has been fully depreciated to $0.00 by today) and because of that aren't interested in paying for a formalized analyze-it-to-death-before-doing-anything problem solving approach. Instead, problem solving is done scientifically, you make a hypothesis of why something's broken, then test it. Granted this method is much more inefficient, but it is cheaper. And it is not cost effective to use the expensive analysis on a system that costs nothing. If you insist on going that route, your only hope is interesting the developer of the ahc driver in your problem. Start by filing a PR in the correct manner (ie: by following the instructions in the Handbook) and if that does not get a response from the developer (the PR's are mailed to the developers of the drivers) then read the source code of the driver to find out who it is, look up his e-mail address on the website, send him an e-mail begging to take a look at your PR, and stop wasting our time here. Ted
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?LOBBIFDAGNMAMLGJJCKNGEKDFAAA.tedm>