Date: Thu, 14 Aug 2003 10:41:09 +0200 (CEST) From: "Hartmann, O." <ohartman@klima.physik.uni-mainz.de> To: Charles Sprickman <spork@inch.com> Cc: freebsd-smp@freebsd.org Subject: Re: 5.1-R-p2 crashes on SMP with AMI RAID and Intel 1000/Pro Message-ID: <20030814103255.C77242@klima.physik.uni-mainz.de> In-Reply-To: <20030813220915.B9910@shell.inch.com> References: <20030813103509.Q49991@mail.physik.uni-mainz.de> <20030813220915.B9910@shell.inch.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 13 Aug 2003, Charles Sprickman wrote: Hello. 'healthd' is not working on the TYAN Thunder 2500 (LSI Logic 896 LVD version), because the ServerWorks HE chipset seems not to be supported - therefore healthd does not report anything. I tried this in the past. I have problems with this mainboard since we purchased it and run FreeBSD on it. Most problems were caused by IRQ problems, I gues and I solved them by fiddling around with which PCI card in which PCI slot ... Our next step is to increase power supplies to 400 or 500 W per unit. But this is not the final solution in my opinion. While recovering from a desaster I was forced to put an additional two channel LVD2 controller into the board (LSI Logic 24060??, a LSI 1010-33 64bit based controller). With this constellation, the system was totaly instable (I detached the tape unit from power and disabled the built in SCSI controller in BIOS!). :>Hi, :> :>Sadly, I can offer no help, but I do have a Thunder 2462 (SMP mobo), and I :>currently have an Adaptec 2110S raid card in it with no problems. :>However, it will NOT work when installed in the riser card... :> :>I'm wondering if you have healthd running on that board and if it reports :>valid data? I'm getting nothing from mine, and I think our boards share :>the same Winbond chip... :> :>Thanks, :> :>Charles :> :>-- :>Charles Sprickman :>spork@inch.com :> :> :>On Wed, 13 Aug 2003, Hartmann, O. wrote: :> :>> Dear Sirs. :>> :>> It seems to me a never ending story. We run a box with a TYAN Thunder :>> 2500 Dual SMP mainboard, 2GB ECC Tyan certified memory, AMI Enterprise :>> 1600 RAID adapter and additional Intel 1000/Pro server type (64 bit) :>> GBit LAN NIC. With FreeBSD 4.8 this was stable, but to achive this :>> state was really hard! It is a story similar to that what happend when :>> we changed towards FreeBSD 5.1-RELEASE-p2 on this machine. :>> :>> It seems to be highly dependend in which PCI slot several cards are :>> attached, so I will report this here also. :>> :>> Phenomenon: :>> :>> After a while the machine was running, the SMP kernel reboots :>> spontanously. This is when heavy IO is done, compiling or, when in the :>> morning time our department gets up and our staff connects to the samba :>> server. :>> :>> Dependend on which devices are switched on or off by BIOS, the kernel :>> freezes at the stage when the amr0 RAID got recognized. I can avoid this :>> by enabling the built in NIC (fxp0). I can force this by putting the em0 :>> NIC into another slot, for instance in the one remaining 64BIT/66MHz :>> slot (which should be a separate bus). :>> :>> This 'game' was identical to that I had with FreeBSD 4.X - 4.8 and I :>> found out, that putting an additional NIC into PCI slot No. 2 (counted :>> from AGP slot on) made things clear, but using both NICs together :>> (either additional fxp0 or the new em0) remains the systems completely :>> unstable. :>> :>> In FreeBSD 5.1-RELEASE-p2 and especially in FreeBSD 5.1-CURRENT this :>> 'gambling' seems to reach its climax. My kernel is built up with :>> SCHED_4BSD because SCHED_ULE and ADAPTIVE_MUTEXES crashes immediately :>> the same way as described (running a while, then coredumping or freeze :>> at the stage after the amr0-RAID showed up in the kernel boot messages, :>> see the dmesg output below). :>> :>> I'm not an hardware expert, but all this wierd stuff looks like to me to be :>> a IRQ routing problem. I fiddled around with many hand-assigned IRQ configurations, :>> but nothing helped. Either the Intel 1000/Pro or the AMI RAID causing :>> problems in the TYAN Thunder 2500 SMP environment. :>> :>> We have also a SMP machine with a similar hardware, based on an ASUS CV4X-D, :>> AMI Elite 1600 RAID controller and the same Intel em0 1GBit NIC. OS is :>> FreeBSD 4.8 and this system never had any problem! :>> :>> I feel a little bit helpless this moment, because I think I tried every trick :>> and something seems to be wrong with the combination TYAN Thunder 2500 and FreeBSD :>> 5.X SMP. It is also very courios, that a kernel without SMP/IO_APIC freezes after :>> booting at the same place (amr0 RAID recognition). :>> :>> Is there any help outside? :>> :>> I attach the kernel config file and the dmesg output. Please note: I disabled both :>> serial ports, the parallel port, sound and usb to get additional IRQs. But I have to :>> enable the built in NIC to get a bootable, but instable FreeBSD 5.1-R box. :>> :>> ==================================== :>> DMESG output :>> ==================================== :>> :>> Copyright (c) 1992-2003 The FreeBSD Project. :>> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 :>> The Regents of the University of California. All rights reserved. :>> FreeBSD 5.1-RELEASE-p2 #14: Wed Aug 13 09:47:00 CEST 2003 :>> root@atmos.physik.uni-mainz.de:/usr/obj/usr/src/sys/ATMOS :>> Preloaded elf kernel "/boot/kernel/kernel" at 0xc0458000. :>> Timecounter "i8254" frequency 1193182 Hz :>> Timecounter "TSC" frequency 868644793 Hz :>> CPU: Intel Pentium III (868.64-MHz 686-class CPU) :>> Origin = "GenuineIntel" Id = 0x683 Stepping = 3 :>> Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE> :>> real memory = 2147483648 (2048 MB) :>> avail memory = 2085625856 (1989 MB) :>> Programming 16 pins in IOAPIC #0 :>> IOAPIC #0 intpin 2 -> irq 0 :>> Programming 16 pins in IOAPIC #1 :>> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs :>> cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000 :>> cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 :>> io0 (APIC): apic id: 2, version: 0x000f0011, at 0xfec00000 :>> io1 (APIC): apic id: 3, version: 0x000f0011, at 0xfec01000 :>> netsmb_dev: loaded :>> Pentium Pro MTRR support enabled :>> npx0: <math processor> on motherboard :>> npx0: INT 16 interface :>> pcibios: BIOS version 2.10 :>> Using $PIR table, 12 entries at 0xc00fdf00 :>> pcib0: <Host to PCI bridge> at pcibus 0 on motherboard :>> pci0: <PCI bus> on pcib0 :>> IOAPIC #1 intpin 13 -> irq 2 :>> IOAPIC #1 intpin 12 -> irq 16 :>> IOAPIC #1 intpin 2 -> irq 17 :>> IOAPIC #1 intpin 7 -> irq 18 :>> pcib1: <PCI-PCI bridge> at device 0.1 on pci0 :>> pci1: <PCI bus> on pcib1 :>> IOAPIC #1 intpin 1 -> irq 19 :>> pci1: <display, VGA> at device 0.0 (no driver attached) :>> sym0: <896> port 0xf800-0xf8ff mem 0xfeafe000-0xfeafffff,0xfeafac00-0xfeafafff irq 2 at device 1.0 on pci0 :>> sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking :>> sym0: open drain IRQ line driver, using on-chip SRAM :>> sym0: using LOAD/STORE-based firmware. :>> sym0: handling phase mismatch from SCRIPTS. :>> sym1: <896> port 0xf400-0xf4ff mem 0xfeafc000-0xfeafdfff,0xfeafa800-0xfeafabff irq 16 at device 1.1 on pci0 :>> sym1: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking :>> sym1: open drain IRQ line driver, using on-chip SRAM :>> sym1: using LOAD/STORE-based firmware. :>> sym1: handling phase mismatch from SCRIPTS. :>> em0: <Intel(R) PRO/1000 Network Connection, Version - 1.5.31> port 0xfcc0-0xfcff mem 0xfeac0000-0xfeadffff irq 17 at device 4.0 on pci0 :>> em0: Speed:1000 Mbps Duplex:Full :>> fxp0: <Intel 82557/8/9 EtherExpress Pro/100(B) Ethernet> port 0xfc40-0xfc7f mem 0xfe900000-0xfe9fffff,0xfeaf9000-0xfeaf9fff irq 18 at device 7.0 on pci0 :>> fxp0: Ethernet address 00:e0:81:00:f0:d7 :>> miibus0: <MII bus> on fxp0 :>> inphy0: <i82555 10/100 media interface> on miibus0 :>> inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto :>> isab0: <PCI-ISA bridge> port 0x500-0x50f at device 15.0 on pci0 :>> isa0: <ISA bus> on isab0 :>> pci0: <mass storage, ATA> at device 15.1 (no driver attached) :>> pcib2: <ServerWorks host to PCI bridge> at pcibus 2 on motherboard :>> pci2: <PCI bus> on pcib2 :>> pcib3: <PCI-PCI bridge> at device 2.0 on pci2 :>> pci3: <PCI bus> on pcib3 :>> IOAPIC #1 intpin 11 -> irq 20 :>> IOAPIC #1 intpin 8 -> irq 21 :>> pcib4: <PCI-PCI bridge> at device 0.0 on pci3 :>> pci4: <PCI bus> on pcib4 :>> IOAPIC #1 intpin 10 -> irq 22 :>> amr0: <LSILogic MegaRAID> mem 0xf0000000-0xf3ffffff irq 22 at device 0.0 on pci4 :>> amr0: <LSILogic MegaRAID Enterprise 1600> Firmware G170, BIOS F316, 64MB RAM :>> pci3: <mass storage, SCSI> at device 1.0 (no driver attached) :>> pci3: <mass storage, SCSI> at device 2.0 (no driver attached) :>> orm0: <Option ROMs> at iomem 0xca000-0xcdfff,0xc0000-0xc9fff on isa0 :>> fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0 :>> fdc0: FIFO enabled, 8 bytes threshold :>> fd0: <1440-KB 3.5" drive> on fdc0 drive 0 :>> atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0 :>> atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 :>> kbd0 at atkbd0 :>> psm0: <PS/2 Mouse> irq 12 on atkbdc0 :>> psm0: model IntelliMouse, device ID 3 :>> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 :>> sc0: <System console> at flags 0x100 on isa0 :>> sc0: VGA <8 virtual consoles, flags=0x300> :>> sio0: configured irq 4 not in bitmap of probed irqs 0 :>> sio0: port may not be enabled :>> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 :>> sio0: type 8250 or not responding :>> sio1: configured irq 3 not in bitmap of probed irqs 0 :>> sio1: port may not be enabled :>> ppc0: parallel port not found. :>> unknown: <PNP0303> can't assign resources (port) :>> psmcpnp0: irq resource info is missing; assuming irq 12 :>> unknown: <PNP0700> can't assign resources (port) :>> ppc1: parallel port not found. :>> APIC_IO: Testing 8254 interrupt delivery :>> APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2 :>> APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0 :>> Timecounters tick every 1.000 msec :>> ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited :>> DUMMYNET initialized (011031) :>> Waiting 5 seconds for SCSI devices to settle :>> (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. :>> (noperiph:sym1:0:-1:-1): SCSI BUS reset delivered. :>> amrd0: <LSILogic MegaRAID logical drive> on amr0 :>> amrd0: 245014MB (501788672 sectors) RAID 5 (optimal) :>> :>> ===> freezing here! :>> :>> sa0 at sym1 bus 0 target 5 lun 0 :>> sa0: <HP C5713A H910> Removable Sequential Access SCSI-2 device :>> sa0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit) :>> ch0 at sym1 bus 0 target 5 lun 1 :>> ch0: <HP C5713A H910> Removable Changer SCSI-2 device :>> ch0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit) :>> ch0: 6 slots, 1 drive, 0 pickers, 0 portals :>> SMP: AP CPU #1 Launched! :>> Mounting root from ufs:/dev/amrd0s1a :>> cd0 at sym0 bus 0 target 3 lun 0 :>> cd0: <TEAC CD-ROM CD-532S 1.0A> Removable CD-ROM SCSI-2 device :>> cd0: 20.000MB/s transfers (20.000MHz, offset 16) :>> cd0: Attempt to query device size failed: NOT READY, Medium not present :>> :>> ======================== :>> KERNEL config file :>> ======================== :>> :>> machine i386 :>> cpu I686_CPU :>> ident ATMOS :>> :>> options SMP # Symmetric MultiProcessor Kernel :>> options APIC_IO # Symmetric (APIC) I/O :>> :>> maxusers 0 :>> :>> hints "ATMOS.hints" #Default places to look for devices. :>> :>> :>> #options COMPAT_FREEBSD4 :>> options SCHED_4BSD #4BSD scheduler :>> :>> #options SCHED_ULE :>> #options ADAPTIVE_MUTEXES :>> :>> #options PQ_CACHESIZE=256 :>> :>> options CPU_ENABLE_SSE :>> :>> options CLK_USE_TSC_CALIBRATION :>> #options HZ=1000 :>> :>> #makeoptions CONF_CFLAGS=-fno-builtin :>> #options MAXDSIZ=(1024UL*1024*1024) :>> #options MAXSSIZ=(128UL*1024*1024) :>> #options DFLDSIZ=(1024UL*1024*1024) :>> :>> options GEOM_AES :>> options GEOM_APPLE :>> options GEOM_BDE :>> options GEOM_BSD :>> options GEOM_GPT :>> options GEOM_MBR :>> options GEOM_PC98 :>> options GEOM_SUNLABEL :>> options GEOM_VOL :>> :>> options ROOTDEVNAME=\"ufs:amrd0s1a\" :>> :>> options INET #InterNETworking :>> #options INET6 #IPv6 communications protocols :>> options FFS #Berkeley Fast Filesystem :>> options SOFTUPDATES #Enable FFS soft updates support :>> options UFS_ACL #Support for access control lists :>> options UFS_DIRHASH #Improve performance on big directories :>> options NFSCLIENT #Network Filesystem Client :>> options NFSSERVER #Network Filesystem Server :>> options MSDOSFS #MSDOS Filesystem :>> options CD9660 #ISO 9660 Filesystem :>> options PROCFS #Process filesystem (requires PSEUDOFS) :>> options PSEUDOFS #Pseudo-filesystem framework :>> options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!] :>> options SCSI_DELAY=5000 #Delay (in ms) before probing SCSI :>> :>> options SYSVSHM #SYSV-style shared memory :>> options SYSVMSG #SYSV-style message queues :>> options SYSVSEM #SYSV-style semaphores :>> :>> options NETSMB :>> options NETSMBCRYPTO :>> options LIBMCHAIN :>> options LIBICONV :>> :>> #options WATCHDOG :>> :>> options NETGRAPH :>> #options NETGRAPH_ASYNC :>> #options NETGRAPH_BPF :>> #options NETGRAPH_BRIDGE :>> #options NETGRAPH_CISCO :>> #options NETGRAPH_ECHO :>> #options NETGRAPH_ETHER :>> #options NETGRAPH_FRAME_RELAY :>> #options NETGRAPH_GIF :>> #options NETGRAPH_GIF_DEMUX :>> #options NETGRAPH_HOLE :>> #options NETGRAPH_IFACE :>> #options NETGRAPH_IP_INPUT :>> #options NETGRAPH_KSOCKET :>> #options NETGRAPH_L2TP :>> #options NETGRAPH_LMI :>> #options NETGRAPH_MPPC_ENCRYPTION :>> #options NETGRAPH_ONE2MANY :>> #options NETGRAPH_PPP :>> #options NETGRAPH_PPPOE :>> #options NETGRAPH_PPTPGRE :>> #options NETGRAPH_RFC1490 :>> #options NETGRAPH_SOCKET :>> #options NETGRAPH_SPLIT :>> #options NETGRAPH_TEE :>> #options NETGRAPH_TTY :>> #options NETGRAPH_UI :>> #options NETGRAPH_VJC :>> :>> options MROUTING :>> options IPFIREWALL :>> options IPFIREWALL_VERBOSE :>> options IPFIREWALL_FORWARD :>> #options IPFIREWALL_VERBOSE_LIMIT=100 :>> #options IPFIREWALL_DEFAULT_TO_ACCEPT :>> #options IPV6FIREWALL :>> #options IPV6FIREWALL_VERBOSE :>> #options IPV6FIREWALL_VERBOSE_LIMIT=100 :>> #options IPV6FIREWALL_DEFAULT_TO_ACCEPT :>> options IPDIVERT :>> #options IPFILTER :>> #options IPFILTER_LOG :>> #options IPFILTER_DEFAULT_BLOCK :>> options IPSTEALTH :>> :>> options RANDOM_IP_ID :>> :>> options ACCEPT_FILTER_DATA :>> #options ACCEPT_FILTER_HTTP :>> :>> options TCP_DROP_SYNFIN :>> options DUMMYNET :>> #options BRIDGE :>> :>> options QUOTA :>> :>> options _KPOSIX_PRIORITY_SCHEDULING :>> options P1003_1B_SEMAPHORES :>> :>> #options MAC :>> #options MAC_BIBA :>> #options MAC_BSDEXTENDED :>> #options MAC_DEBUG :>> #options MAC_IFOFF :>> #options MAC_LOMAC :>> #options MAC_MLS :>> #options MAC_NONE :>> #options MAC_PARTITION :>> #options MAC_SEEOTHERUIDS :>> #options MAC_TEST :>> :>> options KBD_INSTALL_CDEV # install a CDEV entry in /dev :>> :>> device isa :>> #options AUTO_EOI_1 :>> :>> device pci :>> :>> device agp :>> :>> # Floppy drives :>> device fdc :>> :>> # SCSI Controllers :>> device sym # NCR/Symbios Logic (newer chipsets + those of `ncr') :>> #device ahc :>> :>> # SCSI peripherals :>> device scbus # SCSI bus (required) :>> device ch # SCSI media changers :>> device da # Direct Access (disks) :>> device sa # Sequential Access (tape etc) :>> device cd # CD :>> device pass # Passthrough device (direct SCSI access) :>> device ses # SCSI Environmental Services (and SAF-TE) :>> :>> :>> # RAID controllers :>> device amr # AMI MegaRAID :>> :>> :>> #options CHANGER_MIN_BUSY_SECONDS=2 :>> #options CHANGER_MAX_BUSY_SECONDS=10 :>> :>> #options SA_IO_TIMEOUT=4 :>> #options SA_SPACE_TIMEOUT=60 :>> #options SA_REWIND_TIMEOUT=(2*60) :>> #options SA_ERASE_TIMEOUT=(4*60) :>> #options SA_1FM_AT_EOD :>> :>> #options SCSI_PT_DEFAULT_TIMEOUT=60 :>> options SES_ENABLE_PASSTHROUGH :>> :>> :>> # atkbdc0 controls both the keyboard and the PS/2 mouse :>> device atkbdc # AT keyboard controller :>> device atkbd # AT keyboard :>> options ATKBD_DFLT_KEYMAP :>> makeoptions ATKBD_DFLT_KEYMAP=us.iso :>> :>> device psm # PS/2 mouse :>> :>> device vga # VGA video card driver :>> :>> device splash # Splash screen and screen saver support :>> :>> # syscons is the default console driver, resembling an SCO console :>> device sc :>> options MAXCONS=8 :>> :>> #options SC_ALT_MOUSE_IMAGE :>> options SC_DFLT_FONT :>> makeoptions SC_DFLT_FONT=cp850 :>> :>> options SC_DISABLE_DDBKEY :>> options SC_DISABLE_REBOOT :>> options SC_HISTORY_SIZE=512 :>> #options SC_MOUSE_CHAR=0x3 :>> options SC_PIXEL_MODE :>> options SC_NORM_ATTR=(FG_GREEN|BG_BLACK) :>> options SC_NORM_REV_ATTR=(FG_YELLOW|BG_GREEN) :>> options SC_KERNEL_CONS_ATTR=(FG_RED|BG_BLACK) :>> options SC_KERNEL_CONS_REV_ATTR=(FG_BLACK|BG_RED) :>> #options SC_CUT_SPACES2TABS :>> #options SC_CUT_SEPCHARS=\"x09\" :>> #options SC_TWOBUTTON_MOUSE :>> #options SC_NO_CUTPASTE :>> #options SC_NO_FONT_LOADING :>> #options SC_NO_HISTORY :>> #options SC_NO_SYSMOUSE :>> #options SC_NO_SUSPEND_VTYSWITCH :>> :>> device npx :>> :>> #device pmtimer :>> :>> #device sio # 8250, 16[45]50 based serial ports :>> :>> # Parallel port :>> #device ppc :>> #device ppbus # Parallel port bus (required) :>> #device lpt # Printer :>> #device plip # TCP/IP over parallel :>> #device ppi # Parallel port interface device :>> #device vpo # Requires scbus and da :>> :>> :>> device miibus # MII bus support :>> device em :>> #device fxp # Intel EtherExpress PRO/100B (82557, 82558) :>> :>> device random # Entropy device :>> device loop # Network loopback :>> device ether # Ethernet support :>> #device tun # Packet tunnel. :>> device pty # Pseudo-ttys (telnet etc) :>> #device gif # IPv6 and IPv4 tunneling :>> #device faith # IPv6-to-IPv4 relaying (translation) :>> :>> device bpf # Berkeley packet filter :>> :>> :>> ------------------ :>> :>> :>> Thanks a lot for your help, :>> :>> Oliver :>> -- :>> MfG :>> O. Hartmann :>> :>> ohartman@mail.physik.uni-mainz.de :>> ------------------------------------------------------------------ :>> Systemadministration des Institutes fuer Physik der Atmosphaere (IPA) :>> ------------------------------------------------------------------ :>> Johannes Gutenberg Universitaet Mainz :>> Becherweg 21 :>> 55099 Mainz :>> :>> Tel: +496131/3924662 (Maschinenraum) :>> Tel: +496131/3924144 (Buero) :>> FAX: +496131/3923532 :>> _______________________________________________ :>> freebsd-stable@freebsd.org mailing list :>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable :>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" :>> :> -- MfG O. Hartmann ohartman@mail.physik.uni-mainz.de ------------------------------------------------------------------ Systemadministration des Institutes fuer Physik der Atmosphaere (IPA) ------------------------------------------------------------------ Johannes Gutenberg Universitaet Mainz Becherweg 21 55099 Mainz Tel: +496131/3924662 (Maschinenraum) Tel: +496131/3924144 (Buero) FAX: +496131/3923532
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030814103255.C77242>