From owner-freebsd-stable Fri Apr 20 10:16:11 2001 Delivered-To: freebsd-stable@freebsd.org Received: from bjorn.goddamnbastard.org (c1283020-a.hrvy1.il.home.com [24.183.37.152]) by hub.freebsd.org (Postfix) with SMTP id E9A3C37B43C for ; Fri, 20 Apr 2001 10:08:02 -0700 (PDT) (envelope-from ryanb@bjorn.goddamnbastard.org) Received: (qmail 12468 invoked by uid 1000); 20 Apr 2001 17:08:01 -0000 Date: Fri, 20 Apr 2001 12:08:01 -0500 From: ryanb To: freebsd-stable@freebsd.org Subject: AMI MegaRAID (428 series; Enterprise 1200?) + 4-STABLE (2001.12.14) -> hard lock w/o ability to dump Message-ID: <20010420120801.B9227@bjorn.goddamnbastard.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.4i Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG My, that Subject: line is quite long. Please excuse it. :) Anyway, I was handed down a Dell PowerEdge (gak) with an AMI MegaRAID Enterprise 1200 and 4 9G disks. The setup within that machine left all disks (80pin) stuck to the Dell backplane which then ran to said RAID controller. With no free 3.5"/5.25" bays to mount a 68 pin disk in, I was stuck with creating 2 RAID volumes under the controller: a 1 disk RAID 0 and a 3 disk RAID 5. (Get that smirk off your faces. ;) Getting to the point, this machine likes to just totally lock on its own. Still pingable, it seems everything is happy that doesn't rely on the filesystems/disks. ie: I received a page stating that its SMTP services were timing out, but my systat over a serial console was still responsive. After exiting systat, it just sat and did nothing, never returning to a shell prompt. (Possibly from the shell trying to stat() after exiting systat and getting no RAID response? I dunno ...) Rather than playing the power cycling game like we had before the serial consoles were wired, I decided to panic the machine to see what I could get out of it, if anything. This eventually failed with the following error: dumping to dev #amrd/0x20001, offset 1048592 dump failed, reason: device doesn't support a dump routine Automatic reboot in 15 seconds - press a key on the console to abort This is making me think the RAID controller is completely acting up if we can't get a core dump (well, with _that_ error above). (Remember what I said about only having disks on the RAID controller? (Stop chuckling already.)) ... sooo, I'll go ahead and post some diagnostic info below. I'm basically looking for any hints or anything else to check out to confirm a cause to the locks. ( See below -- what's this "no devsw" business? Seems there was a post with similar diagnostic info as mine in freebsd-users-jp, 'cept it was in Japanese. Duh.) Thanks in advance! - ryan dmesg output: Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.2-STABLE #0: Thu Apr 19 18:41:42 CDT 2001 root@backup.enteract.com:/usr/obj/usr/src/sys/SMTP2 Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (447.69-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x672 Stepping = 2 Features=0x383fbff real memory = 536862720 (524280K bytes) avail memory = 519028736 (506864K bytes) hanging APIC ID for IO APIC #0 from 0 to 2 on chip Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 Preloaded elf kernel "kernel" at 0xc02d2000. Pentium Pro MTRR support enabled npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib2: at device 1.0 on pci0 pci1: on pcib2 pci1: at 0.0 pcib3: at device 2.0 on pci0 pci2: on pcib3 ahc0: port 0xec00-0xecff mem 0xf9fff000-0xf9ffffff irq 16 at device 4.0 on pci2 aic7890/91: Wide Channel A, SCSI Id=7, 32/255 SCBs ahc1: port 0xe800-0xe8ff mem 0xf9ffe000-0xf9ffefff irq 16 at device 6.0 on pci2 aic7860: Single Channel A, SCSI Id=7, 3/255 SCBs amr0: port 0xe480-0xe4ff irq 18 at device 10.0 on pci2 amr0: Firmware Uc77, BIOS 1.47, 128MB RAM isab0: at device 7.0 on pci0 isa0: on isab0 pci0: at 7.1 pci0: at 7.2 irq 0 Timecounter "PIIX" frequency 3579545 Hz chip1: port 0x850-0x85f at device 7. 3 on pci0 fxp0: port 0xdce0-0xdcff mem 0xfe000000-0xfe0fffff,0xf7000000-0xf7000fff irq 20 at device 8.0 on pci0 fxp0: Ethernet address 00:90:27:78:a7:1d dc0: <82c169 PNIC 10/100BaseTX> port 0xd800-0xd8ff mem 0xfe100000-0xfe1000ff irq 21 at device 10.0 on pci0 dc0: Ethernet address: 00:a0:cc:3b:43:ba miibus0: on dc0 ukphy0: on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto pcib1: on motherboard pci3: on pcib1 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: on isa0 sc0: VGA <4 virtual consoles, flags=0x0> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 APIC_IO: Testing 8254 interrupt delivery APIC_IO: routing 8254 via IOAPIC #0 intpin 2 Waiting 2 seconds for SCSI devices to settle amrd0: on amr0 amrd0: 8568MB (17547264 sectors) RAID 0 (optimal) amrd1: on amr0 amrd1: 17136MB (35094528 sectors) RAID 5 (optimal) SMP: AP CPU #1 Launched! pass0 at ahc1 bus 0 target 5 lun 0 pass0: Removable CD-ROM SCSI-2 device pass0: 20.000MB/s transfers (20.000MHz, offset 15) no devsw (majdev=0 bootdev=0xa0200000) Mounting root from ufs:/dev/amrd0s1a WARNING: / was not properly dismounted serial console output: telnet> send break Stopped at siointr1+0xb1: jmp siointr1+0x1a0 db> panic panic: from debugger mp_lock = 00000001; cpuid = 0; lapic.id = 01000000 boot() called on cpu#0 syncing disks... Fatal trap 12: page fault while in kernel mode mp_lock = 00000002; cpuid = 0; lapic.id = 01000000 fault virtual address = 0x30 fault code = supervisor read, page not present instruction pointer = 0x8:0xc01c9569 stack pointer = 0x10:0xff806d18 frame pointer = 0x10:0xff806d1c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = bio <- SMP: XXX kernel: type 12 trap, code=0 Stopped at siointr1+0xb1: jmp siointr1+0x1a0 db> trace siointr1(c4068400,c02852a8,0,0,c01fe090) at siointr1+0xb1 siointr(c4068400) at siointr+0x17 Xfastintr4() at Xfastintr4+0x20 db> show all procs pid proc addr uid ppid pgrp flag stat wmesg wchan cmd 43098 ddc58b60 ddcf0000 0 134 134 000004 3 inode c40d0800 inetd 43097 de807ba0 de86e000 0 134 134 000004 3 inode c40d0800 inetd 43096 de103260 de142000 0 134 134 000004 3 inode c40d0800 inetd 43095 de529520 de59e000 0 139 139 000004 3 inode c40d0800 sshd 43094 de40db60 de4c5000 0 139 139 000004 3 inode c40d0800 sshd 43093 de5284e0 de5c1000 0 139 139 000004 3 inode c41a1900 sshd 43090 de411100 de431000 0 43088 43090 004004 3 biord ce93a468 sh 43088 de40dd00 de4c2000 0 136 136 000084 3 piperd ddbceb80 cron 43067 de9498a0 de99a000 0 358 358 004004 3 inode c4220500 smtpd 43066 ddf8f560 ddfec000 333 358 358 004104 3 inode c4376200 cleanup 43065 ddf8eee0 ddffa000 333 358 358 004184 3 select c027fc30 smtpd 43064 dde25860 dde9c000 333 358 358 004104 3 getblk ce8c83dc cleanup 43063 de94b2a0 de966000 333 358 358 004104 3 inode c4220500 smtpd 43062 ddf8e1e0 de01a000 333 358 358 004184 3 select c027fc30 smtpd 43063 de94b2a0 de966000 333 358 358 004104 3 inode c4220500 smtpd 43060 de94be00 de94d000 333 358 358 004184 3 select c027fc30 smtpd 43059 ddf8db60 de029000 333 358 358 004104 3 wdrain c026c48c cleanup 43058 de0453c0 de0ab000 333 358 358 004184 3 select c027fc30 smtpd 43057 de8aad80 de8ed000 333 358 358 004104 3 wdrain c026c48c cleanup 43056 dde28920 dde34000 333 358 358 004184 3 select c027fc30 smtpd 43055 de5d0a80 de5f9000 333 358 358 004104 3 wdrain c026c48c cleanup 43054 de94a260 de986000 333 358 358 004184 3 select c027fc30 smtpd 43053 de80a5e0 de81c000 333 358 358 004104 3 wdrain c026c48c cleanup 43052 de73a4e0 de2ff000 333 358 358 004104 3 getblk ce878140 cleanup 43051 ddc58680 ddcfa000 333 358 358 004184 3 select c027fc30 smtpd 43050 de8aaa40 de8f4000 333 358 358 004184 3 select c027fc30 smtpd 43049 de8a8b60 de92f000 333 358 358 004104 3 wdrain c026c48c cleanup 43048 de1bfa00 de23b000 333 358 358 004184 3 select c027fc30 smtpd 43047 de682080 de6f8000 333 358 358 004104 3 getblk ce90a768 cleanup 43046 dde263c0 dde83000 333 358 358 004184 3 select c027fc30 smtpd 43045 de5cd680 de671000 333 358 358 004104 3 wdrain c026c48c cleanup 43044 de279340 de332000 333 358 358 004104 3 getblk ce937b10 cleanup 43043 de8acac0 de8b5000 333 358 358 004104 3 wdrain c026c48c cleanup 43042 de52abe0 de570000 333 358 358 004184 3 select c027fc30 smtpd 43041 de52c440 de541000 333 358 358 004104 3 wdrain c026c48c cleanup 43040 ddf905a0 ddfc9000 333 358 358 004184 3 select c027fc30 smtpd 43039 ddedcc20 ddf05000 333 358 358 004184 3 select c027fc30 smtpd 43038 de40e040 de4bc000 333 358 358 004184 3 select c027fc30 smtpd 43037 de044520 de0cc000 333 358 358 004104 3 wdrain c026c48c cleanup 43036 de947680 de9e1000 333 358 358 004104 3 wdrain c026c48c cleanup 43035 de100680 de1ae000 333 358 358 004184 3 select c027fc30 smtpd 43034 db48ff60 ddbe6000 333 358 358 004184 3 select c027fc30 smtpd 43031 de33bd00 de3e9000 333 358 358 004104 3 wdrain c026c48c cleanup 43030 ddd0eac0 ddd1c000 333 358 358 004184 3 select c027fc30 smtpd 43029 ddd0df60 ddd32000 333 358 358 004104 3 ffsfsn de509704 cleanup 43028 de4105a0 de44e000 333 358 358 004184 3 select c027fc30 smtpd 43027 de33ec20 de36f000 333 358 358 004184 3 select c027fc30 smtpd 43026 ddc5b260 ddc9c000 333 358 358 004104 3 getblk ce88b5e8 cleanup 43025 db48e700 ddc1e000 333 358 358 004184 3 select c027fc30 smtpd 43024 de947820 de9de000 333 358 358 004104 3 getblk ce894490 cleanup 43023 de8a9a00 de913000 333 358 358 004104 3 ffsfsn db48b304 cleanup 43022 de52aa40 de573000 333 358 358 004184 3 select c027fc30 smtpd 43021 de27b560 de2d9000 333 358 358 004184 3 select c027fc30 smtpd 43020 de27ca80 de2aa000 333 358 358 004104 3 ffsfsn de356404 cleanup 43019 de8a81a0 de942000 333 358 358 004104 3 getblk ce8d24ac cleanup 43018 db4905e0 ddbc8000 333 358 358 004184 3 select c027fc30 smtpd 43017 ddf912a0 ddfae000 333 358 358 004104 3 ffsfsn ddd15404 cleanup 43016 de045d80 de092000 333 358 358 004184 3 select c027fc30 smtpd 43015 de52ba80 de554000 333 358 358 004104 3 ffsfsn dea28984 cleanup 43014 dde244e0 ddecc000 333 358 358 004184 3 select c027fc30 smtpd 43013 de94ac20 de973000 333 358 358 004104 3 ffsfsn de4e0684 cleanup 43012 de806ea0 de887000 333 358 358 004104 3 ffsfsn ddebb504 cleanup 43011 ddedbf20 ddf21000 333 358 358 004184 3 select c027fc30 smtpd 43010 de683dc0 de6bd000 333 358 358 004104 3 inode c4220500 cleanup 43009 de1beea0 de254000 333 358 358 004184 3 select c027fc30 smtpd 43008 db48ed80 ddc0a000 333 358 358 004104 3 ffsfsn de4e7544 cleanup 43007 db48d6c0 ddc44000 333 358 358 004184 3 select c027fc30 smtpd 43006 de73cbe0 de794000 333 358 358 004104 3 ffsfsn de4fe284 cleanup 43005 de27d2a0 de29a000 333 358 358 004104 3 getblk ce865460 cleanup 42992 de5cfbe0 de618000 333 358 358 004104 3 biord ce8ff594 cleanup 42991 de40e520 de4b1000 333 358 358 004184 3 select c027fc30 smtpd 42990 de043820 de0e8000 333 358 358 004184 3 select c027fc30 smtpd 42988 de8a8820 de936000 333 358 358 004104 3 ffsfsn ddd8f004 cleanup 42987 de8088a0 de855000 333 358 358 004184 3 select c027fc30 smtpd 42986 ddf8dd00 de025000 333 358 358 004104 3 ffsfsn de4f8904 cleanup 42984 de1bf860 de23e000 333 358 358 004184 3 select c027fc30 smtpd 42975 de6809c0 de725000 333 358 358 004104 3 ffsfsn de43de84 cleanup 42974 de947340 de9e7000 333 358 358 004184 3 select c027fc30 smtpd 42973 de73c700 de79d000 333 358 358 004184 3 select c027fc30 smtpd 42972 de949be0 de992000 333 358 358 004104 3 ffsfsn ddd12944 cleanup 42967 dde241a0 dded3000 333 358 358 004104 3 biord ce928718 flush 42938 de73e920 de74b000 333 358 358 004104 3 getbuf ce8b1a06 bounce 42922 de680d00 de71e000 333 358 358 004104 3 ffsfsn dea11104 bounce 42921 dde28440 dde3f000 333 358 358 004104 3 getblk ce840798 bounce 42919 de103740 de138000 333 358 358 004104 3 ffsfsn de513e84 bounce 42905 ddd0e2a0 ddd2b000 333 358 358 004184 3 select c027fc30 smtp 42885 dde24680 ddec9000 333 358 358 004184 3 select c027fc30 smtp 42854 de8061a0 de8a3000 333 358 358 004184 3 select c027fc30 smtp 42834 de046400 de081000 333 358 358 004184 3 select c027fc30 smtp 42820 de103400 de13e000 333 358 358 004184 3 select c027fc30 smtpd 42688 de27d920 de28c000 333 358 358 004104 3 inode c4220500 smtp 42662 de6804e0 de72f000 333 358 358 004184 3 select c027fc30 smtp 42658 de8071e0 de881000 333 358 358 004104 3 getblk ce8fd020 smtp 42535 de40d1a0 de522000 333 358 358 004104 3 ffsfsn de77d1c4 bounce 42487 de33edc0 de36c000 333 358 358 004104 3 inode c4220500 smtp 42479 ddd0ab60 dde0b000 333 358 358 004184 3 select c027fc30 smtp 42459 de047100 de065000 333 358 358 004184 3 select c027fc30 smtp 42369 de5ce6c0 de652000 333 358 358 004104 3 ffsfsn ddd9dd04 bounce 42325 de33e400 de389000 333 358 358 004104 3 ffsfsn de383704 bounce 42298 dde27740 dde5a000 333 358 358 004104 3 wdrain c026c48c cleanup 42296 de684c60 de689000 333 358 358 004104 3 ffsfsn dea20e44 cleanup 42295 de045080 de0b3000 333 358 358 004104 3 getblk ce87bf80 cleanup 42291 de807380 de87e000 333 358 358 004104 3 ffsfsn ddd8b684 cleanup 42229 de948ba0 de9b5000 333 358 358 004104 3 ffsfsn de4d1a04 cleanup 41967 ddd0ec60 ddd18000 333 358 358 004184 3 select c027fc30 smtp 41812 dde24340 ddecf000 333 358 358 004184 3 select c027fc30 smtp 41737 de27a6c0 de303000 333 358 358 004184 3 select c027fc30 smtp 41609 de1bf520 de246000 333 358 358 004184 3 select c027fc30 smtpd 41608 de045560 de0a8000 333 358 358 004104 3 inode c4220500 smtpd 41458 de5d00c0 de60f000 333 358 358 004184 3 select c027fc30 smtpd 41309 de411440 de429000 333 358 358 004104 3 wdrain c026c48c cleanup 41306 de808a40 de852000 333 358 358 004104 3 wdrain c026c48c cleanup 41305 de809a80 de832000 333 358 358 004104 3 wdrain c026c48c cleanup 41239 ddc59ba0 ddccd000 333 358 358 004184 3 select c027fc30 smtpd 41237 de1bf1e0 de24d000 333 358 358 004184 3 select c027fc30 smtpd 41230 ddd0bd40 ddd7e000 333 358 358 004184 3 select c027fc30 smtpd 41224 de8a9d40 de90d000 333 358 358 004184 3 select c027fc30 smtpd 41217 de8ab260 de8e4000 333 358 358 004184 3 select c027fc30 smtpd 41215 de40e380 de4b5000 333 358 358 004184 3 select c027fc30 smtpd 24519 ddd0ca40 ddd63000 0 1 24519 004006 3 inode c40d0800 sh 24009 ddf915e0 ddfa6000 22787 20877 24009 004086 3 ttyin c4611c30 bash 24006 de33b000 de40b000 0 20883 24006 004086 3 ttyin c44fc030 systat 21033 de683740 de6cb000 333 358 358 004184 3 select c027fc30 trivial-rewr ite 21016 db48f0c0 ddc03000 333 358 358 004104 3 inode c4801900 qmgr 20883 ddf8e520 de00e000 0 20880 20883 004086 3 wait ddf8e520 bash 20880 de33f780 de350000 0 20878 20880 004086 3 wait de33f780 sh 20878 de5d1440 de5e6000 22787 20877 20878 004086 3 wait de5d1440 bash 20877 de33bb60 de3ec000 22787 1 20877 000084 3 select c027fc30 screen-3.9.8 386 ddd0e440 ddd28000 0 1 386 004086 3 ttyin c43d7110 getty 385 ddd0c080 ddd78000 0 1 385 004086 3 ttyin c43d7f10 getty 384 ddd0c700 ddd6a000 0 1 384 004086 3 ttyin c43d4110 getty 383 ddd0ddc0 ddd36000 0 1 383 004086 3 ttyin c4063210 getty 358 db48ef20 ddc06000 0 1 358 004104 3 inode c4220500 master 139 db48f260 ddc00000 0 1 139 000004 3 inode c40d0800 sshd 136 db48f400 ddbfc000 0 1 136 000004 3 inode c40d0800 cron 134 db48f5a0 ddbf8000 0 1 134 000084 3 select c027fc30 inetd 117 db48f740 ddbf5000 1 1 117 000104 3 inode c40d0800 rwhod 111 db48f8e0 ddbf2000 0 1 106 000084 3 nfsidl c0281dac nfsiod 110 db48fa80 ddbef000 0 1 106 000084 3 nfsidl c0281da8 nfsiod 109 db48fc20 ddbec000 0 1 106 000084 3 nfsidl c0281da4 nfsiod 108 db48fdc0 ddbe9000 0 1 106 000084 3 nfsidl c0281da0 nfsiod 102 db490100 ddbd5000 0 1 102 000084 3 select c027fc30 ntpd 95 db4902a0 ddbd2000 0 1 95 000084 3 select c027fc30 syslogd 5 db490780 db49d000 0 0 0 000204 3 wdrain c026c48c syncer 4 db490920 db49b000 0 0 0 100204 3 psleep c026c454 bufdaemon 3 db490ac0 db499000 0 0 0 000204 3 psleep c0276fa0 vmdaemon 2 db490c60 db497000 0 0 0 100204 3 psleep c025e5f8 pagedaemon 1 db490e00 db495000 0 0 1 004284 3 wait db490e00 init 0 c027efa0 c02f2000 0 0 0 000204 3 sched c027efa0 swapper 39739 de5d0400 de609000 0 358 358 006104 5 pickup db> step panic: rslock: cpu: 0, addr: 0xc02852a8, lock: 0x00000001 mp_lock = 00000002; cpuid = 0; lapic.id = 01000000 boot() called on cpu#0 Uptime: 15h14m39s amr0: flushing cache...failed dumping to dev #amrd/0x20001, offset 1048592 dump failed, reason: device doesn't support a dump routine Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... cpu_reset called on cpu#0 cpu_reset: Stopping other CPUs To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message