From owner-freebsd-stable Mon Jul 2 3: 7:41 2001 Delivered-To: freebsd-stable@freebsd.org Received: from klima.physik.uni-mainz.de (klima.Physik.Uni-Mainz.DE [134.93.180.162]) by hub.freebsd.org (Postfix) with ESMTP id 3242337B406; Mon, 2 Jul 2001 03:07:18 -0700 (PDT) (envelope-from ohartman@klima.physik.uni-mainz.de) Received: from klima.Physik.Uni-Mainz.DE (Sturm@klima.Physik.Uni-Mainz.DE [134.93.180.162]) by klima.physik.uni-mainz.de (8.11.4/8.11.4) with ESMTP id f62A7G800981; Mon, 2 Jul 2001 12:07:17 +0200 (CEST) (envelope-from ohartman@klima.physik.uni-mainz.de) Date: Mon, 2 Jul 2001 12:07:16 +0200 (CEST) From: "Hartmann, O." To: Pete French Cc: , Subject: Re: HELP! Server crashes since last cvsupdate! In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, 2 Jul 2001, Pete French wrote: Another system, our development server, runs FreeBSD 4.3-STABLE now for 15 days uptime (and for that the sources from that point on). It seems to be a kind of bug went into the stable tree since then. Here the dmesg output of that machine: Copyright (c) 1992-2001 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.3-STABLE #8: Fri Jun 15 22:38:25 CEST 2001 root@edda.physik.uni-mainz.de:/usr/local/obj/usr/src/sys/EDDA Timecounter "i8254" frequency 1193182 Hz CPU: Pentium II/Pentium II Xeon/Celeron (349.06-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x652 Stepping = 2 Features=0x183fbff real memory = 268369920 (262080K bytes) avail memory = 257118208 (251092K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 Preloaded elf kernel "kernel" at 0xc03fb000. ccd0-3: Concatenated disk drivers netsmb_dev: loaded Pentium Pro MTRR support enabled npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard IOAPIC #0 intpin 16 -> irq 2 IOAPIC #0 intpin 17 -> irq 16 IOAPIC #0 intpin 18 -> irq 17 pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at 0.0 irq 2 isab0: at device 7.0 on pci0 isa0: on isab0 pci0: at 7.1 pci0: at 7.2 irq 11 intpm0: port 0x5000-0x500f irq 9 at device 7.3 on pci0 intpm0: I/O mapped 5000 intpm0: intr IRQ 9 enabled revision 0 smbus0: on intsmb0 smb0: on smbus0 intpm0: PM I/O mapped 4000 sym0: <1010-33> port 0xe400-0xe4ff mem 0xec102000-0xec103fff,0xec105000-0xec1053ff irq 2 at device 8.0 on pci0 sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. sym0: handling phase mismatch from SCRIPTS. sym1: <1010-33> port 0xe800-0xe8ff mem 0xec100000-0xec101fff,0xec106000-0xec1063ff irq 16 at device 8.1 on pci0 sym1: Symbios NVRAM, ID 7, Fast-80, SE, parity checking sym1: open drain IRQ line driver, using on-chip SRAM sym1: using LOAD/STORE-based firmware. sym1: handling phase mismatch from SCRIPTS. fxp0: port 0xec00-0xec3f mem 0xec000000-0xec0fffff,0xec104000-0xec104fff irq 17 at device 10.0 on pci0 fxp0: Ethernet address 00:02:b3:17:36:29 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 psm0: irq 12 on atkbdc0 psm0: model IntelliMouse, device ID 3 vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: on isa0 sc0: VGA <8 virtual consoles, flags=0x200> fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 flags 0x10 on isa0 sio1: type 16550A ppc0: at port 0x378-0x37f irq 7 drq 1 flags 0x8 on isa0 ppc0: Generic chipset (ECP-only) in ECP mode lpt0: on ppbus0 lpt0: Interrupt-driven port APIC_IO: Testing 8254 interrupt delivery APIC_IO: routing 8254 via IOAPIC #0 intpin 2 DUMMYNET initialized (010124) IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to deny, unlimited logging IPsec: Initialized Security Association Processing. Waiting 4 seconds for SCSI devices to settle (noperiph:sym1:0:-1:-1): SCSI BUS reset delivered. SMP: AP CPU #1 Launched! (noperiph:sym0:0:-1:-1): SCSI BUS reset detected. Mounting root from ufs:/dev/da0s1a cd0 at sym1 bus 0 target 3 lun 0 cd0: Removable CD-ROM SCSI-2 device cd0: 20.000MB/s transfers (20.000MHz, offset 16) cd0: Attempt to query device size failed: NOT READY, Medium not present da0 at sym0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled da0: 8715MB (17850000 512 byte sectors: 255H 63S/T 1111C) link_elf: symbol splash_register undefined arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet linux: syscall fstat64 is obsoleted or not implemented (pid=19990) arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet arp: runt packet nfs server klima:/usr/scratch: not responding nfs server klima:/usr/scratch: not responding arp: runt packet nfs server atmos:/cdrom: not responding nfs server atmos:/cdrom: is alive again :>> Since our last update Friday, 29th June, both SMP machines run :>> into a "stuck" condition after a while. This happened now two times :>> and I do not know what happens. :> :>I've been seeing this effect since 4.3-RELEASE actually. WIth pretty much :>identical symptoms to the ones you descibe. Asking here earlier people :>seemed to think that it was the disc controllers getting locked up as this :>will lead to the effects described. Sometimes the machine will run :>for weeks at a time, sometimes it will freeze after a few hours. The :>easiest way I can make it lockup is to try and access a very large :>file from two processes at once. :> :>I'm currently trying to find time to work out how to use the kernel :>debugging stuff to connect over the network and see what sort of :>state the kernels in (which it is apparently posssible to do). But :>not really got anywhere with that yet. I'd be intyerested in knowing :>what sort of machine you have and what the components are to see if :>theres anything that both systems have in common (other than the SMP bits). :> :>cheers, :> :>-pcf. :> -- MfG O. Hartmann ohartman@klima.physik.uni-mainz.de ---------------------------------------------------------------- IT-Administration des Institut fuer Physik der Atmosphaere (IPA) ---------------------------------------------------------------- Johannes Gutenberg Universitaet Mainz Becherweg 21 55099 Mainz Tel: +496131/3924662 (Maschinenraum) Tel: +496131/3924144 FAX: +496131/3923532 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message