From owner-freebsd-stable@FreeBSD.ORG Wed Apr 14 00:00:32 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A735416A4CE; Wed, 14 Apr 2004 00:00:32 -0700 (PDT) Received: from horsey.gshapiro.net (horsey.gshapiro.net [64.105.95.154]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2151043D5C; Wed, 14 Apr 2004 00:00:31 -0700 (PDT) (envelope-from gshapiro@gshapiro.net) Received: from horsey.gshapiro.net (localhost [127.0.0.1]) id i3E6xpfT052599 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 13 Apr 2004 23:59:51 -0700 (PDT) Received: (from gshapiro@localhost)i3E6xpPU052598; Tue, 13 Apr 2004 23:59:51 -0700 (PDT) Date: Tue, 13 Apr 2004 23:59:51 -0700 From: Gregory Neil Shapiro To: Scott Long Message-ID: <20040414065950.GM4586@horsey.gshapiro.net> References: <407CA652.1090805@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <407CA652.1090805@freebsd.org> User-Agent: Mutt/1.5.6i cc: Pete French cc: stable@freebsd.org Subject: Re: SMP/HTT problems with 4.10-BETA X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Apr 2004 07:00:32 -0000 > I almost wonder if you have a buggy set og CPUs. Have you tried > updating the BIOS? Often times updated BIOSes have updated uCode > patches for the CPU. I tried it on my HTT system (Dell PowerEdge 400SC, up-to-date BIOS) as well and had similar results. Setting machdep.hlt_logical_cpus to 0 on a running system ended up with various processes crashing over the next 12 hours. The system has been running absolutely fine before doing this and after rebooting to get back to the default setup. It is quite possible that it is my own fault if machdep.hlt_logical_cpus should only be set during boot as a loader tunable but that isn't the way I read the UPDATING entry. Here are the details: Time line: Apr 8 21:24 sysctl machdep.hlt_logical_cpus=0 Apr 9 00:22:44 horsey /kernel: pid 130 (named), uid 53: exited on signal 6 Apr 9 00:23:32 horsey /kernel: pid 30932 (perl), uid 103: exited on signal 11 Apr 9 00:34:41 horsey /kernel: pid 31223 (perl), uid 103: exited on signal 11 Apr 9 00:46:41 horsey /kernel: pid 31389 (perl), uid 103: exited on signal 11 Apr 9 01:02:57 horsey /kernel: pid 31645 (perl), uid 103: exited on signal 11 Apr 9 01:14:50 horsey /kernel: pid 31815 (perl), uid 103: exited on signal 11 Apr 9 02:10:40 horsey /kernel: pid 32879 (perl), uid 103: exited on signal 11 Apr 9 03:06:33 horsey /kernel: pid 34102 (cvsup), uid 63: exited on signal 6 (core dumped) Apr 9 04:38:53 horsey /kernel: pid 36119 (perl), uid 103: exited on signal 11 Apr 9 04:58:21 horsey /kernel: pid 36420 (sendmail:8.13.0.), uid 0: exited on signal 12 Apr 9 05:16:14 horsey /kernel: pid 36952 (perl), uid 103: exited on signal 11 Apr 9 05:58:48 horsey /kernel: pid 37908 (perl), uid 103: exited on signal 11 Apr 9 06:59:59 horsey /kernel: pid 39287 (perl), uid 103: exited on signal 11 Apr 9 09:35:51 horsey /kernel: pid 43877 (perl), uid 103: exited on signal 11 Apr 9 10:42:27 horsey /kernel: pid 47579 (perl), uid 103: exited on signal 11 Apr 9 11:23:34 horsey /kernel: pid 41658 (named), uid 53: exited on signal 6 Apr 9 11:45:38 horsey /kernel: pid 49470 (sendmail:8.13.0.), uid 0: exited on signal 12 Apr 9 11:55:36 horsey /kernel: pid 49854 (perl), uid 103: exited on signal 11 Apr 9 12:09:00 horsey /kernel: pid 50248 (perl), uid 103: exited on signal 11 Apr 9 12:16:30 horsey /kernel: pid 50453 (perl), uid 103: exited on signal 11 Apr 9 12:23:43 horsey /kernel: pid 50632 (perl), uid 103: exited on signal 11 Apr 9 12:55:22 reboot dmesg.boot: Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.10-BETA #17: Sat Apr 10 12:24:44 PDT 2004 gshapiro@horsey.gshapiro.net:/src/FreeBSD/RELENG_4/obj/src/sys/HORSEY Timecounter "i8254" frequency 1193182 Hz CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2394.01-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 1073168384 (1048016K bytes) avail memory = 1040097280 (1015720K bytes) Changing APIC ID for IO APIC #0 from 0 to 2 on chip Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00178020, at 0xfec00000 Pentium 4 TCC support enabled, current performance 100% Preloaded elf kernel "kernel" at 0xc0461000. Warning: Pentium 4 CPU: PSE disabled Pentium Pro MTRR support enabled Using $PIR table, 8 entries at 0xc00feae0 apm0: on motherboard apm0: found APM BIOS v1.2, connected at v1.2 npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard IOAPIC #0 intpin 16 -> irq 2 IOAPIC #0 intpin 19 -> irq 13 IOAPIC #0 intpin 18 -> irq 16 IOAPIC #0 intpin 23 -> irq 17 IOAPIC #0 intpin 17 -> irq 18 pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 uhci0: port 0xff80-0xff9f irq 2 at device 29.0 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub1: Atmel UHB124 hub, class 9/0, rev 1.00/1.00, addr 2 uhub1: 4 ports with 4 removable, bus powered umass0: SanDisk Corporation ImageMate CompactFlash USB, rev 1.10/0.09, addr 3 umass0: Get Max Lun not supported (STALLED) umass1: Alcor Micro Mass Storage Device, rev 1.10/1.00, addr 4 umass1: Get Max Lun not supported (STALLED) uhci1: port 0xff60-0xff7f irq 13 at device 29.1 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci2: port 0xff40-0xff5f irq 16 at device 29.2 on pci0 usb2: on uhci2 usb2: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered uhci3: port 0xff20-0xff3f irq 2 at device 29.3 on pci0 usb3: on uhci3 usb3: USB revision 1.0 uhub4: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub4: 2 ports with 2 removable, self powered pci0: at 29.7 irq 17 pcib2: at device 30.0 on pci0 IOAPIC #0 intpin 21 -> irq 19 IOAPIC #0 intpin 22 -> irq 20 pci2: on pcib2 pci2: at 0.0 irq 19 fwohci0: mem 0xfe9df000-0xfe9dffff irq 20 at device 1.0 on pci2 fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channel is 8. fwohci0: EUI64 00:30:dd:80:00:50:fb:f0 fwohci0: Phy 1394a available S400, 3 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:30:dd:50:fb:f0 sbp0: on firewire0 sbp_targ0: on firewire0 fwohci0: Initiate bus reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) em0: port 0xddc0-0xddff mem 0xfe9e0000-0xfe9fffff irq 16 at device 12.0 on pci2 em0: Speed:N/A Duplex:N/A isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0xffa0-0xffaf,0x374-0x377,0x170-0x177,0x3f4-0x3f7,0x1f0-0x1f7 mem 0xfebffc00-0xfebfffff irq 16 at device 31.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 atapci1: port 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 irq 16 at device 31.2 on pci0 ata2: at 0xfe00 on atapci1 ata3: at 0xfe20 on atapci1 ichsmb0: port 0xeda0-0xedbf irq 18 at device 31.3 on pci0 smbus0: on ichsmb0 smb0: on smbus0 pcm0: port 0xedc0-0xedff,0xee00-0xeeff mem 0xfebff900-0xfebff9ff,0xfebffa00-0xfebffbff irq 18 at device 31.5 on pci0 pcm0: orm0: