From owner-freebsd-smp@FreeBSD.ORG Sun Jan 9 00:32:04 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F364D16A4D0 for ; Sun, 9 Jan 2005 00:32:03 +0000 (GMT) Received: from hotmail.com (bay24-f13.bay24.hotmail.com [64.4.18.63]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9496243D48 for ; Sun, 9 Jan 2005 00:32:03 +0000 (GMT) (envelope-from segr@hotmail.com) Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sat, 8 Jan 2005 16:32:03 -0800 Message-ID: Received: from 205.206.122.73 by by24fd.bay24.hotmail.msn.com with HTTP; Sun, 09 Jan 2005 00:31:56 GMT X-Originating-IP: [205.206.122.73] X-Originating-Email: [segr@hotmail.com] X-Sender: segr@hotmail.com From: "Stephane Raimbault" To: smp@freebsd.org Date: Sat, 08 Jan 2005 17:31:56 -0700 Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-OriginalArrivalTime: 09 Jan 2005 00:32:03.0218 (UTC) FILETIME=[A3C44720:01C4F5E2] Subject: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jan 2005 00:32:04 -0000 I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have compiled the SMP kernel and noticed that something is not right. In "top" the CPU values indicate 0% across the board, even idle! last pid: 9462; load averages: 0.00, 0.00, 0.00 up 2+18:57:30 13:11:47 14 processes: 1 running, 13 sleeping CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle Mem: 5232K Active, 115M Inact, 59M Wired, 60M Buf, 315M Free Swap: 999M Total, 999M Free Also, when I run systat and go to the vmstat page I get this error: The alternate system clock has died! Reverting to ``pigs'' display. There seems to be no errors in /var/log/messages. here is my /var/run/dmesg.boot file sol# cat /var/run/dmesg.boot Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-RELEASE #1: Mon Dec 27 17:45:44 MST 2004 root@sol.integer8.net:/usr/obj/usr/src/sys/SMP MPTable: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x652 Stepping = 2 Features=0x183fbff real memory = 536858624 (511 MB) avail memory = 515788800 (491 MB) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 1 cpu1 (AP): APIC ID: 0 ioapic0: Assuming intbase of 0 ioapic0 irqs 0-23 on motherboard ACPI disabled by blacklist. Contact your BIOS vendor. npx0: [FAST] npx0: on motherboard npx0: INT 16 interface pcib0: pcibus 0 on motherboard pir0: on motherboard pci0: on pcib0 agp0: mem 0xe4000000-0xe7ffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) isab0: at device 4.0 on pci0 isa0: on isab0 atapci0: port 0xb800-0xb80f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 4.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 uhci0: port 0xb400-0xb41f irq 11 at device 4.2 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered piix0: port 0xe800-0xe80f at device 4.3 on pci0 Timecounter "PIIX" frequency 3579545 Hz quality 0 ahc0: port 0xb000-0xb0ff mem 0xe1800000-0xe1800fff irq 11 at device 6.0 on pci0 ahc0: [GIANT-LOCKED] aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs pci0: at device 9.0 (no driver attached) fxp0: port 0xa800-0xa83f mem 0xdf800000-0xdf8fffff,0xe0000000-0xe0000fff irq 10 at device 10.0 on pci0 miibus0: on fxp0 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:90:27:8c:08:17 cpu0 on motherboard cpu1 on motherboard orm0: at iomem 0xc8000-0xcd7ff,0xc0000-0xc7fff on isa0 pmtimer0 on isa0 atkbdc0: at port 0x64,0x60 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] fdc0: at port 0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) Timecounters tick every 10.000 msec ata0-master: FAILURE - SETFEATURES SET TRANSFER MODE status=51 error=4 acd0: CDROM at ata0-master BIOSPIO Waiting 15 seconds for SCSI devices to settle da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled da0: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da1 at ahc0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-2 device da1: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da2 at ahc0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled da2: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) SMP: AP CPU #1 Launched! Mounting root from ufs:/dev/da0s1a sol# What is happening? This is a test box I have here so I can do some testing as necessary. Thank you, Stephane. _________________________________________________________________ Take advantage of powerful junk e-mail filters built on patented Microsoft® SmartScreen Technology. http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines Start enjoying all the benefits of MSN® Premium right now and get the first two months FREE*. From owner-freebsd-smp@FreeBSD.ORG Sun Jan 9 00:38:18 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A7C616A4CE for ; Sun, 9 Jan 2005 00:38:18 +0000 (GMT) Received: from multiplay.co.uk (www1.multiplay.co.uk [212.42.16.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 78D7443D3F for ; Sun, 9 Jan 2005 00:38:17 +0000 (GMT) (envelope-from killing@multiplay.co.uk) Received: from vader ([212.135.219.179]) by multiplay.co.uk (multiplay.co.uk [212.42.16.7]) (MDaemon.PRO.v7.2.2.R) with ESMTP id md50000857964.msg for ; Sun, 09 Jan 2005 00:27:51 +0000 Message-ID: <000e01c4f5e3$56296030$b3db87d4@multiplay.co.uk> From: "Steven Hartland" To: "Stephane Raimbault" , References: Date: Sun, 9 Jan 2005 00:36:58 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2527 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 X-Spam-Processed: multiplay.co.uk, Sun, 09 Jan 2005 00:27:51 +0000 (not processed: message from valid local sender) X-MDRemoteIP: 212.135.219.179 X-Return-Path: killing@multiplay.co.uk X-MDaemon-Deliver-To: smp@freebsd.org X-MDAV-Processed: multiplay.co.uk, Sun, 09 Jan 2005 00:27:56 +0000 Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jan 2005 00:38:18 -0000 At a guess the kernel is out of date with world. Steve ----- Original Message ----- From: "Stephane Raimbault" >I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have > compiled the SMP kernel and noticed that something is not right. In "top" > the CPU values indicate 0% across the board, even idle! ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone (023) 8024 3137 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-smp@FreeBSD.ORG Sun Jan 9 02:41:50 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7606516A4CE for ; Sun, 9 Jan 2005 02:41:50 +0000 (GMT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0A7D643D2F for ; Sun, 9 Jan 2005 02:41:50 +0000 (GMT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.13.1/8.13.1) id j092fnbu009148; Sat, 8 Jan 2005 20:41:49 -0600 (CST) (envelope-from dan) Date: Sat, 8 Jan 2005 20:41:49 -0600 From: Dan Nelson To: Stephane Raimbault Message-ID: <20050109024149.GA84945@dan.emsphone.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 5.3-STABLE X-message-flag: Outlook Error User-Agent: Mutt/1.5.6i cc: smp@freebsd.org Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jan 2005 02:41:50 -0000 In the last episode (Jan 08), Stephane Raimbault said: > I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have > compiled the SMP kernel and noticed that something is not right. In > "top" the CPU values indicate 0% across the board, even idle! I get this occasionally on one of my Dell servers after about a week of uptime. Manually stepping the time using ntpdate -b (forcing the kernel to reset the RTC in the process) fixes it for me. If your RTC is nonfunctional from boot you may not have the same problem, though. Also try installing a newer BIOS, since I see > ACPI disabled by blacklist. Contact your BIOS vendor. in your dmesg output. -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-smp@FreeBSD.ORG Sun Jan 9 03:33:21 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2ECB816A4CE for ; Sun, 9 Jan 2005 03:33:21 +0000 (GMT) Received: from smtp5.dti.ne.jp (smtp5.dti.ne.jp [202.216.228.40]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5C0DC43D46 for ; Sun, 9 Jan 2005 03:33:20 +0000 (GMT) (envelope-from maenaka@pluto.dti.ne.jp) Received: from towerrecords.minidns.net (PPPbb116.gifu-ip.dti.ne.jp [218.225.250.116]) by smtp5.dti.ne.jp (3.10s) with ESMTP id j093XJF9010160 for ; Sun, 9 Jan 2005 12:33:19 +0900 (JST) Received: from [127.0.0.1] (destroy [192.168.0.1]) by towerrecords.minidns.net (Postfix) with ESMTP id 129DD2072 for ; Sun, 9 Jan 2005 12:33:19 +0900 (JST) From: "UEMURA (fka. MAENAKA) Tetsuya" To: freebsd-smp@freebsd.org In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.12.01 [ja] Message-Id: <20050109033319.129DD2072@towerrecords.minidns.net> Date: Sun, 9 Jan 2005 12:33:19 +0900 (JST) Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jan 2005 03:33:21 -0000 Posted on Sat, 08 Jan 2005 17:31:56 -0700 by author Stephane Raimbault > I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have > compiled the SMP kernel and noticed that something is not right. In "top" > the CPU values indicate 0% across the board, even idle! I found 5 PRs regarding this symptom. On my 5.3-STABLE server, patch attached with PR 17800 solved the problem. http://www.freebsd.org/cgi/query-pr.cgi?pr=17800 http://www.freebsd.org/cgi/query-pr.cgi?pr=60385 http://www.freebsd.org/cgi/query-pr.cgi?pr=30310 http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/30310 http://www.freebsd.org/cgi/query-pr.cgi?pr=73989 For information, Tyan S1867DLUAN Thunder 2500 dual Slot 1 motherboard always shows correct CPU usage on FreeBSD 5.x since early 2003, its Socket 370 alternative S2567U3AN Thuder HEsl shows incorrect on 5.3-BETA4 and recent 5.3-STABLE without patch. -- UEMURA (fka. MAENAKA) Tetsuya From owner-freebsd-smp@FreeBSD.ORG Mon Jan 10 22:18:30 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5B38D16A4D8 for ; Mon, 10 Jan 2005 22:18:30 +0000 (GMT) Received: from mail3.speakeasy.net (mail3.speakeasy.net [216.254.0.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id 13FD443D2F for ; Mon, 10 Jan 2005 22:18:30 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 16515 invoked from network); 10 Jan 2005 22:18:29 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 10 Jan 2005 22:18:29 -0000 Received: from [10.50.41.243] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j0AMIGCb012873; Mon, 10 Jan 2005 17:18:25 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Mon, 10 Jan 2005 14:48:30 -0500 User-Agent: KMail/1.6.2 References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501101448.30841.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Peter Trifonov cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jan 2005 22:18:30 -0000 On Saturday 08 January 2005 03:54 am, Peter Trifonov wrote: > Hello John, > > > Ok, try this patch instead then, it should make the 'ignoring > > global interrupt entry' messages go away: > > > > --- //depot/vendor/freebsd/src/sys/i386/i386/mptable.c > > 2004/09/24 18:45:28 > > +++ //depot/user/jhb/acpipci/i386/i386/mptable.c > > + if (mptable_nioapics == 1) { > > + apic_id = 0; > > + while (ioapics[apic_id] == NULL) > > + apic_id++; > > + } else { > > + printf( > > + "MPTable: Ignoring global interrupt > > entry for pin %d\n", > > + intr->dst_apic_int); > > + return; > > + } > > } > > if (intr->dst_apic_id >= NAPICID) { > > printf("MPTable: Ignoring interrupt entry for > > ioapic%d\n", > > After reverting your previous patch and recompiling the kernel with the new > one "Ignoring global interrupt > entry for pin" messages have changed to "Ignoring interrupt entry for > ioapic255". > It seems to me that your forgot to change > if (intr->dst_apic_id >= NAPICID) > to > if (apic_id >= NAPICID) > > Changing this caused "ignoring interrupt" messages to disappear. > However, this does not seem to be related to the NIC timeout problem. > Doing flood ping over both interfaces sharing IRQ 11 still causes them to > say "xl*: watchdog timeout" and stop working. > The only way to revive them is to bring both of them down& up. Ok. What if you apply both patches, does that do better? (Your fix was correct btw, so just apply the previous patch to what you have now.) -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Mon Jan 10 22:18:33 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A750D16A4DE for ; Mon, 10 Jan 2005 22:18:33 +0000 (GMT) Received: from mail6.speakeasy.net (mail6.speakeasy.net [216.254.0.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id 09E5B43D4C for ; Mon, 10 Jan 2005 22:18:33 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 16834 invoked from network); 10 Jan 2005 22:18:32 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 10 Jan 2005 22:18:32 -0000 Received: from [10.50.41.243] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j0AMIGCc012873; Mon, 10 Jan 2005 17:18:28 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Mon, 10 Jan 2005 15:32:42 -0500 User-Agent: KMail/1.6.2 References: <20050109033319.129DD2072@towerrecords.minidns.net> In-Reply-To: <20050109033319.129DD2072@towerrecords.minidns.net> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200501101532.42149.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: acpi@FreeBSD.org cc: bde@FreeBSD.org Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jan 2005 22:18:33 -0000 On Saturday 08 January 2005 10:33 pm, UEMURA (fka. MAENAKA) Tetsuya wrote: > Posted on Sat, 08 Jan 2005 17:31:56 -0700 > by author Stephane Raimbault > > > I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have > > compiled the SMP kernel and noticed that something is not right. In > > "top" the CPU values indicate 0% across the board, even idle! > > I found 5 PRs regarding this symptom. On my 5.3-STABLE server, patch > attached with PR 17800 solved the problem. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=17800 > http://www.freebsd.org/cgi/query-pr.cgi?pr=60385 > http://www.freebsd.org/cgi/query-pr.cgi?pr=30310 > http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/30310 > http://www.freebsd.org/cgi/query-pr.cgi?pr=73989 > > For information, Tyan S1867DLUAN Thunder 2500 dual Slot 1 motherboard > always shows correct CPU usage on FreeBSD 5.x since early 2003, its > Socket 370 alternative S2567U3AN Thuder HEsl shows incorrect on > 5.3-BETA4 and recent 5.3-STABLE without patch. Can you please try the patch below. It drains pending interrupts any time we turn interrupts back on on the RTC including during resume: Index: i386/isa/clock.c =================================================================== RCS file: /usr/cvs/src/sys/i386/isa/clock.c,v retrieving revision 1.213 diff -u -r1.213 clock.c --- i386/isa/clock.c 11 Jul 2004 17:50:59 -0000 1.213 +++ i386/isa/clock.c 10 Jan 2005 19:58:51 -0000 @@ -712,6 +712,7 @@ writertc(RTC_STATUSB, RTCSB_24HR); writertc(RTC_STATUSA, rtc_statusa); writertc(RTC_STATUSB, rtc_statusb); + rtcin(RTC_INTR); } /* @@ -911,6 +912,7 @@ /* Reenable RTC updates and interrupts. */ writertc(RTC_STATUSB, rtc_statusb); + rtcin(RTC_INTR); } @@ -957,6 +959,7 @@ INTR_TYPE_CLK | INTR_FAST, NULL); writertc(RTC_STATUSB, rtc_statusb); + rtcin(RTC_INTR); } init_TSC_tc(); -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 07:19:11 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5E8EF16A4CE; Tue, 11 Jan 2005 07:19:11 +0000 (GMT) Received: from dcn.infos.ru (gw-9cor.infos.ru [195.209.229.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id ABB3843D2D; Tue, 11 Jan 2005 07:19:10 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from dcn (localhost [127.0.0.1]) by dcn (Postfix) with SMTP id 9BCB22D17D; Tue, 11 Jan 2005 10:19:08 +0300 (MSK) Received: by smtp.xj.dcn (Postfix, from userid 65534) id 6AB462D1B8; Tue, 11 Jan 2005 10:19:08 +0300 (MSK) Received: from tank-ls.xj.dcn (unknown [10.0.103.154]) by smtp.xj.dcn (Postfix) with ESMTP id 558FC2B4A3; Tue, 11 Jan 2005 10:19:05 +0300 (MSK) From: "Peter Trifonov" To: freebsd-smp@freebsd.org Date: Tue, 11 Jan 2005 10:23:10 +0300 User-Agent: KMail/1.6.2 References: <200501101448.30841.jhb@FreeBSD.org> In-Reply-To: <200501101448.30841.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501111023.10921.pvtrifonov@mail.ru> X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on dcn.xj.dcn X-Spam-Level: X-Spam-Status: No, score=-5.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.0.1 cc: John Baldwin cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 07:19:11 -0000 Hello John, On Monday 10 January 2005 22:48, John Baldwin wrote: > Ok. What if you apply both patches, does that do better? (Your fix was > correct btw, so just apply the previous patch to what you have now.) Things became even worse with both patches. Now doing normal (not flood) ping over EITHER xl1 OR xl2 (they share IRQ11) causes the corresponding interface to say "watchdog timeout". Essentially, xl1 and xl2 do not work at all now. mptable still reports the same stuff as it was on the original (unpatched) kernel (http://lists.freebsd.org/pipermail/freebsd-smp/2005-January/000700.html) -- With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 09:48:43 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 98E0C16A4CE for ; Tue, 11 Jan 2005 09:48:43 +0000 (GMT) Received: from mail.esiee.fr (mail.esiee.fr [147.215.1.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5C0D843D58 for ; Tue, 11 Jan 2005 09:48:43 +0000 (GMT) (envelope-from f.bonnet@esiee.fr) Received: from localhost.esiee.fr (localhost.esiee.fr [127.0.0.1]) by mail.esiee.fr (Postfix) with ESMTP id 8490E3658E2 for ; Tue, 11 Jan 2005 10:48:42 +0100 (CET) Received: from mail.esiee.fr (localhost.esiee.fr [127.0.0.1]) by localhost.esiee.fr (VaMailArmor-2.0.1.16) id 07841-680749EB; Tue, 11 Jan 2005 10:48:42 +0100 Received: from [147.215.1.13] (desolation.esiee.fr [147.215.1.13]) by mail.esiee.fr (Postfix) with ESMTP id 5B07C3658DC for ; Tue, 11 Jan 2005 10:48:42 +0100 (CET) Message-ID: <41E3A0FA.80006@esiee.fr> Date: Tue, 11 Jan 2005 10:48:42 +0100 From: Frank Bonnet User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-smp@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-AntiVirus: checked by Vexira MailArmor (version: 2.0.1.16; VAE: 6.29.0.5; VDF: 6.29.0.52; host: mail.esiee.fr) Subject: Compaq CPU ? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 09:48:43 -0000 Hello I'm new in managing SMP with FreeBSD ... I have a Compaq Proliant DL380, is there a software way to test is there are one or two CPUs before opening the box ? when I perform the "top" command it seems to have two, but ... thanks From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 10:50:09 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1FC2816A4CE for ; Tue, 11 Jan 2005 10:50:09 +0000 (GMT) Received: from lists.sch.bme.hu (kaa.sch.bme.hu [152.66.208.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id C451643D5C for ; Tue, 11 Jan 2005 10:50:05 +0000 (GMT) (envelope-from tamas@bazmag.hu) Received: by lists.sch.bme.hu (Postfix, from userid 102) id 22BD4858E19; Tue, 11 Jan 2005 11:50:04 +0100 (CET) Received: from [152.66.211.76] (ural14.sch.bme.hu [152.66.211.76]) by lists.sch.bme.hu (Postfix) with ESMTP id 13CF7858DAA; Tue, 11 Jan 2005 11:50:04 +0100 (CET) Message-ID: <41E3AF7C.1070601@bazmag.hu> Date: Tue, 11 Jan 2005 11:50:36 +0100 From: Tamas MEZEI User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Frank Bonnet References: <41E3A0FA.80006@esiee.fr> In-Reply-To: <41E3A0FA.80006@esiee.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-smp@freebsd.org Subject: Re: Compaq CPU ? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 10:50:09 -0000 > when I perform the "top" command it seems to have two, but ... There's one way to check the most recent dmesg.boot: cat /var/run/dmesg.boot | grep cpu but I guess if you have N CPUs and HT is turned on you'll see 2N CPUs as a result, so if you don't want to check sysctl values just read the first some lines from dmesg.boot. It should tell that if hyperthreading is used or not. If so, you'll se something like "Hyperthreading: %d logical CPUs" at the CPU info section, and when you grep dmesg.boot for 'cpu', you'll get [cpu0..cpu(N-1)] and this is twice as many as the number of CPUs you have. If you use HT (there's no HTT field in the Features list), and you grep for 'cpu', you'll get the real amount of CPUs you have. Or, you could grab this whole stuff from sysctl, but maybe fussing with finding the value which tells you the truth is way slower than just reading some five lines of text. Good luck, Tamas Tamas From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 13:46:49 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E23E16A4CE for ; Tue, 11 Jan 2005 13:46:49 +0000 (GMT) Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.178.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0424443D39 for ; Tue, 11 Jan 2005 13:46:49 +0000 (GMT) (envelope-from ohartman@uni-mainz.de) Received: from [134.93.180.218] (edda.Physik.Uni-Mainz.DE [134.93.180.218]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailgate1.zdv.Uni-Mainz.DE (Postfix) with ESMTP id 3ABD030032AF; Tue, 11 Jan 2005 14:46:24 +0100 (CET) Message-ID: <41E3D8AE.2050903@uni-mainz.de> Date: Tue, 11 Jan 2005 14:46:22 +0100 From: "O. Hartmann" Organization: Institut =?ISO-8859-1?Q?f=FCr_Geophysik?= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-AT; rv:1.7.5) Gecko/20050102 X-Accept-Language: de-de, en MIME-Version: 1.0 To: Dan Nelson References: <20050109024149.GA84945@dan.emsphone.com> In-Reply-To: <20050109024149.GA84945@dan.emsphone.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at uni-mainz.de cc: Stephane Raimbault cc: smp@freebsd.org Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 13:46:49 -0000 Dan Nelson schrieb: >In the last episode (Jan 08), Stephane Raimbault said: > > >>I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have >>compiled the SMP kernel and noticed that something is not right. In >>"top" the CPU values indicate 0% across the board, even idle! >> >> > >I get this occasionally on one of my Dell servers after about a week of >uptime. Manually stepping the time using ntpdate -b (forcing the >kernel to reset the RTC in the process) fixes it for me. If your RTC >is nonfunctional from boot you may not have the same problem, though. >Also try installing a newer BIOS, since I see > > > >>ACPI disabled by blacklist. Contact your BIOS vendor. >> >> > >in your dmesg output. > > > > Hello. I also get a similar message on an ASUS CUR-DLS based PIII/1000 Mhz SMP system, sometimes in UP, but very often in SMP mode. When calling 'systat -vmstat 1' on the console or within a terminal window I get a weird essage: the alternate system clock has died. RTC seems to be all right. Using UP kernel hides away this problem in most cases. I use FreeBSD 5.3-STABLE. ACPI is enabled, but problem still remains when ACPI has been disabled. SMP is about to crash this system, UP is ok. From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 13:51:44 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD50116A4CE for ; Tue, 11 Jan 2005 13:51:43 +0000 (GMT) Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.178.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0DDF743D31 for ; Tue, 11 Jan 2005 13:51:43 +0000 (GMT) (envelope-from ohartman@uni-mainz.de) Received: from [134.93.180.218] (edda.Physik.Uni-Mainz.DE [134.93.180.218]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailgate1.zdv.Uni-Mainz.DE (Postfix) with ESMTP id 264F430003D8; Tue, 11 Jan 2005 14:51:42 +0100 (CET) Message-ID: <41E3D9F0.2010205@uni-mainz.de> Date: Tue, 11 Jan 2005 14:51:44 +0100 From: "O. Hartmann" Organization: Institut =?ISO-8859-1?Q?f=FCr_Geophysik?= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-AT; rv:1.7.5) Gecko/20050102 X-Accept-Language: de-de, en MIME-Version: 1.0 To: Stephane Raimbault References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: by amavisd-new at uni-mainz.de cc: smp@freebsd.org Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 13:51:44 -0000 Stephane Raimbault schrieb: > I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have > compiled the SMP kernel and noticed that something is not right. In > "top" the CPU values indicate 0% across the board, even idle! > > last pid: 9462; load averages: 0.00, 0.00, 0.00 up 2+18:57:30 > 13:11:47 > 14 processes: 1 running, 13 sleeping > CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, > 0.0% idle > Mem: 5232K Active, 115M Inact, 59M Wired, 60M Buf, 315M Free > Swap: 999M Total, 999M Free > > > Also, when I run systat and go to the vmstat page I get this error: > > The alternate system clock has died! > Reverting to ``pigs'' display. > > There seems to be no errors in /var/log/messages. > > here is my /var/run/dmesg.boot file > > sol# cat /var/run/dmesg.boot > Copyright (c) 1992-2004 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 5.3-RELEASE #1: Mon Dec 27 17:45:44 MST 2004 > root@sol.integer8.net:/usr/obj/usr/src/sys/SMP > MPTable: > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x652 Stepping = 2 > > Features=0x183fbff > > real memory = 536858624 (511 MB) > avail memory = 515788800 (491 MB) > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 1 > cpu1 (AP): APIC ID: 0 > ioapic0: Assuming intbase of 0 > ioapic0 irqs 0-23 on motherboard > ACPI disabled by blacklist. Contact your BIOS vendor. > npx0: [FAST] > npx0: on motherboard > npx0: INT 16 interface > pcib0: pcibus 0 on > motherboard > pir0: on motherboard > pci0: on pcib0 > agp0: mem > 0xe4000000-0xe7ffffff at device 0.0 on pci0 > pcib1: at device 1.0 on pci0 > pci1: on pcib1 > pci1: at device 0.0 (no driver attached) > isab0: at device 4.0 on pci0 > isa0: on isab0 > atapci0: port > 0xb800-0xb80f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 4.1 on pci0 > ata0: channel #0 on atapci0 > ata1: channel #1 on atapci0 > uhci0: port 0xb400-0xb41f > irq 11 at device 4.2 on pci0 > uhci0: [GIANT-LOCKED] > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub0: 2 ports with 2 removable, self powered > piix0: port 0xe800-0xe80f at device 4.3 on pci0 > Timecounter "PIIX" frequency 3579545 Hz quality 0 > ahc0: port 0xb000-0xb0ff mem > 0xe1800000-0xe1800fff irq 11 at device 6.0 on pci0 > ahc0: [GIANT-LOCKED] > aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs > pci0: at device 9.0 (no driver attached) > fxp0: port 0xa800-0xa83f mem > 0xdf800000-0xdf8fffff,0xe0000000-0xe0000fff irq 10 at device 10.0 on pci0 > miibus0: on fxp0 > inphy0: on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:90:27:8c:08:17 > cpu0 on motherboard > cpu1 on motherboard > orm0: at iomem 0xc8000-0xcd7ff,0xc0000-0xc7fff on isa0 > pmtimer0 on isa0 > atkbdc0: at port 0x64,0x60 on isa0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > fdc0: at port 0x3f0-0x3f5 irq 6 drq 2 on > isa0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > ppc0: at port 0x378-0x37f irq 7 on isa0 > ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode > ppc0: FIFO with 16/16/9 bytes threshold > ppbus0: on ppc0 > plip0: on ppbus0 > lpt0: on ppbus0 > lpt0: Interrupt-driven port > ppi0: on ppbus0 > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x100> > sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 > sio0: type 16550A, console > sio1 at port 0x2f8-0x2ff irq 3 on isa0 > sio1: type 16550A > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > Timecounters tick every 10.000 msec > ata0-master: FAILURE - SETFEATURES SET TRANSFER MODE > status=51 error=4 > acd0: CDROM at ata0-master BIOSPIO > Waiting 15 seconds for SCSI devices to settle > da0 at ahc0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-2 device > da0: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled > da0: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) > da1 at ahc0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-2 device > da1: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled > da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) > da2 at ahc0 bus 0 target 2 lun 0 > da2: Fixed Direct Access SCSI-2 device > da2: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged > Queueing Enabled > da2: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) > SMP: AP CPU #1 Launched! > Mounting root from ufs:/dev/da0s1a > sol# > > > > What is happening? This is a test box I have here so I can do some > testing as necessary. > > Thank you, > Stephane. > > _________________________________________________________________ > Take advantage of powerful junk e-mail filters built on patented > Microsoft® SmartScreen Technology. > http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines > Start enjoying all the benefits of MSN® Premium right now and get the > first two months FREE*. > > _______________________________________________ > freebsd-smp@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-smp > To unsubscribe, send any mail to "freebsd-smp-unsubscribe@freebsd.org" Sorry, my last response went t othe wrong list. I also see this weird message on a ASUS CUR-DLS based SMP system with two PIII/1GHz CPUs. Very often, when SMP is active (with or without ACPI), typing 'systat -vmstat 1' I get the above mentioned error message. In very, very rare cases this occurs also under UP conditions, but in nearly 50% of all cases under SMP conditions. As I said, either with or without ACPI enabled. In the past I reported about a strange behaviour of FreeBSD 5.3 (I utilize FreeBSD 5.3-STABLE on this box). SMP is not working. I was told this is possibly due a hardware error. From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 14:01:10 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 218F916A4CE for ; Tue, 11 Jan 2005 14:01:10 +0000 (GMT) Received: from mailgate1.zdv.Uni-Mainz.DE (mailgate1.zdv.Uni-Mainz.DE [134.93.178.129]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9EF8F43D1D for ; Tue, 11 Jan 2005 14:01:09 +0000 (GMT) (envelope-from ohartman@uni-mainz.de) Received: from [134.93.180.218] (edda.Physik.Uni-Mainz.DE [134.93.180.218]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailgate1.zdv.Uni-Mainz.DE (Postfix) with ESMTP id 124CA3000BE9; Tue, 11 Jan 2005 15:00:46 +0100 (CET) Message-ID: <41E3DC10.4070408@uni-mainz.de> Date: Tue, 11 Jan 2005 15:00:48 +0100 From: "O. Hartmann" Organization: Institut =?ISO-8859-1?Q?f=FCr_Geophysik?= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-AT; rv:1.7.5) Gecko/20050102 X-Accept-Language: de-de, en MIME-Version: 1.0 To: "UEMURA (fka. MAENAKA) Tetsuya" References: <20050109033319.129DD2072@towerrecords.minidns.net> In-Reply-To: <20050109033319.129DD2072@towerrecords.minidns.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at uni-mainz.de cc: freebsd-smp@freebsd.org Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 14:01:10 -0000 UEMURA (fka. MAENAKA) Tetsuya schrieb: >Posted on Sat, 08 Jan 2005 17:31:56 -0700 >by author Stephane Raimbault > > >>I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have >>compiled the SMP kernel and noticed that something is not right. In "top" >>the CPU values indicate 0% across the board, even idle! >> >> >I found 5 PRs regarding this symptom. On my 5.3-STABLE server, patch >attached with PR 17800 solved the problem. > >http://www.freebsd.org/cgi/query-pr.cgi?pr=17800 >http://www.freebsd.org/cgi/query-pr.cgi?pr=60385 >http://www.freebsd.org/cgi/query-pr.cgi?pr=30310 >http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/30310 >http://www.freebsd.org/cgi/query-pr.cgi?pr=73989 > >For information, Tyan S1867DLUAN Thunder 2500 dual Slot 1 motherboard >always shows correct CPU usage on FreeBSD 5.x since early 2003, its >Socket 370 alternative S2567U3AN Thuder HEsl shows incorrect on >5.3-BETA4 and recent 5.3-STABLE without patch. > > > I also see the phenomenon, that 'systat -vmstat 1' show sometimes 'alternate system clock has died' on a ASUS CUR-DLS SMP system (dual PIII/1000). This mainboard utilize the LE 3.0 RCC chipset (Socket 370 FCPGA). SMP is risky on all flavours of FreeBSD 5.3 from BETA4 on! I was told to have a faulty hardware, but I believe in a IRQ routing problem since the machine does weird thing swapping NIC or add-on cards in the PCI slots or disabling both serial ports (diabling both serial ports in the BIOS remains each flavour of kernel, UP or SMP, to be stuck after SCSI init and resetting (U160 built in SCSI)). From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 22:10:49 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1405916A57B for ; Tue, 11 Jan 2005 22:10:49 +0000 (GMT) Received: from mail1.speakeasy.net (mail1.speakeasy.net [216.254.0.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id C2D2143D2D for ; Tue, 11 Jan 2005 22:10:48 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 6792 invoked from network); 11 Jan 2005 22:10:48 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 11 Jan 2005 22:10:48 -0000 Received: from [10.50.41.243] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j0BMAfBj020663; Tue, 11 Jan 2005 17:10:44 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Tue, 11 Jan 2005 16:21:48 -0500 User-Agent: KMail/1.6.2 References: <200501101448.30841.jhb@FreeBSD.org> <200501111023.10921.pvtrifonov@mail.ru> In-Reply-To: <200501111023.10921.pvtrifonov@mail.ru> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501111621.48893.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Peter Trifonov cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 22:10:49 -0000 On Tuesday 11 January 2005 02:23 am, Peter Trifonov wrote: > Hello John, > > On Monday 10 January 2005 22:48, John Baldwin wrote: > > Ok. What if you apply both patches, does that do better? (Your fix was > > correct btw, so just apply the previous patch to what you have now.) > > Things became even worse with both patches. Now doing normal (not flood) > ping over EITHER xl1 OR xl2 (they share IRQ11) causes the corresponding > interface to say "watchdog timeout". Essentially, xl1 and xl2 do not work > at all now. > > mptable still reports the same stuff as it was on the original (unpatched) > kernel > (http://lists.freebsd.org/pipermail/freebsd-smp/2005-January/000700.html) Ok, can you get me the dmesg from a boot -v with both patches still? -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Tue Jan 11 23:10:50 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A876416A4CE for ; Tue, 11 Jan 2005 23:10:50 +0000 (GMT) Received: from mail.foolishgames.com (mail.foolishgames.com [216.55.178.45]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6CFFB43D39 for ; Tue, 11 Jan 2005 23:10:50 +0000 (GMT) (envelope-from luke@foolishgames.com) Received: from [192.168.0.49] (24.247.120.6.kzo.mi.chartermi.net [24.247.120.6]) (authenticated bits=0)j0C0EXF0093764 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NO); Tue, 11 Jan 2005 16:14:34 -0800 (PST) (envelope-from luke@foolishgames.com) X-Authentication-Warning: mail.foolishgames.com: Host 24.247.120.6.kzo.mi.chartermi.net [24.247.120.6] claimed to be [192.168.0.49] Message-Id: X-Habeas-Swe-6: email in exchange for a license for this Habeas X-Habeas-Swe-3: like Habeas SWE (tm) Date: Tue, 11 Jan 2005 18:10:25 -0500 X-Habeas-Swe-8: Message (HCM) and not spam. Please report use of this From: Lucas Holt X-Habeas-Swe-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-Swe-2: brightly anticipated In-Reply-To: <41E3DC10.4070408@uni-mainz.de> References: <20050109033319.129DD2072@towerrecords.minidns.net> <41E3DC10.4070408@uni-mainz.de> To: "O. Hartmann" X-Habeas-Swe-7: warrant mark warrants that this is a Habeas Compliant Mime-Version: 1.0 (Apple Message framework v619) X-Habeas-Swe-4: Copyright 2002 Habeas (tm) Content-Type: text/plain; charset=US-ASCII; format=flowed X-Habeas-Swe-1: winter into spring Content-Transfer-Encoding: 7bit X-Habeas-Swe-9: mark in spam to . X-Mailer: Apple Mail (2.619) cc: freebsd-smp@freebsd.org Subject: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 23:10:50 -0000 Part of the problem is that the motherboards are made by asus. I love asus boards for windows boxes, but they never implement a correct, complete bios for their boards. The result is odd behavior and often hacks need to be performed to get them to work. For example, I have an asus board with an nforce2 chipset and an athelon xp 2000+. During a patch level to 5.21 i lost the ability to reboot the box remotely. The system would hang. Eventually, an upgrade to 5.3 release fixed the problem. However, my dual xeon dell workstation is top notch in 5.3. I don't have a great deal of faith in dell bioses but they can do better than asus! :( Lucas Holt Luke@FoolishGames.com ________________________________________________________ FoolishGames.com (Jewel Fan Site) JustJournal.com (Free blogging) FoolishGames.net (Enemy Territory IoM site) From owner-freebsd-smp@FreeBSD.ORG Wed Jan 12 07:08:52 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 36B0516A4CE; Wed, 12 Jan 2005 07:08:52 +0000 (GMT) Received: from mx2.mail.ru (mx2.mail.ru [194.67.23.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id CB76D43D2D; Wed, 12 Jan 2005 07:08:51 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from [195.209.229.106] (port=1174 helo=tank) by mx2.mail.ru with esmtp id 1Coccc-0003Jl-00; Wed, 12 Jan 2005 10:08:50 +0300 From: "Peter Trifonov" To: "'John Baldwin'" , Date: Wed, 12 Jan 2005 10:12:51 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 In-Reply-To: <200501111621.48893.jhb@FreeBSD.org> Thread-Index: AcT4KonzbCKCyEhVT/2Nznn6Nrli7gAS3qng X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Message-Id: X-Spam: Not detected cc: kris@obsecurity.org Subject: RE: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jan 2005 07:08:52 -0000 Hello John, > > Things became even worse with both patches. Now doing normal (not > > flood) ping over EITHER xl1 OR xl2 (they share IRQ11) causes the > > corresponding interface to say "watchdog timeout". Essentially, xl1 > > and xl2 do not work at all now. > > > > mptable still reports the same stuff as it was on the original > > (unpatched) kernel > > > (http://lists.freebsd.org/pipermail/freebsd-smp/2005-January/000700.ht > > ml) > > Ok, can you get me the dmesg from a boot -v with both patches still? It can be found here: http://dcn.infos.ru/~bugman/bootlog.txt I have also put there output of mptable. At a first glance, there are many strange things (e.g. a lot of failures at various places) in this log file, but I don't know which are relevant to the problem :-). Also observe two messages in the end of the file: Jan 12 09:42:04 firewall kernel: arp: 10.0.103.6 is on xl0 but got reply from 00:0e:0c:4d:47:d2 on xl1 Jan 12 09:42:04 firewall kernel: arp: 10.0.103.6 is on xl0 but got reply from 00:0e:0c:4d:47:d2 on xl2 This means that xl1 and xl2 can receive some data. However, with both patches installed they cannot send any data. If it can help you, I can give you root access to this box. With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Wed Jan 12 15:21:08 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9B8F816A4CE; Wed, 12 Jan 2005 15:21:08 +0000 (GMT) Received: from smtp5.dti.ne.jp (smtp5.dti.ne.jp [202.216.228.40]) by mx1.FreeBSD.org (Postfix) with ESMTP id E311043D45; Wed, 12 Jan 2005 15:21:07 +0000 (GMT) (envelope-from maenaka@pluto.dti.ne.jp) Received: from towerrecords.minidns.net (PPPbb116.gifu-ip.dti.ne.jp [218.225.250.116]) by smtp5.dti.ne.jp (3.10s) with ESMTP id j0CFL7E5004216;Thu, 13 Jan 2005 00:21:07 +0900 (JST) Received: from [127.0.0.1] (destroy [192.168.0.1]) by towerrecords.minidns.net (Postfix) with ESMTP id E18FF2079; Thu, 13 Jan 2005 00:21:06 +0900 (JST) From: "UEMURA (fka. MAENAKA) Tetsuya" To: freebsd-smp@FreeBSD.org, acpi@FreeBSD.org, bde@FreeBSD.org In-Reply-To: <200501101532.42149.jhb@FreeBSD.org> References: <20050109033319.129DD2072@towerrecords.minidns.net> <200501101532.42149.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.12.01 [ja] Message-Id: <20050112152106.E18FF2079@towerrecords.minidns.net> Date: Thu, 13 Jan 2005 00:21:06 +0900 (JST) Subject: Re: Re: 5.3-RELEASE: SMP: system clock has died X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jan 2005 15:21:08 -0000 I applied attached patch against 5.3-STABLE as of 6th Jan., rebuilt kernel and restarted, top shows correct CPU usage. vmstat too. While machine is as a server, suspend / resume is not tested. The machine is; two Pentium III on Tyan S2567U3AN Thunder HEsl with ACPI turned off. One amr and one em, both on 64bit 66MHz PCI bus, and one ATI RAGE XL on AGP. -- UEMURA (fka. MAENAKA) Tetsuya Posted on Mon, 10 Jan 2005 15:32:42 -0500 by author John Baldwin > On Saturday 08 January 2005 10:33 pm, UEMURA (fka. MAENAKA) Tetsuya wrote: > > Posted on Sat, 08 Jan 2005 17:31:56 -0700 > > by author Stephane Raimbault > > > > > I have an ASUS P2B-DS motherboard with dual P2 400MHz CPU's. I have > > > compiled the SMP kernel and noticed that something is not right. In > > > "top" the CPU values indicate 0% across the board, even idle! > > > > I found 5 PRs regarding this symptom. On my 5.3-STABLE server, patch > > attached with PR 17800 solved the problem. > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=17800 > > http://www.freebsd.org/cgi/query-pr.cgi?pr=60385 > > http://www.freebsd.org/cgi/query-pr.cgi?pr=30310 > > http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/30310 > > http://www.freebsd.org/cgi/query-pr.cgi?pr=73989 > > > > For information, Tyan S1867DLUAN Thunder 2500 dual Slot 1 motherboard > > always shows correct CPU usage on FreeBSD 5.x since early 2003, its > > Socket 370 alternative S2567U3AN Thuder HEsl shows incorrect on > > 5.3-BETA4 and recent 5.3-STABLE without patch. > > Can you please try the patch below. It drains pending interrupts any time we > turn interrupts back on on the RTC including during resume: > > Index: i386/isa/clock.c > =================================================================== > RCS file: /usr/cvs/src/sys/i386/isa/clock.c,v > retrieving revision 1.213 > diff -u -r1.213 clock.c > --- i386/isa/clock.c 11 Jul 2004 17:50:59 -0000 1.213 > +++ i386/isa/clock.c 10 Jan 2005 19:58:51 -0000 > @@ -712,6 +712,7 @@ > writertc(RTC_STATUSB, RTCSB_24HR); > writertc(RTC_STATUSA, rtc_statusa); > writertc(RTC_STATUSB, rtc_statusb); > + rtcin(RTC_INTR); > } > > /* > @@ -911,6 +912,7 @@ > > /* Reenable RTC updates and interrupts. */ > writertc(RTC_STATUSB, rtc_statusb); > + rtcin(RTC_INTR); > } > > > @@ -957,6 +959,7 @@ > INTR_TYPE_CLK | INTR_FAST, NULL); > > writertc(RTC_STATUSB, rtc_statusb); > + rtcin(RTC_INTR); > } > > init_TSC_tc(); > > -- > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Wed Jan 12 18:19:39 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 432BB16A4CE; Wed, 12 Jan 2005 18:19:39 +0000 (GMT) Received: from mx3.mail.ru (mx3.mail.ru [194.67.23.23]) by mx1.FreeBSD.org (Postfix) with ESMTP id D890B43D5C; Wed, 12 Jan 2005 18:19:38 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from [195.209.229.106] (port=28540 helo=tank) by mx3.mail.ru with esmtp id 1Con5l-0004dX-00; Wed, 12 Jan 2005 21:19:37 +0300 Received-SPF: softfail (mx3.mail.ru: transitioning domain of mail.ru does not designate 195.209.229.106 as permitted sender) client-ip=195.209.229.106; envelope-from=pvtrifonov@mail.ru; helo=tank; From: "Peter Trifonov" To: "'John Baldwin'" , Date: Wed, 12 Jan 2005 21:23:35 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 In-Reply-To: <200501111621.48893.jhb@FreeBSD.org> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Thread-Index: AcT4KonzbCKCyEhVT/2Nznn6Nrli7gApInCg Message-Id: X-Spam: Not detected cc: kris@obsecurity.org Subject: RE: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jan 2005 18:19:39 -0000 Hello John, > Ok, can you get me the dmesg from a boot -v with both patches still? I have just noticed a few things: 1. The log file at http://dcn.infos.ru/~bugman/bootlog.txt appears to be truncated - the system ran out of dmesg buffer space before it was able to write it to /var/run/dmesg.boot However, I have noticed the following messages not listed there: SMAP type 01.... 02.... 01.... 02.... 01.... 02.... 02.... 02.... CLK_USEI854 _CALIBRATION not defined ioapic0: routing external i8259As ->intpin0 intpinI ->isa irq I (edge, high), where I=0..15 2. If the kernel is booted with -p and -v options, xl1 and xl2 work even with both patches installed, but break down as soon as ping -f is performed. If the system boots in normal way, xl1 and xl2 do not work at all. With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Wed Jan 12 19:31:22 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 142A916A4CE for ; Wed, 12 Jan 2005 19:31:22 +0000 (GMT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.202]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D3BE43D2F for ; Wed, 12 Jan 2005 19:31:21 +0000 (GMT) (envelope-from crash4o4@gmail.com) Received: by wproxy.gmail.com with SMTP id 58so513474wri for ; Wed, 12 Jan 2005 11:31:20 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=YTj3wSchIbIt47aP0qw2iZ/Y7icAi2UI5Wh9micp6fy26GVgtPj3MkAXSGNUx8OjKEelCeABW9oEp2KF7YdCo8g4VWXFoqqL8pef+gZDzCqwF2veT61nDf4hRXldf+4Sjvy5T1LM0prohz/uAnXcPf45XLfCNyLRdX8+uQmfNdg= Received: by 10.54.3.2 with SMTP id 2mr245771wrc; Wed, 12 Jan 2005 11:31:20 -0800 (PST) Received: by 10.54.47.27 with HTTP; Wed, 12 Jan 2005 11:31:20 -0800 (PST) Message-ID: <484e07430501121131d0bb896@mail.gmail.com> Date: Wed, 12 Jan 2005 14:31:20 -0500 From: Frank Mancuso To: freebsd-smp@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Frank Mancuso List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jan 2005 19:31:22 -0000 From owner-freebsd-smp@FreeBSD.ORG Wed Jan 12 19:31:56 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 97E0A16A4CE for ; Wed, 12 Jan 2005 19:31:56 +0000 (GMT) Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by mx1.FreeBSD.org (Postfix) with ESMTP id 60A6943D45 for ; Wed, 12 Jan 2005 19:31:56 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 14776 invoked from network); 12 Jan 2005 19:31:56 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 12 Jan 2005 19:31:55 -0000 Received: from [10.50.41.243] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j0CJVl4j027231; Wed, 12 Jan 2005 14:31:51 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Wed, 12 Jan 2005 11:13:50 -0500 User-Agent: KMail/1.6.2 References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501121113.51064.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Peter Trifonov cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jan 2005 19:31:56 -0000 On Wednesday 12 January 2005 02:12 am, Peter Trifonov wrote: > Hello John, > > > > Things became even worse with both patches. Now doing normal (not > > > flood) ping over EITHER xl1 OR xl2 (they share IRQ11) causes the > > > corresponding interface to say "watchdog timeout". Essentially, xl1 > > > and xl2 do not work at all now. > > > > > > mptable still reports the same stuff as it was on the original > > > (unpatched) kernel > > > > (http://lists.freebsd.org/pipermail/freebsd-smp/2005-January/000700.ht > > > > > ml) > > > > Ok, can you get me the dmesg from a boot -v with both patches still? > > It can be found here: > http://dcn.infos.ru/~bugman/bootlog.txt > I have also put there output of mptable. > At a first glance, there are many strange things (e.g. a lot of failures at > various places) in this log > file, but I don't know which are relevant to the problem :-). Unfortunately, it's missing the earliest messages. I'm especially curious if your machine claims to have an ELCR, which would be output to a serial console very early on. I'll commit the current workaround for your mptable and work up a patch to use the ELCR if it exists for ISA busses, not just EISA, maybe that will help. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Thu Jan 13 02:41:40 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7BCF916A4CE for ; Thu, 13 Jan 2005 02:41:40 +0000 (GMT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id 003C743D1F for ; Thu, 13 Jan 2005 02:41:40 +0000 (GMT) (envelope-from crash4o4@gmail.com) Received: by wproxy.gmail.com with SMTP id 58so554992wri for ; Wed, 12 Jan 2005 18:41:39 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=sq12SoV+giamJyjOMxjTg0B8TPtwKMrGhPfHm4Km3LR8mjDUFINvHWo34P9NNWgQD48g6F27JakKvUz8eu3xflmSdUvNRutuSGhdBLY6unKtn5FC22zB+KWROS0wYZzZEira2Phh+IYlqA6PtWM73/aL0pkuChoQtWN0KlqHpFw= Received: by 10.54.46.24 with SMTP id t24mr276041wrt; Wed, 12 Jan 2005 18:41:39 -0800 (PST) Received: by 10.54.47.27 with HTTP; Wed, 12 Jan 2005 18:41:39 -0800 (PST) Message-ID: <484e074305011218417b5c4704@mail.gmail.com> Date: Wed, 12 Jan 2005 21:41:39 -0500 From: Frank Mancuso To: freebsd-smp@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: smp xeon question X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Frank Mancuso List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2005 02:41:40 -0000 Hello there, I have a dual xeon 2.8's and 2gb of ram. And was wondering which version of FreeBSD would be better to install. FreeBSD 4.10 or FreeBSD 5.3. On all my other systems which are P4 1.9's I've used FreeBSD 4.10, cause I was very comfortable with it. But as of SMP support. I'm not sure on the hardware support on either versions. Thanks Frank Mancuso. From owner-freebsd-smp@FreeBSD.ORG Thu Jan 13 07:18:43 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C89C916A4CE; Thu, 13 Jan 2005 07:18:43 +0000 (GMT) Received: from dcn.infos.ru (gw-9cor.infos.ru [195.209.229.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id E6B6A43D1F; Thu, 13 Jan 2005 07:18:42 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from dcn (localhost [127.0.0.1]) by dcn (Postfix) with SMTP id A3F532B791; Thu, 13 Jan 2005 10:18:41 +0300 (MSK) Received: by smtp.xj.dcn (Postfix, from userid 65534) id 8254E2B798; Thu, 13 Jan 2005 10:18:41 +0300 (MSK) Received: from tank-ls.xj.dcn (unknown [10.0.103.154]) by smtp.xj.dcn (Postfix) with ESMTP id D29BB2B726; Thu, 13 Jan 2005 10:18:38 +0300 (MSK) From: Peter Trifonov To: freebsd-smp@freebsd.org Date: Thu, 13 Jan 2005 10:22:49 +0300 User-Agent: KMail/1.6.2 References: <200501121113.51064.jhb@FreeBSD.org> In-Reply-To: <200501121113.51064.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501131022.49845.pvtrifonov@mail.ru> X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on dcn.xj.dcn X-Spam-Level: X-Spam-Status: No, score=-5.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.0.1 cc: John Baldwin cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2005 07:18:43 -0000 Hello John, On Wednesday 12 January 2005 19:13, John Baldwin wrote: > > > Ok, can you get me the dmesg from a boot -v with both patches still? > > > > It can be found here: > > http://dcn.infos.ru/~bugman/bootlog.txt > > I have also put there output of mptable. > > At a first glance, there are many strange things (e.g. a lot of failures > > at various places) in this log > > file, but I don't know which are relevant to the problem :-). > > Unfortunately, it's missing the earliest messages. I'm especially curious > if your machine claims to have an ELCR, which would be output to a serial > console very early on. I'll commit the current workaround for your mptable > and work up a patch to use the ELCR if it exists for ISA busses, not just > EISA, maybe that will help. I have carefully inspected what the kernel says with boot -p -v. There is nothing there about ELCR. I have found two similar problem reports: http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/40274 http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/43852 Some other bug reports also mention "device timeout", but they seem to differ considerably from my case. If it may be helpful for you, I can even connect this box with the serial cable to another PC and give you remote access to that one. -- With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Thu Jan 13 07:34:41 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A054516A4CE for ; Thu, 13 Jan 2005 07:34:41 +0000 (GMT) Received: from silver.he.iki.fi (helenius.fi [193.64.42.241]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A02D43D39 for ; Thu, 13 Jan 2005 07:34:40 +0000 (GMT) (envelope-from pete@he.iki.fi) Received: from [195.163.185.142] (i2-142.rommon.fi [195.163.185.142]) by silver.he.iki.fi (8.13.1/8.11.4) with ESMTP id j0D7YblL033711; Thu, 13 Jan 2005 09:34:38 +0200 (EET) (envelope-from pete@he.iki.fi) Message-ID: <41E6248F.3060802@he.iki.fi> Date: Thu, 13 Jan 2005 09:34:39 +0200 From: Petri Helenius User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Frank Mancuso References: <484e074305011218417b5c4704@mail.gmail.com> In-Reply-To: <484e074305011218417b5c4704@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-smp@freebsd.org Subject: Re: smp xeon question X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2005 07:34:41 -0000 Frank Mancuso wrote: >Hello there, I have a dual xeon 2.8's and 2gb of ram. And was >wondering which version of FreeBSD would be better to install. FreeBSD >4.10 or FreeBSD 5.3. On all my other systems which are P4 1.9's I've >used FreeBSD 4.10, cause I was very comfortable with it. But as of SMP >support. I'm not sure on the hardware support on either versions. > > Both will work, you'll get more performance out of 5.3 but it's still fairly recent release so your mileage may vary more than with 4.x. Pete From owner-freebsd-smp@FreeBSD.ORG Thu Jan 13 17:28:55 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7AB8916A4CE for ; Thu, 13 Jan 2005 17:28:55 +0000 (GMT) Received: from dreadlock.phreakout.net (dreadlock.phreakout.net [12.45.16.51]) by mx1.FreeBSD.org (Postfix) with SMTP id 9335343D4C for ; Thu, 13 Jan 2005 17:28:54 +0000 (GMT) (envelope-from bob@phreakout.net) Received: (qmail 28463 invoked from network); 13 Jan 2005 17:32:53 -0000 Received: from 24-52-224-96.kntnny.adelphia.net (HELO ?192.168.102.103?) (24.52.224.96) by dreadlock.phreakout.net with SMTP; 13 Jan 2005 17:32:53 -0000 Message-ID: <41E6AFAA.6070207@phreakout.net> Date: Thu, 13 Jan 2005 12:28:10 -0500 From: Bob Ababurko User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-smp@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: process binding X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2005 17:28:55 -0000 Hello- I am looking for a way to bind processes to specific processors. I am first trying to find an opensource OS that will be able to do it. It seems that there is a glimmer of hope with the new release of FreeBSD...5.3. So will the function, sched_bind() be able to help me. I am not really to sure what this is...(someone brought it up to me on the openbsd smp list)...is it a command or function of C? I basically want to run a fairly simple perl script on each processor of a two processor machine. I am looking to this idea because running two of these scripts on a single proc machine yields very poor performance. These run for about an hour, but take like 10 times longer if more than one is running(single proc). I am hoping that I can find a way to achieve some good results by running one process per proc on a dual proc machine. I am getting ready to build a athlon MP 2800+ if I can figure out a way to make this work. If anyone has any insight into this problem as well as sched_bind, I would love to hear any feedback. thanks, Bob From owner-freebsd-smp@FreeBSD.ORG Thu Jan 13 19:36:09 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 629FD16A4CE for ; Thu, 13 Jan 2005 19:36:09 +0000 (GMT) Received: from sccrmhc13.comcast.net (sccrmhc13.comcast.net [204.127.202.64]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0140643D53 for ; Thu, 13 Jan 2005 19:36:08 +0000 (GMT) (envelope-from fatbasho@comcast.net) Received: from [192.168.0.2] (c-24-3-59-9.client.comcast.net[24.3.59.9]) by comcast.net (sccrmhc13) with SMTP id <20050113193608016006rm4ie>; Thu, 13 Jan 2005 19:36:08 +0000 Mime-Version: 1.0 (Apple Message framework v619) In-Reply-To: <41E6AFAA.6070207@phreakout.net> References: <41E6AFAA.6070207@phreakout.net> Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <5979AD64-659A-11D9-B6DA-000A95B2B6CC@comcast.net> Content-Transfer-Encoding: 7bit From: James Larkby-Lahet Date: Thu, 13 Jan 2005 14:35:58 -0500 To: freebsd-smp@freebsd.org X-Mailer: Apple Mail (2.619) Subject: Re: process binding X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2005 19:36:09 -0000 > I am looking for a way to bind processes to specific processors. I am > first trying to find an opensource OS that will be able to do it. It > seems that there why not leave it to the scheduler? It will manage things quite nicely (including the many other processes that are bound to run) and even tries to respect L(1/2/3) caching, w/ processor affinity. For the most part you will see your programs running on separate processors, unless another process must run, in which case one process will sleep for a bit. > I basically want to run a fairly simple perl script on each processor > of a two processor machine. I am looking to this idea because running > two of these scripts on a single proc machine yields very poor > performance. These run for about an hour, but take like 10 times > longer if more than one is running(single proc). I am hoping that I > can find a way to achieve some good results by running one process per > proc on a dual proc machine. I am getting ready to build a athlon MP > 2800+ if I can figure out a way to make this work. your problems are likely due to cache contention between the two processes, so 2 processes on a two processor box will be fine, but if you up it to 4 processes, you'll have the same problem. also, unless you are putting faster processors in the new MP box, you will not see a speed-up for a single process ;-) cheers, james From owner-freebsd-smp@FreeBSD.ORG Fri Jan 14 01:45:34 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D935016A4CE for ; Fri, 14 Jan 2005 01:45:34 +0000 (GMT) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id 510D343D66 for ; Fri, 14 Jan 2005 01:45:34 +0000 (GMT) (envelope-from joseph.koshy@gmail.com) Received: by rproxy.gmail.com with SMTP id y7so150798rne for ; Thu, 13 Jan 2005 17:45:32 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=txPp7uCgFzHQAeUNN7rS+senfq4nLRxJihaQ4ZEaugjtgpqBh77I0KDUv3+MXqW3nrldHRc3Y13Fl1i/IRSgKPHH2ALt9/FY7PtNHrs1xhtY1X/Ushgtb4G/5O30GUthDGtNSGqb5jqBrlKPxh6ERZzWx0oml5w6aC1gl54p9P8= Received: by 10.39.2.28 with SMTP id e28mr286100rni; Thu, 13 Jan 2005 17:45:32 -0800 (PST) Received: by 10.38.209.12 with HTTP; Thu, 13 Jan 2005 17:45:31 -0800 (PST) Message-ID: <84dead720501131745174c6e8@mail.gmail.com> Date: Fri, 14 Jan 2005 01:45:31 +0000 From: Joseph Koshy To: Bob Ababurko In-Reply-To: <41E6AFAA.6070207@phreakout.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <41E6AFAA.6070207@phreakout.net> cc: freebsd-smp@freebsd.org Subject: Re: process binding X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Joseph Koshy List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jan 2005 01:45:35 -0000 > seems that there is a glimmer of hope with the new release of > FreeBSD...5.3. So will the function, sched_bind() be able to help me. sched_bind() is a scheduler internal function. At this time there is no way to invoke sched_bind() from userland. From owner-freebsd-smp@FreeBSD.ORG Fri Jan 14 18:06:06 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4658516A4E4 for ; Fri, 14 Jan 2005 18:06:06 +0000 (GMT) Received: from mail26.sea5.speakeasy.net (mail26.sea5.speakeasy.net [69.17.117.28]) by mx1.FreeBSD.org (Postfix) with ESMTP id C836B43D6A for ; Fri, 14 Jan 2005 18:06:05 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 29604 invoked from network); 14 Jan 2005 18:06:05 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) AES256-SHA encrypted SMTP for ; 14 Jan 2005 18:06:05 -0000 Received: from [10.50.40.231] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j0EI5ruL042325; Fri, 14 Jan 2005 13:06:00 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Fri, 14 Jan 2005 12:51:50 -0500 User-Agent: KMail/1.6.2 References: <200501121113.51064.jhb@FreeBSD.org> <200501131022.49845.pvtrifonov@mail.ru> In-Reply-To: <200501131022.49845.pvtrifonov@mail.ru> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200501141251.50342.jhb@FreeBSD.org> Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Peter Trifonov cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jan 2005 18:06:06 -0000 On Thursday 13 January 2005 02:22 am, Peter Trifonov wrote: > Hello John, > > On Wednesday 12 January 2005 19:13, John Baldwin wrote: > > > > Ok, can you get me the dmesg from a boot -v with both patches still? > > > > > > It can be found here: > > > http://dcn.infos.ru/~bugman/bootlog.txt > > > I have also put there output of mptable. > > > At a first glance, there are many strange things (e.g. a lot of > > > failures at various places) in this log > > > file, but I don't know which are relevant to the problem :-). > > > > Unfortunately, it's missing the earliest messages. I'm especially > > curious if your machine claims to have an ELCR, which would be output to > > a serial console very early on. I'll commit the current workaround for > > your mptable and work up a patch to use the ELCR if it exists for ISA > > busses, not just EISA, maybe that will help. > > I have carefully inspected what the kernel says with boot -p -v. There is > nothing there about ELCR. > > I have found two similar problem reports: > http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/40274 > http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/43852 > Some other bug reports also mention "device timeout", but they seem to > differ considerably from my case. Those two bug reports tend to focus on fxp(4) though and you have xl(4) cards. I've gone ahead and committed the fix for the MPTable global entries btw. I don't think there is a routing or edge/level problem though because the devices do work until you do a ping flood. One thing we can try is that Linux has a workaround for an undocumented errata in at least some older I/O APICs where a level triggered interrupt can accidentally be delivered as edge triggered and end up not being properly acknowledged. However, you don't have any level triggered interrupts, so I'm not sure that is applicable. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Fri Jan 14 18:13:34 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8BE0D16A4CE; Fri, 14 Jan 2005 18:13:34 +0000 (GMT) Received: from mx1.mail.ru (mx1.mail.ru [194.67.23.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 23B2843D48; Fri, 14 Jan 2005 18:13:34 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from [195.209.229.106] (port=1198 helo=tank) by mx1.mail.ru with esmtp id 1CpVwy-0007cZ-00; Fri, 14 Jan 2005 21:13:32 +0300 From: "Peter Trifonov" To: "'John Baldwin'" , Date: Fri, 14 Jan 2005 21:17:34 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <200501141251.50342.jhb@FreeBSD.org> Thread-Index: AcT6Y9JK9yf0dOWdTbGi4oMadsDpjQAAL1gw Message-Id: X-Spam: Not detected cc: kris@obsecurity.org Subject: RE: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jan 2005 18:13:34 -0000 Hello John, > > I have found two similar problem reports: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/40274 > > http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/43852 > > Some other bug reports also mention "device timeout", but > they seem to > > differ considerably from my case. > > Those two bug reports tend to focus on fxp(4) though and you > have xl(4) cards. I had the same problem with fxp's. Initially I though that it was fxp driver problem (because xl0 worked fine), so I have replaced Intel NICs with 3COM ones, but nothing has changed. From this I guess that the problem is not in the NIC drivers. > I've gone ahead and committed the fix for the MPTable global > entries btw. I don't think there is a routing or edge/level > problem though because the devices do work until you do a > ping flood. One thing we can try is that Linux has a IMPORTANT: I can do flood ping over either of them without any problems (at least, if the system is booted with -p -v, I don't know why). They break down ONLY if flood ping is SIMULTANEOUSLY performed over both of them. > workaround for an undocumented errata in at least some older > I/O APICs where a level triggered interrupt can accidentally > be delivered as edge triggered and end up not being properly > acknowledged. However, you don't have any level triggered > interrupts, so I'm not sure that is applicable. Please let me know how can I help you with this problem. With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Fri Jan 14 19:41:26 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C735C16A4CE for ; Fri, 14 Jan 2005 19:41:26 +0000 (GMT) Received: from mail14.speakeasy.net (mail22.sea5.speakeasy.net [69.17.117.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 76C8A43D2D for ; Fri, 14 Jan 2005 19:41:26 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 11233 invoked from network); 14 Jan 2005 19:41:26 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 14 Jan 2005 19:41:25 -0000 Received: from [10.50.40.231] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id j0EJepMb042914; Fri, 14 Jan 2005 14:41:13 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-smp@FreeBSD.org Date: Fri, 14 Jan 2005 14:18:18 -0500 User-Agent: KMail/1.6.2 References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501141418.18587.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Peter Trifonov cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jan 2005 19:41:27 -0000 On Friday 14 January 2005 01:17 pm, Peter Trifonov wrote: > Hello John, > > > > I have found two similar problem reports: > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/40274 > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/43852 > > > Some other bug reports also mention "device timeout", but > > > > they seem to > > > > > differ considerably from my case. > > > > Those two bug reports tend to focus on fxp(4) though and you > > have xl(4) cards. > > I had the same problem with fxp's. Initially I though that it was fxp > driver problem (because xl0 worked fine), > so I have replaced Intel NICs with 3COM ones, but nothing has changed. From > this I guess that the problem is not in the NIC drivers. Ok. > > I've gone ahead and committed the fix for the MPTable global > > entries btw. I don't think there is a routing or edge/level > > problem though because the devices do work until you do a > > ping flood. One thing we can try is that Linux has a > > IMPORTANT: I can do flood ping over either of them without any problems (at > least, if the system is booted with -p -v, I don't know why). > They break down ONLY if flood ping is SIMULTANEOUSLY performed over both > of them. More interrupt load that way, which would indicate maybe the bug Linux tries to work around except that your intpins are edge triggered. :( > > workaround for an undocumented errata in at least some older > > I/O APICs where a level triggered interrupt can accidentally > > be delivered as edge triggered and end up not being properly > > acknowledged. However, you don't have any level triggered > > interrupts, so I'm not sure that is applicable. > > Please let me know how can I help you with this problem. I've included a little test program below that you can run as root to do arbitrary port reads (inb). Please compile it and mail me the output of: inb 0x4d0 inb 0x4d1 Thanks. #include #include #include #include #include #include #include #include int main(int ac, char **av) { char repr[5]; char *cp; int fd, port, value; if (ac != 2) errx(1, "A single argument is required."); port = strtoul(av[1], &cp, 0); if (*cp != '\0' || port < 0 || port > 65535) errx(1, "Invalid port number %s.", av[1]); fd = open("/dev/io", O_RDONLY); if (fd < 0) err(1, "Failed to open /dev/io"); value = inb(port); close(fd); vis(repr, value, VIS_NL | VIS_NOSLASH, 0); printf("inb(%s) = 0x%x = %dd = '%s'\n", av[1], value, value, repr); return (0); } -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-smp@FreeBSD.ORG Sat Jan 15 07:59:34 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 89B0516A4CE; Sat, 15 Jan 2005 07:59:34 +0000 (GMT) Received: from dcn.infos.ru (gw-9cor.infos.ru [195.209.229.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id 825AA43D1F; Sat, 15 Jan 2005 07:59:33 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from dcn (localhost [127.0.0.1]) by dcn (Postfix) with SMTP id 0CDBC2D424; Sat, 15 Jan 2005 10:59:32 +0300 (MSK) Received: by smtp.xj.dcn (Postfix, from userid 65534) id E04E22D425; Sat, 15 Jan 2005 10:59:31 +0300 (MSK) Received: from tank-ls.xj.dcn (unknown [10.0.103.154]) by smtp.xj.dcn (Postfix) with ESMTP id EAD272CF97; Sat, 15 Jan 2005 10:59:28 +0300 (MSK) From: "Peter Trifonov" To: freebsd-smp@freebsd.org Date: Sat, 15 Jan 2005 11:03:30 +0300 User-Agent: KMail/1.6.2 References: <200501141418.18587.jhb@FreeBSD.org> In-Reply-To: <200501141418.18587.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <200501151103.30642.pvtrifonov@mail.ru> X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on dcn.xj.dcn X-Spam-Level: X-Spam-Status: No, score=-5.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.0.1 cc: John Baldwin cc: kris@obsecurity.org Subject: Re: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Jan 2005 07:59:34 -0000 Hello John, On Friday 14 January 2005 22:18, John Baldwin wrote: Among those bug reports the followup submitted by cguthrie@clubphoto.co (http://www.freebsd.org/cgi/query-pr.cgi?pr=i386/40274) looks like the most close one to my situation. > > > I've gone ahead and committed the fix for the MPTable global > > > entries btw. I don't think there is a routing or edge/level > > > problem though because the devices do work until you do a > > > ping flood. One thing we can try is that Linux has a > > > > IMPORTANT: I can do flood ping over either of them without any problems > > (at least, if the system is booted with -p -v, I don't know why). > > They break down ONLY if flood ping is SIMULTANEOUSLY performed over both > > of them. Another observation: doing simultaneous flood ping over xl0 AND xl1, xl0 AND xl2 also causes xl1 or xl2 respectively (but not both of them) to say "watchdog timeout". In both cases they can be fixed by doing ifconfig xl1 down ifconfig xl2 down ifconfig xl1 up ifconfig xl2 up i.e. even if flood ping has not been done over xl2, it still has to be brought down& up. xl0 works fine in all cases. flood ping over just one interface (either of them) always works fine. > More interrupt load that way, which would indicate maybe the bug Linux > tries to work around except that your intpins are edge triggered. :( Just a guess: Maybe also there is some kind of race condition in the interrupt handling system, so that if too many interrupts are coming from different sources, some of them are not properly processed? However, this should be somehow related to IRQ sharing. > > I've included a little test program below that you can run as root to do > arbitrary port reads (inb). Please compile it and mail me the output of: > > inb 0x4d0 > inb 0x4d1 Here is what it says: # ./inb 0x4d0 inb(0x4d0) = 0x0 = 0d = '^@' # ./inb 0x4d1 inb(0x4d1) = 0xe = 14d = '^N' PS: I had to revert the first patch because it requres me to be near the console during the boot process in order to type boot -p -v. Moreover, during such boot the system sometimes hangs. mptable patch is still installed. -- With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Sat Jan 15 10:09:13 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EFFA516A4CE; Sat, 15 Jan 2005 10:09:13 +0000 (GMT) Received: from dcn.infos.ru (gw-9cor.infos.ru [195.209.229.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F83343D31; Sat, 15 Jan 2005 10:09:13 +0000 (GMT) (envelope-from petert@dcn.infos.ru) Received: from dcn (localhost [127.0.0.1]) by dcn (Postfix) with SMTP id 666F7306E7; Sat, 15 Jan 2005 13:09:12 +0300 (MSK) Received: by smtp.xj.dcn (Postfix, from userid 65534) id 26B12306E8; Sat, 15 Jan 2005 13:09:12 +0300 (MSK) Received: from tank (unknown [10.0.103.154]) by smtp.xj.dcn (Postfix) with ESMTP id 76681306E2; Sat, 15 Jan 2005 13:09:04 +0300 (MSK) From: "Peter Trifonov" To: "'John Baldwin'" , Date: Sat, 15 Jan 2005 13:13:06 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <200501141418.18587.jhb@FreeBSD.org> Thread-Index: AcT6cRjydPZzt0AKSoisCdw/H/pxDwAd8APg Message-Id: <20050115100904.76681306E2@smtp.xj.dcn> X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on dcn.xj.dcn X-Spam-Level: X-Spam-Status: No, score=-5.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_AHBL_RHSBL autolearn=ham version=3.0.1 cc: 'Peter Trifonov' cc: kris@obsecurity.org Subject: RE: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Jan 2005 10:09:14 -0000 Hello John, > > > workaround for an undocumented errata in at least some older I/O > > > APICs where a level triggered interrupt can accidentally be > > > delivered as edge triggered and end up not being properly > > > acknowledged. However, you don't have any level triggered > > > interrupts, so I'm not sure that is applicable. > More interrupt load that way, which would indicate maybe the > bug Linux tries to work around except that your intpins are > edge triggered. :( After doing some google search I have found a number of somewhat similar problem reports/patches related to this family of HP boxes (Vectra XU 6/200). Maybe that is what you were writing about, sorry in such case. http://www.sslug.dk/emailarkiv/teknik/2000_07/msg00306.html http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg34195.html freebsd-smp-unsubscribe@freebsd.org" With best regards, P. Trifonov From owner-freebsd-smp@FreeBSD.ORG Sat Jan 15 10:11:59 2005 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 612B416A4CE; Sat, 15 Jan 2005 10:11:59 +0000 (GMT) Received: from mx2.mail.ru (mx2.mail.ru [194.67.23.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id 15EBE43D55; Sat, 15 Jan 2005 10:11:59 +0000 (GMT) (envelope-from pvtrifonov@mail.ru) Received: from [195.209.229.106] (port=1706 helo=tank) by mx2.mail.ru with esmtp id 1CpkuT-000IcF-00; Sat, 15 Jan 2005 13:11:57 +0300 From: "Peter Trifonov" To: "'John Baldwin'" , Date: Sat, 15 Jan 2005 13:15:59 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <200501141418.18587.jhb@FreeBSD.org> Thread-Index: AcT6cRjydPZzt0AKSoisCdw/H/pxDwAehJLA Message-Id: X-Spam: Not detected cc: kris@obsecurity.org Subject: RE: Lost interrupts on SMP systems X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Jan 2005 10:11:59 -0000 Hello John, > > > workaround for an undocumented errata in at least some older I/O > > > APICs where a level triggered interrupt can accidentally be > > > delivered as edge triggered and end up not being properly > > > acknowledged. However, you don't have any level triggered > > > interrupts, so I'm not sure that is applicable. > More interrupt load that way, which would indicate maybe the bug Linux > tries to work around except that your intpins are edge triggered. :( After doing some google search I have found a number of somewhat similar problem reports/patches related to this family of HP boxes (Vectra XU 6/200). Maybe that is what you were writing about, sorry in such case. http://www.sslug.dk/emailarkiv/teknik/2000_07/msg00306.html http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg34195.html With best regards, P. Trifonov