From owner-freebsd-smp Sun Jan 26 09:36:10 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA16126 for smp-outgoing; Sun, 26 Jan 1997 09:36:10 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA16117 for ; Sun, 26 Jan 1997 09:36:05 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id KAA21506; Sun, 26 Jan 1997 10:34:38 -0700 Message-Id: <199701261734.KAA21506@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: "John D. Smerdon" cc: smp@freebsd.org, robin@intercore.com (Robin Cutshaw) Subject: Re: Tyan Tomcat II SMP video problems In-reply-to: Your message of "Sat, 25 Jan 1997 22:27:05 EST." <3.0.32.19970125222703.0095bb40@smerdon.livonia.mi.us> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 26 Jan 1997 10:34:37 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > (Unix/FreeBSD/SMP novice questions...) > > I have a Tyan Tomcat II with 2/P133's and a Matrox Millennium video card. > When the system first boots, these messages are displayed: > > >ipi_ihandler_attach: counting ipi irq24's as clk0 irqs > >ipi_ihandler_attach: counting ipi irq25's as clk0 irqs > >ipi_ihandler_attach: counting ipi irq26's as clk0 irqs > >ipi_ihandler_attach: counting ipi irq27's as clk0 irqs > > but are not logged to the /var/log/messages file. this is normal, they occur b4 the log daemon is running and so are 'lost'. --- > The booting continues, until the "SMP: All idle procs online." message is > displayed. The system appears to hang, but is really running and not > updating the video. I can blindly type and login or telnet in from another > system. Enabling the second processor `sysctl -w kern.smp_active=2` causes > the screen to display other messages in the booting sequence, but then the > video seems to hang again. this is a new one, no idea except... > I was running the 3.0-current as of Wednesday, then grabbed the SMP sys > files and created a SMP kernel using the options suggested in the mptable > output. dmesg would not run with the SMP kernel, I don't remember the > message it displayed but I think it was complaining about kmem not being > correct. Attempting to `cd /usr/src/sbin ; make` failed because the > opt_smp.h files could not be found for some programs. `shutdown -r now` > never restarts the system. This is the second report of failure with the "3.0-current" sources in as many days. Perhaps we are out of sync with 3.0 enough that we have problems (this happens every so often, hopefully we will be able to merge soon!) Try this: boot the non-SMP kernel change the link for /sys from /usr/src/sys to //src/sys ( eg. -> ln -s /usr/SMPsrc/sys /sys ) build an SMP kernel from scratch: config; make depend; make; make install boot the new SMP kernel. remember to change back the '/sys' link to /usr/src/sys (or whatever it was) before asttempting to build non-SMP stuff. I have had this help in the past, it may or may not solve your problems. -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Mon Jan 27 00:05:17 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id AAA25101 for smp-outgoing; Mon, 27 Jan 1997 00:05:17 -0800 (PST) Received: from parkplace.cet.co.jp (parkplace.cet.co.jp [202.32.64.1]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id AAA25096; Mon, 27 Jan 1997 00:05:14 -0800 (PST) Received: from localhost (michaelh@localhost) by parkplace.cet.co.jp (8.8.3/CET-v2.1) with SMTP id IAA28518; Mon, 27 Jan 1997 08:05:07 GMT Date: Mon, 27 Jan 1997 17:05:07 +0900 (JST) From: Michael Hancock Reply-To: Michael Hancock To: David Greenman cc: FreeBSD Hackers , freebsd-smp@FreeBSD.ORG Subject: Re: SLAB stuff, and applications to current net code (fwd) In-Reply-To: <199701262225.OAA08492@root.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk [Added cc to smp] On Sun, 26 Jan 1997, David Greenman wrote: > >Vahalia cites a paper: > > "Efficient Kernel Memory Allocation on Shared Memory > > Multiprocessors" > > McKenney, P.E. and Slingwine, J. > > Proceedings of USENIX, Winter, 1993 > > > >Which shows the sequent code to be faster than the McKusick-Karels > >algorithm by a factor of three to five on a UP, and a factor of > >one hundred to one thousand on a 25 processor system. > > I haven't read that paper, but I suspect that the numbers are wrong due > to spl* having high overhead back then. Since Bruce's "fast interrupt" code, > spl* is almost free, and this changes things greatly. The situation might > change slightly in MP systems, but the total time inside of malloc is so small > that I really doubt that synchronization/serialization will ever be a > significant problem. > If subsystems used TSM it would be faster still at the expense of being even less space efficient. Objects would never be free'd in the malloc/free sense, but put on a typed free list. The operations of moving from list to list is even cheaper than malloc/free and synchronization/serialization is easier, even more so in MP environments. In this case, the performance of malloc/free would be largely irrelevant. However, the underlying allocator's ability to distribute hot objects across the processor cache might become more important. TSM probably wouldn't go over with those that want to run FBSD on systems with 8MB of RAM. It'd probably be the way to go for SMP systems though. BTW, SLAB isn't type stable either. There's an operation kmem_cache_reap() that returns a SLAB whose objects are all free to the page level allocator. TSM on top of a SLAB allocator might be interesting. It would negate the extra management overhead of the SLABs and be able to take advantage of processor cache coloring at the same time. Regards, Mike Hancock From owner-freebsd-smp Mon Jan 27 00:41:07 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id AAA26172 for smp-outgoing; Mon, 27 Jan 1997 00:41:07 -0800 (PST) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id AAA26156; Mon, 27 Jan 1997 00:40:19 -0800 (PST) Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id JAA00301; Mon, 27 Jan 1997 09:40:08 +0100 Received: (from j@localhost) by uriah.heep.sax.de (8.8.4/8.6.9) id IAA10512; Mon, 27 Jan 1997 08:46:32 +0100 (MET) Message-ID: Date: Mon, 27 Jan 1997 08:46:32 +0100 From: j@uriah.heep.sax.de (J Wunsch) To: core@freebsd.org, smp@freebsd.org Subject: Re: Current SMP status inquiry References: <7252.854235598@time.cdrom.com> X-Mailer: Mutt 0.55-PL10 Mime-Version: 1.0 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) In-Reply-To: <7252.854235598@time.cdrom.com>; from Jordan K. Hubbard on Jan 25, 1997 15:39:58 -0800 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Jordan K. Hubbard wrote: > > i think putting a FreeBSD/SMP there would make more sense, provided > > the system is basically usable and could serve as an experimental base > > for more than just a dozen developers. > > I agree. After some discussions about the GUUG meeting, i still don't have the slightest sign about the current status of the SMP work. Ain't there anybody around who could tell me? -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) From owner-freebsd-smp Mon Jan 27 17:30:50 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id RAA05169 for smp-outgoing; Mon, 27 Jan 1997 17:30:50 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id RAA04573; Mon, 27 Jan 1997 17:28:40 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id NAA26826 ; Mon, 27 Jan 1997 13:12:15 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id OAA28766; Mon, 27 Jan 1997 14:09:48 -0700 Message-Id: <199701272109.OAA28766@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) cc: core@freebsd.org, smp@freebsd.org Subject: Re: Current SMP status inquiry In-reply-to: Your message of "Mon, 27 Jan 1997 21:27:53 +0100." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 27 Jan 1997 14:09:47 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > SMP itself is fairly stable, runs on most MP machines it has been tried > > on, and is considered ALPHA level. It is capable of running "make world" > > on current without problem. > > That's great. Is it possible to get a diff against some version of > -current that i could put up on that CD-ROM? I'm planning an either > 2.2-POST-BETA or 3.0-current SNAPshot on that CD, but would provide > the SMP diffs for the curious. Are they kernel only, or is the > userland code also affected? it consists of a seperate src/sys tree, which replaces the one in current. A word of caution here, every so often current gets far enough ahead of SMP to cause subtle problems, and a few problem reports this weekend indicate we may have reached that point again. My suggestion is that you get Peter to do a merge at the point where you grab a 3.0-SNAP to guarantee a working combination. The "getting started" page on the web site I quotedpreviously will give the details for cvsupping the tree. On freefall I believe you will find it in /home/smp. > Any ``authoritative'' note about the status of the SMP work (known > bugs, next steps) would also be appreciated, something i could leave > on the CD as a README.SMP, for example. one of the positions we need filled is a "librarian", ie someone who will collect the various mail that has such info and collate into usable documents. much of this info has crossed the transom, but there is no organization of it at all. Peter did a nice mailing about what he saw as necessary to be done b4 we could put SMP into current (features, bugs to fix, etc.) but last time I tried to find I had no success. there is a "to.do" file at the head of the tree. -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Mon Jan 27 17:30:50 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id RAA05168 for smp-outgoing; Mon, 27 Jan 1997 17:30:50 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id RAA04566; Mon, 27 Jan 1997 17:28:39 -0800 (PST) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id MAA26679 ; Mon, 27 Jan 1997 12:52:58 -0800 (PST) Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id VAA14690; Mon, 27 Jan 1997 21:51:19 +0100 Received: (from j@localhost) by uriah.heep.sax.de (8.8.4/8.6.9) id VAA12497; Mon, 27 Jan 1997 21:27:54 +0100 (MET) Message-ID: Date: Mon, 27 Jan 1997 21:27:53 +0100 From: j@uriah.heep.sax.de (J Wunsch) To: smp@csn.net (Steve Passe) Cc: core@freebsd.org, smp@freebsd.org Subject: Re: Current SMP status inquiry References: <199701272019.NAA28537@clem.systemsix.com> X-Mailer: Mutt 0.55-PL10 Mime-Version: 1.0 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) In-Reply-To: <199701272019.NAA28537@clem.systemsix.com>; from Steve Passe on Jan 27, 1997 13:19:16 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Steve Passe wrote: > My somewhat incomplete understanding of the status: > > mergs of SMP with 3.0 is waiting for the merge of lite-2 into current. Ah, ok. > SMP itself is fairly stable, runs on most MP machines it has been tried > on, and is considered ALPHA level. It is capable of running "make world" > on current without problem. That's great. Is it possible to get a diff against some version of -current that i could put up on that CD-ROM? I'm planning an either 2.2-POST-BETA or 3.0-current SNAPshot on that CD, but would provide the SMP diffs for the curious. Are they kernel only, or is the userland code also affected? Any ``authoritative'' note about the status of the SMP work (known bugs, next steps) would also be appreciated, something i could leave on the CD as a README.SMP, for example. -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) From owner-freebsd-smp Mon Jan 27 17:29:56 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id RAA04905 for smp-outgoing; Mon, 27 Jan 1997 17:29:56 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id RAA04270; Mon, 27 Jan 1997 17:27:40 -0800 (PST) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id NAA27027 ; Mon, 27 Jan 1997 13:42:09 -0800 (PST) Received: (from uucp@localhost) by sax.sax.de (8.6.12/8.6.12-s1) with UUCP id WAA15944; Mon, 27 Jan 1997 22:40:40 +0100 Received: (from j@localhost) by uriah.heep.sax.de (8.8.4/8.6.9) id WAA12794; Mon, 27 Jan 1997 22:30:17 +0100 (MET) Message-ID: Date: Mon, 27 Jan 1997 22:30:17 +0100 From: j@uriah.heep.sax.de (J Wunsch) To: smp@csn.net (Steve Passe) Cc: core@freebsd.org, smp@freebsd.org Subject: Re: Current SMP status inquiry References: <199701272109.OAA28766@clem.systemsix.com> X-Mailer: Mutt 0.55-PL10 Mime-Version: 1.0 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) In-Reply-To: <199701272109.OAA28766@clem.systemsix.com>; from Steve Passe on Jan 27, 1997 14:09:47 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk As Steve Passe wrote: > A word of caution here, every so often current gets far enough ahead of SMP > to cause subtle problems, and a few problem reports this weekend indicate > we may have reached that point again. My suggestion is that you get Peter > to do a merge at the point where you grab a 3.0-SNAP to guarantee a > working combination. Ok, i'm not sure about Peter's time resources these days, since i know he's just been moving. I would prefer this option. > The "getting started" page on the web site I > quotedpreviously will give the details for cvsupping the tree. On freefall > I believe you will find it in /home/smp. I don't have the internet bandwidth to suck the entire CVS repository just for this CD. However, Chuck Robey volunteered to prepare me a diff, so i'd rather pick this. I need it by mid/end of next week, so if Peter has an occasion to synchronize the trees by the weekend, this should suffice. > there is a "to.do" file at the head of the tree. Ah, ok. This might help. I will also drop a boldfaced pointer to the SMP mailinglist for anybody who's more serious about this work. Thanks for the help so far! -- cheers, J"org joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) From owner-freebsd-smp Mon Jan 27 17:32:17 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id RAA05423 for smp-outgoing; Mon, 27 Jan 1997 17:32:17 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id RAA04849; Mon, 27 Jan 1997 17:29:44 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id MAA26437 ; Mon, 27 Jan 1997 12:21:53 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id NAA28537; Mon, 27 Jan 1997 13:19:17 -0700 Message-Id: <199701272019.NAA28537@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) cc: core@freebsd.org, smp@freebsd.org Subject: Re: Current SMP status inquiry In-reply-to: Your message of "Mon, 27 Jan 1997 08:46:32 +0100." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 27 Jan 1997 13:19:16 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > As Jordan K. Hubbard wrote: > > > > i think putting a FreeBSD/SMP there would make more sense, provided > > > the system is basically usable and could serve as an experimental base > > > for more than just a dozen developers. > > > > I agree. > > After some discussions about the GUUG meeting, i still don't have the > slightest sign about the current status of the SMP work. Ain't there > anybody around who could tell me? My somewhat incomplete understanding of the status: mergs of SMP with 3.0 is waiting for the merge of lite-2 into current. once this is shaken out then SMP will be merged into current. current is merged into SMP every month or so, the last time being Decemmber 31. SMP itself is fairly stable, runs on most MP machines it has been tried on, and is considered ALPHA level. It is capable of running "make world" on current without problem. There is a web page with latest news, further info: http://www.freebsd.org/~fsmp/SMP/SMP.html -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Mon Jan 27 18:08:39 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id SAA07767 for smp-outgoing; Mon, 27 Jan 1997 18:08:39 -0800 (PST) Received: from mpress.com (qmailr@mpress.com [208.138.29.130]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id SAA07762 for ; Mon, 27 Jan 1997 18:08:35 -0800 (PST) From: brian@mediacity.com Received: (qmail 12420 invoked by uid 100); 28 Jan 1997 02:08:30 -0000 Message-ID: <19970128020830.12419.qmail@mpress.com> Subject: Re: Current SMP status inquiry In-Reply-To: <199701272019.NAA28537@clem.systemsix.com> from Steve Passe at "Jan 27, 97 01:19:16 pm" To: smp@csn.net (Steve Passe) Date: Mon, 27 Jan 1997 18:08:30 -0800 (PST) Cc: joerg_wunsch@uriah.heep.sax.de, core@freebsd.org, smp@freebsd.org Reply-to: brian@mpress.com X-Mailer: ELM [version 2.4ME+ PL30 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > After some discussions about the GUUG meeting, i still don't have > > the slightest sign about the current status of the SMP work. Ain't > > there anybody around who could tell me? Steve Passe writes: > My somewhat incomplete understanding of the status: SMP itself is > fairly stable, runs on most MP machines it has been tried on, and > is considered ALPHA level. It is capable of running "make world" on > current without problem. What about the evil FP exception/FPU support problem? The one that locks everything up solid? I'd be happy to write up a description and workaround, but it may be that others don't believe it exists, or that it is simply a problem with my and a few others equipment. -- Brian Litzinger brian@mediacity.com From owner-freebsd-smp Mon Jan 27 18:38:09 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id SAA09728 for smp-outgoing; Mon, 27 Jan 1997 18:38:09 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id SAA09645; Mon, 27 Jan 1997 18:36:40 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id TAA00396; Mon, 27 Jan 1997 19:35:39 -0700 Message-Id: <199701280235.TAA00396@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: brian@mpress.com cc: joerg_wunsch@uriah.heep.sax.de, core@freebsd.org, smp@freebsd.org Subject: Re: Current SMP status inquiry In-reply-to: Your message of "Mon, 27 Jan 1997 18:08:30 PST." <19970128020830.12419.qmail@mpress.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 27 Jan 1997 19:35:38 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > What about the evil FP exception/FPU support problem? The one that > locks everything up solid? I'd be happy to write up a description and > workaround, but it may be that others don't believe it exists, or that > it is simply a problem with my and a few others equipment. it definately seems to affect some hardware more than others. I recently ran 3 copies of ico on X11 for 3-4 houres without a burp. There has to be a fair amout of FPU use going on during that. This does need to be pursued, but I have no time (or even an SMP machine) to give to it right now, a "real job" has arrived... -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Tue Jan 28 00:01:51 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id AAA28958 for smp-outgoing; Tue, 28 Jan 1997 00:01:51 -0800 (PST) Received: from kvikk.uit.no (kvikk.Uit.No [129.242.4.32]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id AAA28953 for ; Tue, 28 Jan 1997 00:01:47 -0800 (PST) Received: from sprint.cc.uit.no (sprint.Cc.Uit.No [129.242.5.198]) by kvikk.uit.no (8.8.5/8.8.5) with ESMTP id JAA12547; Tue, 28 Jan 1997 09:01:42 +0100 (MET) Received: from slibo.cc.uit.no (slibo.Cc.Uit.No [129.242.5.36]) by sprint.cc.uit.no (8.8.5/8.8.5) with ESMTP id JAA11216; Tue, 28 Jan 1997 09:01:41 +0100 (MET) Received: from localhost (terjem@localhost) by slibo.cc.uit.no (8.8.5/8.8.5) with ESMTP id JAA15200; Tue, 28 Jan 1997 09:01:40 +0100 (MET) Message-Id: <199701280801.JAA15200@slibo.cc.uit.no> X-Mailer: exmh version 1.6.9 8/22/96 To: Steve Passe cc: smp@freebsd.org Subject: Re: Current SMP status inquiry In-reply-to: Your message of "Mon, 27 Jan 1997 19:35:38 MET." <199701280235.TAA00396@clem.systemsix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 28 Jan 1997 09:01:39 +0100 From: Terje Normann Marthinussen Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >Hi, > >> What about the evil FP exception/FPU support problem? The one that >> locks everything up solid? I'd be happy to write up a description and >> workaround, but it may be that others don't believe it exists, or that >> it is simply a problem with my and a few others equipment. > >it definately seems to affect some hardware more than others. I recently >ran 3 copies of ico on X11 for 3-4 houres without a burp. There has to >be a fair amout of FPU use going on during that. > >This does need to be pursued, but I have no time (or even an SMP machine) >to give to it right now, a "real job" has arrived... I got our 4CPU pentium HP Netserver up on current early last week, and during the weekend we had it chewing through 1GB+ of gzip'ed log data. Two processes fed with each half of the data, they grew to around 50MB each (early memory inefficient version of the program and we "only" have 128MB on it so 2 was enough). While not FPU intensive, it most certainly must have used quite a bit of FPU during the ~15 hours the processes ran (well, actually two runs of ~15 hours, a bug was discovered ;)). Didn't notice any problems at all. Terje Marthinussen terjem@cc.uit.no From owner-freebsd-smp Tue Jan 28 09:15:56 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA24557 for smp-outgoing; Tue, 28 Jan 1997 09:15:56 -0800 (PST) Received: from housing1.stucen.gatech.edu (ken@housing1.stucen.gatech.edu [130.207.52.71]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id JAA24550 for ; Tue, 28 Jan 1997 09:15:52 -0800 (PST) Received: (from ken@localhost) by housing1.stucen.gatech.edu (8.8.5/8.8.5) id MAA22280; Tue, 28 Jan 1997 12:15:27 -0500 (EST) From: Kenneth Merry Message-Id: <199701281715.MAA22280@housing1.stucen.gatech.edu> Subject: Re: Current SMP status inquiry In-Reply-To: <199701280801.JAA15200@slibo.cc.uit.no> from Terje Normann Marthinussen at "Jan 28, 97 09:01:39 am" To: Terje.N.Marthinussen@cc.uit.no (Terje Normann Marthinussen) Date: Tue, 28 Jan 1997 12:15:26 -0500 (EST) Cc: smp@csn.net, smp@freebsd.org X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Terje Normann Marthinussen wrote... > >Hi, > > > >> What about the evil FP exception/FPU support problem? The one that > >> locks everything up solid? I'd be happy to write up a description and > >> workaround, but it may be that others don't believe it exists, or that > >> it is simply a problem with my and a few others equipment. > > > >it definately seems to affect some hardware more than others. I recently > >ran 3 copies of ico on X11 for 3-4 houres without a burp. There has to > >be a fair amout of FPU use going on during that. > > > >This does need to be pursued, but I have no time (or even an SMP machine) > >to give to it right now, a "real job" has arrived... > > I got our 4CPU pentium HP Netserver up on current early last week, and > during the weekend we had it chewing through 1GB+ of gzip'ed log data. > Two processes fed with each half of the data, they grew to around > 50MB each (early memory inefficient version of the program and we "only" > have 128MB on it so 2 was enough). > > While not FPU intensive, it most certainly must have used quite a bit of > FPU during the ~15 hours the processes ran (well, actually two runs > of ~15 hours, a bug was discovered ;)). Didn't notice any problems at all. I agree, I can run FP intensive stuff, like ICO, OpenGL demos, and all kinds of stuff without problems. I've been running X without trouble, ever since I figured out I needed to get rid of the FP emulation stuff in the kernel. (I accidentally left it in my config file.) The question, though, is can you run this program successfully? ======================================================================== #include #include void blech() { exit(3); } main() { int i32; double f; int result = 0; signal(SIGFPE, blech); f = (double) 0x7fffffff; f = 10 * f; i32 = (int) f; if (i32 != (int) f) result |= 1; exit(result); } ======================================================================== That program locks my machine up solid. (I got it from the perl5 configuration script, after several attempts to compile the port hung my machine.) Ken -- Kenneth Merry ken@ulc199.residence.gatech.edu Disclaimer: I don't speak for GTRI, GT, or Elvis. From owner-freebsd-smp Tue Jan 28 09:31:48 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA25687 for smp-outgoing; Tue, 28 Jan 1997 09:31:48 -0800 (PST) Received: from tfs.com (tfs.com [140.145.250.1]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA25682 for ; Tue, 28 Jan 1997 09:31:45 -0800 (PST) Received: from schizo.dk.tfs.com by tfs.com (smail3.1.28.1) with SMTP id m0vpHNK-0003y7C; Tue, 28 Jan 97 09:31 PST Received: from critter.dk.tfs.com (critter.dk.tfs.com [140.145.230.252]) by schizo.dk.tfs.com (8.8.2/8.7.3) with ESMTP id SAA12823 for ; Tue, 28 Jan 1997 18:31:12 +0100 (MET) Received: from critter.dk.tfs.com (localhost [127.0.0.1]) by critter.dk.tfs.com (8.8.2/8.8.2) with ESMTP id SAA12397 for ; Tue, 28 Jan 1997 18:33:01 +0100 (MET) To: smp@freebsd.org Subject: another success Reply-to: phk@freebsd.org Date: Tue, 28 Jan 1997 18:33:01 +0100 Message-ID: <12395.854472781@critter.dk.tfs.com> From: Poul-Henning Kamp Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Having heard no loud and agonizing screams from the smp list for some time I decided to try out the smp kernel on our server here. (this is the one I originally used to get the SMP code off the ground btw. :-) Well, what can I say ? It works. # cd /usr/src/lib/libc # make clean >& /dev/null # time make -j 8 >& /dev/null 598.4u 622.3s 11:45.44 173.0% 323+568k 2927+7872io 487pf+0w This is DEC Prioris with 2xP5/133 & Neptune. -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@tfs.com TRW Financial Systems, Inc. Future will arrive by its own means, progress not so. From owner-freebsd-smp Tue Jan 28 21:32:48 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA10155 for smp-outgoing; Tue, 28 Jan 1997 21:32:48 -0800 (PST) Received: from sendero.i-connect.net ([206.190.144.100]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id VAA10141 for ; Tue, 28 Jan 1997 21:32:44 -0800 (PST) Received: (qmail 9327 invoked by uid 1000); 29 Jan 1997 01:05:18 -0000 Message-ID: X-Mailer: XFMail 1.1-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199701280801.JAA15200@slibo.cc.uit.no> Date: Tue, 28 Jan 1997 15:13:30 -0800 (PST) Organization: iConnect Corp. From: Simon Shapiro To: smp@freebsd.org Subject: A Newcomer`s Trivial Question. Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk So, FreeBSD has an SMP support? Lack of it was one before last reservation about switching to it on a large project. How? What, etc? I can find no mention of it anywhere... Simon From owner-freebsd-smp Tue Jan 28 21:59:05 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA11397 for smp-outgoing; Tue, 28 Jan 1997 21:59:05 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id VAA11391 for ; Tue, 28 Jan 1997 21:59:02 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id WAA07770; Tue, 28 Jan 1997 22:58:44 -0700 Message-Id: <199701290558.WAA07770@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Simon Shapiro cc: smp@freebsd.org Subject: Re: A Newcomer`s Trivial Question. In-reply-to: Your message of "Tue, 28 Jan 1997 15:13:30 PST." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 28 Jan 1997 22:58:43 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > So, FreeBSD has an SMP support? Lack of it was one before last reservation > about switching to it on a large project. How? What, etc? > > I can find no mention of it anywhere... http://www.freebsd.org/~fsmp/SMP/SMP.html -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Wed Jan 29 10:21:34 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA13607 for smp-outgoing; Wed, 29 Jan 1997 10:21:34 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA13602 for ; Wed, 29 Jan 1997 10:21:30 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA11225; Wed, 29 Jan 1997 11:20:20 -0700 Message-Id: <199701291820.LAA11225@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Kenneth Merry cc: Terje.N.Marthinussen@cc.uit.no (Terje Normann Marthinussen), smp@freebsd.org Subject: Re: Current SMP status inquiry In-reply-to: Your message of "Tue, 28 Jan 1997 12:15:26 EST." <199701281715.MAA22280@housing1.stucen.gatech.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 29 Jan 1997 11:20:20 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > The question, > though, is can you run this program successfully? > ... > That program locks my machine up solid. (I got it from the perl5 > configuration script, after several attempts to compile the port hung my > machine.) I think the problem here is that we are not handling the SIGFPE properly, ie this program provokes a SIGFPE, which is what locks the machine. If anyone has time to pursue this I would suggest looking at i386/isa/npx.c: npxintr(). -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Wed Jan 29 19:01:54 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id TAA13713 for smp-outgoing; Wed, 29 Jan 1997 19:01:54 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA13703 for ; Wed, 29 Jan 1997 19:01:51 -0800 (PST) Received: from red.jnx.com (red.jnx.com [208.197.169.254]) by who.cdrom.com (8.7.5/8.6.11) with ESMTP id TAA02949 for ; Wed, 29 Jan 1997 19:01:50 -0800 (PST) Received: from chimp.jnx.com (chimp.jnx.com [208.197.169.246]) by red.jnx.com (8.8.3/8.8.3) with ESMTP id TAA29542; Wed, 29 Jan 1997 19:00:05 -0800 (PST) Received: (from tli@localhost) by chimp.jnx.com (8.7.6/8.7.3) id TAA04996; Wed, 29 Jan 1997 19:00:05 -0800 (PST) Date: Wed, 29 Jan 1997 19:00:05 -0800 (PST) Message-Id: <199701300300.TAA04996@chimp.jnx.com> From: Tony Li To: smp@freebsd.org cc: shermis@jnx.com Subject: AMI Goliath status? Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, Can anyone update me on the status of SMP for the AMI Goliath? Anyone actually got it running? Thanks, Tony From owner-freebsd-smp Wed Jan 29 22:32:23 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id WAA23455 for smp-outgoing; Wed, 29 Jan 1997 22:32:23 -0800 (PST) Received: from atlantis.nconnect.net (root@atlantis.nconnect.net [206.54.227.6]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id WAA23450 for ; Wed, 29 Jan 1997 22:32:20 -0800 (PST) Received: from arabian.astrolab.org (dial187.nconnect.net [206.54.227.187]) by atlantis.nconnect.net (8.8.4/8.7.3) with SMTP id AAA17471 for ; Thu, 30 Jan 1997 00:25:45 -0600 (CST) Message-ID: <32F04022.41C67EA6@nconnect.net> Date: Thu, 30 Jan 1997 00:30:58 -0600 From: Randy DuCharme Organization: Computer Specialists X-Mailer: Mozilla 3.01Gold (X11; I; FreeBSD 3.0-SMP i386) MIME-Version: 1.0 To: smp@freebsd.org Subject: Dumb cvsup question Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Greetings, I've been supping current almost nightly and I've been noticing a lot of kernel source updates in current. Should I be including these in the SMP sources ?? I've been keeping them separate thus far. Thanks Randy From owner-freebsd-smp Thu Jan 30 00:24:47 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id AAA27829 for smp-outgoing; Thu, 30 Jan 1997 00:24:47 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id AAA27821 for ; Thu, 30 Jan 1997 00:24:42 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] for with ESMTP id LAA03420; Thu, 30 Jan 1997 11:22:21 +0300 Received: by sinbin.demos.su id LAA20712; (8.6.12/D) Thu, 30 Jan 1997 11:22:00 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701300822.LAA20712@sinbin.demos.su> Subject: troubles with smp kernel To: freebsd-smp@freebsd.org Date: Thu, 30 Jan 1997 11:22:00 +0300 (MSK) Reply-To: bag@bag.ru X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk dual-P6-200 on Asus P/I-P65UP5 with C-P6ND all correct when kern.smp_active=0 after sysctl -w kern.smp_active=2 bytebench looping on first test, probe 2 kill -9 not help ?? Alex. dmesg: FreeBSD 3.0-SMP #0: Tue Jan 28 17:12:58 MSK 1997 mishania@fyllefrossa.demos.su:/usr/src/sys/compile/FYLLEFROSSA FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011 cpu1 (AP): apic id: 0, version: 0x00040011 Warning: APIC I/O disabled Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193157 Hz CPU: Pentium Pro (686-class CPU) Origin = "GenuineIntel" Id = 0x616 Stepping=6 Features=0xfbff,MTRR,PGE,MCA,CMOV> real memory = 33554432 (32768K bytes) Physical memory hole(s): avail memory = 30871552 (30148K bytes) Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 chip1 rev 1 on pci0:1:0 chip2 rev 0 on pci0:1:1 chip3 rev 2 on pci0:9 de0 rev 17 int a irq 12 on pci0:10 de0: 21041 [10Mb/s] pass 1.1 de0: address 00:00:c0:74:8c:dc chip4 rev 2 on pci0:12 Probing for devices on PCI bus 1: ahc0 rev 0 int a irq 9 on pci1:4 ahc0: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs ahc0 waiting for scsi devices to settle (ahc0:0:0): "SEAGATE ST32550W 0016" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 2047MB (4194058 512 byte sectors) (ahc0:1:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 sd1(ahc0:1:0): Direct-Access 8683MB (17783112 512 byte sectors) (ahc0:2:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 sd2(ahc0:2:0): Direct-Access 8683MB (17783112 512 byte sectors) ahc1 rev 0 int a irq 11 on pci1:5 ahc1: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs ahc1 waiting for scsi devices to settle ahc1: Someone reset channel A Probing for devices on PCI bus 2: ahc2 rev 0 int a irq 11 on pci2:4 ahc2: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs ahc2 waiting for scsi devices to settle ahc2: Someone reset channel A ahc3 rev 0 int a irq 10 on pci2:5 ahc3: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs ahc3 waiting for scsi devices to settle ahc3: Someone reset channel A Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: NEC 72065B npx0 on motherboard npx0: INT 16 interface changing root device to sd0a de0: enabling 10baseT port SMP: All idle procs online. SMP: Starting 1st AP! SMP: AP CPU #1 LAUNCHED!! Starting Scheduling... SMP: TADA! CPU #1 made it into the scheduler!. SMP: All 2 CPU's are online! From owner-freebsd-smp Thu Jan 30 00:52:33 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id AAA28984 for smp-outgoing; Thu, 30 Jan 1997 00:52:33 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id AAA28979 for ; Thu, 30 Jan 1997 00:52:30 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id BAA15052; Thu, 30 Jan 1997 01:52:14 -0700 Message-Id: <199701300852.BAA15052@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Randy DuCharme cc: smp@FreeBSD.org Subject: Re: Dumb cvsup question In-reply-to: Your message of "Thu, 30 Jan 1997 00:30:58 CST." <32F04022.41C67EA6@nconnect.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 01:52:14 -0700 Sender: owner-smp@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk Hi, > Greetings, > > I've been supping current almost nightly and I've been noticing a lot of > kernel source updates in current. Should I be including these in the > SMP sources ?? I've been keeping them separate thus far. I would strongly suggest you continue to keep them seperate. In fact, by supping -current you run the risk of your "world" getting out of sync with your "sys" tree and introducing subtle problems. -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Thu Jan 30 01:00:31 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id BAA29298 for smp-outgoing; Thu, 30 Jan 1997 01:00:31 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id BAA29292 for ; Thu, 30 Jan 1997 01:00:27 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id BAA15096; Thu, 30 Jan 1997 01:59:52 -0700 Message-Id: <199701300859.BAA15096@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@bag.ru cc: freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 11:22:00 +0300." <199701300822.LAA20712@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 01:59:52 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >all correct when kern.smp_active=0 >after sysctl -w kern.smp_active=2 bytebench looping on first test, probe 2 >kill -9 not help first, you should be using a kernel with options APIC_IO and options SMP_INVLTBL, although I doubt that is the cause of your problem. you need to be much more specific in describing you problem for us to help. what exactly is bytebench doing at this point? -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Thu Jan 30 02:15:21 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id CAA02506 for smp-outgoing; Thu, 30 Jan 1997 02:15:21 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id CAA02498 for ; Thu, 30 Jan 1997 02:15:06 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id NAA13496; Thu, 30 Jan 1997 13:12:22 +0300 Received: by sinbin.demos.su id NAA08072; (8.6.12/D) Thu, 30 Jan 1997 13:11:47 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701301011.NAA08072@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Thu, 30 Jan 1997 13:11:47 +0300 (MSK) Cc: freebsd-smp@freebsd.org In-Reply-To: <199701300859.BAA15096@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 01:59:52 am X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > Hi, > > >all correct when kern.smp_active=0 > >after sysctl -w kern.smp_active=2 bytebench looping on first test, probe 2 > >kill -9 not help > > first, you should be using a kernel with options APIC_IO and options Asus about IOAPIC: Current Pentium Pro CPU Cards only support PIIX3 SMI so leave JP5 on default settings (don't swith to APIC SMI) until future update ... ?? > SMP_INVLTBL, although I doubt that is the cause of your problem. > > you need to be much more specific in describing you problem for us to help. > what exactly is bytebench doing at this point? bytebench runing dhrstone2 without register var. src: dhry_1.c and dhry_2.c with undef REG in ps: /usr/local/lib/bytebench/pgms/dhry2 10 This is mptable output: MPTable, version 2.0.4 -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 1 6 0xfbff 0 0x11 AP, usable 6 1 7 0xfbff I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 -- -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT active-hi edge 3 0 255 0 NMI active-hi edge 3 0 255 1 options SMP # Symmetric MultiProcessor Kernel #options APIC_IO # Symmetric (APIC) I/O options NCPU=2 # number of CPUs options NBUS=4 # number of busses options NAPIC=1 # number of IO APICs options NINTR=16 # number of INTs =============================================================================== > -- > Steve Passe | powered by > smp@csn.net | FreeBSD > > From owner-freebsd-smp Thu Jan 30 09:28:49 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA20680 for smp-outgoing; Thu, 30 Jan 1997 09:28:49 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA20674 for ; Thu, 30 Jan 1997 09:28:44 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@megillah.demos.su [194.87.0.21] with ESMTP id UAA06133; Thu, 30 Jan 1997 20:27:28 +0300 Received: by megillah.demos.su id UAA19227; (8.8.3/D) Thu, 30 Jan 1997 20:27:46 +0300 (MSK) Message-Id: <199701301727.UAA19227@megillah.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Thu, 30 Jan 1997 20:27:46 +0300 (MSK) Cc: bag@bag.ru, freebsd-smp@freebsd.org, mishania@demos.su In-Reply-To: <199701300859.BAA15096@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 01:59:52 am From: "Mikhail A. Sokolov" X-Class: Fast Organization: Demos Company, Ltd. Reply-To: mishania@demos.su X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Hi, > first, you should be using a kernel with options APIC_IO and options > SMP_INVLTBL, although I doubt that is the cause of your problem. Thanks fot the hint, I recompiled the beasts kernel with APIC_IO/SMP_INVLTBL, but I still wonder about the following: SMP_INVLTBL, APIC_IO, where can description be found ? SMP_INVLTBL doesn't even seem to be announced at www.freebsd.org/~fsmp/SMP.. Here, we currently have ASUS dual ppro mother with two ppro200's, FM of the motherboard says, that APIC_IO should NOT be turned on, until 'future upgrade', as Alex already mentioned. Of course, we tried turning it on, and it works only with it's ON ;-) But the machine still reboots not giving any clue to syslogd. Thus I guess it is more issue for hardware@ list, you guys seem to be more experienced with MP motherboards, right? ;-) What we have is attached at the end of the letter of mine; to be short it's the above mentioned mother, 2x3940TUW (Twin Ultra Wide) Adaptecs in slots 4 and 5, sharing irq's. The FM of the motherboard claims, that it might not be any problem at all to have it shared, _when an OS supports sharing correct_. Seems it doesn't :-(. I would be also interested in hints on RAM parity check this monster does: I get "RAM PARITY SEGMENT CHECK FAILED in segment 0x0000, F1 to disable NMI, F2 to reboot". Since I already tested many different SIMM's, 1 Gb of them ;-) I can assert it not to be RAM physical problem, - but what than? This problem arose _only_ after I plugged second identical processor, stolen from HP Vectra's VA Series 4. I changed processors also, tested four, so they are not culprits. > you need to be much more specific in describing you problem for us to help. > what exactly is bytebench doing at this point? Returning to bytebench and Dhrystones in particular, machine reboots (see whining above also) in process of several concurent shell scripts and on points Alex already described also. Maybe that's the matter if incorrect IRQ's sharing handling? Another little problem (?) here also, - interesting behaviour of SMP kernel on my halt: it yells something like "Oi, I am working on CPU #1, switching to #0! HALT!". Why is that? SMP tree is as af today, fetched it from scratch, build is done on basis of 3.0-19970124-SNAP, mother-fatherboard is Asus P/I-P65UP5/C-P6ND, BIOS set to MP ver1.4. Thank you in advance for any hints, and please excuse me for that much questions > Steve Passe | powered by -mishania P.S. Btw, what's that 'freeing is not implemented' ? See below. P.P.S. dmesg: Jan 30 19:42:18 fyllefrossa halt: halted by mishania Jan 30 19:42:18 fyllefrossa syslogd: exiting on signal 15 Jan 30 19:47:53 fyllefrossa /kernel: Copyright (c) 1992-1996 FreeBSD Inc. Jan 30 19:47:53 fyllefrossa /kernel: Copyright (c) 1982, 1986, 1989, 1991, 1993 Jan 30 19:47:53 fyllefrossa /kernel: The Regents of the University of Califor nia. All rights reserved. Jan 30 19:47:53 fyllefrossa /kernel: Jan 30 19:47:53 fyllefrossa /kernel: FreeBSD 3.0-SMP #0: Thu Jan 30 19:42:05 MSK 1997 Jan 30 19:47:53 fyllefrossa /kernel: mishania@fyllefrossa.demos.su:/arc1/src /sys-SMP/compile/FYLLEFROSSA Jan 30 19:47:53 fyllefrossa /kernel: FreeBSD/SMP: Multiprocessor motherboard Jan 30 19:47:53 fyllefrossa /kernel: cpu0 (BSP): apic id: 1, version: 0x0004001 1 Jan 30 19:47:53 fyllefrossa /kernel: cpu1 (AP): apic id: 0, version: 0x0004001 1 Jan 30 19:47:54 fyllefrossa /kernel: io0 (APIC): apic id: 2, version: 0x0017001 1 Jan 30 19:47:54 fyllefrossa /kernel: Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193157 Hz Jan 30 19:47:54 fyllefrossa /kernel: CPU: Pentium Pro (686-class CPU) Jan 30 19:47:54 fyllefrossa /kernel: Origin = "GenuineIntel" Id = 0x616 Step ping=6 Jan 30 19:47:54 fyllefrossa /kernel: Features=0xfbff,MTRR,PGE,MCA,CMOV> Jan 30 19:47:54 fyllefrossa /kernel: real memory = 268435456 (262144K bytes) Jan 30 19:47:54 fyllefrossa /kernel: avail memory = 257298432 (251268K bytes) Jan 30 19:47:54 fyllefrossa /kernel: Probing for devices on PCI bus 0: Jan 30 19:47:54 fyllefrossa /kernel: chip0 rev 2 on pci0:0 Jan 30 19:47:54 fyllefrossa /kernel: chip1 rev 1 on pci0:1:0 Jan 30 19:47:54 fyllefrossa /kernel: chip2 rev 0 o n pci0:1:1 Jan 30 19:47:54 fyllefrossa /kernel: chip3 rev 2 on p ci0:9 Jan 30 19:47:54 fyllefrossa /kernel: de0 rev 17 int a i rq 18 on pci0:10 Jan 30 19:47:54 fyllefrossa /kernel: Freeing (NOT implimented) irq 12 for ISA ca rds. Jan 30 19:47:54 fyllefrossa /kernel: de0: 21041 [10Mb/s] pass 1.1 Jan 30 19:47:54 fyllefrossa /kernel: de0: address 00:00:c0:74:8c:dc Jan 30 19:47:54 fyllefrossa /kernel: chip4 rev 2 on p ci0:12 Jan 30 19:47:55 fyllefrossa /kernel: Freeing (NOT implimented) irq 12 for ISA ca rds. Jan 30 19:47:55 fyllefrossa /kernel: Probing for devices on PCI bus 1: Jan 30 19:47:55 fyllefrossa /kernel: ahc0 rev 0 int a irq 19 on pci1:4 Jan 30 19:47:55 fyllefrossa /kernel: Freeing (NOT implimented) irq 9 for ISA car ds. Jan 30 19:47:55 fyllefrossa /kernel: ahc0: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs Jan 30 19:47:55 fyllefrossa /kernel: ahc0 waiting for scsi devices to settle Jan 30 19:47:55 fyllefrossa /kernel: (ahc0:0:0): "SEAGATE ST32550W 0016" type 0 fixed SCSI 2 Jan 30 19:47:55 fyllefrossa /kernel: sd0(ahc0:0:0): Direct-Access 2047MB (419405 8 512 byte sectors) Jan 30 19:47:55 fyllefrossa /kernel: (ahc0:1:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 Jan 30 19:47:55 fyllefrossa /kernel: sd1(ahc0:1:0): Direct-Access 8683MB (177831 12 512 byte sectors) Jan 30 19:47:55 fyllefrossa /kernel: (ahc0:2:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 Jan 30 19:47:55 fyllefrossa /kernel: sd2(ahc0:2:0): Direct-Access 8683MB (177831 12 512 byte sectors) Jan 30 19:47:55 fyllefrossa /kernel: ahc1 rev 0 int a irq 16 on pci1:5 Jan 30 19:47:55 fyllefrossa /kernel: Freeing (NOT implimented) irq 11 for ISA ca rds. Jan 30 19:47:55 fyllefrossa /kernel: ahc1: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs Jan 30 19:47:55 fyllefrossa /kernel: ahc1 waiting for scsi devices to settle Jan 30 19:47:55 fyllefrossa /kernel: ahc1: Someone reset channel A Jan 30 19:47:56 fyllefrossa /kernel: Probing for devices on PCI bus 2: Jan 30 19:47:56 fyllefrossa /kernel: ahc2 rev 0 int a irq 19 on pci2:4 Jan 30 19:47:56 fyllefrossa /kernel: Freeing (NOT implimented) irq 11 for ISA ca rds. Jan 30 19:47:56 fyllefrossa /kernel: ahc2: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs Jan 30 19:47:56 fyllefrossa /kernel: ahc2 waiting for scsi devices to settle Jan 30 19:47:56 fyllefrossa /kernel: ahc2: Someone reset channel A Jan 30 19:47:56 fyllefrossa /kernel: ahc3 rev 0 int a irq 16 on pci2:5 Jan 30 19:47:56 fyllefrossa /kernel: Freeing (NOT implimented) irq 10 for ISA ca rds. Jan 30 19:47:56 fyllefrossa /kernel: ahc3: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs Jan 30 19:47:56 fyllefrossa /kernel: ahc3 waiting for scsi devices to settle Jan 30 19:47:56 fyllefrossa /kernel: ahc3: Someone reset channel A Jan 30 19:47:56 fyllefrossa /kernel: Probing for devices on the ISA bus: Jan 30 19:47:56 fyllefrossa /kernel: sc0 at 0x60-0x6f irq 1 on motherboard Jan 30 19:47:56 fyllefrossa /kernel: sc0: VGA color <16 virtual consoles, flags= 0x0> Jan 30 19:47:56 fyllefrossa /kernel: fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa Jan 30 19:47:56 fyllefrossa /kernel: fdc0: NEC 72065B Jan 30 19:47:56 fyllefrossa /kernel: npx0 on motherboard Jan 30 19:47:57 fyllefrossa /kernel: npx0: INT 16 interface Jan 30 19:47:57 fyllefrossa /kernel: changing root device to sd0a Jan 30 19:47:57 fyllefrossa /kernel: Enabled INTs: 1, 2, 6, 8, 16, 18, 19, imen: 0x00f2feb9 Jan 30 19:47:57 fyllefrossa /kernel: de0: enabling 10baseT port Jan 30 19:47:57 fyllefrossa /kernel: SMP: All idle procs online. Jan 30 19:47:58 fyllefrossa lpd[115]: restarted Jan 30 19:48:01 fyllefrossa /kernel: SMP: Starting 1st AP! Jan 30 19:48:01 fyllefrossa /kernel: SMP: AP CPU #1 LAUNCHED!! Starting Schedul ing... Jan 30 19:48:01 fyllefrossa /kernel: SMP: TADA! CPU #1 made it into the schedule r!. Jan 30 19:48:01 fyllefrossa /kernel: SMP: All 2 CPU's are online! From owner-freebsd-smp Thu Jan 30 09:40:50 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA21522 for smp-outgoing; Thu, 30 Jan 1997 09:40:50 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA21510 for ; Thu, 30 Jan 1997 09:40:46 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id KAA17552; Thu, 30 Jan 1997 10:39:57 -0700 Message-Id: <199701301739.KAA17552@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@sinbin.demos.su (Alex G. Bulushev) cc: "Mikhail A. Sokolov" , freebsd-smp@FreeBSD.ORG Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 13:11:47 +0300." <199701301011.NAA08072@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 10:39:57 -0700 Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, > Asus about IOAPIC: > > Current Pentium Pro CPU Cards only support PIIX3 SMI so leave JP5 on default > settings (don't swith to APIC SMI) until future update ... what is J5 described to do? we have several people using this board with APIC_IO enabled, so I know its possible. --- > bytebench runing dhrstone2 without register var. src: dhry_1.c and dhry_2.c > with undef REG I don't know anything about this program and don't have time right now to research it so I will defer to others on this question. --- > This is mptable output: > > MPTable, version 2.0.4 > > -- > Processors: APIC ID Version State Family Model Step Flags > 1 0x11 BSP, usable 6 1 6 0xfbff > 0 0x11 AP, usable 6 1 7 0xfbff > > I/O APICs: APIC ID Version State Address > 2 0x11 usable 0xfec00000 > > -- is this area really missing or did you truncate the output? there should be a long list of INTerrupt associations here!!! > -- > Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > ExtINT active-hi edge 3 0 255 0 > NMI active-hi edge 3 0 255 1 > > > > options SMP # Symmetric MultiProcessor Kernel > #options APIC_IO # Symmetric (APIC) I/O > options NCPU=2 # number of CPUs > options NBUS=4 # number of busses > options NAPIC=1 # number of IO APICs > options NINTR=16 # number of INTs note that while answering this letter another mailing came in from the same site with more detail on some of the above issues so I will continue this with an answer to that mailing. I will be refering to the above mptable line showing "NINTR=16" in it... -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Thu Jan 30 09:45:35 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA21745 for smp-outgoing; Thu, 30 Jan 1997 09:45:35 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA21737 for ; Thu, 30 Jan 1997 09:45:26 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id UAA10694; Thu, 30 Jan 1997 20:44:27 +0300 Received: by sinbin.demos.su id UAA15388; (8.6.12/D) Thu, 30 Jan 1997 20:43:43 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701301743.UAA15388@sinbin.demos.su> Subject: Re: troubles with smp kernel To: mishania@demos.su Date: Thu, 30 Jan 1997 20:43:43 +0300 (MSK) Cc: smp@csn.net, freebsd-smp@freebsd.org In-Reply-To: <199701301727.UAA19227@megillah.demos.su> from "Mikhail A. Sokolov" at Jan 30, 97 08:27:46 pm X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > > Hi, > > first, you should be using a kernel with options APIC_IO and options > > SMP_INVLTBL, although I doubt that is the cause of your problem. ^^^ TLB It works! but reboots once a hour :( > I would be also interested in hints on RAM parity check this monster does: I > get "RAM PARITY SEGMENT CHECK FAILED in segment 0x0000, F1 to disable NMI, > F2 to reboot". Since I already tested many different SIMM's, 1 Gb of them ;-) static electricity killed mishania and now there is no PARITY ERROR !! why ? Alex. From owner-freebsd-smp Thu Jan 30 10:01:17 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA22491 for smp-outgoing; Thu, 30 Jan 1997 10:01:17 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA22486 for ; Thu, 30 Jan 1997 10:01:12 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@megillah.demos.su [194.87.0.21] with ESMTP id UAA14946; Thu, 30 Jan 1997 20:59:28 +0300 Received: by megillah.demos.su id UAA25356; (8.8.3/D) Thu, 30 Jan 1997 20:59:39 +0300 (MSK) Message-Id: <199701301759.UAA25356@megillah.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Thu, 30 Jan 1997 20:59:39 +0300 (MSK) Cc: bag@sinbin.demos.su, mishania@demos.su, freebsd-smp@FreeBSD.ORG In-Reply-To: <199701301739.KAA17552@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 10:39:57 am From: "Mikhail A. Sokolov" X-Class: Fast Organization: Demos Company, Ltd. Reply-To: mishania@demos.su X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > Hi, > > Current Pentium Pro CPU Cards only support PIIX3 SMI so leave JP5 on default > > settings (don't swith to APIC SMI) until future update ... > what is J5 described to do? we have several people using this board with > APIC_IO enabled, so I know its possible. Fine Manual says we should leave JP5 to handle PIIX3 SMI, never turning APIC ON. We turned it on, of course, and it works only from then. > is this area really missing or did you truncate the output? there should be > a long list of INTerrupt associations here!!! > note that while answering this letter another mailing came in from the same > site with more detail on some of the above issues so I will continue > this with an answer to that mailing. I will be refering to the above > mptable line showing "NINTR=16" in it... Seems like it was my letter, but I didn't include mptable output then, here we all have it. But, I see it lies, - I _have_ APIC_IO uncommented ... Since it was truncated, here comes nowadays variant: {fyllefrossa}/home/mishania> ./mptable =============================================================================== MPTable, version 2.0.4 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f60b0 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x8b mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f5caa signature: 'PCMP' base table length: 268 version: 1.4 checksum: 0xf9 OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 25 local APIC address: 0xfee00000 extended table length: 0 extended table checksum: 0 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 1 6 0xfbff 0 0x11 AP, usable 6 1 7 0xfbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 PCI 3 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT conforms conforms 3 0 2 0 INT conforms conforms 3 1 2 1 INT conforms conforms 3 0 2 2 INT conforms conforms 3 3 2 3 INT conforms conforms 3 4 2 4 INT conforms conforms 3 5 2 5 INT conforms conforms 3 6 2 6 INT conforms conforms 3 7 2 7 INT conforms conforms 3 8 2 8 INT conforms conforms 3 14 2 14 INT conforms conforms 3 15 2 15 INT active-lo level 1 4:A 2 19 INT active-lo level 1 5:A 2 16 INT active-lo level 0 10:A 2 18 INT active-lo level 2 4:A 2 16 INT active-lo level 2 5:A 2 17 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT active-hi edge 3 0 255 0 NMI active-hi edge 3 0 255 1 ------------------------------------------------------------------------------- # SMP kernel config file options: options SMP # Symmetric MultiProcessor Kernel #options APIC_IO # Symmetric (APIC) I/O options NCPU=2 # number of CPUs options NBUS=4 # number of busses options NAPIC=1 # number of IO APICs options NINTR=16 # number of INTs > -- > Steve Passe | powered by > smp@csn.net | FreeBSD Thanks! -mishania P.S. kernel 'config': {fyllefrossa}/home/mishania> more /sys/i386/conf/FYLLEFROSSA # # GENERIC -- Generic machine with WD/AHx/NCR/BTx family disks # # For more information read the handbook part System Administration -> # Configuring the FreeBSD Kernel -> The Configuration File. # The handbook is available in /usr/share/doc/handbook or online as # latest version from the FreeBSD World Wide Web server # # # An exhaustive list of options and more detailed explanations of the # device lines is present in the ./LINT configuration file. If you are # in doubt as to the purpose or necessity of a line, check first in LINT. # # $FreeBSD$ machine "i386" cpu "I686_CPU" ident FYLLEFROSSA maxusers 32 options MATH_EMULATE #Support for x87 emulation options INET #InterNETworking options FFS #Berkeley Fast Filesystem options PROCFS #Process filesystem options "COMPAT_43" #Compatible with BSD 4.3 [KEEP THIS!] options SCSI_DELAY=15 #Be pessimistic about Joe SCSI device options BOUNCE_BUFFERS #include support for DMA bounce buffers options UCONSOLE #Allow users to grab the console options USERCONFIG #boot -c editor options VISUAL_USERCONFIG #visual boot -c editor config kernel root on wd0 controller isa0 controller pci0 controller fdc0 at isa? port "IO_FD1" bio irq 6 drq 2 vector fdintr disk fd0 at fdc0 drive 0 controller ahc0 controller scbus0 device sd0 # syscons is the default console driver, resembling an SCO console device sc0 at isa? port "IO_KBD" tty irq 1 vector scintr # Enable this and PCVT_FREEBSD for pcvt vt220 compatible console driver # Mandatory, don't remove device npx0 at isa? port "IO_NPX" irq 13 vector npxintr # Order is important here due to intrusive probes, do *not* alphabetize # this list of network interfaces until the probes have been fixed. # Right now it appears that the ie0 must be probed before ep0. See # revision 1.20 of this file. device de0 pseudo-device loop pseudo-device ether pseudo-device log pseudo-device tun 1 pseudo-device pty 16 pseudo-device gzip # Exec gzipped a.out's # KTRACE enables the system-call tracing facility ktrace(2). # This adds 4 KB bloat to your kernel, and slightly increases # the costs of each syscall. options KTRACE #kernel tracing options "MAXMEM=(1024*256)" options SMP #mishania options NCPU=2 # number of CPUs #mishania options NBUS=4 # number of busses #mishania options NAPIC=1 # number of IO APICs #mishania options NINTR=16 # number of INTsA #mishania options APIC_IO # Steven. options SMP_INVLTBL # Steven. From owner-freebsd-smp Thu Jan 30 10:03:50 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA22537 for smp-outgoing; Thu, 30 Jan 1997 10:03:50 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA22526 for ; Thu, 30 Jan 1997 10:03:40 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id VAA15365; Thu, 30 Jan 1997 21:01:27 +0300 Received: by sinbin.demos.su id VAA18189; (8.6.12/D) Thu, 30 Jan 1997 21:00:48 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701301800.VAA18189@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Thu, 30 Jan 1997 21:00:48 +0300 (MSK) Cc: freebsd-smp@freebsd.org In-Reply-To: <199701301739.KAA17552@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 10:39:57 am X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > Asus about IOAPIC: > > > > Current Pentium Pro CPU Cards only support PIIX3 SMI so leave JP5 on default > > settings (don't swith to APIC SMI) until future update ... > what is J5 described to do? we have several people using this board with > APIC_IO enabled, so I know its possible. Asus write: SMI Settings (JP5) SMI is asserted by either PIIX3 chip (only one CPU can accept SMI) or IOAPIC chip (two CPU's can accept SMI). Current Pentium Pro CPU Card only support PIIX3 SMI so leave on default seyings until future update SMI JP5 PIIX3 SMI [1-2] (default) APIC SMI [2-3] > > Processors: APIC ID Version State Family Model Step Flags > > 1 0x11 BSP, usable 6 1 6 0xfbff > > 0 0x11 AP, usable 6 1 7 0xfbff > > > > I/O APICs: APIC ID Version State Address > > 2 0x11 usable 0xfec00000 > > > > -- > > is this area really missing or did you truncate the output? there should be > a long list of INTerrupt associations here!!! this is a real output with JP5 default setings (PIIX3 SMI) now mptable output for JP5 in APIC SMI position: =============================================================================== MPTable, version 2.0.4 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f60b0 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x8b mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f5caa signature: 'PCMP' base table length: 268 version: 1.4 checksum: 0xf9 OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 25 local APIC address: 0xfee00000 extended table length: 0 extended table checksum: 0 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 1 6 0xfbff 0 0x11 AP, usable 6 1 7 0xfbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 PCI 3 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT conforms conforms 3 0 2 0 INT conforms conforms 3 1 2 1 INT conforms conforms 3 0 2 2 INT conforms conforms 3 3 2 3 INT conforms conforms 3 4 2 4 INT conforms conforms 3 5 2 5 INT conforms conforms 3 6 2 6 INT conforms conforms 3 7 2 7 INT conforms conforms 3 8 2 8 INT conforms conforms 3 14 2 14 INT conforms conforms 3 15 2 15 INT active-lo level 1 4:A 2 19 INT active-lo level 1 5:A 2 16 INT active-lo level 0 10:A 2 18 INT active-lo level 2 4:A 2 16 INT active-lo level 2 5:A 2 17 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT active-hi edge 3 0 255 0 NMI active-hi edge 3 0 255 1 ------------------------------------------------------------------------------- # SMP kernel config file options: options SMP # Symmetric MultiProcessor Kernel #options APIC_IO # Symmetric (APIC) I/O options NCPU=2 # number of CPUs options NBUS=4 # number of busses options NAPIC=1 # number of IO APICs options NINTR=16 # number of INTs =============================================================================== From owner-freebsd-smp Thu Jan 30 10:09:59 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA22870 for smp-outgoing; Thu, 30 Jan 1997 10:09:59 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA22856 for ; Thu, 30 Jan 1997 10:09:33 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA17738; Thu, 30 Jan 1997 11:08:30 -0700 Message-Id: <199701301808.LAA17738@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: mishania@demos.su cc: bag@bag.ru, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 20:27:46 +0300." <199701301727.UAA19227@megillah.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 11:08:29 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >> first, you should be using a kernel with options APIC_IO and options >> SMP_INVLTBL, although I doubt that is the cause of your problem. > >Thanks fot the hint, I recompiled the beasts kernel with APIC_IO/SMP_INVLTBL, >but I still wonder about the following: SMP_INVLTBL, APIC_IO, where can >description be found ? SMP_INVLTBL doesn't even seem to be announced at >www.freebsd.org/~fsmp/SMP.. it isn't, the problem is one of resources. as you know we're all volunteers, doing this in our "free time", of which there usually isn't much! So we can either write code or document code, you know what we will choose.... basically SMP_INVLTBL is code that insures that other CPUs invalidate their TBL tables when virtual memory changes require it. Its a recent addition and thus not documented. A search of the SfreeBSD SMP mail archive would pop up alot of discussion of it. --- >Here, we currently have ASUS dual ppro mother with two ppro200's, >FM of the motherboard says, that APIC_IO should NOT be turned on, >until 'future upgrade', as Alex already mentioned. Of course, we tried turning >it on, and it works only with it's ON ;-) But the machine still reboots >not giving any clue to syslogd. Thus I guess it is more issue for hardware@ I am missing something here, you say it reboots, but I see output in this letter showing it running, specifically under what circumstances does it reboot? --- >list, you guys seem to be more experienced with MP motherboards, right? ;-) >What we have is attached at the end of the letter of mine; to be short it's >the above mentioned mother, 2x3940TUW (Twin Ultra Wide) Adaptecs in slots >4 and 5, sharing irq's. The FM of the motherboard claims, that it might not >be any problem at all to have it shared, _when an OS supports sharing >correct_. Seems it doesn't :-(. It does, and it does support the 3940 IF the motherboard knows how to handle bridged PCI cards (the 3940 has a PCI bridge chip on it). This motherboard is know to properly support the 3940 if correctly setup. Check your BIOS for a setting that describes the MP spec level. It will give you a choice between version 1.1 and 1.4. Set it to 1.4. Running at version 1.1 will cause the 3940s to fail miserably. Again a search of the SMP mail archive for 3940 should provide you with alot of info on what we did to ensure that they work (and work with shared INTs). --- I would be also interested in hints on RAM parity check this monster does: I get "RAM PARITY SEGMENT CHECK FAILED in segment 0x0000, F1 to disable NMI, F2 to reboot". Since I already tested many different SIMM's, 1 Gb of them ;-) I can assert it not to be RAM physical problem, - but what than? This problem arose _only_ after I plugged second identical processor, stolen from HP Vectra's VA Series 4. I changed processors also, tested four, so they are not culprits. This is indeed puzzling, and if it wasn't associated with plugging in the second CPU I would say it has NOTHING to do with this list, but it does so... I would guess that either there is a hardware problem with the motherboard or that something is misconfigured in the BIOS. Either way I think it is a question for ASUS support. --- >> you need to be much more specific in describing you problem for us to help. >> what exactly is bytebench doing at this point? > >Returning to bytebench and Dhrystones in particular, machine reboots (see >whining above also) in process of several concurent shell scripts and on points >Alex already described also. Maybe that's the matter if incorrect IRQ's >sharing handling? good possibility that something with INTs is causing problems. As I stated in last response, there is something very wrong with the contents of the MP-table you sent. there is that missing section, and the line: options NINTR=16 # number of INTs ^^ when there are 3940 cards in the system this number should be different, the bridge causes additional INT sources. also there was the missing INT section of the table I mentioned previously. in summary, make sure the BIOS is set for MP spec version 1.4, build a kernel with APIC_IO & SMP_INVLTLB, boot it and run "mptable -dmesg -verbose", send us the results. --- >Another little problem (?) here also, - interesting behaviour of SMP kernel on >my halt: it yells something like "Oi, I am working on CPU #1, switching to #0! >HALT!". Why is that? nothing to worry about, its just saying that CPU #1 was the CPU that was handed the job of shutting down the system (50-50 chance of this happening!) and that it is stopping and letting CPU #0 do it. This is necessary to ensure an orderly shutdown, flushing of virtual memory, cache, etc. --- SMP tree is as af today, fetched it from scratch, build is done on basis of 3.0-19970124-SNAP, mother-fatherboard is Asus P/I-P65UP5/C-P6ND, BIOS set to MP ver1.4. OK, hadn't seen confirmation of ver1.4 b4, again give me the complete mptable -dmesg output as requested above. -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Thu Jan 30 10:23:24 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA23594 for smp-outgoing; Thu, 30 Jan 1997 10:23:24 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA23587 for ; Thu, 30 Jan 1997 10:23:19 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@megillah.demos.su [194.87.0.21] with ESMTP id VAA21409; Thu, 30 Jan 1997 21:22:27 +0300 Received: by megillah.demos.su id VAA09279; (8.8.3/D) Thu, 30 Jan 1997 21:22:54 +0300 (MSK) Message-Id: <199701301822.VAA09279@megillah.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Thu, 30 Jan 1997 21:22:53 +0300 (MSK) Cc: bag@demos.su (Alex G. Bulushev), smp@freebsd.org In-Reply-To: <199701301808.LAA17738@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 11:08:29 am From: "Mikhail A. Sokolov" X-Class: Fast Organization: Demos Company, Ltd. Reply-To: mishania@demos.su X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Hi, > basically SMP_INVLTBL is code that insures that other CPUs invalidate their > TBL tables when virtual memory changes require it. Its a recent addition > and thus not documented. A search of the SfreeBSD SMP mail archive > would pop up alot of discussion of it. Ok, sorry ;-) > I am missing something here, you say it reboots, but I see output in this > letter showing it running, specifically under what circumstances does it > reboot? Let say that way: "Hell knows" : 9:13ÐÐ up 27 mins, 2 users, load averages: 0.02, 0.02, 0.00 Neither anything on the console, nor in the syslog. I still think INT/IRQ sharing system to be culprit, now from the hardware's side. > SMP mail archive for 3940 should provide you with alot of info > on what we did to ensure that they work (and work with shared INTs). I sent mptable, dmesg separatelly though, - it's 1.4 and it was always 1.4 > This is indeed puzzling, and if it wasn't associated with plugging in the > second CPU I would say it has NOTHING to do with this list, but it does so... > I would guess that either there is a hardware problem with the motherboard > or that something is misconfigured in the BIOS. Either way I think it is > a question for ASUS support. As Alex said, I was almost killed by static electricity of the machine half an hour ago, - this might be the problem. After that, during all our debates today last 1,5 hours, machine rebooted three times, but this RAM error is gone. I pray it is, I mean. Your help is much appreciated, -mishania > Steve Passe | powered by > smp@csn.net | FreeBSD From owner-freebsd-smp Thu Jan 30 10:46:01 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA24427 for smp-outgoing; Thu, 30 Jan 1997 10:46:01 -0800 (PST) Received: from housing1.stucen.gatech.edu (ken@housing1.stucen.gatech.edu [130.207.52.71]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id KAA24422 for ; Thu, 30 Jan 1997 10:45:57 -0800 (PST) Received: (from ken@localhost) by housing1.stucen.gatech.edu (8.8.5/8.8.5) id NAA22578; Thu, 30 Jan 1997 13:44:56 -0500 (EST) From: Kenneth Merry Message-Id: <199701301844.NAA22578@housing1.stucen.gatech.edu> Subject: Re: troubles with smp kernel In-Reply-To: <199701301759.UAA25356@megillah.demos.su> from "Mikhail A. Sokolov" at "Jan 30, 97 08:59:39 pm" To: mishania@demos.su Date: Thu, 30 Jan 1997 13:44:55 -0500 (EST) Cc: smp@csn.net, bag@sinbin.demos.su, mishania@demos.su, freebsd-smp@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Mikhail A. Sokolov wrote... [ ... deleted ... ] I don't have much time to write just now, but I figured I'd send a few suggestions since I've got a very similar setup to yours. (I've got an ASUS P/I-P65UP5 with 2 P6-200's as well, 128MB RAM, Adaptec 3940UW, a Matrox Millineum, 3 SMC-type network cards, and a gravis ultrasound in my machine) > P.S. kernel 'config': > {fyllefrossa}/home/mishania> more /sys/i386/conf/FYLLEFROSSA > # > # GENERIC -- Generic machine with WD/AHx/NCR/BTx family disks > # > # For more information read the handbook part System Administration -> > # Configuring the FreeBSD Kernel -> The Configuration File. > # The handbook is available in /usr/share/doc/handbook or online as > # latest version from the FreeBSD World Wide Web server > # > # > # An exhaustive list of options and more detailed explanations of the > # device lines is present in the ./LINT configuration file. If you are > # in doubt as to the purpose or necessity of a line, check first in LINT. > # > # $FreeBSD$ > > machine "i386" > cpu "I686_CPU" > ident FYLLEFROSSA > maxusers 32 > > options MATH_EMULATE #Support for x87 emulation You should probably take this *out* of the system, unless you're trying the patch that Brian Litzinger sent to force the system to use FP emulation. When I accidentally left MATH_EMULATE in my config file, X, or really any FP worth mentioning would lock up the machine. This *could* be the problem you're experiencing. > options INET #InterNETworking > options FFS #Berkeley Fast Filesystem > options PROCFS #Process filesystem > options "COMPAT_43" #Compatible with BSD 4.3 [KEEP THIS!] > options SCSI_DELAY=15 #Be pessimistic about Joe SCSI device > options BOUNCE_BUFFERS #include support for DMA bounce buffers You really don't need BOUNCE_BUFFERS unless you've got an Adaptec 1542 or some other ISA card that does 24-bit DMA. I didn't see one in your dmesg output.. > options UCONSOLE #Allow users to grab the console > options USERCONFIG #boot -c editor > options VISUAL_USERCONFIG #visual boot -c editor > > config kernel root on wd0 This should probably be 'root on sd0' if you've only got SCSI disks, but I'm not sure how much it matters...someone else could comment more on that. > controller isa0 > controller pci0 > > controller fdc0 at isa? port "IO_FD1" bio irq 6 drq 2 vector fdintr > disk fd0 at fdc0 drive 0 > > controller ahc0 > > controller scbus0 > > device sd0 > > # syscons is the default console driver, resembling an SCO console > device sc0 at isa? port "IO_KBD" tty irq 1 vector scintr > # Enable this and PCVT_FREEBSD for pcvt vt220 compatible console driver > # Mandatory, don't remove > device npx0 at isa? port "IO_NPX" irq 13 vector npxintr > > # Order is important here due to intrusive probes, do *not* alphabetize > # this list of network interfaces until the probes have been fixed. > # Right now it appears that the ie0 must be probed before ep0. See > # revision 1.20 of this file. > device de0 > > pseudo-device loop > pseudo-device ether > pseudo-device log > pseudo-device tun 1 > pseudo-device pty 16 > pseudo-device gzip # Exec gzipped a.out's > > # KTRACE enables the system-call tracing facility ktrace(2). > # This adds 4 KB bloat to your kernel, and slightly increases > # the costs of each syscall. > options KTRACE #kernel tracing > options "MAXMEM=(1024*256)" > options SMP #mishania > options NCPU=2 # number of CPUs #mishania > options NBUS=4 # number of busses #mishania > options NAPIC=1 # number of IO APICs #mishania > options NINTR=16 # number of INTsA #mishania Even though mptable tells you to use NINTR=16, I'd suggest bumping it to 24 anyway, in case you add any more cards.. > options APIC_IO # Steven. > options SMP_INVLTBL # Steven. I think someone else already mentioned it, but that should be: options SMP_INVLTLB I think that's TLB as in "Table Lookaside Buffer". Anyway, if you think it will help, I can mail you my kernel config file. Good luck, Ken -- Kenneth Merry ken@ulc199.residence.gatech.edu Disclaimer: I don't speak for GTRI, GT, or Elvis. From owner-freebsd-smp Thu Jan 30 11:35:08 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA26633 for smp-outgoing; Thu, 30 Jan 1997 11:35:08 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id LAA26622 for ; Thu, 30 Jan 1997 11:35:01 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id MAA18162; Thu, 30 Jan 1997 12:34:19 -0700 Message-Id: <199701301934.MAA18162@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@sinbin.demos.su (Alex G. Bulushev) cc: mishania@demos.su, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 21:00:48 +0300." <199701301800.VAA18189@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 12:34:19 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, ( note that I am answering 3 consecutive mailings in this response. I think I have identified the problem, but don't get to that till the end, so read on... ) --- >> > first, you should be using a kernel with options APIC_IO and options >> > SMP_INVLTBL, although I doubt that is the cause of your problem. > ^^^ TLB sorry... >It works! but reboots once a hour :( does it reboot EXACTLY once every hour, or just approximatelty than often. What I am asking is if it might be associated with some job being run by cron, etc. --- >static electricity killed mishania and now there is no PARITY ERROR !! >why ? in america we have a saying: don't look a gift horse in the mouth. I guess the translation is, it works now, so don't complain! Seriously speaking, this all points to hardware problems of some sort. Might be something as simple as a loose SIMM. I would powerdown the machine, reseat the SIMMS, pull and re-insert all cards, including the CPU card and CPUs. Then do something about static control, spray static control on the carpet, or whatever, static can destroy a machine!!! You might try another location, perhaps thers's a bad electrical socket there (bad ground leg, electrical noise, etc.) Make sure the box is on a surge/noise outlet strip or UPS. --- >Fine Manual says we should leave JP5 to handle PIIX3 SMI, never turning APIC ON. >We turned it on, of course, and it works only from then. manuals for these things are often misleading or incorrect. Unfortunately you often have to "read between the lines" or even disbelieve everything they say and just experiment. It looks like you have found the right combination. --- >Seems like it was my letter, but I didn't include mptable output then, here we >all have it. But, I see it lies, - I _have_ APIC_IO uncommented ... I'm not sure I understand, if you mean that you ran mptable with a kernel that has APIC_IO enabled, but you got the mptable output that was missing the INT section, this is explainable. You need to understand that the information provided by mptable is just gotten from what the BIOS provides, it has nothing to do with which kernel is running. You can run mptable from a non SMP kernel and get the same results. What affects it is the position of motherboard jumpers and BIOS settings. Think of mptable as a tool for getting all these things setup properly. --- >MPTable, version 2.0.4 > ... >Processors: APIC ID Version State Family Model Step Flags > 1 0x11 BSP, usable 6 1 6 0xfbff > 0 0x11 AP, usable 6 1 7 0xfbff ^ you had said earlier that you added an "identical" processor from another machine, but this shows that they are a different stepping. This may or may not be a problem (one being stepping 6, the other being stepping 7). The safest thing would be to try to find 2 of the same stepping, but don't worry too much if you can't.... the rest of the table looks good on first glance... --- >options SMP_INVLTBL # Steven. ^^^ this is my fault, proper spelling is: options SMP_INVLTLB I would suggest you grab the latest mptable from the web page (2.0.6 I think) it will have these newer options listed in its output. --- >> is this area really missing or did you truncate the output? there should be >> a long list of INTerrupt associations here!!! > >this is a real output with JP5 default setings (PIIX3 SMI) > >now mptable output for JP5 in APIC SMI position: > ... obviously the manual is WRONG! --- note that the following lines are grabbed from several of the previous mailings, resorted to explain the issue: >Bus: Bus ID Type > 0 PCI > 1 PCI > 2 PCI > 3 ISA this shows the PCI bus on the motherboard (Bus 0) and the PCI busses created by the PCI bridge chips on each of the 3940s (Bus 1 & Bus 2) This is correctly done, by the way, and many SMP motherboards blow it entirely. >I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > INT active-lo level 1 4:A 2 19 > INT active-lo level 1 5:A 2 16 > INT active-lo level 0 10:A 2 18 > INT active-lo level 2 4:A 2 16 > INT active-lo level 2 5:A 2 17 >ahc0 rev 0 int a irq 19 on pci1:4 >ahc1 rev 0 int a irq 16 on pci1:5 >ahc2 rev 0 int a irq 19 on pci2:4 >ahc3 rev 0 int a irq 16 on pci2:5 ^^ || here is your major problem, ahc2 and ahc3 are getting the wrong INTs assigned to them. ahc2 should get IRQ16, and ahc3 should get IRQ17 A little history to explain why the current code is failing: The original MP spec 1.1 didn't take PCI bridge cards into account and thus couldn't handle them. Intel then added appendix D.2/3 to the spec which attempted to clear this up, but many MBs didn't get it right. Beyond that it was unclear to me from the spec exactly how the code should deal with it till I had a chance to work it thru with several people who actual had this type of hardware. As a result the current code ignores the Bus ID when assigning these INTs. The simple solution here would be to run without the 2nd 3940. The first one is being properly assigned. However, since your MB (ASUS) does the mp table correctly I suggest the better alternative: You could attempt to fix the code in sys/i386/i386/mp_machdep.c. The following patch hopefully will work, but I don't have an SMP machine right now so I could not test it... let me know if it works. -------------------------------------- cut --------------------------------- *** mp_machdep.c.old Thu Dec 12 01:43:52 1996 --- mp_machdep.c Thu Jan 30 12:07:38 1997 *************** *** 917,926 **** /* * determine which APIC pin a PCI INT is attached to. */ #define SRCBUSDEVICE(I) ((ioApicINTs[(I)].srcBusIRQ >> 2) & 0x1f) #define SRCBUSLINE(I) (ioApicINTs[(I)].srcBusIRQ & 0x03) int ! get_pci_apic_irq( int pciBus __attribute__ ((unused)), int pciDevice, int pciInt ) { /** --- 917,927 ---- /* * determine which APIC pin a PCI INT is attached to. */ + #define SRCBUSID(I) (ioApicINTs[(I)].srcBusID) #define SRCBUSDEVICE(I) ((ioApicINTs[(I)].srcBusIRQ >> 2) & 0x1f) #define SRCBUSLINE(I) (ioApicINTs[(I)].srcBusIRQ & 0x03) int ! get_pci_apic_irq( int pciBus, int pciDevice, int pciInt ) { /** *************** *** 932,937 **** --- 933,939 ---- for ( intr = 0; intr < nintrs; ++intr ) /* search each record */ if ( (INTTYPE( intr ) == 0) + && (SRCBUSID( intr ) == pciBus) && (SRCBUSDEVICE( intr ) == pciDevice) && (SRCBUSLINE( intr ) == pciInt) ) /* a candidate IRQ */ if ( apicIntIsBusType( intr, PCI ) ) /* check bus match */ *************** *** 941,946 **** --- 943,949 ---- } #undef SRCBUSLINE #undef SRCBUSDEVICE + #undef SRCBUSID #undef INTPIN #undef INTTYPE -------------------------------------- cut --------------------------------- I expect the above to make things much better, assumming you were using devices on the 2nd 3940. Note that the above patch will actually cause many motherboards to STOP working because they don't do the mp table stuff correctly! This is why I haven't submitted such a change to the code. The real fix is going to involve analyzing the mp table, then making a CORRECTED in-core copy when the kernel boots. It ain't gonna be pretty, and it ain't gonna be easy to get right, so I have been avoiding it!!! -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Thu Jan 30 11:42:41 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA27082 for smp-outgoing; Thu, 30 Jan 1997 11:42:41 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id LAA27076 for ; Thu, 30 Jan 1997 11:42:35 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id MAA18244; Thu, 30 Jan 1997 12:41:32 -0700 Message-Id: <199701301941.MAA18244@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Kenneth Merry cc: mishania@demos.su, bag@sinbin.demos.su, freebsd-smp@FreeBSD.ORG Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 13:44:55 EST." <199701301844.NAA22578@housing1.stucen.gatech.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 12:41:32 -0700 Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, > Even though mptable tells you to use NINTR=16, I'd suggest bumping >it to 24 anyway, in case you add any more cards.. I have fixed the printout in mptable 2.0.6 to always say 24, long term I'm going to ignore this option and have kernel assign (NAPIC * 30) of them just to be sure... >options SMP_INVLTLB again, 2.0.6 will print this as well as all other new options in their suggested state, ie enabled/commented/default value. -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Thu Jan 30 11:46:55 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA27269 for smp-outgoing; Thu, 30 Jan 1997 11:46:55 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id LAA27264 for ; Thu, 30 Jan 1997 11:46:47 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id MAA18277; Thu, 30 Jan 1997 12:46:08 -0700 Message-Id: <199701301946.MAA18277@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: mishania@demos.su cc: smp@csn.net (Steve Passe), bag@demos.su (Alex G. Bulushev), smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 21:22:53 +0300." <199701301822.VAA09279@megillah.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Date: Thu, 30 Jan 1997 12:46:07 -0700 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by freefall.freebsd.org id LAA27265 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > I am missing something here, you say it reboots, but I see output in this > > letter showing it running, specifically under what circumstances does it > > reboot? > > Let say that way: "Hell knows" : > 9:13ÐÐ up 27 mins, 2 users, load averages: 0.02, 0.02, 0.00 > > Neither anything on the console, nor in the syslog. I still think INT/IRQ > sharing system to be culprit, now from the hardware's side. as my previous mail says, I think your right, and hopefully I sent the fix. --- > As Alex said, I was almost killed by static electricity of the machine half > an hour ago, - this might be the problem. After that, during all our debates > today last 1,5 hours, machine rebooted three times, but this RAM error is > gone. I pray it is, I mean. if the hardware now seems happy PARITY wise ignore my suggestion to reseat everything, thats only asking for trouble if the problem is gone. But if static is still a problem you do need to deal with that or you will soon own a nice boat anchor! -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Thu Jan 30 14:37:11 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id OAA06276 for smp-outgoing; Thu, 30 Jan 1997 14:37:11 -0800 (PST) Received: from housing1.stucen.gatech.edu (ken@housing1.stucen.gatech.edu [130.207.52.71]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id OAA06270 for ; Thu, 30 Jan 1997 14:37:07 -0800 (PST) Received: (from ken@localhost) by housing1.stucen.gatech.edu (8.8.5/8.8.5) id RAA25173; Thu, 30 Jan 1997 17:36:39 -0500 (EST) From: Kenneth Merry Message-Id: <199701302236.RAA25173@housing1.stucen.gatech.edu> Subject: Re: troubles with smp kernel In-Reply-To: <199701301934.MAA18162@clem.systemsix.com> from Steve Passe at "Jan 30, 97 12:34:19 pm" To: smp@csn.net (Steve Passe) Date: Thu, 30 Jan 1997 17:36:38 -0500 (EST) Cc: bag@sinbin.demos.su, mishania@demos.su, freebsd-smp@freebsd.org X-Mailer: ELM [version 2.4ME+ PL25 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Steve Passe wrote... [ ... ] > note that the following lines are grabbed from several of the previous > mailings, resorted to explain the issue: > > >Bus: Bus ID Type > > 0 PCI > > 1 PCI > > 2 PCI > > 3 ISA > > this shows the PCI bus on the motherboard (Bus 0) and the PCI busses > created by the PCI bridge chips on each of the 3940s (Bus 1 & Bus 2) > This is correctly done, by the way, and many SMP motherboards blow > it entirely. > >I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > > INT active-lo level 1 4:A 2 19 > > INT active-lo level 1 5:A 2 16 > > INT active-lo level 0 10:A 2 18 > > INT active-lo level 2 4:A 2 16 > > INT active-lo level 2 5:A 2 17 > > >ahc0 rev 0 int a irq 19 on pci1:4 > >ahc1 rev 0 int a irq 16 on pci1:5 > >ahc2 rev 0 int a irq 19 on pci2:4 > >ahc3 rev 0 int a irq 16 on pci2:5 > ^^ > || > here is your major problem, ahc2 and ahc3 are getting the wrong INTs > assigned to them. ahc2 should get IRQ16, and ahc3 should get IRQ17 Are you sure that they're getting the wrong INTs? In an earlier mail, Mikhail Sokolov said that: ==== What we have is attached at the end of the letter of mine; to be short it's the above mentioned mother, 2x3940TUW (Twin Ultra Wide) Adaptecs in slots 4 and 5, sharing irq's. The FM of the motherboard claims, that it might not be any problem at all to have it shared, _when an OS supports sharing correct_. Seems it doesn't :-(. ==== Slots 4 and 5 on this particular board share the same interrupt. My machine, for instance, has a 3940UW in slot 4, and a Matrox Millineum in slot 5, and three DE21040-based cards in slots 1-3. So the interrupts get assigned like this: slot 1: de2 rev 18 int a irq 16 on pci0:12 slot 2: de1 rev 32 int a irq 17 on pci0:11 slot 3: de0 rev 17 int a irq 18 on pci0:10 slot 4: ahc0 rev 0 int a irq 19 on pci1:4 ahc1 rev 0 int a irq 16 on pci1:5 slot 5: vga0 rev 1 int a irq 19 on pci0:13 Seems like it's behaving the same way, although that might not be a 'good' way..:) But my system does work when the 3940, Matrox and SMC cards share interrupts. > A little history to explain why the current code is failing: > > The original MP spec 1.1 didn't take PCI bridge cards into account > and thus couldn't handle them. Intel then added appendix D.2/3 to the > spec which attempted to clear this up, but many MBs didn't get > it right. Beyond that it was unclear to me from the spec exactly > how the code should deal with it till I had a chance to work it thru > with several people who actual had this type of hardware. > As a result the current code ignores the Bus ID when assigning these > INTs. > > The simple solution here would be to run without the 2nd 3940. The first one > is being properly assigned. However, since your MB (ASUS) does the mp table > correctly I suggest the better alternative: If for some reason the 3940's aren't working properly sharing interrupts with each other, maybe one of them could share an interrupt with the network card? To do that, you could do something like put the network card in either slot 1 or slot 5, a 3940 in slot 2, and a 3940 in slot 4. If the network card is in slot 1, it'll share an interrupt with the second channel of the 3940 in slot 4. If the network card is in slot 5, it will share an interrupt with the first channel of the 3940 in slot 4. The 3940 in slot 2 will have two interrupts to itself, one for each channel. > You could attempt to fix the code in sys/i386/i386/mp_machdep.c. The following > patch hopefully will work, but I don't have an SMP machine right now so I > could not test it... let me know if it works. [ ... ] > I expect the above to make things much better, assumming you were using devices > on the 2nd 3940. Note that the above patch will actually cause many > motherboards to STOP working because they don't do the mp table stuff > correctly! This is why I haven't submitted such a change to the code. The > real fix is going to involve analyzing the mp table, then making a CORRECTED > in-core copy when the kernel boots. It ain't gonna be pretty, and it ain't > gonna be easy to get right, so I have been avoiding it!!! Ahh, okay, so is the idea to basically reassign the interrupts that the BIOS assigns, so that there are unique interrupts for each device in the system? Would it be a good idea for me to try out the patch? Ken (who may catch on eventually :) ) -- Kenneth Merry ken@ulc199.residence.gatech.edu Disclaimer: I don't speak for GTRI, GT, or Elvis. From owner-freebsd-smp Thu Jan 30 15:14:30 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id PAA08934 for smp-outgoing; Thu, 30 Jan 1997 15:14:30 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id PAA08925 for ; Thu, 30 Jan 1997 15:14:24 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id QAA19294; Thu, 30 Jan 1997 16:13:08 -0700 Message-Id: <199701302313.QAA19294@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Kenneth Merry cc: bag@sinbin.demos.su, mishania@demos.su, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Thu, 30 Jan 1997 17:36:38 EST." <199701302236.RAA25173@housing1.stucen.gatech.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Jan 1997 16:13:07 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, these are the entries from the mp table for the PCI devices (ed0, ahcX): >>>I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# >>> INT active-lo level 1 4:A 2 19 >>> INT active-lo level 1 5:A 2 16 >>> INT active-lo level 0 10:A 2 18 >>> INT active-lo level 2 4:A 2 16 >>> INT active-lo level 2 5:A 2 17 this was what actually happened, ie the output of dmesg: >>>ahc0 rev 0 int a irq 19 on pci1:4 >>>ahc1 rev 0 int a irq 16 on pci1:5 >>>ahc2 rev 0 int a irq 19 on pci2:4 >>>ahc3 rev 0 int a irq 16 on pci2:5 >> ^^ >> || >> here is your major problem, ahc2 and ahc3 are getting the wrong INTs >> assigned to them. ahc2 should get IRQ16, and ahc3 should get IRQ17 > > Are you sure that they're getting the wrong INTs? In an earlier > mail, Mikhail Sokolov said that: > ... > Seems like it's behaving the same way, although that might not be a > 'good' way..:) But my system does work when the 3940, Matrox and SMC cards > share interrupts. yes, I'm certain, look at the above lines again. since there are 2 3940s involved we have a replication of device:int with each of them: ahc0: device 4, INT a <--> ahc2: device 4, INT a ahc1: device 5, INT a <--> ahc3: device 5, INT a since the current code walks thru the table looking for a match to device:INT, BUT ignores bus ID, ahc0 AND ahc2 both resolve to the 1:4:A entry, ie IRQ19, and ahc1 AND ahc3 both resolve to the 1:5:A entry, ie IRQ16. ahc2 SHOULD resolve to the 2:4:A entry, ie IRQ16 (shared because they are in adjoining slots) and ahc3 SHOULD resolve to 2:5:A, IRQ17 --- > [ ... ] > Ahh, okay, so is the idea to basically reassign the interrupts that > the BIOS assigns, so that there are unique interrupts for each device in > the system? Would it be a good idea for me to try out the patch? no, thats not what I'm saying, hopefully the above explanation clears this up. my scenario doesn't re-assign the INTS, its just that the current code fails to properly read the ASUS table, and my suggested patch should fix this, ie properly read the table. The confusion comes from the fact that many MBs build the table INCORRECTLY and thus doing "the correct thing" would break them. I hope I'm clearly describing this, I know how difficult it can be to grasp, I've only gained the knowledge by working with the code till I get it right! As to your trying the patch, I can't remember if yours is the system that has the good mp table (ie another ASUS), or whether you have the one for which I had to hard-code the true values inside the source of the kernel (ie a misbehaving MP BIOS). If you have the ASUS then the patch won't hurt (assumming its correct) but it also won't change anything unless you add a PCI card that uses the same device:INT as another PCI card in the system. If you have the system with my hard-coded patches for the 3940 INT sources, then it would probably break the kernel! -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 01:09:56 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id BAA13583 for smp-outgoing; Fri, 31 Jan 1997 01:09:56 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id BAA13578 for ; Fri, 31 Jan 1997 01:09:47 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id MAA00842; Fri, 31 Jan 1997 12:08:39 +0300 Received: by sinbin.demos.su id MAA14659; (8.6.12/D) Fri, 31 Jan 1997 12:07:45 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701310907.MAA14659@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Fri, 31 Jan 1997 12:07:45 +0300 (MSK) Cc: mishania@demos.su, freebsd-smp@freebsd.org In-Reply-To: <199701301941.MAA18244@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 12:41:32 pm X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > >options SMP_INVLTLB > > again, 2.0.6 will print this as well as all other new options in their > suggested state, ie enabled/commented/default value. > This is output of mptable 2.0.6, dmesg for kernel without mp_machdep.c patch, but with all recomended options ... system very unstable ... regulary after rebooting fsck stopped on drive sd2 cheking with message: cannot alloc 4177922 bytes for lncntp OR cannot alloc 2088961 bytes for typemap and then write: RUN fsck MANUALLY this is kmem allocation error i think ... not hardware ... Alex. =============================================================================== MPTable, version 2.0.6 looking for EBDA pointer @ 0x040e, NOT found searching CMOS 'top of mem' @ 0x0009fc00 (639K) searching BIOS @ 0x000f0000 MP FPS found in BIOS @ physical addr: 0x000f60b0 ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f60b0 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x8b mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000f5caa signature: 'PCMP' base table length: 268 version: 1.4 checksum: 0xf9 OEM ID: 'OEM00000' Product ID: 'PROD00000000' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 25 local APIC address: 0xfee00000 extended table length: 0 extended table checksum: 0 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 1 6 0xfbff 0 0x11 AP, usable 6 1 7 0xfbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 PCI 3 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT conforms conforms 3 0 2 0 INT conforms conforms 3 1 2 1 INT conforms conforms 3 0 2 2 INT conforms conforms 3 3 2 3 INT conforms conforms 3 4 2 4 INT conforms conforms 3 5 2 5 INT conforms conforms 3 6 2 6 INT conforms conforms 3 7 2 7 INT conforms conforms 3 8 2 8 INT conforms conforms 3 14 2 14 INT conforms conforms 3 15 2 15 INT active-lo level 1 4:A 2 19 INT active-lo level 1 5:A 2 16 INT active-lo level 0 10:A 2 18 INT active-lo level 2 4:A 2 16 INT active-lo level 2 5:A 2 17 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# ExtINT active-hi edge 3 0 255 0 NMI active-hi edge 3 0 255 1 ------------------------------------------------------------------------------- # SMP kernel config file options: options SMP # Symmetric MultiProcessor Kernel options APIC_IO # Symmetric (APIC) I/O options NCPU=2 # number of CPUs options NBUS=4 # number of busses options NAPIC=1 # number of IO APICs options NINTR=24 # number of INTs options SMP_INVLTLB # #options SMP_PRIVPAGES # BROKEN, DO NOT use! #options SMP_AUTOSTART # BROKEN, DO NOT use! #options SERIAL_DEBUG # com port debug output ------------------------------------------------------------------------------- dmesg output: %Äó\^A%Ôï\^A%äï\^A%ôï\^A%\^Dð\^A%\^T˜\^A\^E$˜\^A\^E4ô\^A%Dô\^A\^ETô\^A%dø\^A\^Et¨\^E„ô\^A\^EÄ\^X\^A%Ô\^X\^A%ä \^EôŒ\^A\^Ed\^Q\^A%ty\^E„y%”\^M\^A\^E¤\^A%´\^A%ĉ%Ôù\^A%äá\^A%ôá\^A%\^Dv\^E\^TÒ\^A%$Ò\^A%4Ò\^A%DÒ\^A%Tj%dÒ\^A\^Etâ\^A%„\^^\^A%”Â\^A%¤Ò\^A%´Ò\^A%ÄÒ\^A%ÔÂ\^A%\^D\^_\^A%\^T\^_\^A%d\^V\^A%¤r%Ä\^V\^Agd›\^Agt›\^A%„\^A%´ø\^Ag¤Ã\^A%´gÄë\^A%”x%ä¬Copyright (c) 1992-1996 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-SMP #0: Fri Jan 31 01:47:53 MSK 1997 mishania@fyllefrossa.demos.su:/arc1/src/sys-SMP/compile/FYLLEFROSSA FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011 cpu1 (AP): apic id: 0, version: 0x00040011 io0 (APIC): apic id: 2, version: 0x00170011 Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193158 Hz CPU: Pentium Pro (686-class CPU) Origin = "GenuineIntel" Id = 0x616 Stepping=6 Features=0xfbff,MTRR,PGE,MCA,CMOV> real memory = 268435456 (262144K bytes) avail memory = 257298432 (251268K bytes) Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 chip1 rev 1 on pci0:1:0 chip2 rev 0 on pci0:1:1 chip3 rev 2 on pci0:9 de0 rev 17 int a irq 18 on pci0:10 Freeing (NOT implimented) irq 12 for ISA cards. de0: 21041 [10Mb/s] pass 1.1 de0: address 00:00:c0:74:8c:dc chip4 rev 2 on pci0:12 Freeing (NOT implimented) irq 12 for ISA cards. Probing for devices on PCI bus 1: ahc0 rev 0 int a irq 19 on pci1:4 Freeing (NOT implimented) irq 9 for ISA cards. ahc0: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs ahc0 waiting for scsi devices to settle (ahc0:0:0): "SEAGATE ST32550W 0016" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 2047MB (4194058 512 byte sectors) (ahc0:1:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 sd1(ahc0:1:0): Direct-Access 8683MB (17783112 512 byte sectors) (ahc0:2:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 sd2(ahc0:2:0): Direct-Access 8683MB (17783112 512 byte sectors) ahc1 rev 0 int a irq 16 on pci1:5 Freeing (NOT implimented) irq 11 for ISA cards. ahc1: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs ahc1 waiting for scsi devices to settle ahc1: Someone reset channel A Probing for devices on PCI bus 2: ahc2 rev 0 int a irq 19 on pci2:4 Freeing (NOT implimented) irq 11 for ISA cards. ahc2: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs ahc2 waiting for scsi devices to settle ahc2: Someone reset channel A ahc3 rev 0 int a irq 16 on pci2:5 Freeing (NOT implimented) irq 10 for ISA cards. ahc3: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs ahc3 waiting for scsi devices to settle ahc3: Someone reset channel A Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: NEC 72065B npx0 on motherboard npx0: INT 16 interface changing root device to sd0a Enabled INTs: 1, 2, 6, 8, 16, 18, 19, imen: 0x00f2feb9 de0: enabling 10baseT port WARNING: / was not properly dismounted. SMP: All idle procs online. SMP: Starting 1st AP! SMP: AP CPU #1 LAUNCHED!! Starting Scheduling... SMP: TADA! CPU #1 made it into the scheduler!. SMP: All 2 CPU's are online! =============================================================================== From owner-freebsd-smp Fri Jan 31 01:19:41 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id BAA13877 for smp-outgoing; Fri, 31 Jan 1997 01:19:41 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id BAA13872 for ; Fri, 31 Jan 1997 01:19:37 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id CAA22013; Fri, 31 Jan 1997 02:18:09 -0700 Message-Id: <199701310918.CAA22013@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@sinbin.demos.su (Alex G. Bulushev) cc: mishania@demos.su, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Fri, 31 Jan 1997 12:07:45 +0300." <199701310907.MAA14659@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 02:18:09 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >This is output of mptable 2.0.6, dmesg for kernel without mp_machdep.c patch, >but with all recomended options ... system very unstable ... >regulary after rebooting fsck stopped on drive sd2 cheking with message: you absolutly *need* to apply that patch or remove the 2nd 3940. The disk errors you are seeing are probably a result of running with the bug that hopefully will be fixed by the patch. -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 01:40:07 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id BAA14689 for smp-outgoing; Fri, 31 Jan 1997 01:40:07 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id BAA14660 for ; Fri, 31 Jan 1997 01:40:02 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id MAA11683; Fri, 31 Jan 1997 12:38:40 +0300 Received: by sinbin.demos.su id MAA21221; (8.6.12/D) Fri, 31 Jan 1997 12:37:39 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701310937.MAA21221@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Fri, 31 Jan 1997 12:37:39 +0300 (MSK) Cc: mishania@demos.su, freebsd-smp@freebsd.org In-Reply-To: <199701301934.MAA18162@clem.systemsix.com> from "Steve Passe" at Jan 30, 97 12:34:19 pm X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > >static electricity killed mishania and now there is no PARITY ERROR !! > >why ? > > in america we have a saying: don't look a gift horse in the mouth. > > I guess the translation is, it works now, so don't complain! > 1. it works fine with one CPU, no PARITY ERROR 2. with two CPU's after all BIOS messages and before booting from disk it write about PARITY ERROR 3. there is no PARITY ERROR message when we switch to ECC in BIOS but i think that ECC simply correct this error ... i think: this is hardware problem, and we need to change matheboard (CPU's and SIMM's changed already) > I'm not sure I understand, if you mean that you ran mptable with a kernel > that has APIC_IO enabled, but you got the mptable output that was > missing the INT section, this is explainable. You need to understand missing INT section in mptable output with JP5 in PIIX3 position > >ahc0 rev 0 int a irq 19 on pci1:4 > >ahc1 rev 0 int a irq 16 on pci1:5 > >ahc2 rev 0 int a irq 19 on pci2:4 > >ahc3 rev 0 int a irq 16 on pci2:5 > ^^ > || > here is your major problem, ahc2 and ahc3 are getting the wrong INTs > assigned to them. ahc2 should get IRQ16, and ahc3 should get IRQ17 > You could attempt to fix the code in sys/i386/i386/mp_machdep.c. The following ahc0 rev 0 int a irq 19 on pci1:4 ahc1 rev 0 int a irq 16 on pci1:5 ahc2 rev 0 int a irq 16 on pci2:4 ahc3 rev 0 int a irq 17 on pci2:5 ^^^ is it correct? i recompile kernel with mp_matchdep.c patch and got this result ^ This is dmesg output with mp_matchdep.c patch: Copyright (c) 1992-1996 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-SMP #0: Fri Jan 31 11:12:51 MSK 1997 bag@fyllefrossa.demos.su:/arc1/src/sys-SMP/compile/FYLLEFROSSA FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011 cpu1 (AP): apic id: 0, version: 0x00040011 io0 (APIC): apic id: 2, version: 0x00170011 Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193157 Hz CPU: Pentium Pro (686-class CPU) Origin = "GenuineIntel" Id = 0x616 Stepping=6 Features=0xfbff,MTRR,PGE,MCA,CMOV> real memory = 268435456 (262144K bytes) avail memory = 257298432 (251268K bytes) Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0 chip1 rev 1 on pci0:1:0 chip2 rev 0 on pci0:1:1 chip3 rev 2 on pci0:9 de0 rev 17 int a irq 18 on pci0:10 Freeing (NOT implimented) irq 12 for ISA cards. de0: 21041 [10Mb/s] pass 1.1 de0: address 00:00:c0:74:8c:dc chip4 rev 2 on pci0:12 Freeing (NOT implimented) irq 12 for ISA cards. Probing for devices on PCI bus 1: ahc0 rev 0 int a irq 19 on pci1:4 Freeing (NOT implimented) irq 9 for ISA cards. ahc0: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs ahc0 waiting for scsi devices to settle (ahc0:0:0): "SEAGATE ST32550W 0016" type 0 fixed SCSI 2 sd0(ahc0:0:0): Direct-Access 2047MB (4194058 512 byte sectors) (ahc0:1:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 sd1(ahc0:1:0): Direct-Access 8683MB (17783112 512 byte sectors) (ahc0:2:0): "SEAGATE ST19171W 0017" type 0 fixed SCSI 2 sd2(ahc0:2:0): Direct-Access 8683MB (17783112 512 byte sectors) ahc1 rev 0 int a irq 16 on pci1:5 Freeing (NOT implimented) irq 11 for ISA cards. ahc1: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs ahc1 waiting for scsi devices to settle ahc1: Someone reset channel A Probing for devices on PCI bus 2: ahc2 rev 0 int a irq 16 on pci2:4 Freeing (NOT implimented) irq 11 for ISA cards. ahc2: aic7880 Wide Channel A, SCSI Id=7, 16 SCBs ahc2 waiting for scsi devices to settle ahc2: Someone reset channel A ahc3 rev 0 int a irq 17 on pci2:5 Freeing (NOT implimented) irq 10 for ISA cards. ahc3: aic7880 Wide Channel B, SCSI Id=7, 16 SCBs ahc3 waiting for scsi devices to settle ahc3: Someone reset channel A Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: NEC 72065B npx0 on motherboard npx0: INT 16 interface changing root device to sd0a Enabled INTs: 1, 2, 6, 8, 16, 17, 18, 19, imen: 0x00f0feb9 de0: enabling 10baseT port SMP: All idle procs online. SMP: Starting 1st AP! SMP: AP CPU #1 LAUNCHED!! Starting Scheduling... SMP: TADA! CPU #1 made it into the scheduler!. SMP: All 2 CPU's are online! From owner-freebsd-smp Fri Jan 31 02:02:11 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id CAA15492 for smp-outgoing; Fri, 31 Jan 1997 02:02:11 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id CAA15486 for ; Fri, 31 Jan 1997 02:02:05 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id DAA22229; Fri, 31 Jan 1997 03:00:53 -0700 Message-Id: <199701311000.DAA22229@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@sinbin.demos.su (Alex G. Bulushev) cc: mishania@demos.su, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Fri, 31 Jan 1997 12:37:39 +0300." <199701310937.MAA21221@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 03:00:53 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > >You could attempt to fix the code in sys/i386/i386/mp_machdep.c. > >ahc0 rev 0 int a irq 19 on pci1:4 >ahc1 rev 0 int a irq 16 on pci1:5 >ahc2 rev 0 int a irq 16 on pci2:4 >ahc3 rev 0 int a irq 17 on pci2:5 > ^^^ is it correct? yes, this is exactly what we wanted to see, this corresponds to the entries in the mp table: I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# INT active-lo level 1 4:A 2 19 INT active-lo level 1 5:A 2 16 INT active-lo level 2 4:A 2 16 INT active-lo level 2 5:A 2 17 you parse the dmesg lines as: int a irq 17 on pci2:5 <- pci bus #2, pci device #5, pci int pin a on IRQ 17 you parse the mp table INT association line as: 2 5:A 2 17 <- pci bus #2, pci device #5, pci int A assoc with APIC IRQ 17 so for each of the 4 controllers (ahc0-3) we picked the correct APIC IRQs. now the $64k question, is the system running any better this way? -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 04:05:44 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id EAA19912 for smp-outgoing; Fri, 31 Jan 1997 04:05:44 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id EAA19906 for ; Fri, 31 Jan 1997 04:05:37 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id PAA28052; Fri, 31 Jan 1997 15:04:41 +0300 Received: by sinbin.demos.su id PAA22033; (8.6.12/D) Fri, 31 Jan 1997 15:04:13 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701311204.PAA22033@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Fri, 31 Jan 1997 15:04:13 +0300 (MSK) Cc: mishania@demos.su, freebsd-smp@freebsd.org In-Reply-To: <199701311000.DAA22229@clem.systemsix.com> from "Steve Passe" at Jan 31, 97 03:00:53 am X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > so for each of the 4 controllers (ahc0-3) we picked the correct APIC IRQs. > now the $64k question, is the system running any better this way? > no :( it reboots ... max working time ~1h when runing with one CPU it works ... i'l planing to test it with minimal needed number of cards Alex. From owner-freebsd-smp Fri Jan 31 06:36:09 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id GAA24787 for smp-outgoing; Fri, 31 Jan 1997 06:36:09 -0800 (PST) Received: from pdx1.world.net (pdx1.world.net [192.243.32.18]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id GAA24759 for ; Fri, 31 Jan 1997 06:36:02 -0800 (PST) Received: from suburbia.net (suburbia.net [203.4.184.1]) by pdx1.world.net (8.7.5/8.7.3) with SMTP id GAA10728 for ; Fri, 31 Jan 1997 06:36:53 -0800 (PST) Received: (qmail 2829 invoked by uid 110); 31 Jan 1997 14:35:10 -0000 MBOX-Line: From owner-netdev@roxanne.nuclecu.unam.mx Fri Jan 31 12:59:57 1997 remote from suburbia.net Delivered-To: proff@suburbia.net Received: (qmail 1389 invoked from network); 31 Jan 1997 12:59:12 -0000 Received: from roxanne.nuclecu.unam.mx (132.248.29.2) by suburbia.net with SMTP; 31 Jan 1997 12:59:12 -0000 Received: (from root@localhost) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) id GAA10602 for netdev-outgoing; Fri, 31 Jan 1997 06:34:25 -0600 Received: from caipfs.rutgers.edu (caipfs.rutgers.edu [128.6.155.100]) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) with ESMTP id GAA10595 for ; Fri, 31 Jan 1997 06:34:20 -0600 Received: from jenolan.caipgeneral (jenolan.rutgers.edu [128.6.111.5]) by caipfs.rutgers.edu (8.7.6/8.7.3) with SMTP id HAA22496; Fri, 31 Jan 1997 07:31:29 -0500 (EST) Received: by jenolan.caipgeneral (SMI-8.6/SMI-SVR4) id HAA15221; Fri, 31 Jan 1997 07:31:12 -0500 Date: Fri, 31 Jan 1997 07:31:12 -0500 Message-Id: <199701311231.HAA15221@jenolan.caipgeneral> From: "David S. Miller" To: netdev@roxanne.nuclecu.unam.mx CC: roque@di.fc.ul.pt In-reply-to: <199701311117.LAA13030@oberon.di.fc.ul.pt> (message from Pedro Roque on Fri, 31 Jan 1997 11:17:04 GMT) Subject: Re: SMP Reply-To: netdev@roxanne.nuclecu.unam.mx, "David S. Miller" Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Date: Fri, 31 Jan 1997 11:17:04 GMT From: Pedro Roque Ok dave how do you want us to do it :-) ? Good question ;) Specific questions i have: - Is it ok to do locking on TCP by socket ? (i.e. is the granularity too low) - a lot of the current locking is done by using atomic_bh to avoid net_bh and timer to run while the critical section is executing... This is a nice scheme in non SMP... how do you want it replaced ? - How will timers and bottoms handlers run ? do they have a lock already ? should they release the kernel lock and adquire a network one ? [ NOTE: This mail should be forwarded by some random person within the next 12 hours or so to freebsd-hackers@freebsd.org and instantly everyone there will be "experts" on wait free algorithms and lockless data structures. This seems to be a new fad or something, almost everything I post about some new technique that is going to enter Linux gets forwarded there by someone, I find it extremely amusing the treatment my forwarded postings gets. ;-) ] If you want to attack the SMP problem the way that will bring results myself and Linus are really striving for, do this: step 1: Play stupid. "Locks, what are locks?" This is completely serious. Pretend that you know nothing about "traditional" SMP synchronization techniques. Now you are ready to analyze the problem at hand. As an example, self-locking or also termed "lockless" data structures which are self synchronizing are what you will most likely get the best results from. As an example, Eric Schenk already has told us that self-locking algorithms for hash tables do exist, and is why he was stressing that we look at that solution for the socket demuxing instead of something involving a radix tree. There are (at least to my knowledge) self locking data structures and algorithms for a decent set of the basic objects one would want to manipulate. Alan told me at Usenix that ones exist for trees, although he couldn't recall what kind of trees. Also I'm pretty sure there some sort of linked list implementation which is self locking as well. The problem here, is that this is new territory. I only have one reference or two to this area of work. It would be nice if the other hackers (Mr. Schenk, hint hint ;-) could cough up sources of information for papers on existing self locking techniques. Another related area is that of "wait free" algorithms, most of the existing research work is done depending upon a DCAS() instruction on the CPU, luckily DCAS (Double Compare And Set) can be done in software with not much extra overhead than a real hardware implementation. If you structure your code in the way their DCAS based wait free algorihms are, you get the same effect and the same contention avoidance characteristics. step 2: Ok, no self locking technique exists for what you are tackling. Ok, plan B. At this point, you are going to go after a traditional locking technique. But make the most of it. The traditional way people think about this stuff for performance is to try and make it such that less people want to lock the same stuff. I tend to disagree with that philosophy for certain situations, and instead I would incourage someone to take the following approach first. "Try to find a locking strategy that encourages code which holds the lock for _extremely_ short periods of time." That is, strive for short holds instead of less contention. As for interrupts and base handlers, perhaps not their "interface" but their implementation is indeed going to go through a major overhaul. At Usenix Linus and myself decided that we want the following from the interrupt architecture the kernel will have: 1) It must be implemented via software. 2) It will extend to the multiprocessor model. That is to say that when a cli() occurrs, dammit that is a cli() no matter what cpu you are on. This essentially requires (1) to be in any way portable. 3) It must be low overhead, at least not as ineffeicient as real uniprocessor hardware interrupt disabling is on supported architectures is. The interface as I mentioned will be the same, your good friends save_flags, cli, sti, and restore_flags will still be there and do what you expect them to. What is the motivation behind this? Think about it... If the above can be made to work, and efficiently, _every_ interrupt handler we currently have (this means all device driver interrupt servicing code) can run with no locks, no modifications, nothing at all. And whats more in the SMP case, a port can scatter the interrupts to various processors if they all come in at once. For example: do_IRQ() { if(intr_count[smp_processor_id()]) { give_irq_to_some_other_cpu(irq); return; } [ ... ] } or do_IRQ() { make_next_irq_goto_some_cpu(irq); [ ... ] } Again, everyone can still run lockless, completely. Think about what this does to interrupt response times. And also consider what this does for device driver writers, they need know nothing about ugly traditional SMP mutual exclusion techniques. They just block out "interrupts" when they need to protect their top and bottom half code from some data accesses. Sounds nice right? Anyways, these are the sorts of "clever" things you want to look for. There are some cases where I will say just do the traditional locking for SMP. For example, per-process signal state is a good example. This is a case where there are enough "inherent races" that the only thing you really need to protect from is multiple updates, reads are essentially safe in all cases. In fact I've coded the necessarily parallelization for the signal stuff already, it was all straightforward and took around 45 minutes to implement and test. It was cool because it made 12 or 13 lock/unlock kernel sequences "disappear". ;-) For the networking, I want to entertain an idea. What would people say if I told them that %75 or more of the locking in the networking could be done via bh atomics? Here is a case where my suggestion of keeping locking to a minimum does not hold, and it shouldn't. The bh's can be made to do the bulk of the work, and are assured single threaded semantics at all times. I could (actually, probably am) be entirely wrong here, but I wanted to at least entertain that sort of idea to people. And I suppose trying to make everything protected via bh's could suck for performance, essentially start_bh_atomic() means "single thread me". And thats not so swift for scalability at all. Could the folks with refernces to lockless data structures and similar please share references you may have, this would be appreciated. ---------------------------------------------//// Yow! 11.26 MB/s remote host TCP bandwidth & //// 199 usec remote TCP latency over 100Mb/s //// ethernet. Beat that! //// -----------------------------------------////__________ o David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ >< From owner-freebsd-smp Fri Jan 31 09:03:40 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA01181 for smp-outgoing; Fri, 31 Jan 1997 09:03:40 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA01171 for ; Fri, 31 Jan 1997 09:03:35 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id JAA24071; Fri, 31 Jan 1997 09:58:20 -0700 Message-Id: <199701311658.JAA24071@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@sinbin.demos.su (Alex G. Bulushev) cc: mishania@demos.su, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Fri, 31 Jan 1997 15:04:13 +0300." <199701311204.PAA22033@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 09:58:20 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > so for each of the 4 controllers (ahc0-3) we picked the correct APIC IRQs. > > now the $64k question, is the system running any better this way? > > > > no :( it reboots ... max working time ~1h > when runing with one CPU it works ... > i'l planing to test it with minimal needed number of cards bummer.... well we definately did find and fix a problem. My guess is that whatever causes the PARITY ERRORS when adding the 2nd CPU is what is causing the reboots. when you say "when runing with one CPU it works" do you mean with 1 physical CPU installed, or do you mean with 2 CPUs installed but without starting up the 2nd one via sysctl? -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 09:16:48 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA01800 for smp-outgoing; Fri, 31 Jan 1997 09:16:48 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA01791 for ; Fri, 31 Jan 1997 09:16:42 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@megillah.demos.su [194.87.0.21] with ESMTP id UAA09353; Fri, 31 Jan 1997 20:16:27 +0300 Received: by megillah.demos.su id UAA06479; (8.8.3/D) Fri, 31 Jan 1997 20:16:25 +0300 (MSK) Message-Id: <199701311716.UAA06479@megillah.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Fri, 31 Jan 1997 20:16:24 +0300 (MSK) Cc: bag@demos.su (Alex G. Bulushev), smp@freebsd.org In-Reply-To: <199701311658.JAA24071@clem.systemsix.com> from "Steve Passe" at Jan 31, 97 09:58:20 am From: "Mikhail A. Sokolov" X-Class: Fast Organization: Demos Company, Ltd. Reply-To: mishania@demos.su X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Hi, > > bummer.... well we definately did find and fix a problem. My guess is > that whatever causes the PARITY ERRORS when adding the 2nd CPU is what > is causing the reboots. when you say "when runing with one CPU it works" > do you mean with 1 physical CPU installed, or do you mean with 2 CPUs installed It now has 3 hours uptime without second adaptec, with SMP kernel and 2 physical CPU's, both on with sysctl. Though, it reboots on sysctl -w kern.smp_active=2 in maximum as 40 minutes, but right now we did sysctl -w kern.smp_active=1 after it has an uptime 2 hours and smp_active=0. Strange, but I see 'TADA''s on console about _both_ CPU's being in work on adding smp_active=1. 2 != 1, right ? It still survives anyhow. Let's see when it crashes. -mishania > but without starting up the 2nd one via sysctl? > -- > Steve Passe | powered by > smp@csn.net | FreeBSD From owner-freebsd-smp Fri Jan 31 09:46:57 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA03315 for smp-outgoing; Fri, 31 Jan 1997 09:46:57 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA03307 for ; Fri, 31 Jan 1997 09:46:45 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id UAA18013; Fri, 31 Jan 1997 20:45:50 +0300 Received: by sinbin.demos.su id UAA01408; (8.6.12/D) Fri, 31 Jan 1997 20:45:05 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701311745.UAA01408@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Fri, 31 Jan 1997 20:45:05 +0300 (MSK) Cc: mishania@demos.su, freebsd-smp@freebsd.org In-Reply-To: <199701311658.JAA24071@clem.systemsix.com> from "Steve Passe" at Jan 31, 97 09:58:20 am X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > bummer.... well we definately did find and fix a problem. My guess is > that whatever causes the PARITY ERRORS when adding the 2nd CPU is what may be > is causing the reboots. when you say "when runing with one CPU it works" > do you mean with 1 physical CPU installed, or do you mean with 2 CPUs installed > but without starting up the 2nd one via sysctl? > i mean "with 2 physical CPU installed, but with not SMP kernel" with single phisical CPU it stable too ... now i remove 2nd 3940 and run SMP kernel with kern.smp_active=0 after 2 hours (without reboots !) i write: sysctl -w kern.smp_active=1 ^ why? don't know. sysctl -a show that kern.smp_active=2 and message on console say that two CPU's runing ... now it works without reboots ... 3:13 ... 1 h with two CPU's Alex. From owner-freebsd-smp Fri Jan 31 10:19:01 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA04806 for smp-outgoing; Fri, 31 Jan 1997 10:19:01 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA04801 for ; Fri, 31 Jan 1997 10:18:57 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA24508; Fri, 31 Jan 1997 11:15:59 -0700 Message-Id: <199701311815.LAA24508@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: mishania@demos.su cc: bag@demos.su (Alex G. Bulushev), smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Fri, 31 Jan 1997 20:16:24 +0300." <199701311716.UAA06479@megillah.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 11:15:59 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > It now has 3 hours uptime without second adaptec, with SMP kernel and > 2 physical CPU's, both on with sysctl. Though, it reboots on sysctl -w > kern.smp_active=2 in maximum as 40 minutes, but right now we did > sysctl -w kern.smp_active=1 after it has an uptime 2 hours and smp_active=0. > Strange, but I see 'TADA''s on console about _both_ CPU's being in work on > adding smp_active=1. 2 != 1, right ? its a bug in my code, I'll elaborate in the answer to the next mailing... -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 10:40:31 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA05628 for smp-outgoing; Fri, 31 Jan 1997 10:40:31 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA05620 for ; Fri, 31 Jan 1997 10:40:16 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA24610; Fri, 31 Jan 1997 11:35:51 -0700 Message-Id: <199701311835.LAA24610@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: bag@sinbin.demos.su (Alex G. Bulushev) cc: mishania@demos.su, freebsd-smp@freebsd.org Subject: Re: troubles with smp kernel In-reply-to: Your message of "Fri, 31 Jan 1997 20:45:05 +0300." <199701311745.UAA01408@sinbin.demos.su> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 11:35:50 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > is causing the reboots. when you say "when runing with one CPU it works" > > do you mean with 1 physical CPU installed, or do you mean with 2 CPUs installed > > but without starting up the 2nd one via sysctl? > > > > i mean "with 2 physical CPU installed, but with not SMP kernel" > with single phisical CPU it stable too ... > > now i remove 2nd 3940 and run SMP kernel with kern.smp_active=0 > after 2 hours (without reboots !) i write: > sysctl -w kern.smp_active=1 > ^ why? don't know. > > sysctl -a show that kern.smp_active=2 and message on console > say that two CPU's runing ... > > now it works without reboots ... 3:13 ... 1 h with two CPU's you have found a minor bug in the way I startup the 2nd CPU. We have tried to make this completely automatic, but those attempts were never successful, so I threw in a quick hack to allow the manual startup of the 2nd CPU without affecting the autostart code any more than necessary. This hack basically just waits for smp_active to change from 0 to some non-0 value, then it sets smp_active = , so when you do: sysctl -w kern.smp_active=1 it appears to have EXACTLY the same effect as setting it to 2. so, for tagging the problem in the mail archive: ---------------------------------------------------------------- SMP_PROBLEM: initial startup of APs via "sysctl kern.smp_active=x" ignores the actual value of 'x' and starts all the APs. solution: not really a problem as much as it is just unexpected behaviour. should be simple fix when I can find the time... ---------------------------------------------------------------- assumming you continue to run without trouble this points to the 2nd 3940 (or possibly any other hardware you have also removed). I don't think it is INT sharing in itself, we now have the INTs properly assigned, and many users have systems that have shared PCI INTs working without problem, including several with 1 3940 sharing INTs with network cards. You are the first to try 2 3940s that I know of, but that should not be a problem for any reason I can think of. Perhaps its the 2nd 3940 itself? After running without rebooting for a day or so I would suggest that you swap the 2 3940s, ie replace the working one with the currently removed one. Then run with it alone (ie leave the known good one out) and see if this 2nd card also runs by itself withoout problem. -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 10:47:49 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id KAA05906 for smp-outgoing; Fri, 31 Jan 1997 10:47:49 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id KAA05870 for ; Fri, 31 Jan 1997 10:47:09 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id LAA24680 for ; Fri, 31 Jan 1997 11:45:56 -0700 Message-Id: <199701311845.LAA24680@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: freebsd-smp@FreeBSD.ORG Subject: bounced mail Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 11:45:56 -0700 Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, I just got the 5-day notices for bounced mail on 2 different replys I made to . Hopefully you still get this and will know I didn't just ignore you: ----- The following addresses had delivery problems ----- (unrecoverable error) ----- Transcript of session follows ----- ... Deferred: Name server: intercore.com.: host name lookup failure Message could not be delivered for 5 days Message will be deleted from queue -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Fri Jan 31 12:21:57 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA10122 for smp-outgoing; Fri, 31 Jan 1997 12:21:57 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id MAA10117 for ; Fri, 31 Jan 1997 12:21:50 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id XAA02066; Fri, 31 Jan 1997 23:20:49 +0300 Received: by sinbin.demos.su id XAA18178; (8.6.12/D) Fri, 31 Jan 1997 23:20:19 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701312020.XAA18178@sinbin.demos.su> Subject: bytebench not correct for SMP kernel ? To: freebsd-smp@freebsd.org Date: Fri, 31 Jan 1997 23:20:19 +0300 (MSK) Cc: mishania@demos.su X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk i run bytebench for nonSMP kernel, for SMP kernel(smp_active=0) and for SMP kernel(smp_active=2) on the same hardware fastes is nonSMP kernel, then SMP kernel(smp_active=0) and slowest result for SMP kernel(smp_active=2) some parameters decrease dramaticaly nosmp smp smp active=0 active=2 System Call Overhead Test lps 68192.2 51738.2 38070.8 Pipe Throughput Test lps 92324.9 68053.5 57780.3 Pipe-based Context Switching Test lps 40542.8 20177.0 8785.4 Process Creation Test lps 3256.4 2739.2 1568.9 Execl Throughput Test lps 1437.4 1206.6 1032.5 File Read (10 seconds) KBps 254626.0 190873.0 151645.0 File Read (30 seconds) KBps 255236.0 191890.0 152978.0 Dc: sqrt(2) to 99 decimal places lpm 9533.5 8497.6 7406.6 is it bytebench bug? what bench tests SMP corectly? Alex. BYTE UNIX Benchmarks (Version 3.11) Asus P/I-P65UP5 with C-P6ND dual P6-200, adaptec 3940W, 256 Mb RAM ECC kernel/nosmp kernel/smp kernel/smp smp_active=0 smp_active=2 Dhrystone 2 without register variables (10s, 6sampl) 445551.2 lps 448127.5 lps 448622.3 lps Dhrystone 2 using register variables (10s, 6sampl) 449567.0 lps 451046.9 lps 451203.7 lps Arithmetic Test (type = arithoh) (10s, 6sampl) 13243729.3 lps 13264989.4 lps 13268774.7 lps Arithmetic Test (type = register) (10s, 6sampl) 54531.7 lps 54611.8 lps 54642.7 lps Arithmetic Test (type = short) (10s, 6sampl) 33325.9 lps 33383.2 lps 33395.7 lps Arithmetic Test (type = int) (10s, 6sampl) 54532.6 lps 54614.9 lps 54620.3 lps Arithmetic Test (type = long) (10s, 6sampl) 54533.5 lps 54583.9 lps 54660.0 lps Arithmetic Test (type = float) (10s, 6sampl) 60761.6 lps 60870.3 lps 60918.7 lps Arithmetic Test (type = double) (10s, 6sampl) 60779.7 lps 60865.2 lps 60931.6 lps System Call Overhead Test (10s, 6sampl) 68192.2 lps 51738.2 lps 38070.8 lps Pipe Throughput Test (10s, 6sampl) 92324.9 lps 68053.5 lps 57780.3 lps Pipe-based Context Switching Test (10s, 6sampl) 40542.8 lps 20177.0 lps 8785.4 lps Process Creation Test (10s, 6sampl) 3256.4 lps 2739.2 lps 1568.9 lps Execl Throughput Test ( 9s, 6sampl) 1437.4 lps 1206.6 lps 1032.5 lps File Read (10 seconds) (10s, 6sampl) 254626.0 KBps 190873.0 KBps 151645.0 KBps File Write (10 seconds) (10s, 6sampl) 3800.0 KBps 3800.0 KBps 3800.0 KBps File Copy (10 seconds) (10s, 6sampl) 3286.0 KBps 3286.0 KBps 3286.0 KBps File Read (30 seconds) (30s, 6sampl) 255236.0 KBps 191890.0 KBps 152978.0 KBps File Write (30 seconds) (30s, 6sampl) 3600.0 KBps 3600.0 KBps 3600.0 KBps File Copy (30 seconds) (30s, 6sampl) 1.0 KBps 3177.0 KBps 3236.0 KBps C Compiler Test (60s, 3sampl) 169.9 lpm 177.5 lpm 169.3 lpm Shell scripts (1 concurrent) (60s, 3sampl) 270.3 lpm 293.9 lpm 264.0 lpm Shell scripts (2 concurrent) (60s, 3sampl) 149.0 lpm 170.6 lpm 150.3 lpm Shell scripts (4 concurrent) (60s, 3sampl) 80.3 lpm 94.3 lpm 83.0 lpm Shell scripts (8 concurrent) (60s, 3sampl) 40.3 lpm 40.0 lpm 42.0 lpm Dc: sqrt(2) to 99 decimal places (60s, 6sampl) 9533.5 lpm 8497.6 lpm 7406.6 lpm Recursion Test--Tower of Hanoi (10s, 6sampl) 5574.9 lps 5600.4 lps 5602.9 lps INDEX VALUES TEST INDEX INDEX INDEX Arithmetic Test (type = double) 23.9 23.9 24.0 Dhrystone 2 without register variables 19.9 20.0 20.1 Execl Throughput Test 87.1 73.1 62.6 File Copy (30 seconds) 0.0 17.7 18.1 Pipe-based Context Switching Test 30.7 15.3 6.7 Shell scripts (8 concurrent) 10.1 10.0 10.5 ========= ========= ========= SUM of 6 items 171.8 160.2 141.8 AVERAGE 28.6 26.7 23.6 From owner-freebsd-smp Fri Jan 31 12:23:40 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA10152 for smp-outgoing; Fri, 31 Jan 1997 12:23:40 -0800 (PST) Received: from kremvax.demos.su (kremvax.demos.su [194.87.0.20]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id MAA10147 for ; Fri, 31 Jan 1997 12:23:35 -0800 (PST) Received: by kremvax.demos.su (8.6.13/D) from 0@sinbin.demos.su [194.87.2.95] with ESMTP id XAA02592; Fri, 31 Jan 1997 23:22:48 +0300 Received: by sinbin.demos.su id XAA25521; (8.6.12/D) Fri, 31 Jan 1997 23:22:51 +0300 From: bag@sinbin.demos.su (Alex G. Bulushev) Message-Id: <199701312022.XAA25521@sinbin.demos.su> Subject: Re: troubles with smp kernel To: smp@csn.net (Steve Passe) Date: Fri, 31 Jan 1997 23:22:50 +0300 (MSK) Cc: mishania@demos.su, freebsd-smp@freebsd.org In-Reply-To: <199701311835.LAA24610@clem.systemsix.com> from "Steve Passe" at Jan 31, 97 11:35:50 am X-Mailer: ELM [version 2.4 PL24 ME7a] Content-Type: text Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Perhaps its the 2nd 3940 itself? After running without rebooting for a > day or so I would suggest that you swap the 2 3940s, ie replace the working > one with the currently removed one. Then run with it alone (ie leave the > known good one out) and see if this 2nd card also runs by itself withoout > problem. Ok, i'l do this ... Alex. From owner-freebsd-smp Fri Jan 31 12:33:46 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA10628 for smp-outgoing; Fri, 31 Jan 1997 12:33:46 -0800 (PST) Received: from sendero.i-connect.net ([206.190.144.100]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA10601 for ; Fri, 31 Jan 1997 12:33:40 -0800 (PST) Received: (from shimon@localhost) by sendero.i-connect.net (8.8.5/8.8.4) id NAA15041; Fri, 31 Jan 1997 13:32:16 -0800 (PST) Message-ID: X-Mailer: XFMail 1.1-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199701310907.MAA14659@sinbin.demos.su> Date: Fri, 31 Jan 1997 12:58:41 -0800 (PST) Organization: iConnect Corp. From: Simon Shapiro To: (Alex G. Bulushev) Subject: Re: troubles with smp kernel Cc: freebsd-smp@freebsd.org, mishania@demos.su, (Steve Passe) Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi Alex G. Bulushev; On 31-Jan-97 you wrote: > > >options SMP_INVLTLB > > > > again, 2.0.6 will print this as well as all other new options in their > > suggested state, ie enabled/commented/default value. > > > > This is output of mptable 2.0.6, dmesg for kernel without mp_machdep.c patch, > but with all recomended options ... system very unstable ... > regulary after rebooting fsck stopped on drive sd2 cheking with message: > > cannot alloc 4177922 bytes for lncntp I get that one when doing fsck on a 4GB file system. Causes /etc/rc to fail and drop into user mode. On a 2.2-BETA, no SMP, etc. The amount of memory is only about 2MB, not 4. I had to disable the fsck in /etc/rc, so the system will boot and mount /Archives/FreeBSD :-) Simon From owner-freebsd-smp Fri Jan 31 13:09:40 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA12599 for smp-outgoing; Fri, 31 Jan 1997 13:09:40 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id NAA12591 for ; Fri, 31 Jan 1997 13:09:35 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id OAA25415; Fri, 31 Jan 1997 14:04:11 -0700 Message-Id: <199701312104.OAA25415@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Simon Shapiro cc: bag@sinbin.demos.su (Alex G. Bulushev), freebsd-smp@freebsd.org, mishania@demos.su Subject: Re: troubles with smp kernel In-reply-to: Your message of "Fri, 31 Jan 1997 12:58:41 PST." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 14:04:10 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > > but with all recomended options ... system very unstable ... > > regulary after rebooting fsck stopped on drive sd2 cheking with message: > > > > cannot alloc 4177922 bytes for lncntp > > I get that one when doing fsck on a 4GB file system. Causes /etc/rc to > fail and drop into user mode. On a 2.2-BETA, no SMP, etc. > The amount of memory is only about 2MB, not 4. I had to disable the fsck > in /etc/rc, so the system will boot and mount /Archives/FreeBSD :-) this is good to know, has this been reported to bugs via send-pr (or other means)? is there any keyword(s) for searching the mail lists for further info on this problem? -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK----- From owner-freebsd-smp Fri Jan 31 17:05:10 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id RAA25979 for smp-outgoing; Fri, 31 Jan 1997 17:05:10 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id RAA25974 for ; Fri, 31 Jan 1997 17:05:05 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id SAA03619; Fri, 31 Jan 1997 18:03:38 -0700 From: Terry Lambert Message-Id: <199702010103.SAA03619@phaeton.artisoft.com> Subject: Re: bytebench not correct for SMP kernel ? To: bag@sinbin.demos.su (Alex G. Bulushev) Date: Fri, 31 Jan 1997 18:03:38 +73700 (MST) Cc: freebsd-smp@freebsd.org, mishania@demos.su In-Reply-To: <199701312020.XAA18178@sinbin.demos.su> from "Alex G. Bulushev" at Jan 31, 97 11:20:19 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > System Call Overhead Test lps 68192.2 51738.2 38070.8 This is probably fair, but it's too high. The kernel is not processor reeentrant in the system call gate at this time; any other process running on the system will detract from the call time available to any other, even though they are both runnable, one or the other will be stuck in the mutex grab in the trap call. This is expected to be corrected once we can propagate locks down the data flow through each subsystem (I personally want to work on doing the VFS for this). > Pipe Throughput Test lps 92324.9 68053.5 57780.3 This is probably an artifact of the call mutex again. > Pipe-based Context Switching Test lps 40542.8 20177.0 8785.4 This is because the benchmark is doing the wrong thing. The context switch testing does not expect the processes so switched to operate concurrently. To correctly model this would require modelling a scarce resource becoming available after a small percentage of the run time of a process being switched... that is, the second CPU will be capable of entering the shared resource on the second process. This is inherently serialized by the way they are using the pipes in the context switch... so it measures serial context switch overhead, not concurrency of resource access. The context switch overhea is less important than the resource access concurrency in any SMP or kernel multithreading case. The more kernel threads and/or processors that can reeenter the resource, the worse this will be for accurate modelling. > Process Creation Test lps 3256.4 2739.2 1568.9 This is an effect of the fork call gate, and of the flush when a process is started on a CPU other than the one in which the process that forked is running (in a traditional UP environment, a cache flush is not required on the processor when the child starts, and it will, in effect be here). The problem derives from the child and the parent both being immediately placed on the ready-to-run queue. The proper method of fixing this is probably to establish a split scheduler queue model to enforce an initial processor affinity in the child for the processor the parent that forked was running on. If we scale this with the initial call mutex reduction, we see that this is slightly worse than the test case. This amount of "slightly worse" is the cost to establish the child process mappings on the second CPU in the absence of usable cached data, as you would have in the UP case. You would probably discard the initial affinity (if the user has not forced an affinity) after the first context switch of the child, allowing the process to migrate off the parent's CPU, which would effect an increase in cocurrency, assuming neither CPU was bound with work. > Execl Throughput Test lps 1437.4 1206.6 1032.5 This is a truer measure of just the call gate overhead, since an exec'ed process won't have usable cache. We can scale them by their relationship on the UP case to see that the effects I predecited for CPU switching have about the scale we decided they would have. Again, this would benefit from deserializing the ready-to-run queue. > File Read (10 seconds) KBps 254626.0 190873.0 151645.0 > File Read (30 seconds) KBps 255236.0 191890.0 152978.0 These are, again, the processor affinity issue, since the processor you go into the kernelon is not necessarily the processor you come out of the kernel on. This is a scheduler problem unrelated to the actual existance of SMP, per se... > Dc: sqrt(2) to 99 decimal places lpm 9533.5 8497.6 7406.6 I'm not sure on this... there must be some call gate effects for the controlling process, but they would be minimal in a CPU bound environment. More like, this is related to poor FPU context handling. In the standard UP kernel, the FPU context is "lazy bound"... that is, if a process uses FPU, the FPU state will not be flushed until another, *different* process also decides to use FPU -- then it will need to (potentially) signal exception state for unprocessed exceptions (the FPU design is "except-behind", probably an error on Intel's part, if you want an SMP system). This implies that if we did FPU "right" and did not tie the lazy flush to processor affinity changes (if any), then we would expect a higher overhead on context switch out of an FPU using process. > is it bytebench bug? There are a couple of bugs... there are also a lot of overemphasis on issues related solely to the scheduler, and less related to what the tests purport to benchmark. Big effects from things the benchmark designers felt would be "noise", generally, and aren't, in the SMP case. > what bench tests SMP corectly? One benchmark is good as another, as long as you know what you are comparing, and compare only similar things. This particular benchmark doesn't compare very good things to show SMP vs. non-SMP, and instead shows up scheduler differences (which is good too, but really doesn't match with the labels they've said describe what they are trying to test). Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Fri Jan 31 18:46:21 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id SAA29902 for smp-outgoing; Fri, 31 Jan 1997 18:46:21 -0800 (PST) Received: from sendero.i-connect.net ([206.190.144.100]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id SAA29896 for ; Fri, 31 Jan 1997 18:46:17 -0800 (PST) Received: (from shimon@localhost) by sendero.i-connect.net (8.8.5/8.8.4) id TAA16842; Fri, 31 Jan 1997 19:43:44 -0800 (PST) Message-ID: X-Mailer: XFMail 1.1-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <199701312104.OAA25415@clem.systemsix.com> Date: Fri, 31 Jan 1997 18:55:03 -0800 (PST) Organization: iConnect Corp. From: Simon Shapiro To: Steve Passe Subject: Re: troubles with smp kernel Cc: mishania@demos.su, freebsd-smp@freebsd.org, (Alex G. Bulushev) Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi Steve Passe; On 31-Jan-97 you wrote: > Hi, > > > > but with all recomended options ... system very unstable ... > > > regulary after rebooting fsck stopped on drive sd2 cheking with message: > > > > > > cannot alloc 4177922 bytes for lncntp > > > > I get that one when doing fsck on a 4GB file system. Causes /etc/rc to > > fail and drop into user mode. On a 2.2-BETA, no SMP, etc. > > The amount of memory is only about 2MB, not 4. I had to disable the fsck > > in /etc/rc, so the system will boot and mount /Archives/FreeBSD :-) > > this is good to know, has this been reported to bugs via send-pr (or > other means)? is there any keyword(s) for searching the mail lists > for further info on this problem? Now there is :-) lncntp 'fsck -p' 4GB Simon From owner-freebsd-smp Fri Jan 31 19:22:41 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id TAA01267 for smp-outgoing; Fri, 31 Jan 1997 19:22:41 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id TAA01262 for ; Fri, 31 Jan 1997 19:22:37 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id UAA27071; Fri, 31 Jan 1997 20:16:08 -0700 Message-Id: <199702010316.UAA27071@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: Terry Lambert cc: bag@sinbin.demos.su (Alex G. Bulushev), freebsd-smp@freebsd.org, mishania@demos.su Subject: Re: bytebench not correct for SMP kernel ? In-reply-to: Your message of "Fri, 31 Jan 1997 18:03:38." <199702010103.SAA03619@phaeton.artisoft.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 Jan 1997 20:16:08 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, [ nice answer by terry, thanx! ] perhaps a better test is doing some real world things, try this cd /src/sys/compile/ running uni-processor kernel: make clean;make depend; time make make clean;make depend; time make -j4 running SMP kernel with both CPUs enabled: make clean;make depend; time make make clean;make depend; time make -j8 -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Fri Jan 31 20:02:21 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id UAA02514 for smp-outgoing; Fri, 31 Jan 1997 20:02:21 -0800 (PST) Received: from sendero.i-connect.net ([206.190.144.100]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id UAA02505; Fri, 31 Jan 1997 20:02:14 -0800 (PST) Received: (from shimon@localhost) by sendero.i-connect.net (8.8.5/8.8.4) id VAA06103; Fri, 31 Jan 1997 21:01:47 -0800 (PST) Message-ID: X-Mailer: XFMail 1.1-alpha [p0] on FreeBSD Content-Type: text/plain; charset=iso-8859-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Date: Fri, 31 Jan 1997 20:28:36 -0800 (PST) Organization: iConnect Corp. From: Simon Shapiro To: freebsd-questions@freebsd.org, freebsd-smp@freebsd.org Subject: CVS TAGS, whaere are you? Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk What are the proper tags, for CVS, to checkout the 3.0-SNAP source tree, the 2.2-BETA tree, and the SMP tree. I already know that the -current tree is ``.''. Thanx, Simon From owner-freebsd-smp Fri Jan 31 20:23:48 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id UAA03209 for smp-outgoing; Fri, 31 Jan 1997 20:23:48 -0800 (PST) Received: from po1.glue.umd.edu (root@po1.glue.umd.edu [129.2.128.44]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id UAA03189; Fri, 31 Jan 1997 20:23:41 -0800 (PST) Received: from skipper.eng.umd.edu (skipper.eng.umd.edu [129.2.103.24]) by po1.glue.umd.edu (8.8.5/8.7.3) with ESMTP id XAA26731; Fri, 31 Jan 1997 23:23:38 -0500 (EST) Received: from localhost (chuckr@localhost) by skipper.eng.umd.edu (8.8.5/8.7.3) with SMTP id XAA32132; Fri, 31 Jan 1997 23:23:37 -0500 (EST) X-Authentication-Warning: skipper.eng.umd.edu: chuckr owned process doing -bs Date: Fri, 31 Jan 1997 23:23:37 -0500 (EST) From: Chuck Robey X-Sender: chuckr@skipper.eng.umd.edu To: Simon Shapiro cc: freebsd-questions@freebsd.org, freebsd-smp@freebsd.org Subject: Re: CVS TAGS, whaere are you? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk On Fri, 31 Jan 1997, Simon Shapiro wrote: > What are the proper tags, for CVS, to checkout the 3.0-SNAP source tree, > the 2.2-BETA tree, and the SMP tree. I already know that the -current tree > is ``.''. SMP is separate from the main cvs tree, so no tag will get it. Someone correct me if I'm wrong here, but doesn't the CVSROOT/val-tags file list the available tags? Here's the file: RELENG_2_1_0 y HEAD y JULIAN_HACK y SCSI y ALLMAN y RELENG_2_1_5_RELEASE y scsi y CSRG y RELENG_2_1_0_RELEASE y v8_8_2 y RELENG_2_2_BP y RELENG_2_2 y RELENG_2_1_6_RELEASE y If I'm right about this file, the tags seem to be fairly obvious. ----------------------------+----------------------------------------------- Chuck Robey | Interests include any kind of voice or data chuckr@eng.umd.edu | communications topic, C programming, and Unix. 9120 Edmonston Ct #302 | Greenbelt, MD 20770 | I run Journey2 and picnic, both FreeBSD (301) 220-2114 | version 3.0 current -- and great FUN! ----------------------------+----------------------------------------------- From owner-freebsd-smp Fri Jan 31 23:21:53 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id XAA08748 for smp-outgoing; Fri, 31 Jan 1997 23:21:53 -0800 (PST) Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id XAA08743; Fri, 31 Jan 1997 23:21:51 -0800 (PST) Received: (from bde@localhost) by godzilla.zeta.org.au (8.8.3/8.6.9) id SAA14427; Sat, 1 Feb 1997 18:19:30 +1100 Date: Sat, 1 Feb 1997 18:19:30 +1100 From: Bruce Evans Message-Id: <199702010719.SAA14427@godzilla.zeta.org.au> To: freebsd-questions@freebsd.org, freebsd-smp@freebsd.org, Shimon@i-Connect.Net Subject: Re: CVS TAGS, whaere are you? Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >What are the proper tags, for CVS, to checkout the 3.0-SNAP source tree, >the 2.2-BETA tree, and the SMP tree. I already know that the -current tree >is ``.''. There aren't any for SNAPs or 2.2-BETA :-(. Tags are too expensive to apply to SNAPs, but should be applied to BETAs. Tag RELENG_2_2 gives the head of the 2.2 branch. You probably want that anyway unless you are attempting to figure out which bugs are in an old version. Bruce From owner-freebsd-smp Sat Feb 1 08:58:28 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id IAA29620 for smp-outgoing; Sat, 1 Feb 1997 08:58:28 -0800 (PST) Received: from smerdon.livonia.mi.us (root@pm145-23.dialip.mich.net [198.110.144.93]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA29611 for ; Sat, 1 Feb 1997 08:58:16 -0800 (PST) Received: from p133 (e0.i386.smerdon.livonia.mi.us [199.33.147.37]) by smerdon.livonia.mi.us (8.7.5/8.6.9) with SMTP id LAA06030; Sat, 1 Feb 1997 11:57:51 -0500 (EST) Message-Id: <3.0.32.19970201115741.00ad4210@smerdon.livonia.mi.us> X-Sender: jds@smerdon.livonia.mi.us X-Mailer: Windows Eudora Pro Version 3.0 (32) Date: Sat, 01 Feb 1997 11:57:44 -0500 To: Steve Passe From: "John D. Smerdon" Subject: Re: Tyan Tomcat II SMP video problems Cc: smp@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >> I have a Tyan Tomcat II with 2/P133's and a Matrox Millennium >> hanging during init. The system is still running, so you can >> telnet in. Shutdown never completes. I tried rebuilding kernels several times with no luck. I then compiled a kernel without APIC_IO and the system booted without any problems. Entering `sysctl -w smp_active=2` worked, but entering `ps aux` causes panic: <...> current process = 5 (cpuidle0) trapnumber = 29 panic (cpu#0) Unknown/Reserved Trap I also removed MATH_EMULATE and increased NINTR to 24, per other recent messages. Searching through old SMP mail archives, I saw a message from Hidetoshi Shimokawa (Sep 29, 1996) where he was having problems with another Tyan board where the boot CPU was not #0. He had patches to some initialization code and termination code that made sure the correct CPU is doing the init and termination. This is not in the current sources. Any chance this is related? mptable with and without APIC_IO is below. mptable with APIC_IO: >=========================================================================== ==== > >MPTable, version 2.0.6 > > looking for EBDA pointer @ 0x040e, NOT found > searching CMOS 'top of mem' @ 0x0009fc00 (639K) > searching BIOS @ 0x000f0000 > > MP FPS found in BIOS @ physical addr: 0x000f0c80 > >--------------------------------------------------------------------------- ---- > >MP Floating Pointer Structure: > > location: BIOS > physical address: 0x000f0c80 > signature: '_MP_' > length: 16 bytes > version: 1.1 > checksum: 0xf4 > mode: Virtual Wire > >--------------------------------------------------------------------------- ---- > >MP Config Table Header: > > physical address: 0x000f0c94 > signature: 'PCMP' > base table length: 292 > version: 1.1 > checksum: 0xa5 > OEM ID: 'OEM00000' > Product ID: 'PROD00000000' > OEM table pointer: 0x00000000 > OEM table size: 0 > entry count: 28 > local APIC address: 0xfee00000 > extended table length: 0 > extended table checksum: 0 > >--------------------------------------------------------------------------- ---- > >MP Config Base Table Entries: > >-- >Processors: APIC ID Version State Family Model Step Flags > 0 0x11 BSP, usable 5 2 1 0x07bf > 1 0x11 AP, usable 5 2 1 0x07bf >-- >Bus: Bus ID Type > 0 ISA > 1 PCI >-- >I/O APICs: APIC ID Version State Address > 2 0x11 usable 0xfec00000 >-- >I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > ExtINT conforms conforms 0 0 2 0 > INT conforms conforms 0 1 2 1 > INT conforms conforms 0 0 2 2 > INT conforms conforms 0 3 2 3 > INT conforms conforms 0 4 2 4 > INT conforms conforms 0 5 2 5 > INT conforms conforms 0 6 2 6 > INT conforms conforms 0 7 2 7 > INT conforms conforms 0 8 2 8 > INT conforms conforms 0 9 2 9 > INT conforms conforms 0 10 2 10 > INT conforms conforms 0 11 2 11 > INT conforms conforms 0 12 2 12 > INT conforms conforms 0 13 2 13 > INT conforms conforms 0 14 2 14 > INT conforms conforms 0 15 2 15 > INT active-lo level 1 20:A 2 16 > INT active-lo level 1 19:A 2 17 > INT active-lo level 1 18:A 2 18 > INT active-lo level 1 17:A 2 19 > SMI conforms conforms 0 0 2 23 >-- >Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > ExtINT active-hi edge 0 0 255 0 > NMI active-hi edge 0 0 255 1 > >--------------------------------------------------------------------------- ---- > ># SMP kernel config file options: > >options SMP # Symmetric MultiProcessor Kernel >options APIC_IO # Symmetric (APIC) I/O >options NCPU=2 # number of CPUs >options NBUS=2 # number of busses >options NAPIC=1 # number of IO APICs >options NINTR=24 # number of INTs >options SMP_INVLTLB # >#options SMP_PRIVPAGES # BROKEN, DO NOT use! >#options SMP_AUTOSTART # BROKEN, DO NOT use! >#options SERIAL_DEBUG # com port debug output > >--------------------------------------------------------------------------- ---- > >dmesg output: > >Copyright (c) 1992-1996 FreeBSD Inc. >Copyright (c) 1982, 1986, 1989, 1991, 1993 > The Regents of the University of California. All rights reserved. > >FreeBSD 3.0-SMP #0: Sat Feb 1 10:08:27 EST 1997 > jds@p133.smerdon.livonia.mi.us:/usr/src/sys/compile/SMERDONSMPAPIC >FreeBSD/SMP: Multiprocessor motherboard > cpu0 (BSP): apic id: 0, version: 0x00030010 > cpu1 (AP): apic id: 1, version: 0x00030010 > io0 (APIC): apic id: 2, version: 0x00170011 >Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193121 Hz >CPU: Pentium (586-class CPU) > Origin = "GenuineIntel" Id = 0x52c Stepping=12 > Features=0x3bf >real memory = 67108864 (65536K bytes) >avail memory = 63594496 (62104K bytes) >Probing for devices on PCI bus 0: >chip0 rev 2 on pci0:0 >chip1 rev 1 on pci0:7:0 >chip2 rev 0 on pci0:7:1 >vga0 rev 1 int a irq 19 on pci0:17 >Freeing (NOT implimented) irq 10 for ISA cards. >ahc0 rev 0 int a irq 17 on pci0:19 >Freeing (NOT implimented) irq 11 for ISA cards. >ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs >ahc0 waiting for scsi devices to settle >(ahc0:0:0): "Quantum XP34300W L912" type 0 fixed SCSI 2 >sd0(ahc0:0:0): Direct-Access 4101MB (8399520 512 byte sectors) >ahc0:A:5: refuses WIDE negotiation. Using 8bit transfers >(ahc0:5:0): "TOSHIBA CD-ROM XM-3701TA 0236" type 5 removable SCSI 2 >cd0(ahc0:5:0): CD-ROM can't get the size >Probing for devices on the ISA bus: >sc0 at 0x60-0x6f irq 1 on motherboard >sc0: VGA color <16 virtual consoles, flags=0x0> >sio0 at 0x3f8-0x3ff irq 4 on isa >sio0: type 16550A >sio1 at 0x2f8-0x2ff irq 3 on isa >sio1: type 16550A >sio2 at 0x3e8-0x3ef irq 9 on isa >sio2: type 16550A >sio3: disabled, not probed. >lpt0 at 0x378-0x37f irq 7 on isa >lpt0: Interrupt-driven port >lp0: TCP/IP capable interface >fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa >fdc0: NEC 72065B >fd0: 1.44MB 3.5in >uha0 not found at 0x330 >aha0 not found at 0x330 >aic0 not found at 0x340 >scd0 not found at 0x230 >1 3C5x9 board(s) on ISA found at 0x300 >ep0 at 0x300-0x30f irq 15 on isa >ep0: aui/utp/bnc[*BNC*] address 00:a0:24:be:b8:c0 >npx0 on motherboard >npx0: INT 16 interface >apm0: disabled, not probed. >joy0 at 0x201 on isa >joy0: joystick >sb0 at 0x220 irq 5 drq 1 on isa >sb0: >sbxvi0 at 0x0 drq 5 on isa >sbxvi0: >sbmidi0 at 0x330 on isa > >changing root device to sd0a >Enabled INTs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 17, imen: 0x00fd7c01 >SMP: All idle procs online. > >=========================================================================== ==== > > mptable without APIC_IO: >=========================================================================== ==== > >MPTable, version 2.0.6 > > looking for EBDA pointer @ 0x040e, NOT found > searching CMOS 'top of mem' @ 0x0009fc00 (639K) > searching BIOS @ 0x000f0000 > > MP FPS found in BIOS @ physical addr: 0x000f0c80 > >--------------------------------------------------------------------------- ---- > >MP Floating Pointer Structure: > > location: BIOS > physical address: 0x000f0c80 > signature: '_MP_' > length: 16 bytes > version: 1.1 > checksum: 0xf4 > mode: Virtual Wire > >--------------------------------------------------------------------------- ---- > >MP Config Table Header: > > physical address: 0x000f0c94 > signature: 'PCMP' > base table length: 292 > version: 1.1 > checksum: 0xa5 > OEM ID: 'OEM00000' > Product ID: 'PROD00000000' > OEM table pointer: 0x00000000 > OEM table size: 0 > entry count: 28 > local APIC address: 0xfee00000 > extended table length: 0 > extended table checksum: 0 > >--------------------------------------------------------------------------- ---- > >MP Config Base Table Entries: > >-- >Processors: APIC ID Version State Family Model Step Flags > 0 0x11 BSP, usable 5 2 1 0x07bf > 1 0x11 AP, usable 5 2 1 0x07bf >-- >Bus: Bus ID Type > 0 ISA > 1 PCI >-- >I/O APICs: APIC ID Version State Address > 2 0x11 usable 0xfec00000 >-- >I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > ExtINT conforms conforms 0 0 2 0 > INT conforms conforms 0 1 2 1 > INT conforms conforms 0 0 2 2 > INT conforms conforms 0 3 2 3 > INT conforms conforms 0 4 2 4 > INT conforms conforms 0 5 2 5 > INT conforms conforms 0 6 2 6 > INT conforms conforms 0 7 2 7 > INT conforms conforms 0 8 2 8 > INT conforms conforms 0 9 2 9 > INT conforms conforms 0 10 2 10 > INT conforms conforms 0 11 2 11 > INT conforms conforms 0 12 2 12 > INT conforms conforms 0 13 2 13 > INT conforms conforms 0 14 2 14 > INT conforms conforms 0 15 2 15 > INT active-lo level 1 20:A 2 16 > INT active-lo level 1 19:A 2 17 > INT active-lo level 1 18:A 2 18 > INT active-lo level 1 17:A 2 19 > SMI conforms conforms 0 0 2 23 >-- >Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID INT# > ExtINT active-hi edge 0 0 255 0 > NMI active-hi edge 0 0 255 1 > >--------------------------------------------------------------------------- ---- > ># SMP kernel config file options: > >options SMP # Symmetric MultiProcessor Kernel >options APIC_IO # Symmetric (APIC) I/O >options NCPU=2 # number of CPUs >options NBUS=2 # number of busses >options NAPIC=1 # number of IO APICs >options NINTR=24 # number of INTs >options SMP_INVLTLB # >#options SMP_PRIVPAGES # BROKEN, DO NOT use! >#options SMP_AUTOSTART # BROKEN, DO NOT use! >#options SERIAL_DEBUG # com port debug output > >--------------------------------------------------------------------------- ---- > >dmesg output: > >Copyright (c) 1992-1996 FreeBSD Inc. >Copyright (c) 1982, 1986, 1989, 1991, 1993 > The Regents of the University of California. All rights reserved. > >FreeBSD 3.0-SMP #0: Sat Feb 1 11:29:34 EST 1997 > root@p133.smerdon.livonia.mi.us:/usr/src/sys/compile/SMERDONSMP >FreeBSD/SMP: Multiprocessor motherboard > cpu0 (BSP): apic id: 0, version: 0x00030010 > cpu1 (AP): apic id: 1, version: 0x00030010 > Warning: APIC I/O disabled >Calibrating clock(s) relative to mc146818A clock ... i8254 clock: 1193122 Hz >CPU: Pentium (586-class CPU) > Origin = "GenuineIntel" Id = 0x52c Stepping=12 > Features=0x3bf >real memory = 67108864 (65536K bytes) >avail memory = 63602688 (62112K bytes) >Probing for devices on PCI bus 0: >chip0 rev 2 on pci0:0 >chip1 rev 1 on pci0:7:0 >chip2 rev 0 on pci0:7:1 >vga0 rev 1 int a irq 10 on pci0:17 >ahc0 rev 0 int a irq 11 on pci0:19 >ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs >ahc0 waiting for scsi devices to settle >(ahc0:0:0): "Quantum XP34300W L912" type 0 fixed SCSI 2 >sd0(ahc0:0:0): Direct-Access 4101MB (8399520 512 byte sectors) >ahc0:A:5: refuses WIDE negotiation. Using 8bit transfers >(ahc0:5:0): "TOSHIBA CD-ROM XM-3701TA 0236" type 5 removable SCSI 2 >cd0(ahc0:5:0): CD-ROM can't get the size >Probing for devices on the ISA bus: >sc0 at 0x60-0x6f irq 1 on motherboard >sc0: VGA color <16 virtual consoles, flags=0x0> >sio0 at 0x3f8-0x3ff irq 4 on isa >sio0: type 16550A >sio1 at 0x2f8-0x2ff irq 3 on isa >sio1: type 16550A >sio2 at 0x3e8-0x3ef irq 9 on isa >sio2: type 16550A >sio3: disabled, not probed. >lpt0 at 0x378-0x37f irq 7 on isa >lpt0: Interrupt-driven port >lp0: TCP/IP capable interface >fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa >fdc0: NEC 72065B >fd0: 1.44MB 3.5in >uha0 not found at 0x330 >aha0 not found at 0x330 >aic0 not found at 0x340 >scd0 not found at 0x230 >1 3C5x9 board(s) on ISA found at 0x300 >ep0 at 0x300-0x30f irq 15 on isa >ep0: aui/utp/bnc[*BNC*] address 00:a0:24:be:b8:c0 >npx0 on motherboard >npx0: INT 16 interface >apm0: disabled, not probed. >joy0 at 0x201 on isa >joy0: joystick >sb0 at 0x220 irq 5 drq 1 on isa >sb0: >sbxvi0 at 0x0 drq 5 on isa >sbxvi0: >sbmidi0 at 0x330 on isa > >changing root device to sd0a >SMP: All idle procs online. > >=========================================================================== ==== > > > -- John D. Smerdon; Livonia, Michigan, USA; Contents are my opinion. Home: jds@smerdon.livonia.mi.us From owner-freebsd-smp Sat Feb 1 09:55:26 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA02027 for smp-outgoing; Sat, 1 Feb 1997 09:55:26 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA02022 for ; Sat, 1 Feb 1997 09:55:22 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id KAA02973; Sat, 1 Feb 1997 10:52:13 -0700 Message-Id: <199702011752.KAA02973@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: "John D. Smerdon" cc: smp@freebsd.org Subject: Re: Tyan Tomcat II SMP video problems In-reply-to: Your message of "Sat, 01 Feb 1997 11:57:44 EST." <3.0.32.19970201115741.00ad4210@smerdon.livonia.mi.us> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 01 Feb 1997 10:52:13 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, >I tried rebuilding kernels several times with no luck. I then compiled a >kernel without APIC_IO and the system booted without any problems. did you also have: options SMP_INVLTLB set when using APIC_IO? this is MANDATORY (but ONLY with APIC_IO). --- >Entering `sysctl -w smp_active=2` worked, but entering `ps aux` causes panic: > ><...> >current process = 5 (cpuidle0) >trapnumber = 29 >panic (cpu#0) Unknown/Reserved Trap this smells like hardware to me. I believe this board is the one that had a problem with its cache module. The solution (if your board is indeed affected) is to replace the cache module with a "good" one. whether that means "any new one" or a specific model made by tyan to fix the problem I have no idea. You will have to turn to the tyan mail list (or tyan) for that answer. you might try running with the external cache disabled in the BIOS if thats possible (just to test the theory, wouldn't be an acceptable long term solution). All this is just a theory, I could easily be barking up the wrong tree... --- >Searching through old SMP mail archives, I saw a message from Hidetoshi >Shimokawa (Sep 29, 1996) where he was having problems with another Tyan >board where the boot CPU was not #0. He had patches to some initialization >code and termination code that made sure the correct CPU is doing the init >and termination. This is not in the current sources. Any chance this is >related? no, we are past those issues in the current code. there is a table mapping physical CPU numbers to virtual numbers, insuring that the right one is selected. --- >mptable with and without APIC_IO is below. the only thing that jumps out is having ep0 on IRQ15, try to keep everything off of 14 & 15. -- Steve Passe | powered by smp@csn.net | FreeBSD From owner-freebsd-smp Sat Feb 1 13:02:22 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id NAA09409 for smp-outgoing; Sat, 1 Feb 1997 13:02:22 -0800 (PST) Received: from pdx1.world.net (pdx1.world.net [192.243.32.18]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA09397 for ; Sat, 1 Feb 1997 13:01:58 -0800 (PST) Received: from suburbia.net (suburbia.net [203.4.184.1]) by pdx1.world.net (8.7.5/8.7.3) with SMTP id NAA04921 for ; Sat, 1 Feb 1997 13:02:43 -0800 (PST) Received: (qmail 10804 invoked by uid 110); 1 Feb 1997 21:00:31 -0000 MBOX-Line: From owner-netdev@roxanne.nuclecu.unam.mx Sat Feb 01 20:27:14 1997 remote from suburbia.net Delivered-To: proff@suburbia.net Received: (qmail 9045 invoked from network); 1 Feb 1997 20:26:51 -0000 Received: from roxanne.nuclecu.unam.mx (132.248.29.2) by suburbia.net with SMTP; 1 Feb 1997 20:26:51 -0000 Received: (from root@localhost) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) id OAA22043 for netdev-outgoing; Sat, 1 Feb 1997 14:00:30 -0600 Received: from prosun.first.gmd.de (prosun.first.gmd.de [194.95.168.2]) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) with SMTP id FAA19920 for ; Sat, 1 Feb 1997 05:20:16 -0600 Received: by prosun.first.gmd.de (4.1/SMI-4.1) id AA01923; Sat, 1 Feb 97 12:17:41 +0100 From: leo@prosun.first.gmd.de (Matthias L. Jugel) Message-Id: <9702011117.AA01923@prosun.first.gmd.de> Subject: lockfree datastructures To: netdev@roxanne.nuclecu.unam.mx Date: Sat, 1 Feb 1997 12:17:40 +0100 (MET) X-Mailer: ELM [version 2.4 PL23] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi folks, a friend of mine forwarded me your interest in lock free data structures. Maybe the following papers will help you: * General Aspects - "Axioms for Concurrent Objects", CMU-CS-86-154 - "Reasoning About Concurrent Objects", CMU-CS-87-176 - "Linearizeability: A Correctness Condition for Concurrent Objects", CMU-CS-88-120 Maurice P. Herlihy, Jeanette M. Wing Dept. of Computer Science, Carnegie-Mellong-Univ. * Methods - "A Methodology for Implementing Highly Concurrent Data Objects", ACM Transactions on Programing Languages and Systems, Vol 15, No 5, Nov 1993 Pages 746-770 - "A Method for Implementing Lock-Free Shared Data Structures", Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing, Ottawa, Ont. Canada, pp184-193 G. Barnes, 1995 also: MPI-I-94-120, Apr 1994, Max-Planck-Institut fuer Informatik - "Practical Considerations for nonblocking concurrent objects" Proceedings 13th IEEE International Conference on Distri- buted Systems, IEEE Computer Society Press, pp 264-273 B.N. Bershad, 1993 * Operating Systems - "The Synergy Between Non-Blocking Synchronisation and Operating System Structure", CA 94305-9040 Michael Greenwald, David Cheriton CS Dept, Stanford University - ... There is another paper I have to fetch ... Ok, I hope the papers give you a clue about the problem. From experience and discussion with our own OS builders I get the impression that for most problems a CAS or CAS2 (Compare-And-Swap) is sufficient compared to the proposed methods explained in the papers above. I'd be very interested in any problems and solutions you have in mind, because I am working on the subject in my PhD studies. If someone could give me a more detailled description of the problems you want to solve using non-blocking techniques. Best wishes, Leo. -- Matthias L. Jugel, leo@first.gmd.de GMD - German National Research Centre for Information Technology FIRST - Research Institute for Computer Architecture and Software Technology From owner-freebsd-smp Sat Feb 1 14:40:36 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id OAA12772 for smp-outgoing; Sat, 1 Feb 1997 14:40:36 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id OAA12767 for ; Sat, 1 Feb 1997 14:40:30 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id PAA06784; Sat, 1 Feb 1997 15:39:07 -0700 From: Terry Lambert Message-Id: <199702012239.PAA06784@phaeton.artisoft.com> Subject: Re: lockfree datastructures To: leo@prosun.first.gmd.de (Matthias L. Jugel) Date: Sat, 1 Feb 1997 15:39:07 -0700 (MST) Cc: netdev@roxanne.nuclecu.unam.mx, smp@freebsd.org In-Reply-To: <9702011117.AA01923@prosun.first.gmd.de> from "Matthias L. Jugel" at Feb 1, 97 12:17:40 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > Hi folks, > > a friend of mine forwarded me your interest in lock free data structures. [ ... ] > Ok, I hope the papers give you a clue about the problem. From experience > and discussion with our own OS builders I get the impression that for > most problems a CAS or CAS2 (Compare-And-Swap) is sufficient compared to > the proposed methods explained in the papers above. > > I'd be very interested in any problems and solutions you have in mind, > because I am working on the subject in my PhD studies. If someone could > give me a more detailled description of the problems you want to solve > using non-blocking techniques. Realize that I'm probably only speaking for the people who I have discussed these issues with, and who happened to agree with me. 8-). We are primarily interested in maximizing work concurrency in a symmetric multiprocessing environment. One big issue that is not addressed by lock free data structures is the continuing need for interprocessor synchronization of these structures in the SMP environment. Effectively, any CAS or CAS2 system implementing interprocessor resource locking must block during IPI-based cache data invalidation to propagate the act of synchronization. We haven't discussed this thoroughly yet, but it seems that implementing a lock tree hierarchy using intention modes would allow us to compute transitive closure over the graph swiftly, and only need to implement IPI-based cache data invalidation when accessing system wide resources. Combining this technique with Dynix-style per-CPU resource pools which are filled/drained to global resource pools seems to yield the minimum amount of inter-CPU synchorinization issues. A difficultly in doing per CPU pools with a SLAB or modified zone allocator, is that of process migration. If a process allocates a resource on one processor, and subsequently migrates to another, then frees the resource, we have the cache invalidation issue all over again before the original CPU realizes the resource has been freed for reuse. Further, we must be careful to avoid issues of cache line overlap, so that a modification of one object on one processor does not write non-current data into the adjacent object because the object adjacency boundry is interior to a single cache line that will be written back as a unit. This is made even more difficult because we want the resulting design to be able to be "tuned" for varying processor architectures -- not just Intel -- and we don't have direct knowledge of the cache interactions for object adjacency on all of the potential architectures we will want to run on. Tenatively, we have discussed per pool notification queues that can be written by the other processor to queue a "free" event at the time the object is freed. The processing of these events is then delayed until the number of free items in the pool hits the low watermark that would cause it to go back to the global pool to obtain more. At that time, we must undertake an IPI synchronization in any event, but then we could process the "free" event queues for us from each other processor for each resource. If we get enough resources back to satisfy the refill quota for the low watermark event, then we can return without obtaining objects from the global pool. Otherwise, we may have to obtain from the global pool anyway. One bad consequence of this type of approach is that pools will tend to "run hot"... that is, they will tend to have a higher apparent usage than they actually have, and depending on the watermarking seperation, the refill/drain amounts, and the average object persistence once allocated, we may in fact have up to 50% of the number of items between the post-refill amount and the midway point for the high/low watermark free but considered to be in use. Any suggestions would, of course, be appreciated... Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Feb 1 20:28:23 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id UAA29509 for smp-outgoing; Sat, 1 Feb 1997 20:28:23 -0800 (PST) Received: from smerdon.livonia.mi.us (root@pm233-27.dialip.mich.net [198.110.144.128]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id UAA29504 for ; Sat, 1 Feb 1997 20:28:15 -0800 (PST) Received: from p133 (e0.i386.smerdon.livonia.mi.us [199.33.147.37]) by smerdon.livonia.mi.us (8.7.5/8.6.9) with SMTP id XAA12353; Sat, 1 Feb 1997 23:28:00 -0500 (EST) Message-Id: <3.0.32.19970201232751.00e31450@smerdon.livonia.mi.us> X-Sender: jds@smerdon.livonia.mi.us X-Mailer: Windows Eudora Pro Version 3.0 (32) Date: Sat, 01 Feb 1997 23:27:53 -0500 To: Steve Passe From: "John D. Smerdon" Subject: Re: Tyan Tomcat II SMP video problems Cc: smp@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk At 10:52 AM 2/1/97 -0700, Steve Passe wrote: >options SMP_INVLTLB > >set when using APIC_IO? this is MANDATORY (but ONLY with APIC_IO). I had it set with and without APIC_IO. A just compiled a kernel without APIC_IO and SMP_INVLTLB and it works! I just rebuilt most of the kernel with a `make -j 8` >this smells like hardware to me. I believe this board is the one that had >a problem with its cache module. The motherboard came with the "new and improved" cache module. I have been using Windows NT since October and getting FreeBSD installed and configured over the last couple months and have never had a single crash. So I think the hardware is good. >the only thing that jumps out is having ep0 on IRQ15, try to keep everything >off of 14 & 15. While trying to get a configuration that worked with NT, FreeBSD, and PNP, I stuck the ep0 on IRQ 15. I can try to move it somewhere else and see what happens. Thanks for your help! -- John D. Smerdon; Livonia, Michigan, USA; Contents are my opinion. Home: jds@smerdon.livonia.mi.us From owner-freebsd-smp Sat Feb 1 21:01:28 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA00534 for smp-outgoing; Sat, 1 Feb 1997 21:01:28 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id VAA00527 for ; Sat, 1 Feb 1997 21:01:17 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id VAA05826; Sat, 1 Feb 1997 21:58:21 -0700 Message-Id: <199702020458.VAA05826@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: "John D. Smerdon" cc: smp@freebsd.org Subject: Re: Tyan Tomcat II SMP video problems In-reply-to: Your message of "Sat, 01 Feb 1997 23:27:53 EST." <3.0.32.19970201232751.00e31450@smerdon.livonia.mi.us> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sat, 01 Feb 1997 21:58:21 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > At 10:52 AM 2/1/97 -0700, Steve Passe wrote: > >options SMP_INVLTLB > > > >set when using APIC_IO? this is MANDATORY (but ONLY with APIC_IO). > > I had it set with and without APIC_IO. I seem to remember that setting SMP_INVLTLB *without* APIC_IO causes the one error you reported previously: >current process = 5 (cpuidle0) >trapnumber = 29 >panic (cpu#0) Unknown/Reserved Trap the Unknown trap is a result of trying to do an IPI via the APICs when they are not programmed. bottom line for everyone: ------------------------------------------------------------------------------- SMP_PROBLEM: dependancies of APIC_IO and SMP_INVLTLB not properly documented. you MUST either set BOTH APIC_IO and SMP_INVLTLB or set NEITHER. -- solution: don't do it! -- proper fix for code: add #ifdefs to code to abort build when improper combinations are attempted. ------------------------------------------------------------------------------- --- > A just compiled a kernel without APIC_IO and SMP_INVLTLB and it works! I > just rebuilt most of the kernel with a `make -j 8` so the question is why a kernel with APIC_IO *and* SMP_INVLTLB doesn't... can you verify that this is true, ie have you tried that combination? --- > > >this smells like hardware to me. I believe this board is the one that had > >a problem with its cache module. > > The motherboard came with the "new and improved" cache module. I have been could you document for us how one determines whether they have the "new and improved" module? serial #, model #, ??? --- > >the only thing that jumps out is having ep0 on IRQ15, try to keep everything > >off of 14 & 15. > > While trying to get a configuration that worked with NT, FreeBSD, and PNP, > I stuck the ep0 on IRQ 15. I can try to move it somewhere else and see > what happens. this was a shot in the dark. I seem to remember someone claiming that tyan had a problem with reassigning 14 & 15 as they are "reserved" for the 2 IDE channels. since the IDE circuits are in the same chip as the 8259 ICUs it is possible that there are some limitations, but I have no concrete facts either way. -- Steve Passe | powered by smp@csn.net | FreeBSD -----BEGIN PGP PUBLIC KEY BLOCK----- Version: 2.6.2 mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE 04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+ =ds99 -----END PGP PUBLIC KEY BLOCK-----