From owner-freebsd-smp@FreeBSD.ORG Sun May 4 13:09:03 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2FD8137B401 for ; Sun, 4 May 2003 13:09:03 -0700 (PDT) Received: from mail.dannysplace.net (allxs.xs4all.nl [194.109.223.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id B1CA043FE0 for ; Sun, 4 May 2003 13:09:01 -0700 (PDT) (envelope-from fbsd@dannysplace.net) Received: from [192.168.100.228] (helo=llama) by mail.dannysplace.net with smtp (Exim 4.12) id 19CPn9-0009v5-00 for freebsd-smp@freebsd.org; Sun, 04 May 2003 22:08:59 +0200 Message-ID: <002d01c31279$006270a0$e464a8c0@llama> From: "Danny Carroll" To: Date: Sun, 4 May 2003 22:09:00 +0200 MIME-Version: 1.0 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *19CPn9-0009v5-00*L.yw3u7OhcE* Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.1 Subject: New to SMP, is it Ok to use? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Danny Carroll List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 May 2003 20:09:03 -0000 Hiya. I'm looking at purchasing a Compaq Proliant 6500 which has 4 Pentium Pro = 200 processors in it. It will be a low-load mail/web server for about 30 users (mail) So basically, before I go out and pay for this thing, I was wondering if = I can rely on the SMP features of FreeBSD for this production = environment. I use the word production loosely. The truth is it will be the primary = mail server for a private domain. A secondary will be available so I = have a way out if there are problems. Should I use RELENG_4 or RELENG_5_0? What are the pros and cons of = each (only related to SMP). Thanks for the help.... -D p.s. If you want me to bugger off to -questions, I'll humbly scurry = away.... From owner-freebsd-smp@FreeBSD.ORG Tue May 6 11:48:41 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4F0E637B401 for ; Tue, 6 May 2003 11:48:41 -0700 (PDT) Received: from gromit.codeconcepts.com (rrcs-sw-24-153-140-106.biz.rr.com [24.153.140.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9EAA043F75 for ; Tue, 6 May 2003 11:48:40 -0700 (PDT) (envelope-from greg@codeconcepts.com) Received: from gromit.codeconcepts.com (localhost [127.0.0.1]) h46Imjva068831 for ; Tue, 6 May 2003 13:48:45 -0500 (CDT) (envelope-from greg@gromit.codeconcepts.com) Message-Id: <200305061848.h46Imjva068831@gromit.codeconcepts.com> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: freebsd-smp@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 06 May 2003 13:48:45 -0500 From: Greg Subject: spinlocks and cv_wait() X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 May 2003 18:48:41 -0000 Hi, I am working on a driver that has data which is accessed by both the top and bottom. It is a typical situation in which the top half initiates I/O and then awaits notification of completion by the bottom half. The problem seems to be that either that I am misunderstanding something fundamental in the FreeBSD locking implementation, or perhaps the implementation of cv_wait() is a bit too restrictive. Because the bottom half is a disk I/O interrupt handler, I presume the mutex must be of the MTX_SPIN variety, and therein is the problem. The top half acquires the mutex, checks the condition, and then calls cv_wait() if the condition is not met. Unfortunately, cv_wait() checks that the mutex is of the sleeping variety and trips an assert because it isn't (at kern_condvar.c line 240). As a data point, this code runs correctly on SMP Solaris, AIX, and FreeBSD-4 (taking into account the different locking APIs, of course). If anyone out there can shed some light on the problem it would be greatly appreciated. Thanks in advance! Greg P.S. My system is fairly current. My last cvs up didn't turn up any changes to the locking code. FreeBSD magenta.cc.codeconcepts.com 5.0-RELEASE-p7 FreeBSD 5.0-RELEASE-p7 #0: Sat Apr 12 14:04:42 CDT 2003 greg@magenta.cc.codeconcepts.com:/usr/obj/usr/src/sys/MAGENTA i386 -- Every man's work, whether it be literature or music or pictures or architecture or anything else, is always a portrait of himself - Samuel Butler (1835-1902) From owner-freebsd-smp@FreeBSD.ORG Tue May 6 12:32:55 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2AFA737B404 for ; Tue, 6 May 2003 12:32:55 -0700 (PDT) Received: from mail.speakeasy.net (mail13.speakeasy.net [216.254.0.213]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9553343F3F for ; Tue, 6 May 2003 12:32:52 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 25400 invoked from network); 6 May 2003 19:32:58 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 6 May 2003 19:32:58 -0000 Received: from laptop.baldwin.cx ([216.133.140.1]) by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h46JWnp0001213; Tue, 6 May 2003 15:32:49 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200305061848.h46Imjva068831@gromit.codeconcepts.com> Date: Tue, 06 May 2003 15:32:56 -0400 (EDT) From: John Baldwin To: Greg cc: freebsd-smp@freebsd.org Subject: RE: spinlocks and cv_wait() X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 May 2003 19:32:55 -0000 On 06-May-2003 Greg wrote: > > Hi, I am working on a driver that has data which is accessed by both > the top and bottom. It is a typical situation in which the top half > initiates I/O and then awaits notification of completion by the > bottom half. > > The problem seems to be that either that I am misunderstanding something > fundamental in the FreeBSD locking implementation, or perhaps the > implementation of cv_wait() is a bit too restrictive. > > Because the bottom half is a disk I/O interrupt handler, I presume the > mutex must be of the MTX_SPIN variety, and therein is the problem. The > top half acquires the mutex, checks the condition, and then calls cv_wait() > if the condition is not met. Unfortunately, cv_wait() checks that the > mutex is of the sleeping variety and trips an assert because it isn't > (at kern_condvar.c line 240). Your first assumption here is wrong. Interrupt handlers run in a (mostly) top-half context and are allowed to use normal mutexes. MTX_SPIN mutexes should be avoided when possible. > As a data point, this code runs correctly on SMP Solaris, AIX, and > FreeBSD-4 (taking into account the different locking APIs, of course). > > If anyone out there can shed some light on the problem it would be > greatly appreciated. > > Thanks in advance! > Greg > -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ From owner-freebsd-smp@FreeBSD.ORG Wed May 7 07:44:13 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0B39C37B401; Wed, 7 May 2003 07:44:13 -0700 (PDT) Received: from gromit.codeconcepts.com (rrcs-sw-24-153-140-106.biz.rr.com [24.153.140.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id CE02C43FAF; Wed, 7 May 2003 07:44:09 -0700 (PDT) (envelope-from greg@codeconcepts.com) Received: from gromit.codeconcepts.com (localhost [127.0.0.1]) h47EiCvZ096249; Wed, 7 May 2003 09:44:12 -0500 (CDT) (envelope-from greg@gromit.codeconcepts.com) Message-Id: <200305071444.h47EiCvZ096249@gromit.codeconcepts.com> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: John Baldwin In-Reply-To: Message from John Baldwin Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 07 May 2003 09:44:12 -0500 From: Greg cc: freebsd-smp@FreeBSD.org Subject: Re: spinlocks and cv_wait() X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 May 2003 14:44:13 -0000 > > Because the bottom half is a disk I/O interrupt handler, I presume the > > mutex must be of the MTX_SPIN variety, and therein is the problem. The > > top half acquires the mutex, checks the condition, and then calls cv_wait() > > if the condition is not met. Unfortunately, cv_wait() checks that the > > mutex is of the sleeping variety and trips an assert because it isn't > > (at kern_condvar.c line 240). > > Your first assumption here is wrong. Interrupt handlers run in a (mostly) > top-half context and are allowed to use normal mutexes. MTX_SPIN mutexes > should be avoided when possible. Thanks for the clarification. The mtx_init man page said as much, but I wasn't 100% certain. Of course I had tried MTX_DEF, but then the machine crashed in seemingly unrelated code, so I couldn't be certain which usage was correct. -- The comfort of a knowledge of the rise above the sky above could never parallel the challenge of an acquisition in the here and now From owner-freebsd-smp@FreeBSD.ORG Fri May 9 13:46:30 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3E2B137B401; Fri, 9 May 2003 13:46:30 -0700 (PDT) Received: from dns1.vizion2000.net (dns1.vizion2000.net [64.58.171.82]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7CC3E43F75; Fri, 9 May 2003 13:46:29 -0700 (PDT) (envelope-from vizion@ixpres.com) Received: from vizion (vizion.vizion2000.net [64.58.171.92]) by dns1.vizion2000.net (8.11.6/8.11.6) with SMTP id h49LKKb33158; Fri, 9 May 2003 14:20:24 -0700 (PDT) (envelope-from vizion@ixpres.com) Message-ID: <003701c31668$38371650$15b55042@vizion2000.net> From: "vizion communication" To: , "FreeBSD Stable" Date: Fri, 9 May 2003 13:18:53 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Subject: What the???? Is there a Doctor in the house? Help X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 May 2003 20:46:30 -0000 Ok do not know what is causing this problem and would appreciate some help in tracking down the cause. FreeBSD 4.7 System runs fine for maybe a week and then all networking stops - without warning. I do not know anyway of persuading it to resume networking without rebooting. Does anyone have some useful suggestions on how to diagnose the cause? David From owner-freebsd-smp@FreeBSD.ORG Fri May 9 15:30:02 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 54A8B37B401; Fri, 9 May 2003 15:30:02 -0700 (PDT) Received: from lilzcluster.liwest.at (lilzclust01.liwest.at [212.33.55.11]) by mx1.FreeBSD.org (Postfix) with ESMTP id C376343F93; Fri, 9 May 2003 15:29:58 -0700 (PDT) (envelope-from dgw@liwest.at) Received: from cm58-27.liwest.at by lilzcluster.liwest.at (8.10.2/1.1.2.11/08Jun01-1123AM) id h49MTHP0000998167; Sat, 10 May 2003 00:29:18 +0200 (MEST) From: Daniela To: "vizion communication" , , "FreeBSD Stable" Date: Sat, 10 May 2003 00:30:09 +0000 User-Agent: KMail/1.5.1 References: <003701c31668$38371650$15b55042@vizion2000.net> In-Reply-To: <003701c31668$38371650$15b55042@vizion2000.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200305100030.09247.dgw@liwest.at> Subject: Re: What the???? Is there a Doctor in the house? Help X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 May 2003 22:30:02 -0000 On Friday 09 May 2003 20:18, vizion communication wrote: > Ok do not know what is causing this problem and would > appreciate some help in tracking down the cause. > > FreeBSD 4.7 > > System runs fine for maybe a week and then all networking > stops - without warning. I do not know anyway of persuading > it to resume networking without rebooting. > > Does anyone have some useful suggestions on how to diagnose > the cause? Had this problem some time ago. A process got killed. Do a ps -ax right after rebooting and also when the network isn't working again. Compare these two and look for processes that aren't there any more. Also look at the console. When I had the problem for the first time, I got some error messages saying that there isn't enough buffer space or so. This could be another possible cause. Daniela From owner-freebsd-smp@FreeBSD.ORG Fri May 9 22:39:44 2003 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 625AC37B408; Fri, 9 May 2003 22:39:44 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id F2ED443FBF; Fri, 9 May 2003 22:39:41 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0008.cvx22-bradley.dialup.earthlink.net ([209.179.198.8] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19EN4l-0001gU-00; Fri, 09 May 2003 22:39:15 -0700 Message-ID: <3EBC9039.3154C20D@mindspring.com> Date: Fri, 09 May 2003 22:38:01 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: vizion communication References: <003701c31668$38371650$15b55042@vizion2000.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a46b9f7da6fb1c1b84e83c1e3c29f11df3350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: FreeBSD Stable cc: freebsd-smp@freebsd.org Subject: Re: What the???? Is there a Doctor in the house? Help X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 May 2003 05:39:44 -0000 vizion communication wrote: > Ok do not know what is causing this problem and would > appreciate some help in tracking down the cause. > > FreeBSD 4.7 > > System runs fine for maybe a week and then all networking > stops - without warning. I do not know anyway of persuading > it to resume networking without rebooting. > > Does anyone have some useful suggestions on how to diagnose > the cause? You said you had a XXXX networking card, so that's probably the problem. Try this patch on if_XXXX.c: +XXXXXXXXXX -XXXXXXXXXX (it's a unidiff). 8-) Really, you need to provide us with at last the name of your networking card. Also, you should try to ifconfig down the interface, and then ifconfig it back up, in case it's a lost interrupt or something that needs a timer to correct (doing the ifconfig's will reset the card). -- Terry