From owner-freebsd-stable Fri Dec 6 12:43:42 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2771637B401 for ; Fri, 6 Dec 2002 12:43:41 -0800 (PST) Received: from ion.gank.org (ion.gank.org [198.78.66.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id B686543EBE for ; Fri, 6 Dec 2002 12:43:40 -0800 (PST) (envelope-from craig@meoqu.gank.org) Received: from owen1492.it.oot (nat10343.owentools.com [206.50.138.222]) by ion.gank.org (GankMail) with ESMTP id 7749E2BDB1 for ; Fri, 6 Dec 2002 14:40:19 -0600 (CST) Subject: SMP users: Please test (possible bug in select()) From: Craig Boston To: freebsd-stable@freebsd.org Content-Type: text/plain Organization: Message-Id: <1039207405.306.15.camel@owen1492.it.oot> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.0 Date: 06 Dec 2002 14:43:26 -0600 Content-Transfer-Encoding: 7bit Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG If anyone has an SMP system that they don't mind an ungraceful shutdown for a good cause, please compile the following program: #include #include #include int main (int argc, char *argv[]) { fd_set a, b; struct timeval tv; FD_ZERO(&a); FD_SET(0, &a); FD_SET(1, &a); FD_SET(2, &a); while(1) { bzero(&tv, sizeof(tv)); b = a; select(3, &b, 0, 0, &tv); } } Start off by running one copy of this program PER CPU (i.e. run it twice on a dual CPU box), and let it run for 5 minutes or so. If your system is still up and running, run two more copies and wait another 5 minutes. Bonus points for running from within X. I'm trying to track down the cause of my mysterious SMP-related freezes that happen every 6-8 days, and by chance found out that the above program causes it to wedge MUCH faster, usually within 30 seconds or so. I run it as a normal user, not root. I strongly suspected hardware problems, until I found that I was able to break into the debugger via serial console. The results of that were detailed in the "Help needed debugging hard lock" thread. Single-stepping gives clues of a deadlock, but I haven't been able to figure out why. I don't want to open a PR until I can verify whether or not this happens on any other machine than mine. Thank you, Craig To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message