From owner-freebsd-hackers@FreeBSD.ORG Fri Jun 4 13:09:45 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 33E8316A4CE for ; Fri, 4 Jun 2004 13:09:45 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 914CE43D2F for ; Fri, 4 Jun 2004 13:09:44 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i54K8LhU094564; Fri, 4 Jun 2004 16:08:21 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i54K8LfX094561; Fri, 4 Jun 2004 16:08:21 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Fri, 4 Jun 2004 16:08:21 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Ali Niknam In-Reply-To: <00dd01c449b3$ca5a0f90$0400a8c0@redguy> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-hackers@freebsd.org Subject: Re: FreeBSD 5.2.1: Mutex/Spinlock starvation? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jun 2004 20:09:45 -0000 On Thu, 3 Jun 2004, Ali Niknam wrote: > First of all: this is my first posting in this group so please be gentil > :) Welcome :-). > Now i unfortunately do not know enough about the internals of BSD to do a > very estimated guess, but i'll give a shot nevertheless: my estimate is that > due to the tremendous amount of 'locked' processes the system simply starves > of CPU to do anything. My guess is the Locking mechanism probably uses > some kind of 'spin' to wait until the resource is unlocked (whichever > resource it is, probably something network related, though). Actually, by default, most mutexes in the system are sleep mutexes, so they sleep on contention rather than spinning. In some cases, this actually hurts more than spinning, because if the mutex is released quickly by the holder, then you pay the context switches which cost more than spinning for the short period of time. You might want to try adding "options ADAPTIVE_MUTEXES" to your kernel configuration, which will cause mutexes to spin briefly on SMP systems before sleeping, and has been observed to improve performance quite a bit. > I would be very interested to hear what this problem could be; perhaps i > can test a little if someone has solutions (i cant test much > unfortunately, it's a production system). As you may or may not be aware, improving locking and parallelism in FreeBSD 5.x is a big on-going task, with a lot of activity. A moderate quantity of recent locking work has occurred since 5.2.1 release, so depending on your tolerance for experimentation on this system, you might wish to give 5-CURRENT a try. Be warned that 5-CURRENT, while having a number of performance enhancements, also has some stability regressions, more recent ACPI code, etc. I'm using older snapshots of 5-CURRENT in production today, but generally not newer than about April or early May. If you do try -CURRENT, take a look at UPDATING, and make sure to disable a lot of the debugging features present if you're interested specifically in performance. If you have a lower tolerance for instability, there are a number of minor performance tweaks that can be easily back-ported to 5.2.1, such as the change to proc.h to make grabbing and releasing the proc lock conditional on p_stops having events defined. This removes several mutex operations from each system call, and I've observed the difference in a pretty measurable way on micro-benchmarks. It's also pretty low risk. The change is src/sys/sys/proc.h:1.366. There are some other related changes that can probably be dug up, including changes to improve the performance of the scheduler in the presence of threads, etc. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research