From owner-freebsd-current@FreeBSD.ORG Tue Jul 12 18:44:21 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 944B0106566C for ; Tue, 12 Jul 2011 18:44:21 +0000 (UTC) (envelope-from ohartman@zedat.fu-berlin.de) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) by mx1.freebsd.org (Postfix) with ESMTP id 44FAA8FC18 for ; Tue, 12 Jul 2011 18:44:21 +0000 (UTC) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1QghwS-0007S5-2t>; Tue, 12 Jul 2011 20:44:20 +0200 Received: from e178028043.adsl.alicedsl.de ([85.178.28.43] helo=thor.walstatt.dyndns.org) by inpost2.zedat.fu-berlin.de (Exim 4.69) with esmtpsa (envelope-from ) id <1QghwR-0002dS-VV>; Tue, 12 Jul 2011 20:44:20 +0200 Message-ID: <4E1C9603.9070801@zedat.fu-berlin.de> Date: Tue, 12 Jul 2011 20:44:19 +0200 From: "Hartmann, O." User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:5.0) Gecko/20110712 Thunderbird/5.0 MIME-Version: 1.0 To: Matt References: <4E1421D9.7080808@zedat.fu-berlin.de> <20110706193636.GA69550@troutmask.apl.washington.edu> <4E14CCE5.4050906@zedat.fu-berlin.de> <20110707015151.GB71966@troutmask.apl.washington.edu> <4E1B67C7.8040402@FreeBSD.org> <4E1C8E19.8010105@gmail.com> In-Reply-To: <4E1C8E19.8010105@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: 85.178.28.43 Cc: mdf@freebsd.org, FreeBSD Current , Arnaud Lacombe Subject: Re: Heavy I/O blocks FreeBSD box for several seconds X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2011 18:44:21 -0000 On 07/12/11 20:10, Matt wrote: > >> Sic... If you allow me the comparison, FreeBSD development is as open >> as are the US (and, to some extend, most western country) borders >> nowadays open to aliens, and believe me, this is not a compliment. >> >> - Arnaud >> I like the comment, although I disagree. In some cases, 'too open' is worse. Look at Linux. There are too open ditributions, and the rate of systemmalfunctions are relative high compared to *BSDs. This is my experience over the last few years, especially with RedHat ... > This is getting offtopic fast. Can we just EGODWIN here? It doesn't > fulfill the entire requirements, but it's getting close... > > Is it possible CPU is the wrong cause of the blocking? Is there > perhaps memory contention between ZFS/UFS in OP's setup? Could > filesystem/disk performance be the cause and not obscure > technicalities of ULE scheduler? Dodgy hardware causing interrupt > storms? Ethernet not in polling mode? > > Matt Doggy hardware: Dell PowerEdge 1950 with 16 GB RAM, two 4-core XEONs at 2.5 GHz, two Broadcom GBit NICs, SAS controller (mpt) and two SATA drives. Another box: Selfmade box based upon LGA775 with 8GB RAM, Realtek NIC with polling enabled (but do not know wheter it is used or not), Q6600 CPU, 5 SATA drives connected to ICH10R. Notebook: doggy hardware is Dell Latitude E6520. Another doggy hardware is our Dell Blade system with 24 GB RAM, two socket LGA1366 and two Westmere 6-core Intel XEONs (X56XXsomething). This box has as doggy hardware a SAS 2.0 controller with a 500 GB SAS 2.0 harddrive and a 2 TB SATA drive. The box is now running Linux due to the nVidia TESLA M2050 board, the doggy nVidia GPU card can not be used with the server OS FreeBSD. Especially on that machine, which runs headless, I experienced those 'locks' - when starting compiler, when running some n-body simulation code which is not parallelised, so I start an ensemble of 5 or six or 12 instances. I would like to qunatify the problems if someone would give me some advice how to measure. Within this thread I read about top isn't well designed doing that, so what is? Well, just for fun, I compiled 4BSD scheduler in the older 16-core XEON box's kernel and tried reproducing the problems. I wasn't able to do so in most cases, except when doing massively disk I/O AND network try to copy lots of data via network. It seems to bring down even a simple SSH. I'm confused ...