From owner-freebsd-performance@FreeBSD.ORG Sun Mar 9 01:14:17 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B8CD1065673; Sun, 9 Mar 2008 01:14:17 +0000 (UTC) (envelope-from prvs=195486f407=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (core6.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id E1E7A8FC12; Sun, 9 Mar 2008 01:14:16 +0000 (UTC) (envelope-from prvs=195486f407=killing@multiplay.co.uk) DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=multiplay.co.uk; s=Multiplay; t=1205024516; x=1205629316; q=dns/txt; h=Received: Message-ID:From:To:Cc:References:Subject:Date:MIME-Version: Content-Type:Content-Transfer-Encoding; bh=d3c9wMJ4g3x0WW9KbgTfw JlUK26ODfmwsGZqq3KV1N8=; b=FMqZIjUGBxI+AzipN3E5T2p8gPAkfx7uqbbgu ERxqvs2QYeYV3+I9OtYykDQSBzClBLw34QQGlSZkVOMoteRW3lx4BlwHsq5TuiVj ntaNwoNnbk57vJCJl7e9FJF+SwyRyX3xNGWbV4yzPOmr+rQvxHY5QWc2UJuXVZiQ l80duw= X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on mail1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-14.7 required=6.0 tests=BAYES_00, USER_IN_WHITELIST, USER_IN_WHITELIST_TO autolearn=ham version=3.1.8 Received: from r2d2 by mail1.multiplay.co.uk (MDaemon PRO v9.6.3) with ESMTP id md50005243084.msg; Sun, 09 Mar 2008 01:01:50 +0000 Message-ID: <006401c88181$25cf0e30$b6db87d4@multiplay.co.uk> From: "Steven Hartland" To: "Robert Watson" References: <056601c8814c$516c0370$b6db87d4@multiplay.co.uk> <20080308221441.E11432@fledge.watson.org> Date: Sun, 9 Mar 2008 01:01:46 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.3138 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 X-Authenticated-Sender: Killing@multiplay.co.uk X-MDRemoteIP: 212.135.219.182 X-Return-Path: prvs=195486f407=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-Spam-Processed: mail1.multiplay.co.uk, Sun, 09 Mar 2008 01:01:51 +0000 X-MDAV-Processed: mail1.multiplay.co.uk, Sun, 09 Mar 2008 01:01:56 +0000 Cc: freebsd-performance@freebsd.org Subject: Re: rrdtool / mtr causing stalling on 7.0 X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Mar 2008 01:14:17 -0000 ----- Original Message ----- From: "Robert Watson" > It looks like the attachment got lost on the way through the mailing list. > > I think the first starting point is: what sort of stall is this? Is it, for > example, all network communication stalling, all disk I/O stalling, or the > entire kernel and all processes stalling? The usual diagnostics are: > > - Does the machine stop responding to pings while stalled, and/or possibly > "catch up" all at once when it recovers? > > - If you run the following loop on the machine without any network or console > I/O, do you see gaps in time stamps: > > while (1) { > sleep 1 > date >> date.log > } > > - If you write a short C program that looks a lot like the above loop, but > logs time stamps into an in-memory buffer, and have it look for gaps in the > sequence of >3 seconds, does it run across the stall? Thanks for the ideas Robert the output from the shell script this shows significant gaps:- Sun Mar 9 00:20:33 GMT 2008 Sun Mar 9 00:20:34 GMT 2008 <== Stall Sun Mar 9 00:21:09 GMT 2008 Sun Mar 9 00:21:10 GMT 2008 ... Sun Mar 9 00:25:23 GMT 2008 Sun Mar 9 00:25:24 GMT 2008 Sun Mar 9 00:25:25 GMT 2008 Sun Mar 9 00:25:27 GMT 2008 <== Stall Sun Mar 9 00:25:53 GMT 2008 Sun Mar 9 00:25:59 GMT 2008 Sun Mar 9 00:26:00 GMT 2008 Running a ping along side shows no missed responses. Enabling lock profiling for the period changes the behaviour somewhat, producing shorter but multiple stalls. Sun Mar 9 00:30:31 GMT 2008 Sun Mar 9 00:30:32 GMT 2008 Sun Mar 9 00:30:34 GMT 2008 Sun Mar 9 00:30:35 GMT 2008 Sun Mar 9 00:30:36 GMT 2008 Sun Mar 9 00:30:37 GMT 2008 Sun Mar 9 00:30:38 GMT 2008 Sun Mar 9 00:30:41 GMT 2008 Sun Mar 9 00:30:42 GMT 2008 <== Stall Sun Mar 9 00:30:44 GMT 2008 Sun Mar 9 00:30:45 GMT 2008 <== Stall Sun Mar 9 00:30:47 GMT 2008 <== Stall Sun Mar 9 00:30:49 GMT 2008 Sun Mar 9 00:30:50 GMT 2008 <== Stall Sun Mar 9 00:30:52 GMT 2008 <== Stall Sun Mar 9 00:30:54 GMT 2008 Sun Mar 9 00:30:55 GMT 2008 Sun Mar 9 00:30:56 GMT 2008 Sun Mar 9 00:30:57 GMT 2008 <== Stall Sun Mar 9 00:31:03 GMT 2008 <== Stall Sun Mar 9 00:31:05 GMT 2008 Sun Mar 9 00:31:06 GMT 2008 <== Stall Sun Mar 9 00:31:08 GMT 2008 Sun Mar 9 00:31:09 GMT 2008 Sun Mar 9 00:31:10 GMT 2008 Sun Mar 9 00:31:11 GMT 2008 <== Stall Sun Mar 9 00:31:14 GMT 2008 Sun Mar 9 00:31:15 GMT 2008 Sun Mar 9 00:31:16 GMT 2008 <== Stall Sun Mar 9 00:31:20 GMT 2008 Sun Mar 9 00:31:21 GMT 2008 Sun Mar 9 00:31:22 GMT 2008 Using the following c code we also see stalls: #include #include #include int main( char **argv, int argc ) { time_t last = time( NULL ); while ( 1 ) { time_t now = time( NULL ); time_t diff = now - last; if ( diff >= 2 ) { fprintf( stderr, "stalled for %d seconds\n", diff ); } fprintf( stderr, ctime( &now ) ); last = now; sleep( 1 ); } exit( 0 ); } [date.log] Sun Mar 9 00:55:40 GMT 2008 Sun Mar 9 00:55:43 GMT 2008 <== Stall Sun Mar 9 00:56:11 GMT 2008 Sun Mar 9 00:56:12 GMT 2008 Sun Mar 9 00:56:13 GMT 2008 Sun Mar 9 00:56:14 GMT 2008 Sun Mar 9 00:56:15 GMT 2008 [/date.log] [timec output] Sun Mar 9 00:55:40 2008 Sun Mar 9 00:55:41 2008 Sun Mar 9 00:55:42 2008 stalled for 2 seconds Sun Mar 9 00:55:44 2008 stalled for 5 seconds Sun Mar 9 00:55:49 2008 stalled for 2 seconds Sun Mar 9 00:55:51 2008 stalled for 2 seconds Sun Mar 9 00:55:53 2008 Sun Mar 9 00:55:54 2008 Sun Mar 9 00:55:55 2008 Sun Mar 9 00:55:56 2008 Sun Mar 9 00:55:57 2008 Sun Mar 9 00:55:58 2008 Sun Mar 9 00:55:59 2008 Sun Mar 9 00:56:00 2008 Sun Mar 9 00:56:01 2008 Sun Mar 9 00:56:02 2008 Sun Mar 9 00:56:03 2008 Sun Mar 9 00:56:04 2008 Sun Mar 9 00:56:05 2008 Sun Mar 9 00:56:06 2008 Sun Mar 9 00:56:07 2008 Sun Mar 9 00:56:08 2008 Sun Mar 9 00:56:09 2008 Sun Mar 9 00:56:10 2008 Sun Mar 9 00:56:11 2008 Sun Mar 9 00:56:12 2008 Sun Mar 9 00:56:13 2008 Sun Mar 9 00:56:14 2008 Sun Mar 9 00:56:15 2008 [/timec output] As the list ate the attachment, the output from the lock profile can be found here:- ftp://ftp1.multiplay.co.uk/pub/other/freebsd-7.0-rrdtool-stall.zip Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.