From owner-freebsd-stable@FreeBSD.ORG Sat Oct 20 13:53:46 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BADB516A420 for ; Sat, 20 Oct 2007 13:53:46 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from amailer.gwdg.de (amailer.gwdg.de [134.76.10.18]) by mx1.freebsd.org (Postfix) with ESMTP id 3F74A13C474 for ; Sat, 20 Oct 2007 13:53:44 +0000 (UTC) (envelope-from rhurlin@gwdg.de) Received: from p578b68b8.dip0.t-ipconnect.de ([87.139.104.184] helo=krabat.raven.hur) by mailer.gwdg.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1IjDeA-0004iQ-1J; Sat, 20 Oct 2007 14:41:42 +0200 Message-ID: <4719F786.80708@gwdg.de> Date: Sat, 20 Oct 2007 14:41:42 +0200 From: Rainer Hurling User-Agent: Thunderbird 2.0.0.6 (X11/20070803) MIME-Version: 1.0 To: Oleg Derevenetz , eugen@kuzbass.ru References: <027d01c8125c$73d4db80$c8c55358@delloleg><20071019220501.GL31826@elvis.mu.org> <20071020082724.GA87825@svzserv.kemerovo.su> <008d01c812f5$7aad62d0$eec55358@W2KOOOD> In-Reply-To: <008d01c812f5$7aad62d0$eec55358@W2KOOOD> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated: Id:rhurlin X-Spam-Level: - X-Virus-Scanned: (clean) by exiscan+sophie Cc: freebsd-stable@freebsd.org Subject: Re: kern/104406: [ufs] Processes get stuck in "ufs" state underpersistent CPU load X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Oct 2007 13:53:46 -0000 Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. I know that this 'hanging' behaviour has been described several times in the near past on STABLE and CURRENT lists. But mostly the context was different. In discussions beared on these hangings it seems people are looking for misbehaviour of the scheduler (namely ULE), linux emulation, java runtime environment or firefox. At my point of view it has more likely to do with UFS-locking under high cpu load or something around it. I have barely skills with programming and debuging, but if there are any activities on this topic in the background, what can we do to help? Sincerely, Rainer Hurling Oleg Derevenetz schrieb: >>>> Can anyone take a look on PR kern/104406 ? I got repeatable hang > situation, >>>> but I can't obtain a kernel dump to get result of all show commands > from >>>> here: >>>> >>>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html >>>> After my break to debugger using Ctrl+Alt+Esc sequence and entering a >>>> "panic" command kernel does not wrote a kernel dump but seems to hang. > Can >>>> anyone describe how to obtain a kernel dump in this situation, or at > least >>>> say - which output of show commands need in first place to debug this > ? >>>> Output of all suggested commands is huge and I afraid of making > mistake >>>> when carrying this output from screen to list of paper and back :-) >> This very easy to reproduce [ufs] uninterruptable deadlock >> for both of RELENG_6 and RELENG_7. Look at this PR: >> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439 >> >> The PR is closed but the problem is still here with 7.0-PRERELEASE >> and, perhaps, CURRENT. > > This is probably another bug because: > > 1. I built kernel with INVARIANTS as described in on "Debugging Deadlocks" > page of FreeBSD Developers' Handbook and got no panic, but only deadlock; > 2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP > to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and > NOT occured when the same NTFS mounted r/o). > > -- > Oleg Derevenetz OOD3-RIPE > Phone: +7 4732 539880 > Fax: +7 4732 531415 http://www.vsi.ru > CenterTelecom Voronezh ISP http://isp.vsi.ru