From owner-freebsd-current@FreeBSD.ORG Thu Jul 29 21:17:36 2010 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 925D01065673; Thu, 29 Jul 2010 21:17:36 +0000 (UTC) (envelope-from gavin@ury.york.ac.uk) Received: from mail-gw0.york.ac.uk (mail-gw0.york.ac.uk [144.32.128.245]) by mx1.freebsd.org (Postfix) with ESMTP id 448608FC1B; Thu, 29 Jul 2010 21:17:35 +0000 (UTC) Received: from ury.york.ac.uk (ury.york.ac.uk [144.32.108.81]) by mail-gw0.york.ac.uk (8.13.6/8.13.6) with ESMTP id o6TKlKlm029440; Thu, 29 Jul 2010 21:47:20 +0100 (BST) Received: from gavin (helo=localhost) by ury.york.ac.uk with local-esmtp (Exim 4.72) (envelope-from ) id 1Oea0e-00007C-Lb; Thu, 29 Jul 2010 21:47:20 +0100 Date: Thu, 29 Jul 2010 21:47:20 +0100 (BST) From: Gavin Atkinson X-X-Sender: gavin@ury.york.ac.uk To: Kostik Belousov In-Reply-To: <1279726551.25909.17.camel@buffy.york.ac.uk> Message-ID: References: <4C4510B8.6090105@freebsd.org> <20100720132931.GI2381@deviant.kiev.zoral.com.ua> <1279726551.25909.17.camel@buffy.york.ac.uk> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=UTF-8 Sender: X-York-MailScanner: Found to be clean X-York-MailScanner-From: gavin@ury.york.ac.uk Cc: David Xu , FreeBSD Current Subject: Re: firefox is stuck in getbuf() X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jul 2010 21:17:36 -0000 On Wed, 21 Jul 2010, Gavin Atkinson wrote: > On Tue, 2010-07-20 at 16:29 +0300, Kostik Belousov wrote: > > On Tue, Jul 20, 2010 at 10:58:00AM +0800, David Xu wrote: > > > With newest -HEAD code, firefox is stuck in getbuf(). > > > > > > top > > > > > > last pid: 1814; load averages: 0.00, 0.05, 0.07 > > > > > > up 0+00:37:11 10:54:01 > > > 135 processes: 1 running, 134 sleeping > > > CPU: 3.7% user, 0.0% nice, 0.6% system, 0.0% interrupt, 95.7% idle > > > Mem: 259M Active, 393M Inact, 151M Wired, 1484K Cache, 111M Buf, 186M Free > > > Swap: 2020M Total, 2020M Free > > > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > > > COMMAND > > > 1427 davidxu 1 45 0 114M 101M select 0 1:24 0.29% Xorg > > > 1588 davidxu 10 44 0 279M 145M getbuf 0 2:15 0.00% > > > firefox-bin > > > > > > > > > procstat -k 1588 > > > PID TID COMM TDNAME KSTACK > > > > > > 1588 100200 firefox-bin initial thread mi_switch sleepq_switch > > > sleepq_wait _sleep getdirtybuf flush_deplist softdep_sync_metadata > > > ffs_syncvnode ffs_fsync VOP_FSYNC_APV fsync syscallenter syscall > > > Xint0x80_syscall > > > 1588 100207 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_wait_sig _cv_wait_sig seltdwait poll > > > syscallenter syscall Xint0x80_syscall > > > 1588 100208 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op > > > syscallenter syscall Xint0x80_syscall > > > 1588 100209 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_timedwait_sig _sleep __umtx_op_cv_wait > > > _umtx_op syscallenter syscall Xint0x80_syscall > > > 1588 100210 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_timedwait_sig _sleep __umtx_op_cv_wait > > > _umtx_op syscallenter syscall Xint0x80_syscall > > > 1588 100216 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op > > > syscallenter syscall Xint0x80_syscall > > > 1588 100220 firefox-bin - mi_switch sleepq_switch > > > sleepq_wait _sleep getdirtybuf flush_deplist softdep_sync_metadata > > > ffs_syncvnode ffs_fsync VOP_FSYNC_APV fsync syscallenter syscall > > > Xint0x80_syscall > > > 1588 100238 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op > > > syscallenter syscall Xint0x80_syscall > > > 1588 100239 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op > > > syscallenter syscall Xint0x80_syscall > > > 1588 100240 firefox-bin - mi_switch sleepq_switch > > > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op > > > syscallenter syscall Xint0x80_syscall > > > > Can you, please, do the following: > > show the backtraces for the system processes, in particular, syncer, > > bufdaemon, softdepflush daemon, pagedaemon and vm ? > > for the stuck firefox thread, find the address of the buffer > > supplied as an argument to getdirtybuf, and print the *(struct buf *)addr ? > > This can be done on the live/stuck system using kgdb on /dev/mem. > > I can relatively easily recreate this, see my thread on -current on the > 17th July ("Filesystem wedge, SUJ-related?"), which (and the followup > emails) contain additional info. I'm currently trying to find the > commit responsible for introducing this, and have established that a OK, sorry for the delay. I have the information requested. Please see http://people.freebsd.org/~gavin/rho-fs-hang.txt I've started to try and narrow down where exactly the hangs started: r208700 - June 1st - seems to work fime r209425 - June 22st - hangs occur If you need any more info, let me know. Thanks, Gavin