Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jul 2010 16:35:51 +0100
From:      Gavin Atkinson <gavin@FreeBSD.org>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        David Xu <davidxu@FreeBSD.org>, FreeBSD Current <current@FreeBSD.org>
Subject:   Re: firefox is stuck in getbuf()
Message-ID:  <1279726551.25909.17.camel@buffy.york.ac.uk>
In-Reply-To: <20100720132931.GI2381@deviant.kiev.zoral.com.ua>
References:  <4C4510B8.6090105@freebsd.org> <20100720132931.GI2381@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

On Tue, 2010-07-20 at 16:29 +0300, Kostik Belousov wrote:
> On Tue, Jul 20, 2010 at 10:58:00AM +0800, David Xu wrote:
> > With newest -HEAD code, firefox is stuck in getbuf().
> > 
> > top
> > 
> > last pid:  1814;  load averages:  0.00,  0.05,  0.07 
> > 
> >                                         up 0+00:37:11  10:54:01
> > 135 processes: 1 running, 134 sleeping
> > CPU:  3.7% user,  0.0% nice,  0.6% system,  0.0% interrupt, 95.7% idle
> > Mem: 259M Active, 393M Inact, 151M Wired, 1484K Cache, 111M Buf, 186M Free
> > Swap: 2020M Total, 2020M Free
> > 
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU 
> > COMMAND
> >  1427 davidxu       1  45    0   114M   101M select  0   1:24  0.29% Xorg
> >  1588 davidxu      10  44    0   279M   145M getbuf  0   2:15  0.00% 
> > firefox-bin
> > 
> > 
> > procstat  -k 1588
> >   PID    TID COMM             TDNAME           KSTACK 
> > 
> >  1588 100200 firefox-bin      initial thread   mi_switch sleepq_switch 
> > sleepq_wait _sleep getdirtybuf flush_deplist softdep_sync_metadata 
> > ffs_syncvnode ffs_fsync VOP_FSYNC_APV fsync syscallenter syscall 
> > Xint0x80_syscall
> >  1588 100207 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_wait_sig _cv_wait_sig seltdwait poll 
> > syscallenter syscall Xint0x80_syscall
> >  1588 100208 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op 
> > syscallenter syscall Xint0x80_syscall
> >  1588 100209 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_timedwait_sig _sleep __umtx_op_cv_wait 
> > _umtx_op syscallenter syscall Xint0x80_syscall
> >  1588 100210 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_timedwait_sig _sleep __umtx_op_cv_wait 
> > _umtx_op syscallenter syscall Xint0x80_syscall
> >  1588 100216 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op 
> > syscallenter syscall Xint0x80_syscall
> >  1588 100220 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_wait _sleep getdirtybuf flush_deplist softdep_sync_metadata 
> > ffs_syncvnode ffs_fsync VOP_FSYNC_APV fsync syscallenter syscall 
> > Xint0x80_syscall
> >  1588 100238 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op 
> > syscallenter syscall Xint0x80_syscall
> >  1588 100239 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op 
> > syscallenter syscall Xint0x80_syscall
> >  1588 100240 firefox-bin      -                mi_switch sleepq_switch 
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op 
> > syscallenter syscall Xint0x80_syscall
> 
> Can you, please, do the following:
> show the backtraces for the system processes, in particular, syncer,
> bufdaemon, softdepflush daemon, pagedaemon and vm ?
> for the stuck firefox thread, find the address of the buffer
> supplied as an argument to getdirtybuf, and print the *(struct buf *)addr ?
> This can be done on the live/stuck system using kgdb on /dev/mem.

I can relatively easily recreate this, see my thread on -current on the
17th July ("Filesystem wedge, SUJ-related?"), which (and the followup
emails) contain additional info.  I'm currently trying to find the
commit responsible for introducing this, and have established that a
kernel from the 1st June does not seem to exhibit the same issue.

Tonight, I'll revert to a current -current and try to get the info you
need.

Thanks,

Gavin

-- 
Gavin Atkinson
FreeBSD committer and bugmeister



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1279726551.25909.17.camel>