Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jul 2010 16:35:51 +0100
From:      Gavin Atkinson <gavin@FreeBSD.org>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        David Xu <davidxu@FreeBSD.org>, FreeBSD Current <current@FreeBSD.org>
Subject:   Re: firefox is stuck in getbuf()
Message-ID:  <1279726551.25909.17.camel@buffy.york.ac.uk>
In-Reply-To: <20100720132931.GI2381@deviant.kiev.zoral.com.ua>
References:  <4C4510B8.6090105@freebsd.org> <20100720132931.GI2381@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2010-07-20 at 16:29 +0300, Kostik Belousov wrote:
> On Tue, Jul 20, 2010 at 10:58:00AM +0800, David Xu wrote:
> > With newest -HEAD code, firefox is stuck in getbuf().
> >=20
> > top
> >=20
> > last pid:  1814;  load averages:  0.00,  0.05,  0.07=20
> >=20
> >                                         up 0+00:37:11  10:54:01
> > 135 processes: 1 running, 134 sleeping
> > CPU:  3.7% user,  0.0% nice,  0.6% system,  0.0% interrupt, 95.7% idle
> > Mem: 259M Active, 393M Inact, 151M Wired, 1484K Cache, 111M Buf, 186M F=
ree
> > Swap: 2020M Total, 2020M Free
> >=20
> >   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU=20
> > COMMAND
> >  1427 davidxu       1  45    0   114M   101M select  0   1:24  0.29% Xo=
rg
> >  1588 davidxu      10  44    0   279M   145M getbuf  0   2:15  0.00%=20
> > firefox-bin
> >=20
> >=20
> > procstat  -k 1588
> >   PID    TID COMM             TDNAME           KSTACK=20
> >=20
> >  1588 100200 firefox-bin      initial thread   mi_switch sleepq_switch=20
> > sleepq_wait _sleep getdirtybuf flush_deplist softdep_sync_metadata=20
> > ffs_syncvnode ffs_fsync VOP_FSYNC_APV fsync syscallenter syscall=20
> > Xint0x80_syscall
> >  1588 100207 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_wait_sig _cv_wait_sig seltdwait poll=20
> > syscallenter syscall Xint0x80_syscall
> >  1588 100208 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op=20
> > syscallenter syscall Xint0x80_syscall
> >  1588 100209 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_timedwait_sig _sleep __umtx_op_cv_wait=20
> > _umtx_op syscallenter syscall Xint0x80_syscall
> >  1588 100210 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_timedwait_sig _sleep __umtx_op_cv_wait=20
> > _umtx_op syscallenter syscall Xint0x80_syscall
> >  1588 100216 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op=20
> > syscallenter syscall Xint0x80_syscall
> >  1588 100220 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_wait _sleep getdirtybuf flush_deplist softdep_sync_metadata=20
> > ffs_syncvnode ffs_fsync VOP_FSYNC_APV fsync syscallenter syscall=20
> > Xint0x80_syscall
> >  1588 100238 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op=20
> > syscallenter syscall Xint0x80_syscall
> >  1588 100239 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op=20
> > syscallenter syscall Xint0x80_syscall
> >  1588 100240 firefox-bin      -                mi_switch sleepq_switch=20
> > sleepq_catch_signals sleepq_wait_sig _sleep __umtx_op_cv_wait _umtx_op=20
> > syscallenter syscall Xint0x80_syscall
>=20
> Can you, please, do the following:
> show the backtraces for the system processes, in particular, syncer,
> bufdaemon, softdepflush daemon, pagedaemon and vm ?
> for the stuck firefox thread, find the address of the buffer
> supplied as an argument to getdirtybuf, and print the *(struct buf *)addr=
 ?
> This can be done on the live/stuck system using kgdb on /dev/mem.

I can relatively easily recreate this, see my thread on -current on the
17th July ("Filesystem wedge, SUJ-related?"), which (and the followup
emails) contain additional info.  I'm currently trying to find the
commit responsible for introducing this, and have established that a
kernel from the 1st June does not seem to exhibit the same issue.

Tonight, I'll revert to a current -current and try to get the info you
need.

Thanks,

Gavin

--=20
Gavin Atkinson
FreeBSD committer and bugmeister



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1279726551.25909.17.camel>