Date: Tue, 12 May 2009 10:14:55 -0400 From: John Baldwin <jhb@freebsd.org> To: pluknet <pluknet@gmail.com> Cc: freebsd-stable@freebsd.org Subject: Re: lock up in 6.2 (procs massively stuck in Giant) Message-ID: <200905121014.55450.jhb@freebsd.org> In-Reply-To: <a31046fc0905112312y2496e5cex334ddcaf57889909@mail.gmail.com> References: <a31046fc0904292336w17aca317hefd32dad5bc28007@mail.gmail.com> <200905110949.31142.jhb@freebsd.org> <a31046fc0905112312y2496e5cex334ddcaf57889909@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 12 May 2009 2:12:27 am pluknet wrote: > 2009/5/11 John Baldwin <jhb@freebsd.org>: > > On Monday 04 May 2009 11:41:35 pm pluknet wrote: > >> 2009/5/1 John Baldwin <jhb@freebsd.org>: > >> > On Thursday 30 April 2009 2:36:34 am pluknet wrote: > >> >> Hi folks. > >> >> > >> >> Today I got a new locking issue. > >> >> This is the first time I got it, and it's merely reproduced. > >> >> > >> >> The box has lost both remote connection and local access. > >> >> No SIGINFO output on the local console even. > >> >> Jumping in ddb> shows the next: > >> >> > >> >> 1) first, this is a 8-way web server. No processes on runqueue except one > >> > httpd > >> >> (i.e. ps shows R in its state): > >> > > >> > You need to find who owns Giant and what that thread is doing. You can > > try > >> > using 'show lock Giant' as well as 'show lockchain 11568'. > >> > > >> > >> Hi, John! > >> > >> Just reproduced now on another box. > >> Hmm.. stack of the process owing Giant looks garbled. > >> > >> db> show lock Giant > >> class: sleep mutex > >> name: Giant > >> flags: {DEF, RECURSE} > >> state: {OWNED, CONTESTED} > >> owner: 0xd0d79320 (tid 102754, pid 34594, "httpd") > >> > >> db> show lockchain 34594 > >> thread 102754 (pid 34594, httpd) running on CPU 7 > >> db> show lockchain 102754 > >> thread 102754 (pid 34594, httpd) running on CPU 7 > > > > The thread is running, so we don't know what it's top of stack is and you > > can't a good stack trace in that case. > > > > None of your CPUs are idle, so I don't think you have any sort of deadlock. > > You might have a livelock. > > > > -- > > John Baldwin > > > > I'm curious if it could be caused by heavy load. > I don't know what it might be definitely, > as it's non-trivial for me to determine the reason > of a livelock, and to debug it. > > So I think it may have sense to try 7.x, as there > has been done much locking work. It may be worth trying 7. Also, what is the state of the 'swi7: clock' process? -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905121014.55450.jhb>