Date: Mon, 26 Nov 2001 17:20:51 +0100 (CET) From: Laurent Wacrenier <lwa@teaser.fr> To: FreeBSD-gnats-submit@freebsd.org Subject: bin/32295: pthread dont dequeue signals Message-ID: <20011126162051.4459032601@victor.teaser.fr>
next in thread | raw e-mail | index | archive | help
>Number: 32295
>Category: bin
>Synopsis: pthread dont dequeue signals
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Nov 26 08:30:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator: Laurent Wacrenier
>Release: FreeBSD 5.0-CURRENT i386
>Organization:
France Teaser
>Environment:
FreeBSD mysql.firstcampus.fr 4.4-STABLE FreeBSD 4.4-STABLE #0: Wed Nov 7 12:07:39 CET 2001 jcmichot@mysql.firstcampus.fr:/usr/src/sys/compile/SQLCAMPUS i386
FreeBSD math.teaser.net 4.2-STABLE FreeBSD 4.2-STABLE #2: Wed Jan 17 20:19:53 CET 2001 lwa@math.teaser.net:/usr/src/sys/compile/MATH i386
>Description:
On heavy load, our MySQL servers sometime take about 100% CPU but
are still functionnal. I've traced the process. It's looping on the
following system calls :
% strace -p 261
gettimeofday({1006788477, 519930}, NULL) = 0
poll([{fd=3, events=POLLRDNORM, revents=POLLRDNORM}, {fd=1715, events=POLLRDNORM}, {fd=1633, events=POLLRDNORM}, {fd=1748, events=POLLRDNORM}, {fd=145, events=POLLRDNORM}, {fd=988, events=POLLRDNORM}, {fd=1734, events=POLLRDNORM}, {fd=313, events=POLLRDNORM}, (...) , 292, 7838) = 1
file descriptor 3 is checked but no treatment is done for it.
according lsof :
mysqld 261 mysql 3u PIPE 0xd793ff20 16384 ->0xd793fe80, cnt=10798, in=10798
The pipe is used to handle signal in the threads
_thread_kern_pipe[0]. It's connected to _thread_kern_pipe[1] == 4. The
number of queued bytes is increasing every few seconds. I never see
him decrease. The server is in this state sin 2 days.
I've ktraced the process between to pipe queue grows :
261 mysqld CALL poll(0x81eb000,0x154,0x2518)
261 mysqld PSIG SIGALRM caught handler=0x282a92fc mask=0x0 code=0x0
261 mysqld RET poll 1
261 mysqld CALL write(0x4,0x81e1e5f,0x1)
261 mysqld GIO fd 4 wrote 1 byte
"\^N"
261 mysqld RET write 1
261 mysqld CALL sigreturn(0x81e1e7c)
My servers on high load are in production use, so I can't run them
into a debuger to pick more infos.
>How-To-Repeat:
Not easy to repeat. Make a lot of queies to a MySQL server, wait
some weeks until its happend.
>Fix:
I'm not fluent with libc_r but I suspect bug in thread_kern_poll()
from libc_r/uthread/uthread_kern.c or _thread_sig_handler() in
uthread_sig.c.
In uthread_kern.c, _thread_kern_pipe[0] is added to the pool set but
signal are dequeued only when _sigq_check_reqd != 0. If no new signal
occurs it can loop forever.
_sigq_check_reqd is set to 1 in _thread_sig_handler() only when the
signal is not blocked but a byte is wroten in the pipe on every cases.
May be some race condition occurs, also.
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011126162051.4459032601>
