From owner-freebsd-bugs Sat Mar 31 9:40:11 2001 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 65C1B37B71E for ; Sat, 31 Mar 2001 09:40:01 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.11.1/8.11.1) id f2VHe1437531; Sat, 31 Mar 2001 09:40:01 -0800 (PST) (envelope-from gnats) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 80D6537B71B for ; Sat, 31 Mar 2001 09:34:37 -0800 (PST) (envelope-from nobody@FreeBSD.org) Received: (from nobody@localhost) by freefall.freebsd.org (8.11.1/8.11.1) id f2VHYbq37245; Sat, 31 Mar 2001 09:34:37 -0800 (PST) (envelope-from nobody) Message-Id: <200103311734.f2VHYbq37245@freefall.freebsd.org> Date: Sat, 31 Mar 2001 09:34:37 -0800 (PST) From: Heikki.Tuuri@innobase.inet.fi To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-1.0 Subject: kern/26247: Does pthread_mutex_trylock really work on the latest FreeBSD release if the mutex is already reserved? Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 26247 >Category: kern >Synopsis: Does pthread_mutex_trylock really work on the latest FreeBSD release if the mutex is already reserved? >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Mar 31 09:40:01 PST 2001 >Closed-Date: >Last-Modified: >Originator: Heikki Tuuri >Release: The latest (5) >Organization: Innobase Oy >Environment: Do not know because Dan Nelson reported the problem to me >Description: Hi FreeBSD people! Dan Nelson reported a hang in the MySQL/Innobase engine. It seems that the FreeBSD code gets into an infinite loop when pthread_mutex_trylock is tried on a mutex already locked. My code works on Linux-Intel and Solaris-Sparc. Could the problem have something to do with thread signal masks or something? You can view Innobase source code at www.bitkeeper.com, Hosted repositories, mysql, subdirectory innobase. Regards, Heikki Tuuri Innobase Oy ............... Thank you Dan! I do not have access to a FreeBSD computer during this weekend but your stack prints already tell the origin of the problem. I have implemented my own mutexes in the purpose that I can use an assembler instruction for the atomic test-and-set operation needed in a mutex. But for now I have done the test-and-set with pthread_mutex_trylock: it provides an atomic operation on an OS mutex which I can use in place of test-and-set. It seems that if a thread does not acquire a mutex with the first try, then something goes wrong and the thread is left in a loop. The stack prints show that FreeBSD uses a spin wait also in the case a trylock fails. This may be associated with the problem. More logical would be that FreeBSD would return with a failure code if trylock fails. A fix would be to replace pthread_mutex_trylock with the XCHG instruction to implement test-and-set. But that would not work on non-Intel FreeBSD platforms. I will dig into the FreeBSD documentation and try to find a solution from there. Best regards, Heikki At 03:52 PM 3/30/01 -0600, you wrote: >How-To-Repeat: Seems to hang always. >Fix: >Release-Note: >Audit-Trail: >Unformatted: >In the last episode (Mar 30), Heikki Tuuri said: >> The FreeBSD bug is known. I will run tests on our FreeBSD machine in >> the next few days. Obviously there is something wrong with the >> FreeBSD port. Was it so that it hung and used 100 % of CPU? That has >> been reported also from Italy. > >I have a similar problem, on FreeBSD 5 (i.e. -current). I can insert >records one at a time with no problem, but if I try to update more than >~250 records at a time, it hangs, consuming 100% cpu. gdb'ing a corefile of >the process, it looks like a mutex/spinlock problem of some sort. >Deleting records dies if I delete between 100 and 150 records in one >go. Does innobase create a mutex for each record processed? Maybe >there's a limit on 256 held mutices per thread on FreeBSD or something. > >-- mysqld hung on "insert into temp (value) select ip from iptable limit 300": >(gdb) thread apply all bt > >Thread 1 (process 764): >#0 0x28288163 in _get_curthread () > at /usr/src/lib/libc_r/uthread/uthread_kern.c:1145 >#1 0x28280064 in _spinlock_debug (lck=0xbfaa9ecc, > fname=0x28280138 "\203─\020\205└\017\2055   \213\205ⁿ■  \213U\b\211B\004\213E\f\211B\b\213E\020\211B\f\215Ñ╪■  [^_\211∞]├$FreeBSD: src/lib/libc_r/arch/i386/_atomic_lock.S,v 1.3 1999/08/28 00:03:01 peter Exp $", lineno=1495515 36) > at /usr/src/lib/libc_r/uthread/uthread_spinlock.c:83 >#2 0x282854d6 in mutex_trylock_common (mutex=0xbfaa9ecc) > at /usr/src/lib/libc_r/uthread/uthread_mutex.c:311 >#3 0x28285712 in __pthread_mutex_trylock (mutex=0x8ea3090) > at /usr/src/lib/libc_r/uthread/uthread_mutex.c:441 >#4 0x8193d4b in mutex_spin_wait (mutex=0x8ea308c) at ../include/os0sync.ic:38 >#5 0x8126ead in srv_master_thread (arg=0x0) at ../include/sync0sync.ic:220 >#6 0x2827f18c in _thread_start () > at /usr/src/lib/libc_r/uthread/uthread_create.c:326 >#7 0x0 in ?? () >(gdb) > > >-- mysqld hung on "delete from temp limit 150": >(gdb) info threads; >* 1 process 26111 0x28361b54 in gettimeofday () from /usr/lib/libc.so.5 >(gdb) where >#0 0x28361b54 in gettimeofday () from /usr/lib/libc.so.5 >#1 0x28280949 in _thread_sig_handler (sig=0, info=0x828a660, ucp=0x282808d1) > at /usr/src/lib/libc_r/uthread/uthread_sig.c:93 >#2 0xbfbfffac in ?? () >#3 0x28287ffb in _thread_kern_sig_defer () > at /usr/src/lib/libc_r/uthread/uthread_kern.c:1049 >#4 0x282854bf in mutex_trylock_common (mutex=0x0) > at /usr/src/lib/libc_r/uthread/uthread_mutex.c:308 >#5 0x28285712 in __pthread_mutex_trylock (mutex=0x8ea3210) > at /usr/src/lib/libc_r/uthread/uthread_mutex.c:441 >#6 0x8193d4b in mutex_spin_wait (mutex=0x8ea320c) at ../include/os0sync.ic:38 >#7 0x8165a84 in buf_page_get_gen (space=0, offset=6, rw_latch=2, guess=0x0, > mode=10, mtr=0xbfaa95d0) at ../include/sync0sync.ic:220 >#8 0x81576d9 in trx_purge_truncate_rseg_history (rseg=0x8ebd10c, > limit_trx_no={high = 0, low = 7946}, limit_undo_no={high = 0, low = 0}) > at ../include/trx0rseg.ic:25 >#9 0x8157bdd in trx_purge_truncate_history () at trx0purge.c:545 >#10 0x81589c7 in trx_purge_fetch_next_rec (roll_ptr=0xbfaa9ee4, > cell=0x8ec016c, heap=0x8ebfe0c) at trx0purge.c:564 >#11 0x813a2b6 in row_purge (node=0x8ec0134, thr=0x8ec00d4) at row0purge.c:481 >#12 0x813a4fe in row_purge_step (thr=0x8ec00d4) at row0purge.c:548 >#13 0x8129aa2 in que_run_threads (thr=0x8ec00d4) at que0que.c:1223 >#14 0x8158f95 in trx_purge () at trx0purge.c:1050 >#15 0x8126fb5 in srv_master_thread (arg=0x0) at srv0srv.c:1901 >#16 0x2827f18c in _thread_start () > at /usr/src/lib/libc_r/uthread/uthread_create.c:326 >#17 0x0 in ?? () >(gdb) > > >-- > Dan Nelson > dnelson@emsphone.com > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message