From owner-freebsd-current@FreeBSD.ORG Thu Nov 18 02:36:10 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 85E9216A4CE for ; Thu, 18 Nov 2004 02:36:10 +0000 (GMT) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3ED9643D1D for ; Thu, 18 Nov 2004 02:36:10 +0000 (GMT) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id 310F872DD4; Wed, 17 Nov 2004 18:36:10 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id 2BC9372DCB; Wed, 17 Nov 2004 18:36:10 -0800 (PST) Date: Wed, 17 Nov 2004 18:36:10 -0800 (PST) From: Doug White To: Sean McNeil In-Reply-To: <1100725008.21333.2.camel@server.mcneil.com> Message-ID: <20041117183453.C29048@carver.gumbysoft.com> References: <1100657472.74795.2.camel@server.mcneil.com> <1100725008.21333.2.camel@server.mcneil.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@freebsd.org Subject: Re: Why won't slapd shutdown (kill -0)? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Nov 2004 02:36:10 -0000 On Wed, 17 Nov 2004, Sean McNeil wrote: > On Wed, 2004-11-17 at 10:28 -0800, Doug White wrote: > > On Tue, 16 Nov 2004, Sean McNeil wrote: > > > > > This has been happening for a long time with current and hasn't been > > > resolved. When I start up slapd, I cannot stop it without kill -9 ing > > > it. It would appear stuck in kse and probably has something to do with > > > kill -0: > > > > Mind expanding on this? The backtrace looks normal for a pthread process. > > kill -0 just tests signal delivery; the process is completely unaware that > > the probe occured, though. The process may also be unkillable if its > > stuck in some sort of I/O wait. > > > > Is the server busy when you signal it? > > Oh, OK. I didn't look at /usr/local/etc/rc.subr too closely. I have > additional information, though.... > > It appears that all the threads are destroyed yet it is still in the > thread processing loop. The process is no longer active at all. I just > had a similar problem happen with vlc where I closed it yet it is > hanging in the same place as slapd with all the threads gone. Interesting... what scheduler are you using? > > Here is the one from vlc: > > (gdb) bt full > #0 _thr_sched_switch_unlocked (curthread=0x955000) at pthread_md.h:226 I can't find a reference to this in that file. Can you run ldd against your vlc binary? I('m curious what thread library it thinks its running. > psf = {psf_valid = 0, psf_flags = 0, psf_cancelflags = 29952806, > psf_interrupted = 8, psf_timeout = 11279168, psf_signo = 0, > psf_state = 11279168, psf_wait_data = {mutex = 0x8, cond = 0x8, lock = > 0x8, > sigwait = 0x8}, psf_wakeup_time = {tv_sec = 0, tv_nsec = 0}, > psf_sigset = { > __bits = {29950366, 8, 9860096, 0}}, psf_sigmask = {__bits = > {9752576, 1, > 9860096, 0}}, psf_seqno = 29995347} > curkse = (struct kse *) 0x952000 > resume_once = 0 > #1 0x0000000801c925e0 in _thr_sched_switch (curthread=0x955000) > at /usr/src/lib/libpthread/thread/thr_kern.c:607 > No locals. > #2 0x0000000801c85cb4 in _pthread_join (pthread=0x967400, > thread_return=0x0) > at /usr/src/lib/libpthread/thread/thr_join.c:133 > curthread = (struct pthread *) 0x955000 > tmp = (void *) 0x0 > crit = 0x0 > ret = 0 > #3 0x0000000000431749 in __vlc_thread_join (p_this=0xad4800, > psz_file=0x6a283c "src/playlist/playlist.c", i_line=130) > at src/misc/threads.c:716 > i_ret = 1 > #4 0x000000000040ee1a in playlist_Destroy (p_playlist=0xad4800) > ---Type to continue, or q to quit--- > at src/playlist/playlist.c:130 > No locals. > #5 0x000000000040c400 in VLC_CleanUp (i_object=0) at src/libvlc.c:831 > p_intf = (intf_thread_t *) 0xad4800 > p_playlist = (playlist_t *) 0xad4800 > p_vout = (vout_thread_t *) 0xad4800 > p_aout = (aout_instance_t *) 0xad4800 > p_announce = (announce_handler_t *) 0xad4800 > p_vlc = (vlc_t *) 0x94d400 > #6 0x0000000000407415 in main (i_argc=1, ppsz_argv=0x7fffffffe940) > at src/vlc.c:108 > i_ret = 0 > > and here is a full trace of slapd: > > (gdb) bt full > #0 0x000000080142e914 in kse_release () at kse_release.S:2 > No locals. > #1 0x0000000801428e49 in kse_wait (kse=0x62a000, td_wait=0x0, > sigseqno=0) > at /usr/src/lib/libpthread/thread/thr_kern.c:1843 > ts = {tv_sec = 7647232, tv_nsec = 7647232} > ts_sleep = {tv_sec = 60, tv_nsec = 0} > saved_flags = 0 > #2 0x0000000801427078 in kse_sched_multi (kmbx=0x62efa0) > at /usr/src/lib/libpthread/thread/thr_kern.c:1039 > curkse = (struct kse *) 0x62a000 > curthread = (struct pthread *) 0x0 > td_wait = (struct pthread *) 0x62a068 > curframe = (struct pthread_sigframe *) 0x17f > ret = 383 > #3 0x000000080142afbf in _amd64_enter_uts () > at /usr/src/lib/libpthread/arch/amd64/amd64/enter_uts.S:40 > No locals. > #4 0x0000000000000000 in ?? () > No symbol table info available. > #5 0x000000000062f000 in ?? () > No symbol table info available. > #6 0x000000000062a000 in ?? () > No symbol table info available. > ---Type to continue, or q to quit--- > #7 0x0000000000000000 in ?? () > No symbol table info available. > #8 0x0000000000000000 in ?? () > No symbol table info available. > #9 0x0000000000000000 in ?? () > No symbol table info available. > #10 0x0000000000000000 in ?? () > No symbol table info available. > #11 0x0000000000000000 in ?? () > No symbol table info available. > #12 0x0000000000000001 in ?? () > No symbol table info available. > #13 0x0000000801426dd0 in _thr_sched_switch_unlocked () > at /usr/src/lib/libpthread/thread/thr_kern.c:904 > free_kseq = {tqh_first = 0x0, tqh_last = 0x801534810} > gc_ksegq = {tqh_first = 0x0, tqh_last = 0x801534840} > next_uniqueid = 7 > active_kse_groupq = {tqh_first = 0x62f100, tqh_last = 0x748020} > active_kse_count = 2 > free_threadq = {tqh_first = 0x0, tqh_last = 0x801534890} > free_kse_count = 0 > active_kseq = {tqh_first = 0x62a000, tqh_last = 0x6c9220} > free_kse_groupq = {tqh_first = 0x0, tqh_last = 0x801534820} > ---Type to continue, or q to quit--- > kse_lock = {l_head = 0x6291c0, l_tail = 0x6291c0, > l_type = LCK_ADAPTIVE, l_wait = 0x801426150 <_kse_lock_wait>, > l_wakeup = 0x8014261e0 <_kse_lock_wakeup>} > active_kseg_count = 2 > inited = 1 > free_thread_count = 0 > free_kseg_count = 0 > thr_hashtable = {{lh_first = 0x0} , { > lh_first = 0x6c3c00}, {lh_first = 0x0}, {lh_first = 0x0}, { > lh_first = 0x0}, {lh_first = 0x0}, {lh_first = 0x1874400}, { > lh_first = 0x0}, {lh_first = 0x0}, {lh_first = 0x0}, {lh_first = > 0x0}, { > lh_first = 0x74b000}, {lh_first = 0x0} , { > lh_first = 0x74b800}, {lh_first = 0x0}, {lh_first = 0x0}, { > lh_first = 0x0}, {lh_first = 0x0}, {lh_first = 0x0}, {lh_first = > 0x0}, { > lh_first = 0x0}, {lh_first = 0x0}, {lh_first = 0x632000}, { > lh_first = 0x0} , {lh_first = 0x29ab400}, { > lh_first = 0x0}, {lh_first = 0x0}, {lh_first = 0x0}, {lh_first = > 0x0}, { > lh_first = 0x0}, {lh_first = 0x2983c00}, { > lh_first = 0x0} } > thread_lock = {l_head = 0x6291e0, l_tail = 0x6291e0, > l_type = LCK_ADAPTIVE, l_wait = 0x801426150 <_kse_lock_wait>, > l_wakeup = 0x8014261e0 <_kse_lock_wakeup>} > _tcb_mutex = 0x628380 > Previous frame inner to this frame (corrupt stack?) > > which looks like total garbage. Looking at each thread I see that there > is only a thread 1,2, and 3: > > (gdb) thread 1 > [Switching to thread 1 (Thread 6 (LWP 100177))]#0 0x000000080142e914 in > kse_release () at kse_release.S:2 > 2 RSYSCALL(kse_release) > (gdb) bt > #0 0x000000080142e914 in kse_release () at kse_release.S:2 > #1 0x000000080141d926 in sig_daemon (arg=0x7fffffefef70) > at /usr/src/lib/libpthread/thread/thr_sig.c:216 > #2 0x0000000801426db5 in kse_sched_single (kmbx=0x7fffffefef70) > at /usr/src/lib/libpthread/thread/thr_kern.c:902 > > (gdb) thread 2 > [Switching to thread 2 (Thread 7 (sleeping))]#0 > _thr_sched_switch_unlocked ( > curthread=0x632000) at pthread_md.h:226 > 226 if (ret == 0) { > Current language: auto; currently c > (gdb) bt > #0 _thr_sched_switch_unlocked (curthread=0x632000) at pthread_md.h:226 > #1 0x00000008014265e0 in _thr_sched_switch (curthread=0x632000) > at /usr/src/lib/libpthread/thread/thr_kern.c:607 > #2 0x0000000801419cb4 in _pthread_join (pthread=0x74b000, > thread_return=0x0) > at /usr/src/lib/libpthread/thread/thr_join.c:133 > #3 0x0000000800719d09 in ldap_pvt_thread_join (thread=0x800609070, > thread_return=0x62a068) at thr_posix.c:165 > > (gdb) thread 3 > [Switching to thread 3 (LWP 100148)]#0 0x000000080142e914 in > kse_release () > at kse_release.S:2 > 2 RSYSCALL(kse_release) > Current language: auto; currently asm > (gdb) bt > #0 0x000000080142e914 in kse_release () at kse_release.S:2 > #1 0x0000000801428e49 in kse_wait (kse=0x62a000, td_wait=0x0, > sigseqno=0) > at /usr/src/lib/libpthread/thread/thr_kern.c:1843 > #2 0x0000000801427078 in kse_sched_multi (kmbx=0x62efa0) > at /usr/src/lib/libpthread/thread/thr_kern.c:1039 > #3 0x000000080142afbf in _amd64_enter_uts () > at /usr/src/lib/libpthread/arch/amd64/amd64/enter_uts.S:40 > > > > > > > (gdb) bt > > > #0 0x000000080142e914 in kse_release () at kse_release.S:2 > > > #1 0x0000000801428e49 in kse_wait (kse=0x62a000, td_wait=0x0, > > > sigseqno=0) > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1843 > > > #2 0x0000000801427078 in kse_sched_multi (kmbx=0x62efa0) > > > at /usr/src/lib/libpthread/thread/thr_kern.c:1039 > > > #3 0x000000080142afbf in _amd64_enter_uts () > > > at /usr/src/lib/libpthread/arch/amd64/amd64/enter_uts.S:40 > > > #4 0x0000000000000000 in ?? () > > > #5 0x000000000062f000 in ?? () > > > #6 0x000000000062a000 in ?? () > > > #7 0x0000000000000000 in ?? () > > > #8 0x0000000000000000 in ?? () > > > #9 0x0000000000000000 in ?? () > > > #10 0x0000000000000000 in ?? () > > > #11 0x0000000000000000 in ?? () > > > #12 0x0000000000000001 in ?? () > > > #13 0x0000000801426dd0 in _thr_sched_switch_unlocked () > > > at /usr/src/lib/libpthread/thread/thr_kern.c:904 > > > Previous frame inner to this frame (corrupt stack?) > > > > > > > > > -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org