From owner-freebsd-bugs@FreeBSD.ORG Mon Aug 11 03:10:13 2003 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EFEF637B405 for ; Mon, 11 Aug 2003 03:10:12 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id AC26343F93 for ; Mon, 11 Aug 2003 03:10:11 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h7BAABUp004365 for ; Mon, 11 Aug 2003 03:10:11 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h7BAABCR004364; Mon, 11 Aug 2003 03:10:11 -0700 (PDT) Resent-Date: Mon, 11 Aug 2003 03:10:11 -0700 (PDT) Resent-Message-Id: <200308111010.h7BAABCR004364@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Peter Edwards Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C3BF937B404 for ; Mon, 11 Aug 2003 03:03:30 -0700 (PDT) Received: from sweeper.openet-telecom.com (mail.openet-telecom.com [62.17.151.60]) by mx1.FreeBSD.org (Postfix) with ESMTP id A05CB43FA3 for ; Mon, 11 Aug 2003 03:03:28 -0700 (PDT) (envelope-from petere@openet-telecom.com) Received: from mail.openet-telecom.com (unverified) by sweeper.openet-telecom.com for ; Mon, 11 Aug 2003 11:06:54 +0100 Received: from rocklobster.openet-telecom.lan (10.0.0.40) by mail.openet-telecom.com (NPlex 6.5.027) id 3F268E000000727E for FreeBSD-gnats-submit@freebsd.org; Mon, 11 Aug 2003 11:01:43 +0100 Received: from rocklobster.openet-telecom.lan (localhost [127.0.0.1]) h7B9uwUB044583 for ; Mon, 11 Aug 2003 10:56:58 +0100 (IST) (envelope-from petere@rocklobster.openet-telecom.lan) Received: (from petere@localhost)h7B9uwTJ044582; Mon, 11 Aug 2003 10:56:58 +0100 (IST) Message-Id: <200308110956.h7B9uwTJ044582@rocklobster.openet-telecom.lan> Date: Mon, 11 Aug 2003 10:56:58 +0100 (IST) From: Peter Edwards To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: bin/55457: GDB gets confused debugging libc_r threaded processes. X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Peter Edwards List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2003 10:10:13 -0000 >Number: 55457 >Category: bin >Synopsis: GDB gets confused debugging libc_r threaded processes. >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Aug 11 03:10:10 PDT 2003 >Closed-Date: >Last-Modified: >Originator: Peter Edwards >Release: FreeBSD 5.1-CURRENT i386 >Organization: >Environment: System: FreeBSD rocklobster 5.1-CURRENT FreeBSD 5.1-CURRENT #8: Wed Jul 9 12:08:31 IST 2003 nfs@archie:/pub/FreeBSD/obj/pub/FreeBSD/current/src/sys/ROCKLOBSTER i386 >Description: I originally pointed this out on -current in Message-Id <1060291316.64739.58.camel@rocklobster.openet-telecom.lan> No one seems interested in commenting on the problem, so I store it here for posterity. Original text follows: [body] Hi. This might be of interest to anyone who has tried debugging multi-threaded programs (of the libc_r variety) with gdb. This has been bugging me for months, and I finally got frustrated enough to find out what was going on. The symptom: Once you call any function that puts a thread to sleep, the target process crashes (simple program, 1.c attached, and log of gdb killing it in crash.txt) The problem: I traced this to an interaction between gdb and the threads scheduler. The initial crash comes from gdb adding internal breakpoints in the "(_)?(sig)?longjmp" functions. This breakpoint gets hit when the thread scheduler calls "_thread_kern_sched" After handling the breakpoint, gdb then needs to reset the instruction pointer in the "current thread" to re-run the instruction the breakpoint was at. However, at that point, gdb's freebsd_uthread_store_registers() barfs, thinking that the thread in question is not "active", because its not in state PS_RUNNING (it's just about to go to sleep). As a result, it mucks up the resetting of the instruction pointer, because it thinks it just needs to twiddle with the threads context, rather than the "live" registers. Once the process is resumed, it starts in the middle of whatever instruction the breakpoint overwrote, and generally fscking things up. The fix: I added a couple of "nop"s to "___longjmp", and created a new entrypoint below them called "___longjmp_raw". This provides a way for the libc_r library to avoid hitting the gdb breakpoints at sensitive moments. All other consumers still work the exact same way (modulo the time spent executing a couple of nops). The patch is attached, and makes gdb behave perfectly for me. Does anyone have any comments on this, or ideas on how to improve on it? The only penalty I can see is an extra "nop" instruction for normal longjmps, which I'll gladly trade for a usable debugger. PS: before anyone suggests it, I initially tried changing freebsd_uthread.c to check for the active thread more effectively, as is done in freebsd_uthread_fetch_registers, by comparing it with "_pthread_run", rather than checking the state. This improved things, but gdb still got confused, and started stopping unexpectedly when it lost it's breakpoints, etc, so I figured the other approach was probably going to be more stable. >How-To-Repeat: Compile this: #include #include #include #include #include #include void *threadFunc(void *arg) { sleep(1); return 0; } int main(int argc, char *argv[]) { pthread_t thread; void *result; int e; if ((e = pthread_create(&thread, 0, threadFunc, 0)) != 0) { fprintf(stderr, "pthread_create: %s\n", strerror(e)); exit(-1); } pthread_join(thread, &result); return 0; } And do this in gdb: petere@rocklobster$ gcc -o 1 -g -Wall -pthread 1.c petere@rocklobster$ gdb ./1 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... (gdb) b threadFunc Breakpoint 1 at 0x804861e: file 1.c, line 10. (gdb) run Starting program: /local/petere/1 Breakpoint 1, threadFunc (arg=0x0) at 1.c:10 10 sleep(1); (gdb) n Program received signal SIGSEGV, Segmentation fault. 0x280d0138 in _longjmp () from /usr/lib/libc.so.5 (gdb) >Fix: This is, of course for Intel only. I'll create versions for other architectures if if someone cares enough to commit it, and doesn't have a better alternate patch. A possibly less intrusive patch might be to produce a copy of "longjmp" for use in the threads scheduler from the libc source, just mangling the entrypoint name with the preprocessor. I do figure avoiding gdb-inserted breakpoints in the scheduler is a better approach to coping with the problems it causes. Index: lib/libc/i386/gen/_setjmp.S =================================================================== RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/lib/libc/i386/gen/_setjmp.S,v retrieving revision 1.16 diff -u -r1.16 _setjmp.S --- lib/libc/i386/gen/_setjmp.S 23 Mar 2002 02:05:17 -0000 1.16 +++ lib/libc/i386/gen/_setjmp.S 7 Aug 2003 20:42:08 -0000 @@ -66,6 +66,17 @@ .weak CNAME(_longjmp) .set CNAME(_longjmp),CNAME(___longjmp) ENTRY(___longjmp) +/* + * Debuggers tend to put breakpoints in longjmp, while + * threads libraries don't like to be interrupted. + * The extra nop for the exposed "_longjmp" stops + * ___longjmp getting mucked about with by the debugger + * The threads library can then call ___longjmp_raw + * with impunity. + */ + nop + nop +ENTRY(___longjmp_raw) movl 4(%esp),%edx movl 8(%esp),%eax movl 0(%edx),%ecx Index: lib/libc_r/uthread/uthread_kern.c =================================================================== RCS file: /pub/FreeBSD/development/FreeBSD-CVS/src/lib/libc_r/uthread/uthread_kern.c,v retrieving revision 1.45 diff -u -r1.45 uthread_kern.c --- lib/libc_r/uthread/uthread_kern.c 5 Oct 2002 02:22:26 -0000 1.45 +++ lib/libc_r/uthread/uthread_kern.c 7 Aug 2003 20:39:44 -0000 @@ -95,7 +95,7 @@ curthread->check_pending = 1; /* Switch to the thread scheduler: */ - ___longjmp(_thread_kern_sched_jb, 1); + ___longjmp_raw(_thread_kern_sched_jb, 1); } @@ -165,7 +165,7 @@ } } /* Switch to the thread scheduler: */ - ___longjmp(_thread_kern_sched_jb, 1); + ___longjmp_raw(_thread_kern_sched_jb, 1); } void @@ -582,7 +582,7 @@ #if NOT_YET _setcontext(&curthread->ctx.uc); #else - ___longjmp(curthread->ctx.jb, 1); + ___longjmp_raw(curthread->ctx.jb, 1); #endif /* This point should not be reached. */ PANIC("Thread has returned from sigreturn or longjmp"); >Release-Note: >Audit-Trail: >Unformatted: