From owner-freebsd-bugs@FreeBSD.ORG Sat Apr 7 08:10:09 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E52DD16A404 for ; Sat, 7 Apr 2007 08:10:09 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 922FE13C458 for ; Sat, 7 Apr 2007 08:10:09 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l378A98D080796 for ; Sat, 7 Apr 2007 08:10:09 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l378A9VU080794; Sat, 7 Apr 2007 08:10:09 GMT (envelope-from gnats) Date: Sat, 7 Apr 2007 08:10:09 GMT Message-Id: <200704070810.l378A9VU080794@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Kris Kennaway Cc: Subject: Re: kern/111260: FreeBSD kernel dead lock and a solution X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Kris Kennaway List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Apr 2007 08:10:10 -0000 The following reply was made to PR kern/111260; it has been noted by GNATS. From: Kris Kennaway To: Zhouyi Zhou Cc: Kris Kennaway , freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/111260: FreeBSD kernel dead lock and a solution Date: Sat, 7 Apr 2007 04:08:02 -0400 On Sat, Apr 07, 2007 at 03:36:40PM +0800, Zhouyi Zhou wrote: > Dear Mr Kennaway > It is sure to goto dead lock with simulatanouly tests running after several days, I use FreeBSD's DEBUG_LOCKS options > with lk_stack to infer where the thead get the lock, and when the thread is not swapped out, I use > ((struct i386_frame *)(struct thread * (0xc*****))->td_pcb->pcb_ebp)->f_frame->f_frame-> ...... ->f_retaddr > is infer where lead the thread into sleep. > Besides all above, to find the reason that lead to dead lock, I modified > sys/stack.h to: > 32 #define STACK_MAX 50 > 33 > 34 struct sbuf; > 35 > 36 struct stack { > 37 int depth; > 38 vm_offset_t pcs[STACK_MAX]; > 39 vm_offset_t arg0[STACK_MAX]; > 40 }; > and the function stack_save in file i386/i386/db_trace.c > to save the first argument beside the return address. > And In the case of tracing the swapped out thread, I modified the thread struct in sys/proc.h and msleep function > in kern/kern_synch.c to save the calling stack when the thread is going to sleep: > 241 struct thread { > 242 struct proc *td_proc; /* (*) Associated process. */ > 243 struct ksegrp *td_ksegrp; > ..... > 327 struct stack td_stack > 328 } > > 118 int > 119 msleep(ident, mtx, priority, wmesg, timo) > 120 void *ident; > 121 struct mtx *mtx; > 122 int priority, timo; > 123 const char *wmesg; > 124 { > 125 struct thread *td; > 126 struct proc *p; > 127 int catch, rval, flags; > 128 WITNESS_SAVE_DECL(mtx); > 129 > 130 td = curthread; > 131 stack_save(td->td_stack); > It is absolutely evidence that it is the > 462 if (p->p_sysent->sv_copyout_strings) > 463 stack_base = (*p->p_sysent->sv_copyout_strings)(imgp); > in do_execve that lead to dead lock. These are your conclusions, I am asking for the stack traces that lead you to them so that we can verify your observations. Kris