From owner-freebsd-stable@FreeBSD.ORG  Thu Dec  7 02:22:43 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from localhost.my.domain (localhost [127.0.0.1])
	by hub.freebsd.org (Postfix) with ESMTP id 2D62516A412;
	Thu,  7 Dec 2006 02:22:43 +0000 (UTC)
	(envelope-from davidxu@freebsd.org)
From: David Xu <davidxu@freebsd.org>
To: freebsd-stable@freebsd.org
Date: Thu, 7 Dec 2006 10:22:37 +0800
User-Agent: KMail/1.8.2
References: <20061113084430.GE59604@dimma.mow.oilspace.com>
	<20061116102436.GN32700@FreeBSD.org>
	<20061116111525.GO32700@FreeBSD.org>
In-Reply-To: <20061116111525.GO32700@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200612071022.37930.davidxu@freebsd.org>
Cc: stable@freebsd.org, Gleb Smirnoff <glebius@freebsd.org>
Subject: Re: RELENG_6 panic under heavy load
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Dec 2006 02:22:43 -0000

On Thursday 16 November 2006 19:15, Gleb Smirnoff wrote:
> On Thu, Nov 16, 2006 at 01:24:36PM +0300, Gleb Smirnoff wrote:
> T>   I wonder why UMA was suspected to be the problem. Dima gave
> T> me access to the core. Here are more details from the trace:
>
> It looks like a race between two threads in one process. Look here:
>
> (kgdb) frame 12
> #12 0xd05f4fc1 in _mtx_lock_sleep (m=0xd5dd5498, tid=3583683968, opts=0,
> file=0x12 <Address 0x12 out of bounds>, line=18) at
> /usr/src/sys/kern/kern_mutex.c:579 579                    
> turnstile_wait(&m->mtx_object, mtx_owner(m)); (kgdb) p *m
> $10 = {mtx_object = {lo_class = 0xd084e224, lo_name = 0xd080508c "process
> lock", lo_type = 0xd080508c "process lock", lo_flags = 4390912, lo_list = {
> tqe_next = 0xd5dd56b0, tqe_prev = 0xd5dd5290}, lo_witness = 0xd088a100},
> mtx_lock = 3611674882, mtx_recurse = 0} (kgdb) p ((struct thread *)tid)
> $15 = (struct thread *) 0xd59aad80
> (kgdb) p ((struct thread *)(m->mtx_lock & ~(0x1 | 0x2)))
> $17 = (struct thread *) 0xd745c900
> (kgdb) p ((struct thread *)(m->mtx_lock & ~(0x1 | 0x2)))->td_proc
> $18 = (struct proc *) 0xd5dd5430
> (kgdb) p ((struct thread *)tid)->td_proc
> $19 = (struct proc *) 0xd5dd5430
>
> So, we see that one thread blocks on the lock that is held by an
> other thread of the same process. Here they are:
>
> * 134 Thread 100198 (PID=47872: nagios)  doadump () at pcpu.h:165
>   133 Thread 100147 (PID=47872: nagios)  sched_switch (td=0xd745c900,
> newtd=0xd51f7a80, flags=2) at /usr/src/sys/kern/sched_4bsd.c:980
>
> Let's look at the second one:
>
> (kgdb) thread 133
> [Switching to thread 133 (Thread 100147)]#0  sched_switch (td=0xd745c900,
> newtd=0xd51f7a80, flags=2) at /usr/src/sys/kern/sched_4bsd.c:980 980       
>      sched_lock.mtx_lock = (uintptr_t)td;
> (kgdb) bt
> #0  sched_switch (td=0xd745c900, newtd=0xd51f7a80, flags=2) at
> /usr/src/sys/kern/sched_4bsd.c:980 #1  0xd0607f46 in mi_switch (flags=2,
> newtd=0x0) at /usr/src/sys/kern/kern_synch.c:420 #2  0xd0615ecf in
> maybe_preempt_in_ksegrp (td=0xd59aad80) at kern_switch.c:467 #3  0xd06160c8
> in setrunqueue (td=0xd59aad80, flags=0) at kern_switch.c:585 #4  0xd06151e7
> in sched_wakeup (td=0xd59aad80) at /usr/src/sys/kern/sched_4bsd.c:996 #5 
> 0xd0608025 in setrunnable (td=0xd59aad80) at
> /usr/src/sys/kern/kern_synch.c:483 #6  0xd060d78e in thread_unsuspend_one
> (td=0xd59aad80) at /usr/src/sys/kern/kern_thread.c:972 #7  0xd060d584 in
> thread_suspend_check (return_instead=0) at
> /usr/src/sys/kern/kern_thread.c:935 #8  0xd0628a88 in userret
> (td=0xd745c900, frame=0xf5dd4d38, oticks=1) at
> /usr/src/sys/kern/subr_trap.c:116 #9  0xd07a6e16 in syscall (frame=
>       {tf_fs = 134938683, tf_es = 59, tf_ds = -809566149, tf_edi =
> 134997504, tf_esi = 134998528, tf_ebp = -813707944, tf_isp = -170046108,
> tf_ebx = 672261300, tf_edx = 0, tf_ecx = 134969072, tf_eax = 1, tf_trapno =
> 0, tf_err = 2, tf_eip = 672832335, tf_cs = 51, tf_eflags = 646, tf_esp =
> -813707972, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:1034
> #10 0xd078f38f in Xint0x80_syscall () at
> /usr/src/sys/i386/i386/exception.s:200

maybe_preempt_in_ksegrp is broken, it should do some checks like
maybe_preempt() does, some special cases should prevent preemption,
I believe this will not be a problem on -CURRENT with Julian's ksegrp removal
yesterday. the problem is not in thread suspension code.

Regards,
David Xu