From owner-svn-src-all@FreeBSD.ORG Tue Sep 21 17:38:41 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC38010656A5; Tue, 21 Sep 2010 17:38:41 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 58C5B8FC25; Tue, 21 Sep 2010 17:38:41 +0000 (UTC) Received: by iwn34 with SMTP id 34so6322645iwn.13 for ; Tue, 21 Sep 2010 10:38:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=U7S3vy8S9gEmALgq6XSFvWLqrgbz7/jwWLBHg1BElXo=; b=TWHszlaRkK7SpKr2ueMK3ImgnOga4BuoIPUbIrgRJZ3vNWUlhf+h72VlWnVpC2H2YD HHjJrkcbXS/Er3i76qq51KJ3JbEqN445Q7BxER/1G8ppoGJcKg+oj/0ufdzCTbtOLCqY pVh1UjVp1dh8cW+/KDBfS7v/3zftVqMmSHV3o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=L/x2ko9UaxFYOSv2jH0PLmD5jk8Z3EJMGP12PdPoy2WxjDB6pHLplg8+XJraMI4otG Eg4fhVeFQUQpRLiuzd1l6ibsTaecThHLJJ7cNLQOEpp1zWedtwCHXvN+dngfafUzyyU0 nrtrSJBo0Ck6BAIgwDi5L7u8wIKtVTOXmPFxQ= MIME-Version: 1.0 Received: by 10.231.184.156 with SMTP id ck28mr10375567ibb.168.1285090716913; Tue, 21 Sep 2010 10:38:36 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.231.187.71 with HTTP; Tue, 21 Sep 2010 10:38:36 -0700 (PDT) In-Reply-To: <201009211250.40704.jhb@freebsd.org> References: <201009211507.o8LF7iVv097676@svn.freebsd.org> <4C98D200.4040909@freebsd.org> <201009211250.40704.jhb@freebsd.org> Date: Tue, 21 Sep 2010 10:38:36 -0700 X-Google-Sender-Auth: faHs1tBdIilmf2hw9A8SvZ-h2AI Message-ID: From: mdf@FreeBSD.org To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Andriy Gapon Subject: Re: svn commit: r212964 - head/sys/kern X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Sep 2010 17:38:41 -0000 On Tue, Sep 21, 2010 at 9:50 AM, John Baldwin wrote: > On Tuesday, September 21, 2010 12:31:01 pm mdf@freebsd.org wrote: >> On Tue, Sep 21, 2010 at 8:40 AM, Andriy Gapon wrote: >> > on 21/09/2010 18:27 Andriy Gapon said the following: >> >> on 21/09/2010 18:17 mdf@FreeBSD.org said the following: >> >>> >> >>> I'd recommend using stack_print_ddb(), as that avoids any locking >> >>> which may hang depending on how the kernel panic'd. >> >> >> >> It seems that stack_print_ddb() depends on DDB? >> > >> > But the point about locking is very good. >> > How do you suggest we can deal with it? >> > >> > A dirty hack would be to check panicstr in linker_search_symbol_name a= nd avoid >> > locking, but I don't like that at all. >> > Perhaps, some code in subr_stack.c could be taken outside DDB ifdef? >> >> I keep forgetting, but actually _mtx_lock_sleep() will just return if >> panicstr is set. =A0_mtx_assert() is similarly guarded, so actually it >> should be mostly okay. > > Err, I don't think _mtx_lock_sleep() is guarded in that fashion? =A0I hav= e an > old patch to do that but have never committed it. =A0If we want that we s= hould > probably change rwlocks and sxlocks to have also not block when panicstr = is > set. D'oh! Once again I was looking at Isilon source but not CURRENT. We had patches for mtx (back on 6.1) and haven't updated them for sx/rw for 7. We're also running with local patches to stop CPUs in panic() that Mr Jacob has a copy of. Regarding the original issue, then, I think in the short term using a safe stack_print() would be the correct thing. Changing the locking and stop_cpus logic will not happen in the next week. :-) Thanks, matthew > > --- //depot/vendor/freebsd/src/sys/kern/kern_mutex.c =A0 =A02010-05-11 18= :31:25.000000000 0000 > +++ //depot/projects/smpng/sys/kern/kern_mutex.c =A0 =A0 =A0 =A02010-06-0= 1 20:12:02.000000000 0000 > @@ -348,6 +348,15 @@ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; > =A0 =A0 =A0 =A0} > > + =A0 =A0 =A0 /* > + =A0 =A0 =A0 =A0* If we have already panic'd and this is the thread that= called > + =A0 =A0 =A0 =A0* panic(), then don't block on any mutexes but silently = succeed. > + =A0 =A0 =A0 =A0* Otherwise, the kernel will deadlock since the schedule= r isn't > + =A0 =A0 =A0 =A0* going to run the thread that holds the lock we need. > + =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 if (panicstr !=3D NULL && curthread->td_flags & TDF_INPANIC= ) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + > =A0 =A0 =A0 =A0lock_profile_obtain_lock_failed(&m->lock_object, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&contested, &waittime); > =A0 =A0 =A0 =A0if (LOCK_LOG_TEST(&m->lock_object, opts)) > @@ -664,6 +673,15 @@ > =A0 =A0 =A0 =A0} > > =A0 =A0 =A0 =A0/* > + =A0 =A0 =A0 =A0* If we failed to unlock this lock and we are a thread t= hat has > + =A0 =A0 =A0 =A0* called panic(), it may be due to the bypass in _mtx_lo= ck_sleep() > + =A0 =A0 =A0 =A0* above. =A0In that case, just return and leave the lock= alone to > + =A0 =A0 =A0 =A0* avoid changing the state. > + =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 if (panicstr !=3D NULL && curthread->td_flags & TDF_INPANIC= ) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + > + =A0 =A0 =A0 /* > =A0 =A0 =A0 =A0 * We have to lock the chain before the turnstile so this = turnstile > =A0 =A0 =A0 =A0 * can be removed from the hash list if it is empty. > =A0 =A0 =A0 =A0 */ > > -- > John Baldwin >