From owner-freebsd-current@FreeBSD.ORG Fri Dec 2 20:20:49 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC0F31065670; Fri, 2 Dec 2011 20:20:49 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id AB6F78FC0C; Fri, 2 Dec 2011 20:20:48 +0000 (UTC) Received: by faak28 with SMTP id k28so3467853faa.13 for ; Fri, 02 Dec 2011 12:20:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=0uLkV7fCWCFQd6Yy/mn2RI6un/ROy1ROLZO0GFinmws=; b=SbtdalVKNAA0Vp3siSxy33W2KoTZo3alp1WhgPjJDe+3K+fU308/Zjo/mRYXPdgID9 DBbWpo7lKtBCH/SQwbAhB1ofrStdzYqfXbq8i7mpcRRI5UePkPDcxpk0WbN4ve4Spx/n B75y+pINGBxwa16MVkXVVyN9CqnLRFYIM9otA= MIME-Version: 1.0 Received: by 10.180.108.114 with SMTP id hj18mr12118243wib.2.1322857247476; Fri, 02 Dec 2011 12:20:47 -0800 (PST) Sender: asmrookie@gmail.com Received: by 10.216.47.211 with HTTP; Fri, 2 Dec 2011 12:20:47 -0800 (PST) In-Reply-To: <4ED91B8D.2080808@FreeBSD.org> References: <20111113083215.GV50300@deviant.kiev.zoral.com.ua> <201112011349.50502.jhb@freebsd.org> <4ED7E6B0.30400@FreeBSD.org> <201112011553.34432.jhb@freebsd.org> <4ED7F4BC.3080206@FreeBSD.org> <4ED855E6.20207@FreeBSD.org> <4ED8A306.9020801@FreeBSD.org> <4ED8F1C1.7010206@FreeBSD.org> <4ED91B8D.2080808@FreeBSD.org> Date: Fri, 2 Dec 2011 21:20:47 +0100 X-Google-Sender-Auth: nGogWj2bLU_FXIf1i-i8tSs0mRk Message-ID: From: Attilio Rao To: John Baldwin Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org, Konstantin Belousov , Andriy Gapon Subject: Re: Stop scheduler on panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 20:20:49 -0000 2011/12/2 John Baldwin : > On 12/2/11 12:18 PM, Attilio Rao wrote: >> >> 2011/12/2 John Baldwin: >>> >>> On 12/2/11 5:05 AM, Andriy Gapon wrote: >>>> >>>> >>>> on 02/12/2011 06:36 John Baldwin said the following: >>>>> >>>>> >>>>> Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true >>>>> when >>>>> kdb was >>>>> active). =C2=A0But I think these two changes should cover critical_ex= it() >>>>> ok. >>>>> >>>> >>>> I attempted to start a discussion about this a few times already :-) >>>> Should we treat kdb context the same as SCHEDULER_STOPPED context (in >>>> the >>>> current definition) ? =C2=A0That is, skip all locks in the same fashio= n? >>>> There are pros and contras. >>> >>> >>> >>> kdb should not block on locks, no. =C2=A0Most debugger commands should = not go >>> near locks anyway unless they are intended to carefully modify the >>> existing >>> system in a safe manner (such as the 'kill' command which should only b= e >>> using try locks and fail if it cannot safely post the signal). >> >> >> The biggest problem to KDB as the same as panic is that doing proper >> 'continue' is impossible. >> One of the features of the 'skip-locking' path is that it doesn't take >> into account fast locking paths, where sometimes the lock can succeed >> and other fails and you don't know about them. Also the restarted CPUs >> can find corrupted datas (as they can be arbitrarely updated), I'm >> sure it is too much panic prone. > > > Yes, my thought is that kdb commands, etc. should be using dedicated > routines that do not use locks whenever possible. =C2=A0The problem of a = user > calling an arbitrary routine is not solvable (so I don't think we should = try > to solve that, you use 'call' at your own risk), but built-in commands > should explicitly either 1) not use locking, or 2) only use try locks and > fail out cleanly (including dropping any try locks acquired) if a try fai= ls. > =C2=A0Now, that's an ideal view, I don't know how close we are to that in > practice or if it is a realistically attainable goal. So you are not in favor of giving KDB its own context? There are some fallbacks (like, for example, bugs involving the scheduler or switching mechanism but for that we can make a facility like KDB_LITE if you want to debug a scheduler problem), but in general that would avoid replicating code to avoid the locking. If you don't want to give KDB its own context, we should work on a KPI (or similar) that defines the command to serve as KDB commands, that tries to keep things under control, etc. Attilio --=20 Peace can only be achieved by understanding - A. Einstein