From owner-freebsd-current@FreeBSD.ORG Mon Jun 27 19:44:02 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1DB48106567C; Mon, 27 Jun 2011 19:44:02 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id B246E8FC18; Mon, 27 Jun 2011 19:44:01 +0000 (UTC) Received: by qwc9 with SMTP id 9so3325799qwc.13 for ; Mon, 27 Jun 2011 12:44:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=7Byw1LAt81lMq+TWGgK6RADIQ6nTVMrpLaO6WoCIpF8=; b=pFJolZ5LjOqENLfxJRFf6vXBXQEc/uWAJfxm90T61x9Cj+b7jM+13sHqV3BB5jiHfL DxziLzymcytsTrDB2W/bunrlpVHlMgz5zY1u5ihMwlb41nDw/NBlikV06QUT7WLYoT6F eW3MeKxDhm4rbeNRRYIxqUEJ6vkM2k8TjKVRk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=dIAqSUBh177ZiXnvwS4CIWta7at6vhppdCwPOHq3C/Z2vAJG38uIqqnNDnVxIpqR5i qKMoCu93/iMlhPC9TipRFMMHHzLiYGw1vcDG2Sr5K4qM3WNHBKtYklTuzajI0jXNjTxK 1Cp4r25iQKF/blwWES35t9ifiqDSyvKE/IoA8= MIME-Version: 1.0 Received: by 10.229.2.131 with SMTP id 3mr4966220qcj.156.1309202387229; Mon, 27 Jun 2011 12:19:47 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.229.62.229 with HTTP; Mon, 27 Jun 2011 12:19:47 -0700 (PDT) In-Reply-To: <4E08568E.4060309@FreeBSD.org> References: <4E05F582.2010500@FreeBSD.org> <6C42CE07-9298-444A-8094-9C60384CA4F1@bsdimp.com> <4E08568E.4060309@FreeBSD.org> Date: Mon, 27 Jun 2011 12:19:47 -0700 X-Google-Sender-Auth: i-1OHeDAibxgLbgyt5oGPFFA918 Message-ID: From: mdf@FreeBSD.org To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org, freebsd-stable@freebsd.org, Warner Losh Subject: Re: kern.sync_on_panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jun 2011 19:44:02 -0000 On Mon, Jun 27, 2011 at 3:08 AM, Andriy Gapon wrote: > on 26/06/2011 08:51 Warner Losh said the following: >> >> On Jun 25, 2011, at 8:49 AM, Andriy Gapon wrote: >>> Does anybody actually use kern.sync_on_panic tunable/sysctl? If yes, th= en >>> in what circumstances do you need it? That is, why any other alternativ= e >>> doesn't work for you? Like: 1. remounting filesystems R/O before panic = if >>> you knowingly provoke it for testing 2. using netboot for your test sys= tem >>> 3. using su+j, gjournal or a different filesystem altogether 4. using f= sck >>> after reboot >>> >>> It seems to me that syncing filesystems in panic context is an adventur= e. >>> And it may become even more of an adventure if we introduce code that >>> completely stops scheduler in and after panic. >> >> I've used it in the past when I was developing a device driver that was = in >> the late stages of maturing. =A0Since all the panics in the system were = when >> the driver dereferenced NULL in that driver, sync was safe because all t= he >> data structures were sane except the aforementioned driver. >> >> (1) It was a production system, and everything that could be was already >> mounted r/w. =A0However, some small, but every critical, amount of data = was >> still r/w and it was very important to not lose this data. =A0Production= here >> likely should be in quotes, because it was in the late stages of >> testing/validation. =A0The problem was without this sometimes the saved = state >> of the GPS receiver and other hardware would wind up being zero, which m= eant >> that we'd have to do a cold start which cost us a few hours of time. =A0= At the >> time I was doing this, we saw zero files a couple times a day without th= is >> turned on. (2) netbooting wasn't an option since we were qualifying a >> non-netbooting system. (3) these weren't available at the time, but the = goal >> was to prevent data loss, not to necessarily have to avoid fsck on boot.= (4) >> Data loss without it. >> >> Now, I'll be the first to admit this has been a few years, and I haven't= done >> a fresh evaluation to see if things are still safe. =A0I'll also be the = first >> to admit that this was a useful debugging setting late in development, a= nd >> not in production. =A0I'm also the first to admit this isn't what I'd ca= ll a >> very wide-spread case. =A0But it did come in very handy when chasing a f= ew bugs >> to be able to do 10 panic/reboot cycles an hour rather than 2 a day. > > A fine enough use-case for me. =A0I guess the problem ultimately boiled d= own to > peculiarities of UFS behavior, but still... > However, please be aware that sync_on_panic might get broken when/if we s= tart > stopping scheduler in panic. The entirety of the sync code should be a subroutine in vfs_bio.c so the 'buf' variable is static to the file. At that point it would be reasonable to explicitly call it at the beginning of panic(9) for the sync-on-panic case, either before IPIing the other CPUs, or at least before entering the critical section that prevents the scheduler from running. Cheers, matthew