From owner-freebsd-stable@FreeBSD.ORG Mon Mar 7 03:28:32 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6699D16A4CE for ; Mon, 7 Mar 2005 03:28:32 +0000 (GMT) Received: from FS.denninger.net (wsip-68-15-213-52.at.at.cox.net [68.15.213.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id C6AB043D53 for ; Mon, 7 Mar 2005 03:28:31 +0000 (GMT) (envelope-from karl@FS.denninger.net) Received: from fs.denninger.net (localhost [127.0.0.1]) by FS.denninger.net (8.13.3/8.13.1) with SMTP id j273SVdb000976 for ; Sun, 6 Mar 2005 21:28:31 -0600 (CST) (envelope-from karl@FS.denninger.net) Received: from fs.denninger.net [127.0.0.1] by Spamblock-sys; Sun Mar 6 21:28:31 2005 Received: (from karl@localhost) by FS.denninger.net (8.13.3/8.13.1/Submit) id j273SVY9000974 for stable@freebsd.org; Sun, 6 Mar 2005 21:28:31 -0600 (CST) (envelope-from karl) Message-ID: <20050306212830.A877@denninger.net> Date: Sun, 6 Mar 2005 21:28:30 -0600 From: Karl Denninger To: stable@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.93.2i Organization: Karl's Sushi and Packet Smashers X-Die-Spammers: Spammers cheerfully broiled for supper and served with ketchup! Subject: Caution - possible system instability on attempted fix for "WRITE_ERROR" problem (see enclosed) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2005 03:28:32 -0000 Hi folks; This may be the wrong place, given what I did, but I wanted to give a "heads up" here given the impending release of 5.4-RELEASE This refers to http://www.freebsd.org/cgi/query-pr.cgi?pr=77643 In an attempt to mitigate this, I saw the following commit in the CVS logs: mdodd 2005-03-02 04:01:37 UTC FreeBSD src repository Modified files: sys/dev/ata ata-queue.c Log: When resubmitting a timed out request, reset donecount. Submitted by: Nate Lawson Revision Changes Path 1.42 +1 -0 src/sys/dev/ata/ata-queue.c Is this change supposed to be "safe" against a 5.4-PRERELEASE kernel from today (CVSupped about 1700 CST)? If it is supposed to be, its NOT! It DOES fix the failure to requeue timed out requests, but it also provokes radical destabilization of the interrupt system in the kernel (e.g. receive serial interrupts "disappear", etc) leading evenutally to a panic. BTW, it appear to fix the requeue problem with disks, and wth this in a disk that takes a timeout (but is actually working) does not disconnect from a GEOM mirror; the retried write succeeds. However, for obvious reasons the kernel instability that results from the retried write is not acceptable :) Don't know if this is germane to what is about to show up in 5.4-RELEASE, but if it is, this urgently needs to be looked at. Needless to say I've backed this one out. Will also put this against the PR to dissuade others from trying the same thing... -- -- Karl Denninger (karl@denninger.net) Internet Consultant & Kids Rights Activist http://www.denninger.net My home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.com Musings Of A Sentient Mind