From owner-freebsd-scsi@FreeBSD.ORG Sat Jul 23 22:37:22 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51A60106566C for ; Sat, 23 Jul 2011 22:37:22 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-fx0-f44.google.com (mail-fx0-f44.google.com [209.85.161.44]) by mx1.freebsd.org (Postfix) with ESMTP id D95298FC18 for ; Sat, 23 Jul 2011 22:37:21 +0000 (UTC) Received: by fxe6 with SMTP id 6so5339685fxe.17 for ; Sat, 23 Jul 2011 15:37:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :x-enigmail-version:content-type:content-transfer-encoding; bh=YomAAJrs/RsrP4Xn4BP5wvErP7BSCaXgPrk6RutZa9U=; b=LUDTe7Tp3eyMd6ngqP5lnfWkuEj7t6nGsT7k64cXSzlZRAB18HGWl+uesj4anHvX7H LucNsyXVzwVcviwnP/TpzPkLaSBlu+xdDq+ad1KejchpOP7hJ+BX8yyEiseamK5ybk9G fVlK+N10dooKhRiOkHJOBVbpiPw0V5W/ZpIuc= Received: by 10.223.23.26 with SMTP id p26mr4304478fab.98.1311458941152; Sat, 23 Jul 2011 15:09:01 -0700 (PDT) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id j19sm2866368faa.41.2011.07.23.15.08.59 (version=SSLv3 cipher=OTHER); Sat, 23 Jul 2011 15:09:00 -0700 (PDT) Sender: Alexander Motin Message-ID: <4E2B4674.8070605@FreeBSD.org> Date: Sun, 24 Jul 2011 01:08:52 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: freebsd-scsi@freebsd.org X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: Subject: No retries after periph invalidation? X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jul 2011 22:37:22 -0000 Hi. I've simulated one real world device failure condition, when SATA disk still reports its presence, but doesn't respond to any command. I've found that due to multiple command retries, each of which cause 30s timeout, bus reset and another retry/requeue, it may take ages to eventually drop the failed device. Odd thing that those retries continue even after XPT considered device lost and invalidated it. I've made a patch (http://people.freebsd.org/~mav/periph_noretry.patch) for cam_periph_error() to block any retries after periph was marked as invalid. With that patch all activity completes in 1-2 minutess, just after several timeouts, required to consider device loss. Can this way considered to be correct? -- Alexander Motin