From owner-freebsd-bugs@FreeBSD.ORG Fri Oct 8 17:01:04 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 639C716A4E9; Fri, 8 Oct 2004 17:01:04 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A28E43D77; Fri, 8 Oct 2004 17:00:52 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) i98H0q1M049952; Fri, 8 Oct 2004 17:00:52 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i98H0qRA049950; Fri, 8 Oct 2004 17:00:52 GMT (envelope-from gnats) Resent-Date: Fri, 8 Oct 2004 17:00:52 GMT Resent-Message-Id: <200410081700.i98H0qRA049950@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Cc: sos@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Mikhail Teterin Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C38AE16A4CF for ; Fri, 8 Oct 2004 16:56:07 +0000 (GMT) Received: from harik.murex.com (mail.murex.com [194.98.239.11]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1F21B43D54 for ; Fri, 8 Oct 2004 16:56:07 +0000 (GMT) (envelope-from mteterin@pandora.us.murex.com) Message-Id: <200410081650.i98Go6O5015987@harik.murex.com> Date: Fri, 8 Oct 2004 12:55:36 -0400 (EDT) From: Mikhail Teterin To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 X-GNATS-Notify: sos@FreeBSD.org Subject: kern/72451: Continuing problems with Silicon Image SATA controllers X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Oct 2004 17:01:05 -0000 >Number: 72451 >Category: kern >Synopsis: Continuing problems with Silicon Image SATA controllers >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Oct 08 17:00:51 GMT 2004 >Closed-Date: >Last-Modified: >Originator: Mikhail Teterin >Release: FreeBSD 5.3-BETA5 amd64 >Organization: Virtual Estates, Inc. >Environment: System: FreeBSD pandora 5.3-BETA5 FreeBSD 5.3-BETA5 #4: Mon Sep 20 16:45:55 EDT 2004 mteterin@pandora:/backup/obj/usr/src/sys/DIOSCURI amd64 Relevant dmesg.boot entries: atapci0: port 0x9c00-0x9c0f,0xa000-0xa003,0xa400-0xa407,0xa800-0xa803,0xac00-0xac07 mem 0xff3ff400-0xff3ff7ff irq 17 at device 11.0 on pci3 ad6: 190782MB [387621/16/63] at ata3-master SATA150 Ident information from the running kernel: $FreeBSD: src/sys/dev/ata/ata-all.c,v 1.227 2004/09/16 09:35:01 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.34 2004/08/27 14:48:32 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-lowlevel.c,v 1.47 2004/09/03 12:10:44 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-isa.c,v 1.22 2004/04/30 16:21:34 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-pci.c,v 1.88 2004/08/20 06:19:25 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-chipset.c,v 1.88 2004/09/10 10:31:37 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-dma.c,v 1.131 2004/09/10 10:31:37 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-disk.c,v 1.177 2004/09/01 12:15:44 sos Exp $ $FreeBSD: src/sys/dev/ata/atapi-cd.c,v 1.171 2004/08/24 10:39:00 sos Exp $ $FreeBSD: src/sys/dev/ata/atapi-fd.c,v 1.97 2004/08/05 21:11:33 sos Exp $ >Description: Under _combined_ disk and CPU load, the following errors start popping up: ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=53404031 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=54910687 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=56806527 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=61715903 ad6: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=62103999 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=176444927 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=311594591 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=196040671 ad6: FAILURE - WRITE_DMA status=51 error=4 LBA=306623743 After a while, all disk IO starts hanging and even a gracefull reboot becomes impossible -- the machine hangs after saying: "some processes would not die..." We replaced the disk and the cables twice already. Under just the disk load, the problem does not appear -- the box survives a full run of `iozone -a' without a hitch, for example. But when we, for example, dump databases on it (over NFS) and, at the same time, gzip the dump for archiving, we see this. Or, when a big file is being uploaded with scp over a fast link with ssh compression. So it looks like something inside the ata driver is not attended to fast enough... >How-To-Repeat: Run `iozone -a' on a disk, while gzip-ing a big file off of the same drive. >Fix: >Release-Note: >Audit-Trail: >Unformatted: