From owner-freebsd-stable@FreeBSD.ORG Mon Apr 13 21:33:11 2015 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E1B201A5 for ; Mon, 13 Apr 2015 21:33:11 +0000 (UTC) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 60D2BC2C for ; Mon, 13 Apr 2015 21:33:10 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id t3DLWeVh049732 for ; Tue, 14 Apr 2015 00:32:40 +0300 (MSK) (envelope-from marck@rinet.ru) Date: Tue, 14 Apr 2015 00:32:40 +0300 (MSK) From: Dmitry Morozovsky To: freebsd-stable@FreeBSD.org Subject: [GEOM] Disk IO error when resyncing gmirror -> massive hang in D state Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [0.0.0.0]); Tue, 14 Apr 2015 00:32:40 +0300 (MSK) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2015 21:33:12 -0000 Dear colleagues, unfortunately, the machine in question is in productin, so I have no clear reproduce case. I do have console logs, however. prerequisites: - rather fresh stable/10, amd64, SuperMicro MicroCloud 1150, X10SLD-F/HF - su+j ufs2 on top of gmirror of two SATA Toshiba drives - one disk died some time ago, so gmirror works in degraded state trouble: - inserted new drive, labelled, started gmirror resync - apparently remaining drive also has read issues: (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 10 b2 c3 40 01 00 00 01 00 00 (ada0:ahcich1:0:0:0): CAM status: ATA Status Error (ada0:ahcich1:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada0:ahcich1:0:0:0): RES: 41 40 04 b3 c3 40 01 00 00 00 01 (ada0:ahcich1:0:0:0): Error 5, Retries exhausted GEOM_MIRROR: Request failed (error=5). ada0a[READ(offset=6566445056, length=131072)] GEOM_MIRROR: Synchronization request failed (error=5). mirror/m0a[READ(offset=6566445056, length=131072)] at this point, all requests to disk I/O are stalled, all cron jobs, syslogd, dchpd, etc. Situation reproduce itself at least two times, then as an emergency new drive had been labelled independently and rsynced over. Any thoughts? Thanks in advance! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------