From owner-freebsd-geom@FreeBSD.ORG Sun Dec 20 23:37:55 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3CDD81065676 for ; Sun, 20 Dec 2009 23:37:55 +0000 (UTC) (envelope-from me@johnea.net) Received: from mail.johnea.net (johnea.net [70.167.123.7]) by mx1.freebsd.org (Postfix) with ESMTP id 225898FC13 for ; Sun, 20 Dec 2009 23:37:54 +0000 (UTC) Received: from localhost.localdomain (vhost.johnea.net [192.168.100.239]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.johnea.net (Postfix) with ESMTPSA id DAF7D73F1846 for ; Sun, 20 Dec 2009 14:59:24 -0800 (PST) Date: Sun, 20 Dec 2009 15:14:53 -0800 From: johnea To: freebsd-geom@freebsd.org Message-ID: <20091220151453.3dbe738e@johnea.net> Organization: very little X-Mailer: Claws Mail 3.7.3 (GTK+ 2.18.3; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: errrors resyncronizing mirror with new disk X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Dec 2009 23:37:55 -0000 Hello, Recently a drive started to have READ_DMA failures and drop out of the gmirror. While synching the replacement drive to the mirror, a WRITE error occured and terminated the sync. A second attempt to sync completed but left the drive generating READ errors. The third time the drive was inserted in the mirror the operation completed and no further errors have occured. R'ingTFM and extensive scroogling have not yielded an explanation. Is this expected behaviour? Should I suspect this drive? (it's brand new, but the "old" one was only 9 months) Additionally smartctl on the new drive continues to indicate no errors. (it did indicate each of the read errors on the old drive) Below are the relevant syslog excerpts. Thank You! johnea ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #1 The 3rd OLD disk read failure: Dec 17 03:03:23 atom kernel: ad4: FAILURE - READ_DMA status=51 error=40 LBA=9787039 Dec 17 03:03:23 atom kernel: GEOM_MIRROR: Request failed (error=5). ad4[READ(offset=5010963968, length=16384)] Dec 17 03:03:23 atom kernel: GEOM_MIRROR: Device gm0: provider ad4 disconnected. Dec 17 03:24:02 atom smartd[41810]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors Dec 17 03:24:02 atom smartd[41810]: Device: /dev/ad4, 1 Offline uncorrectable sectors Dec 17 03:24:02 atom smartd[41810]: Device: /dev/ad4, ATA error count increased from 2 to 3 Dec 17 03:54:02 atom smartd[41810]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors Dec 17 03:54:02 atom smartd[41810]: Device: /dev/ad4, 1 Offline uncorrectable sectors Dec 17 04:24:02 atom smartd[41810]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors Dec 17 04:24:02 atom smartd[41810]: Device: /dev/ad4, 1 Offline uncorrectable sectors Last 2 lines repeating every 30 minutes... #2 Rebooting after removing OLD disk and installing NEW disk: Dec 17 18:00:14 atom kernel: GEOM_MIRROR: Force device gm0 start due to timeout. Dec 17 18:00:14 atom kernel: Root mount waiting for: GMIRROR Dec 17 18:00:14 atom kernel: GEOM_MIRROR: Device mirror/gm0 launched (1/2). Dec 17 18:00:14 atom kernel: Trying to mount root from ufs:/dev/mirror/gm0s1a #3 Write Error after issuing 'gmirror forget gm0; gmirror insert gm0 /dev/ad4' on NEW disk: Dec 17 18:14:59 atom kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4. ... Dec 17 19:41:33 atom kernel: ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=955081984 Dec 17 19:41:39 atom kernel: ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=955081984 Dec 17 19:41:45 atom kernel: ad4: FAILURE - WRITE_DMA48 timed out LBA=955081984 Dec 17 19:41:45 atom kernel: GEOM_MIRROR: Synchronization request failed (error=5). ad4[WRITE(offset=489001975808, length=131072)] Dec 17 19:41:45 atom kernel: GEOM_MIRROR: Device gm0: provider ad4 disconnected. Dec 17 19:41:45 atom kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4 stopped. #4 Read Errors after 2nd 'gmirror forget gm0; gmirror insert gm0 /dev/ad4' on NEW disk: Dec 17 19:56:54 atom kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4. Dec 17 23:16:26 atom kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4 finished. Dec 18 01:00:42 atom smartd[724]: Device: /dev/ad4, 4 Currently unreadable (pending) sectors Dec 18 01:00:43 atom smartd[724]: Device: /dev/ad4, 4 Offline uncorrectable sectors Dec 18 01:30:42 atom smartd[724]: Device: /dev/ad4, 4 Currently unreadable (pending) sectors Dec 18 01:30:42 atom smartd[724]: Device: /dev/ad4, 4 Offline uncorrectable sectors Dec 18 02:00:42 atom smartd[724]: Device: /dev/ad4, 8 Currently unreadable (pending) sectors Dec 18 02:00:42 atom smartd[724]: Device: /dev/ad4, 8 Offline uncorrectable sectors Dec 18 02:30:42 atom smartd[724]: Device: /dev/ad4, 8 Currently unreadable (pending) sectors Dec 18 02:30:42 atom smartd[724]: Device: /dev/ad4, 8 Offline uncorrectable sectors #5 Log entries around issuing 'gmirror remove gm0 ad4; gmirror insert gm0 ad4': Dec 19 10:00:42 atom smartd[724]: Device: /dev/ad4, 8 Currently unreadable (pending) sectors Dec 19 10:00:42 atom smartd[724]: Device: /dev/ad4, 8 Offline uncorrectable sectors Dec 19 10:30:41 atom smartd[724]: Device: /dev/ad4, 8 Currently unreadable (pending) sectors Dec 19 10:30:41 atom smartd[724]: Device: /dev/ad4, 8 Offline uncorrectable sectors Dec 19 10:44:00 atom kernel: GEOM_MIRROR: Device gm0: provider ad4 destroyed. Dec 19 10:44:17 atom kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4. Dec 19 11:00:42 atom smartd[724]: Device: /dev/ad4, 8 Currently unreadable (pending) sectors Dec 19 11:00:42 atom smartd[724]: Device: /dev/ad4, 8 Offline uncorrectable sectors Dec 19 14:05:24 atom kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4 finished. From owner-freebsd-geom@FreeBSD.ORG Mon Dec 21 11:06:56 2009 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 051E010656A3 for ; Mon, 21 Dec 2009 11:06:56 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DC6488FC26 for ; Mon, 21 Dec 2009 11:06:55 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBLB6tBg004089 for ; Mon, 21 Dec 2009 11:06:55 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBLB6tdp004087 for freebsd-geom@FreeBSD.org; Mon, 21 Dec 2009 11:06:55 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 21 Dec 2009 11:06:55 GMT Message-Id: <200912211106.nBLB6tdp004087@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Dec 2009 11:06:56 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/141740 geom [geom] gjournal(8): g_journal_destroy concurrent error o kern/141011 geom [geli] Encrypted root, geli password at boot; enter ke o kern/140352 geom [geom] gjournal + glabel not working o kern/139847 geom [geom_mbr] load/unload causes system to hang o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/134044 geom [geom] gmirror(8) overwrites fs with stale data from r o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition f kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid f kern/128276 geom [gmirror] machine lock up when gmirror module is used f kern/126902 geom [geom] geom_label: kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s f kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 54 problems total. From owner-freebsd-geom@FreeBSD.ORG Mon Dec 21 11:46:27 2009 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D97ED1065670 for ; Mon, 21 Dec 2009 11:46:27 +0000 (UTC) (envelope-from gcubfg-freebsd-geom@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 96BE58FC13 for ; Mon, 21 Dec 2009 11:46:27 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.50) id 1NMgiT-0001u6-ND for freebsd-geom@freebsd.org; Mon, 21 Dec 2009 12:46:21 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 21 Dec 2009 12:46:21 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 21 Dec 2009 12:46:21 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-geom@freebsd.org From: Ivan Voras Date: Mon, 21 Dec 2009 12:46:01 +0100 Lines: 20 Message-ID: References: <20091220151453.3dbe738e@johnea.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.23 (X11/20091210) In-Reply-To: <20091220151453.3dbe738e@johnea.net> Sender: news Subject: Re: errrors resyncronizing mirror with new disk X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Dec 2009 11:46:27 -0000 johnea wrote: > Hello, > > Recently a drive started to have READ_DMA failures and drop out of the gmirror. > > While synching the replacement drive to the mirror, a WRITE error occured and terminated the sync. > > A second attempt to sync completed but left the drive generating READ errors. > > The third time the drive was inserted in the mirror the operation completed and no further errors have occured. > > R'ingTFM and extensive scroogling have not yielded an explanation. > > Is this expected behaviour? Should I suspect this drive? > (it's brand new, but the "old" one was only 9 months) Yes. Probably either the drives or the power unit. > Additionally smartctl on the new drive continues to indicate no errors. > (it did indicate each of the read errors on the old drive)