From owner-freebsd-bugs@FreeBSD.ORG Mon Jun 25 14:00:21 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC6A41065674 for ; Mon, 25 Jun 2012 14:00:21 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 959B58FC14 for ; Mon, 25 Jun 2012 14:00:21 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q5PE0LOa041491 for ; Mon, 25 Jun 2012 14:00:21 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q5PE0LpD041490; Mon, 25 Jun 2012 14:00:21 GMT (envelope-from gnats) Resent-Date: Mon, 25 Jun 2012 14:00:21 GMT Resent-Message-Id: <201206251400.q5PE0LpD041490@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Ron Dzierwa Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48DD71065672 for ; Mon, 25 Jun 2012 13:56:40 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 295A58FC08 for ; Mon, 25 Jun 2012 13:56:40 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q5PDud22066373 for ; Mon, 25 Jun 2012 13:56:39 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id q5PDudhm066372; Mon, 25 Jun 2012 13:56:39 GMT (envelope-from nobody) Message-Id: <201206251356.q5PDudhm066372@red.freebsd.org> Date: Mon, 25 Jun 2012 13:56:39 GMT From: Ron Dzierwa To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: misc/169398: Can't remove file with permanent error X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Jun 2012 14:00:21 -0000 >Number: 169398 >Category: misc >Synopsis: Can't remove file with permanent error >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Jun 25 14:00:21 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Ron Dzierwa >Release: 8.2-RELEASE-p6 >Organization: Innovative Engineering, Inc. >Environment: FreeBSD phoenix.hsd1.md.comcast.net 8.2-RELEASE-p6 FreeBSD 8.2-RELEASE-p6 #0: Sat Mar 24 20:42:07 EDT 2012 root@phoenix.hsd1.md.comcast.net:/usr/src/sys/amd64/compile/PHOENIX amd64 >Description: I am running ZFS filesystem version 4 and storage pool version 15 on a FreeBSD 8.2-Release-amd64 kernel. I have a single 12TB pool based on a 3ware 9650 controller with 8 seagate ST2000DL003 drives in a raid-5 configuration managed by the controller. I recently had a connector problem on a disk in the array while running a performance test that was writing a 1TB pattern file to the array. When the raid controller started reporting errors I stopped the test and re-seated the connector on the drive. After running a verify on the raid, I tried to read the partial pattern file and ZFS produced copious amounts of checksum error messages on the system console. So, I rm'ed the file, and got even more checksum errors interspersed with several I/O error 86 messages. Since the rm, ls no longer shows the file, but I did a scrub just to be sure the bogus file was gone, and got tons of checksum and i/o 86 errors. At the end, zpool status shows: phoenix# zpool status -v zfsPool pool: zfsPool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 3h40m with 6353 errors on Fri Jun 22 08:36:36 2012 config: NAME STATE READ WRITE CKSUM zfsPool ONLINE 0 0 6.20K da0 ONLINE 0 0 12.4K errors: Permanent errors have been detected in the following files: zfsPool/raid:<0x9e241> I have tried "zpool clear"/reboot/"zpool scrub" several times now, and get a similar set of errors and results. My question is - How do I get rid of this file? It is no longer linked to a directory entry, and there shouldn't be anybody with it open since I have rebooted several times. yet, zfs still tells me there's a broken file and I should replace it. It is most likely the pattern test file that I deleted, so I don't need it and I don't want to recover it. i would just like to get rid of it and get my filesystem clean again without resorting to starting over. thanks, ron. >How-To-Repeat: not sure. it occurred because of an untimely combination of high usage and hardware failures. >Fix: it was suggested that i either backup or copy the array somewhere and then copy it back, but the machine is in production, and don't have enough capacity elsewhere to copy the entire content. Anyway, for a serious filesystem, it should be possible to clean this file even if it has bad links and checksums without starting over. >Release-Note: >Audit-Trail: >Unformatted: