From owner-freebsd-fs@FreeBSD.ORG Sun Jan 9 11:49:30 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 852B31065696; Sun, 9 Jan 2011 11:49:30 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 7A58D8FC1B; Sun, 9 Jan 2011 11:49:28 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id E95DA70458C; Sun, 9 Jan 2011 12:49:27 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 13.6024] X-CRM114-CacheID: sfid-20110109_12492_927F0F69 X-CRM114-Status: Good ( pR: 13.6024 ) X-Spambayes-Classification: ham; 0.00 Message-ID: <4D29A0C7.8050002@fsn.hu> Date: Sun, 09 Jan 2011 12:49:27 +0100 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Martin Matuska References: <4D0A09AF.3040005@FreeBSD.org> <4D297943.1040507@fsn.hu> In-Reply-To: <4D297943.1040507@fsn.hu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: New ZFSv28 patchset for 8-STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jan 2011 11:49:30 -0000 On 01/09/2011 10:00 AM, Attila Nagy wrote: > On 12/16/2010 01:44 PM, Martin Matuska wrote: >> Hi everyone, >> >> following the announcement of Pawel Jakub Dawidek (pjd@FreeBSD.org) I am >> providing a ZFSv28 testing patch for 8-STABLE. >> >> Link to the patch: >> >> http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz >> >> > I've got an IO hang with dedup enabled (not sure it's related, I've > started to rewrite all data on pool, which makes a heavy load): > > The processes are in various states: > 65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup > 80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync > 1501 www 1 44 0 7304K 2504K zio->i 0 2:09 0.00% nginx > 1479 www 1 44 0 7304K 2416K zio->i 1 2:03 0.00% nginx > 1477 www 1 44 0 7304K 2664K zio->i 0 2:02 0.00% nginx > 1487 www 1 44 0 7304K 2376K zio->i 0 1:40 0.00% nginx > 1490 www 1 44 0 7304K 1852K zfs 0 1:30 0.00% nginx > 1486 www 1 44 0 7304K 2400K zfsvfs 1 1:05 0.00% nginx > > And everything which wants to touch the pool is/becomes dead. > > Procstat says about one process: > # procstat -k 1497 > PID TID COMM TDNAME KSTACK > 1497 100257 nginx - mi_switch sleepq_wait > __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock VOP_LOCK1_APV > _vn_lock nullfs_root lookup namei vn_open_cred kern_openat > syscallenter syscall Xfast_syscall No, it's not related. One of the disks in the RAIDZ2 pool went bad: (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error (da4:arcmsr0:0:4:0): SCSI status: Check Condition (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) and it seems it froze the whole zpool. Removing the disk by hand solved the problem. I've seen this previously on other machines with ciss. I wonder why ZFS didn't throw it out of the pool.