From owner-freebsd-fs@freebsd.org Sat Oct 29 13:33:44 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5E56AC26A6E for ; Sat, 29 Oct 2016 13:33:44 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 6711AEDD; Sat, 29 Oct 2016 13:33:42 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA11373; Sat, 29 Oct 2016 16:33:41 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1c0TlV-000Gvs-0k; Sat, 29 Oct 2016 16:33:41 +0300 Subject: Re: ZFS L2ARC checksum errors after compression To: lev@FreeBSD.org, freebsd-fs References: <921575537.20161029143626@serebryakov.spb.ru> From: Andriy Gapon Message-ID: <3dae7691-fcd1-b3b9-445c-b81d6f0cdc52@FreeBSD.org> Date: Sat, 29 Oct 2016 16:32:44 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <921575537.20161029143626@serebryakov.spb.ru> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Oct 2016 13:33:44 -0000 On 29/10/2016 14:36, Lev Serebryakov wrote: > Hello freebsd-fs, > > System is FreeBSD 10.3-STABLE #0 r307523: Mon Oct 17 22:36:27 MSK 2016. > > I have a small L2ARC (185G) on SSD for my RAIDZ1 pool. > > When "ALLOC" on this L2ARC becomes greater than "SIZE" (it is compression > works, am I right?), zfs-stats shows, that number of checkum errors start > to raise. For example, I have this "zfs-stats -L" output now: > > L2 ARC Summary: (DEGRADED) > Passed Headroom: 153.46k > Tried Lock Failures: 9.65k > IO In Progress: 4.33k > Low Memory Aborts: 9 > Free on Write: 1.77k > Writes While Full: 15.20k > R/W Clashes: 0 > Bad Checksums: 104.95k > IO Errors: 0 > SPA Mismatch: 4.10m > > > And "Bad Checksums" goes up rather fast, it becomes 105.31k when I compose > this message! > > Looks like here is some problems with L2ARC compression. > I think that a recent upstream change, compressed ARC support, reintroduced an a old problem that was fixed a while ago. It would be great if you could test this patch: Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c =================================================================== --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c (revision 308050) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c (working copy) @@ -7028,7 +7028,22 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, continue; } - if ((write_asize + HDR_GET_LSIZE(hdr)) > target_sz) { + /* + * We rely on the L1 portion of the header below, so + * it's invalid for this header to have been evicted out + * of the ghost cache, prior to being written out. The + * ARC_FLAG_L2_WRITING bit ensures this won't happen. + */ + ASSERT(HDR_HAS_L1HDR(hdr)); + + ASSERT3U(HDR_GET_PSIZE(hdr), >, 0); + ASSERT3P(hdr->b_l1hdr.b_pdata, !=, NULL); + ASSERT3U(arc_hdr_size(hdr), >, 0); + uint64_t size = arc_hdr_size(hdr); + uint64_t asize = vdev_psize_to_asize(dev->l2ad_vdev, + size); + + if ((write_asize + asize) > target_sz) { full = B_TRUE; mutex_exit(hash_lock); ARCSTAT_BUMP(arcstat_l2_write_full); @@ -7063,21 +7078,6 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, list_insert_head(&dev->l2ad_buflist, hdr); mutex_exit(&dev->l2ad_mtx); - /* - * We rely on the L1 portion of the header below, so - * it's invalid for this header to have been evicted out - * of the ghost cache, prior to being written out. The - * ARC_FLAG_L2_WRITING bit ensures this won't happen. - */ - ASSERT(HDR_HAS_L1HDR(hdr)); - - ASSERT3U(HDR_GET_PSIZE(hdr), >, 0); - ASSERT3P(hdr->b_l1hdr.b_pdata, !=, NULL); - ASSERT3U(arc_hdr_size(hdr), >, 0); - uint64_t size = arc_hdr_size(hdr); - uint64_t asize = vdev_psize_to_asize(dev->l2ad_vdev, - size); - (void) refcount_add_many(&dev->l2ad_alloc, size, hdr); /* -- Andriy Gapon