From owner-svn-src-stable@freebsd.org Wed Oct 3 02:48:33 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BFF2110B576A; Wed, 3 Oct 2018 02:48:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6320288328; Wed, 3 Oct 2018 02:48:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3F77810E7A; Wed, 3 Oct 2018 02:48:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w932mWR7061497; Wed, 3 Oct 2018 02:48:32 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w932mVoU061494; Wed, 3 Oct 2018 02:48:31 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201810030248.w932mVoU061494@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 3 Oct 2018 02:48:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r339114 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 339114 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 02:48:33 -0000 Author: mav Date: Wed Oct 3 02:48:31 2018 New Revision: 339114 URL: https://svnweb.freebsd.org/changeset/base/339114 Log: MFC r337025: MFV r337022: 9403 assertion failed in arc_buf_destroy() when concurrently reading block with checksum error This assertion (VERIFY) failure was reported when reading a block. Turns out the problem is that if we get an i/o error (ECKSUM in this case), and there are multiple concurrent ARC reads of the same block (from different clones), then the ARC will put multiple buf's on the same ANON hdr, which isn't supposed to happen, and then causes a panic when we try to arc_buf_destroy() the buf. illumos/illumos-gate@fa98e487a9619b7902f218663be219e787a57dad Reviewed by: George Wilson Reviewed by: Paul Dagnelie Reviewed by: Pavel Zakharov Approved by: Matt Ahrens Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Oct 3 02:19:17 2018 (r339113) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Oct 3 02:48:31 2018 (r339114) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2018, Joyent, Inc. - * Copyright (c) 2011, 2017 by Delphix. All rights reserved. + * Copyright (c) 2011, 2018 by Delphix. All rights reserved. * Copyright (c) 2014 by Saso Kiselkov. All rights reserved. * Copyright 2017 Nexenta Systems, Inc. All rights reserved. */ @@ -5177,12 +5177,13 @@ arc_getbuf_func(zio_t *zio, const zbookmark_phys_t *zb arc_buf_t *buf, void *arg) { arc_buf_t **bufp = arg; - if (buf == NULL) { + ASSERT(zio == NULL || zio->io_error != 0); *bufp = NULL; } else { + ASSERT(zio == NULL || zio->io_error == 0); *bufp = buf; - ASSERT(buf->b_data); + ASSERT(buf->b_data != NULL); } } @@ -5210,6 +5211,7 @@ arc_read_done(zio_t *zio) arc_callback_t *callback_list; arc_callback_t *acb; boolean_t freeable = B_FALSE; + boolean_t no_zio_error = (zio->io_error == 0); /* * The hdr was inserted into hash-table and removed from lists @@ -5235,7 +5237,7 @@ arc_read_done(zio_t *zio) ASSERT3P(hash_lock, !=, NULL); } - if (zio->io_error == 0) { + if (no_zio_error) { /* byteswap if necessary */ if (BP_SHOULD_BYTESWAP(zio->io_bp)) { if (BP_GET_LEVEL(zio->io_bp) > 0) { @@ -5256,8 +5258,7 @@ arc_read_done(zio_t *zio) callback_list = hdr->b_l1hdr.b_acb; ASSERT3P(callback_list, !=, NULL); - if (hash_lock && zio->io_error == 0 && - hdr->b_l1hdr.b_state == arc_anon) { + if (hash_lock && no_zio_error && hdr->b_l1hdr.b_state == arc_anon) { /* * Only call arc_access on anonymous buffers. This is because * if we've issued an I/O for an evicted buffer, we've already @@ -5280,20 +5281,38 @@ arc_read_done(zio_t *zio) callback_cnt++; - if (zio->io_error != 0) - continue; - - int error = arc_buf_alloc_impl(hdr, acb->acb_private, - acb->acb_compressed, - B_TRUE, &acb->acb_buf); - if (error != 0) { - arc_buf_destroy(acb->acb_buf, acb->acb_private); - acb->acb_buf = NULL; + if (no_zio_error) { + int error = arc_buf_alloc_impl(hdr, acb->acb_private, + acb->acb_compressed, zio->io_error == 0, + &acb->acb_buf); + if (error != 0) { + /* + * Decompression failed. Set io_error + * so that when we call acb_done (below), + * we will indicate that the read failed. + * Note that in the unusual case where one + * callback is compressed and another + * uncompressed, we will mark all of them + * as failed, even though the uncompressed + * one can't actually fail. In this case, + * the hdr will not be anonymous, because + * if there are multiple callbacks, it's + * because multiple threads found the same + * arc buf in the hash table. + */ + zio->io_error = error; + } } - - if (zio->io_error == 0) - zio->io_error = error; } + /* + * If there are multiple callbacks, we must have the hash lock, + * because the only way for multiple threads to find this hdr is + * in the hash table. This ensures that if there are multiple + * callbacks, the hdr is not anonymous. If it were anonymous, + * we couldn't use arc_buf_destroy() in the error case below. + */ + ASSERT(callback_cnt < 2 || hash_lock != NULL); + hdr->b_l1hdr.b_acb = NULL; arc_hdr_clear_flags(hdr, ARC_FLAG_IO_IN_PROGRESS); if (callback_cnt == 0) { @@ -5305,7 +5324,7 @@ arc_read_done(zio_t *zio) ASSERT(refcount_is_zero(&hdr->b_l1hdr.b_refcnt) || callback_list != NULL); - if (zio->io_error == 0) { + if (no_zio_error) { arc_hdr_verify(hdr, zio->io_bp); } else { arc_hdr_set_flags(hdr, ARC_FLAG_IO_ERROR); @@ -5338,7 +5357,16 @@ arc_read_done(zio_t *zio) /* execute each callback and free its structure */ while ((acb = callback_list) != NULL) { - if (acb->acb_done) { + if (acb->acb_done != NULL) { + if (zio->io_error != 0 && acb->acb_buf != NULL) { + /* + * If arc_buf_alloc_impl() fails during + * decompression, the buf will still be + * allocated, and needs to be freed here. + */ + arc_buf_destroy(acb->acb_buf, acb->acb_private); + acb->acb_buf = NULL; + } acb->acb_done(zio, &zio->io_bookmark, zio->io_bp, acb->acb_buf, acb->acb_private); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 02:19:17 2018 (r339113) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Oct 3 02:48:31 2018 (r339114) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012, 2017 by Delphix. All rights reserved. + * Copyright (c) 2012, 2018 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -1000,8 +1000,15 @@ dbuf_read_done(zio_t *zio, const zbookmark_phys_t *zb, ASSERT(refcount_count(&db->db_holds) > 0); ASSERT(db->db_buf == NULL); ASSERT(db->db.db_data == NULL); - if (db->db_level == 0 && db->db_freed_in_flight) { - /* we were freed in flight; disregard any error */ + if (buf == NULL) { + /* i/o error */ + ASSERT(zio == NULL || zio->io_error != 0); + ASSERT(db->db_blkid != DMU_BONUS_BLKID); + ASSERT3P(db->db_buf, ==, NULL); + db->db_state = DB_UNCACHED; + } else if (db->db_level == 0 && db->db_freed_in_flight) { + /* freed in flight */ + ASSERT(zio == NULL || zio->io_error == 0); if (buf == NULL) { buf = arc_alloc_buf(db->db_objset->os_spa, db, DBUF_GET_BUFC_TYPE(db), db->db.db_size); @@ -1012,13 +1019,11 @@ dbuf_read_done(zio_t *zio, const zbookmark_phys_t *zb, db->db_freed_in_flight = FALSE; dbuf_set_data(db, buf); db->db_state = DB_CACHED; - } else if (buf != NULL) { + } else { + /* success */ + ASSERT(zio == NULL || zio->io_error == 0); dbuf_set_data(db, buf); db->db_state = DB_CACHED; - } else { - ASSERT(db->db_blkid != DMU_BONUS_BLKID); - ASSERT3P(db->db_buf, ==, NULL); - db->db_state = DB_UNCACHED; } cv_broadcast(&db->db_changed); dbuf_rele_and_unlock(db, NULL); @@ -2431,6 +2436,13 @@ dbuf_prefetch_indirect_done(zio_t *zio, const zbookmar ASSERT3S(dpa->dpa_zb.zb_level, <, dpa->dpa_curlevel); ASSERT3S(dpa->dpa_curlevel, >, 0); + + if (abuf == NULL) { + ASSERT(zio == NULL || zio->io_error != 0); + kmem_free(dpa, sizeof (*dpa)); + return; + } + ASSERT(zio == NULL || zio->io_error == 0); /* * The dpa_dnode is only valid if we are called with a NULL Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c Wed Oct 3 02:19:17 2018 (r339113) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c Wed Oct 3 02:48:31 2018 (r339114) @@ -25,7 +25,7 @@ */ /* * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. - * Copyright (c) 2013, 2016 by Delphix. All rights reserved. + * Copyright (c) 2013, 2018 by Delphix. All rights reserved. */ #include @@ -56,6 +56,12 @@ static zcomp_stats_t zcomp_stats = { kstat_t *zcomp_ksp; /* + * If nonzero, every 1/X decompression attempts will fail, simulating + * an undetected memory error. + */ +uint64_t zio_decompress_fail_fraction = 0; + +/* * Compression vectors. */ zio_compress_info_t zio_compress_table[ZIO_COMPRESS_FUNCTIONS] = { @@ -171,6 +177,16 @@ zio_decompress_data(enum zio_compress c, abd_t *src, v void *tmp = abd_borrow_buf_copy(src, s_len); int ret = zio_decompress_data_buf(c, tmp, dst, s_len, d_len); abd_return_buf(src, tmp, s_len); + + /* + * Decompression shouldn't fail, because we've already verifyied + * the checksum. However, for extra protection (e.g. against bitflips + * in non-ECC RAM), we handle this error (and test it). + */ + ASSERT0(ret); + if (zio_decompress_fail_fraction != 0 && + spa_get_random(zio_decompress_fail_fraction) == 0) + ret = SET_ERROR(EINVAL); return (ret); }