From owner-svn-src-stable-11@freebsd.org Thu Dec 15 08:10:49 2016 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 403CAC81A51; Thu, 15 Dec 2016 08:10:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E1E62623; Thu, 15 Dec 2016 08:10:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id uBF8Am9q084896; Thu, 15 Dec 2016 08:10:48 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id uBF8AmU0084895; Thu, 15 Dec 2016 08:10:48 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201612150810.uBF8AmU0084895@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 15 Dec 2016 08:10:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r310105 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Dec 2016 08:10:49 -0000 Author: mav Date: Thu Dec 15 08:10:47 2016 New Revision: 310105 URL: https://svnweb.freebsd.org/changeset/base/310105 Log: MFC 309714: Fix spa_alloc_tree sorting by offset in r305331. Original commit "7090 zfs should improve allocation order" declares alloc queue sorted by time and offset. But in practice io_offset is always zero, so sorting happened only by time, while order of writes with equal time was completely random. On Illumos this did not affected much thanks to using high resolution timestamps. On FreeBSD due to using much faster but low resolution timestamps it caused bad data placement on disks, affecting further read performance. This change switches zio_timestamp_compare() from comparing uninitialized io_offset to really populated io_bookmark values. I haven't decided yet what to do with timestampts, but on simple tests this change gives the same peformance results by just making code to work as declared. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Dec 15 08:03:16 2016 (r310104) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Dec 15 08:10:47 2016 (r310105) @@ -563,9 +563,24 @@ zio_timestamp_compare(const void *x1, co if (z1->io_queued_timestamp > z2->io_queued_timestamp) return (1); - if (z1->io_offset < z2->io_offset) + if (z1->io_bookmark.zb_objset < z2->io_bookmark.zb_objset) return (-1); - if (z1->io_offset > z2->io_offset) + if (z1->io_bookmark.zb_objset > z2->io_bookmark.zb_objset) + return (1); + + if (z1->io_bookmark.zb_object < z2->io_bookmark.zb_object) + return (-1); + if (z1->io_bookmark.zb_object > z2->io_bookmark.zb_object) + return (1); + + if (z1->io_bookmark.zb_level < z2->io_bookmark.zb_level) + return (-1); + if (z1->io_bookmark.zb_level > z2->io_bookmark.zb_level) + return (1); + + if (z1->io_bookmark.zb_blkid < z2->io_bookmark.zb_blkid) + return (-1); + if (z1->io_bookmark.zb_blkid > z2->io_bookmark.zb_blkid) return (1); if (z1 < z2)