From owner-svn-src-stable-11@freebsd.org Sun Jul 23 18:00:05 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2928EDA998C; Sun, 23 Jul 2017 18:00:05 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E33E069638; Sun, 23 Jul 2017 18:00:04 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6NI03ou030714; Sun, 23 Jul 2017 18:00:03 GMT (envelope-from ngie@FreeBSD.org) Received: (from ngie@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6NI03OD030713; Sun, 23 Jul 2017 18:00:03 GMT (envelope-from ngie@FreeBSD.org) Message-Id: <201707231800.v6NI03OD030713@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ngie set sender to ngie@FreeBSD.org using -f From: Ngie Cooper Date: Sun, 23 Jul 2017 18:00:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321388 - stable/11/usr.sbin/cron/cron X-SVN-Group: stable-11 X-SVN-Commit-Author: ngie X-SVN-Commit-Paths: stable/11/usr.sbin/cron/cron X-SVN-Commit-Revision: 321388 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Jul 2017 18:00:05 -0000 Author: ngie Date: Sun Jul 23 18:00:03 2017 New Revision: 321388 URL: https://svnweb.freebsd.org/changeset/base/321388 Log: MFC r321240: cron(8) manpage updates - Document /etc/cron.d and /usr/local/etc/cron.d under FILES. - Reword documentation for -n: add appropriate soft-stop and remove contraction to appease igor. Modified: stable/11/usr.sbin/cron/cron/cron.8 Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/cron/cron/cron.8 ============================================================================== --- stable/11/usr.sbin/cron/cron/cron.8 Sun Jul 23 17:57:00 2017 (r321387) +++ stable/11/usr.sbin/cron/cron/cron.8 Sun Jul 23 18:00:03 2017 (r321388) @@ -17,7 +17,7 @@ .\" .\" $FreeBSD$ .\" -.Dd October 31, 2016 +.Dd July 19, 2017 .Dt CRON 8 .Os .Sh NAME @@ -138,7 +138,7 @@ set to a null string, usually specified in a shell as or .Li \*q\*q . .It Fl n -Don't daemonize, run in foreground instead. +Do not daemonize; run in foreground instead. .It Fl s Enable special handling of situations when the GMT offset of the local timezone changes, such as the switches between the standard time and @@ -209,13 +209,17 @@ trace through the execution, but do not perform any ac .El .El .Sh FILES -.Bl -tag -width /etc/pam.d/cron -compact +.Bl -tag -width /usr/local/etc/cron.d -compact .It Pa /etc/crontab System crontab file +.It Pa /etc/cron.d +Directory for optional/modularized system crontab files. .It Pa /etc/pam.d/cron .Xr pam.conf 5 configuration file for .Nm +.It Pa /usr/local/etc/cron.d +Directory for third-party package provided crontab files. .It Pa /var/cron/tabs Directory for personal crontab files .El From owner-svn-src-stable-11@freebsd.org Sun Jul 23 18:13:20 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 94CD8DAA06E; Sun, 23 Jul 2017 18:13:20 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6DE046A164; Sun, 23 Jul 2017 18:13:20 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6NIDJj0038638; Sun, 23 Jul 2017 18:13:19 GMT (envelope-from ngie@FreeBSD.org) Received: (from ngie@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6NIDJjC038637; Sun, 23 Jul 2017 18:13:19 GMT (envelope-from ngie@FreeBSD.org) Message-Id: <201707231813.v6NIDJjC038637@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ngie set sender to ngie@FreeBSD.org using -f From: Ngie Cooper Date: Sun, 23 Jul 2017 18:13:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321391 - stable/11/targets/pseudo/tests X-SVN-Group: stable-11 X-SVN-Commit-Author: ngie X-SVN-Commit-Paths: stable/11/targets/pseudo/tests X-SVN-Commit-Revision: 321391 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Jul 2017 18:13:20 -0000 Author: ngie Date: Sun Jul 23 18:13:19 2017 New Revision: 321391 URL: https://svnweb.freebsd.org/changeset/base/321391 Log: MFC r316603,r321214: r316603: META_MODE: add additional reachover relative paths to DIRDEPS_BUILD These additional entries are being added, after their addition to the source tree. r321214: Update targets/pseudo/tests/Makefile.depend after recent additions/subtractions from the FreeBSD test suite. MFC with: r316603 Modified: stable/11/targets/pseudo/tests/Makefile.depend Directory Properties: stable/11/ (props changed) Modified: stable/11/targets/pseudo/tests/Makefile.depend ============================================================================== --- stable/11/targets/pseudo/tests/Makefile.depend Sun Jul 23 18:10:47 2017 (r321390) +++ stable/11/targets/pseudo/tests/Makefile.depend Sun Jul 23 18:13:19 2017 (r321391) @@ -7,13 +7,17 @@ # find . -name Makefile -exec grep -l '^\.include.*\.test.mk' {} + | grep -v '^\./contrib' | sed -e 's,/Makefile,,' -e 's,^\./,,' -e 's,^, ,' -e 's,$, \\,' | sort DIRDEPS= \ bin/cat/tests \ + bin/chmod/tests \ bin/date/tests \ bin/dd/tests \ + bin/echo/tests \ bin/expr/tests \ + bin/ln/tests \ bin/ls/tests \ bin/mv/tests \ bin/pax/tests \ bin/pkill/tests \ + bin/pwait/tests \ bin/sh/tests \ bin/sh/tests/builtins \ bin/sh/tests/errors \ @@ -113,6 +117,7 @@ DIRDEPS= \ cddl/usr.sbin/dtrace/tests/common/vars \ cddl/usr.sbin/dtrace/tests/common/version \ cddl/usr.sbin/tests \ + cddl/usr.sbin/zfsd/tests \ gnu/lib/tests \ gnu/tests \ gnu/usr.bin/diff/tests \ @@ -131,6 +136,7 @@ DIRDEPS= \ lib/libc/tests/gen/execve \ lib/libc/tests/gen/posix_spawn \ lib/libc/tests/hash \ + lib/libc/tests/iconv \ lib/libc/tests/inet \ lib/libc/tests/locale \ lib/libc/tests/net \ @@ -149,12 +155,16 @@ DIRDEPS= \ lib/libc/tests/time \ lib/libc/tests/tls \ lib/libc/tests/ttyio \ + lib/libcam/tests \ lib/libcrypt/tests \ + lib/libdevdctl/tests \ + lib/libkvm/tests \ lib/libmp/tests \ lib/libnv/tests \ lib/libpam/libpam/tests \ lib/libproc/tests \ lib/librt/tests \ + lib/libsbuf/tests \ lib/libthr/tests \ lib/libthr/tests/dlopen \ lib/libthr/tests/dlopen/dso \ @@ -166,8 +176,6 @@ DIRDEPS= \ libexec/atf/atf-sh/tests \ libexec/atf/tests \ libexec/rtld-elf/tests \ - libexec/rtld-elf/tests/libpythagoras \ - libexec/rtld-elf/tests/target \ libexec/tests \ sbin/devd/tests \ sbin/dhclient/tests \ @@ -193,6 +201,8 @@ DIRDEPS= \ tests/sys/aio \ tests/sys/fifo \ tests/sys/file \ + tests/sys/fs \ + tests/sys/fs/tmpfs \ tests/sys/geom \ tests/sys/geom/class \ tests/sys/geom/class/concat \ @@ -209,13 +219,13 @@ DIRDEPS= \ tests/sys/kern/execve \ tests/sys/kern/pipe \ tests/sys/kqueue \ + tests/sys/kqueue/libkqueue \ tests/sys/mac \ tests/sys/mac/bsdextended \ tests/sys/mac/portacl \ tests/sys/mqueue \ tests/sys/netinet \ tests/sys/opencrypto \ - tests/sys/pjdfstest/pjdfstest \ tests/sys/pjdfstest/tests \ tests/sys/pjdfstest/tests/chflags \ tests/sys/pjdfstest/tests/chmod \ @@ -292,10 +302,13 @@ DIRDEPS= \ usr.bin/cmp/tests \ usr.bin/col/tests \ usr.bin/comm/tests \ + usr.bin/compress/tests \ usr.bin/cpio/tests \ usr.bin/cut/tests \ usr.bin/dirname/tests \ + usr.bin/du/tests \ usr.bin/file2c/tests \ + usr.bin/getconf/tests \ usr.bin/grep/tests \ usr.bin/gzip/tests \ usr.bin/ident/tests \ @@ -306,17 +319,21 @@ DIRDEPS= \ usr.bin/m4/tests \ usr.bin/mkimg/tests \ usr.bin/ncal/tests \ + usr.bin/pr/tests \ usr.bin/printf/tests \ - usr.bin/pwait/tests \ + usr.bin/procstat/tests \ usr.bin/sdiff/tests \ usr.bin/sed/tests \ usr.bin/sed/tests/regress.multitest.out \ usr.bin/soelim/tests \ + usr.bin/stat/tests \ + usr.bin/tail/tests \ usr.bin/tar/tests \ usr.bin/tests \ usr.bin/timeout/tests \ usr.bin/tr/tests \ usr.bin/truncate/tests \ + usr.bin/uniq/tests \ usr.bin/units/tests \ usr.bin/uudecode/tests \ usr.bin/uuencode/tests \ @@ -341,6 +358,7 @@ DIRDEPS= \ DIRDEPS:= ${DIRDEPS:Ncddl/usr.sbin/dtrace/tests/common/nfs} DIRDEPS:= ${DIRDEPS:Ncddl/usr.sbin/dtrace/tests/common/sysevent} DIRDEPS:= ${DIRDEPS:Ncddl/usr.sbin/dtrace/tests/common/docsExamples} +DIRDEPS:= ${DIRDEPS:Ncddl/usr.sbin/zfsd/tests} DIRDEPS:= ${DIRDEPS:Nlib/libc/tests/net/getaddrinfo} .include From owner-svn-src-stable-11@freebsd.org Mon Jul 24 01:39:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6F0D7DB317F; Mon, 24 Jul 2017 01:39:59 +0000 (UTC) (envelope-from sephe@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 37BDE76677; Mon, 24 Jul 2017 01:39:59 +0000 (UTC) (envelope-from sephe@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6O1dwMp017761; Mon, 24 Jul 2017 01:39:58 GMT (envelope-from sephe@FreeBSD.org) Received: (from sephe@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6O1dwEJ017760; Mon, 24 Jul 2017 01:39:58 GMT (envelope-from sephe@FreeBSD.org) Message-Id: <201707240139.v6O1dwEJ017760@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: sephe set sender to sephe@FreeBSD.org using -f From: Sepherosa Ziehau Date: Mon, 24 Jul 2017 01:39:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321404 - stable/11/sys/dev/hyperv/storvsc X-SVN-Group: stable-11 X-SVN-Commit-Author: sephe X-SVN-Commit-Paths: stable/11/sys/dev/hyperv/storvsc X-SVN-Commit-Revision: 321404 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 01:39:59 -0000 Author: sephe Date: Mon Jul 24 01:39:58 2017 New Revision: 321404 URL: https://svnweb.freebsd.org/changeset/base/321404 Log: MFC 321286 hyperv/storvsc: Force SPC3 for CDROM attached. This unbreaks the CDROM attaching on GEN2 VMs. On GEN1 VMs, CDROM is attached to emulated ATA controller. PR: 220790 Submitted by: Hongjiang Zhang Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11634 Modified: stable/11/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c ============================================================================== --- stable/11/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c Mon Jul 24 00:53:43 2017 (r321403) +++ stable/11/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c Mon Jul 24 01:39:58 2017 (r321404) @@ -2211,6 +2211,23 @@ storvsc_io_done(struct hv_storvsc_request *reqp) resp_buf[3], resp_buf[4]); } /* + * XXX: Hyper-V (since win2012r2) responses inquiry with + * unknown version (0) for GEN-2 DVD device. + * Manually set the version number to SPC3 in order to + * ask CAM to continue probing with "PROBE_REPORT_LUNS". + * see probedone() in scsi_xpt.c + */ + if (SID_TYPE(inq_data) == T_CDROM && + inq_data->version == 0 && + (vmstor_proto_version >= VMSTOR_PROTOCOL_VERSION_WIN8)) { + inq_data->version = SCSI_REV_SPC3; + if (bootverbose) { + xpt_print(ccb->ccb_h.path, + "set version from 0 to %d\n", + inq_data->version); + } + } + /* * XXX: Manually fix the wrong response returned from WS2012 */ if (!is_scsi_valid(inq_data) && @@ -2219,7 +2236,7 @@ storvsc_io_done(struct hv_storvsc_request *reqp) vmstor_proto_version == VMSTOR_PROTOCOL_VERSION_WIN7)) { if (data_len >= 4 && (resp_buf[2] == 0 || resp_buf[3] == 0)) { - resp_buf[2] = 5; // verion=5 means SPC-3 + resp_buf[2] = SCSI_REV_SPC3; resp_buf[3] = 2; // resp fmt must be 2 if (bootverbose) xpt_print(ccb->ccb_h.path, From owner-svn-src-stable-11@freebsd.org Mon Jul 24 05:52:12 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A6C9DDB7786; Mon, 24 Jul 2017 05:52:12 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 802528165D; Mon, 24 Jul 2017 05:52:12 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6O5qBhF022791; Mon, 24 Jul 2017 05:52:11 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6O5qAdA022781; Mon, 24 Jul 2017 05:52:10 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707240552.v6O5qAdA022781@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 24 Jul 2017 05:52:10 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321411 - in stable/11: sys/fs/nfsclient sys/fs/smbfs sys/vm usr.sbin/bhyve X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: sys/fs/nfsclient sys/fs/smbfs sys/vm usr.sbin/bhyve X-SVN-Commit-Revision: 321411 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 05:52:12 -0000 Author: mav Date: Mon Jul 24 05:52:10 2017 New Revision: 321411 URL: https://svnweb.freebsd.org/changeset/base/321411 Log: MFC r302850: Make PCI interupts allocation static when using bootrom (UEFI). This makes factual interrupt routing match one shipped with UEFI firmware. With old firmware this make legacy interrupts work reliable for functions 0 of PCI slots 3-6. Updated UEFI image fixes problem completely. Modified: stable/11/sys/fs/nfsclient/nfs_clbio.c stable/11/sys/fs/smbfs/smbfs_io.c stable/11/sys/vm/vnode_pager.c stable/11/sys/vm/vnode_pager.h stable/11/usr.sbin/bhyve/ioapic.c stable/11/usr.sbin/bhyve/ioapic.h stable/11/usr.sbin/bhyve/pci_emul.c stable/11/usr.sbin/bhyve/pci_irq.c stable/11/usr.sbin/bhyve/pci_irq.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/fs/nfsclient/nfs_clbio.c ============================================================================== --- stable/11/sys/fs/nfsclient/nfs_clbio.c Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/sys/fs/nfsclient/nfs_clbio.c Mon Jul 24 05:52:10 2017 (r321411) @@ -337,8 +337,10 @@ ncl_putpages(struct vop_putpages_args *ap) cred); crfree(cred); - if (error == 0 || !nfs_keep_dirty_on_error) - vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid); + if (error == 0 || !nfs_keep_dirty_on_error) { + vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid, + np->n_size - offset, npages * PAGE_SIZE); + } return (rtvals[0]); } Modified: stable/11/sys/fs/smbfs/smbfs_io.c ============================================================================== --- stable/11/sys/fs/smbfs/smbfs_io.c Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/sys/fs/smbfs/smbfs_io.c Mon Jul 24 05:52:10 2017 (r321411) @@ -621,9 +621,11 @@ smbfs_putpages(ap) relpbuf(bp, &smbfs_pbuf_freecnt); - if (!error) - vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid); - return rtvals[0]; + if (error == 0) { + vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid, + npages * PAGE_SIZE, npages * PAGE_SIZE); + } + return (rtvals[0]); #endif /* SMBFS_RWGENERIC */ } Modified: stable/11/sys/vm/vnode_pager.c ============================================================================== --- stable/11/sys/vm/vnode_pager.c Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/sys/vm/vnode_pager.c Mon Jul 24 05:52:10 2017 (r321411) @@ -1277,12 +1277,13 @@ vnode_pager_putpages_ioflags(int pager_flags) } void -vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written) +vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written, int eof, + int lpos) { vm_object_t obj; - int i, pos; + int i, pos, pos_devb; - if (written == 0) + if (written == 0 && eof >= lpos) return; obj = ma[0]->object; VM_OBJECT_WLOCK(obj); @@ -1294,6 +1295,20 @@ vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, /* Partially written page. */ rtvals[i] = VM_PAGER_AGAIN; vm_page_clear_dirty(ma[i], 0, written & PAGE_MASK); + } + } + for (pos = eof, i = OFF_TO_IDX(trunc_page(pos)); pos < lpos; i++) { + if (pos != trunc_page(pos)) { + pos_devb = roundup2(pos, DEV_BSIZE) & PAGE_MASK; + vm_page_clear_dirty(ma[i], pos_devb, PAGE_SIZE - + pos_devb); + if (ma[i]->dirty == VM_PAGE_BITS_ALL) + rtvals[i] = VM_PAGER_OK; + pos = round_page(pos); + } else { + /* vm_pageout_flush() clears dirty */ + rtvals[i] = VM_PAGER_BAD; + pos += PAGE_SIZE; } } VM_OBJECT_WUNLOCK(obj); Modified: stable/11/sys/vm/vnode_pager.h ============================================================================== --- stable/11/sys/vm/vnode_pager.h Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/sys/vm/vnode_pager.h Mon Jul 24 05:52:10 2017 (r321411) @@ -50,7 +50,8 @@ int vnode_pager_local_getpages_async(struct vop_getpag int vnode_pager_putpages_ioflags(int pager_flags); void vnode_pager_release_writecount(vm_object_t object, vm_offset_t start, vm_offset_t end); -void vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written); +void vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written, + int eof, int lpos); void vnode_pager_update_writecount(vm_object_t object, vm_offset_t start, vm_offset_t end); Modified: stable/11/usr.sbin/bhyve/ioapic.c ============================================================================== --- stable/11/usr.sbin/bhyve/ioapic.c Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/usr.sbin/bhyve/ioapic.c Mon Jul 24 05:52:10 2017 (r321411) @@ -29,11 +29,14 @@ __FBSDID("$FreeBSD$"); #include +#include #include #include #include "ioapic.h" +#include "pci_emul.h" +#include "pci_lpc.h" /* * Assign PCI INTx interrupts to I/O APIC pins in a round-robin @@ -64,11 +67,15 @@ ioapic_init(struct vmctx *ctx) } int -ioapic_pci_alloc_irq(void) +ioapic_pci_alloc_irq(struct pci_devinst *pi) { static int last_pin; if (pci_pins == 0) return (-1); + if (lpc_bootrom()) { + /* For external bootrom use fixed mapping. */ + return (16 + (4 + pi->pi_slot + pi->pi_lintr.pin) % 8); + } return (16 + (last_pin++ % pci_pins)); } Modified: stable/11/usr.sbin/bhyve/ioapic.h ============================================================================== --- stable/11/usr.sbin/bhyve/ioapic.h Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/usr.sbin/bhyve/ioapic.h Mon Jul 24 05:52:10 2017 (r321411) @@ -30,10 +30,12 @@ #ifndef _IOAPIC_H_ #define _IOAPIC_H_ +struct pci_devinst; + /* * Allocate a PCI IRQ from the I/O APIC. */ void ioapic_init(struct vmctx *ctx); -int ioapic_pci_alloc_irq(void); +int ioapic_pci_alloc_irq(struct pci_devinst *pi); #endif Modified: stable/11/usr.sbin/bhyve/pci_emul.c ============================================================================== --- stable/11/usr.sbin/bhyve/pci_emul.c Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/usr.sbin/bhyve/pci_emul.c Mon Jul 24 05:52:10 2017 (r321411) @@ -1504,7 +1504,7 @@ pci_lintr_route(struct pci_devinst *pi) * is not yet assigned. */ if (ii->ii_ioapic_irq == 0) - ii->ii_ioapic_irq = ioapic_pci_alloc_irq(); + ii->ii_ioapic_irq = ioapic_pci_alloc_irq(pi); assert(ii->ii_ioapic_irq > 0); /* @@ -1512,7 +1512,7 @@ pci_lintr_route(struct pci_devinst *pi) * not yet assigned. */ if (ii->ii_pirq_pin == 0) - ii->ii_pirq_pin = pirq_alloc_pin(pi->pi_vmctx); + ii->ii_pirq_pin = pirq_alloc_pin(pi); assert(ii->ii_pirq_pin > 0); pi->pi_lintr.ioapic_irq = ii->ii_ioapic_irq; Modified: stable/11/usr.sbin/bhyve/pci_irq.c ============================================================================== --- stable/11/usr.sbin/bhyve/pci_irq.c Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/usr.sbin/bhyve/pci_irq.c Mon Jul 24 05:52:10 2017 (r321411) @@ -193,19 +193,25 @@ pci_irq_deassert(struct pci_devinst *pi) } int -pirq_alloc_pin(struct vmctx *ctx) +pirq_alloc_pin(struct pci_devinst *pi) { + struct vmctx *ctx = pi->pi_vmctx; int best_count, best_irq, best_pin, irq, pin; pirq_cold = 0; - /* First, find the least-used PIRQ pin. */ - best_pin = 0; - best_count = pirqs[0].use_count; - for (pin = 1; pin < nitems(pirqs); pin++) { - if (pirqs[pin].use_count < best_count) { - best_pin = pin; - best_count = pirqs[pin].use_count; + if (lpc_bootrom()) { + /* For external bootrom use fixed mapping. */ + best_pin = (4 + pi->pi_slot + pi->pi_lintr.pin) % 8; + } else { + /* Find the least-used PIRQ pin. */ + best_pin = 0; + best_count = pirqs[0].use_count; + for (pin = 1; pin < nitems(pirqs); pin++) { + if (pirqs[pin].use_count < best_count) { + best_pin = pin; + best_count = pirqs[pin].use_count; + } } } pirqs[best_pin].use_count++; Modified: stable/11/usr.sbin/bhyve/pci_irq.h ============================================================================== --- stable/11/usr.sbin/bhyve/pci_irq.h Mon Jul 24 04:38:05 2017 (r321410) +++ stable/11/usr.sbin/bhyve/pci_irq.h Mon Jul 24 05:52:10 2017 (r321411) @@ -37,7 +37,7 @@ void pci_irq_deassert(struct pci_devinst *pi); void pci_irq_init(struct vmctx *ctx); void pci_irq_reserve(int irq); void pci_irq_use(int irq); -int pirq_alloc_pin(struct vmctx *ctx); +int pirq_alloc_pin(struct pci_devinst *pi); int pirq_irq(int pin); uint8_t pirq_read(int pin); void pirq_write(struct vmctx *ctx, int pin, uint8_t val); From owner-svn-src-stable-11@freebsd.org Mon Jul 24 06:07:45 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD890DB7A0A; Mon, 24 Jul 2017 06:07:45 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AEEB081B18; Mon, 24 Jul 2017 06:07:45 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6O67ik3027388; Mon, 24 Jul 2017 06:07:44 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6O67ic9027384; Mon, 24 Jul 2017 06:07:44 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707240607.v6O67ic9027384@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 24 Jul 2017 06:07:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321412 - in stable/11/sys: fs/nfsclient fs/smbfs vm X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys: fs/nfsclient fs/smbfs vm X-SVN-Commit-Revision: 321412 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 06:07:46 -0000 Author: mav Date: Mon Jul 24 06:07:44 2017 New Revision: 321412 URL: https://svnweb.freebsd.org/changeset/base/321412 Log: Revert unexpected changes leaked into r321411. Modified: stable/11/sys/fs/nfsclient/nfs_clbio.c stable/11/sys/fs/smbfs/smbfs_io.c stable/11/sys/vm/vnode_pager.c stable/11/sys/vm/vnode_pager.h Modified: stable/11/sys/fs/nfsclient/nfs_clbio.c ============================================================================== --- stable/11/sys/fs/nfsclient/nfs_clbio.c Mon Jul 24 05:52:10 2017 (r321411) +++ stable/11/sys/fs/nfsclient/nfs_clbio.c Mon Jul 24 06:07:44 2017 (r321412) @@ -337,10 +337,8 @@ ncl_putpages(struct vop_putpages_args *ap) cred); crfree(cred); - if (error == 0 || !nfs_keep_dirty_on_error) { - vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid, - np->n_size - offset, npages * PAGE_SIZE); - } + if (error == 0 || !nfs_keep_dirty_on_error) + vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid); return (rtvals[0]); } Modified: stable/11/sys/fs/smbfs/smbfs_io.c ============================================================================== --- stable/11/sys/fs/smbfs/smbfs_io.c Mon Jul 24 05:52:10 2017 (r321411) +++ stable/11/sys/fs/smbfs/smbfs_io.c Mon Jul 24 06:07:44 2017 (r321412) @@ -621,11 +621,9 @@ smbfs_putpages(ap) relpbuf(bp, &smbfs_pbuf_freecnt); - if (error == 0) { - vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid, - npages * PAGE_SIZE, npages * PAGE_SIZE); - } - return (rtvals[0]); + if (!error) + vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid); + return rtvals[0]; #endif /* SMBFS_RWGENERIC */ } Modified: stable/11/sys/vm/vnode_pager.c ============================================================================== --- stable/11/sys/vm/vnode_pager.c Mon Jul 24 05:52:10 2017 (r321411) +++ stable/11/sys/vm/vnode_pager.c Mon Jul 24 06:07:44 2017 (r321412) @@ -1277,13 +1277,12 @@ vnode_pager_putpages_ioflags(int pager_flags) } void -vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written, int eof, - int lpos) +vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written) { vm_object_t obj; - int i, pos, pos_devb; + int i, pos; - if (written == 0 && eof >= lpos) + if (written == 0) return; obj = ma[0]->object; VM_OBJECT_WLOCK(obj); @@ -1295,20 +1294,6 @@ vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, /* Partially written page. */ rtvals[i] = VM_PAGER_AGAIN; vm_page_clear_dirty(ma[i], 0, written & PAGE_MASK); - } - } - for (pos = eof, i = OFF_TO_IDX(trunc_page(pos)); pos < lpos; i++) { - if (pos != trunc_page(pos)) { - pos_devb = roundup2(pos, DEV_BSIZE) & PAGE_MASK; - vm_page_clear_dirty(ma[i], pos_devb, PAGE_SIZE - - pos_devb); - if (ma[i]->dirty == VM_PAGE_BITS_ALL) - rtvals[i] = VM_PAGER_OK; - pos = round_page(pos); - } else { - /* vm_pageout_flush() clears dirty */ - rtvals[i] = VM_PAGER_BAD; - pos += PAGE_SIZE; } } VM_OBJECT_WUNLOCK(obj); Modified: stable/11/sys/vm/vnode_pager.h ============================================================================== --- stable/11/sys/vm/vnode_pager.h Mon Jul 24 05:52:10 2017 (r321411) +++ stable/11/sys/vm/vnode_pager.h Mon Jul 24 06:07:44 2017 (r321412) @@ -50,8 +50,7 @@ int vnode_pager_local_getpages_async(struct vop_getpag int vnode_pager_putpages_ioflags(int pager_flags); void vnode_pager_release_writecount(vm_object_t object, vm_offset_t start, vm_offset_t end); -void vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written, - int eof, int lpos); +void vnode_pager_undirty_pages(vm_page_t *ma, int *rtvals, int written); void vnode_pager_update_writecount(vm_object_t object, vm_offset_t start, vm_offset_t end); From owner-svn-src-stable-11@freebsd.org Mon Jul 24 06:19:06 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88DB5DB7C3C; Mon, 24 Jul 2017 06:19:06 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CA83481F24; Mon, 24 Jul 2017 06:19:05 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6O6J5tG031350; Mon, 24 Jul 2017 06:19:05 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6O6J4ZV031347; Mon, 24 Jul 2017 06:19:04 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707240619.v6O6J4ZV031347@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 24 Jul 2017 06:19:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321413 - stable/11/usr.sbin/bhyve X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/usr.sbin/bhyve X-SVN-Commit-Revision: 321413 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 06:19:06 -0000 Author: mav Date: Mon Jul 24 06:19:04 2017 New Revision: 321413 URL: https://svnweb.freebsd.org/changeset/base/321413 Log: MFC r305898, r309120, r309121 (by jceel): Add virtio-console support to bhyve. Adds virtio-console device support to bhyve, allowing to create bidirectional character streams between host and guest. Syntax: -s ,virtio-console,port1=/path/to/port1.sock,anotherport=... Maximum of 16 ports per device can be created. Every port is named and corresponds to an Unix domain socket created by bhyve. bhyve accepts at most one connection per port at a time. Limitations: - due to lack of destructors of in bhyve, sockets on the filesystem must be cleaned up manually after bhyve exits - there's no way to use "console port" feature, nor the console port resize as of now - emergency write is advertised, but no-op as of now Added: stable/11/usr.sbin/bhyve/pci_virtio_console.c - copied, changed from r305898, head/usr.sbin/bhyve/pci_virtio_console.c Modified: stable/11/usr.sbin/bhyve/Makefile stable/11/usr.sbin/bhyve/virtio.h Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/bhyve/Makefile ============================================================================== --- stable/11/usr.sbin/bhyve/Makefile Mon Jul 24 06:07:44 2017 (r321412) +++ stable/11/usr.sbin/bhyve/Makefile Mon Jul 24 06:19:04 2017 (r321413) @@ -38,6 +38,7 @@ SRCS= \ pci_lpc.c \ pci_passthru.c \ pci_virtio_block.c \ + pci_virtio_console.c \ pci_virtio_net.c \ pci_virtio_rnd.c \ pci_uart.c \ Copied and modified: stable/11/usr.sbin/bhyve/pci_virtio_console.c (from r305898, head/usr.sbin/bhyve/pci_virtio_console.c) ============================================================================== --- head/usr.sbin/bhyve/pci_virtio_console.c Sat Sep 17 13:48:01 2016 (r305898, copy source) +++ stable/11/usr.sbin/bhyve/pci_virtio_console.c Mon Jul 24 06:19:04 2017 (r321413) @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$"); #include "pci_emul.h" #include "virtio.h" #include "mevent.h" +#include "sockstream.h" #define VTCON_RINGSZ 64 #define VTCON_MAXPORTS 16 @@ -90,6 +91,7 @@ struct pci_vtcon_port { bool vsp_enabled; bool vsp_console; bool vsp_rx_ready; + bool vsp_open; int vsp_rxq; int vsp_txq; void * vsp_arg; @@ -116,6 +118,7 @@ struct pci_vtcon_softc { char * vsc_rootdir; int vsc_kq; int vsc_nports; + bool vsc_ready; struct pci_vtcon_port vsc_control_port; struct pci_vtcon_port vsc_ports[VTCON_MAXPORTS]; struct pci_vtcon_config *vsc_config; @@ -349,6 +352,7 @@ pci_vtcon_sock_accept(int fd __unused, enum ev_type t sock->vss_open = true; sock->vss_conn_fd = s; sock->vss_conn_evp = mevent_add(s, EVF_READ, pci_vtcon_sock_rx, sock); + pci_vtcon_open_port(sock->vss_port, true); } @@ -412,16 +416,21 @@ pci_vtcon_sock_tx(struct pci_vtcon_port *port, void *a int niov) { struct pci_vtcon_sock *sock; - int ret; + int i, ret; sock = (struct pci_vtcon_sock *)arg; if (sock->vss_conn_fd == -1) return; - ret = writev(sock->vss_conn_fd, iov, niov); + for (i = 0; i < niov; i++) { + ret = stream_write(sock->vss_conn_fd, iov[i].iov_base, + iov[i].iov_len); + if (ret <= 0) + break; + } - if (ret < 0 && errno != EWOULDBLOCK) { + if (ret <= 0) { mevent_delete_close(sock->vss_conn_evp); sock->vss_conn_fd = -1; sock->vss_open = false; @@ -444,11 +453,15 @@ pci_vtcon_control_tx(struct pci_vtcon_port *port, void switch (ctrl->event) { case VTCON_DEVICE_READY: + sc->vsc_ready = true; /* set port ready events for registered ports */ for (i = 0; i < VTCON_MAXPORTS; i++) { tmp = &sc->vsc_ports[i]; if (tmp->vsp_enabled) pci_vtcon_announce_port(tmp); + + if (tmp->vsp_open) + pci_vtcon_open_port(tmp, true); } break; @@ -489,6 +502,11 @@ static void pci_vtcon_open_port(struct pci_vtcon_port *port, bool open) { struct pci_vtcon_control event; + + if (!port->vsp_sc->vsc_ready) { + port->vsp_open = true; + return; + } event.id = port->vsp_id; event.event = VTCON_PORT_OPEN; Modified: stable/11/usr.sbin/bhyve/virtio.h ============================================================================== --- stable/11/usr.sbin/bhyve/virtio.h Mon Jul 24 06:07:44 2017 (r321412) +++ stable/11/usr.sbin/bhyve/virtio.h Mon Jul 24 06:19:04 2017 (r321413) @@ -210,6 +210,7 @@ struct vring_used { #define VIRTIO_DEV_NET 0x1000 #define VIRTIO_DEV_BLOCK 0x1001 #define VIRTIO_DEV_RANDOM 0x1005 +#define VIRTIO_DEV_CONSOLE 0x1003 /* * PCI config space constants. From owner-svn-src-stable-11@freebsd.org Mon Jul 24 06:49:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3DA12DBC272; Mon, 24 Jul 2017 06:49:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0CA9982A67; Mon, 24 Jul 2017 06:49:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6O6nwm8043974; Mon, 24 Jul 2017 06:49:58 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6O6nwef043972; Mon, 24 Jul 2017 06:49:58 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707240649.v6O6nwef043972@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 24 Jul 2017 06:49:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321414 - stable/11/sys/amd64/vmm/io X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/amd64/vmm/io X-SVN-Commit-Revision: 321414 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 06:49:59 -0000 Author: mav Date: Mon Jul 24 06:49:57 2017 New Revision: 321414 URL: https://svnweb.freebsd.org/changeset/base/321414 Log: MFC r302843: Increase number of I/O APIC pins from 24 to 32 to give PCI up to 16 IRQs. Move HPET to the top of the supported 0-31 range. Modified: stable/11/sys/amd64/vmm/io/vhpet.c stable/11/sys/amd64/vmm/io/vioapic.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/amd64/vmm/io/vhpet.c ============================================================================== --- stable/11/sys/amd64/vmm/io/vhpet.c Mon Jul 24 06:19:04 2017 (r321413) +++ stable/11/sys/amd64/vmm/io/vhpet.c Mon Jul 24 06:49:57 2017 (r321414) @@ -715,8 +715,10 @@ vhpet_init(struct vm *vm) vhpet->freq_sbt = bttosbt(bt); pincount = vioapic_pincount(vm); - if (pincount >= 24) - allowed_irqs = 0x00f00000; /* irqs 20, 21, 22 and 23 */ + if (pincount >= 32) + allowed_irqs = 0xff000000; /* irqs 24-31 */ + else if (pincount >= 20) + allowed_irqs = 0xf << (pincount - 4); /* 4 upper irqs */ else allowed_irqs = 0; Modified: stable/11/sys/amd64/vmm/io/vioapic.c ============================================================================== --- stable/11/sys/amd64/vmm/io/vioapic.c Mon Jul 24 06:19:04 2017 (r321413) +++ stable/11/sys/amd64/vmm/io/vioapic.c Mon Jul 24 06:49:57 2017 (r321414) @@ -49,7 +49,7 @@ __FBSDID("$FreeBSD$"); #define IOREGSEL 0x00 #define IOWIN 0x10 -#define REDIR_ENTRIES 24 +#define REDIR_ENTRIES 32 #define RTBL_RO_BITS ((uint64_t)(IOART_REM_IRR | IOART_DELIVS)) struct vioapic { From owner-svn-src-stable-11@freebsd.org Mon Jul 24 14:42:45 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 139F8C7F966; Mon, 24 Jul 2017 14:42:45 +0000 (UTC) (envelope-from ken@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE2936A7EB; Mon, 24 Jul 2017 14:42:44 +0000 (UTC) (envelope-from ken@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6OEghJ2038535; Mon, 24 Jul 2017 14:42:43 GMT (envelope-from ken@FreeBSD.org) Received: (from ken@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6OEghb1038532; Mon, 24 Jul 2017 14:42:43 GMT (envelope-from ken@FreeBSD.org) Message-Id: <201707241442.v6OEghb1038532@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ken set sender to ken@FreeBSD.org using -f From: "Kenneth D. Merry" Date: Mon, 24 Jul 2017 14:42:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321415 - in stable/11/sys/dev: mpr mps X-SVN-Group: stable-11 X-SVN-Commit-Author: ken X-SVN-Commit-Paths: in stable/11/sys/dev: mpr mps X-SVN-Commit-Revision: 321415 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 14:42:45 -0000 Author: ken Date: Mon Jul 24 14:42:43 2017 New Revision: 321415 URL: https://svnweb.freebsd.org/changeset/base/321415 Log: MFC r321207: ------------------------------------------------------------------------ r321207 | ken | 2017-07-19 09:39:01 -0600 (Wed, 19 Jul 2017) | 14 lines Fix spurious timeouts on commands sent to mps(4) and mpr(4) controllers. mps_wait_command() and mpr_wait_command() were using getmicrotime() to determine elapsed time when checking for a timeout in polled mode. getmicrotime() isn't guaranteed to monotonically increase, and that caused spurious timeouts occasionally. Switch to using getmicrouptime(), which does increase monotonically. This fixes the spurious timeouts in my test case. ------------------------------------------------------------------------ Reviewed by: slm, scottl Sponsored by: Spectra Logic Modified: stable/11/sys/dev/mpr/mpr.c stable/11/sys/dev/mps/mps.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/mpr/mpr.c ============================================================================== --- stable/11/sys/dev/mpr/mpr.c Mon Jul 24 06:49:57 2017 (r321414) +++ stable/11/sys/dev/mpr/mpr.c Mon Jul 24 14:42:43 2017 (r321415) @@ -3286,9 +3286,17 @@ mpr_wait_command(struct mpr_softc *sc, struct mpr_comm if (curthread->td_pflags & TDP_NOSLEEPING) #endif //__FreeBSD_version >= 1000029 sleep_flag = NO_SLEEP; - getmicrotime(&start_time); + getmicrouptime(&start_time); if (mtx_owned(&sc->mpr_mtx) && sleep_flag == CAN_SLEEP) { error = msleep(cm, &sc->mpr_mtx, 0, "mprwait", timeout*hz); + if (error == EWOULDBLOCK) { + /* + * Record the actual elapsed time in the case of a + * timeout for the message below. + */ + getmicrouptime(&cur_time); + timevalsub(&cur_time, &start_time); + } } else { while ((cm->cm_flags & MPR_CM_FLAGS_COMPLETE) == 0) { mpr_intr_locked(sc); @@ -3297,8 +3305,9 @@ mpr_wait_command(struct mpr_softc *sc, struct mpr_comm else DELAY(50000); - getmicrotime(&cur_time); - if ((cur_time.tv_sec - start_time.tv_sec) > timeout) { + getmicrouptime(&cur_time); + timevalsub(&cur_time, &start_time); + if (cur_time.tv_sec > timeout) { error = EWOULDBLOCK; break; } @@ -3306,7 +3315,9 @@ mpr_wait_command(struct mpr_softc *sc, struct mpr_comm } if (error == EWOULDBLOCK) { - mpr_dprint(sc, MPR_FAULT, "Calling Reinit from %s\n", __func__); + mpr_dprint(sc, MPR_FAULT, "Calling Reinit from %s, timeout=%d," + " elapsed=%jd\n", __func__, timeout, + (intmax_t)cur_time.tv_sec); rc = mpr_reinit(sc); mpr_dprint(sc, MPR_FAULT, "Reinit %s\n", (rc == 0) ? "success" : "failed"); Modified: stable/11/sys/dev/mps/mps.c ============================================================================== --- stable/11/sys/dev/mps/mps.c Mon Jul 24 06:49:57 2017 (r321414) +++ stable/11/sys/dev/mps/mps.c Mon Jul 24 14:42:43 2017 (r321415) @@ -2551,10 +2551,18 @@ mps_wait_command(struct mps_softc *sc, struct mps_comm */ if (curthread->td_no_sleeping != 0) sleep_flag = NO_SLEEP; - getmicrotime(&start_time); + getmicrouptime(&start_time); if (mtx_owned(&sc->mps_mtx) && sleep_flag == CAN_SLEEP) { cm->cm_flags |= MPS_CM_FLAGS_WAKEUP; error = msleep(cm, &sc->mps_mtx, 0, "mpswait", timeout*hz); + if (error == EWOULDBLOCK) { + /* + * Record the actual elapsed time in the case of a + * timeout for the message below. + */ + getmicrouptime(&cur_time); + timevalsub(&cur_time, &start_time); + } } else { while ((cm->cm_flags & MPS_CM_FLAGS_COMPLETE) == 0) { mps_intr_locked(sc); @@ -2563,8 +2571,9 @@ mps_wait_command(struct mps_softc *sc, struct mps_comm else DELAY(50000); - getmicrotime(&cur_time); - if ((cur_time.tv_sec - start_time.tv_sec) > timeout) { + getmicrouptime(&cur_time); + timevalsub(&cur_time, &start_time); + if (cur_time.tv_sec > timeout) { error = EWOULDBLOCK; break; } @@ -2572,7 +2581,9 @@ mps_wait_command(struct mps_softc *sc, struct mps_comm } if (error == EWOULDBLOCK) { - mps_dprint(sc, MPS_FAULT, "Calling Reinit from %s\n", __func__); + mps_dprint(sc, MPS_FAULT, "Calling Reinit from %s, timeout=%d," + " elapsed=%jd\n", __func__, timeout, + (intmax_t)cur_time.tv_sec); rc = mps_reinit(sc); mps_dprint(sc, MPS_FAULT, "Reinit %s\n", (rc == 0) ? "success" : "failed"); From owner-svn-src-stable-11@freebsd.org Mon Jul 24 16:23:29 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ACBBACFDA33; Mon, 24 Jul 2017 16:23:29 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7B5996DECC; Mon, 24 Jul 2017 16:23:29 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6OGNSxx079925; Mon, 24 Jul 2017 16:23:28 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6OGNSGc079924; Mon, 24 Jul 2017 16:23:28 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201707241623.v6OGNSGc079924@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Mon, 24 Jul 2017 16:23:28 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321418 - stable/11/sys/kern X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: stable/11/sys/kern X-SVN-Commit-Revision: 321418 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 16:23:29 -0000 Author: markj Date: Mon Jul 24 16:23:28 2017 New Revision: 321418 URL: https://svnweb.freebsd.org/changeset/base/321418 Log: MFC r320918, r321035: Have mkdumpheader() handle version string truncation. Modified: stable/11/sys/kern/kern_shutdown.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/kern_shutdown.c ============================================================================== --- stable/11/sys/kern/kern_shutdown.c Mon Jul 24 15:39:09 2017 (r321417) +++ stable/11/sys/kern/kern_shutdown.c Mon Jul 24 16:23:28 2017 (r321418) @@ -914,6 +914,7 @@ void mkdumpheader(struct kerneldumpheader *kdh, char *magic, uint32_t archver, uint64_t dumplen, uint32_t blksz) { + size_t dstsize; bzero(kdh, sizeof(*kdh)); strlcpy(kdh->magic, magic, sizeof(kdh->magic)); @@ -924,7 +925,9 @@ mkdumpheader(struct kerneldumpheader *kdh, char *magic kdh->dumptime = htod64(time_second); kdh->blocksize = htod32(blksz); strlcpy(kdh->hostname, prison0.pr_hostname, sizeof(kdh->hostname)); - strlcpy(kdh->versionstring, version, sizeof(kdh->versionstring)); + dstsize = sizeof(kdh->versionstring); + if (strlcpy(kdh->versionstring, version, dstsize) >= dstsize) + kdh->versionstring[dstsize - 2] = '\n'; if (panicstr != NULL) strlcpy(kdh->panicstring, panicstr, sizeof(kdh->panicstring)); kdh->parity = kerneldump_parity(kdh); From owner-svn-src-stable-11@freebsd.org Mon Jul 24 16:24:23 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89CE7CFDABC; Mon, 24 Jul 2017 16:24:23 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 589576E095; Mon, 24 Jul 2017 16:24:23 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6OGOM6e080013; Mon, 24 Jul 2017 16:24:22 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6OGOM1P080012; Mon, 24 Jul 2017 16:24:22 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201707241624.v6OGOM1P080012@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Mon, 24 Jul 2017 16:24:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321419 - stable/11/usr.bin/top X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: stable/11/usr.bin/top X-SVN-Commit-Revision: 321419 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 16:24:23 -0000 Author: markj Date: Mon Jul 24 16:24:22 2017 New Revision: 321419 URL: https://svnweb.freebsd.org/changeset/base/321419 Log: MFC r321356: Fix top(1) output when zfs.ko is loaded but ZFS is not in use. Modified: stable/11/usr.bin/top/machine.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/top/machine.c ============================================================================== --- stable/11/usr.bin/top/machine.c Mon Jul 24 16:23:28 2017 (r321418) +++ stable/11/usr.bin/top/machine.c Mon Jul 24 16:24:22 2017 (r321419) @@ -328,14 +328,15 @@ machine_init(struct statics *statics, char do_unames) size != sizeof(smpmode)) smpmode = 0; - size = sizeof(carc_en); - if (sysctlbyname("vfs.zfs.compressed_arc_enabled", &carc_en, &size, - NULL, 0) == 0 && carc_en == 1) - carc_enabled = 1; size = sizeof(arc_size); if (sysctlbyname("kstat.zfs.misc.arcstats.size", &arc_size, &size, NULL, 0) == 0 && arc_size != 0) arc_enabled = 1; + size = sizeof(carc_en); + if (arc_enabled && + sysctlbyname("vfs.zfs.compressed_arc_enabled", &carc_en, &size, + NULL, 0) == 0 && carc_en == 1) + carc_enabled = 1; if (do_unames) { while ((pw = getpwent()) != NULL) { From owner-svn-src-stable-11@freebsd.org Tue Jul 25 00:28:30 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A6A28DAC68B; Tue, 25 Jul 2017 00:28:30 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7401C82B9B; Tue, 25 Jul 2017 00:28:30 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6P0STwU076078; Tue, 25 Jul 2017 00:28:29 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6P0STIp076077; Tue, 25 Jul 2017 00:28:29 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201707250028.v6P0STIp076077@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Tue, 25 Jul 2017 00:28:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321447 - stable/11/sbin/savecore X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: stable/11/sbin/savecore X-SVN-Commit-Revision: 321447 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 00:28:30 -0000 Author: markj Date: Tue Jul 25 00:28:29 2017 New Revision: 321447 URL: https://svnweb.freebsd.org/changeset/base/321447 Log: MFC r320896: Add a subroutine for comparing kerneldump identifiers. Modified: stable/11/sbin/savecore/savecore.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sbin/savecore/savecore.c ============================================================================== --- stable/11/sbin/savecore/savecore.c Tue Jul 25 00:28:23 2017 (r321446) +++ stable/11/sbin/savecore/savecore.c Tue Jul 25 00:28:29 2017 (r321447) @@ -308,6 +308,13 @@ check_space(const char *savedir, off_t dumpsize, int b return (1); } +static bool +compare_magic(const struct kerneldumpheader *kdh, const char *magic) +{ + + return (strncmp(kdh->magic, magic, sizeof(kdh->magic)) == 0); +} + #define BLOCKSIZE (1<<12) #define BLOCKMASK (~(BLOCKSIZE-1)) @@ -534,7 +541,7 @@ DoFile(const char *savedir, const char *device) } memcpy(&kdhl, temp, sizeof(kdhl)); istextdump = 0; - if (strncmp(kdhl.magic, TEXTDUMPMAGIC, sizeof kdhl) == 0) { + if (compare_magic(&kdhl, TEXTDUMPMAGIC)) { if (verbose) printf("textdump magic on last dump header on %s\n", device); @@ -548,8 +555,7 @@ DoFile(const char *savedir, const char *device) if (force == 0) goto closefd; } - } else if (memcmp(kdhl.magic, KERNELDUMPMAGIC, sizeof kdhl.magic) == - 0) { + } else if (compare_magic(&kdhl, KERNELDUMPMAGIC)) { if (dtoh32(kdhl.version) != KERNELDUMPVERSION) { syslog(LOG_ERR, "unknown version (%d) in last dump header on %s", @@ -568,8 +574,7 @@ DoFile(const char *savedir, const char *device) if (force == 0) goto closefd; - if (memcmp(kdhl.magic, KERNELDUMPMAGIC_CLEARED, - sizeof kdhl.magic) == 0) { + if (compare_magic(&kdhl, KERNELDUMPMAGIC_CLEARED)) { if (verbose) printf("forcing magic on %s\n", device); memcpy(kdhl.magic, KERNELDUMPMAGIC, From owner-svn-src-stable-11@freebsd.org Tue Jul 25 00:30:27 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0892FDAC761; Tue, 25 Jul 2017 00:30:27 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C5C8E82E07; Tue, 25 Jul 2017 00:30:26 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6P0UPbl076203; Tue, 25 Jul 2017 00:30:25 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6P0UPhY076202; Tue, 25 Jul 2017 00:30:25 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201707250030.v6P0UPhY076202@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Tue, 25 Jul 2017 00:30:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321448 - stable/11/sbin/savecore X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: stable/11/sbin/savecore X-SVN-Commit-Revision: 321448 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 00:30:27 -0000 Author: markj Date: Tue Jul 25 00:30:25 2017 New Revision: 321448 URL: https://svnweb.freebsd.org/changeset/base/321448 Log: Include stdbool.h for r321447. This is a direct commit to stable/11. Modified: stable/11/sbin/savecore/savecore.c Modified: stable/11/sbin/savecore/savecore.c ============================================================================== --- stable/11/sbin/savecore/savecore.c Tue Jul 25 00:28:29 2017 (r321447) +++ stable/11/sbin/savecore/savecore.c Tue Jul 25 00:30:25 2017 (r321448) @@ -75,6 +75,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include From owner-svn-src-stable-11@freebsd.org Tue Jul 25 00:31:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D5678DAC959; Tue, 25 Jul 2017 00:31:59 +0000 (UTC) (envelope-from yaneurabeya@gmail.com) Received: from mail-pg0-x241.google.com (mail-pg0-x241.google.com [IPv6:2607:f8b0:400e:c05::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A170B83101; Tue, 25 Jul 2017 00:31:59 +0000 (UTC) (envelope-from yaneurabeya@gmail.com) Received: by mail-pg0-x241.google.com with SMTP id g14so8391474pgu.4; Mon, 24 Jul 2017 17:31:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:mime-version:from:in-reply-to:date:cc:message-id:references :to; bh=dWfgSyEJjFyjoF9eQeVvrBfVyN2l/HYHfZ/6//en280=; b=mTzJP7yTL2gwielauXLPBfVFIac4zLm++tjH1EYtR4oM5aJM/0feOZvzuTeSL8jW4z DoN4UkuQvbrvVIKkrciyn5UDo7JLKo/9wzo+14ClSCWm4MNHWroTHZ06x0q/3rDjJh5J kxCooYuGX5rleO1lZETH56VT4hDOciJQZfFTFaXPzx10p9jO6KPfn5/aHWrZOer+BbTg yXFcKWR9JfD4K8lyXw6OIaNnRahqpG/M14YiLNYRGYambZbksLtdy2hv0iMwuZhINL3S OYMB2Mpt3Gdo6RJycECvjJTF7dyzkZPUjRv6wVlnoM9Ov/9NnTIed4jTBG8fBvy4neNI 3vfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:mime-version:from:in-reply-to:date:cc :message-id:references:to; bh=dWfgSyEJjFyjoF9eQeVvrBfVyN2l/HYHfZ/6//en280=; b=fKgfN+LQcy+jGYVxV+cihGPCchJA/cq4OPiZ+rgU8OU0QXqin/YmZT7KT79lqMM3ko skAMsQ70gkyUOmkHCeDRAsXwdjQDzng1jjhlTncuaEQCp7veUR3kowObrMnt4kxmMrP9 e5x9XG5WJAhQrx5Irah/rd7O08dFqFKv8yq6NtG9fsr+Ts8qgoMEjQMVT9kg6Vm7WBt9 s3SpMmC/964iovJ/O1zR1UQKgh91Kq+mQzfTz04hbDyjHHw/AVxMJPNRETMgZWUVSa0r qH25klPNpbK6eIEvjNfR0NH1ZqIUhRXQV1ddRzXcY5uVYYWRrTEl/SvOkl6t+usoNmlX 6VGA== X-Gm-Message-State: AIVw110mDGY8XB8s2m1uQ+6AVG0A0KEsmcNFhSo343rJR3zTyS8OA5C2 3s6J6cft9DlikFYVupw= X-Received: by 10.84.191.164 with SMTP id a33mr19411263pld.119.1500942719070; Mon, 24 Jul 2017 17:31:59 -0700 (PDT) Received: from pinklady.local (c-73-19-52-228.hsd1.wa.comcast.net. [73.19.52.228]) by smtp.gmail.com with ESMTPSA id y12sm21334475pgs.91.2017.07.24.17.31.58 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 24 Jul 2017 17:31:58 -0700 (PDT) Subject: Re: svn commit: r321448 - stable/11/sbin/savecore Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: multipart/signed; boundary="Apple-Mail=_56CFB916-9546-4FDD-8072-257063ABC673"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail From: "Ngie Cooper (yaneurabeya)" In-Reply-To: <201707250030.v6P0UPhY076202@repo.freebsd.org> Date: Mon, 24 Jul 2017 17:31:57 -0700 Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Message-Id: <51FECB6B-0838-4A93-9584-50623946FB14@gmail.com> References: <201707250030.v6P0UPhY076202@repo.freebsd.org> To: Mark Johnston X-Mailer: Apple Mail (2.3124) X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 00:31:59 -0000 --Apple-Mail=_56CFB916-9546-4FDD-8072-257063ABC673 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jul 24, 2017, at 17:30, Mark Johnston wrote: >=20 > Author: markj > Date: Tue Jul 25 00:30:25 2017 > New Revision: 321448 > URL: https://svnweb.freebsd.org/changeset/base/321448 >=20 > Log: > Include stdbool.h for r321447. >=20 > This is a direct commit to stable/11. Shouldn=E2=80=99t the pollution be committed to ^/head and backported = instead of doing a direct commit? Thanks, -Ngie --Apple-Mail=_56CFB916-9546-4FDD-8072-257063ABC673 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJZdpF9AAoJEPWDqSZpMIYV+yEP/1i9e7Umzzjve/Lmu42gERkE dZ7i/EtKrh4N72EcR3oLg6O/zVKhYwA2Pv9PYJ0EIazYNPlBcseaqzupaNhw9yKx pwpgixxUugHSQwFw2VLs6Qy8P1CU/AczYI6s+jE3ePGsGdCkLASbiThUcSeQYORx 60pwpcJL9lVyU/x+gZmsQZcJhcodI3nuyHbOZ+GZ60f6p9RQ9i2JXyKHs/t5vuBm pdGt2elxTrW57NibGGfujRnGOOllfvZ2A4tABiN6BwyR7UB69M3AJ4VxQZepTmd1 b1MJ3bu3KS5D1q3nUW66zyl+aoOnUTrwXA0nX/LmOn7UxEuSMi4pwPBAKZgyF+UQ hq3Y7g3s7du9AdtHsV0Su92REYjQpRrgbbcDhILD/ai40xEkuDPuJRaKtvKlFWGn cpGbfAngSFrHkTCtmivL5iwfe5zGM5xMhxNcIMZATmv9vpFI+VqqHINiK5wVir6P X4kPMgo6dhChOjujiIJHcoW78er6QbsS7qjgOST8SQZaRh0Ma/MKBfgDy9Ah1tTY TOU953cFimf+5RGWSVs41P7TYWTk58nKMttN2wq2AUqa1HTXN7QlUb99V8e9bohw 5QuBjAzKTXuLVsGOhlj6kWnfL7uDtxU+Ddb7Td6oB79apykfCXWtqKUlRLCEy7xj Xdba94CWFYwfTD5B2Fjd =PmU0 -----END PGP SIGNATURE----- --Apple-Mail=_56CFB916-9546-4FDD-8072-257063ABC673-- From owner-svn-src-stable-11@freebsd.org Tue Jul 25 00:37:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C6E7DACBE5; Tue, 25 Jul 2017 00:37:59 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk0-x22c.google.com (mail-qk0-x22c.google.com [IPv6:2607:f8b0:400d:c09::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E6DC183543; Tue, 25 Jul 2017 00:37:58 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk0-x22c.google.com with SMTP id k2so22537376qkf.0; Mon, 24 Jul 2017 17:37:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=mPAptb4pngmO5t/41UGzm++nBu329vnCgx0GJ6Uqfak=; b=kO1msF8muSaw+tCCCiiuH9Od03N9WKDPoc9s41+J52i03ximET4RDorGb6fMGGs/Cu 68oDkqzYBinsXxBVs3Zcd0x0aAxk72r22NWOuaFxJX42iyK/DVX/uMYSvmgu0EhM6KTa TSxjNq8nERpMQDmej93SkAjg3skPl/m+q6cxXhaTsTCg3n5DS8c4dbD7yo2Mahxcgg2S lq6DszPeV2K3M1VTORDmWnfKSM4sybodp8pS0EpRzfx10n3Rhsx48RffdDF8lgttP9HW 8Th8M1A+kZ6UMjaYBZoGNIatEM+ir5LVzsNtDZid1XjtZ68fhOVFmDafgEOqFjmUhnje 4iug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=mPAptb4pngmO5t/41UGzm++nBu329vnCgx0GJ6Uqfak=; b=pe8xpmR4O48cf+z4KkUbGuGYQIdKz9vl4f92pyzYftKfoofdykNn0uTrQBdJ8Anra4 NUSuKP42Xlvdhiz85sC8x83TYNiS+eSXFvwBzIoso+60w6G79oehxF6pSK2yxpfVQY/9 owHLl43u4q8eZZyKYcZEeuPkfqMhDVNw3LxZA0NrJhZMu2h8QPHWld5Sq6ZLHrPo/5YS x9QA9rUkasqw8IhIXF/Ak/fDSUQgctZsFhMk8Z7kj1Hk74zX4lgP8I9P2HZkBtWOEVe2 xgj59vppiBVygfrsEFU5lWASm+MVNya6AinCvSjnKn7u7hxj+2dCABANudPFW4qkPtNS PVqw== X-Gm-Message-State: AIVw110d1HtVCr2+Owm3TF88htSOCE5j0F04rcpIE8wwPKjy898JkY8w jVfuHqE9PVjbWhKgQDE= X-Received: by 10.55.18.222 with SMTP id 91mr22904010qks.350.1500943077932; Mon, 24 Jul 2017 17:37:57 -0700 (PDT) Received: from wkstn-mjohnston.west.isilon.com (c-76-104-201-218.hsd1.wa.comcast.net. [76.104.201.218]) by smtp.gmail.com with ESMTPSA id 64sm9944607qky.78.2017.07.24.17.37.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 17:37:57 -0700 (PDT) Sender: Mark Johnston Date: Mon, 24 Jul 2017 17:38:37 -0700 From: Mark Johnston To: "Ngie Cooper (yaneurabeya)" Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: Re: svn commit: r321448 - stable/11/sbin/savecore Message-ID: <20170725003837.GA95607@wkstn-mjohnston.west.isilon.com> References: <201707250030.v6P0UPhY076202@repo.freebsd.org> <51FECB6B-0838-4A93-9584-50623946FB14@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <51FECB6B-0838-4A93-9584-50623946FB14@gmail.com> User-Agent: Mutt/1.8.3 (2017-05-23) X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 00:37:59 -0000 On Mon, Jul 24, 2017 at 05:31:57PM -0700, Ngie Cooper (yaneurabeya) wrote: > > > On Jul 24, 2017, at 17:30, Mark Johnston wrote: > > > > Author: markj > > Date: Tue Jul 25 00:30:25 2017 > > New Revision: 321448 > > URL: https://svnweb.freebsd.org/changeset/base/321448 > > > > Log: > > Include stdbool.h for r321447. > > > > This is a direct commit to stable/11. > > Shouldn’t the pollution be committed to ^/head and backported instead of doing a direct commit? > Thanks, > -Ngie The stdbool.h include is already present in head. From owner-svn-src-stable-11@freebsd.org Tue Jul 25 03:43:01 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ABD67DB4089; Tue, 25 Jul 2017 03:43:01 +0000 (UTC) (envelope-from alc@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 79B7A657A1; Tue, 25 Jul 2017 03:43:01 +0000 (UTC) (envelope-from alc@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6P3h0M1057869; Tue, 25 Jul 2017 03:43:00 GMT (envelope-from alc@FreeBSD.org) Received: (from alc@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6P3h0JF057867; Tue, 25 Jul 2017 03:43:00 GMT (envelope-from alc@FreeBSD.org) Message-Id: <201707250343.v6P3h0JF057867@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: alc set sender to alc@FreeBSD.org using -f From: Alan Cox Date: Tue, 25 Jul 2017 03:43:00 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321453 - in stable/11/sys: kern sys X-SVN-Group: stable-11 X-SVN-Commit-Author: alc X-SVN-Commit-Paths: in stable/11/sys: kern sys X-SVN-Commit-Revision: 321453 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 03:43:01 -0000 Author: alc Date: Tue Jul 25 03:43:00 2017 New Revision: 321453 URL: https://svnweb.freebsd.org/changeset/base/321453 Log: MFC r320077 Change blist_alloc()'s allocation policy from first-fit to next-fit so that disk writes are more likely to be sequential. This change is beneficial on both the solid state and mechanical disks that I've tested. (A similar change in allocation policy was made by DragonFly BSD in 2013 to speed up Poudriere with "stressful memory parameters".) Increase the width of blst_meta_alloc()'s parameter "skip" and the local variables whose values are derived from it to 64 bits. (This matches the width of the field "skip" that is stored in the structure "blist" and passed to blst_meta_alloc().) Eliminate a pointless check for a NULL blist_t. Simplify blst_meta_alloc()'s handling of the ALL-FREE case. Address nearby style errors. MFC r320417 Address the remaining integer overflow issues with the "skip" parameters and "next_skip" variables. The "skip" value in struct blist has long been a 64-bit quantity but various functions have implicitly truncated this value to 32 bits. Now, all arithmetic involving the "skip" value is 64 bits wide. (This should allow us to relax the size limit on a swap device in the swap pager.) Maintain the ability to test this allocator as a user-space application by including . Remove an unused variable from blst_radix_print(). MFC r320527 Change blst_leaf_alloc() to handle a cursor argument, and to improve performance. To find in the leaf bitmap all ranges of sufficient length, use a doubling strategy with shift-and-and until each bit still set represents a bit sequence of length 'count', or until the bitmask is zero. In the latter case, update the hint based on the first bit sequence length not found to be available. For example, seeking an interval of length 12, the set bits of the bitmap would represent intervals of length 1, then 2, then 3, then 6, then 12. If no bits are set at the point when each bit represents an interval of length 6, then the hint can be updated to 5 and the search terminated. If long-enough intervals are found, discard those before the cursor. If any remain, use binary search to find the position of the first of them, and allocate that interval. Modified: stable/11/sys/kern/subr_blist.c stable/11/sys/sys/blist.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/subr_blist.c ============================================================================== --- stable/11/sys/kern/subr_blist.c Tue Jul 25 03:41:05 2017 (r321452) +++ stable/11/sys/kern/subr_blist.c Tue Jul 25 03:43:00 2017 (r321453) @@ -105,6 +105,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #define bitcount64(x) __bitcount64((uint64_t)(x)) #define malloc(a,b,c) calloc(a, 1) @@ -120,22 +121,23 @@ void panic(const char *ctl, ...); * static support functions */ -static daddr_t blst_leaf_alloc(blmeta_t *scan, daddr_t blk, int count); -static daddr_t blst_meta_alloc(blmeta_t *scan, daddr_t blk, - daddr_t count, daddr_t radix, int skip); +static daddr_t blst_leaf_alloc(blmeta_t *scan, daddr_t blk, int count, + daddr_t cursor); +static daddr_t blst_meta_alloc(blmeta_t *scan, daddr_t blk, daddr_t count, + daddr_t radix, daddr_t skip, daddr_t cursor); static void blst_leaf_free(blmeta_t *scan, daddr_t relblk, int count); static void blst_meta_free(blmeta_t *scan, daddr_t freeBlk, daddr_t count, - daddr_t radix, int skip, daddr_t blk); + daddr_t radix, daddr_t skip, daddr_t blk); static void blst_copy(blmeta_t *scan, daddr_t blk, daddr_t radix, daddr_t skip, blist_t dest, daddr_t count); static daddr_t blst_leaf_fill(blmeta_t *scan, daddr_t blk, int count); static daddr_t blst_meta_fill(blmeta_t *scan, daddr_t allocBlk, daddr_t count, - daddr_t radix, int skip, daddr_t blk); -static daddr_t blst_radix_init(blmeta_t *scan, daddr_t radix, - int skip, daddr_t count); + daddr_t radix, daddr_t skip, daddr_t blk); +static daddr_t blst_radix_init(blmeta_t *scan, daddr_t radix, daddr_t skip, + daddr_t count); #ifndef _KERNEL -static void blst_radix_print(blmeta_t *scan, daddr_t blk, - daddr_t radix, int skip, int tab); +static void blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t radix, + daddr_t skip, int tab); #endif #ifdef _KERNEL @@ -157,18 +159,18 @@ blist_t blist_create(daddr_t blocks, int flags) { blist_t bl; - daddr_t nodes, radix; - int skip = 0; + daddr_t nodes, radix, skip; /* * Calculate radix and skip field used for scanning. */ radix = BLIST_BMAP_RADIX; - + skip = 0; while (radix < blocks) { radix *= BLIST_META_RADIX; skip = (skip + 1) * BLIST_META_RADIX; } + nodes = 1 + blst_radix_init(NULL, radix, skip, blocks); bl = malloc(sizeof(struct blist), M_SWAP, flags); if (bl == NULL) @@ -177,13 +179,13 @@ blist_create(daddr_t blocks, int flags) bl->bl_blocks = blocks; bl->bl_radix = radix; bl->bl_skip = skip; - nodes = 1 + blst_radix_init(NULL, radix, bl->bl_skip, blocks); + bl->bl_cursor = 0; bl->bl_root = malloc(nodes * sizeof(blmeta_t), M_SWAP, flags); if (bl->bl_root == NULL) { free(bl, M_SWAP); return (NULL); } - blst_radix_init(bl->bl_root, radix, bl->bl_skip, blocks); + blst_radix_init(bl->bl_root, radix, skip, blocks); #if defined(BLIST_DEBUG) printf( @@ -218,13 +220,24 @@ blist_alloc(blist_t bl, daddr_t count) { daddr_t blk; - if (bl != NULL && count <= bl->bl_root->bm_bighint) { + /* + * This loop iterates at most twice. An allocation failure in the + * first iteration leads to a second iteration only if the cursor was + * non-zero. When the cursor is zero, an allocation failure will + * reduce the hint, stopping further iterations. + */ + while (count <= bl->bl_root->bm_bighint) { if (bl->bl_radix == BLIST_BMAP_RADIX) - blk = blst_leaf_alloc(bl->bl_root, 0, count); + blk = blst_leaf_alloc(bl->bl_root, 0, count, + bl->bl_cursor); else blk = blst_meta_alloc(bl->bl_root, 0, count, - bl->bl_radix, bl->bl_skip); - return (blk); + bl->bl_radix, bl->bl_skip, bl->bl_cursor); + if (blk != SWAPBLK_NONE) { + bl->bl_cursor = blk + count; + return (blk); + } else if (bl->bl_cursor != 0) + bl->bl_cursor = 0; } return (SWAPBLK_NONE); } @@ -341,77 +354,92 @@ blist_print(blist_t bl) /* * blist_leaf_alloc() - allocate at a leaf in the radix tree (a bitmap). * - * This is the core of the allocator and is optimized for the 1 block - * and the BLIST_BMAP_RADIX block allocation cases. Other cases are - * somewhat slower. The 1 block allocation case is log2 and extremely - * quick. + * This is the core of the allocator and is optimized for the + * BLIST_BMAP_RADIX block allocation case. Otherwise, execution + * time is proportional to log2(count) + log2(BLIST_BMAP_RADIX). */ static daddr_t -blst_leaf_alloc( - blmeta_t *scan, - daddr_t blk, - int count -) { - u_daddr_t orig = scan->u.bmu_bitmap; +blst_leaf_alloc(blmeta_t *scan, daddr_t blk, int count, daddr_t cursor) +{ + u_daddr_t mask; + int count1, hi, lo, mid, num_shifts, range1, range_ext; - if (orig == 0) { + if (count == BLIST_BMAP_RADIX) { /* - * Optimize bitmap all-allocated case. Also, count = 1 - * case assumes at least 1 bit is free in the bitmap, so - * we have to take care of this case here. + * Optimize allocation of BLIST_BMAP_RADIX bits. If this wasn't + * a special case, then forming the final value of 'mask' below + * would require special handling to avoid an invalid left shift + * when count equals the number of bits in mask. */ + if (~scan->u.bmu_bitmap != 0) { + scan->bm_bighint = BLIST_BMAP_RADIX - 1; + return (SWAPBLK_NONE); + } + if (cursor != blk) + return (SWAPBLK_NONE); + scan->u.bmu_bitmap = 0; scan->bm_bighint = 0; - return(SWAPBLK_NONE); + return (blk); } - if (count == 1) { + range1 = 0; + count1 = count - 1; + num_shifts = fls(count1); + mask = scan->u.bmu_bitmap; + while (mask != 0 && num_shifts > 0) { /* - * Optimized code to allocate one bit out of the bitmap + * If bit i is set in mask, then bits in [i, i+range1] are set + * in scan->u.bmu_bitmap. The value of range1 is equal to + * count1 >> num_shifts. Grow range and reduce num_shifts to 0, + * while preserving these invariants. The updates to mask leave + * fewer bits set, but each bit that remains set represents a + * longer string of consecutive bits set in scan->u.bmu_bitmap. */ - u_daddr_t mask; - int j = BLIST_BMAP_RADIX/2; - int r = 0; - - mask = (u_daddr_t)-1 >> (BLIST_BMAP_RADIX/2); - - while (j) { - if ((orig & mask) == 0) { - r += j; - orig >>= j; - } - j >>= 1; - mask >>= j; - } - scan->u.bmu_bitmap &= ~((u_daddr_t)1 << r); - return(blk + r); + num_shifts--; + range_ext = range1 + ((count1 >> num_shifts) & 1); + mask &= mask >> range_ext; + range1 += range_ext; } - if (count <= BLIST_BMAP_RADIX) { + if (mask == 0) { /* - * non-optimized code to allocate N bits out of the bitmap. - * The more bits, the faster the code runs. It will run - * the slowest allocating 2 bits, but since there aren't any - * memory ops in the core loop (or shouldn't be, anyway), - * you probably won't notice the difference. + * Update bighint. There is no allocation bigger than range1 + * available in this leaf. */ - int j; - int n = BLIST_BMAP_RADIX - count; - u_daddr_t mask; + scan->bm_bighint = range1; + return (SWAPBLK_NONE); + } - mask = (u_daddr_t)-1 >> n; + /* + * Discard any candidates that appear before the cursor. + */ + lo = cursor - blk; + mask &= ~(u_daddr_t)0 << lo; - for (j = 0; j <= n; ++j) { - if ((orig & mask) == mask) { - scan->u.bmu_bitmap &= ~mask; - return(blk + j); - } - mask = (mask << 1); - } + if (mask == 0) + return (SWAPBLK_NONE); + + /* + * The least significant set bit in mask marks the start of the first + * available range of sufficient size. Clear all the bits but that one, + * and then perform a binary search to find its position. + */ + mask &= -mask; + hi = BLIST_BMAP_RADIX - count1; + while (lo + 1 < hi) { + mid = (lo + hi) >> 1; + if ((mask >> mid) != 0) + lo = mid; + else + hi = mid; } + /* - * We couldn't allocate count in this subtree, update bighint. + * Set in mask exactly the bits being allocated, and clear them from + * the set of available bits. */ - scan->bm_bighint = count - 1; - return(SWAPBLK_NONE); + mask = (mask << count) - mask; + scan->u.bmu_bitmap &= ~mask; + return (blk + lo); } /* @@ -424,16 +452,12 @@ blst_leaf_alloc( */ static daddr_t -blst_meta_alloc( - blmeta_t *scan, - daddr_t blk, - daddr_t count, - daddr_t radix, - int skip -) { - daddr_t r; - int i; - int next_skip = ((u_int)skip / BLIST_META_RADIX); +blst_meta_alloc(blmeta_t *scan, daddr_t blk, daddr_t count, daddr_t radix, + daddr_t skip, daddr_t cursor) +{ + daddr_t i, next_skip, r; + int child; + bool scan_from_start; if (scan->u.bmu_avail < count) { /* @@ -444,6 +468,7 @@ blst_meta_alloc( scan->bm_bighint = scan->u.bmu_avail; return (SWAPBLK_NONE); } + next_skip = skip / BLIST_META_RADIX; /* * An ALL-FREE meta node requires special handling before allocating @@ -457,13 +482,11 @@ blst_meta_alloc( * meta node cannot have a terminator in any subtree. */ for (i = 1; i <= skip; i += next_skip) { - if (next_skip == 1) { + if (next_skip == 1) scan[i].u.bmu_bitmap = (u_daddr_t)-1; - scan[i].bm_bighint = BLIST_BMAP_RADIX; - } else { - scan[i].bm_bighint = radix; + else scan[i].u.bmu_avail = radix; - } + scan[i].bm_bighint = radix; } } else { radix /= BLIST_META_RADIX; @@ -476,16 +499,21 @@ blst_meta_alloc( */ panic("allocation too large"); } - for (i = 1; i <= skip; i += next_skip) { + scan_from_start = cursor == blk; + child = (cursor - blk) / radix; + blk += child * radix; + for (i = 1 + child * next_skip; i <= skip; i += next_skip) { if (count <= scan[i].bm_bighint) { /* * The allocation might fit in the i'th subtree. */ if (next_skip == 1) { - r = blst_leaf_alloc(&scan[i], blk, count); + r = blst_leaf_alloc(&scan[i], blk, count, + cursor > blk ? cursor : blk); } else { r = blst_meta_alloc(&scan[i], blk, count, - radix, next_skip - 1); + radix, next_skip - 1, cursor > blk ? + cursor : blk); } if (r != SWAPBLK_NONE) { scan->u.bmu_avail -= count; @@ -503,9 +531,10 @@ blst_meta_alloc( /* * We couldn't allocate count in this subtree, update bighint. */ - if (scan->bm_bighint >= count) + if (scan_from_start && scan->bm_bighint >= count) scan->bm_bighint = count - 1; - return(SWAPBLK_NONE); + + return (SWAPBLK_NONE); } /* @@ -558,16 +587,11 @@ blst_leaf_free( */ static void -blst_meta_free( - blmeta_t *scan, - daddr_t freeBlk, - daddr_t count, - daddr_t radix, - int skip, - daddr_t blk -) { - int i; - int next_skip = ((u_int)skip / BLIST_META_RADIX); +blst_meta_free(blmeta_t *scan, daddr_t freeBlk, daddr_t count, daddr_t radix, + daddr_t skip, daddr_t blk) +{ + daddr_t i, next_skip, v; + int child; #if 0 printf("free (%llx,%lld) FROM (%llx,%lld)\n", @@ -575,6 +599,7 @@ blst_meta_free( (long long)blk, (long long)radix ); #endif + next_skip = skip / BLIST_META_RADIX; if (scan->u.bmu_avail == 0) { /* @@ -619,13 +644,10 @@ blst_meta_free( radix /= BLIST_META_RADIX; - i = (freeBlk - blk) / radix; - blk += i * radix; - i = i * next_skip + 1; - + child = (freeBlk - blk) / radix; + blk += child * radix; + i = 1 + child * next_skip; while (i <= skip && blk < freeBlk + count) { - daddr_t v; - v = blk + radix - freeBlk; if (v > count) v = count; @@ -662,8 +684,7 @@ static void blst_copy( blist_t dest, daddr_t count ) { - int next_skip; - int i; + daddr_t i, next_skip; /* * Leaf node @@ -708,7 +729,7 @@ static void blst_copy( radix /= BLIST_META_RADIX; - next_skip = ((u_int)skip / BLIST_META_RADIX); + next_skip = skip / BLIST_META_RADIX; for (i = 1; count && i <= skip; i += next_skip) { if (scan[i].bm_bighint == (daddr_t)-1) @@ -775,17 +796,11 @@ blst_leaf_fill(blmeta_t *scan, daddr_t blk, int count) * number of blocks allocated by the call. */ static daddr_t -blst_meta_fill( - blmeta_t *scan, - daddr_t allocBlk, - daddr_t count, - daddr_t radix, - int skip, - daddr_t blk -) { - int i; - int next_skip = ((u_int)skip / BLIST_META_RADIX); - daddr_t nblks = 0; +blst_meta_fill(blmeta_t *scan, daddr_t allocBlk, daddr_t count, daddr_t radix, + daddr_t skip, daddr_t blk) +{ + daddr_t i, nblks, next_skip, v; + int child; if (count > radix) { /* @@ -803,6 +818,7 @@ blst_meta_fill( scan->bm_bighint = 0; return nblks; } + next_skip = skip / BLIST_META_RADIX; /* * An ALL-FREE meta node requires special handling before allocating @@ -828,13 +844,11 @@ blst_meta_fill( radix /= BLIST_META_RADIX; } - i = (allocBlk - blk) / radix; - blk += i * radix; - i = i * next_skip + 1; - + nblks = 0; + child = (allocBlk - blk) / radix; + blk += child * radix; + i = 1 + child * next_skip; while (i <= skip && blk < allocBlk + count) { - daddr_t v; - v = blk + radix - allocBlk; if (v > count) v = count; @@ -867,12 +881,12 @@ blst_meta_fill( */ static daddr_t -blst_radix_init(blmeta_t *scan, daddr_t radix, int skip, daddr_t count) +blst_radix_init(blmeta_t *scan, daddr_t radix, daddr_t skip, daddr_t count) { - int i; - int next_skip; - daddr_t memindex = 0; + daddr_t i, memindex, next_skip; + memindex = 0; + /* * Leaf node */ @@ -897,7 +911,7 @@ blst_radix_init(blmeta_t *scan, daddr_t radix, int ski } radix /= BLIST_META_RADIX; - next_skip = ((u_int)skip / BLIST_META_RADIX); + next_skip = skip / BLIST_META_RADIX; for (i = 1; i <= skip; i += next_skip) { if (count >= radix) { @@ -939,11 +953,10 @@ blst_radix_init(blmeta_t *scan, daddr_t radix, int ski #ifdef BLIST_DEBUG static void -blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t radix, int skip, int tab) +blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t radix, daddr_t skip, + int tab) { - int i; - int next_skip; - int lastState = 0; + daddr_t i, next_skip; if (radix == BLIST_BMAP_RADIX) { printf( @@ -985,7 +998,7 @@ blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t ); radix /= BLIST_META_RADIX; - next_skip = ((u_int)skip / BLIST_META_RADIX); + next_skip = skip / BLIST_META_RADIX; tab += 4; for (i = 1; i <= skip; i += next_skip) { @@ -995,7 +1008,6 @@ blst_radix_print(blmeta_t *scan, daddr_t blk, daddr_t tab, tab, "", (long long)blk, (long long)radix ); - lastState = 0; break; } blst_radix_print( Modified: stable/11/sys/sys/blist.h ============================================================================== --- stable/11/sys/sys/blist.h Tue Jul 25 03:41:05 2017 (r321452) +++ stable/11/sys/sys/blist.h Tue Jul 25 03:43:00 2017 (r321453) @@ -82,6 +82,7 @@ typedef struct blist { daddr_t bl_blocks; /* area of coverage */ daddr_t bl_radix; /* coverage radix */ daddr_t bl_skip; /* starting skip */ + daddr_t bl_cursor; /* next-fit search starts at */ blmeta_t *bl_root; /* root of radix tree */ } *blist_t; From owner-svn-src-stable-11@freebsd.org Tue Jul 25 13:43:16 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 275C6C7CDAD; Tue, 25 Jul 2017 13:43:16 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E195A7C04C; Tue, 25 Jul 2017 13:43:15 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6PDhFVa005359; Tue, 25 Jul 2017 13:43:15 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6PDhFnD005358; Tue, 25 Jul 2017 13:43:15 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201707251343.v6PDhFnD005358@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Tue, 25 Jul 2017 13:43:15 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321473 - stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Commit-Revision: 321473 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 13:43:16 -0000 Author: gjb Date: Tue Jul 25 13:43:14 2017 New Revision: 321473 URL: https://svnweb.freebsd.org/changeset/base/321473 Log: Fix a typo. PR: 220917 (related) Submitted by: fbsdbugs4 _at_ sentry dot org Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml ============================================================================== --- stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Tue Jul 25 13:18:28 2017 (r321472) +++ stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Tue Jul 25 13:43:14 2017 (r321473) @@ -105,8 +105,8 @@ without recompiling the kernel. To mitigate system crashes with such configurations, - chose Escape to loader prompt in the boot - menu and enter the following lines from &man.loader.8; + choose Escape to loader prompt in the + boot menu and enter the following lines from &man.loader.8; prompt, after an OK: set kern.kstack_pages=4 From owner-svn-src-stable-11@freebsd.org Tue Jul 25 13:43:52 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C5B07C7CE27; Tue, 25 Jul 2017 13:43:52 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9F6D97C17B; Tue, 25 Jul 2017 13:43:52 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6PDhpTT005433; Tue, 25 Jul 2017 13:43:51 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6PDhpnQ005431; Tue, 25 Jul 2017 13:43:51 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201707251343.v6PDhpnQ005431@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Tue, 25 Jul 2017 13:43:51 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321474 - stable/11/release/doc/share/xml X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: stable/11/release/doc/share/xml X-SVN-Commit-Revision: 321474 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 13:43:52 -0000 Author: gjb Date: Tue Jul 25 13:43:51 2017 New Revision: 321474 URL: https://svnweb.freebsd.org/changeset/base/321474 Log: Trim stale entries from 11.0. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/share/xml/errata.xml stable/11/release/doc/share/xml/security.xml Modified: stable/11/release/doc/share/xml/errata.xml ============================================================================== --- stable/11/release/doc/share/xml/errata.xml Tue Jul 25 13:43:14 2017 (r321473) +++ stable/11/release/doc/share/xml/errata.xml Tue Jul 25 13:43:51 2017 (r321474) @@ -19,73 +19,9 @@ - FreeBSD-EN-16:18.loader - 25 October 2016 - Loader may hang during boot - - - - FreeBSD-EN-16:19.tzcode - 6 December 2016 - Fix warnings about invalid timezone - abbreviations - - - - FreeBSD-EN-16:20.tzdata - 6 December 2016 - Update timezone database - information - - - - FreeBSD-EN-16:21.localedef - 6 December 2016 - Fix incorrectly defined unicode - characters - - - - FreeBSD-EN-17:01.pcie - 23 February 2017 - Fix system hang when booting when PCI-express - HotPlug is enabled - - - - FreeBSD-EN-17:02.yp - 23 February 2017 - Fix NIS master updates are not pushed to an NIS - slave - - - - FreeBSD-EN-17:03.hyperv - 23 February 2017 - Fix compatibility with Hyper-V/storage after - KB3172614 or KB3179574 - - - - FreeBSD-EN-17:04.mandoc - 23 February 2017 - Make &man.makewhatis.1; output - reproducible - - - - FreeBSD-EN-17:05.xen - 23 February 2017 - Xen migration enhancements + No notices +   +   Modified: stable/11/release/doc/share/xml/security.xml ============================================================================== --- stable/11/release/doc/share/xml/security.xml Tue Jul 25 13:43:14 2017 (r321473) +++ stable/11/release/doc/share/xml/security.xml Tue Jul 25 13:43:51 2017 (r321474) @@ -19,84 +19,9 @@ - FreeBSD-SA-16:32.bhyve - 25 October 2016 - Privilege escalation vulnerability - - - - FreeBSD-SA-16:33.openssh - 2 November 2016 - Remote Denial of Service - vulnerability - - - - FreeBSD-SA-16:36.telnetd - 6 December 2016 - Possible &man.login.1; argument - injection - - - - FreeBSD-SA-16:37.libc - 6 December 2016 - &man.link.ntoa.3; buffer overflow - - - - FreeBSD-SA-16:38.bhyve - 6 December 2016 - Possible escape from &man.bhyve.8; virtual - machine - - - - FreeBSD-SA-16:39.ntp - 22 December 2016 - Multiple vulnerabilities - - - - FreeBSD-SA-17:01.openssh - 10 January 2017 - Multiple vulnerabilities - - - - FreeBSD-SA-17:02.openssl - 23 February 2017 - Multiple vulnerabilities - - - - FreeBSD-SA-17:03.ntp - 12 April 2017 - Multiple vulnerabilities - - - - FreeBSD-SA-17:04.ipfilter - 27 April 2017 - Fix fragment handling panic - - - - FreeBSD-SA-17:05.heimdal - 12 July 2017 - Fix KDC-REP service name validation - vulnerability + No advisories +   +   From owner-svn-src-stable-11@freebsd.org Tue Jul 25 14:35:45 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D2C94C7D8D4; Tue, 25 Jul 2017 14:35:45 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9F6BA7D63F; Tue, 25 Jul 2017 14:35:45 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6PEZifK025987; Tue, 25 Jul 2017 14:35:44 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6PEZisI025985; Tue, 25 Jul 2017 14:35:44 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201707251435.v6PEZisI025985@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Tue, 25 Jul 2017 14:35:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321475 - stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Commit-Revision: 321475 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 14:35:45 -0000 Author: gjb Date: Tue Jul 25 14:35:44 2017 New Revision: 321475 URL: https://svnweb.freebsd.org/changeset/base/321475 Log: Mention arm64 lacking EFI RTC support, and a workaround. Submitted by: emaste Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml ============================================================================== --- stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Tue Jul 25 13:43:51 2017 (r321474) +++ stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Tue Jul 25 14:35:44 2017 (r321475) @@ -131,6 +131,17 @@ boot upgrade from anywhere on earlier stable branches, so caution should be exercised. + + + [2017-07-25] &os;/&arch.arm64; currently lacks + EFI real-time clock + (RTC) support, which may cause the system + to boot with the wrong time set. + + As a workaround, either enable &man.ntpdate.8; or + include ntpd_sync_on_start="YES" in + &man.rc.conf.5;. + From owner-svn-src-stable-11@freebsd.org Tue Jul 25 14:46:15 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C009C7DFD0; Tue, 25 Jul 2017 14:46:15 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D977C7E2EC; Tue, 25 Jul 2017 14:46:14 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6PEkEe7030729; Tue, 25 Jul 2017 14:46:14 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6PEkEgZ030728; Tue, 25 Jul 2017 14:46:14 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201707251446.v6PEkEgZ030728@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Tue, 25 Jul 2017 14:46:14 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321478 - stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Commit-Revision: 321478 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 14:46:15 -0000 Author: gjb Date: Tue Jul 25 14:46:13 2017 New Revision: 321478 URL: https://svnweb.freebsd.org/changeset/base/321478 Log: Document a late-discovered issue where 'root on ZFS' installations on arm64 fail to find the root pool. Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml ============================================================================== --- stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Tue Jul 25 14:41:50 2017 (r321477) +++ stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Tue Jul 25 14:46:13 2017 (r321478) @@ -142,6 +142,15 @@ boot include ntpd_sync_on_start="YES" in &man.rc.conf.5;. + + + [2017-07-25] A late issue was discovered with + &os;/&arch.arm64; and "root on + ZFS" installations where the root + ZFS pool would fail to be located. + + There currently is no workaround. + From owner-svn-src-stable-11@freebsd.org Tue Jul 25 17:24:51 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5E9CCFDA38; Tue, 25 Jul 2017 17:24:51 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 818298461F; Tue, 25 Jul 2017 17:24:51 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6PHOoYr098867; Tue, 25 Jul 2017 17:24:50 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6PHOoJ5098866; Tue, 25 Jul 2017 17:24:50 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707251724.v6PHOoJ5098866@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Tue, 25 Jul 2017 17:24:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321488 - stable/11/contrib/elftoolchain/readelf X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/contrib/elftoolchain/readelf X-SVN-Commit-Revision: 321488 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 17:24:51 -0000 Author: emaste Date: Tue Jul 25 17:24:50 2017 New Revision: 321488 URL: https://svnweb.freebsd.org/changeset/base/321488 Log: readelf: fix printing of DT_FILTER and some other DT_* values MFC r321045: readelf: fix printing of DT_FILTER and some other DT_* values Some non-processor-specific DT_* values overlap the range DT_LOPROC to DT_HIPROC. Handle common ones first, then the processor-specific ones. MFC r321046: readelf: correct printing of DT_FILTER and DT_AUXILIARY values Previously these were shown only for MIPS objects. Obtained from: ELF Tool Chain r3563, r3564 Sponsored by: The FreeBSD Foundation Modified: stable/11/contrib/elftoolchain/readelf/readelf.c Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/elftoolchain/readelf/readelf.c ============================================================================== --- stable/11/contrib/elftoolchain/readelf/readelf.c Tue Jul 25 17:04:35 2017 (r321487) +++ stable/11/contrib/elftoolchain/readelf/readelf.c Tue Jul 25 17:24:50 2017 (r321488) @@ -783,6 +783,80 @@ dt_type(unsigned int mach, unsigned int dtype) { static char s_dtype[32]; + switch (dtype) { + case DT_NULL: return "NULL"; + case DT_NEEDED: return "NEEDED"; + case DT_PLTRELSZ: return "PLTRELSZ"; + case DT_PLTGOT: return "PLTGOT"; + case DT_HASH: return "HASH"; + case DT_STRTAB: return "STRTAB"; + case DT_SYMTAB: return "SYMTAB"; + case DT_RELA: return "RELA"; + case DT_RELASZ: return "RELASZ"; + case DT_RELAENT: return "RELAENT"; + case DT_STRSZ: return "STRSZ"; + case DT_SYMENT: return "SYMENT"; + case DT_INIT: return "INIT"; + case DT_FINI: return "FINI"; + case DT_SONAME: return "SONAME"; + case DT_RPATH: return "RPATH"; + case DT_SYMBOLIC: return "SYMBOLIC"; + case DT_REL: return "REL"; + case DT_RELSZ: return "RELSZ"; + case DT_RELENT: return "RELENT"; + case DT_PLTREL: return "PLTREL"; + case DT_DEBUG: return "DEBUG"; + case DT_TEXTREL: return "TEXTREL"; + case DT_JMPREL: return "JMPREL"; + case DT_BIND_NOW: return "BIND_NOW"; + case DT_INIT_ARRAY: return "INIT_ARRAY"; + case DT_FINI_ARRAY: return "FINI_ARRAY"; + case DT_INIT_ARRAYSZ: return "INIT_ARRAYSZ"; + case DT_FINI_ARRAYSZ: return "FINI_ARRAYSZ"; + case DT_RUNPATH: return "RUNPATH"; + case DT_FLAGS: return "FLAGS"; + case DT_PREINIT_ARRAY: return "PREINIT_ARRAY"; + case DT_PREINIT_ARRAYSZ: return "PREINIT_ARRAYSZ"; + case DT_MAXPOSTAGS: return "MAXPOSTAGS"; + case DT_SUNW_AUXILIARY: return "SUNW_AUXILIARY"; + case DT_SUNW_RTLDINF: return "SUNW_RTLDINF"; + case DT_SUNW_FILTER: return "SUNW_FILTER"; + case DT_SUNW_CAP: return "SUNW_CAP"; + case DT_CHECKSUM: return "CHECKSUM"; + case DT_PLTPADSZ: return "PLTPADSZ"; + case DT_MOVEENT: return "MOVEENT"; + case DT_MOVESZ: return "MOVESZ"; + case DT_FEATURE: return "FEATURE"; + case DT_POSFLAG_1: return "POSFLAG_1"; + case DT_SYMINSZ: return "SYMINSZ"; + case DT_SYMINENT: return "SYMINENT"; + case DT_GNU_HASH: return "GNU_HASH"; + case DT_TLSDESC_PLT: return "DT_TLSDESC_PLT"; + case DT_TLSDESC_GOT: return "DT_TLSDESC_GOT"; + case DT_GNU_CONFLICT: return "GNU_CONFLICT"; + case DT_GNU_LIBLIST: return "GNU_LIBLIST"; + case DT_CONFIG: return "CONFIG"; + case DT_DEPAUDIT: return "DEPAUDIT"; + case DT_AUDIT: return "AUDIT"; + case DT_PLTPAD: return "PLTPAD"; + case DT_MOVETAB: return "MOVETAB"; + case DT_SYMINFO: return "SYMINFO"; + case DT_VERSYM: return "VERSYM"; + case DT_RELACOUNT: return "RELACOUNT"; + case DT_RELCOUNT: return "RELCOUNT"; + case DT_FLAGS_1: return "FLAGS_1"; + case DT_VERDEF: return "VERDEF"; + case DT_VERDEFNUM: return "VERDEFNUM"; + case DT_VERNEED: return "VERNEED"; + case DT_VERNEEDNUM: return "VERNEEDNUM"; + case DT_AUXILIARY: return "AUXILIARY"; + case DT_USED: return "USED"; + case DT_FILTER: return "FILTER"; + case DT_GNU_PRELINKED: return "GNU_PRELINKED"; + case DT_GNU_CONFLICTSZ: return "GNU_CONFLICTSZ"; + case DT_GNU_LIBLISTSZ: return "GNU_LIBLISTSZ"; + } + if (dtype >= DT_LOPROC && dtype <= DT_HIPROC) { switch (mach) { case EM_ARM: @@ -903,86 +977,10 @@ dt_type(unsigned int mach, unsigned int dtype) default: break; } - snprintf(s_dtype, sizeof(s_dtype), "", dtype); - return (s_dtype); } - switch (dtype) { - case DT_NULL: return "NULL"; - case DT_NEEDED: return "NEEDED"; - case DT_PLTRELSZ: return "PLTRELSZ"; - case DT_PLTGOT: return "PLTGOT"; - case DT_HASH: return "HASH"; - case DT_STRTAB: return "STRTAB"; - case DT_SYMTAB: return "SYMTAB"; - case DT_RELA: return "RELA"; - case DT_RELASZ: return "RELASZ"; - case DT_RELAENT: return "RELAENT"; - case DT_STRSZ: return "STRSZ"; - case DT_SYMENT: return "SYMENT"; - case DT_INIT: return "INIT"; - case DT_FINI: return "FINI"; - case DT_SONAME: return "SONAME"; - case DT_RPATH: return "RPATH"; - case DT_SYMBOLIC: return "SYMBOLIC"; - case DT_REL: return "REL"; - case DT_RELSZ: return "RELSZ"; - case DT_RELENT: return "RELENT"; - case DT_PLTREL: return "PLTREL"; - case DT_DEBUG: return "DEBUG"; - case DT_TEXTREL: return "TEXTREL"; - case DT_JMPREL: return "JMPREL"; - case DT_BIND_NOW: return "BIND_NOW"; - case DT_INIT_ARRAY: return "INIT_ARRAY"; - case DT_FINI_ARRAY: return "FINI_ARRAY"; - case DT_INIT_ARRAYSZ: return "INIT_ARRAYSZ"; - case DT_FINI_ARRAYSZ: return "FINI_ARRAYSZ"; - case DT_RUNPATH: return "RUNPATH"; - case DT_FLAGS: return "FLAGS"; - case DT_PREINIT_ARRAY: return "PREINIT_ARRAY"; - case DT_PREINIT_ARRAYSZ: return "PREINIT_ARRAYSZ"; - case DT_MAXPOSTAGS: return "MAXPOSTAGS"; - case DT_SUNW_AUXILIARY: return "SUNW_AUXILIARY"; - case DT_SUNW_RTLDINF: return "SUNW_RTLDINF"; - case DT_SUNW_FILTER: return "SUNW_FILTER"; - case DT_SUNW_CAP: return "SUNW_CAP"; - case DT_CHECKSUM: return "CHECKSUM"; - case DT_PLTPADSZ: return "PLTPADSZ"; - case DT_MOVEENT: return "MOVEENT"; - case DT_MOVESZ: return "MOVESZ"; - case DT_FEATURE: return "FEATURE"; - case DT_POSFLAG_1: return "POSFLAG_1"; - case DT_SYMINSZ: return "SYMINSZ"; - case DT_SYMINENT: return "SYMINENT"; - case DT_GNU_HASH: return "GNU_HASH"; - case DT_TLSDESC_PLT: return "DT_TLSDESC_PLT"; - case DT_TLSDESC_GOT: return "DT_TLSDESC_GOT"; - case DT_GNU_CONFLICT: return "GNU_CONFLICT"; - case DT_GNU_LIBLIST: return "GNU_LIBLIST"; - case DT_CONFIG: return "CONFIG"; - case DT_DEPAUDIT: return "DEPAUDIT"; - case DT_AUDIT: return "AUDIT"; - case DT_PLTPAD: return "PLTPAD"; - case DT_MOVETAB: return "MOVETAB"; - case DT_SYMINFO: return "SYMINFO"; - case DT_VERSYM: return "VERSYM"; - case DT_RELACOUNT: return "RELACOUNT"; - case DT_RELCOUNT: return "RELCOUNT"; - case DT_FLAGS_1: return "FLAGS_1"; - case DT_VERDEF: return "VERDEF"; - case DT_VERDEFNUM: return "VERDEFNUM"; - case DT_VERNEED: return "VERNEED"; - case DT_VERNEEDNUM: return "VERNEEDNUM"; - case DT_AUXILIARY: return "AUXILIARY"; - case DT_USED: return "USED"; - case DT_FILTER: return "FILTER"; - case DT_GNU_PRELINKED: return "GNU_PRELINKED"; - case DT_GNU_CONFLICTSZ: return "GNU_CONFLICTSZ"; - case DT_GNU_LIBLISTSZ: return "GNU_LIBLISTSZ"; - default: - snprintf(s_dtype, sizeof(s_dtype), "", dtype); - return (s_dtype); - } + snprintf(s_dtype, sizeof(s_dtype), "", dtype); + return (s_dtype); } static const char * @@ -2638,10 +2636,8 @@ dyn_str(struct readelf *re, uint32_t stab, uint64_t d_ } static void -dump_arch_dyn_val(struct readelf *re, GElf_Dyn *dyn, uint32_t stab) +dump_arch_dyn_val(struct readelf *re, GElf_Dyn *dyn) { - const char *name; - switch (re->ehdr.e_machine) { case EM_MIPS: case EM_MIPS_RS3_LE: @@ -2694,11 +2690,6 @@ dump_arch_dyn_val(struct readelf *re, GElf_Dyn *dyn, u break; case DT_MIPS_IVERSION: case DT_MIPS_PERF_SUFFIX: - case DT_AUXILIARY: - case DT_FILTER: - name = dyn_str(re, stab, dyn->d_un.d_val); - printf(" %s\n", name); - break; case DT_MIPS_TIME_STAMP: printf(" %s\n", timestamp(dyn->d_un.d_val)); break; @@ -2715,14 +2706,16 @@ dump_dyn_val(struct readelf *re, GElf_Dyn *dyn, uint32 { const char *name; - if (dyn->d_tag >= DT_LOPROC && dyn->d_tag <= DT_HIPROC) { - dump_arch_dyn_val(re, dyn, stab); + if (dyn->d_tag >= DT_LOPROC && dyn->d_tag <= DT_HIPROC && + dyn->d_tag != DT_AUXILIARY && dyn->d_tag != DT_FILTER) { + dump_arch_dyn_val(re, dyn); return; } /* These entry values are index into the string table. */ name = NULL; - if (dyn->d_tag == DT_NEEDED || dyn->d_tag == DT_SONAME || + if (dyn->d_tag == DT_AUXILIARY || dyn->d_tag == DT_FILTER || + dyn->d_tag == DT_NEEDED || dyn->d_tag == DT_SONAME || dyn->d_tag == DT_RPATH || dyn->d_tag == DT_RUNPATH) name = dyn_str(re, stab, dyn->d_un.d_val); @@ -2766,6 +2759,12 @@ dump_dyn_val(struct readelf *re, GElf_Dyn *dyn, uint32 case DT_VERDEFNUM: case DT_VERNEEDNUM: printf(" %ju\n", (uintmax_t) dyn->d_un.d_val); + break; + case DT_AUXILIARY: + printf(" Auxiliary library: [%s]\n", name); + break; + case DT_FILTER: + printf(" Filter library: [%s]\n", name); break; case DT_NEEDED: printf(" Shared library: [%s]\n", name); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 01:12:29 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D7944DAD24E; Wed, 26 Jul 2017 01:12:29 +0000 (UTC) (envelope-from davidcs@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A3DB66EE0C; Wed, 26 Jul 2017 01:12:29 +0000 (UTC) (envelope-from davidcs@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6Q1CS0G091865; Wed, 26 Jul 2017 01:12:28 GMT (envelope-from davidcs@FreeBSD.org) Received: (from davidcs@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6Q1CSDe091863; Wed, 26 Jul 2017 01:12:28 GMT (envelope-from davidcs@FreeBSD.org) Message-Id: <201707260112.v6Q1CSDe091863@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: davidcs set sender to davidcs@FreeBSD.org using -f From: David C Somayajulu Date: Wed, 26 Jul 2017 01:12:28 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321494 - stable/11/sys/dev/qlxgbe X-SVN-Group: stable-11 X-SVN-Commit-Author: davidcs X-SVN-Commit-Paths: stable/11/sys/dev/qlxgbe X-SVN-Commit-Revision: 321494 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 01:12:30 -0000 Author: davidcs Date: Wed Jul 26 01:12:28 2017 New Revision: 321494 URL: https://svnweb.freebsd.org/changeset/base/321494 Log: MFC 320694 Allow MTU changes without ifconfig down/up Modified: stable/11/sys/dev/qlxgbe/ql_hw.c stable/11/sys/dev/qlxgbe/ql_os.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/qlxgbe/ql_hw.c ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_hw.c Tue Jul 25 20:51:06 2017 (r321493) +++ stable/11/sys/dev/qlxgbe/ql_hw.c Wed Jul 26 01:12:28 2017 (r321494) @@ -2498,6 +2498,9 @@ ql_init_hw_if(qla_host_t *ha) if (qla_hw_add_all_mcast(ha)) return (-1); + if (ql_set_max_mtu(ha, ha->max_frame_size, ha->hw.rcv_cntxt_id)) + return (-1); + if (qla_config_rss(ha, ha->hw.rcv_cntxt_id)) return (-1); Modified: stable/11/sys/dev/qlxgbe/ql_os.c ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_os.c Tue Jul 25 20:51:06 2017 (r321493) +++ stable/11/sys/dev/qlxgbe/ql_os.c Wed Jul 26 01:12:28 2017 (r321494) @@ -980,8 +980,7 @@ qla_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data) ifp->if_mtu + ETHER_HDR_LEN + ETHER_CRC_LEN; if ((ifp->if_drv_flags & IFF_DRV_RUNNING)) { - ret = ql_set_max_mtu(ha, ha->max_frame_size, - ha->hw.rcv_cntxt_id); + qla_init_locked(ha); } if (ifp->if_mtu > ETHERMTU) @@ -1014,11 +1013,9 @@ qla_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data) ret = ql_set_allmulti(ha); } } else { - qla_init_locked(ha); ha->max_frame_size = ifp->if_mtu + ETHER_HDR_LEN + ETHER_CRC_LEN; - ret = ql_set_max_mtu(ha, ha->max_frame_size, - ha->hw.rcv_cntxt_id); + qla_init_locked(ha); } } else { if (ifp->if_drv_flags & IFF_DRV_RUNNING) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 01:15:32 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F8D8DAD321; Wed, 26 Jul 2017 01:15:32 +0000 (UTC) (envelope-from davidcs@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2B13B6EF90; Wed, 26 Jul 2017 01:15:32 +0000 (UTC) (envelope-from davidcs@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6Q1FVs7092026; Wed, 26 Jul 2017 01:15:31 GMT (envelope-from davidcs@FreeBSD.org) Received: (from davidcs@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6Q1FVY3092025; Wed, 26 Jul 2017 01:15:31 GMT (envelope-from davidcs@FreeBSD.org) Message-Id: <201707260115.v6Q1FVY3092025@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: davidcs set sender to davidcs@FreeBSD.org using -f From: David C Somayajulu Date: Wed, 26 Jul 2017 01:15:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321495 - stable/11/sys/dev/qlxgbe X-SVN-Group: stable-11 X-SVN-Commit-Author: davidcs X-SVN-Commit-Paths: stable/11/sys/dev/qlxgbe X-SVN-Commit-Revision: 321495 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 01:15:32 -0000 Author: davidcs Date: Wed Jul 26 01:15:31 2017 New Revision: 321495 URL: https://svnweb.freebsd.org/changeset/base/321495 Log: MFC 320705 Release mtx hw_lock before calling pause() in qla_stop() and qla_error_recovery() Modified: stable/11/sys/dev/qlxgbe/ql_os.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/qlxgbe/ql_os.c ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_os.c Wed Jul 26 01:12:28 2017 (r321494) +++ stable/11/sys/dev/qlxgbe/ql_os.c Wed Jul 26 01:15:31 2017 (r321495) @@ -1519,8 +1519,11 @@ qla_stop(qla_host_t *ha) ha->flags.qla_watchdog_pause = 1; - while (!ha->qla_watchdog_paused) + while (!ha->qla_watchdog_paused) { + QLA_UNLOCK(ha); qla_mdelay(__func__, 1); + QLA_LOCK(ha); + } ha->flags.qla_interface_up = 0; @@ -1915,7 +1918,10 @@ qla_error_recovery(void *context, int pending) if (ha->flags.qla_interface_up) { ha->hw.imd_compl = 1; + + QLA_UNLOCK(ha); qla_mdelay(__func__, 300); + QLA_LOCK(ha); ifp->if_drv_flags &= ~(IFF_DRV_OACTIVE | IFF_DRV_RUNNING); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 01:19:51 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 348A7DAD597; Wed, 26 Jul 2017 01:19:51 +0000 (UTC) (envelope-from davidcs@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 10D766F278; Wed, 26 Jul 2017 01:19:50 +0000 (UTC) (envelope-from davidcs@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6Q1JoeV092230; Wed, 26 Jul 2017 01:19:50 GMT (envelope-from davidcs@FreeBSD.org) Received: (from davidcs@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6Q1JnkZ092226; Wed, 26 Jul 2017 01:19:49 GMT (envelope-from davidcs@FreeBSD.org) Message-Id: <201707260119.v6Q1JnkZ092226@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: davidcs set sender to davidcs@FreeBSD.org using -f From: David C Somayajulu Date: Wed, 26 Jul 2017 01:19:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321496 - stable/11/sys/dev/qlxgbe X-SVN-Group: stable-11 X-SVN-Commit-Author: davidcs X-SVN-Commit-Paths: stable/11/sys/dev/qlxgbe X-SVN-Commit-Revision: 321496 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 01:19:51 -0000 Author: davidcs Date: Wed Jul 26 01:19:49 2017 New Revision: 321496 URL: https://svnweb.freebsd.org/changeset/base/321496 Log: MFC 321233 Raise the watchdog timer interval to 2 ticks, there by guaranteeing that it fires between 1ms and 2ms. ` Treat two consecutive occurrences of Heartbeat failures as a legitimate Heartbeat failure Modified: stable/11/sys/dev/qlxgbe/ql_def.h stable/11/sys/dev/qlxgbe/ql_hw.c stable/11/sys/dev/qlxgbe/ql_hw.h stable/11/sys/dev/qlxgbe/ql_os.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/qlxgbe/ql_def.h ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_def.h Wed Jul 26 01:15:31 2017 (r321495) +++ stable/11/sys/dev/qlxgbe/ql_def.h Wed Jul 26 01:19:49 2017 (r321496) @@ -105,7 +105,7 @@ struct qla_ivec { typedef struct qla_ivec qla_ivec_t; -#define QLA_WATCHDOG_CALLOUT_TICKS 1 +#define QLA_WATCHDOG_CALLOUT_TICKS 2 typedef struct _qla_tx_ring { qla_tx_buf_t tx_buf[NUM_TX_DESCRIPTORS]; Modified: stable/11/sys/dev/qlxgbe/ql_hw.c ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_hw.c Wed Jul 26 01:15:31 2017 (r321495) +++ stable/11/sys/dev/qlxgbe/ql_hw.c Wed Jul 26 01:19:49 2017 (r321496) @@ -3366,7 +3366,7 @@ ql_hw_check_health(qla_host_t *ha) ha->hw.health_count++; - if (ha->hw.health_count < 1000) + if (ha->hw.health_count < 500) return 0; ha->hw.health_count = 0; @@ -3385,10 +3385,18 @@ ql_hw_check_health(qla_host_t *ha) if ((val != ha->hw.hbeat_value) && (!(QL_ERR_INJECT(ha, INJCT_HEARTBEAT_FAILURE)))) { ha->hw.hbeat_value = val; + ha->hw.hbeat_failure = 0; return 0; } - device_printf(ha->pci_dev, "%s: Heartbeat Failue [0x%08x]\n", - __func__, val); + + ha->hw.hbeat_failure++; + + if (ha->hw.hbeat_failure < 2) /* we ignore the first failure */ + return 0; + else + device_printf(ha->pci_dev, "%s: Heartbeat Failue [0x%08x]\n", + __func__, val); + return -1; } Modified: stable/11/sys/dev/qlxgbe/ql_hw.h ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_hw.h Wed Jul 26 01:15:31 2017 (r321495) +++ stable/11/sys/dev/qlxgbe/ql_hw.h Wed Jul 26 01:19:49 2017 (r321496) @@ -1671,6 +1671,7 @@ typedef struct _qla_hw { /* heart beat register value */ uint32_t hbeat_value; uint32_t health_count; + uint32_t hbeat_failure; uint32_t max_tx_segs; uint32_t min_lro_pkt_size; Modified: stable/11/sys/dev/qlxgbe/ql_os.c ============================================================================== --- stable/11/sys/dev/qlxgbe/ql_os.c Wed Jul 26 01:15:31 2017 (r321495) +++ stable/11/sys/dev/qlxgbe/ql_os.c Wed Jul 26 01:19:49 2017 (r321496) @@ -276,7 +276,7 @@ qla_watchdog(void *arg) ha->qla_watchdog_paused = 1; } - ha->watchdog_ticks = ha->watchdog_ticks++ % 1000; + ha->watchdog_ticks = ha->watchdog_ticks++ % 500; callout_reset(&ha->tx_callout, QLA_WATCHDOG_CALLOUT_TICKS, qla_watchdog, ha); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 06:52:47 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A3B8EDB2F05; Wed, 26 Jul 2017 06:52:47 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7587077C79; Wed, 26 Jul 2017 06:52:47 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6Q6qkT2030526; Wed, 26 Jul 2017 06:52:46 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6Q6qjDu030520; Wed, 26 Jul 2017 06:52:45 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201707260652.v6Q6qjDu030520@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 26 Jul 2017 06:52:45 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321513 - in stable/11/sys: sys vm X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: in stable/11/sys: sys vm X-SVN-Commit-Revision: 321513 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 06:52:47 -0000 Author: kib Date: Wed Jul 26 06:52:45 2017 New Revision: 321513 URL: https://svnweb.freebsd.org/changeset/base/321513 Log: MFC r321247: Add pctrie_init() and vm_radix_init() to initialize generic pctrie and vm_radix trie. Modified: stable/11/sys/sys/_pctrie.h stable/11/sys/sys/pctrie.h stable/11/sys/vm/_vm_radix.h stable/11/sys/vm/vm_object.c stable/11/sys/vm/vm_radix.c stable/11/sys/vm/vm_radix.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/sys/_pctrie.h ============================================================================== --- stable/11/sys/sys/_pctrie.h Wed Jul 26 06:42:01 2017 (r321512) +++ stable/11/sys/sys/_pctrie.h Wed Jul 26 06:52:45 2017 (r321513) @@ -38,14 +38,4 @@ struct pctrie { uintptr_t pt_root; }; -#ifdef _KERNEL - -static __inline boolean_t -pctrie_is_empty(struct pctrie *ptree) -{ - - return (ptree->pt_root == 0); -} - -#endif /* _KERNEL */ #endif /* !__SYS_PCTRIE_H_ */ Modified: stable/11/sys/sys/pctrie.h ============================================================================== --- stable/11/sys/sys/pctrie.h Wed Jul 26 06:42:01 2017 (r321512) +++ stable/11/sys/sys/pctrie.h Wed Jul 26 06:52:45 2017 (r321513) @@ -119,5 +119,19 @@ void pctrie_remove(struct pctrie *ptree, uint64_t key size_t pctrie_node_size(void); int pctrie_zone_init(void *mem, int size, int flags); +static __inline void +pctrie_init(struct pctrie *ptree) +{ + + ptree->pt_root = 0; +} + +static __inline boolean_t +pctrie_is_empty(struct pctrie *ptree) +{ + + return (ptree->pt_root == 0); +} + #endif /* _KERNEL */ #endif /* !_SYS_PCTRIE_H_ */ Modified: stable/11/sys/vm/_vm_radix.h ============================================================================== --- stable/11/sys/vm/_vm_radix.h Wed Jul 26 06:42:01 2017 (r321512) +++ stable/11/sys/vm/_vm_radix.h Wed Jul 26 06:52:45 2017 (r321513) @@ -38,14 +38,4 @@ struct vm_radix { uintptr_t rt_root; }; -#ifdef _KERNEL - -static __inline boolean_t -vm_radix_is_empty(struct vm_radix *rtree) -{ - - return (rtree->rt_root == 0); -} - -#endif /* _KERNEL */ #endif /* !__VM_RADIX_H_ */ Modified: stable/11/sys/vm/vm_object.c ============================================================================== --- stable/11/sys/vm/vm_object.c Wed Jul 26 06:42:01 2017 (r321512) +++ stable/11/sys/vm/vm_object.c Wed Jul 26 06:52:45 2017 (r321513) @@ -204,7 +204,7 @@ vm_object_zinit(void *mem, int size, int flags) /* These are true for any object that has been freed */ object->type = OBJT_DEAD; object->ref_count = 0; - object->rtree.rt_root = 0; + vm_radix_init(&object->rtree); object->paging_in_progress = 0; object->resident_page_count = 0; object->shadow_count = 0; @@ -301,7 +301,7 @@ vm_object_init(void) #endif vm_object_zinit, NULL, UMA_ALIGN_PTR, UMA_ZONE_NOFREE); - vm_radix_init(); + vm_radix_zinit(); } void Modified: stable/11/sys/vm/vm_radix.c ============================================================================== --- stable/11/sys/vm/vm_radix.c Wed Jul 26 06:42:01 2017 (r321512) +++ stable/11/sys/vm/vm_radix.c Wed Jul 26 06:52:45 2017 (r321513) @@ -310,7 +310,7 @@ SYSINIT(vm_radix_reserve_kva, SI_SUB_KMEM, SI_ORDER_TH * Initialize the UMA slab zone. */ void -vm_radix_init(void) +vm_radix_zinit(void) { vm_radix_node_zone = uma_zcreate("RADIX NODE", Modified: stable/11/sys/vm/vm_radix.h ============================================================================== --- stable/11/sys/vm/vm_radix.h Wed Jul 26 06:42:01 2017 (r321512) +++ stable/11/sys/vm/vm_radix.h Wed Jul 26 06:52:45 2017 (r321513) @@ -35,7 +35,6 @@ #ifdef _KERNEL -void vm_radix_init(void); int vm_radix_insert(struct vm_radix *rtree, vm_page_t page); boolean_t vm_radix_is_singleton(struct vm_radix *rtree); vm_page_t vm_radix_lookup(struct vm_radix *rtree, vm_pindex_t index); @@ -44,6 +43,21 @@ vm_page_t vm_radix_lookup_le(struct vm_radix *rtree, v void vm_radix_reclaim_allnodes(struct vm_radix *rtree); vm_page_t vm_radix_remove(struct vm_radix *rtree, vm_pindex_t index); vm_page_t vm_radix_replace(struct vm_radix *rtree, vm_page_t newpage); +void vm_radix_zinit(void); + +static __inline void +vm_radix_init(struct vm_radix *rtree) +{ + + rtree->rt_root = 0; +} + +static __inline boolean_t +vm_radix_is_empty(struct vm_radix *rtree) +{ + + return (rtree->rt_root == 0); +} #endif /* _KERNEL */ #endif /* !_VM_RADIX_H_ */ From owner-svn-src-stable-11@freebsd.org Wed Jul 26 07:00:29 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 145E5DB311F; Wed, 26 Jul 2017 07:00:29 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D5C797C157; Wed, 26 Jul 2017 07:00:28 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6Q70SZE030942; Wed, 26 Jul 2017 07:00:28 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6Q70RnG030940; Wed, 26 Jul 2017 07:00:27 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201707260700.v6Q70RnG030940@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Wed, 26 Jul 2017 07:00:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321515 - stable/11/sys/vm X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/sys/vm X-SVN-Commit-Revision: 321515 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 07:00:29 -0000 Author: kib Date: Wed Jul 26 07:00:27 2017 New Revision: 321515 URL: https://svnweb.freebsd.org/changeset/base/321515 Log: MFC r321217: Remove unused function swap_pager_isswapped(). Modified: stable/11/sys/vm/swap_pager.c stable/11/sys/vm/swap_pager.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/vm/swap_pager.c ============================================================================== --- stable/11/sys/vm/swap_pager.c Wed Jul 26 06:57:15 2017 (r321514) +++ stable/11/sys/vm/swap_pager.c Wed Jul 26 07:00:27 2017 (r321515) @@ -1585,43 +1585,6 @@ swp_pager_async_iodone(struct buf *bp) } /* - * swap_pager_isswapped: - * - * Return 1 if at least one page in the given object is paged - * out to the given swap device. - * - * This routine may not sleep. - */ -int -swap_pager_isswapped(vm_object_t object, struct swdevt *sp) -{ - daddr_t index = 0; - int bcount; - int i; - - VM_OBJECT_ASSERT_WLOCKED(object); - if (object->type != OBJT_SWAP) - return (0); - - mtx_lock(&swhash_mtx); - for (bcount = 0; bcount < object->un_pager.swp.swp_bcount; bcount++) { - struct swblock *swap; - - if ((swap = *swp_pager_hash(object, index)) != NULL) { - for (i = 0; i < SWAP_META_PAGES; ++i) { - if (swp_pager_isondev(swap->swb_pages[i], sp)) { - mtx_unlock(&swhash_mtx); - return (1); - } - } - } - index += SWAP_META_PAGES; - } - mtx_unlock(&swhash_mtx); - return (0); -} - -/* * SWP_PAGER_FORCE_PAGEIN() - force a swap block to be paged in * * This routine dissociates the page at the given index within an object Modified: stable/11/sys/vm/swap_pager.h ============================================================================== --- stable/11/sys/vm/swap_pager.h Wed Jul 26 06:57:15 2017 (r321514) +++ stable/11/sys/vm/swap_pager.h Wed Jul 26 07:00:27 2017 (r321515) @@ -82,7 +82,6 @@ void swap_pager_copy(vm_object_t, vm_object_t, vm_pind vm_pindex_t swap_pager_find_least(vm_object_t object, vm_pindex_t pindex); void swap_pager_freespace(vm_object_t, vm_pindex_t, vm_size_t); void swap_pager_swap_init(void); -int swap_pager_isswapped(vm_object_t, struct swdevt *); int swap_pager_reserve(vm_object_t, vm_pindex_t, vm_size_t); void swap_pager_status(int *total, int *used); void swapoff_all(void); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 11:04:31 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AB895DB777D; Wed, 26 Jul 2017 11:04:31 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7A9758325D; Wed, 26 Jul 2017 11:04:31 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QB4Ume033253; Wed, 26 Jul 2017 11:04:30 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QB4UVL033251; Wed, 26 Jul 2017 11:04:30 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201707261104.v6QB4UVL033251@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Wed, 26 Jul 2017 11:04:30 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321517 - stable/11/sys/dev/bxe X-SVN-Group: stable-11 X-SVN-Commit-Author: ae X-SVN-Commit-Paths: stable/11/sys/dev/bxe X-SVN-Commit-Revision: 321517 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 11:04:31 -0000 Author: ae Date: Wed Jul 26 11:04:30 2017 New Revision: 321517 URL: https://svnweb.freebsd.org/changeset/base/321517 Log: MFC r321203: Add HPE FlexFabric 10Gb 4-port 536FLR-T device id to the bxe(4) driver. Modified: stable/11/sys/dev/bxe/bxe.c stable/11/sys/dev/bxe/bxe.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/bxe/bxe.c ============================================================================== --- stable/11/sys/dev/bxe/bxe.c Wed Jul 26 11:03:13 2017 (r321516) +++ stable/11/sys/dev/bxe/bxe.c Wed Jul 26 11:04:30 2017 (r321517) @@ -167,6 +167,12 @@ static struct bxe_device_type bxe_devs[] = { "QLogic NetXtreme II BCM57840 4x10GbE" }, { + QLOGIC_VENDORID, + CHIP_NUM_57840_4_10, + PCI_ANY_ID, PCI_ANY_ID, + "QLogic NetXtreme II BCM57840 4x10GbE" + }, + { BRCM_VENDORID, CHIP_NUM_57840_2_20, PCI_ANY_ID, PCI_ANY_ID, Modified: stable/11/sys/dev/bxe/bxe.h ============================================================================== --- stable/11/sys/dev/bxe/bxe.h Wed Jul 26 11:03:13 2017 (r321516) +++ stable/11/sys/dev/bxe/bxe.h Wed Jul 26 11:04:30 2017 (r321517) @@ -168,6 +168,7 @@ int bxe_ilog2(int x) #include "ecore_sp.h" #define BRCM_VENDORID 0x14e4 +#define QLOGIC_VENDORID 0x1077 #define PCI_ANY_ID (uint16_t)(~0U) struct bxe_device_type From owner-svn-src-stable-11@freebsd.org Wed Jul 26 14:55:49 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2C78FDBFCF7; Wed, 26 Jul 2017 14:55:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 078DC656B1; Wed, 26 Jul 2017 14:55:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QEtmgW030580; Wed, 26 Jul 2017 14:55:48 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QEtlq6030576; Wed, 26 Jul 2017 14:55:47 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261455.v6QEtlq6030576@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 14:55:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321519 - in stable/11/sys: boot/zfs cddl/boot/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys: boot/zfs cddl/boot/zfs X-SVN-Commit-Revision: 321519 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 14:55:49 -0000 Author: mav Date: Wed Jul 26 14:55:47 2017 New Revision: 321519 URL: https://svnweb.freebsd.org/changeset/base/321519 Log: MFC r303630 (by allanjude): Make boot code and loader check for unsupported ZFS feature flags OpenZFS uses feature flags instead of a zpool version number to track features since the split from Oracle. In addition to avoiding confusion on ZFS vs OpenZFS version numbers, this also allows features to be added to different operating systems that use OpenZFS in different order. The previous zfs boot code (gptzfsboot) and loader (zfsloader) blindly tries to read the pool, and if failed provided only a vague error message. With this change, both the boot code and loader check the MOS features list in the ZFS label and compare it against the list of features that the loader supports. If any unsupported feature is active, the pool is not considered as a candidate for booting, and a helpful diagnostic message is printed to the screen. Features that are merely enabled via zpool upgrade, but not in use, do not block booting from the pool. Submitted by: Toomas Soome Reviewed by: delphij, mav Relnotes: yes Differential Revision: https://reviews.freebsd.org/D6857 Modified: stable/11/sys/boot/zfs/libzfs.h stable/11/sys/boot/zfs/zfs.c stable/11/sys/boot/zfs/zfsimpl.c stable/11/sys/cddl/boot/zfs/zfsimpl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/boot/zfs/libzfs.h ============================================================================== --- stable/11/sys/boot/zfs/libzfs.h Wed Jul 26 12:07:46 2017 (r321518) +++ stable/11/sys/boot/zfs/libzfs.h Wed Jul 26 14:55:47 2017 (r321519) @@ -65,7 +65,7 @@ int zfs_probe_dev(const char *devname, uint64_t *pool_ int zfs_list(const char *name); void init_zfs_bootenv(char *currdev); int zfs_bootenv(const char *name); -int zfs_belist_add(const char *name); +int zfs_belist_add(const char *name, uint64_t __unused); int zfs_set_env(void); extern struct devsw zfs_dev; Modified: stable/11/sys/boot/zfs/zfs.c ============================================================================== --- stable/11/sys/boot/zfs/zfs.c Wed Jul 26 12:07:46 2017 (r321518) +++ stable/11/sys/boot/zfs/zfs.c Wed Jul 26 14:55:47 2017 (r321519) @@ -801,7 +801,7 @@ zfs_bootenv(const char *name) } int -zfs_belist_add(const char *name) +zfs_belist_add(const char *name, uint64_t value __unused) { /* Skip special datasets that start with a $ character */ Modified: stable/11/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/11/sys/boot/zfs/zfsimpl.c Wed Jul 26 12:07:46 2017 (r321518) +++ stable/11/sys/boot/zfs/zfsimpl.c Wed Jul 26 14:55:47 2017 (r321519) @@ -1472,12 +1472,12 @@ zap_lookup(const spa_t *spa, const dnode_phys_t *dnode * the directory contents. */ static int -mzap_list(const dnode_phys_t *dnode, int (*callback)(const char *)) +mzap_list(const dnode_phys_t *dnode, int (*callback)(const char *, uint64_t)) { const mzap_phys_t *mz; const mzap_ent_phys_t *mze; size_t size; - int chunks, i; + int chunks, i, rc; /* * Microzap objects use exactly one block. Read the whole @@ -1489,9 +1489,11 @@ mzap_list(const dnode_phys_t *dnode, int (*callback)(c for (i = 0; i < chunks; i++) { mze = &mz->mz_chunk[i]; - if (mze->mze_name[0]) - //printf("%-32s 0x%jx\n", mze->mze_name, (uintmax_t)mze->mze_value); - callback(mze->mze_name); + if (mze->mze_name[0]) { + rc = callback(mze->mze_name, mze->mze_value); + if (rc != 0) + return (rc); + } } return (0); @@ -1502,12 +1504,12 @@ mzap_list(const dnode_phys_t *dnode, int (*callback)(c * the directory header. */ static int -fzap_list(const spa_t *spa, const dnode_phys_t *dnode, int (*callback)(const char *)) +fzap_list(const spa_t *spa, const dnode_phys_t *dnode, int (*callback)(const char *, uint64_t)) { int bsize = dnode->dn_datablkszsec << SPA_MINBLOCKSHIFT; zap_phys_t zh = *(zap_phys_t *) zap_scratch; fat_zap_t z; - int i, j; + int i, j, rc; if (zh.zap_magic != ZAP_MAGIC) return (EIO); @@ -1565,14 +1567,16 @@ fzap_list(const spa_t *spa, const dnode_phys_t *dnode, value = fzap_leaf_value(&zl, zc); //printf("%s 0x%jx\n", name, (uintmax_t)value); - callback((const char *)name); + rc = callback((const char *)name, value); + if (rc != 0) + return (rc); } } return (0); } -static int zfs_printf(const char *name) +static int zfs_printf(const char *name, uint64_t value __unused) { printf("%s\n", name); @@ -1867,7 +1871,7 @@ zfs_list_dataset(const spa_t *spa, uint64_t objnum/*, } int -zfs_callback_dataset(const spa_t *spa, uint64_t objnum, int (*callback)(const char *name)) +zfs_callback_dataset(const spa_t *spa, uint64_t objnum, int (*callback)(const char *, uint64_t)) { uint64_t dir_obj, child_dir_zapobj, zap_type; dnode_phys_t child_dir_zap, dir, dataset; @@ -2007,9 +2011,67 @@ zfs_mount(const spa_t *spa, uint64_t rootobj, struct z return (0); } +/* + * callback function for feature name checks. + */ static int +check_feature(const char *name, uint64_t value) +{ + int i; + + if (value == 0) + return (0); + if (name[0] == '\0') + return (0); + + for (i = 0; features_for_read[i] != NULL; i++) { + if (strcmp(name, features_for_read[i]) == 0) + return (0); + } + printf("ZFS: unsupported feature: %s\n", name); + return (EIO); +} + +/* + * Checks whether the MOS features that are active are supported. + */ +static int +check_mos_features(const spa_t *spa) +{ + dnode_phys_t dir; + uint64_t objnum, zap_type; + size_t size; + int rc; + + if ((rc = objset_get_dnode(spa, &spa->spa_mos, DMU_OT_OBJECT_DIRECTORY, + &dir)) != 0) + return (rc); + if ((rc = zap_lookup(spa, &dir, DMU_POOL_FEATURES_FOR_READ, &objnum)) != 0) + return (rc); + + if ((rc = objset_get_dnode(spa, &spa->spa_mos, objnum, &dir)) != 0) + return (rc); + + if (dir.dn_type != DMU_OTN_ZAP_METADATA) + return (EIO); + + size = dir.dn_datablkszsec * 512; + if (dnode_read(spa, &dir, 0, zap_scratch, size)) + return (EIO); + + zap_type = *(uint64_t *) zap_scratch; + if (zap_type == ZBT_MICRO) + rc = mzap_list(&dir, check_feature); + else + rc = fzap_list(spa, &dir, check_feature); + + return (rc); +} + +static int zfs_spa_init(spa_t *spa) { + int rc; if (zio_read(spa, &spa->spa_uberblock.ub_rootbp, &spa->spa_mos)) { printf("ZFS: can't read MOS of pool %s\n", spa->spa_name); @@ -2019,7 +2081,13 @@ zfs_spa_init(spa_t *spa) printf("ZFS: corrupted MOS of pool %s\n", spa->spa_name); return (EIO); } - return (0); + + rc = check_mos_features(spa); + if (rc != 0) { + printf("ZFS: pool %s is not supported\n", spa->spa_name); + } + + return (rc); } static int Modified: stable/11/sys/cddl/boot/zfs/zfsimpl.h ============================================================================== --- stable/11/sys/cddl/boot/zfs/zfsimpl.h Wed Jul 26 12:07:46 2017 (r321518) +++ stable/11/sys/cddl/boot/zfs/zfsimpl.h Wed Jul 26 14:55:47 2017 (r321519) @@ -63,6 +63,8 @@ #define _NOTE(s) +typedef enum { B_FALSE, B_TRUE } boolean_t; + /* CRC64 table */ #define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ @@ -899,6 +901,41 @@ typedef struct dnode_phys { blkptr_t dn_spill; } dnode_phys_t; +typedef enum dmu_object_byteswap { + DMU_BSWAP_UINT8, + DMU_BSWAP_UINT16, + DMU_BSWAP_UINT32, + DMU_BSWAP_UINT64, + DMU_BSWAP_ZAP, + DMU_BSWAP_DNODE, + DMU_BSWAP_OBJSET, + DMU_BSWAP_ZNODE, + DMU_BSWAP_OLDACL, + DMU_BSWAP_ACL, + /* + * Allocating a new byteswap type number makes the on-disk format + * incompatible with any other format that uses the same number. + * + * Data can usually be structured to work with one of the + * DMU_BSWAP_UINT* or DMU_BSWAP_ZAP types. + */ + DMU_BSWAP_NUMFUNCS +} dmu_object_byteswap_t; + +#define DMU_OT_NEWTYPE 0x80 +#define DMU_OT_METADATA 0x40 +#define DMU_OT_BYTESWAP_MASK 0x3f + +/* + * Defines a uint8_t object type. Object types specify if the data + * in the object is metadata (boolean) and how to byteswap the data + * (dmu_object_byteswap_t). + */ +#define DMU_OT(byteswap, metadata) \ + (DMU_OT_NEWTYPE | \ + ((metadata) ? DMU_OT_METADATA : 0) | \ + ((byteswap) & DMU_OT_BYTESWAP_MASK)) + typedef enum dmu_object_type { DMU_OT_NONE, /* general: */ @@ -959,7 +996,21 @@ typedef enum dmu_object_type { DMU_OT_SA_ATTR_LAYOUTS, /* ZAP */ DMU_OT_SCAN_XLATE, /* ZAP */ DMU_OT_DEDUP, /* fake dedup BP from ddt_bp_create() */ - DMU_OT_NUMTYPES + DMU_OT_NUMTYPES, + + /* + * Names for valid types declared with DMU_OT(). + */ + DMU_OTN_UINT8_DATA = DMU_OT(DMU_BSWAP_UINT8, B_FALSE), + DMU_OTN_UINT8_METADATA = DMU_OT(DMU_BSWAP_UINT8, B_TRUE), + DMU_OTN_UINT16_DATA = DMU_OT(DMU_BSWAP_UINT16, B_FALSE), + DMU_OTN_UINT16_METADATA = DMU_OT(DMU_BSWAP_UINT16, B_TRUE), + DMU_OTN_UINT32_DATA = DMU_OT(DMU_BSWAP_UINT32, B_FALSE), + DMU_OTN_UINT32_METADATA = DMU_OT(DMU_BSWAP_UINT32, B_TRUE), + DMU_OTN_UINT64_DATA = DMU_OT(DMU_BSWAP_UINT64, B_FALSE), + DMU_OTN_UINT64_METADATA = DMU_OT(DMU_BSWAP_UINT64, B_TRUE), + DMU_OTN_ZAP_DATA = DMU_OT(DMU_BSWAP_ZAP, B_FALSE), + DMU_OTN_ZAP_METADATA = DMU_OT(DMU_BSWAP_ZAP, B_TRUE) } dmu_object_type_t; typedef enum dmu_objset_type { @@ -1097,6 +1148,7 @@ typedef struct dsl_dataset_phys { */ #define DMU_POOL_DIRECTORY_OBJECT 1 #define DMU_POOL_CONFIG "config" +#define DMU_POOL_FEATURES_FOR_READ "features_for_read" #define DMU_POOL_ROOT_DATASET "root_dataset" #define DMU_POOL_SYNC_BPLIST "sync_bplist" #define DMU_POOL_ERRLOG_SCRUB "errlog_scrub" From owner-svn-src-stable-11@freebsd.org Wed Jul 26 14:56:04 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C2B47DBFD35; Wed, 26 Jul 2017 14:56:04 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8E45665791; Wed, 26 Jul 2017 14:56:04 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QEu3EO030637; Wed, 26 Jul 2017 14:56:03 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QEu3TS030636; Wed, 26 Jul 2017 14:56:03 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201707261456.v6QEu3TS030636@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Wed, 26 Jul 2017 14:56:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321520 - stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Commit-Revision: 321520 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 14:56:04 -0000 Author: gjb Date: Wed Jul 26 14:56:03 2017 New Revision: 321520 URL: https://svnweb.freebsd.org/changeset/base/321520 Log: Add a note regarding VirtualBox vboxguest panics during 11.1-RC2. Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml ============================================================================== --- stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Wed Jul 26 14:55:47 2017 (r321519) +++ stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Wed Jul 26 14:56:03 2017 (r321520) @@ -151,6 +151,34 @@ boot There currently is no workaround. + + + [2017-07-26] Note for those upgrading from 11.1-RC2 in + VirtualBox: + + If system panics were experienced when upgrading from + 11.1-RC1 to 11.1-RC2, and the emulators/virtualbox-ose-additions{,-nox11} + port was built locally as a resolution, the port will either + need to be rebuilt when upgrading from 11.1-RC2 to + 11.1-RELEASE, or reinstall the package from the pkg(8) + mirrors using either: + + &prompt.root; pkg install -f virtualbox-ose-additions + + or + + &prompt.root; pkg install -f virtualbox-ose-additions-nox11 + + To ensure the system does not panic after rebooting into + the updated kernel, it is recommended to disable the + vboxguest service in &man.rc.conf.5; + prior to rebooting the system if possible, or use + &man.pkg.8; to forcefully reinstall the package. + + Systems being upgraded from 11.1-RC1 and earlier and + 11.1-RC3 to 11.1-RELEASE should be unaffected. + From owner-svn-src-stable-11@freebsd.org Wed Jul 26 15:15:21 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3CB72DC02FD; Wed, 26 Jul 2017 15:15:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1757D66452; Wed, 26 Jul 2017 15:15:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QFFK1F038592; Wed, 26 Jul 2017 15:15:20 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QFFKmc038591; Wed, 26 Jul 2017 15:15:20 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261515.v6QFFKmc038591@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 15:15:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321521 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321521 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 15:15:21 -0000 Author: mav Date: Wed Jul 26 15:15:20 2017 New Revision: 321521 URL: https://svnweb.freebsd.org/changeset/base/321521 Log: MFC r305701 (by allanjude): MFV r268120: 4936 lz4 could theoretically overflow a pointer with a certain input illumos/illumos-gate@58d0718061c87e3d647c891ec5281b93c08dba4e Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c Wed Jul 26 14:56:03 2017 (r321520) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c Wed Jul 26 15:15:20 2017 (r321521) @@ -187,21 +187,18 @@ lz4_decompress(void *s_start, void *d_start, size_t s_ defined(__amd64) || defined(__ppc64__) || defined(_WIN64) || \ defined(__LP64__) || defined(_LP64)) #define LZ4_ARCH64 1 -/* - * Illumos: On amd64 we have 20k of stack and 24k on sun4u and sun4v, so we - * can spend 16k on the algorithm - */ -/* FreeBSD: Use heap for all platforms for now */ -#define STACKLIMIT 0 #else #define LZ4_ARCH64 0 +#endif + /* - * Illumos: On i386 we only have 12k of stack, so in order to maintain the - * same COMPRESSIONLEVEL we have to use heap allocation. Performance will - * suck, but alas, it's ZFS on 32-bit we're talking about, so... + * Limits the amount of stack space that the algorithm may consume to hold + * the compression lookup table. The value `9' here means we'll never use + * more than 2k of stack (see above for a description of COMPRESSIONLEVEL). + * If more memory is needed, it is allocated from the heap. */ +/* FreeBSD: Use heap for all platforms for now */ #define STACKLIMIT 0 -#endif /* * Little Endian or Big Endian? @@ -870,7 +867,7 @@ real_LZ4_compress(const char *source, char *dest, int /* Decompression functions */ /* - * Note: The decoding functionLZ4_uncompress_unknownOutputSize() is safe + * Note: The decoding function LZ4_uncompress_unknownOutputSize() is safe * against "buffer overflow" attack type. They will never write nor * read outside of the provided output buffers. * LZ4_uncompress_unknownOutputSize() also insures that it will never @@ -913,6 +910,9 @@ LZ4_uncompress_unknownOutputSize(const char *source, c } /* copy literals */ cpy = op + length; + /* CORNER-CASE: cpy might overflow. */ + if (cpy < op) + goto _output_error; /* cpy was overflowed, bail! */ if ((cpy > oend - COPYLENGTH) || (ip + length > iend - COPYLENGTH)) { if (cpy > oend) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 15:19:36 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1B86CDC039A; Wed, 26 Jul 2017 15:19:36 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DB616665F1; Wed, 26 Jul 2017 15:19:35 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QFJYNd038804; Wed, 26 Jul 2017 15:19:34 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QFJY0F038803; Wed, 26 Jul 2017 15:19:34 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261519.v6QFJY0F038803@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 15:19:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321522 - stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Commit-Revision: 321522 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 15:19:36 -0000 Author: mav Date: Wed Jul 26 15:19:34 2017 New Revision: 321522 URL: https://svnweb.freebsd.org/changeset/base/321522 Log: MFC r309096 (by avg): MFV r308989: 6428 set canmount=off on unmounted filesystem tries to unmount children This is a cosmetic and bookkeeping change as the actual change is already in FreeBSD. See r297521, r304520, r308985. Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 15:15:20 2017 (r321521) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 15:19:34 2017 (r321522) @@ -1620,7 +1620,7 @@ zfs_prop_set_list(zfs_handle_t *zhp, nvlist_t *props) */ if (prop != ZFS_PROP_CANMOUNT || (fnvpair_value_uint64(elem) == ZFS_CANMOUNT_OFF && - zfs_is_mounted(zhp, NULL))) { + zfs_is_mounted(zhp, NULL))) { cls[cl_idx] = changelist_gather(zhp, prop, 0, 0); if (cls[cl_idx] == NULL) goto error; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 15:23:02 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 81C6EDC057E; Wed, 26 Jul 2017 15:23:02 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5DBC0669F6; Wed, 26 Jul 2017 15:23:02 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QFN1ow042745; Wed, 26 Jul 2017 15:23:01 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QFN1oa042742; Wed, 26 Jul 2017 15:23:01 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261523.v6QFN1oa042742@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 15:23:01 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321523 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321523 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 15:23:02 -0000 Author: mav Date: Wed Jul 26 15:23:01 2017 New Revision: 321523 URL: https://svnweb.freebsd.org/changeset/base/321523 Log: MFC r312535: MFV 312436 6569 large file delete can starve out write ops illumos/illumos-gate@ff5177ee8bf9a355131ce2cc61ae2da6a5a6fdd6 https://github.com/illumos/illumos-gate/commit/ff5177ee8bf9a355131ce2cc61ae2da 6a5a6fdd6 https://www.illumos.org/issues/6569 The core issue I've found is that there is no throttle for how many deletes get assigned to one TXG. As a results when deleting large files we end up filling consecutive TXGs with deletes/frees, then write throttling other (more important) ops. There is an easy test case for this problem. Try deleting several large files (at least 1/2 TB) while you do write ops on the same pool. What we've seen is performance of these write ops (let's call it sideload I/O) would drop to zero. More specifically the problem is that dmu_free_long_range_impl() can/will fill up all of the dirty data in the pool "instantly", before many of the sideload ops can get in. So sideload performance will be impacted until all the files are freed. The solution we have tested at Nexenta (with positive results) creates a relatively simple throttle for how many "free" ops we let into one TXG. However this solution exposes other problems that should also be addressed. If we are to slow down freeing of data that means one has to wait even longer (assuming vnode ref count of 1) to get shell back after an rm or for NFS thread to finish the free-ing op. To avoid this the proposed solution is to call zfs_inactive() async for "large" files. Async freeing then begs for the reclaimed space to be accounted for in the zpool's "freeing" prop. The other issue with having a longer delete is the inability to export/unmount for a longer period of time. The proposed solution is to interrupt freeing of blocks when a fs is unmounted. Author: Alek Pinchuk Reviewed by: Matt Ahrens Reviewed by: Sanjay Nadkarni Reviewed by: Pavel Zakharov Approved by: Dan McDonald Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 15:19:34 2017 (r321522) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 15:23:01 2017 (r321523) @@ -60,6 +60,16 @@ SYSCTL_DECL(_vfs_zfs); SYSCTL_INT(_vfs_zfs, OID_AUTO, nopwrite_enabled, CTLFLAG_RDTUN, &zfs_nopwrite_enabled, 0, "Enable nopwrite feature"); +/* + * Tunable to control percentage of dirtied blocks from frees in one TXG. + * After this threshold is crossed, additional dirty blocks from frees + * wait until the next TXG. + * A value of zero will disable this throttle. + */ +uint32_t zfs_per_txg_dirty_frees_percent = 30; +SYSCTL_INT(_vfs_zfs, OID_AUTO, per_txg_dirty_frees_percent, CTLFLAG_RWTUN, + &zfs_per_txg_dirty_frees_percent, 0, "Percentage of dirtied blocks from frees in one txg"); + const dmu_object_type_info_t dmu_ot[DMU_OT_NUMTYPES] = { { DMU_BSWAP_UINT8, TRUE, "unallocated" }, { DMU_BSWAP_ZAP, TRUE, "object directory" }, @@ -718,15 +728,25 @@ dmu_free_long_range_impl(objset_t *os, dnode_t *dn, ui { uint64_t object_size = (dn->dn_maxblkid + 1) * dn->dn_datablksz; int err; + uint64_t dirty_frees_threshold; + dsl_pool_t *dp = dmu_objset_pool(os); if (offset >= object_size) return (0); + if (zfs_per_txg_dirty_frees_percent <= 100) + dirty_frees_threshold = + zfs_per_txg_dirty_frees_percent * zfs_dirty_data_max / 100; + else + dirty_frees_threshold = zfs_dirty_data_max / 4; + if (length == DMU_OBJECT_END || offset + length > object_size) length = object_size - offset; while (length != 0) { - uint64_t chunk_end, chunk_begin; + uint64_t chunk_end, chunk_begin, chunk_len; + uint64_t long_free_dirty_all_txgs = 0; + dmu_tx_t *tx; chunk_end = chunk_begin = offset + length; @@ -737,11 +757,30 @@ dmu_free_long_range_impl(objset_t *os, dnode_t *dn, ui ASSERT3U(chunk_begin, >=, offset); ASSERT3U(chunk_begin, <=, chunk_end); - dmu_tx_t *tx = dmu_tx_create(os); - dmu_tx_hold_free(tx, dn->dn_object, - chunk_begin, chunk_end - chunk_begin); + chunk_len = chunk_end - chunk_begin; + mutex_enter(&dp->dp_lock); + for (int t = 0; t < TXG_SIZE; t++) { + long_free_dirty_all_txgs += + dp->dp_long_free_dirty_pertxg[t]; + } + mutex_exit(&dp->dp_lock); + /* + * To avoid filling up a TXG with just frees wait for + * the next TXG to open before freeing more chunks if + * we have reached the threshold of frees + */ + if (dirty_frees_threshold != 0 && + long_free_dirty_all_txgs >= dirty_frees_threshold) { + txg_wait_open(dp, 0); + continue; + } + + tx = dmu_tx_create(os); + dmu_tx_hold_free(tx, dn->dn_object, chunk_begin, chunk_len); + + /* * Mark this transaction as typically resulting in a net * reduction in space used. */ @@ -751,10 +790,18 @@ dmu_free_long_range_impl(objset_t *os, dnode_t *dn, ui dmu_tx_abort(tx); return (err); } - dnode_free_range(dn, chunk_begin, chunk_end - chunk_begin, tx); + + mutex_enter(&dp->dp_lock); + dp->dp_long_free_dirty_pertxg[dmu_tx_get_txg(tx) & TXG_MASK] += + chunk_len; + mutex_exit(&dp->dp_lock); + DTRACE_PROBE3(free__long__range, + uint64_t, long_free_dirty_all_txgs, uint64_t, chunk_len, + uint64_t, dmu_tx_get_txg(tx)); + dnode_free_range(dn, chunk_begin, chunk_len, tx); dmu_tx_commit(tx); - length -= chunk_end - chunk_begin; + length -= chunk_len; } return (0); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 15:19:34 2017 (r321522) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 15:23:01 2017 (r321523) @@ -24,6 +24,7 @@ * Copyright (c) 2013 Steven Hartland. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2016 Nexenta Systems, Inc. All rights reserved. */ #include @@ -588,6 +589,16 @@ dsl_pool_sync(dsl_pool_t *dp, uint64_t txg) * Shore up the accounting of any dirtied space now. */ dsl_pool_undirty_space(dp, dp->dp_dirty_pertxg[txg & TXG_MASK], txg); + + /* + * Update the long range free counter after + * we're done syncing user data + */ + mutex_enter(&dp->dp_lock); + ASSERT(spa_sync_pass(dp->dp_spa) == 1 || + dp->dp_long_free_dirty_pertxg[txg & TXG_MASK] == 0); + dp->dp_long_free_dirty_pertxg[txg & TXG_MASK] = 0; + mutex_exit(&dp->dp_lock); /* * After the data blocks have been written (ensured by the zio_wait() Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Jul 26 15:19:34 2017 (r321522) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Jul 26 15:23:01 2017 (r321523) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright 2016 Nexenta Systems, Inc. All rights reserved. */ #ifndef _SYS_DSL_POOL_H @@ -103,6 +104,7 @@ typedef struct dsl_pool { kcondvar_t dp_spaceavail_cv; uint64_t dp_dirty_pertxg[TXG_SIZE]; uint64_t dp_dirty_total; + uint64_t dp_long_free_dirty_pertxg[TXG_SIZE]; uint64_t dp_mos_used_delta; uint64_t dp_mos_compressed_delta; uint64_t dp_mos_uncompressed_delta; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 15:24:18 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8F1A3DC0681; Wed, 26 Jul 2017 15:24:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 690C166CE2; Wed, 26 Jul 2017 15:24:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QFOHpk042848; Wed, 26 Jul 2017 15:24:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QFOHC2042845; Wed, 26 Jul 2017 15:24:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261524.v6QFOHC2042845@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 15:24:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321524 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321524 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 15:24:18 -0000 Author: mav Date: Wed Jul 26 15:24:17 2017 New Revision: 321524 URL: https://svnweb.freebsd.org/changeset/base/321524 Log: MFC r313813: MFV 313786 7500 Simplify dbuf_free_range by removing dn_unlisted_l0_blkid illumos/illumos-gate@653af1b809998570c7e89fe7a0d3f90992bf0216 https://github.com/illumos/illumos-gate/commit/653af1b809998570c7e89fe7a0d3f90992bf0216 https://www.illumos.org/issues/7500 With the integration of: commit 0f6d88aded0d165f5954688a9b13bac76c38da84 Author: Alex Reece Date: Sat Jul 26 13:40:04 2014 -0800 4873 zvol unmap calls can take a very long time for larger datasets the dnode's dn_bufs field was changed from a list to a tree. As a result, the dn_unlisted_l0_blkid field is no longer necessary. Author: Stephen Blinick Reviewed by: Matthew Ahrens Reviewed by: Dan Kimmel Approved by: Gordon Ross Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 15:23:01 2017 (r321523) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 15:24:17 2017 (r321524) @@ -49,12 +49,6 @@ uint_t zfs_dbuf_evict_key; -/* - * Number of times that zfs_free_range() took the slow path while doing - * a zfs receive. A nonzero value indicates a potential performance problem. - */ -uint64_t zfs_free_range_recv_miss; - static boolean_t dbuf_undirty(dmu_buf_impl_t *db, dmu_tx_t *tx); static void dbuf_write(dbuf_dirty_record_t *dr, arc_buf_t *data, dmu_tx_t *tx); @@ -1220,9 +1214,6 @@ dbuf_unoverride(dbuf_dirty_record_t *dr) * Evict (if its unreferenced) or clear (if its referenced) any level-0 * data blocks in the free range, so that any future readers will find * empty blocks. - * - * This is a no-op if the dataset is in the middle of an incremental - * receive; see comment below for details. */ void dbuf_free_range(dnode_t *dn, uint64_t start_blkid, uint64_t end_blkid, @@ -1232,10 +1223,9 @@ dbuf_free_range(dnode_t *dn, uint64_t start_blkid, uin dmu_buf_impl_t *db, *db_next; uint64_t txg = tx->tx_txg; avl_index_t where; - boolean_t freespill = - (start_blkid == DMU_SPILL_BLKID || end_blkid == DMU_SPILL_BLKID); - if (end_blkid > dn->dn_maxblkid && !freespill) + if (end_blkid > dn->dn_maxblkid && + !(start_blkid == DMU_SPILL_BLKID || end_blkid == DMU_SPILL_BLKID)) end_blkid = dn->dn_maxblkid; dprintf_dnode(dn, "start=%llu end=%llu\n", start_blkid, end_blkid); @@ -1244,29 +1234,9 @@ dbuf_free_range(dnode_t *dn, uint64_t start_blkid, uin db_search.db_state = DB_SEARCH; mutex_enter(&dn->dn_dbufs_mtx); - if (start_blkid >= dn->dn_unlisted_l0_blkid && !freespill) { - /* There can't be any dbufs in this range; no need to search. */ -#ifdef DEBUG - db = avl_find(&dn->dn_dbufs, &db_search, &where); - ASSERT3P(db, ==, NULL); - db = avl_nearest(&dn->dn_dbufs, where, AVL_AFTER); - ASSERT(db == NULL || db->db_level > 0); -#endif - mutex_exit(&dn->dn_dbufs_mtx); - return; - } else if (dmu_objset_is_receiving(dn->dn_objset)) { - /* - * If we are receiving, we expect there to be no dbufs in - * the range to be freed, because receive modifies each - * block at most once, and in offset order. If this is - * not the case, it can lead to performance problems, - * so note that we unexpectedly took the slow path. - */ - atomic_inc_64(&zfs_free_range_recv_miss); - } - db = avl_find(&dn->dn_dbufs, &db_search, &where); ASSERT3P(db, ==, NULL); + db = avl_nearest(&dn->dn_dbufs, where, AVL_AFTER); for (; db != NULL; db = db_next) { @@ -2283,9 +2253,7 @@ dbuf_create(dnode_t *dn, uint8_t level, uint64_t blkid return (odb); } avl_add(&dn->dn_dbufs, db); - if (db->db_level == 0 && db->db_blkid >= - dn->dn_unlisted_l0_blkid) - dn->dn_unlisted_l0_blkid = db->db_blkid + 1; + db->db_state = DB_UNCACHED; mutex_exit(&dn->dn_dbufs_mtx); arc_space_consume(sizeof (dmu_buf_impl_t), ARC_SPACE_OTHER); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Jul 26 15:23:01 2017 (r321523) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Jul 26 15:24:17 2017 (r321524) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2015 by Delphix. All rights reserved. + * Copyright (c) 2012, 2016 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -153,7 +153,6 @@ dnode_cons(void *arg, void *unused, int kmflag) dn->dn_id_flags = 0; dn->dn_dbufs_count = 0; - dn->dn_unlisted_l0_blkid = 0; avl_create(&dn->dn_dbufs, dbuf_compare, sizeof (dmu_buf_impl_t), offsetof(dmu_buf_impl_t, db_link)); @@ -207,7 +206,6 @@ dnode_dest(void *arg, void *unused) ASSERT0(dn->dn_id_flags); ASSERT0(dn->dn_dbufs_count); - ASSERT0(dn->dn_unlisted_l0_blkid); avl_destroy(&dn->dn_dbufs); } @@ -525,7 +523,6 @@ dnode_destroy(dnode_t *dn) dn->dn_newuid = 0; dn->dn_newgid = 0; dn->dn_id_flags = 0; - dn->dn_unlisted_l0_blkid = 0; dmu_zfetch_fini(&dn->dn_zfetch); kmem_cache_free(dnode_cache, dn); @@ -761,7 +758,6 @@ dnode_move_impl(dnode_t *odn, dnode_t *ndn) ASSERT(avl_is_empty(&ndn->dn_dbufs)); avl_swap(&ndn->dn_dbufs, &odn->dn_dbufs); ndn->dn_dbufs_count = odn->dn_dbufs_count; - ndn->dn_unlisted_l0_blkid = odn->dn_unlisted_l0_blkid; ndn->dn_bonus = odn->dn_bonus; ndn->dn_have_spill = odn->dn_have_spill; ndn->dn_zio = odn->dn_zio; @@ -794,7 +790,6 @@ dnode_move_impl(dnode_t *odn, dnode_t *ndn) avl_create(&odn->dn_dbufs, dbuf_compare, sizeof (dmu_buf_impl_t), offsetof(dmu_buf_impl_t, db_link)); odn->dn_dbufs_count = 0; - odn->dn_unlisted_l0_blkid = 0; odn->dn_bonus = NULL; odn->dn_zfetch.zf_dnode = NULL; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Jul 26 15:23:01 2017 (r321523) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Jul 26 15:24:17 2017 (r321524) @@ -195,8 +195,6 @@ struct dnode { /* protected by dn_dbufs_mtx; declared here to fill 32-bit hole */ uint32_t dn_dbufs_count; /* count of dn_dbufs */ - /* There are no level-0 blocks of this blkid or higher in dn_dbufs */ - uint64_t dn_unlisted_l0_blkid; /* protected by os_lock: */ list_node_t dn_dirty_link[TXG_SIZE]; /* next on dataset's dirty */ From owner-svn-src-stable-11@freebsd.org Wed Jul 26 15:29:58 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17703DC07A4; Wed, 26 Jul 2017 15:29:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E4B8366EC2; Wed, 26 Jul 2017 15:29:57 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QFTv8N043107; Wed, 26 Jul 2017 15:29:57 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QFTv4U043105; Wed, 26 Jul 2017 15:29:57 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261529.v6QFTv4U043105@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 15:29:56 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321525 - in stable/11/sys: boot/zfs cddl/boot/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys: boot/zfs cddl/boot/zfs X-SVN-Commit-Revision: 321525 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 15:29:58 -0000 Author: mav Date: Wed Jul 26 15:29:56 2017 New Revision: 321525 URL: https://svnweb.freebsd.org/changeset/base/321525 Log: MFC r314112 (by tsoome): loader: update symlink support in zfs reader As the current zfs file system is providing symlink via system attributes, need to update the code accordingly. Note, as the zfsboot code does not free the memory at this time, the object list will put some stress on the boot2 heap, eventually we should address the issue. Differential Revision: https://reviews.freebsd.org/D9706 Modified: stable/11/sys/boot/zfs/zfsimpl.c stable/11/sys/cddl/boot/zfs/zfsimpl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/11/sys/boot/zfs/zfsimpl.c Wed Jul 26 15:24:17 2017 (r321524) +++ stable/11/sys/boot/zfs/zfsimpl.c Wed Jul 26 15:29:56 2017 (r321525) @@ -2142,6 +2142,61 @@ zfs_dnode_stat(const spa_t *spa, dnode_phys_t *dn, str return (0); } +static int +zfs_dnode_readlink(const spa_t *spa, dnode_phys_t *dn, char *path, size_t psize) +{ + int rc = 0; + + if (dn->dn_bonustype == DMU_OT_SA) { + sa_hdr_phys_t *sahdrp = NULL; + size_t size = 0; + void *buf = NULL; + int hdrsize; + char *p; + + if (dn->dn_bonuslen != 0) + sahdrp = (sa_hdr_phys_t *)DN_BONUS(dn); + else { + blkptr_t *bp; + + if ((dn->dn_flags & DNODE_FLAG_SPILL_BLKPTR) == 0) + return (EIO); + bp = &dn->dn_spill; + + size = BP_GET_LSIZE(bp); + buf = zfs_alloc(size); + rc = zio_read(spa, bp, buf); + if (rc != 0) { + zfs_free(buf, size); + return (rc); + } + sahdrp = buf; + } + hdrsize = SA_HDR_SIZE(sahdrp); + p = (char *)((uintptr_t)sahdrp + hdrsize + SA_SYMLINK_OFFSET); + memcpy(path, p, psize); + if (buf != NULL) + zfs_free(buf, size); + return (0); + } + /* + * Second test is purely to silence bogus compiler + * warning about accessing past the end of dn_bonus. + */ + if (psize + sizeof(znode_phys_t) <= dn->dn_bonuslen && + sizeof(znode_phys_t) <= sizeof(dn->dn_bonus)) { + memcpy(path, &dn->dn_bonus[sizeof(znode_phys_t)], psize); + } else { + rc = dnode_read(spa, dn, 0, path, psize); + } + return (rc); +} + +struct obj_list { + uint64_t objnum; + STAILQ_ENTRY(obj_list) entry; +}; + /* * Lookup a file and return its dnode. */ @@ -2149,7 +2204,7 @@ static int zfs_lookup(const struct zfsmount *mount, const char *upath, dnode_phys_t *dnode) { int rc; - uint64_t objnum, rootnum, parentnum; + uint64_t objnum; const spa_t *spa; dnode_phys_t dn; const char *p, *q; @@ -2157,6 +2212,8 @@ zfs_lookup(const struct zfsmount *mount, const char *u char path[1024]; int symlinks_followed = 0; struct stat sb; + struct obj_list *entry; + STAILQ_HEAD(, obj_list) on_cache = STAILQ_HEAD_INITIALIZER(on_cache); spa = mount->spa; if (mount->objset.os_type != DMU_OST_ZFS) { @@ -2165,102 +2222,145 @@ zfs_lookup(const struct zfsmount *mount, const char *u return (EIO); } + if ((entry = malloc(sizeof(struct obj_list))) == NULL) + return (ENOMEM); + /* * Get the root directory dnode. */ rc = objset_get_dnode(spa, &mount->objset, MASTER_NODE_OBJ, &dn); - if (rc) + if (rc) { + free(entry); return (rc); + } rc = zap_lookup(spa, &dn, ZFS_ROOT_OBJ, &rootnum); - if (rc) + if (rc) { + free(entry); return (rc); + } + entry->objnum = objnum; + STAILQ_INSERT_HEAD(&on_cache, entry, entry); - rc = objset_get_dnode(spa, &mount->objset, rootnum, &dn); - if (rc) - return (rc); + rc = objset_get_dnode(spa, &mount->objset, objnum, &dn); + if (rc != 0) + goto done; - objnum = rootnum; p = upath; while (p && *p) { + rc = objset_get_dnode(spa, &mount->objset, objnum, &dn); + if (rc != 0) + goto done; + while (*p == '/') p++; - if (!*p) + if (*p == '\0') break; - q = strchr(p, '/'); - if (q) { - memcpy(element, p, q - p); - element[q - p] = 0; - p = q; - } else { - strcpy(element, p); - p = NULL; + q = p; + while (*q != '\0' && *q != '/') + q++; + + /* skip dot */ + if (p + 1 == q && p[0] == '.') { + p++; + continue; } + /* double dot */ + if (p + 2 == q && p[0] == '.' && p[1] == '.') { + p += 2; + if (STAILQ_FIRST(&on_cache) == + STAILQ_LAST(&on_cache, obj_list, entry)) { + rc = ENOENT; + goto done; + } + entry = STAILQ_FIRST(&on_cache); + STAILQ_REMOVE_HEAD(&on_cache, entry); + free(entry); + objnum = (STAILQ_FIRST(&on_cache))->objnum; + continue; + } + if (q - p + 1 > sizeof(element)) { + rc = ENAMETOOLONG; + goto done; + } + memcpy(element, p, q - p); + element[q - p] = 0; + p = q; - rc = zfs_dnode_stat(spa, &dn, &sb); - if (rc) - return (rc); - if (!S_ISDIR(sb.st_mode)) - return (ENOTDIR); + if ((rc = zfs_dnode_stat(spa, &dn, &sb)) != 0) + goto done; + if (!S_ISDIR(sb.st_mode)) { + rc = ENOTDIR; + goto done; + } - parentnum = objnum; rc = zap_lookup(spa, &dn, element, &objnum); if (rc) - return (rc); + goto done; objnum = ZFS_DIRENT_OBJ(objnum); + if ((entry = malloc(sizeof(struct obj_list))) == NULL) { + rc = ENOMEM; + goto done; + } + entry->objnum = objnum; + STAILQ_INSERT_HEAD(&on_cache, entry, entry); rc = objset_get_dnode(spa, &mount->objset, objnum, &dn); if (rc) - return (rc); + goto done; /* * Check for symlink. */ rc = zfs_dnode_stat(spa, &dn, &sb); if (rc) - return (rc); + goto done; if (S_ISLNK(sb.st_mode)) { - if (symlinks_followed > 10) - return (EMLINK); + if (symlinks_followed > 10) { + rc = EMLINK; + goto done; + } symlinks_followed++; /* * Read the link value and copy the tail of our * current path onto the end. */ - if (p) - strcpy(&path[sb.st_size], p); - else - path[sb.st_size] = 0; - /* - * Second test is purely to silence bogus compiler - * warning about accessing past the end of dn_bonus. - */ - if (sb.st_size + sizeof(znode_phys_t) <= - dn.dn_bonuslen && sizeof(znode_phys_t) <= - sizeof(dn.dn_bonus)) { - memcpy(path, &dn.dn_bonus[sizeof(znode_phys_t)], - sb.st_size); - } else { - rc = dnode_read(spa, &dn, 0, path, sb.st_size); - if (rc) - return (rc); + if (sb.st_size + strlen(p) + 1 > sizeof(path)) { + rc = ENAMETOOLONG; + goto done; } + strcpy(&path[sb.st_size], p); + rc = zfs_dnode_readlink(spa, &dn, path, sb.st_size); + if (rc != 0) + goto done; + /* * Restart with the new path, starting either at * the root or at the parent depending whether or * not the link is relative. */ p = path; - if (*p == '/') - objnum = rootnum; - else - objnum = parentnum; - objset_get_dnode(spa, &mount->objset, objnum, &dn); + if (*p == '/') { + while (STAILQ_FIRST(&on_cache) != + STAILQ_LAST(&on_cache, obj_list, entry)) { + entry = STAILQ_FIRST(&on_cache); + STAILQ_REMOVE_HEAD(&on_cache, entry); + free(entry); + } + } else { + entry = STAILQ_FIRST(&on_cache); + STAILQ_REMOVE_HEAD(&on_cache, entry); + free(entry); + } + objnum = (STAILQ_FIRST(&on_cache))->objnum; } } *dnode = dn; - return (0); +done: + STAILQ_FOREACH(entry, &on_cache, entry) + free(entry); + return (rc); } Modified: stable/11/sys/cddl/boot/zfs/zfsimpl.h ============================================================================== --- stable/11/sys/cddl/boot/zfs/zfsimpl.h Wed Jul 26 15:24:17 2017 (r321524) +++ stable/11/sys/cddl/boot/zfs/zfsimpl.h Wed Jul 26 15:29:56 2017 (r321525) @@ -1070,6 +1070,7 @@ typedef struct sa_hdr_phys { #define SA_UID_OFFSET 24 #define SA_GID_OFFSET 32 #define SA_PARENT_OFFSET 40 +#define SA_SYMLINK_OFFSET 160 /* * Intent log header - this on disk structure holds fields to manage From owner-svn-src-stable-11@freebsd.org Wed Jul 26 15:52:52 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ED7F3DC0E6F; Wed, 26 Jul 2017 15:52:52 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B3D4C67A91; Wed, 26 Jul 2017 15:52:52 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QFqpB0055116; Wed, 26 Jul 2017 15:52:51 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QFqpMh055115; Wed, 26 Jul 2017 15:52:51 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261552.v6QFqpMh055115@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 15:52:51 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321526 - stable/11/sys/boot/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/boot/zfs X-SVN-Commit-Revision: 321526 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 15:52:53 -0000 Author: mav Date: Wed Jul 26 15:52:51 2017 New Revision: 321526 URL: https://svnweb.freebsd.org/changeset/base/321526 Log: Fix mismerge in r321525. Modified: stable/11/sys/boot/zfs/zfsimpl.c Modified: stable/11/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/11/sys/boot/zfs/zfsimpl.c Wed Jul 26 15:29:56 2017 (r321525) +++ stable/11/sys/boot/zfs/zfsimpl.c Wed Jul 26 15:52:51 2017 (r321526) @@ -2234,7 +2234,7 @@ zfs_lookup(const struct zfsmount *mount, const char *u return (rc); } - rc = zap_lookup(spa, &dn, ZFS_ROOT_OBJ, &rootnum); + rc = zap_lookup(spa, &dn, ZFS_ROOT_OBJ, &objnum); if (rc) { free(entry); return (rc); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:11:10 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5AFA3D98249; Wed, 26 Jul 2017 16:11:10 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2381F68294; Wed, 26 Jul 2017 16:11:10 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGB9da061775; Wed, 26 Jul 2017 16:11:09 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGB8sE061766; Wed, 26 Jul 2017 16:11:08 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261611.v6QGB8sE061766@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:11:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321527 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321527 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:11:10 -0000 Author: mav Date: Wed Jul 26 16:11:08 2017 New Revision: 321527 URL: https://svnweb.freebsd.org/changeset/base/321527 Log: MFC r314267: MFV 314243 6676 Race between unique_insert() and unique_remove() causes ZFS fsid change illumos/illumos-gate@40510e8eba18690b9a9843b26393725eeb0f1dac https://github.com/illumos/illumos-gate/commit/40510e8eba18690b9a9843b26393725ee b0f1dac https://www.illumos.org/issues/6676 The fsid of zfs filesystems might change after reboot or remount. The problem seems to be caused by a race between unique_insert() and unique_remove(). The unique_re move() is called from dsl_dataset_evict() which is now an asynchronous thread. In a c ase the dsl_dataset_evict() thread is very slow and calls unique_remove() too late we will end up with changed fsid on zfs mount. This problem is very likely caused by #5056. Steps to Reproduce Note: I'm able to reproduce this always on a single core (virtual) machine. On multicore machines it is not so easy to reproduce. # uname -a SunOS openindiana 5.11 illumos-633aa80 i86pc i386 i86pc Solaris # zfs create rpool/TEST # FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}') # echo $FS::print vfs_t vfs_fsid | mdb -k vfs_fsid = { vfs_fsid.val = [ 0x54d7028a, 0x70311508 ] } # zfs umount rpool/TEST # zfs mount rpool/TEST # FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}') # echo $FS::print vfs_t vfs_fsid | mdb -k vfs_fsid = { vfs_fsid.val = [ 0xd9454e49, 0x6b36d08 ] } # Impact The persistent fsid (filesystem id) is essential for proper NFS functionality. If the fsid of a filesystem changes on remount (or after reboot) the NFS clients might not be able to automatically recover from such event and the manual remount of the NFS filesystems on every NFS client might be needed. Author: Josef 'Jeff' Sipek Reviewed by: Saso Kiselkov Reviewed by: Sanjay Nadkarni Reviewed by: Dan Vatca Reviewed by: Matthew Ahrens Reviewed by: George Wilson Reviewed by: Sebastien Roy Approved by: Robert Mustacchi Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:11:08 2017 (r321527) @@ -54,7 +54,9 @@ static void dbuf_write(dbuf_dirty_record_t *dr, arc_bu #ifndef __lint extern inline void dmu_buf_init_user(dmu_buf_user_t *dbu, - dmu_buf_evict_func_t *evict_func, dmu_buf_t **clear_on_evict_dbufp); + dmu_buf_evict_func_t *evict_func_sync, + dmu_buf_evict_func_t *evict_func_async, + dmu_buf_t **clear_on_evict_dbufp); #endif /* ! __lint */ /* @@ -361,11 +363,24 @@ dbuf_evict_user(dmu_buf_impl_t *db) #endif /* - * Invoke the callback from a taskq to avoid lock order reversals - * and limit stack depth. + * There are two eviction callbacks - one that we call synchronously + * and one that we invoke via a taskq. The async one is useful for + * avoiding lock order reversals and limiting stack depth. + * + * Note that if we have a sync callback but no async callback, + * it's likely that the sync callback will free the structure + * containing the dbu. In that case we need to take care to not + * dereference dbu after calling the sync evict func. */ - taskq_dispatch_ent(dbu_evict_taskq, dbu->dbu_evict_func, dbu, 0, - &dbu->dbu_tqent); + boolean_t has_async = (dbu->dbu_evict_func_async != NULL); + + if (dbu->dbu_evict_func_sync != NULL) + dbu->dbu_evict_func_sync(dbu); + + if (has_async) { + taskq_dispatch_ent(dbu_evict_taskq, dbu->dbu_evict_func_async, + dbu, 0, &dbu->dbu_tqent); + } } boolean_t Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Jul 26 16:11:08 2017 (r321527) @@ -1014,7 +1014,7 @@ dnode_special_open(objset_t *os, dnode_phys_t *dnp, ui } static void -dnode_buf_pageout(void *dbu) +dnode_buf_evict_async(void *dbu) { dnode_children_t *children_dnodes = dbu; int i; @@ -1140,8 +1140,8 @@ dnode_hold_impl(objset_t *os, uint64_t object, int fla for (i = 0; i < epb; i++) { zrl_init(&dnh[i].dnh_zrlock); } - dmu_buf_init_user(&children_dnodes->dnc_dbu, - dnode_buf_pageout, NULL); + dmu_buf_init_user(&children_dnodes->dnc_dbu, NULL, + dnode_buf_evict_async, NULL); winner = dmu_buf_set_user(&db->db, &children_dnodes->dnc_dbu); if (winner != NULL) { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Jul 26 16:11:08 2017 (r321527) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright (c) 2011 Martin Matuska - * Copyright (c) 2011, 2015 by Delphix. All rights reserved. + * Copyright (c) 2011, 2016 by Delphix. All rights reserved. * Copyright (c) 2014, Joyent, Inc. All rights reserved. * Copyright (c) 2014 RackTop Systems. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -279,17 +279,31 @@ dsl_dataset_block_freeable(dsl_dataset_t *ds, const bl return (B_TRUE); } +/* + * We have to release the fsid syncronously or we risk that a subsequent + * mount of the same dataset will fail to unique_insert the fsid. This + * failure would manifest itself as the fsid of this dataset changing + * between mounts which makes NFS clients quite unhappy. + */ static void -dsl_dataset_evict(void *dbu) +dsl_dataset_evict_sync(void *dbu) { dsl_dataset_t *ds = dbu; ASSERT(ds->ds_owner == NULL); - ds->ds_dbuf = NULL; - unique_remove(ds->ds_fsid_guid); +} +static void +dsl_dataset_evict_async(void *dbu) +{ + dsl_dataset_t *ds = dbu; + + ASSERT(ds->ds_owner == NULL); + + ds->ds_dbuf = NULL; + if (ds->ds_objset != NULL) dmu_objset_evict(ds->ds_objset); @@ -528,7 +542,8 @@ dsl_dataset_hold_obj(dsl_pool_t *dp, uint64_t dsobj, v ds->ds_reserved = ds->ds_quota = 0; } - dmu_buf_init_user(&ds->ds_dbu, dsl_dataset_evict, &ds->ds_dbuf); + dmu_buf_init_user(&ds->ds_dbu, dsl_dataset_evict_sync, + dsl_dataset_evict_async, &ds->ds_dbuf); if (err == 0) winner = dmu_buf_set_user_ie(dbuf, &ds->ds_dbu); @@ -551,6 +566,16 @@ dsl_dataset_hold_obj(dsl_pool_t *dp, uint64_t dsobj, v } else { ds->ds_fsid_guid = unique_insert(dsl_dataset_phys(ds)->ds_fsid_guid); + if (ds->ds_fsid_guid != + dsl_dataset_phys(ds)->ds_fsid_guid) { + zfs_dbgmsg("ds_fsid_guid changed from " + "%llx to %llx for pool %s dataset id %llu", + (long long) + dsl_dataset_phys(ds)->ds_fsid_guid, + (long long)ds->ds_fsid_guid, + spa_name(dp->dp_spa), + dsobj); + } } } ASSERT3P(ds->ds_dbuf, ==, dbuf); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Wed Jul 26 16:11:08 2017 (r321527) @@ -22,7 +22,7 @@ * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2011 Pawel Jakub Dawidek . * All rights reserved. - * Copyright (c) 2012, 2014 by Delphix. All rights reserved. + * Copyright (c) 2012, 2016 by Delphix. All rights reserved. * Copyright (c) 2014 Joyent, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright 2015 Nexenta Systems, Inc. All rights reserved. @@ -133,7 +133,7 @@ extern inline dsl_dir_phys_t *dsl_dir_phys(dsl_dir_t * static uint64_t dsl_dir_space_towrite(dsl_dir_t *dd); static void -dsl_dir_evict(void *dbu) +dsl_dir_evict_async(void *dbu) { dsl_dir_t *dd = dbu; dsl_pool_t *dp = dd->dd_pool; @@ -240,7 +240,8 @@ dsl_dir_hold_obj(dsl_pool_t *dp, uint64_t ddobj, dmu_buf_rele(origin_bonus, FTAG); } - dmu_buf_init_user(&dd->dd_dbu, dsl_dir_evict, &dd->dd_dbuf); + dmu_buf_init_user(&dd->dd_dbu, NULL, dsl_dir_evict_async, + &dd->dd_dbuf); winner = dmu_buf_set_user_ie(dbuf, &dd->dd_dbu); if (winner != NULL) { if (dd->dd_parent) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Jul 26 16:11:08 2017 (r321527) @@ -22,7 +22,7 @@ /* * Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright 2011 iXsystems, Inc - * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013, 2016 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -1299,7 +1299,7 @@ sa_build_index(sa_handle_t *hdl, sa_buf_type_t buftype /*ARGSUSED*/ static void -sa_evict(void *dbu) +sa_evict_sync(void *dbu) { panic("evicting sa dbuf\n"); } @@ -1383,7 +1383,8 @@ sa_handle_get_from_db(objset_t *os, dmu_buf_t *db, voi sa_handle_t *winner = NULL; handle = kmem_cache_alloc(sa_cache, KM_SLEEP); - handle->sa_dbu.dbu_evict_func = NULL; + handle->sa_dbu.dbu_evict_func_sync = NULL; + handle->sa_dbu.dbu_evict_func_async = NULL; handle->sa_userp = userp; handle->sa_bonus = db; handle->sa_os = os; @@ -1394,7 +1395,8 @@ sa_handle_get_from_db(objset_t *os, dmu_buf_t *db, voi error = sa_build_index(handle, SA_BONUS); if (hdl_type == SA_HDL_SHARED) { - dmu_buf_init_user(&handle->sa_dbu, sa_evict, NULL); + dmu_buf_init_user(&handle->sa_dbu, sa_evict_sync, NULL, + NULL); winner = dmu_buf_set_user_ie(db, &handle->sa_dbu); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:11:08 2017 (r321527) @@ -539,8 +539,14 @@ typedef struct dmu_buf_user { */ taskq_ent_t dbu_tqent; - /* This instance's eviction function pointer. */ - dmu_buf_evict_func_t *dbu_evict_func; + /* + * This instance's eviction function pointers. + * + * dbu_evict_func_sync is called synchronously and then + * dbu_evict_func_async is executed asynchronously on a taskq. + */ + dmu_buf_evict_func_t *dbu_evict_func_sync; + dmu_buf_evict_func_t *dbu_evict_func_async; #ifdef ZFS_DEBUG /* * Pointer to user's dbuf pointer. NULL for clients that do @@ -564,15 +570,19 @@ typedef struct dmu_buf_user { /* Very ugly, but it beats issuing suppression directives in many Makefiles. */ extern void dmu_buf_init_user(dmu_buf_user_t *dbu, dmu_buf_evict_func_t *evict_func, - dmu_buf_t **clear_on_evict_dbufp); + dmu_buf_evict_func_t *evict_func_async, dmu_buf_t **clear_on_evict_dbufp); #else /* __lint */ inline void -dmu_buf_init_user(dmu_buf_user_t *dbu, dmu_buf_evict_func_t *evict_func, - dmu_buf_t **clear_on_evict_dbufp) +dmu_buf_init_user(dmu_buf_user_t *dbu, dmu_buf_evict_func_t *evict_func_sync, + dmu_buf_evict_func_t *evict_func_async, dmu_buf_t **clear_on_evict_dbufp) { - ASSERT(dbu->dbu_evict_func == NULL); - ASSERT(evict_func != NULL); - dbu->dbu_evict_func = evict_func; + ASSERT(dbu->dbu_evict_func_sync == NULL); + ASSERT(dbu->dbu_evict_func_async == NULL); + + /* must have at least one evict func */ + IMPLY(evict_func_sync == NULL, evict_func_async != NULL); + dbu->dbu_evict_func_sync = evict_func_sync; + dbu->dbu_evict_func_async = evict_func_async; #ifdef ZFS_DEBUG dbu->dbu_clear_on_evict_dbufp = clear_on_evict_dbufp; #endif Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h Wed Jul 26 16:11:08 2017 (r321527) @@ -199,7 +199,7 @@ boolean_t zap_match(zap_name_t *zn, const char *matchn int zap_lockdir(objset_t *os, uint64_t obj, dmu_tx_t *tx, krw_t lti, boolean_t fatreader, boolean_t adding, void *tag, zap_t **zapp); void zap_unlockdir(zap_t *zap, void *tag); -void zap_evict(void *dbu); +void zap_evict_sync(void *dbu); zap_name_t *zap_name_alloc(zap_t *zap, const char *key, matchtype_t mt); void zap_name_free(zap_name_t *zn); int zap_hashbits(zap_t *zap); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c Wed Jul 26 16:11:08 2017 (r321527) @@ -81,7 +81,8 @@ fzap_upgrade(zap_t *zap, dmu_tx_t *tx, zap_flags_t fla ASSERT(RW_WRITE_HELD(&zap->zap_rwlock)); zap->zap_ismicro = FALSE; - zap->zap_dbu.dbu_evict_func = zap_evict; + zap->zap_dbu.dbu_evict_func_sync = zap_evict_sync; + zap->zap_dbu.dbu_evict_func_async = NULL; mutex_init(&zap->zap_f.zap_num_entries_mtx, 0, 0, 0); zap->zap_f.zap_block_shift = highbit64(zap->zap_dbuf->db_size) - 1; @@ -399,7 +400,7 @@ zap_allocate_blocks(zap_t *zap, int nblocks) } static void -zap_leaf_pageout(void *dbu) +zap_leaf_evict_sync(void *dbu) { zap_leaf_t *l = dbu; @@ -423,7 +424,7 @@ zap_create_leaf(zap_t *zap, dmu_tx_t *tx) VERIFY(0 == dmu_buf_hold(zap->zap_objset, zap->zap_object, l->l_blkid << FZAP_BLOCK_SHIFT(zap), NULL, &l->l_dbuf, DMU_READ_NO_PREFETCH)); - dmu_buf_init_user(&l->l_dbu, zap_leaf_pageout, &l->l_dbuf); + dmu_buf_init_user(&l->l_dbu, zap_leaf_evict_sync, NULL, &l->l_dbuf); winner = dmu_buf_set_user(l->l_dbuf, &l->l_dbu); ASSERT(winner == NULL); dmu_buf_will_dirty(l->l_dbuf, tx); @@ -470,13 +471,13 @@ zap_open_leaf(uint64_t blkid, dmu_buf_t *db) l->l_bs = highbit64(db->db_size) - 1; l->l_dbuf = db; - dmu_buf_init_user(&l->l_dbu, zap_leaf_pageout, &l->l_dbuf); + dmu_buf_init_user(&l->l_dbu, zap_leaf_evict_sync, NULL, &l->l_dbuf); winner = dmu_buf_set_user(db, &l->l_dbu); rw_exit(&l->l_rwlock); if (winner != NULL) { /* someone else set it first */ - zap_leaf_pageout(&l->l_dbu); + zap_leaf_evict_sync(&l->l_dbu); l = winner; } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Jul 26 15:52:51 2017 (r321526) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Jul 26 16:11:08 2017 (r321527) @@ -403,7 +403,7 @@ mzap_open(objset_t *os, uint64_t obj, dmu_buf_t *db) * it, because zap_lockdir() checks zap_ismicro without the lock * held. */ - dmu_buf_init_user(&zap->zap_dbu, zap_evict, &zap->zap_dbuf); + dmu_buf_init_user(&zap->zap_dbu, zap_evict_sync, NULL, &zap->zap_dbuf); winner = dmu_buf_set_user(db, &zap->zap_dbu); if (winner != NULL) @@ -747,7 +747,7 @@ zap_destroy(objset_t *os, uint64_t zapobj, dmu_tx_t *t } void -zap_evict(void *dbu) +zap_evict_sync(void *dbu) { zap_t *zap = dbu; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:12:21 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B1129D9840D; Wed, 26 Jul 2017 16:12:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8B91A68596; Wed, 26 Jul 2017 16:12:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGCKPn063259; Wed, 26 Jul 2017 16:12:20 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGCKgu063258; Wed, 26 Jul 2017 16:12:20 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261612.v6QGCKgu063258@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:12:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321528 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321528 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:12:21 -0000 Author: mav Date: Wed Jul 26 16:12:20 2017 New Revision: 321528 URL: https://svnweb.freebsd.org/changeset/base/321528 Log: MFC r314280: MFV 314276 7570 tunable to allow zvol SCSI unmap to return on commit of txn to ZIL illumos/illumos-gate@1c9272b861cd640a8342f4407da026ed98615517 https://github.com/illumos/illumos-gate/commit/1c9272b861cd640a8342f4407da026ed98615517 https://www.illumos.org/issues/7570 Based on the discovery that every unmap waits for the commit of the txn to the ZIL, introducing a very high latency to unmap commands, this behavior was made into a tunable zvol_unmap_sync_enabled and set to false. The net impact of this change is that by default SCSI unmap commands will result in space being freed within the zvol (today they are ignored and returned with good status). However, unlike the code today, instead of 18+ms per unmap, they take about 30us. With the testing done on NTFS against a Win2k12 target, the new behavior should work seamlessly. Files on the zvol that have already been set with the zfree application will continue to write 0's when deleted, and any new files created since zvol creation will send unmap commands when deleted. This behavior exists today, but with this change the unmap commands will be processed and result in reclaim of space. Author: Stephen Blinick Reviewed by: Dan Kimmel Reviewed by: Matt Ahrens Reviewed by: Steve Gonczi Reviewed by: Pavel Zakharov Reviewed by: Saso Kiselkov Reviewed by: Yuri Pankov Approved by: Robert Mustacchi Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Wed Jul 26 16:11:08 2017 (r321527) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Wed Jul 26 16:12:20 2017 (r321528) @@ -27,7 +27,7 @@ * Portions Copyright 2010 Robert Milkowski * * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012, 2014 by Delphix. All rights reserved. + * Copyright (c) 2012, 2016 by Delphix. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -202,11 +202,22 @@ int zvol_maxphys = DMU_MAX_ACCESS/2; * Toggle unmap functionality. */ boolean_t zvol_unmap_enabled = B_TRUE; + +/* + * If true, unmaps requested as synchronous are executed synchronously, + * otherwise all unmaps are asynchronous. + */ +boolean_t zvol_unmap_sync_enabled = B_FALSE; + #ifndef illumos SYSCTL_INT(_vfs_zfs_vol, OID_AUTO, unmap_enabled, CTLFLAG_RWTUN, &zvol_unmap_enabled, 0, "Enable UNMAP functionality"); +SYSCTL_INT(_vfs_zfs_vol, OID_AUTO, unmap_sync_enabled, CTLFLAG_RWTUN, + &zvol_unmap_sync_enabled, 0, + "UNMAPs requested as sync are executed synchronously"); + static d_open_t zvol_d_open; static d_close_t zvol_d_close; static d_read_t zvol_read; @@ -2277,26 +2288,21 @@ zvol_ioctl(dev_t dev, int cmd, intptr_t arg, int flag, zfs_range_unlock(rl); - if (error == 0) { - /* - * If the write-cache is disabled or 'sync' property - * is set to 'always' then treat this as a synchronous - * operation (i.e. commit to zil). - */ - if (!(zv->zv_flags & ZVOL_WCE) || - (zv->zv_objset->os_sync == ZFS_SYNC_ALWAYS)) - zil_commit(zv->zv_zilog, ZVOL_OBJ); - - /* - * If the caller really wants synchronous writes, and - * can't wait for them, don't return until the write - * is done. - */ - if (df.df_flags & DF_WAIT_SYNC) { - txg_wait_synced( - dmu_objset_pool(zv->zv_objset), 0); - } + /* + * If the write-cache is disabled, 'sync' property + * is set to 'always', or if the caller is asking for + * a synchronous free, commit this operation to the zil. + * This will sync any previous uncommitted writes to the + * zvol object. + * Can be overridden by the zvol_unmap_sync_enabled tunable. + */ + if ((error == 0) && zvol_unmap_sync_enabled && + (!(zv->zv_flags & ZVOL_WCE) || + (zv->zv_objset->os_sync == ZFS_SYNC_ALWAYS) || + (df.df_flags & DF_WAIT_SYNC))) { + zil_commit(zv->zv_zilog, ZVOL_OBJ); } + return (error); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:14:07 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 34D9CD984E7; Wed, 26 Jul 2017 16:14:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C50C4686FF; Wed, 26 Jul 2017 16:14:06 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGE6nQ063374; Wed, 26 Jul 2017 16:14:06 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGE5r3063367; Wed, 26 Jul 2017 16:14:05 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261614.v6QGE5r3063367@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:14:05 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321529 - in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzpool/common sys/cddl/compat/opensolaris/kern sys/cddl/compat... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzpool/common sys/cddl/compat/opensolaris/kern sys/cddl/compat/opensolaris/sys sys/cddl... X-SVN-Commit-Revision: 321529 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:14:07 -0000 Author: mav Date: Wed Jul 26 16:14:05 2017 New Revision: 321529 URL: https://svnweb.freebsd.org/changeset/base/321529 Log: MFC r315896: MFV r315290, r315291: 7303 dynamic metaslab selection illumos/illumos-gate@8363e80ae72609660f6090766ca8c2c18aa53f0c https://github.com/illumos/illumos-gate/commit/8363e80ae72609660f6090766ca8c2c18 https://www.illumos.org/issues/7303 This change introduces a new weighting algorithm to improve metaslab selection . The new weighting algorithm relies on the SPACEMAP_HISTOGRAM feature. As a res ult, the metaslab weight now encodes the type of weighting algorithm used (size-based vs segment-based). This also introduce a new allocation tracing facility and two new dcmds to hel p debug allocation problems. Each zio now contains a zio_alloc_list_t structure that is populated as the zio goes through the allocations stage. Here's an example of how to use the tracing facility: > c5ec000::print zio_t io_alloc_list | ::walk list | ::metaslab_trace MSID DVA ASIZE WEIGHT RESULT VDEV - 0 400 0 NOT_ALLOCATABLE ztest.0a - 0 400 0 NOT_ALLOCATABLE ztest.0a - 0 400 0 ENOSPC ztest.0a - 0 200 0 NOT_ALLOCATABLE ztest.0a - 0 200 0 NOT_ALLOCATABLE ztest.0a - 0 200 0 ENOSPC ztest.0a 1 0 400 1 x 8M 17b1a00 ztest.0a > 1ff2400::print zio_t io_alloc_list | ::walk list | ::metaslab_trace MSID DVA ASIZE WEIGHT RESULT VDEV - 0 200 0 NOT_ALLOCATABLE mirror-2 - 0 200 0 NOT_ALLOCATABLE mirror-0 1 0 200 1 x 4M 112ae00 mirror-1 - 1 200 0 NOT_ALLOCATABLE mirror-2 - 1 200 0 NOT_ALLOCATABLE mirror-0 1 1 200 1 x 4M 112b000 mirror-1 - 2 200 0 NOT_ALLOCATABLE mirror-2 If the metaslab is using segment-based weighting then the WEIGHT column will display the number of segments available in the bucket where the allocation attempt was made. Author: George Wilson Reviewed by: Alex Reece Reviewed by: Chris Siden Reviewed by: Dan Kimmel Reviewed by: Matthew Ahrens Reviewed by: Paul Dagnelie Reviewed by: Pavel Zakharov Reviewed by: Prakash Surya Reviewed by: Don Brady Approved by: Richard Lowe Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c stable/11/sys/cddl/compat/opensolaris/sys/kstat.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_debug.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Jul 26 16:12:20 2017 (r321528) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Jul 26 16:14:05 2017 (r321529) @@ -2585,10 +2585,21 @@ zdb_leak_init(spa_t *spa, zdb_cb_t *zcb) if (!dump_opt['L']) { vdev_t *rvd = spa->spa_root_vdev; + + /* + * We are going to be changing the meaning of the metaslab's + * ms_tree. Ensure that the allocator doesn't try to + * use the tree. + */ + spa->spa_normal_class->mc_ops = &zdb_metaslab_ops; + spa->spa_log_class->mc_ops = &zdb_metaslab_ops; + for (uint64_t c = 0; c < rvd->vdev_children; c++) { vdev_t *vd = rvd->vdev_child[c]; + metaslab_group_t *mg = vd->vdev_mg; for (uint64_t m = 0; m < vd->vdev_ms_count; m++) { metaslab_t *msp = vd->vdev_ms[m]; + ASSERT3P(msp->ms_group, ==, mg); mutex_enter(&msp->ms_lock); metaslab_unload(msp); @@ -2609,8 +2620,6 @@ zdb_leak_init(spa_t *spa, zdb_cb_t *zcb) (longlong_t)m, (longlong_t)vd->vdev_ms_count); - msp->ms_ops = &zdb_metaslab_ops; - /* * We don't want to spend the CPU * manipulating the size-ordered @@ -2620,7 +2629,10 @@ zdb_leak_init(spa_t *spa, zdb_cb_t *zcb) msp->ms_tree->rt_ops = NULL; VERIFY0(space_map_load(msp->ms_sm, msp->ms_tree, SM_ALLOC)); - msp->ms_loaded = B_TRUE; + + if (!msp->ms_loaded) { + msp->ms_loaded = B_TRUE; + } } mutex_exit(&msp->ms_lock); } @@ -2642,8 +2654,10 @@ zdb_leak_fini(spa_t *spa) vdev_t *rvd = spa->spa_root_vdev; for (int c = 0; c < rvd->vdev_children; c++) { vdev_t *vd = rvd->vdev_child[c]; + metaslab_group_t *mg = vd->vdev_mg; for (int m = 0; m < vd->vdev_ms_count; m++) { metaslab_t *msp = vd->vdev_ms[m]; + ASSERT3P(mg, ==, msp->ms_group); mutex_enter(&msp->ms_lock); /* @@ -2657,7 +2671,10 @@ zdb_leak_fini(spa_t *spa) * from the ms_tree. */ range_tree_vacate(msp->ms_tree, zdb_leak, vd); - msp->ms_loaded = B_FALSE; + + if (msp->ms_loaded) { + msp->ms_loaded = B_FALSE; + } mutex_exit(&msp->ms_lock); } Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Jul 26 16:12:20 2017 (r321528) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Wed Jul 26 16:14:05 2017 (r321529) @@ -173,7 +173,7 @@ static const ztest_shared_opts_t ztest_opts_defaults = .zo_mirrors = 2, .zo_raidz = 4, .zo_raidz_parity = 1, - .zo_vdev_size = SPA_MINDEVSIZE * 2, + .zo_vdev_size = SPA_MINDEVSIZE * 4, /* 256m default size */ .zo_datasets = 7, .zo_threads = 23, .zo_passtime = 60, /* 60 seconds */ Modified: stable/11/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c Wed Jul 26 16:12:20 2017 (r321528) +++ stable/11/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c Wed Jul 26 16:14:05 2017 (r321529) @@ -97,6 +97,11 @@ kstat_create(char *module, int instance, char *name, c /*ARGSUSED*/ void +kstat_named_init(kstat_named_t *knp, const char *name, uchar_t type) +{} + +/*ARGSUSED*/ +void kstat_install(kstat_t *ksp) {} Modified: stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c ============================================================================== --- stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c Wed Jul 26 16:12:20 2017 (r321528) +++ stable/11/sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c Wed Jul 26 16:14:05 2017 (r321529) @@ -129,3 +129,19 @@ kstat_delete(kstat_t *ksp) sysctl_ctx_free(&ksp->ks_sysctl_ctx); free(ksp, M_KSTAT); } + +void +kstat_set_string(char *dst, const char *src) +{ + + bzero(dst, KSTAT_STRLEN); + (void) strncpy(dst, src, KSTAT_STRLEN - 1); +} + +void +kstat_named_init(kstat_named_t *knp, const char *name, uchar_t data_type) +{ + + kstat_set_string(knp->name, name); + knp->data_type = data_type; +} Modified: stable/11/sys/cddl/compat/opensolaris/sys/kstat.h ============================================================================== --- stable/11/sys/cddl/compat/opensolaris/sys/kstat.h Wed Jul 26 16:12:20 2017 (r321528) +++ stable/11/sys/cddl/compat/opensolaris/sys/kstat.h Wed Jul 26 16:14:05 2017 (r321529) @@ -69,5 +69,7 @@ kstat_t *kstat_create(char *module, int instance, char uchar_t type, ulong_t ndata, uchar_t flags); void kstat_install(kstat_t *ksp); void kstat_delete(kstat_t *ksp); +void kstat_set_string(char *, const char *); +void kstat_named_init(kstat_named_t *, const char *, uchar_t); #endif /* _OPENSOLARIS_SYS_KSTAT_H_ */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Jul 26 16:12:20 2017 (r321528) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Jul 26 16:14:05 2017 (r321529) @@ -41,11 +41,6 @@ SYSCTL_NODE(_vfs_zfs, OID_AUTO, metaslab, CTLFLAG_RW, #define GANG_ALLOCATION(flags) \ ((flags) & (METASLAB_GANG_CHILD | METASLAB_GANG_HEADER)) -#define METASLAB_WEIGHT_PRIMARY (1ULL << 63) -#define METASLAB_WEIGHT_SECONDARY (1ULL << 62) -#define METASLAB_ACTIVE_MASK \ - (METASLAB_WEIGHT_PRIMARY | METASLAB_WEIGHT_SECONDARY) - uint64_t metaslab_aliquot = 512ULL << 10; uint64_t metaslab_gang_bang = SPA_MAXBLOCKSIZE + 1; /* force gang blocks */ SYSCTL_QUAD(_vfs_zfs_metaslab, OID_AUTO, gang_bang, CTLFLAG_RWTUN, @@ -55,7 +50,7 @@ SYSCTL_QUAD(_vfs_zfs_metaslab, OID_AUTO, gang_bang, CT /* * The in-core space map representation is more compact than its on-disk form. * The zfs_condense_pct determines how much more compact the in-core - * space_map representation must be before we compact it on-disk. + * space map representation must be before we compact it on-disk. * Values should be greater than or equal to 100. */ int zfs_condense_pct = 200; @@ -153,7 +148,7 @@ SYSCTL_QUAD(_vfs_zfs_metaslab, OID_AUTO, df_alloc_thre /* * The minimum free space, in percent, which must be available * in a space map to continue allocations in a first-fit fashion. - * Once the space_map's free space drops below this level we dynamically + * Once the space map's free space drops below this level we dynamically * switch to using best-fit allocations. */ int metaslab_df_free_pct = 4; @@ -230,9 +225,40 @@ SYSCTL_INT(_vfs_zfs_metaslab, OID_AUTO, bias_enabled, &metaslab_bias_enabled, 0, "Enable metaslab group biasing"); -static uint64_t metaslab_fragmentation(metaslab_t *); +/* + * Enable/disable segment-based metaslab selection. + */ +boolean_t zfs_metaslab_segment_weight_enabled = B_TRUE; /* + * When using segment-based metaslab selection, we will continue + * allocating from the active metaslab until we have exhausted + * zfs_metaslab_switch_threshold of its buckets. + */ +int zfs_metaslab_switch_threshold = 2; + +/* + * Internal switch to enable/disable the metaslab allocation tracing + * facility. + */ +boolean_t metaslab_trace_enabled = B_TRUE; + +/* + * Maximum entries that the metaslab allocation tracing facility will keep + * in a given list when running in non-debug mode. We limit the number + * of entries in non-debug mode to prevent us from using up too much memory. + * The limit should be sufficiently large that we don't expect any allocation + * to every exceed this value. In debug mode, the system will panic if this + * limit is ever reached allowing for further investigation. + */ +uint64_t metaslab_trace_max_entries = 5000; + +static uint64_t metaslab_weight(metaslab_t *); +static void metaslab_set_fragmentation(metaslab_t *); + +kmem_cache_t *metaslab_alloc_trace_cache; + +/* * ========================================================================== * Metaslab classes * ========================================================================== @@ -475,11 +501,6 @@ metaslab_class_expandable_space(metaslab_class_t *mc) return (space); } -/* - * ========================================================================== - * Metaslab groups - * ========================================================================== - */ static int metaslab_compare(const void *x1, const void *x2) { @@ -505,6 +526,57 @@ metaslab_compare(const void *x1, const void *x2) } /* + * Verify that the space accounting on disk matches the in-core range_trees. + */ +void +metaslab_verify_space(metaslab_t *msp, uint64_t txg) +{ + spa_t *spa = msp->ms_group->mg_vd->vdev_spa; + uint64_t allocated = 0; + uint64_t freed = 0; + uint64_t sm_free_space, msp_free_space; + + ASSERT(MUTEX_HELD(&msp->ms_lock)); + + if ((zfs_flags & ZFS_DEBUG_METASLAB_VERIFY) == 0) + return; + + /* + * We can only verify the metaslab space when we're called + * from syncing context with a loaded metaslab that has an allocated + * space map. Calling this in non-syncing context does not + * provide a consistent view of the metaslab since we're performing + * allocations in the future. + */ + if (txg != spa_syncing_txg(spa) || msp->ms_sm == NULL || + !msp->ms_loaded) + return; + + sm_free_space = msp->ms_size - space_map_allocated(msp->ms_sm) - + space_map_alloc_delta(msp->ms_sm); + + /* + * Account for future allocations since we would have already + * deducted that space from the ms_freetree. + */ + for (int t = 0; t < TXG_CONCURRENT_STATES; t++) { + allocated += + range_tree_space(msp->ms_alloctree[(txg + t) & TXG_MASK]); + } + freed = range_tree_space(msp->ms_freetree[TXG_CLEAN(txg) & TXG_MASK]); + + msp_free_space = range_tree_space(msp->ms_tree) + allocated + + msp->ms_deferspace + freed; + + VERIFY3U(sm_free_space, ==, msp_free_space); +} + +/* + * ========================================================================== + * Metaslab groups + * ========================================================================== + */ +/* * Update the allocatable flag and the metaslab group's capacity. * The allocatable flag is set to true if the capacity is below * the zfs_mg_noalloc_threshold or has a fragmentation value that is @@ -1078,7 +1150,7 @@ static range_tree_ops_t metaslab_rt_ops = { /* * ========================================================================== - * Metaslab block operations + * Common allocator routines * ========================================================================== */ @@ -1097,33 +1169,24 @@ metaslab_block_maxsize(metaslab_t *msp) return (rs->rs_end - rs->rs_start); } -uint64_t -metaslab_block_alloc(metaslab_t *msp, uint64_t size) +static range_seg_t * +metaslab_block_find(avl_tree_t *t, uint64_t start, uint64_t size) { - uint64_t start; - range_tree_t *rt = msp->ms_tree; + range_seg_t *rs, rsearch; + avl_index_t where; - VERIFY(!msp->ms_condensing); + rsearch.rs_start = start; + rsearch.rs_end = start + size; - start = msp->ms_ops->msop_alloc(msp, size); - if (start != -1ULL) { - vdev_t *vd = msp->ms_group->mg_vd; - - VERIFY0(P2PHASE(start, 1ULL << vd->vdev_ashift)); - VERIFY0(P2PHASE(size, 1ULL << vd->vdev_ashift)); - VERIFY3U(range_tree_space(rt) - size, <=, msp->ms_size); - range_tree_remove(rt, start, size); + rs = avl_find(t, &rsearch, &where); + if (rs == NULL) { + rs = avl_nearest(t, where, AVL_AFTER); } - return (start); + + return (rs); } /* - * ========================================================================== - * Common allocator routines - * ========================================================================== - */ - -/* * This is a helper function that can be used by the allocator to find * a suitable block to allocate. This will search the specified AVL * tree looking for a block that matches the specified criteria. @@ -1132,16 +1195,8 @@ static uint64_t metaslab_block_picker(avl_tree_t *t, uint64_t *cursor, uint64_t size, uint64_t align) { - range_seg_t *rs, rsearch; - avl_index_t where; + range_seg_t *rs = metaslab_block_find(t, *cursor, size); - rsearch.rs_start = *cursor; - rsearch.rs_end = *cursor + size; - - rs = avl_find(t, &rsearch, &where); - if (rs == NULL) - rs = avl_nearest(t, where, AVL_AFTER); - while (rs != NULL) { uint64_t offset = P2ROUNDUP(rs->rs_start, align); @@ -1365,6 +1420,7 @@ int metaslab_load(metaslab_t *msp) { int error = 0; + boolean_t success = B_FALSE; ASSERT(MUTEX_HELD(&msp->ms_lock)); ASSERT(!msp->ms_loaded); @@ -1382,14 +1438,18 @@ metaslab_load(metaslab_t *msp) else range_tree_add(msp->ms_tree, msp->ms_start, msp->ms_size); - msp->ms_loaded = (error == 0); + success = (error == 0); msp->ms_loading = B_FALSE; - if (msp->ms_loaded) { + if (success) { + ASSERT3P(msp->ms_group, !=, NULL); + msp->ms_loaded = B_TRUE; + for (int t = 0; t < TXG_DEFER_SIZE; t++) { range_tree_walk(msp->ms_defertree[t], range_tree_remove, msp->ms_tree); } + msp->ms_max_size = metaslab_block_maxsize(msp); } cv_broadcast(&msp->ms_load_cv); return (error); @@ -1402,6 +1462,7 @@ metaslab_unload(metaslab_t *msp) range_tree_vacate(msp->ms_tree, NULL, NULL); msp->ms_loaded = B_FALSE; msp->ms_weight &= ~METASLAB_ACTIVE_MASK; + msp->ms_max_size = 0; } int @@ -1446,21 +1507,23 @@ metaslab_init(metaslab_group_t *mg, uint64_t id, uint6 ms->ms_tree = range_tree_create(&metaslab_rt_ops, ms, &ms->ms_lock); metaslab_group_add(mg, ms); - ms->ms_fragmentation = metaslab_fragmentation(ms); - ms->ms_ops = mg->mg_class->mc_ops; + metaslab_set_fragmentation(ms); /* * If we're opening an existing pool (txg == 0) or creating * a new one (txg == TXG_INITIAL), all space is available now. * If we're adding space to an existing pool, the new space * does not become available until after this txg has synced. + * The metaslab's weight will also be initialized when we sync + * out this txg. This ensures that we don't attempt to allocate + * from it before we have initialized it completely. */ if (txg <= TXG_INITIAL) metaslab_sync_done(ms, 0); /* * If metaslab_debug_load is set and we're initializing a metaslab - * that has an allocated space_map object then load the its space + * that has an allocated space map object then load the its space * map so that can verify frees. */ if (metaslab_debug_load && ms->ms_sm != NULL) { @@ -1487,7 +1550,6 @@ metaslab_fini(metaslab_t *msp) metaslab_group_remove(mg, msp); mutex_enter(&msp->ms_lock); - VERIFY(msp->ms_group == NULL); vdev_space_update(mg->mg_vd, -space_map_allocated(msp->ms_sm), 0, -msp->ms_size); @@ -1560,8 +1622,8 @@ int zfs_frag_table[FRAGMENTATION_TABLE_SIZE] = { * not support this metric. Otherwise, the return value should be in the * range [0, 100]. */ -static uint64_t -metaslab_fragmentation(metaslab_t *msp) +static void +metaslab_set_fragmentation(metaslab_t *msp) { spa_t *spa = msp->ms_group->mg_vd->vdev_spa; uint64_t fragmentation = 0; @@ -1569,18 +1631,22 @@ metaslab_fragmentation(metaslab_t *msp) boolean_t feature_enabled = spa_feature_is_enabled(spa, SPA_FEATURE_SPACEMAP_HISTOGRAM); - if (!feature_enabled) - return (ZFS_FRAG_INVALID); + if (!feature_enabled) { + msp->ms_fragmentation = ZFS_FRAG_INVALID; + return; + } /* * A null space map means that the entire metaslab is free * and thus is not fragmented. */ - if (msp->ms_sm == NULL) - return (0); + if (msp->ms_sm == NULL) { + msp->ms_fragmentation = 0; + return; + } /* - * If this metaslab's space_map has not been upgraded, flag it + * If this metaslab's space map has not been upgraded, flag it * so that we upgrade next time we encounter it. */ if (msp->ms_sm->sm_dbuf->db_size != sizeof (space_map_phys_t)) { @@ -1593,12 +1659,14 @@ metaslab_fragmentation(metaslab_t *msp) spa_dbgmsg(spa, "txg %llu, requesting force condense: " "msp %p, vd %p", txg, msp, vd); } - return (ZFS_FRAG_INVALID); + msp->ms_fragmentation = ZFS_FRAG_INVALID; + return; } for (int i = 0; i < SPACE_MAP_HISTOGRAM_SIZE; i++) { uint64_t space = 0; uint8_t shift = msp->ms_sm->sm_shift; + int idx = MIN(shift - SPA_MINBLOCKSHIFT + i, FRAGMENTATION_TABLE_SIZE - 1); @@ -1615,7 +1683,8 @@ metaslab_fragmentation(metaslab_t *msp) if (total > 0) fragmentation /= total; ASSERT3U(fragmentation, <=, 100); - return (fragmentation); + + msp->ms_fragmentation = fragmentation; } /* @@ -1624,30 +1693,20 @@ metaslab_fragmentation(metaslab_t *msp) * the LBA range, and whether the metaslab is loaded. */ static uint64_t -metaslab_weight(metaslab_t *msp) +metaslab_space_weight(metaslab_t *msp) { metaslab_group_t *mg = msp->ms_group; vdev_t *vd = mg->mg_vd; uint64_t weight, space; ASSERT(MUTEX_HELD(&msp->ms_lock)); + ASSERT(!vd->vdev_removing); /* - * This vdev is in the process of being removed so there is nothing - * for us to do here. - */ - if (vd->vdev_removing) { - ASSERT0(space_map_allocated(msp->ms_sm)); - ASSERT0(vd->vdev_ms_shift); - return (0); - } - - /* * The baseline weight is the metaslab's free space. */ space = msp->ms_size - space_map_allocated(msp->ms_sm); - msp->ms_fragmentation = metaslab_fragmentation(msp); if (metaslab_fragmentation_factor_enabled && msp->ms_fragmentation != ZFS_FRAG_INVALID) { /* @@ -1696,9 +1755,213 @@ metaslab_weight(metaslab_t *msp) weight |= (msp->ms_weight & METASLAB_ACTIVE_MASK); } + WEIGHT_SET_SPACEBASED(weight); return (weight); } +/* + * Return the weight of the specified metaslab, according to the segment-based + * weighting algorithm. The metaslab must be loaded. This function can + * be called within a sync pass since it relies only on the metaslab's + * range tree which is always accurate when the metaslab is loaded. + */ +static uint64_t +metaslab_weight_from_range_tree(metaslab_t *msp) +{ + uint64_t weight = 0; + uint32_t segments = 0; + + ASSERT(msp->ms_loaded); + + for (int i = RANGE_TREE_HISTOGRAM_SIZE - 1; i >= SPA_MINBLOCKSHIFT; + i--) { + uint8_t shift = msp->ms_group->mg_vd->vdev_ashift; + int max_idx = SPACE_MAP_HISTOGRAM_SIZE + shift - 1; + + segments <<= 1; + segments += msp->ms_tree->rt_histogram[i]; + + /* + * The range tree provides more precision than the space map + * and must be downgraded so that all values fit within the + * space map's histogram. This allows us to compare loaded + * vs. unloaded metaslabs to determine which metaslab is + * considered "best". + */ + if (i > max_idx) + continue; + + if (segments != 0) { + WEIGHT_SET_COUNT(weight, segments); + WEIGHT_SET_INDEX(weight, i); + WEIGHT_SET_ACTIVE(weight, 0); + break; + } + } + return (weight); +} + +/* + * Calculate the weight based on the on-disk histogram. This should only + * be called after a sync pass has completely finished since the on-disk + * information is updated in metaslab_sync(). + */ +static uint64_t +metaslab_weight_from_spacemap(metaslab_t *msp) +{ + uint64_t weight = 0; + + for (int i = SPACE_MAP_HISTOGRAM_SIZE - 1; i >= 0; i--) { + if (msp->ms_sm->sm_phys->smp_histogram[i] != 0) { + WEIGHT_SET_COUNT(weight, + msp->ms_sm->sm_phys->smp_histogram[i]); + WEIGHT_SET_INDEX(weight, i + + msp->ms_sm->sm_shift); + WEIGHT_SET_ACTIVE(weight, 0); + break; + } + } + return (weight); +} + +/* + * Compute a segment-based weight for the specified metaslab. The weight + * is determined by highest bucket in the histogram. The information + * for the highest bucket is encoded into the weight value. + */ +static uint64_t +metaslab_segment_weight(metaslab_t *msp) +{ + metaslab_group_t *mg = msp->ms_group; + uint64_t weight = 0; + uint8_t shift = mg->mg_vd->vdev_ashift; + + ASSERT(MUTEX_HELD(&msp->ms_lock)); + + /* + * The metaslab is completely free. + */ + if (space_map_allocated(msp->ms_sm) == 0) { + int idx = highbit64(msp->ms_size) - 1; + int max_idx = SPACE_MAP_HISTOGRAM_SIZE + shift - 1; + + if (idx < max_idx) { + WEIGHT_SET_COUNT(weight, 1ULL); + WEIGHT_SET_INDEX(weight, idx); + } else { + WEIGHT_SET_COUNT(weight, 1ULL << (idx - max_idx)); + WEIGHT_SET_INDEX(weight, max_idx); + } + WEIGHT_SET_ACTIVE(weight, 0); + ASSERT(!WEIGHT_IS_SPACEBASED(weight)); + + return (weight); + } + + ASSERT3U(msp->ms_sm->sm_dbuf->db_size, ==, sizeof (space_map_phys_t)); + + /* + * If the metaslab is fully allocated then just make the weight 0. + */ + if (space_map_allocated(msp->ms_sm) == msp->ms_size) + return (0); + /* + * If the metaslab is already loaded, then use the range tree to + * determine the weight. Otherwise, we rely on the space map information + * to generate the weight. + */ + if (msp->ms_loaded) { + weight = metaslab_weight_from_range_tree(msp); + } else { + weight = metaslab_weight_from_spacemap(msp); + } + + /* + * If the metaslab was active the last time we calculated its weight + * then keep it active. We want to consume the entire region that + * is associated with this weight. + */ + if (msp->ms_activation_weight != 0 && weight != 0) + WEIGHT_SET_ACTIVE(weight, WEIGHT_GET_ACTIVE(msp->ms_weight)); + return (weight); +} + +/* + * Determine if we should attempt to allocate from this metaslab. If the + * metaslab has a maximum size then we can quickly determine if the desired + * allocation size can be satisfied. Otherwise, if we're using segment-based + * weighting then we can determine the maximum allocation that this metaslab + * can accommodate based on the index encoded in the weight. If we're using + * space-based weights then rely on the entire weight (excluding the weight + * type bit). + */ +boolean_t +metaslab_should_allocate(metaslab_t *msp, uint64_t asize) +{ + boolean_t should_allocate; + + if (msp->ms_max_size != 0) + return (msp->ms_max_size >= asize); + + if (!WEIGHT_IS_SPACEBASED(msp->ms_weight)) { + /* + * The metaslab segment weight indicates segments in the + * range [2^i, 2^(i+1)), where i is the index in the weight. + * Since the asize might be in the middle of the range, we + * should attempt the allocation if asize < 2^(i+1). + */ + should_allocate = (asize < + 1ULL << (WEIGHT_GET_INDEX(msp->ms_weight) + 1)); + } else { + should_allocate = (asize <= + (msp->ms_weight & ~METASLAB_WEIGHT_TYPE)); + } + return (should_allocate); +} + +static uint64_t +metaslab_weight(metaslab_t *msp) +{ + vdev_t *vd = msp->ms_group->mg_vd; + spa_t *spa = vd->vdev_spa; + uint64_t weight; + + ASSERT(MUTEX_HELD(&msp->ms_lock)); + + /* + * This vdev is in the process of being removed so there is nothing + * for us to do here. + */ + if (vd->vdev_removing) { + ASSERT0(space_map_allocated(msp->ms_sm)); + ASSERT0(vd->vdev_ms_shift); + return (0); + } + + metaslab_set_fragmentation(msp); + + /* + * Update the maximum size if the metaslab is loaded. This will + * ensure that we get an accurate maximum size if newly freed space + * has been added back into the free tree. + */ + if (msp->ms_loaded) + msp->ms_max_size = metaslab_block_maxsize(msp); + + /* + * Segment-based weighting requires space map histogram support. + */ + if (zfs_metaslab_segment_weight_enabled && + spa_feature_is_enabled(spa, SPA_FEATURE_SPACEMAP_HISTOGRAM) && + (msp->ms_sm == NULL || msp->ms_sm->sm_dbuf->db_size == + sizeof (space_map_phys_t))) { + weight = metaslab_segment_weight(msp); + } else { + weight = metaslab_space_weight(msp); + } + return (weight); +} + static int metaslab_activate(metaslab_t *msp, uint64_t activation_weight) { @@ -1714,6 +1977,7 @@ metaslab_activate(metaslab_t *msp, uint64_t activation } } + msp->ms_activation_weight = msp->ms_weight; metaslab_group_sort(msp->ms_group, msp, msp->ms_weight | activation_weight); } @@ -1724,18 +1988,56 @@ metaslab_activate(metaslab_t *msp, uint64_t activation } static void -metaslab_passivate(metaslab_t *msp, uint64_t size) +metaslab_passivate(metaslab_t *msp, uint64_t weight) { + uint64_t size = weight & ~METASLAB_WEIGHT_TYPE; + /* * If size < SPA_MINBLOCKSIZE, then we will not allocate from * this metaslab again. In that case, it had better be empty, * or we would be leaving space on the table. */ - ASSERT(size >= SPA_MINBLOCKSIZE || range_tree_space(msp->ms_tree) == 0); - metaslab_group_sort(msp->ms_group, msp, MIN(msp->ms_weight, size)); + ASSERT(size >= SPA_MINBLOCKSIZE || + range_tree_space(msp->ms_tree) == 0); + ASSERT0(weight & METASLAB_ACTIVE_MASK); + + msp->ms_activation_weight = 0; + metaslab_group_sort(msp->ms_group, msp, weight); ASSERT((msp->ms_weight & METASLAB_ACTIVE_MASK) == 0); } +/* + * Segment-based metaslabs are activated once and remain active until + * we either fail an allocation attempt (similar to space-based metaslabs) + * or have exhausted the free space in zfs_metaslab_switch_threshold + * buckets since the metaslab was activated. This function checks to see + * if we've exhaused the zfs_metaslab_switch_threshold buckets in the + * metaslab and passivates it proactively. This will allow us to select a + * metaslabs with larger contiguous region if any remaining within this + * metaslab group. If we're in sync pass > 1, then we continue using this + * metaslab so that we don't dirty more block and cause more sync passes. + */ +void +metaslab_segment_may_passivate(metaslab_t *msp) +{ + spa_t *spa = msp->ms_group->mg_vd->vdev_spa; + + if (WEIGHT_IS_SPACEBASED(msp->ms_weight) || spa_sync_pass(spa) > 1) + return; + + /* + * Since we are in the middle of a sync pass, the most accurate + * information that is accessible to us is the in-core range tree + * histogram; calculate the new weight based on that information. + */ + uint64_t weight = metaslab_weight_from_range_tree(msp); + int activation_idx = WEIGHT_GET_INDEX(msp->ms_activation_weight); + int current_idx = WEIGHT_GET_INDEX(weight); + + if (current_idx <= activation_idx - zfs_metaslab_switch_threshold) + metaslab_passivate(msp, weight); +} + static void metaslab_preload(void *arg) { @@ -1748,11 +2050,7 @@ metaslab_preload(void *arg) metaslab_load_wait(msp); if (!msp->ms_loaded) (void) metaslab_load(msp); - - /* - * Set the ms_access_txg value so that we don't unload it right away. - */ - msp->ms_access_txg = spa_syncing_txg(spa) + metaslab_unload_delay + 1; + msp->ms_selected_txg = spa_syncing_txg(spa); mutex_exit(&msp->ms_lock); } @@ -1773,10 +2071,7 @@ metaslab_group_preload(metaslab_group_t *mg) /* * Load the next potential metaslabs */ - msp = avl_first(t); - while (msp != NULL) { - metaslab_t *msp_next = AVL_NEXT(t, msp); - + for (msp = avl_first(t); msp != NULL; msp = AVL_NEXT(t, msp)) { /* * We preload only the maximum number of metaslabs specified * by metaslab_preload_limit. If a metaslab is being forced @@ -1784,27 +2079,11 @@ metaslab_group_preload(metaslab_group_t *mg) * that force condensing happens in the next txg. */ if (++m > metaslab_preload_limit && !msp->ms_condense_wanted) { - msp = msp_next; continue; } - /* - * We must drop the metaslab group lock here to preserve - * lock ordering with the ms_lock (when grabbing both - * the mg_lock and the ms_lock, the ms_lock must be taken - * first). As a result, it is possible that the ordering - * of the metaslabs within the avl tree may change before - * we reacquire the lock. The metaslab cannot be removed from - * the tree while we're in syncing context so it is safe to - * drop the mg_lock here. If the metaslabs are reordered - * nothing will break -- we just may end up loading a - * less than optimal one. - */ - mutex_exit(&mg->mg_lock); VERIFY(taskq_dispatch(mg->mg_taskq, metaslab_preload, msp, TQ_SLEEP) != 0); - mutex_enter(&mg->mg_lock); - msp = msp_next; } mutex_exit(&mg->mg_lock); } @@ -1953,7 +2232,7 @@ metaslab_condense(metaslab_t *msp, uint64_t txg, dmu_t mutex_enter(&msp->ms_lock); /* - * While we would ideally like to create a space_map representation + * While we would ideally like to create a space map representation * that consists only of allocation records, doing so can be * prohibitively expensive because the in-core free tree can be * large, and therefore computationally expensive to subtract @@ -2016,7 +2295,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) * metaslab_sync() is the metaslab's ms_tree. No other thread can * be modifying this txg's alloctree, freetree, freed_tree, or * space_map_phys_t. Therefore, we only hold ms_lock to satify - * space_map ASSERTs. We drop it whenever we call into the DMU, + * space map ASSERTs. We drop it whenever we call into the DMU, * because the DMU can call down to us (e.g. via zio_free()) at * any time. */ @@ -2038,7 +2317,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) mutex_enter(&msp->ms_lock); /* - * Note: metaslab_condense() clears the space_map's histogram. + * Note: metaslab_condense() clears the space map's histogram. * Therefore we must verify and remove this histogram before * condensing. */ @@ -2063,16 +2342,38 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) */ space_map_histogram_clear(msp->ms_sm); space_map_histogram_add(msp->ms_sm, msp->ms_tree, tx); - } else { + /* - * Since the space map is not loaded we simply update the - * exisiting histogram with what was freed in this txg. This - * means that the on-disk histogram may not have an accurate - * view of the free space but it's close enough to allow - * us to make allocation decisions. + * Since we've cleared the histogram we need to add back + * any free space that has already been processed, plus + * any deferred space. This allows the on-disk histogram + * to accurately reflect all free space even if some space + * is not yet available for allocation (i.e. deferred). */ - space_map_histogram_add(msp->ms_sm, *freetree, tx); + space_map_histogram_add(msp->ms_sm, *freed_tree, tx); + + /* + * Add back any deferred free space that has not been + * added back into the in-core free tree yet. This will + * ensure that we don't end up with a space map histogram + * that is completely empty unless the metaslab is fully + * allocated. + */ + for (int t = 0; t < TXG_DEFER_SIZE; t++) { + space_map_histogram_add(msp->ms_sm, + msp->ms_defertree[t], tx); + } } + + /* + * Always add the free space from this sync pass to the space + * map histogram. We want to make sure that the on-disk histogram + * accounts for all free space. If the space map is not loaded, + * then we will lose some accuracy but will correct it the next + * time we load the space map. + */ + space_map_histogram_add(msp->ms_sm, *freetree, tx); + metaslab_group_histogram_add(mg, msp); metaslab_group_histogram_verify(mg); metaslab_class_histogram_verify(mg->mg_class); @@ -2091,6 +2392,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) range_tree_vacate(alloctree, NULL, NULL); ASSERT0(range_tree_space(msp->ms_alloctree[txg & TXG_MASK])); + ASSERT0(range_tree_space(msp->ms_alloctree[TXG_CLEAN(txg) & TXG_MASK])); ASSERT0(range_tree_space(msp->ms_freetree[txg & TXG_MASK])); mutex_exit(&msp->ms_lock); @@ -2112,9 +2414,11 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) { metaslab_group_t *mg = msp->ms_group; vdev_t *vd = mg->mg_vd; + spa_t *spa = vd->vdev_spa; range_tree_t **freed_tree; range_tree_t **defer_tree; int64_t alloc_delta, defer_delta; + boolean_t defer_allowed = B_TRUE; ASSERT(!vd->vdev_ishole); @@ -2149,9 +2453,20 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) freed_tree = &msp->ms_freetree[TXG_CLEAN(txg) & TXG_MASK]; defer_tree = &msp->ms_defertree[txg % TXG_DEFER_SIZE]; + uint64_t free_space = metaslab_class_get_space(spa_normal_class(spa)) - + metaslab_class_get_alloc(spa_normal_class(spa)); + if (free_space <= spa_get_slop_space(spa)) { + defer_allowed = B_FALSE; + } + + defer_delta = 0; alloc_delta = space_map_alloc_delta(msp->ms_sm); - defer_delta = range_tree_space(*freed_tree) - - range_tree_space(*defer_tree); + if (defer_allowed) { + defer_delta = range_tree_space(*freed_tree) - + range_tree_space(*defer_tree); + } else { + defer_delta -= range_tree_space(*defer_tree); + } vdev_space_update(vd, alloc_delta + defer_delta, defer_delta, 0); @@ -2172,7 +2487,12 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) */ range_tree_vacate(*defer_tree, msp->ms_loaded ? range_tree_add : NULL, msp->ms_tree); - range_tree_swap(freed_tree, defer_tree); + if (defer_allowed) { + range_tree_swap(freed_tree, defer_tree); + } else { + range_tree_vacate(*freed_tree, + msp->ms_loaded ? range_tree_add : NULL, msp->ms_tree); + } space_map_update(msp->ms_sm); @@ -2187,7 +2507,18 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) vdev_dirty(vd, VDD_METASLAB, msp, txg + 1); } - if (msp->ms_loaded && msp->ms_access_txg < txg) { + /* + * Calculate the new weights before unloading any metaslabs. + * This will give us the most accurate weighting. + */ + metaslab_group_sort(mg, msp, metaslab_weight(msp)); + + /* + * If the metaslab is loaded and we've not tried to load or allocate + * from it in 'metaslab_unload_delay' txgs, then unload it. + */ + if (msp->ms_loaded && + msp->ms_selected_txg + metaslab_unload_delay < txg) { for (int t = 1; t < TXG_CONCURRENT_STATES; t++) { VERIFY0(range_tree_space( msp->ms_alloctree[(txg + t) & TXG_MASK])); @@ -2197,7 +2528,6 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) metaslab_unload(msp); } *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:14:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 33955D98597; Wed, 26 Jul 2017 16:14:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0142A68861; Wed, 26 Jul 2017 16:14:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGEwiu063455; Wed, 26 Jul 2017 16:14:58 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGEwEu063453; Wed, 26 Jul 2017 16:14:58 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261614.v6QGEwEu063453@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:14:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321530 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321530 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:14:59 -0000 Author: mav Date: Wed Jul 26 16:14:57 2017 New Revision: 321530 URL: https://svnweb.freebsd.org/changeset/base/321530 Log: MFC r316037: MFV: 315989 7603 xuio_stat_wbuf_* should be declared (void) illumos/illumos-gate@99aa8b55058e512798eafbd71f72f916bdc10181 https://github.com/illumos/illumos-gate/commit/99aa8b55058e512798eafbd71f72f916bdc10181 https://www.illumos.org/issues/7603 The funcs are declared k&r style, where the args are not specified: void xuio_stat_wbuf_copied(); They should be declared to take no arguments: void xuio_stat_wbuf_copied(void); Need to change both .c and .h. Author: Prashanth Sreenivasa Reviewed by: Matthew Ahrens Reviewed by: Paul Dagnelie Reviewed by: Robert Mustacchi Approved by: Richard Lowe Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 16:14:05 2017 (r321529) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 16:14:57 2017 (r321530) @@ -1124,13 +1124,13 @@ xuio_stat_fini(void) } void -xuio_stat_wbuf_copied() +xuio_stat_wbuf_copied(void) { XUIOSTAT_BUMP(xuiostat_wbuf_copied); } void -xuio_stat_wbuf_nocopy() +xuio_stat_wbuf_nocopy(void) { XUIOSTAT_BUMP(xuiostat_wbuf_nocopy); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:14:05 2017 (r321529) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:14:57 2017 (r321530) @@ -762,8 +762,8 @@ int dmu_xuio_add(struct xuio *uio, struct arc_buf *abu int dmu_xuio_cnt(struct xuio *uio); struct arc_buf *dmu_xuio_arcbuf(struct xuio *uio, int i); void dmu_xuio_clear(struct xuio *uio, int i); -void xuio_stat_wbuf_copied(); -void xuio_stat_wbuf_nocopy(); +void xuio_stat_wbuf_copied(void); +void xuio_stat_wbuf_nocopy(void); extern boolean_t zfs_prefetch_disable; extern int zfs_max_recordsize; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:16:44 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E2D14D98664; Wed, 26 Jul 2017 16:16:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BCB7C689B3; Wed, 26 Jul 2017 16:16:44 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGGhPr063581; Wed, 26 Jul 2017 16:16:43 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGGhhr063577; Wed, 26 Jul 2017 16:16:43 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261616.v6QGGhhr063577@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:16:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321531 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321531 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:16:45 -0000 Author: mav Date: Wed Jul 26 16:16:43 2017 New Revision: 321531 URL: https://svnweb.freebsd.org/changeset/base/321531 Log: MFC r317235: MFV 316868 7430 Backfill metadnode more intelligently illumos/illumos-gate@af346df58864e8fe897b1ff1a3a4c12f9294391b https://github.com/illumos/illumos-gate/commit/af346df58864e8fe897b1ff1a3a4c12f9 294391b https://www.illumos.org/issues/7430 Description and patch from brought over from the following ZoL commit: https:/ / github.com/zfsonlinux/zfs/commit/68cbd56e182ab949f58d004778d463aeb3f595c6 Only attempt to backfill lower metadnode object numbers if at least 4096 objects have been freed since the last rescan, and at most once per transaction group. This avoids a pathology in dmu_object_alloc() that caused O(N^2) behavior for create-heavy workloads and substantially improves object creation rates. As summarized by @mahrens in #4636: "Normally, the object allocator simply checks to see if the next object is available. The slow calls happened when dmu_object_alloc() checks to see if it can backfill lower object numbers. This happens every time we move on to a new L1 indirect block (i.e. every 32 * 128 = 4096 objects). When re-checking lower object numbers, we use the on-disk fill count (blkptr_t:blk_fill) to quickly skip over indirect blocks that don?t have enough free dnodes (defined as an L2 with at least 393,216 of 524,288 dnodes free). Therefore, we may find that a block of dnodes has a low (or zero) fill count, and yet we can?t allocate any of its dnodes, because they've been allocated in memory but not yet written to disk. In this case we have to hold each of the dnodes and then notice that it has been allocated in memory. The end result is that allocating N objects in the same TXG can require CPU usage proportional to N^2." Add a tunable dmu_rescan_dnode_threshold to define the number of objects that must be freed before a rescan is performed. Don't bother to export this as a module option because testing doesn't show a compelling reason to change it. The vast majority of the performance gain comes from limit the rescan to at most once per TXG. Reviewed by: Alek Pinchuk Reviewed by: Brian Behlendorf Reviewed by: Matthew Ahrens Approved by: Gordon Ross Author: Ned Bass Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Jul 26 16:14:57 2017 (r321530) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Jul 26 16:16:43 2017 (r321531) @@ -36,20 +36,22 @@ dmu_object_alloc(objset_t *os, dmu_object_type_t ot, i dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx) { uint64_t object; - uint64_t L2_dnode_count = DNODES_PER_BLOCK << + uint64_t L1_dnode_count = DNODES_PER_BLOCK << (DMU_META_DNODE(os)->dn_indblkshift - SPA_BLKPTRSHIFT); dnode_t *dn = NULL; - int restarted = B_FALSE; mutex_enter(&os->os_obj_lock); for (;;) { object = os->os_obj_next; /* - * Each time we polish off an L2 bp worth of dnodes - * (2^13 objects), move to another L2 bp that's still - * reasonably sparse (at most 1/4 full). Look from the - * beginning once, but after that keep looking from here. - * If we can't find one, just keep going from here. + * Each time we polish off a L1 bp worth of dnodes (2^12 + * objects), move to another L1 bp that's still reasonably + * sparse (at most 1/4 full). Look from the beginning at most + * once per txg, but after that keep looking from here. + * os_scan_dnodes is set during txg sync if enough objects + * have been freed since the previous rescan to justify + * backfilling again. If we can't find a suitable block, just + * keep going from here. * * Note that dmu_traverse depends on the behavior that we use * multiple blocks of the dnode object before going back to @@ -57,12 +59,19 @@ dmu_object_alloc(objset_t *os, dmu_object_type_t ot, i * that property or find another solution to the issues * described in traverse_visitbp. */ - if (P2PHASE(object, L2_dnode_count) == 0) { - uint64_t offset = restarted ? object << DNODE_SHIFT : 0; - int error = dnode_next_offset(DMU_META_DNODE(os), + + if (P2PHASE(object, L1_dnode_count) == 0) { + uint64_t offset; + int error; + if (os->os_rescan_dnodes) { + offset = 0; + os->os_rescan_dnodes = B_FALSE; + } else { + offset = object << DNODE_SHIFT; + } + error = dnode_next_offset(DMU_META_DNODE(os), DNODE_FIND_HOLE, &offset, 2, DNODES_PER_BLOCK >> 2, 0); - restarted = B_TRUE; if (error == 0) object = offset >> DNODE_SHIFT; } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:14:57 2017 (r321530) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:16:43 2017 (r321531) @@ -67,6 +67,13 @@ krwlock_t os_lock; */ int dmu_find_threads = 0; +/* + * Backfill lower metadnode objects after this many have been freed. + * Backfilling negatively impacts object creation rates, so only do it + * if there are enough holes to fill. + */ +int dmu_rescan_dnode_threshold = 131072; + static void dmu_objset_find_dp_cb(void *arg); void @@ -1176,6 +1183,13 @@ dmu_objset_sync(objset_t *os, zio_t *pio, dmu_tx_t *tx if (dr->dr_zio) zio_nowait(dr->dr_zio); } + + /* Enable dnode backfill if enough objects have been freed. */ + if (os->os_freed_dnodes >= dmu_rescan_dnode_threshold) { + os->os_rescan_dnodes = B_TRUE; + os->os_freed_dnodes = 0; + } + /* * Free intent log blocks up to this tx. */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Jul 26 16:14:57 2017 (r321530) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Jul 26 16:16:43 2017 (r321531) @@ -672,6 +672,7 @@ dnode_sync(dnode_t *dn, dmu_tx_t *tx) } if (freeing_dnode) { + dn->dn_objset->os_freed_dnodes++; dnode_sync_free(dn, tx); return; } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Wed Jul 26 16:14:57 2017 (r321530) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Wed Jul 26 16:16:43 2017 (r321531) @@ -112,6 +112,8 @@ struct objset { zil_header_t os_zil_header; list_t os_synced_dnodes; uint64_t os_flags; + uint64_t os_freed_dnodes; + boolean_t os_rescan_dnodes; /* Protected by os_obj_lock */ kmutex_t os_obj_lock; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:18:20 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 72FDFD986F0; Wed, 26 Jul 2017 16:18:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4BFD368AE4; Wed, 26 Jul 2017 16:18:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGIJDU063688; Wed, 26 Jul 2017 16:18:19 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGIJV8063686; Wed, 26 Jul 2017 16:18:19 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261618.v6QGIJV8063686@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:18:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321532 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321532 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:18:20 -0000 Author: mav Date: Wed Jul 26 16:18:19 2017 New Revision: 321532 URL: https://svnweb.freebsd.org/changeset/base/321532 Log: MFC r317237: MFV 316870 7448 ZFS doesn't notice when disk vdevs have no write cache illumos/illumos-gate@295438ba3230419314faaa889a2616f561658bd5 https://github.com/illumos/illumos-gate/commit/295438ba3230419314faaa889a2616f56 1658bd5 https://www.illumos.org/issues/7448 I built a SmartOS image with all the NVMe commits including 7372 (support NVMe volatile write cache) and repeated my dd testing: > #!/bin/bash > for i in `seq 1 1000`; do > dd if=/dev/zero of=file00 bs=1M count=102400 oflag=sync & > dd if=/dev/zero of=file01 bs=1M count=102400 oflag=sync & > wait > rm file00 file01 > done > Previously each dd command took ~145 seconds to finish, now it takes ~400 seconds. Eventually I figured out it is 7372 that causes unnecessary nvme_bd_sync() executions which wasted CPU cycles. If a NVMe device doesn't support a write cache, the nvme_bd_sync function will return ENOTSUP to indicate this to upper layers. It seems this returned value is ignored by ZFS, and as such this bug is not really specific to NVMe. In vdev_disk_io_start() ZFS sends the flush to the disk driver (blkdev) with a callback to vdev_disk_ioctl_done(). As nvme filled in the bd_sync_cache function pointer, blkdev will not return ENOTSUP, as the nvme driver in general does support cache flush. Instead it will issue an asynchronous flush to nvme and immediately return 0, and hence ZFS will not se t vdev_nowritecache here. The nvme driver will at some point process the cache flush command, and if there is no write cache on the device it will return ENOTSUP, which will be delivered to the vdev_disk_ioctl_done() callback. This function will not check the error code and not set nowritecache. The right place to check the error code from the cache flush is in zio_vdev_io_assess(). This would catch both cases, synchronous and asynchronous cache flushes. This would also be independent of the implementation detail that some drivers can return ENOTSUP immediately. Reviewed by: Dan Fields Reviewed by: Alek Pinchuk Reviewed by: George Wilson Approved by: Dan McDonald Author: Hans Rosenfeld Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c Wed Jul 26 16:16:43 2017 (r321531) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c Wed Jul 26 16:18:19 2017 (r321532) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012, 2015 by Delphix. All rights reserved. - * Copyright 2013 Nexenta Systems, Inc. All rights reserved. + * Copyright 2016 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2013 Joyent, Inc. All rights reserved. */ @@ -743,16 +743,6 @@ vdev_disk_io_start(zio_t *zio) return; } - if (error == ENOTSUP || error == ENOTTY) { - /* - * If we get ENOTSUP or ENOTTY, we know that - * no future attempts will ever succeed. - * In this case we set a persistent bit so - * that we don't bother with the ioctl in the - * future. - */ - vd->vdev_nowritecache = B_TRUE; - } zio->io_error = error; break; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Jul 26 16:16:43 2017 (r321531) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Jul 26 16:18:19 2017 (r321532) @@ -3302,6 +3302,16 @@ zio_vdev_io_assess(zio_t *zio) vd->vdev_cant_write = B_TRUE; } + /* + * If a cache flush returns ENOTSUP or ENOTTY, we know that no future + * attempts will ever succeed. In this case we set a persistent bit so + * that we don't bother with it in the future. + */ + if ((zio->io_error == ENOTSUP || zio->io_error == ENOTTY) && + zio->io_type == ZIO_TYPE_IOCTL && + zio->io_cmd == DKIOCFLUSHWRITECACHE && vd != NULL) + vd->vdev_nowritecache = B_TRUE; + if (zio->io_error) zio->io_pipeline = ZIO_INTERLOCK_PIPELINE; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:19:05 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 64AC8D987CD; Wed, 26 Jul 2017 16:19:05 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3198968C55; Wed, 26 Jul 2017 16:19:05 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGJ4l3063775; Wed, 26 Jul 2017 16:19:04 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGJ4T7063774; Wed, 26 Jul 2017 16:19:04 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261619.v6QGJ4T7063774@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:19:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321533 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321533 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:19:05 -0000 Author: mav Date: Wed Jul 26 16:19:04 2017 New Revision: 321533 URL: https://svnweb.freebsd.org/changeset/base/321533 Log: MFC r317238: MFV 316871 7490 real checksum errors are silenced when zinject is on illumos/illumos-gate@6cedfc397d92d64e442f0aae4445ac507beaf58f https://github.com/illumos/illumos-gate/commit/6cedfc397d92d64e442f0aae4445ac507beaf58f https://www.illumos.org/issues/7490 When zinject is on, error codes from zfs_checksum_error() can be overwritten due to an incorrect and overly-complex if condition. Reviewed by: George Wilson Reviewed by: Paul Dagnelie Reviewed by: Dan Kimmel Reviewed by: Matthew Ahrens Approved by: Robert Mustacchi Author: Pavel Zakharov Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c Wed Jul 26 16:18:19 2017 (r321532) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c Wed Jul 26 16:19:04 2017 (r321533) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2013, 2015 by Delphix. All rights reserved. + * Copyright (c) 2013, 2016 by Delphix. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright 2013 Saso Kiselkov. All rights reserved. */ @@ -392,12 +392,13 @@ zio_checksum_error(zio_t *zio, zio_bad_cksum_t *info) error = zio_checksum_error_impl(spa, bp, checksum, data, size, offset, info); - if (error != 0 && zio_injection_enabled && !zio->io_error && - (error = zio_handle_fault_injection(zio, ECKSUM)) != 0) { - info->zbc_injected = 1; - return (error); + if (zio_injection_enabled && error == 0 && zio->io_error == 0) { + error = zio_handle_fault_injection(zio, ECKSUM); + if (error != 0) + info->zbc_injected = 1; } + return (error); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:20:21 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67E45D98898; Wed, 26 Jul 2017 16:20:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 323B668DAC; Wed, 26 Jul 2017 16:20:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGKK1j063897; Wed, 26 Jul 2017 16:20:20 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGKJtJ063890; Wed, 26 Jul 2017 16:20:19 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261620.v6QGKJtJ063890@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:20:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321534 - in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs X-SVN-Commit-Revision: 321534 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:20:21 -0000 Author: mav Date: Wed Jul 26 16:20:19 2017 New Revision: 321534 URL: https://svnweb.freebsd.org/changeset/base/321534 Log: MFC r317267: MFV 316891 7386 zfs get does not work properly with bookmarks illumos/illumos-gate@edb901aab9c738b5eb15aa55933e82b0f2f9d9a2 https://github.com/illumos/illumos-gate/commit/edb901aab9c738b5eb15aa55933e82b0f2f9d9a2 https://www.illumos.org/issues/7386 The zfs get command does not work with the bookmark parameter while it works properly with both filesystem and snapshot: # zfs get -t all -r creation rpool/test NAME PROPERTY VALUE SOURCE rpool/test creation Fri Sep 16 15:00 2016 - rpool/test@snap creation Fri Sep 16 15:00 2016 - rpool/test#bkmark creation Fri Sep 16 15:00 2016 - # zfs get -t all -r creation rpool/test@snap NAME PROPERTY VALUE SOURCE rpool/test@snap creation Fri Sep 16 15:00 2016 - # zfs get -t all -r creation rpool/test#bkmark cannot open 'rpool/test#bkmark': invalid dataset name # The zfs get command should be modified to work properly with bookmarks too. Reviewed by: Simon Klinkert Reviewed by: Paul Dagnelie Approved by: Matthew Ahrens Author: Marcel Telka Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Jul 26 16:20:19 2017 (r321534) @@ -25,13 +25,13 @@ .\" Copyright (c) 2013 by Saso Kiselkov. All rights reserved. .\" Copyright (c) 2014, Joyent, Inc. All rights reserved. .\" Copyright (c) 2013, Steven Hartland -.\" Copyright (c) 2014 Nexenta Systems, Inc. All Rights Reserved. +.\" Copyright (c) 2016 Nexenta Systems, Inc. All Rights Reserved. .\" Copyright (c) 2014, Xin LI .\" Copyright (c) 2014-2015, The FreeBSD Foundation, All Rights Reserved. .\" .\" $FreeBSD$ .\" -.Dd May 31, 2016 +.Dd September 16, 2016 .Dt ZFS 8 .Os .Sh NAME @@ -114,7 +114,7 @@ .Op Fl t Ar type Ns Oo , Ns type Ns Oc Ns ... .Oo Fl s Ar property Oc Ns ... .Oo Fl S Ar property Oc Ns ... -.Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot +.Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot | Ns Ar bookmark Ns ... .Nm .Cm set .Ar property Ns = Ns Ar value Oo Ar property Ns = Ns Ar value Oc Ns ... @@ -2156,7 +2156,7 @@ section. .Op Fl t Ar type Ns Oo , Ns Ar type Oc Ns ... .Op Fl s Ar source Ns Oo , Ns Ar source Oc Ns ... .Ar all | property Ns Oo , Ns Ar property Oc Ns ... -.Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot Ns ... +.Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot Ns | Ns Ar bookmark Ns ... .Xc .Pp Displays properties for the given datasets. If no datasets are specified, then Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Jul 26 16:20:19 2017 (r321534) @@ -243,7 +243,7 @@ get_usage(zfs_help_t idx) "[-o \"all\" | field[,...]]\n" "\t [-t type[,...]] [-s source[,...]]\n" "\t <\"all\" | property[,...]> " - "[filesystem|volume|snapshot] ...\n")); + "[filesystem|volume|snapshot|bookmark] ...\n")); case HELP_INHERIT: return (gettext("\tinherit [-rS] " " ...\n")); @@ -1622,7 +1622,7 @@ zfs_do_get(int argc, char **argv) { zprop_get_cbdata_t cb = { 0 }; int i, c, flags = ZFS_ITER_ARGS_CAN_BE_PATHS; - int types = ZFS_TYPE_DATASET; + int types = ZFS_TYPE_DATASET | ZFS_TYPE_BOOKMARK; char *value, *fields; int ret = 0; int limit = 0; Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 16:20:19 2017 (r321534) @@ -103,7 +103,7 @@ zfs_validate_name(libzfs_handle_t *hdl, const char *pa char what; (void) zfs_prop_get_table(); - if (dataset_namecheck(path, &why, &what) != 0) { + if (entity_namecheck(path, &why, &what) != 0) { if (hdl != NULL) { switch (why) { case NAME_ERR_TOOLONG: @@ -132,9 +132,10 @@ zfs_validate_name(libzfs_handle_t *hdl, const char *pa "'%c' in name"), what); break; - case NAME_ERR_MULTIPLE_AT: + case NAME_ERR_MULTIPLE_DELIMITERS: zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, - "multiple '@' delimiters in name")); + "multiple '@' and/or '#' delimiters in " + "name")); break; case NAME_ERR_NOLETTER: @@ -165,7 +166,7 @@ zfs_validate_name(libzfs_handle_t *hdl, const char *pa if (!(type & ZFS_TYPE_SNAPSHOT) && strchr(path, '@') != NULL) { if (hdl != NULL) zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, - "snapshot delimiter '@' in filesystem name")); + "snapshot delimiter '@' is not expected here")); return (0); } @@ -176,6 +177,20 @@ zfs_validate_name(libzfs_handle_t *hdl, const char *pa return (0); } + if (!(type & ZFS_TYPE_BOOKMARK) && strchr(path, '#') != NULL) { + if (hdl != NULL) + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "bookmark delimiter '#' is not expected here")); + return (0); + } + + if (type == ZFS_TYPE_BOOKMARK && strchr(path, '#') == NULL) { + if (hdl != NULL) + zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, + "missing '#' delimiter in bookmark name")); + return (0); + } + if (modifying && strchr(path, '%') != NULL) { if (hdl != NULL) zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, @@ -613,8 +628,36 @@ make_bookmark_handle(zfs_handle_t *parent, const char return (zhp); } +struct zfs_open_bookmarks_cb_data { + const char *path; + zfs_handle_t *zhp; +}; + +static int +zfs_open_bookmarks_cb(zfs_handle_t *zhp, void *data) +{ + struct zfs_open_bookmarks_cb_data *dp = data; + + /* + * Is it the one we are looking for? + */ + if (strcmp(dp->path, zfs_get_name(zhp)) == 0) { + /* + * We found it. Save it and let the caller know we are done. + */ + dp->zhp = zhp; + return (EEXIST); + } + + /* + * Not found. Close the handle and ask for another one. + */ + zfs_close(zhp); + return (0); +} + /* - * Opens the given snapshot, filesystem, or volume. The 'types' + * Opens the given snapshot, bookmark, filesystem, or volume. The 'types' * argument is a mask of acceptable types. The function will print an * appropriate error message and return NULL if it can't be opened. */ @@ -623,6 +666,7 @@ zfs_open(libzfs_handle_t *hdl, const char *path, int t { zfs_handle_t *zhp; char errbuf[1024]; + char *bookp; (void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN, "cannot open '%s'"), path); @@ -630,20 +674,68 @@ zfs_open(libzfs_handle_t *hdl, const char *path, int t /* * Validate the name before we even try to open it. */ - if (!zfs_validate_name(hdl, path, ZFS_TYPE_DATASET, B_FALSE)) { - zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, - "invalid dataset name")); + if (!zfs_validate_name(hdl, path, types, B_FALSE)) { (void) zfs_error(hdl, EZFS_INVALIDNAME, errbuf); return (NULL); } /* - * Try to get stats for the dataset, which will tell us if it exists. + * Bookmarks needs to be handled separately. */ - errno = 0; - if ((zhp = make_dataset_handle(hdl, path)) == NULL) { - (void) zfs_standard_error(hdl, errno, errbuf); - return (NULL); + bookp = strchr(path, '#'); + if (bookp == NULL) { + /* + * Try to get stats for the dataset, which will tell us if it + * exists. + */ + errno = 0; + if ((zhp = make_dataset_handle(hdl, path)) == NULL) { + (void) zfs_standard_error(hdl, errno, errbuf); + return (NULL); + } + } else { + char dsname[ZFS_MAX_DATASET_NAME_LEN]; + zfs_handle_t *pzhp; + struct zfs_open_bookmarks_cb_data cb_data = {path, NULL}; + + /* + * We need to cut out '#' and everything after '#' + * to get the parent dataset name only. + */ + assert(bookp - path < sizeof (dsname)); + (void) strncpy(dsname, path, bookp - path); + dsname[bookp - path] = '\0'; + + /* + * Create handle for the parent dataset. + */ + errno = 0; + if ((pzhp = make_dataset_handle(hdl, dsname)) == NULL) { + (void) zfs_standard_error(hdl, errno, errbuf); + return (NULL); + } + + /* + * Iterate bookmarks to find the right one. + */ + errno = 0; + if ((zfs_iter_bookmarks(pzhp, zfs_open_bookmarks_cb, + &cb_data) == 0) && (cb_data.zhp == NULL)) { + (void) zfs_error(hdl, EZFS_NOENT, errbuf); + zfs_close(pzhp); + return (NULL); + } + if (cb_data.zhp == NULL) { + (void) zfs_standard_error(hdl, errno, errbuf); + zfs_close(pzhp); + return (NULL); + } + zhp = cb_data.zhp; + + /* + * Cleanup. + */ + zfs_close(pzhp); } if (zhp == NULL) { Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c Wed Jul 26 16:20:19 2017 (r321534) @@ -947,9 +947,10 @@ zpool_name_valid(libzfs_handle_t *hdl, boolean_t isope "trailing slash in name")); break; - case NAME_ERR_MULTIPLE_AT: + case NAME_ERR_MULTIPLE_DELIMITERS: zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, - "multiple '@' delimiters in name")); + "multiple '@' and/or '#' delimiters in " + "name")); break; default: Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_util.c Wed Jul 26 16:20:19 2017 (r321534) @@ -718,7 +718,7 @@ zfs_get_pool_handle(const zfs_handle_t *zhp) * Given a name, determine whether or not it's a valid path * (starts with '/' or "./"). If so, walk the mnttab trying * to match the device number. If not, treat the path as an - * fs/vol/snap name. + * fs/vol/snap/bkmark name. */ zfs_handle_t * zfs_path_to_zhandle(libzfs_handle_t *hdl, char *path, zfs_type_t argtype) Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c Wed Jul 26 16:20:19 2017 (r321534) @@ -120,9 +120,9 @@ permset_namecheck(const char *path, namecheck_err_t *w } /* - * Dataset names must be of the following form: + * Entity names must be of the following form: * - * [component][/]*[component][@component] + * [component/]*[component][(@|#)component]? * * Where each component is made up of alphanumeric characters plus the following * characters: @@ -133,10 +133,10 @@ permset_namecheck(const char *path, namecheck_err_t *w * names for temporary clones (for online recv). */ int -dataset_namecheck(const char *path, namecheck_err_t *why, char *what) +entity_namecheck(const char *path, namecheck_err_t *why, char *what) { - const char *loc, *end; - int found_snapshot; + const char *start, *end; + int found_delim; /* * Make sure the name is not too long. @@ -161,12 +161,13 @@ dataset_namecheck(const char *path, namecheck_err_t *w return (-1); } - loc = path; - found_snapshot = 0; + start = path; + found_delim = 0; for (;;) { /* Find the end of this component */ - end = loc; - while (*end != '/' && *end != '@' && *end != '\0') + end = start; + while (*end != '/' && *end != '@' && *end != '#' && + *end != '\0') end++; if (*end == '\0' && end[-1] == '/') { @@ -176,25 +177,8 @@ dataset_namecheck(const char *path, namecheck_err_t *w return (-1); } - /* Zero-length components are not allowed */ - if (loc == end) { - if (why) { - /* - * Make sure this is really a zero-length - * component and not a '@@'. - */ - if (*end == '@' && found_snapshot) { - *why = NAME_ERR_MULTIPLE_AT; - } else { - *why = NAME_ERR_EMPTY_COMPONENT; - } - } - - return (-1); - } - /* Validate the contents of this component */ - while (loc != end) { + for (const char *loc = start; loc != end; loc++) { if (!valid_char(*loc) && *loc != '%') { if (why) { *why = NAME_ERR_INVALCHAR; @@ -202,43 +186,64 @@ dataset_namecheck(const char *path, namecheck_err_t *w } return (-1); } - loc++; } - /* If we've reached the end of the string, we're OK */ - if (*end == '\0') - return (0); - - if (*end == '@') { - /* - * If we've found an @ symbol, indicate that we're in - * the snapshot component, and report a second '@' - * character as an error. - */ - if (found_snapshot) { + /* Snapshot or bookmark delimiter found */ + if (*end == '@' || *end == '#') { + /* Multiple delimiters are not allowed */ + if (found_delim != 0) { if (why) - *why = NAME_ERR_MULTIPLE_AT; + *why = NAME_ERR_MULTIPLE_DELIMITERS; return (-1); } - found_snapshot = 1; + found_delim = 1; } + /* Zero-length components are not allowed */ + if (start == end) { + if (why) + *why = NAME_ERR_EMPTY_COMPONENT; + return (-1); + } + + /* If we've reached the end of the string, we're OK */ + if (*end == '\0') + return (0); + /* - * If there is a '/' in a snapshot name + * If there is a '/' in a snapshot or bookmark name * then report an error */ - if (*end == '/' && found_snapshot) { + if (*end == '/' && found_delim != 0) { if (why) *why = NAME_ERR_TRAILING_SLASH; return (-1); } /* Update to the next component */ - loc = end + 1; + start = end + 1; } } +/* + * Dataset is any entity, except bookmark + */ +int +dataset_namecheck(const char *path, namecheck_err_t *why, char *what) +{ + int ret = entity_namecheck(path, why, what); + + if (ret == 0 && strchr(path, '#') != NULL) { + if (why != NULL) { + *why = NAME_ERR_INVALCHAR; + *what = '#'; + } + return (-1); + } + + return (ret); +} /* * mountpoint names must be of the following form: Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h Wed Jul 26 16:19:04 2017 (r321533) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_namecheck.h Wed Jul 26 16:20:19 2017 (r321534) @@ -38,7 +38,7 @@ typedef enum { NAME_ERR_EMPTY_COMPONENT, /* name contains an empty component */ NAME_ERR_TRAILING_SLASH, /* name ends with a slash */ NAME_ERR_INVALCHAR, /* invalid character found */ - NAME_ERR_MULTIPLE_AT, /* multiple '@' characters found */ + NAME_ERR_MULTIPLE_DELIMITERS, /* multiple '@'/'#' delimiters found */ NAME_ERR_NOLETTER, /* pool doesn't begin with a letter */ NAME_ERR_RESERVED, /* entire name is reserved */ NAME_ERR_DISKLIKE, /* reserved disk name (c[0-9].*) */ @@ -49,6 +49,7 @@ typedef enum { #define ZFS_PERMSET_MAXLEN 64 int pool_namecheck(const char *, namecheck_err_t *, char *); +int entity_namecheck(const char *, namecheck_err_t *, char *); int dataset_namecheck(const char *, namecheck_err_t *, char *); int mountpoint_namecheck(const char *, namecheck_err_t *); int zfs_component_namecheck(const char *, namecheck_err_t *, char *); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:21:34 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C34F9D98A5D; Wed, 26 Jul 2017 16:21:34 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4647469150; Wed, 26 Jul 2017 16:21:34 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGLXR2064740; Wed, 26 Jul 2017 16:21:33 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGLWWM064731; Wed, 26 Jul 2017 16:21:32 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261621.v6QGLWWM064731@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:21:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321535 - in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/cmd/zstreamdump cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzfs_core/com... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zfs cddl/contrib/opensolaris/cmd/zstreamdump cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzfs_core/common sys/cddl/contrib/open... X-SVN-Commit-Revision: 321535 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:21:34 -0000 Author: mav Date: Wed Jul 26 16:21:32 2017 New Revision: 321535 URL: https://svnweb.freebsd.org/changeset/base/321535 Log: MFC r317414: MFV 316894 7252 7628 compressed zfs send / receive illumos/illumos-gate@5602294fda888d923d57a78bafdaf48ae6223dea https://github.com/illumos/illumos-gate/commit/5602294fda888d923d57a78bafdaf48ae6223dea https://www.illumos.org/issues/7252 This feature includes code to allow a system with compressed ARC enabled to send data in its compressed form straight out of the ARC, and receive data in its compressed form directly into the ARC. https://www.illumos.org/issues/7628 We should have longer, more readable versions of the ZFS send / recv options. 7628 create long versions of ZFS send / receive options Reviewed by: George Wilson Reviewed by: John Kennedy Reviewed by: Matthew Ahrens Reviewed by: Paul Dagnelie Reviewed by: Pavel Zakharov Reviewed by: Sebastien Roy Reviewed by: David Quigley Reviewed by: Thomas Caputi Approved by: Dan McDonald Author: Dan Kimmel Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c stable/11/cddl/contrib/opensolaris/cmd/zstreamdump/zstreamdump.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_compress.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs.8 Wed Jul 26 16:21:32 2017 (r321535) @@ -180,12 +180,12 @@ .Ar bookmark .Nm .Cm send -.Op Fl DnPpRveL +.Op Fl DLPRcenpv .Op Fl i Ar snapshot | Fl I Ar snapshot .Ar snapshot .Nm .Cm send -.Op Fl eL +.Op Fl Lce .Op Fl i Ar snapshot Ns | Ns bookmark .Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot .Nm @@ -2535,7 +2535,7 @@ feature. .It Xo .Nm .Cm send -.Op Fl DnPpRveL +.Op Fl DLPRcenpv .Op Fl i Ar snapshot | Fl I Ar snapshot .Ar snapshot .Xc @@ -2580,7 +2580,7 @@ The incremental source may be specified as with the .Fl i option. -.It Fl R +.It Fl R, -replicate Generate a replication stream package, which will replicate the specified filesystem, and all descendent file systems, up to the named snapshot. When received, all properties, snapshots, descendent file systems, and clones are @@ -2598,7 +2598,7 @@ is received. If the .Fl F flag is specified when this stream is received, snapshots and file systems that do not exist on the sending side are destroyed. -.It Fl D +.It Fl D, -dedup Generate a deduplicated stream. Blocks which would have been sent multiple times in the send stream will only be sent once. The receiving system must also support this feature to receive a deduplicated stream. This flag can @@ -2607,7 +2607,7 @@ be used regardless of the dataset's property, but performance will be much better if the filesystem uses a dedup-capable checksum (eg. .Sy sha256 ) . -.It Fl L +.It Fl L, -large-block Generate a stream which may contain blocks larger than 128KB. This flag has no effect if the @@ -2623,7 +2623,7 @@ See for details on ZFS feature flags and the .Sy large_blocks feature. -.It Fl e +.It Fl e, -embed Generate a more compact stream by using WRITE_EMBEDDED records for blocks which are stored more compactly on disk by the .Sy embedded_data @@ -2646,11 +2646,25 @@ See for details on ZFS feature flags and the .Sy embedded_data feature. -.It Fl p +.It Fl c, -compressed +Generate a more compact stream by using compressed WRITE records for blocks +which are compressed on disk and in memory (see the +.Sy compression property for details). If the +.Sy lz4_compress +feature is active on the sending system, then the receiving system must have that +feature enabled as well. If the +.Sy large_blocks +feature is enabled on the sending system but the +.Fl L +option is not supplied in conjunction with +.Fl c +then the data will be decompressed before sending so it can be split +into smaller block sizes. +.It Fl p, -props Include the dataset's properties in the stream. This flag is implicit when .Fl R is specified. The receiving system must also support this feature. -.It Fl n +.It Fl n, -dryrun Do a dry-run ("No-op") send. Do not generate any actual send data. This is useful in conjunction with the .Fl v @@ -2660,9 +2674,9 @@ flags to determine what data will be sent. In this case, the verbose output will be written to standard output (contrast with a non-dry-run, where the stream is written to standard output and the verbose output goes to standard error). -.It Fl P +.It Fl P, -parsable Print machine-parsable verbose information about the stream package generated. -.It Fl v +.It Fl v, -verbose Print verbose information about the stream package generated. This information includes a per-second report of how much data has been sent. .El @@ -2673,7 +2687,7 @@ on future versions of .It Xo .Nm .Cm send -.Op Fl eL +.Op Fl Lce .Op Fl i Ar snapshot Ns | Ns Ar bookmark .Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot .Xc @@ -2699,7 +2713,7 @@ specified as the last component of the name If the incremental target is a clone, the incremental source can be the origin snapshot, or an earlier snapshot in the origin's filesystem, or the origin's origin, etc. -.It Fl L +.It Fl L, -large-block Generate a stream which may contain blocks larger than 128KB. This flag has no effect if the @@ -2715,7 +2729,22 @@ See for details on ZFS feature flags and the .Sy large_blocks feature. -.It Fl e +.It Fl c, -compressed +Generate a more compact stream by using compressed WRITE records for blocks +which are compressed on disk and in memory (see the +.Sy compression +property for details). If the +.Sy lz4_compress +feature is active on the sending system, then the receiving system must have +that feature enabled as well. If the +.Sy large_blocks +feature is enabled on the sending system but the +.Fl L +option is not supplied in conjunction with +.Fl c +then the data will be decompressed before sending so it can be split +into smaller block sizes. +.It Fl e, -embed Generate a more compact stream by using WRITE_EMBEDDED records for blocks which are stored more compactly on disk by the .Sy embedded_data Modified: stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c Wed Jul 26 16:21:32 2017 (r321535) @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -278,7 +279,7 @@ get_usage(zfs_help_t idx) case HELP_ROLLBACK: return (gettext("\trollback [-rRf] \n")); case HELP_SEND: - return (gettext("\tsend [-DnPpRvLe] [-[iI] snapshot] " + return (gettext("\tsend [-DnPpRvLec] [-[iI] snapshot] " "\n" "\tsend [-Le] [-i snapshot|bookmark] " "\n" @@ -3771,8 +3772,23 @@ zfs_do_send(int argc, char **argv) nvlist_t *dbgnv = NULL; boolean_t extraverbose = B_FALSE; + struct option long_options[] = { + {"replicate", no_argument, NULL, 'R'}, + {"props", no_argument, NULL, 'p'}, + {"parsable", no_argument, NULL, 'P'}, + {"dedup", no_argument, NULL, 'D'}, + {"verbose", no_argument, NULL, 'v'}, + {"dryrun", no_argument, NULL, 'n'}, + {"large-block", no_argument, NULL, 'L'}, + {"embed", no_argument, NULL, 'e'}, + {"resume", required_argument, NULL, 't'}, + {"compressed", no_argument, NULL, 'c'}, + {0, 0, 0, 0} + }; + /* check options */ - while ((c = getopt(argc, argv, ":i:I:RDpvnPLet:")) != -1) { + while ((c = getopt_long(argc, argv, ":i:I:RbDpvnPLet:c", long_options, + NULL)) != -1) { switch (c) { case 'i': if (fromname) @@ -3816,12 +3832,17 @@ zfs_do_send(int argc, char **argv) case 't': resume_token = optarg; break; + case 'c': + flags.compress = B_TRUE; + break; case ':': (void) fprintf(stderr, gettext("missing argument for " "'%c' option\n"), optopt); usage(B_FALSE); break; case '?': + /*FALLTHROUGH*/ + default: (void) fprintf(stderr, gettext("invalid option '%c'\n"), optopt); usage(B_FALSE); @@ -3892,6 +3913,8 @@ zfs_do_send(int argc, char **argv) lzc_flags |= LZC_SEND_FLAG_LARGE_BLOCK; if (flags.embed_data) lzc_flags |= LZC_SEND_FLAG_EMBED_DATA; + if (flags.compress) + lzc_flags |= LZC_SEND_FLAG_COMPRESS; if (fromname != NULL && (fromname[0] == '#' || fromname[0] == '@')) { Modified: stable/11/cddl/contrib/opensolaris/cmd/zstreamdump/zstreamdump.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zstreamdump/zstreamdump.c Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/cmd/zstreamdump/zstreamdump.c Wed Jul 26 16:21:32 2017 (r321535) @@ -25,8 +25,8 @@ */ /* - * Copyright (c) 2013, 2014 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright (c) 2013, 2015 by Delphix. All rights reserved. */ #include @@ -39,6 +39,7 @@ #include #include +#include #include /* @@ -251,6 +252,7 @@ main(int argc, char *argv[]) (void) fprintf(stderr, "invalid option '%c'\n", optopt); usage(); + break; } } @@ -453,38 +455,50 @@ main(int argc, char *argv[]) drrw->drr_object = BSWAP_64(drrw->drr_object); drrw->drr_type = BSWAP_32(drrw->drr_type); drrw->drr_offset = BSWAP_64(drrw->drr_offset); - drrw->drr_length = BSWAP_64(drrw->drr_length); + drrw->drr_logical_size = + BSWAP_64(drrw->drr_logical_size); drrw->drr_toguid = BSWAP_64(drrw->drr_toguid); drrw->drr_key.ddk_prop = BSWAP_64(drrw->drr_key.ddk_prop); + drrw->drr_compressed_size = + BSWAP_64(drrw->drr_compressed_size); } + + uint64_t payload_size = DRR_WRITE_PAYLOAD_SIZE(drrw); + /* * If this is verbose and/or dump output, * print info on the modified block */ if (verbose) { (void) printf("WRITE object = %llu type = %u " - "checksum type = %u\n" - " offset = %llu length = %llu " + "checksum type = %u compression type = %u\n" + " offset = %llu logical_size = %llu " + "compressed_size = %llu " + "payload_size = %llu " "props = %llx\n", (u_longlong_t)drrw->drr_object, drrw->drr_type, drrw->drr_checksumtype, + drrw->drr_compressiontype, (u_longlong_t)drrw->drr_offset, - (u_longlong_t)drrw->drr_length, + (u_longlong_t)drrw->drr_logical_size, + (u_longlong_t)drrw->drr_compressed_size, + (u_longlong_t)payload_size, (u_longlong_t)drrw->drr_key.ddk_prop); } + /* * Read the contents of the block in from STDIN to buf */ - (void) ssread(buf, drrw->drr_length, &zc); + (void) ssread(buf, payload_size, &zc); /* * If in dump mode */ if (dump) { - print_block(buf, drrw->drr_length); + print_block(buf, payload_size); } - total_write_size += drrw->drr_length; + total_write_size += payload_size; break; case DRR_WRITE_BYREF: Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h Wed Jul 26 16:21:32 2017 (r321535) @@ -616,6 +616,9 @@ typedef struct sendflags { /* WRITE_EMBEDDED records of type DATA are permitted */ boolean_t embed_data; + + /* compressed WRITE records are permitted */ + boolean_t compress; } sendflags_t; typedef boolean_t (snapfilter_cb_t)(zfs_handle_t *, void *); Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jul 26 16:21:32 2017 (r321535) @@ -354,8 +354,10 @@ cksummer(void *arg) { struct drr_write *drrw = &drr->drr_u.drr_write; dataref_t dataref; + uint64_t payload_size; - (void) ssread(buf, drrw->drr_length, ofp); + payload_size = DRR_WRITE_PAYLOAD_SIZE(drrw); + (void) ssread(buf, payload_size, ofp); /* * Use the existing checksum if it's dedup-capable, @@ -369,7 +371,7 @@ cksummer(void *arg) zio_cksum_t tmpsha256; SHA256Init(&ctx); - SHA256Update(&ctx, buf, drrw->drr_length); + SHA256Update(&ctx, buf, payload_size); SHA256Final(&tmpsha256, &ctx); drrw->drr_key.ddk_cksum.zc_word[0] = BE_64(tmpsha256.zc_word[0]); @@ -399,7 +401,7 @@ cksummer(void *arg) wbr_drrr->drr_object = drrw->drr_object; wbr_drrr->drr_offset = drrw->drr_offset; - wbr_drrr->drr_length = drrw->drr_length; + wbr_drrr->drr_length = drrw->drr_logical_size; wbr_drrr->drr_toguid = drrw->drr_toguid; wbr_drrr->drr_refguid = dataref.ref_guid; wbr_drrr->drr_refobject = @@ -421,7 +423,7 @@ cksummer(void *arg) goto out; } else { /* block not previously seen */ - if (dump_record(drr, buf, drrw->drr_length, + if (dump_record(drr, buf, payload_size, &stream_cksum, outfd) != 0) goto out; } @@ -924,7 +926,7 @@ typedef struct send_dump_data { uint64_t prevsnap_obj; boolean_t seenfrom, seento, replicate, doall, fromorigin; boolean_t verbose, dryrun, parsable, progress, embed_data, std_out; - boolean_t large_block; + boolean_t large_block, compress; int outfd; boolean_t err; nvlist_t *fss; @@ -940,7 +942,7 @@ typedef struct send_dump_data { static int estimate_ioctl(zfs_handle_t *zhp, uint64_t fromsnap_obj, - boolean_t fromorigin, uint64_t *sizep) + boolean_t fromorigin, enum lzc_send_flags flags, uint64_t *sizep) { zfs_cmd_t zc = { 0 }; libzfs_handle_t *hdl = zhp->zfs_hdl; @@ -953,6 +955,7 @@ estimate_ioctl(zfs_handle_t *zhp, uint64_t fromsnap_ob zc.zc_sendobj = zfs_prop_get_int(zhp, ZFS_PROP_OBJSETID); zc.zc_fromobj = fromsnap_obj; zc.zc_guid = 1; /* estimate flag */ + zc.zc_flags = flags; if (zfs_ioctl(zhp->zfs_hdl, ZFS_IOC_SEND, &zc) != 0) { char errbuf[1024]; @@ -1192,6 +1195,7 @@ dump_snapshot(zfs_handle_t *zhp, void *arg) progress_arg_t pa = { 0 }; pthread_t tid; char *thissnap; + enum lzc_send_flags flags = 0; int err; boolean_t isfromsnap, istosnap, fromorigin; boolean_t exclude = B_FALSE; @@ -1220,6 +1224,13 @@ dump_snapshot(zfs_handle_t *zhp, void *arg) if (istosnap) sdd->seento = B_TRUE; + if (sdd->large_block) + flags |= LZC_SEND_FLAG_LARGE_BLOCK; + if (sdd->embed_data) + flags |= LZC_SEND_FLAG_EMBED_DATA; + if (sdd->compress) + flags |= LZC_SEND_FLAG_COMPRESS; + if (!sdd->doall && !isfromsnap && !istosnap) { if (sdd->replicate) { char *snapname; @@ -1266,7 +1277,7 @@ dump_snapshot(zfs_handle_t *zhp, void *arg) if (sdd->verbose) { uint64_t size = 0; (void) estimate_ioctl(zhp, sdd->prevsnap_obj, - fromorigin, &size); + fromorigin, flags, &size); send_print_verbose(fout, zhp->zfs_name, sdd->prevsnap[0] ? sdd->prevsnap : NULL, @@ -1291,12 +1302,6 @@ dump_snapshot(zfs_handle_t *zhp, void *arg) } } - enum lzc_send_flags flags = 0; - if (sdd->large_block) - flags |= LZC_SEND_FLAG_LARGE_BLOCK; - if (sdd->embed_data) - flags |= LZC_SEND_FLAG_EMBED_DATA; - err = dump_ioctl(zhp, sdd->prevsnap, sdd->prevsnap_obj, fromorigin, sdd->outfd, flags, sdd->debugnv); @@ -1602,8 +1607,12 @@ zfs_send_resume(libzfs_handle_t *hdl, sendflags_t *fla fromguid = 0; (void) nvlist_lookup_uint64(resume_nvl, "fromguid", &fromguid); + if (flags->largeblock || nvlist_exists(resume_nvl, "largeblockok")) + lzc_flags |= LZC_SEND_FLAG_LARGE_BLOCK; if (flags->embed_data || nvlist_exists(resume_nvl, "embedok")) lzc_flags |= LZC_SEND_FLAG_EMBED_DATA; + if (flags->compress || nvlist_exists(resume_nvl, "compressok")) + lzc_flags |= LZC_SEND_FLAG_COMPRESS; if (guid_to_name(hdl, toname, toguid, B_FALSE, name) != 0) { if (zfs_dataset_exists(hdl, toname, ZFS_TYPE_DATASET)) { @@ -1636,7 +1645,8 @@ zfs_send_resume(libzfs_handle_t *hdl, sendflags_t *fla if (flags->verbose) { uint64_t size = 0; - error = lzc_send_space(zhp->zfs_name, fromname, &size); + error = lzc_send_space(zhp->zfs_name, fromname, + lzc_flags, &size); if (error == 0) size = MAX(0, (int64_t)(size - bytes)); send_print_verbose(stderr, zhp->zfs_name, fromname, @@ -1866,6 +1876,7 @@ zfs_send(zfs_handle_t *zhp, const char *fromsnap, cons sdd.dryrun = flags->dryrun; sdd.large_block = flags->largeblock; sdd.embed_data = flags->embed_data; + sdd.compress = flags->compress; sdd.filter_cb = filter_func; sdd.filter_cb_arg = cb_arg; if (debugnvp) @@ -2960,11 +2971,17 @@ recv_skip(libzfs_handle_t *hdl, int fd, boolean_t byte case DRR_WRITE: if (byteswap) { - drr->drr_u.drr_write.drr_length = - BSWAP_64(drr->drr_u.drr_write.drr_length); + drr->drr_u.drr_write.drr_logical_size = + BSWAP_64( + drr->drr_u.drr_write.drr_logical_size); + drr->drr_u.drr_write.drr_compressed_size = + BSWAP_64( + drr->drr_u.drr_write.drr_compressed_size); } + uint64_t payload_size = + DRR_WRITE_PAYLOAD_SIZE(&drr->drr_u.drr_write); (void) recv_read(hdl, fd, buf, - drr->drr_u.drr_write.drr_length, B_FALSE, NULL); + payload_size, B_FALSE, NULL); break; case DRR_SPILL: if (byteswap) { Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Jul 26 16:21:32 2017 (r321535) @@ -534,6 +534,8 @@ lzc_send_resume(const char *snapname, const char *from fnvlist_add_boolean(args, "largeblockok"); if (flags & LZC_SEND_FLAG_EMBED_DATA) fnvlist_add_boolean(args, "embedok"); + if (flags & LZC_SEND_FLAG_COMPRESS) + fnvlist_add_boolean(args, "compressok"); if (resumeobj != 0 || resumeoff != 0) { fnvlist_add_uint64(args, "resume_object", resumeobj); fnvlist_add_uint64(args, "resume_offset", resumeoff); @@ -559,7 +561,8 @@ lzc_send_resume(const char *snapname, const char *from * an equivalent snapshot. */ int -lzc_send_space(const char *snapname, const char *from, uint64_t *spacep) +lzc_send_space(const char *snapname, const char *from, + enum lzc_send_flags flags, uint64_t *spacep) { nvlist_t *args; nvlist_t *result; @@ -568,6 +571,12 @@ lzc_send_space(const char *snapname, const char *from, args = fnvlist_alloc(); if (from != NULL) fnvlist_add_string(args, "from", from); + if (flags & LZC_SEND_FLAG_LARGE_BLOCK) + fnvlist_add_boolean(args, "largeblockok"); + if (flags & LZC_SEND_FLAG_EMBED_DATA) + fnvlist_add_boolean(args, "embedok"); + if (flags & LZC_SEND_FLAG_COMPRESS) + fnvlist_add_boolean(args, "compressok"); err = lzc_ioctl(ZFS_IOC_SEND_SPACE, snapname, args, &result); nvlist_free(args); if (err == 0) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h Wed Jul 26 16:21:32 2017 (r321535) @@ -62,13 +62,14 @@ int lzc_get_holds(const char *, nvlist_t **); enum lzc_send_flags { LZC_SEND_FLAG_EMBED_DATA = 1 << 0, - LZC_SEND_FLAG_LARGE_BLOCK = 1 << 1 + LZC_SEND_FLAG_LARGE_BLOCK = 1 << 1, + LZC_SEND_FLAG_COMPRESS = 1 << 2 }; int lzc_send(const char *, const char *, int, enum lzc_send_flags); int lzc_send_resume(const char *, const char *, int, enum lzc_send_flags, uint64_t, uint64_t); -int lzc_send_space(const char *, const char *, uint64_t *); +int lzc_send_space(const char *, const char *, enum lzc_send_flags, uint64_t *); struct dmu_replay_record; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:20:19 2017 (r321534) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:21:32 2017 (r321535) @@ -77,10 +77,10 @@ * A new reference to a cache buffer can be obtained in two * ways: 1) via a hash table lookup using the DVA as a key, * or 2) via one of the ARC lists. The arc_read() interface - * uses method 1, while the internal arc algorithms for + * uses method 1, while the internal ARC algorithms for * adjusting the cache use method 2. We therefore provide two * types of locks: 1) the hash table lock array, and 2) the - * arc list locks. + * ARC list locks. * * Buffers do not have their own mutexes, rather they rely on the * hash table mutexes for the bulk of their protection (i.e. most @@ -93,21 +93,12 @@ * buf_hash_remove() expects the appropriate hash mutex to be * already held before it is invoked. * - * Each arc state also has a mutex which is used to protect the + * Each ARC state also has a mutex which is used to protect the * buffer list associated with the state. When attempting to - * obtain a hash table lock while holding an arc list lock you + * obtain a hash table lock while holding an ARC list lock you * must use: mutex_tryenter() to avoid deadlock. Also note that * the active state mutex must be held before the ghost state mutex. * - * Arc buffers may have an associated eviction callback function. - * This function will be invoked prior to removing the buffer (e.g. - * in arc_do_user_evicts()). Note however that the data associated - * with the buffer may be evicted prior to the callback. The callback - * must be made with *no locks held* (to prevent deadlock). Additionally, - * the users of callbacks must ensure that their private data is - * protected from simultaneous callbacks from arc_clear_callback() - * and arc_do_user_evicts(). - * * Note that the majority of the performance stats are manipulated * with atomic operations. * @@ -136,67 +127,81 @@ * are cached in the L1ARC. The L1ARC (l1arc_buf_hdr_t) is a structure within * the arc_buf_hdr_t that will point to the data block in memory. A block can * only be read by a consumer if it has an l1arc_buf_hdr_t. The L1ARC - * caches data in two ways -- in a list of arc buffers (arc_buf_t) and + * caches data in two ways -- in a list of ARC buffers (arc_buf_t) and * also in the arc_buf_hdr_t's private physical data block pointer (b_pdata). - * Each arc buffer (arc_buf_t) is being actively accessed by a specific ARC - * consumer, and always contains uncompressed data. The ARC will provide - * references to this data and will keep it cached until it is no longer in - * use. Typically, the arc will try to cache only the L1ARC's physical data - * block and will aggressively evict any arc_buf_t that is no longer referenced. - * The amount of memory consumed by the arc_buf_t's can be seen via the + * + * The L1ARC's data pointer may or may not be uncompressed. The ARC has the + * ability to store the physical data (b_pdata) associated with the DVA of the + * arc_buf_hdr_t. Since the b_pdata is a copy of the on-disk physical block, + * it will match its on-disk compression characteristics. This behavior can be + * disabled by setting 'zfs_compressed_arc_enabled' to B_FALSE. When the + * compressed ARC functionality is disabled, the b_pdata will point to an + * uncompressed version of the on-disk data. + * + * Data in the L1ARC is not accessed by consumers of the ARC directly. Each + * arc_buf_hdr_t can have multiple ARC buffers (arc_buf_t) which reference it. + * Each ARC buffer (arc_buf_t) is being actively accessed by a specific ARC + * consumer. The ARC will provide references to this data and will keep it + * cached until it is no longer in use. The ARC caches only the L1ARC's physical + * data block and will evict any arc_buf_t that is no longer referenced. The + * amount of memory consumed by the arc_buf_ts' data buffers can be seen via the * "overhead_size" kstat. * + * Depending on the consumer, an arc_buf_t can be requested in uncompressed or + * compressed form. The typical case is that consumers will want uncompressed + * data, and when that happens a new data buffer is allocated where the data is + * decompressed for them to use. Currently the only consumer who wants + * compressed arc_buf_t's is "zfs send", when it streams data exactly as it + * exists on disk. When this happens, the arc_buf_t's data buffer is shared + * with the arc_buf_hdr_t. * - * arc_buf_hdr_t - * +-----------+ - * | | - * | | - * | | - * +-----------+ - * l2arc_buf_hdr_t| | - * | | - * +-----------+ - * l1arc_buf_hdr_t| | - * | | arc_buf_t - * | b_buf +------------>+---------+ arc_buf_t - * | | |b_next +---->+---------+ - * | b_pdata +-+ |---------| |b_next +-->NULL - * +-----------+ | | | +---------+ - * | |b_data +-+ | | - * | +---------+ | |b_data +-+ - * +->+------+ | +---------+ | - * (potentially) | | | | - * compressed | | | | - * data +------+ | v - * +->+------+ +------+ - * uncompressed | | | | - * data | | | | - * +------+ +------+ + * Here is a diagram showing an arc_buf_hdr_t referenced by two arc_buf_t's. The + * first one is owned by a compressed send consumer (and therefore references + * the same compressed data buffer as the arc_buf_hdr_t) and the second could be + * used by any other consumer (and has its own uncompressed copy of the data + * buffer). * - * The L1ARC's data pointer, however, may or may not be uncompressed. The - * ARC has the ability to store the physical data (b_pdata) associated with - * the DVA of the arc_buf_hdr_t. Since the b_pdata is a copy of the on-disk - * physical block, it will match its on-disk compression characteristics. - * If the block on-disk is compressed, then the physical data block - * in the cache will also be compressed and vice-versa. This behavior - * can be disabled by setting 'zfs_compressed_arc_enabled' to B_FALSE. When the - * compressed ARC functionality is disabled, the b_pdata will point to an - * uncompressed version of the on-disk data. + * arc_buf_hdr_t + * +-----------+ + * | fields | + * | common to | + * | L1- and | + * | L2ARC | + * +-----------+ + * | l2arc_buf_hdr_t + * | | + * +-----------+ + * | l1arc_buf_hdr_t + * | | arc_buf_t + * | b_buf +------------>+-----------+ arc_buf_t + * | b_pdata +-+ |b_next +---->+-----------+ + * +-----------+ | |-----------| |b_next +-->NULL + * | |b_comp = T | +-----------+ + * | |b_data +-+ |b_comp = F | + * | +-----------+ | |b_data +-+ + * +->+------+ | +-----------+ | + * compressed | | | | + * data | |<--------------+ | uncompressed + * +------+ compressed, | data + * shared +-->+------+ + * data | | + * | | + * +------+ * * When a consumer reads a block, the ARC must first look to see if the - * arc_buf_hdr_t is cached. If the hdr is cached and already has an arc_buf_t, - * then an additional arc_buf_t is allocated and the uncompressed data is - * bcopied from the existing arc_buf_t. If the hdr is cached but does not - * have an arc_buf_t, then the ARC allocates a new arc_buf_t and decompresses - * the b_pdata contents into the arc_buf_t's b_data. If the arc_buf_hdr_t's - * b_pdata is not compressed, then the block is shared with the newly - * allocated arc_buf_t. This block sharing only occurs with one arc_buf_t - * in the arc buffer chain. Sharing the block reduces the memory overhead - * required when the hdr is caching uncompressed blocks or the compressed - * arc functionality has been disabled via 'zfs_compressed_arc_enabled'. + * arc_buf_hdr_t is cached. If the hdr is cached then the ARC allocates a new + * arc_buf_t and either copies uncompressed data into a new data buffer from an + * existing uncompressed arc_buf_t, decompresses the hdr's b_pdata buffer into a + * new data buffer, or shares the hdr's b_pdata buffer, depending on whether the + * hdr is compressed and the desired compression characteristics of the + * arc_buf_t consumer. If the arc_buf_t ends up sharing data with the + * arc_buf_hdr_t and both of them are uncompressed then the arc_buf_t must be + * the last buffer in the hdr's b_buf list, however a shared compressed buf can + * be anywhere in the hdr's list. * * The diagram below shows an example of an uncompressed ARC hdr that is - * sharing its data with an arc_buf_t: + * sharing its data with an arc_buf_t (note that the shared uncompressed buf is + * the last element in the buf list): * * arc_buf_hdr_t * +-----------+ @@ -225,20 +230,24 @@ * | +------+ | * +---------------------------------+ * - * Writing to the arc requires that the ARC first discard the b_pdata + * Writing to the ARC requires that the ARC first discard the hdr's b_pdata * since the physical block is about to be rewritten. The new data contents - * will be contained in the arc_buf_t (uncompressed). As the I/O pipeline - * performs the write, it may compress the data before writing it to disk. - * The ARC will be called with the transformed data and will bcopy the - * transformed on-disk block into a newly allocated b_pdata. + * will be contained in the arc_buf_t. As the I/O pipeline performs the write, + * it may compress the data before writing it to disk. The ARC will be called + * with the transformed data and will bcopy the transformed on-disk block into + * a newly allocated b_pdata. Writes are always done into buffers which have + * either been loaned (and hence are new and don't have other readers) or + * buffers which have been released (and hence have their own hdr, if there + * were originally other readers of the buf's original hdr). This ensures that + * the ARC only needs to update a single buf and its hdr after a write occurs. * * When the L2ARC is in use, it will also take advantage of the b_pdata. The * L2ARC will always write the contents of b_pdata to the L2ARC. This means - * that when compressed arc is enabled that the L2ARC blocks are identical + * that when compressed ARC is enabled that the L2ARC blocks are identical * to the on-disk block in the main data pool. This provides a significant * advantage since the ARC can leverage the bp's checksum when reading from the * L2ARC to determine if the contents are valid. However, if the compressed - * arc is disabled, then the L2ARC's block must be transformed to look + * ARC is disabled, then the L2ARC's block must be transformed to look * like the physical block in the main data pool before comparing the * checksum and determining its validity. */ @@ -910,6 +919,7 @@ struct arc_callback { void *acb_private; arc_done_func_t *acb_done; arc_buf_t *acb_buf; + boolean_t acb_compressed; zio_t *acb_zio_dummy; arc_callback_t *acb_next; }; @@ -961,7 +971,7 @@ typedef struct l1arc_buf_hdr { zio_cksum_t *b_freeze_cksum; #ifdef ZFS_DEBUG /* - * used for debugging wtih kmem_flags - by allocating and freeing + * Used for debugging with kmem_flags - by allocating and freeing * b_thawed when the buffer is thawed, we get a record of the stack * trace that thawed it. */ @@ -1172,6 +1182,8 @@ sysctl_vfs_zfs_arc_min(SYSCTL_HANDLER_ARGS) HDR_COMPRESS_OFFSET, SPA_COMPRESSBITS, (cmp)); #define ARC_BUF_LAST(buf) ((buf)->b_next == NULL) +#define ARC_BUF_SHARED(buf) ((buf)->b_flags & ARC_BUF_FLAG_SHARED) +#define ARC_BUF_COMPRESSED(buf) ((buf)->b_flags & ARC_BUF_FLAG_COMPRESSED) /* * Other sizes @@ -1332,7 +1344,7 @@ static kmutex_t l2arc_free_on_write_mtx; /* mutex for static uint64_t l2arc_ndev; /* number of devices */ typedef struct l2arc_read_callback { - arc_buf_hdr_t *l2rcb_hdr; /* read buffer */ + arc_buf_hdr_t *l2rcb_hdr; /* read header */ blkptr_t l2rcb_bp; /* original blkptr */ zbookmark_phys_t l2rcb_zb; /* original bookmark */ int l2rcb_flags; /* original flags */ @@ -1682,6 +1694,31 @@ retry: } } +/* + * This is the size that the buf occupies in memory. If the buf is compressed, + * it will correspond to the compressed size. You should use this method of + * getting the buf size unless you explicitly need the logical size. + */ +int32_t +arc_buf_size(arc_buf_t *buf) +{ + return (ARC_BUF_COMPRESSED(buf) ? + HDR_GET_PSIZE(buf->b_hdr) : HDR_GET_LSIZE(buf->b_hdr)); +} + +int32_t +arc_buf_lsize(arc_buf_t *buf) +{ + return (HDR_GET_LSIZE(buf->b_hdr)); +} + +enum zio_compress +arc_get_compression(arc_buf_t *buf) +{ + return (ARC_BUF_COMPRESSED(buf) ? + HDR_GET_COMPRESS(buf->b_hdr) : ZIO_COMPRESS_OFF); +} + #define ARC_MINTIME (hz>>4) /* 62 ms */ static inline boolean_t @@ -1690,9 +1727,21 @@ arc_buf_is_shared(arc_buf_t *buf) boolean_t shared = (buf->b_data != NULL && buf->b_data == buf->b_hdr->b_l1hdr.b_pdata); IMPLY(shared, HDR_SHARED_DATA(buf->b_hdr)); + IMPLY(shared, ARC_BUF_SHARED(buf)); + IMPLY(shared, ARC_BUF_COMPRESSED(buf) || ARC_BUF_LAST(buf)); + + /* + * It would be nice to assert arc_can_share() too, but the "hdr isn't + * already being shared" requirement prevents us from doing that. + */ + return (shared); } +/* + * Free the checksum associated with this header. If there is no checksum, this + * is a no-op. + */ static inline void arc_cksum_free(arc_buf_hdr_t *hdr) { @@ -1705,6 +1754,25 @@ arc_cksum_free(arc_buf_hdr_t *hdr) mutex_exit(&hdr->b_l1hdr.b_freeze_lock); } +/* + * Return true iff at least one of the bufs on hdr is not compressed. + */ +static boolean_t +arc_hdr_has_uncompressed_buf(arc_buf_hdr_t *hdr) +{ + for (arc_buf_t *b = hdr->b_l1hdr.b_buf; b != NULL; b = b->b_next) { + if (!ARC_BUF_COMPRESSED(b)) { + return (B_TRUE); + } + } + return (B_FALSE); +} + +/* + * If we've turned on the ZFS_DEBUG_MODIFY flag, verify that the buf's data + * matches the checksum that is stored in the hdr. If there is no checksum, + * or if the buf is compressed, this is a no-op. + */ static void arc_cksum_verify(arc_buf_t *buf) { @@ -1714,6 +1782,12 @@ arc_cksum_verify(arc_buf_t *buf) if (!(zfs_flags & ZFS_DEBUG_MODIFY)) return; + if (ARC_BUF_COMPRESSED(buf)) { + ASSERT(hdr->b_l1hdr.b_freeze_cksum == NULL || + arc_hdr_has_uncompressed_buf(hdr)); + return; + } + ASSERT(HDR_HAS_L1HDR(hdr)); mutex_enter(&hdr->b_l1hdr.b_freeze_lock); @@ -1721,7 +1795,8 @@ arc_cksum_verify(arc_buf_t *buf) mutex_exit(&hdr->b_l1hdr.b_freeze_lock); return; } - fletcher_2_native(buf->b_data, HDR_GET_LSIZE(hdr), NULL, &zc); + + fletcher_2_native(buf->b_data, arc_buf_size(buf), NULL, &zc); if (!ZIO_CHECKSUM_EQUAL(*hdr->b_l1hdr.b_freeze_cksum, zc)) panic("buffer modified while frozen!"); mutex_exit(&hdr->b_l1hdr.b_freeze_lock); @@ -1795,6 +1870,12 @@ arc_cksum_is_equal(arc_buf_hdr_t *hdr, zio_t *zio) return (valid_cksum); } +/* + * Given a buf full of data, if ZFS_DEBUG_MODIFY is enabled this computes a + * checksum and attaches it to the buf's hdr so that we can ensure that the buf + * isn't modified later on. If buf is compressed or there is already a checksum + * on the hdr, this is a no-op (we only checksum uncompressed bufs). + */ static void arc_cksum_compute(arc_buf_t *buf) { @@ -1804,14 +1885,21 @@ arc_cksum_compute(arc_buf_t *buf) return; ASSERT(HDR_HAS_L1HDR(hdr)); + mutex_enter(&buf->b_hdr->b_l1hdr.b_freeze_lock); if (hdr->b_l1hdr.b_freeze_cksum != NULL) { + ASSERT(arc_hdr_has_uncompressed_buf(hdr)); mutex_exit(&hdr->b_l1hdr.b_freeze_lock); return; + } else if (ARC_BUF_COMPRESSED(buf)) { + mutex_exit(&hdr->b_l1hdr.b_freeze_lock); + return; } + + ASSERT(!ARC_BUF_COMPRESSED(buf)); hdr->b_l1hdr.b_freeze_cksum = kmem_alloc(sizeof (zio_cksum_t), KM_SLEEP); - fletcher_2_native(buf->b_data, HDR_GET_LSIZE(hdr), NULL, + fletcher_2_native(buf->b_data, arc_buf_size(buf), NULL, hdr->b_l1hdr.b_freeze_cksum); mutex_exit(&hdr->b_l1hdr.b_freeze_lock); #ifdef illumos @@ -1855,7 +1943,7 @@ arc_buf_watch(arc_buf_t *buf) procctl_t ctl; ctl.cmd = PCWATCH; ctl.prwatch.pr_vaddr = (uintptr_t)buf->b_data; - ctl.prwatch.pr_size = HDR_GET_LSIZE(buf->b_hdr); + ctl.prwatch.pr_size = arc_buf_size(buf); ctl.prwatch.pr_wflags = WA_WRITE; result = write(arc_procfd, &ctl, sizeof (ctl)); ASSERT3U(result, ==, sizeof (ctl)); @@ -1877,6 +1965,12 @@ arc_buf_type(arc_buf_hdr_t *hdr) return (type); } +boolean_t +arc_is_metadata(arc_buf_t *buf) +{ + return (HDR_ISTYPE_METADATA(buf->b_hdr) != 0); +} + static uint32_t arc_bufc_to_flags(arc_buf_contents_t type) { @@ -1898,12 +1992,19 @@ arc_buf_thaw(arc_buf_t *buf) { arc_buf_hdr_t *hdr = buf->b_hdr; - if (zfs_flags & ZFS_DEBUG_MODIFY) { - if (hdr->b_l1hdr.b_state != arc_anon) - panic("modifying non-anon buffer!"); - if (HDR_IO_IN_PROGRESS(hdr)) - panic("modifying buffer while i/o in progress!"); - arc_cksum_verify(buf); + ASSERT3P(hdr->b_l1hdr.b_state, ==, arc_anon); + ASSERT(!HDR_IO_IN_PROGRESS(hdr)); + + arc_cksum_verify(buf); + + /* + * Compressed buffers do not manipulate the b_freeze_cksum or + * allocate b_thawed. + */ + if (ARC_BUF_COMPRESSED(buf)) { + ASSERT(hdr->b_l1hdr.b_freeze_cksum == NULL || + arc_hdr_has_uncompressed_buf(hdr)); + return; } ASSERT(HDR_HAS_L1HDR(hdr)); @@ -1934,6 +2035,12 @@ arc_buf_freeze(arc_buf_t *buf) if (!(zfs_flags & ZFS_DEBUG_MODIFY)) return; + if (ARC_BUF_COMPRESSED(buf)) { + ASSERT(hdr->b_l1hdr.b_freeze_cksum == NULL || + arc_hdr_has_uncompressed_buf(hdr)); + return; + } + hash_lock = HDR_LOCK(hdr); mutex_enter(hash_lock); @@ -1942,7 +2049,6 @@ arc_buf_freeze(arc_buf_t *buf) hdr->b_l1hdr.b_state == arc_anon); arc_cksum_compute(buf); mutex_exit(hash_lock); - } /* @@ -1999,47 +2105,157 @@ arc_hdr_set_compress(arc_buf_hdr_t *hdr, enum zio_comp } } +/* + * Looks for another buf on the same hdr which has the data decompressed, copies + * from it, and returns true. If no such buf exists, returns false. + */ +static boolean_t +arc_buf_try_copy_decompressed_data(arc_buf_t *buf) +{ + arc_buf_hdr_t *hdr = buf->b_hdr; + boolean_t copied = B_FALSE; + + ASSERT(HDR_HAS_L1HDR(hdr)); + ASSERT3P(buf->b_data, !=, NULL); + ASSERT(!ARC_BUF_COMPRESSED(buf)); + + for (arc_buf_t *from = hdr->b_l1hdr.b_buf; from != NULL; + from = from->b_next) { + /* can't use our own data buffer */ + if (from == buf) { + continue; *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:23:32 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 35571D98B59; Wed, 26 Jul 2017 16:23:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 10E7369384; Wed, 26 Jul 2017 16:23:31 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGNVNL067866; Wed, 26 Jul 2017 16:23:31 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGNV9G067865; Wed, 26 Jul 2017 16:23:31 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261623.v6QGNV9G067865@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:23:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321536 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321536 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:23:32 -0000 Author: mav Date: Wed Jul 26 16:23:30 2017 New Revision: 321536 URL: https://svnweb.freebsd.org/changeset/base/321536 Log: MFC r317507: MFV 316895 7606 dmu_objset_find_dp() takes a long time while importing pool illumos/illumos-gate@7588687e6ba67c47bf7c9805086dec4a97fcac7b https://github.com/illumos/illumos-gate/commit/7588687e6ba67c47bf7c9805086dec4a97fcac7b https://www.illumos.org/issues/7606 When importing a pool with a large number of filesystems within the same parent filesystem, we see that dmu_objset_find_dp() takes a long time. It is called from 3 places: spa_check_logs(), spa_ld_claim_log_blocks(), and spa_load_verify(). There are several ways to improve performance here: 1. We don't really need to do spa_check_logs() or spa_ld_claim_log_blocks() if the pool was closed cleanly. 2. spa_load_verify() uses dmu_objset_find_dp() to check that no datasets have too long of names. 3. dmu_objset_find_dp() is slow because it's doing zap_value_search() (which is O(N sibling datasets)) to determine the name of each dsl_dir when it's opened. In this case we actually know the name when we are opening it, so we can provide it and avoid the lookup. This change implements fix #3 from the above list; i.e. make dmu_objset_find_dp() provide the name of the dataset so that we don't have to search for it. Reviewed by: Steve Gonczi Reviewed by: George Wilson Reviewed by: Prashanth Sreenivasa Approved by: Gordon Ross Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:21:32 2017 (r321535) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:23:30 2017 (r321536) @@ -1705,6 +1705,7 @@ typedef struct dmu_objset_find_ctx { taskq_t *dc_tq; dsl_pool_t *dc_dp; uint64_t dc_ddobj; + char *dc_ddname; /* last component of ddobj's name */ int (*dc_func)(dsl_pool_t *, dsl_dataset_t *, void *); void *dc_arg; int dc_flags; @@ -1716,7 +1717,6 @@ static void dmu_objset_find_dp_impl(dmu_objset_find_ctx_t *dcp) { dsl_pool_t *dp = dcp->dc_dp; - dmu_objset_find_ctx_t *child_dcp; dsl_dir_t *dd; dsl_dataset_t *ds; zap_cursor_t zc; @@ -1728,7 +1728,12 @@ dmu_objset_find_dp_impl(dmu_objset_find_ctx_t *dcp) if (*dcp->dc_error != 0) goto out; - err = dsl_dir_hold_obj(dp, dcp->dc_ddobj, NULL, FTAG, &dd); + /* + * Note: passing the name (dc_ddname) here is optional, but it + * improves performance because we don't need to call + * zap_value_search() to determine the name. + */ + err = dsl_dir_hold_obj(dp, dcp->dc_ddobj, dcp->dc_ddname, FTAG, &dd); if (err != 0) goto out; @@ -1753,9 +1758,11 @@ dmu_objset_find_dp_impl(dmu_objset_find_ctx_t *dcp) sizeof (uint64_t)); ASSERT3U(attr->za_num_integers, ==, 1); - child_dcp = kmem_alloc(sizeof (*child_dcp), KM_SLEEP); + dmu_objset_find_ctx_t *child_dcp = + kmem_alloc(sizeof (*child_dcp), KM_SLEEP); *child_dcp = *dcp; child_dcp->dc_ddobj = attr->za_first_integer; + child_dcp->dc_ddname = spa_strdup(attr->za_name); if (dcp->dc_tq != NULL) (void) taskq_dispatch(dcp->dc_tq, dmu_objset_find_dp_cb, child_dcp, TQ_SLEEP); @@ -1798,16 +1805,25 @@ dmu_objset_find_dp_impl(dmu_objset_find_ctx_t *dcp) } } - dsl_dir_rele(dd, FTAG); kmem_free(attr, sizeof (zap_attribute_t)); - if (err != 0) + if (err != 0) { + dsl_dir_rele(dd, FTAG); goto out; + } /* * Apply to self. */ err = dsl_dataset_hold_obj(dp, thisobj, FTAG, &ds); + + /* + * Note: we hold the dir while calling dsl_dataset_hold_obj() so + * that the dir will remain cached, and we won't have to re-instantiate + * it (which could be expensive due to finding its name via + * zap_value_search()). + */ + dsl_dir_rele(dd, FTAG); if (err != 0) goto out; err = dcp->dc_func(dp, ds, dcp->dc_arg); @@ -1822,6 +1838,8 @@ out: mutex_exit(dcp->dc_error_lock); } + if (dcp->dc_ddname != NULL) + spa_strfree(dcp->dc_ddname); kmem_free(dcp, sizeof (*dcp)); } @@ -1866,6 +1884,7 @@ dmu_objset_find_dp(dsl_pool_t *dp, uint64_t ddobj, dcp->dc_tq = NULL; dcp->dc_dp = dp; dcp->dc_ddobj = ddobj; + dcp->dc_ddname = NULL; dcp->dc_func = func; dcp->dc_arg = arg; dcp->dc_flags = flags; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:24:12 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9B292D98BD6; Wed, 26 Jul 2017 16:24:12 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7770E694CC; Wed, 26 Jul 2017 16:24:12 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGOBtQ067946; Wed, 26 Jul 2017 16:24:11 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGOBu4067944; Wed, 26 Jul 2017 16:24:11 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261624.v6QGOBu4067944@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:24:11 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321537 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321537 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:24:12 -0000 Author: mav Date: Wed Jul 26 16:24:11 2017 New Revision: 321537 URL: https://svnweb.freebsd.org/changeset/base/321537 Log: MFC r317511: MFV 316896 7580 ztest failure in dbuf_read_impl illumos/illumos-gate@1a01181fdc809f40c64d5c6881ae3e4521a9d9c7 https://github.com/illumos/illumos-gate/commit/1a01181fdc809f40c64d5c6881ae3e4521a9d9c7 https://www.illumos.org/issues/7580 We need to prevent any reader whenever we're about the zero out all the blkptrs. To do this we need to grab the dn_struct_rwlock as writer in dbuf_write_children_ready and free_children just prior to calling bzero. Reviewed by: Pavel Zakharov Reviewed by: Steve Gonczi Reviewed by: Matthew Ahrens Approved by: Dan McDonald Author: George Wilson Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:23:30 2017 (r321536) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:24:11 2017 (r321537) @@ -3317,13 +3317,13 @@ dbuf_write_children_ready(zio_t *zio, arc_buf_t *buf, dmu_buf_impl_t *db = vdb; dnode_t *dn; blkptr_t *bp; - uint64_t i; - int epbs; + unsigned int epbs, i; ASSERT3U(db->db_level, >, 0); DB_DNODE_ENTER(db); dn = DB_DNODE(db); epbs = dn->dn_phys->dn_indblkshift - SPA_BLKPTRSHIFT; + ASSERT3U(epbs, <, 31); /* Determine if all our children are holes */ for (i = 0, bp = db->db.db_data; i < 1 << epbs; i++, bp++) { @@ -3336,8 +3336,14 @@ dbuf_write_children_ready(zio_t *zio, arc_buf_t *buf, * we may get compressed away. */ if (i == 1 << epbs) { - /* didn't find any non-holes */ + /* + * We only found holes. Grab the rwlock to prevent + * anybody from reading the blocks we're about to + * zero out. + */ + rw_enter(&dn->dn_struct_rwlock, RW_WRITER); bzero(db->db.db_data, db->db.db_size); + rw_exit(&dn->dn_struct_rwlock); } DB_DNODE_EXIT(db); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Jul 26 16:23:30 2017 (r321536) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Jul 26 16:24:11 2017 (r321537) @@ -236,8 +236,8 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint dnode_t *dn; blkptr_t *bp; dmu_buf_impl_t *subdb; - uint64_t start, end, dbstart, dbend, i; - int epbs, shift; + uint64_t start, end, dbstart, dbend; + unsigned int epbs, shift, i; /* * There is a small possibility that this block will not be cached: @@ -254,6 +254,7 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint DB_DNODE_ENTER(db); dn = DB_DNODE(db); epbs = dn->dn_phys->dn_indblkshift - SPA_BLKPTRSHIFT; + ASSERT3U(epbs, <, 31); shift = (db->db_level - 1) * epbs; dbstart = db->db_blkid << epbs; start = blkid >> shift; @@ -273,12 +274,12 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint FREE_VERIFY(db, start, end, tx); free_blocks(dn, bp, end-start+1, tx); } else { - for (i = start; i <= end; i++, bp++) { + for (uint64_t id = start; id <= end; id++, bp++) { if (BP_IS_HOLE(bp)) continue; rw_enter(&dn->dn_struct_rwlock, RW_READER); VERIFY0(dbuf_hold_impl(dn, db->db_level - 1, - i, TRUE, FALSE, FTAG, &subdb)); + id, TRUE, FALSE, FTAG, &subdb)); rw_exit(&dn->dn_struct_rwlock); ASSERT3P(bp, ==, subdb->db_blkptr); @@ -293,8 +294,14 @@ free_children(dmu_buf_impl_t *db, uint64_t blkid, uint break; } if (i == 1 << epbs) { - /* didn't find any non-holes */ + /* + * We only found holes. Grab the rwlock to prevent + * anybody from reading the blocks we're about to + * zero out. + */ + rw_enter(&dn->dn_struct_rwlock, RW_WRITER); bzero(db->db.db_data, db->db.db_size); + rw_exit(&dn->dn_struct_rwlock); free_blocks(dn, db->db_blkptr, 1, tx); } else { /* From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:24:55 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 137D3D98C5B; Wed, 26 Jul 2017 16:24:55 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E150369602; Wed, 26 Jul 2017 16:24:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGOs4A068023; Wed, 26 Jul 2017 16:24:54 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGOrBF068021; Wed, 26 Jul 2017 16:24:53 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261624.v6QGOrBF068021@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:24:53 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321538 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Commit-Revision: 321538 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:24:55 -0000 Author: mav Date: Wed Jul 26 16:24:53 2017 New Revision: 321538 URL: https://svnweb.freebsd.org/changeset/base/321538 Log: MFC r317522: MFV 316897 7586 remove #ifdef __lint hack from dmu.h illumos/illumos-gate@4ba5b9616327ef64e8abc737d29b3faabc6ae68c https://github.com/illumos/illumos-gate/commit/4ba5b9616327ef64e8abc737d29b3faabc6ae68c https://www.illumos.org/issues/7586 The #ifdef __lint in dmu.h is ugly, and it would be nice not to duplicate it if we add other inline functions into header files in ZFS, especially since it is difficult to make any other solution work across all compilation targets. We should switch to disabling the lint flags that are failing instead. Reviewed by: Matthew Ahrens Reviewed by: Pavel Zakharov Approved by: Dan McDonald Author: Dan Kimmel Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:24:11 2017 (r321537) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:24:53 2017 (r321538) @@ -567,12 +567,7 @@ typedef struct dmu_buf_user { * NOTE: This function should only be called once on a given dmu_buf_user_t. * To allow enforcement of this, dbu must already be zeroed on entry. */ -#ifdef __lint -/* Very ugly, but it beats issuing suppression directives in many Makefiles. */ -extern void -dmu_buf_init_user(dmu_buf_user_t *dbu, dmu_buf_evict_func_t *evict_func, - dmu_buf_evict_func_t *evict_func_async, dmu_buf_t **clear_on_evict_dbufp); -#else /* __lint */ +/*ARGSUSED*/ inline void dmu_buf_init_user(dmu_buf_user_t *dbu, dmu_buf_evict_func_t *evict_func_sync, dmu_buf_evict_func_t *evict_func_async, dmu_buf_t **clear_on_evict_dbufp) @@ -588,7 +583,6 @@ dmu_buf_init_user(dmu_buf_user_t *dbu, dmu_buf_evict_f dbu->dbu_clear_on_evict_dbufp = clear_on_evict_dbufp; #endif } -#endif /* __lint */ /* * Attach user data to a dbuf and mark it for normal (when the dbuf's Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Jul 26 16:24:11 2017 (r321537) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Jul 26 16:24:53 2017 (r321538) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2014 by Delphix. All rights reserved. + * Copyright (c) 2011, 2016 by Delphix. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright 2013 Saso Kiselkov. All rights reserved. @@ -36,6 +36,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -610,8 +611,6 @@ _NOTE(CONSTCOND) } while (0) } \ ASSERT(len < size); \ } - -#include #define BP_GET_BUFC_TYPE(bp) \ (((BP_GET_LEVEL(bp) > 0) || (DMU_OT_IS_METADATA(BP_GET_TYPE(bp)))) ? \ From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:25:48 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 39D75D98CDF; Wed, 26 Jul 2017 16:25:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0335869767; Wed, 26 Jul 2017 16:25:47 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGPlKe068114; Wed, 26 Jul 2017 16:25:47 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGPlUB068112; Wed, 26 Jul 2017 16:25:47 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261625.v6QGPlUB068112@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:25:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321539 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321539 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:25:48 -0000 Author: mav Date: Wed Jul 26 16:25:46 2017 New Revision: 321539 URL: https://svnweb.freebsd.org/changeset/base/321539 Log: MFC r317527: MFV 316898 7613 ms_freetree[4] is only used in syncing context illumos/illumos-gate@5f145778012b555e084eacc858ead9e1e42bd149 https://github.com/illumos/illumos-gate/commit/5f145778012b555e084eacc858ead9e1e42bd149 https://www.illumos.org/issues/7613 metaslab_t:ms_freetree[TXG_SIZE] is only used in syncing context. We should replace it with two trees: the freeing tree (ranges that we are freeing this syncing txg) and the freed tree (ranges which have been freed this txg). Reviewed by: George Wilson Reviewed by: Alex Reece Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Jul 26 16:24:53 2017 (r321538) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Jul 26 16:25:46 2017 (r321539) @@ -533,7 +533,6 @@ metaslab_verify_space(metaslab_t *msp, uint64_t txg) { spa_t *spa = msp->ms_group->mg_vd->vdev_spa; uint64_t allocated = 0; - uint64_t freed = 0; uint64_t sm_free_space, msp_free_space; ASSERT(MUTEX_HELD(&msp->ms_lock)); @@ -563,10 +562,9 @@ metaslab_verify_space(metaslab_t *msp, uint64_t txg) allocated += range_tree_space(msp->ms_alloctree[(txg + t) & TXG_MASK]); } - freed = range_tree_space(msp->ms_freetree[TXG_CLEAN(txg) & TXG_MASK]); msp_free_space = range_tree_space(msp->ms_tree) + allocated + - msp->ms_deferspace + freed; + msp->ms_deferspace + range_tree_space(msp->ms_freedtree); VERIFY3U(sm_free_space, ==, msp_free_space); } @@ -1499,7 +1497,7 @@ metaslab_init(metaslab_group_t *mg, uint64_t id, uint6 /* * We create the main range tree here, but we don't create the - * alloctree and freetree until metaslab_sync_done(). This serves + * other range trees until metaslab_sync_done(). This serves * two purposes: it allows metaslab_sync_done() to detect the * addition of new space; and for debugging, it ensures that we'd * data fault on any attempt to use this metaslab before it's ready. @@ -1557,10 +1555,11 @@ metaslab_fini(metaslab_t *msp) metaslab_unload(msp); range_tree_destroy(msp->ms_tree); + range_tree_destroy(msp->ms_freeingtree); + range_tree_destroy(msp->ms_freedtree); for (int t = 0; t < TXG_SIZE; t++) { range_tree_destroy(msp->ms_alloctree[t]); - range_tree_destroy(msp->ms_freetree[t]); } for (int t = 0; t < TXG_DEFER_SIZE; t++) { @@ -2171,7 +2170,6 @@ static void metaslab_condense(metaslab_t *msp, uint64_t txg, dmu_tx_t *tx) { spa_t *spa = msp->ms_group->mg_vd->vdev_spa; - range_tree_t *freetree = msp->ms_freetree[txg & TXG_MASK]; range_tree_t *condense_tree; space_map_t *sm = msp->ms_sm; @@ -2202,9 +2200,9 @@ metaslab_condense(metaslab_t *msp, uint64_t txg, dmu_t /* * Remove what's been freed in this txg from the condense_tree. * Since we're in sync_pass 1, we know that all the frees from - * this txg are in the freetree. + * this txg are in the freeingtree. */ - range_tree_walk(freetree, range_tree_remove, condense_tree); + range_tree_walk(msp->ms_freeingtree, range_tree_remove, condense_tree); for (int t = 0; t < TXG_DEFER_SIZE; t++) { range_tree_walk(msp->ms_defertree[t], @@ -2260,9 +2258,6 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) spa_t *spa = vd->vdev_spa; objset_t *mos = spa_meta_objset(spa); range_tree_t *alloctree = msp->ms_alloctree[txg & TXG_MASK]; - range_tree_t **freetree = &msp->ms_freetree[txg & TXG_MASK]; - range_tree_t **freed_tree = - &msp->ms_freetree[TXG_CLEAN(txg) & TXG_MASK]; dmu_tx_t *tx; uint64_t object = space_map_object(msp->ms_sm); @@ -2271,14 +2266,14 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) /* * This metaslab has just been added so there's no work to do now. */ - if (*freetree == NULL) { + if (msp->ms_freeingtree == NULL) { ASSERT3P(alloctree, ==, NULL); return; } ASSERT3P(alloctree, !=, NULL); - ASSERT3P(*freetree, !=, NULL); - ASSERT3P(*freed_tree, !=, NULL); + ASSERT3P(msp->ms_freeingtree, !=, NULL); + ASSERT3P(msp->ms_freedtree, !=, NULL); /* * Normally, we don't want to process a metaslab if there @@ -2286,14 +2281,14 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) * is being forced to condense we need to let it through. */ if (range_tree_space(alloctree) == 0 && - range_tree_space(*freetree) == 0 && + range_tree_space(msp->ms_freeingtree) == 0 && !msp->ms_condense_wanted) return; /* * The only state that can actually be changing concurrently with * metaslab_sync() is the metaslab's ms_tree. No other thread can - * be modifying this txg's alloctree, freetree, freed_tree, or + * be modifying this txg's alloctree, freeingtree, freedtree, or * space_map_phys_t. Therefore, we only hold ms_lock to satify * space map ASSERTs. We drop it whenever we call into the DMU, * because the DMU can call down to us (e.g. via zio_free()) at @@ -2330,7 +2325,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) metaslab_condense(msp, txg, tx); } else { space_map_write(msp->ms_sm, alloctree, SM_ALLOC, tx); - space_map_write(msp->ms_sm, *freetree, SM_FREE, tx); + space_map_write(msp->ms_sm, msp->ms_freeingtree, SM_FREE, tx); } if (msp->ms_loaded) { @@ -2350,7 +2345,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) * to accurately reflect all free space even if some space * is not yet available for allocation (i.e. deferred). */ - space_map_histogram_add(msp->ms_sm, *freed_tree, tx); + space_map_histogram_add(msp->ms_sm, msp->ms_freedtree, tx); /* * Add back any deferred free space that has not been @@ -2372,7 +2367,7 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) * then we will lose some accuracy but will correct it the next * time we load the space map. */ - space_map_histogram_add(msp->ms_sm, *freetree, tx); + space_map_histogram_add(msp->ms_sm, msp->ms_freeingtree, tx); metaslab_group_histogram_add(mg, msp); metaslab_group_histogram_verify(mg); @@ -2380,20 +2375,21 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) /* * For sync pass 1, we avoid traversing this txg's free range tree - * and instead will just swap the pointers for freetree and - * freed_tree. We can safely do this since the freed_tree is + * and instead will just swap the pointers for freeingtree and + * freedtree. We can safely do this since the freed_tree is * guaranteed to be empty on the initial pass. */ if (spa_sync_pass(spa) == 1) { - range_tree_swap(freetree, freed_tree); + range_tree_swap(&msp->ms_freeingtree, &msp->ms_freedtree); } else { - range_tree_vacate(*freetree, range_tree_add, *freed_tree); + range_tree_vacate(msp->ms_freeingtree, + range_tree_add, msp->ms_freedtree); } range_tree_vacate(alloctree, NULL, NULL); ASSERT0(range_tree_space(msp->ms_alloctree[txg & TXG_MASK])); ASSERT0(range_tree_space(msp->ms_alloctree[TXG_CLEAN(txg) & TXG_MASK])); - ASSERT0(range_tree_space(msp->ms_freetree[txg & TXG_MASK])); + ASSERT0(range_tree_space(msp->ms_freeingtree)); mutex_exit(&msp->ms_lock); @@ -2415,7 +2411,6 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) metaslab_group_t *mg = msp->ms_group; vdev_t *vd = mg->mg_vd; spa_t *spa = vd->vdev_spa; - range_tree_t **freed_tree; range_tree_t **defer_tree; int64_t alloc_delta, defer_delta; boolean_t defer_allowed = B_TRUE; @@ -2426,20 +2421,24 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) /* * If this metaslab is just becoming available, initialize its - * alloctrees, freetrees, and defertree and add its capacity to - * the vdev. + * range trees and add its capacity to the vdev. */ - if (msp->ms_freetree[TXG_CLEAN(txg) & TXG_MASK] == NULL) { + if (msp->ms_freedtree == NULL) { for (int t = 0; t < TXG_SIZE; t++) { ASSERT(msp->ms_alloctree[t] == NULL); - ASSERT(msp->ms_freetree[t] == NULL); msp->ms_alloctree[t] = range_tree_create(NULL, msp, &msp->ms_lock); - msp->ms_freetree[t] = range_tree_create(NULL, msp, - &msp->ms_lock); } + ASSERT3P(msp->ms_freeingtree, ==, NULL); + msp->ms_freeingtree = range_tree_create(NULL, msp, + &msp->ms_lock); + + ASSERT3P(msp->ms_freedtree, ==, NULL); + msp->ms_freedtree = range_tree_create(NULL, msp, + &msp->ms_lock); + for (int t = 0; t < TXG_DEFER_SIZE; t++) { ASSERT(msp->ms_defertree[t] == NULL); @@ -2450,7 +2449,6 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) vdev_space_update(vd, 0, 0, msp->ms_size); } - freed_tree = &msp->ms_freetree[TXG_CLEAN(txg) & TXG_MASK]; defer_tree = &msp->ms_defertree[txg % TXG_DEFER_SIZE]; uint64_t free_space = metaslab_class_get_space(spa_normal_class(spa)) - @@ -2462,7 +2460,7 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) defer_delta = 0; alloc_delta = space_map_alloc_delta(msp->ms_sm); if (defer_allowed) { - defer_delta = range_tree_space(*freed_tree) - + defer_delta = range_tree_space(msp->ms_freedtree) - range_tree_space(*defer_tree); } else { defer_delta -= range_tree_space(*defer_tree); @@ -2470,9 +2468,6 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) vdev_space_update(vd, alloc_delta + defer_delta, defer_delta, 0); - ASSERT0(range_tree_space(msp->ms_alloctree[txg & TXG_MASK])); - ASSERT0(range_tree_space(msp->ms_freetree[txg & TXG_MASK])); - /* * If there's a metaslab_load() in progress, wait for it to complete * so that we have a consistent view of the in-core space map. @@ -2488,9 +2483,9 @@ metaslab_sync_done(metaslab_t *msp, uint64_t txg) range_tree_vacate(*defer_tree, msp->ms_loaded ? range_tree_add : NULL, msp->ms_tree); if (defer_allowed) { - range_tree_swap(freed_tree, defer_tree); + range_tree_swap(&msp->ms_freedtree, defer_tree); } else { - range_tree_vacate(*freed_tree, + range_tree_vacate(msp->ms_freedtree, msp->ms_loaded ? range_tree_add : NULL, msp->ms_tree); } @@ -3250,10 +3245,10 @@ metaslab_free_dva(spa_t *spa, const dva_t *dva, uint64 range_tree_add(msp->ms_tree, offset, size); msp->ms_max_size = metaslab_block_maxsize(msp); } else { - if (range_tree_space(msp->ms_freetree[txg & TXG_MASK]) == 0) + VERIFY3U(txg, ==, spa->spa_syncing_txg); + if (range_tree_space(msp->ms_freeingtree) == 0) vdev_dirty(vd, VDD_METASLAB, msp, txg); - range_tree_add(msp->ms_freetree[txg & TXG_MASK], - offset, size); + range_tree_add(msp->ms_freeingtree, offset, size); } mutex_exit(&msp->ms_lock); @@ -3485,8 +3480,8 @@ metaslab_check_free(spa_t *spa, const blkptr_t *bp) if (msp->ms_loaded) range_tree_verify(msp->ms_tree, offset, size); - for (int j = 0; j < TXG_SIZE; j++) - range_tree_verify(msp->ms_freetree[j], offset, size); + range_tree_verify(msp->ms_freeingtree, offset, size); + range_tree_verify(msp->ms_freedtree, offset, size); for (int j = 0; j < TXG_DEFER_SIZE; j++) range_tree_verify(msp->ms_defertree[j], offset, size); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Wed Jul 26 16:24:53 2017 (r321538) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Wed Jul 26 16:25:46 2017 (r321539) @@ -255,21 +255,24 @@ struct metaslab_group { #define MAX_LBAS 64 /* - * Each metaslab maintains a set of in-core trees to track metaslab operations. - * The in-core free tree (ms_tree) contains the current list of free segments. - * As blocks are allocated, the allocated segment are removed from the ms_tree - * and added to a per txg allocation tree (ms_alloctree). As blocks are freed, - * they are added to the per txg free tree (ms_freetree). These per txg - * trees allow us to process all allocations and frees in syncing context - * where it is safe to update the on-disk space maps. One additional in-core - * tree is maintained to track deferred frees (ms_defertree). Once a block - * is freed it will move from the ms_freetree to the ms_defertree. A deferred - * free means that a block has been freed but cannot be used by the pool - * until TXG_DEFER_SIZE transactions groups later. For example, a block - * that is freed in txg 50 will not be available for reallocation until - * txg 52 (50 + TXG_DEFER_SIZE). This provides a safety net for uberblock - * rollback. A pool could be safely rolled back TXG_DEFERS_SIZE - * transactions groups and ensure that no block has been reallocated. + * Each metaslab maintains a set of in-core trees to track metaslab + * operations. The in-core free tree (ms_tree) contains the list of + * free segments which are eligible for allocation. As blocks are + * allocated, the allocated segments are removed from the ms_tree and + * added to a per txg allocation tree (ms_alloctree). This allows us to + * process all allocations in syncing context where it is safe to update + * the on-disk space maps. Frees are also processed in syncing context. + * Most frees are generated from syncing context, and those that are not + * are held in the spa_free_bplist for processing in syncing context. + * An additional set of in-core trees is maintained to track deferred + * frees (ms_defertree). Once a block is freed it will move from the + * ms_freedtree to the ms_defertree. A deferred free means that a block + * has been freed but cannot be used by the pool until TXG_DEFER_SIZE + * transactions groups later. For example, a block that is freed in txg + * 50 will not be available for reallocation until txg 52 (50 + + * TXG_DEFER_SIZE). This provides a safety net for uberblock rollback. + * A pool could be safely rolled back TXG_DEFERS_SIZE transactions + * groups and ensure that no block has been reallocated. * * The simplified transition diagram looks like this: * @@ -277,33 +280,34 @@ struct metaslab_group { * ALLOCATE * | * V - * free segment (ms_tree) --------> ms_alloctree ----> (write to space map) + * free segment (ms_tree) -----> ms_alloctree[4] ----> (write to space map) * ^ - * | - * | ms_freetree <--- FREE + * | ms_freeingtree <--- FREE * | | + * | v + * | ms_freedtree * | | - * | | - * +----------- ms_defertree <-------+---------> (write to space map) + * +-------- ms_defertree[2] <-------+---------> (write to space map) * * * Each metaslab's space is tracked in a single space map in the MOS, - * which is only updated in syncing context. Each time we sync a txg, - * we append the allocs and frees from that txg to the space map. - * The pool space is only updated once all metaslabs have finished syncing. + * which is only updated in syncing context. Each time we sync a txg, + * we append the allocs and frees from that txg to the space map. The + * pool space is only updated once all metaslabs have finished syncing. * - * To load the in-core free tree we read the space map from disk. - * This object contains a series of alloc and free records that are - * combined to make up the list of all free segments in this metaslab. These + * To load the in-core free tree we read the space map from disk. This + * object contains a series of alloc and free records that are combined + * to make up the list of all free segments in this metaslab. These * segments are represented in-core by the ms_tree and are stored in an * AVL tree. * * As the space map grows (as a result of the appends) it will - * eventually become space-inefficient. When the metaslab's in-core free tree - * is zfs_condense_pct/100 times the size of the minimal on-disk - * representation, we rewrite it in its minimized form. If a metaslab - * needs to condense then we must set the ms_condensing flag to ensure - * that allocations are not performed on the metaslab that is being written. + * eventually become space-inefficient. When the metaslab's in-core + * free tree is zfs_condense_pct/100 times the size of the minimal + * on-disk representation, we rewrite it in its minimized form. If a + * metaslab needs to condense then we must set the ms_condensing flag to + * ensure that allocations are not performed on the metaslab that is + * being written. */ struct metaslab { kmutex_t ms_lock; @@ -315,9 +319,16 @@ struct metaslab { uint64_t ms_fragmentation; range_tree_t *ms_alloctree[TXG_SIZE]; - range_tree_t *ms_freetree[TXG_SIZE]; - range_tree_t *ms_defertree[TXG_DEFER_SIZE]; range_tree_t *ms_tree; + + /* + * The following range trees are accessed only from syncing context. + * ms_free*tree only have entries while syncing, and are empty + * between syncs. + */ + range_tree_t *ms_freeingtree; /* to free this syncing txg */ + range_tree_t *ms_freedtree; /* already freed this syncing txg */ + range_tree_t *ms_defertree[TXG_DEFER_SIZE]; boolean_t ms_condensing; /* condensing? */ boolean_t ms_condense_wanted; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:26:35 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BF920D98D8A; Wed, 26 Jul 2017 16:26:35 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9902569924; Wed, 26 Jul 2017 16:26:35 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGQYg4068199; Wed, 26 Jul 2017 16:26:34 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGQYou068197; Wed, 26 Jul 2017 16:26:34 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261626.v6QGQYou068197@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:26:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321540 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321540 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:26:35 -0000 Author: mav Date: Wed Jul 26 16:26:34 2017 New Revision: 321540 URL: https://svnweb.freebsd.org/changeset/base/321540 Log: MFC r317533: MFV 316900 7743 per-vdev-zaps have no initialize path on upgrade illumos/illumos-gate@555da5111b0f2552c42d057b211aba89c9c79f6c https://github.com/illumos/illumos-gate/commit/555da5111b0f2552c42d057b211aba89c9c79f6c https://www.illumos.org/issues/7743 When loading a pool that had been created before the existance of per-vdev zaps, on a system that knows about per-vdev zaps, the per-vdev zaps will not be allocated and initialized. This appears to be because the logic that would have done so, in spa_sync_config_object(), is not reached under normal operation. It is only reached if spa_config_dirty_list is non-empty. The fix is to add another `AVZ_ACTION_` enum that will allow this code to be reached when we detect that we're loading an old pool, even when there are no dirty configs. Reviewed by: Matt Ahrens Reviewed by: Pavel Zakharov Reviewed by: George Wilson Reviewed by: Don Brady Approved by: Robert Mustacchi Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Jul 26 16:25:46 2017 (r321539) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Jul 26 16:26:34 2017 (r321540) @@ -2736,10 +2736,14 @@ spa_load_impl(spa_t *spa, uint64_t pool_guid, nvlist_t error = spa_dir_prop(spa, DMU_POOL_VDEV_ZAP_MAP, &spa->spa_all_vdev_zaps); - if (error != ENOENT && error != 0) { + if (error == ENOENT) { + VERIFY(!nvlist_exists(mos_config, + ZPOOL_CONFIG_HAS_PER_VDEV_ZAPS)); + spa->spa_avz_action = AVZ_ACTION_INITIALIZE; + ASSERT0(vdev_count_verify_zaps(spa->spa_root_vdev)); + } else if (error != 0) { return (spa_vdev_err(rvd, VDEV_AUX_CORRUPT_DATA, EIO)); - } else if (error == 0 && !nvlist_exists(mos_config, - ZPOOL_CONFIG_HAS_PER_VDEV_ZAPS)) { + } else if (!nvlist_exists(mos_config, ZPOOL_CONFIG_HAS_PER_VDEV_ZAPS)) { /* * An older version of ZFS overwrote the sentinel value, so * we have orphaned per-vdev ZAPs in the MOS. Defer their @@ -6503,6 +6507,7 @@ spa_sync_config_object(spa_t *spa, dmu_tx_t *tx) spa_config_enter(spa, SCL_STATE, FTAG, RW_READER); ASSERT(spa->spa_avz_action == AVZ_ACTION_NONE || + spa->spa_avz_action == AVZ_ACTION_INITIALIZE || spa->spa_all_vdev_zaps != 0); if (spa->spa_avz_action == AVZ_ACTION_REBUILD) { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Wed Jul 26 16:25:46 2017 (r321539) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Wed Jul 26 16:26:34 2017 (r321540) @@ -120,7 +120,8 @@ typedef struct spa_taskqs { typedef enum spa_all_vdev_zap_action { AVZ_ACTION_NONE = 0, AVZ_ACTION_DESTROY, /* Destroy all per-vdev ZAPs and the AVZ. */ - AVZ_ACTION_REBUILD /* Populate the new AVZ, see spa_avz_rebuild */ + AVZ_ACTION_REBUILD, /* Populate the new AVZ, see spa_avz_rebuild */ + AVZ_ACTION_INITIALIZE } spa_avz_action_t; struct spa { From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:27:21 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D2DCCD98DFE; Wed, 26 Jul 2017 16:27:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AEC4469A54; Wed, 26 Jul 2017 16:27:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGRKtX068282; Wed, 26 Jul 2017 16:27:20 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGRKxT068280; Wed, 26 Jul 2017 16:27:20 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261627.v6QGRKxT068280@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:27:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321541 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321541 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:27:21 -0000 Author: mav Date: Wed Jul 26 16:27:20 2017 New Revision: 321541 URL: https://svnweb.freebsd.org/changeset/base/321541 Log: MFC r317541: MFV 316905 7740 fix for 6513 only works in hole punching case, not truncation illumos/illumos-gate@7de35a3ed0c2e6d4256bd2fb05b48b3798aaf553 https://github.com/illumos/illumos-gate/commit/7de35a3ed0c2e6d4256bd2fb05b48b3798aaf553 https://www.illumos.org/issues/7740 The problem is that dbuf_findbp will return ENOENT if the block it's trying to find is beyond the end of the file. If that happens, we assume there is no birth time, and so we lose that information when we write out new blkptrs. We should teach dbuf_findbp to look for things that are beyond the current end, but not beyond the absolute end of the file. To verify, create a large file, truncate it to a short length, and then write beyond the end. Check with zdb to make sure that there are no holes with birth time zero (will appear as gaps). Reviewed by: Steve Gonczi Reviewed by: Matthew Ahrens Approved by: Dan McDonald Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:26:34 2017 (r321540) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:27:20 2017 (r321541) @@ -2161,8 +2161,6 @@ static int dbuf_findbp(dnode_t *dn, int level, uint64_t blkid, int fail_sparse, dmu_buf_impl_t **parentp, blkptr_t **bpp) { - int nlevels, epbs; - *parentp = NULL; *bpp = NULL; @@ -2181,17 +2179,35 @@ dbuf_findbp(dnode_t *dn, int level, uint64_t blkid, in return (0); } - if (dn->dn_phys->dn_nlevels == 0) - nlevels = 1; - else - nlevels = dn->dn_phys->dn_nlevels; + int nlevels = + (dn->dn_phys->dn_nlevels == 0) ? 1 : dn->dn_phys->dn_nlevels; + int epbs = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - epbs = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - ASSERT3U(level * epbs, <, 64); ASSERT(RW_LOCK_HELD(&dn->dn_struct_rwlock)); + /* + * This assertion shouldn't trip as long as the max indirect block size + * is less than 1M. The reason for this is that up to that point, + * the number of levels required to address an entire object with blocks + * of size SPA_MINBLOCKSIZE satisfies nlevels * epbs + 1 <= 64. In + * other words, if N * epbs + 1 > 64, then if (N-1) * epbs + 1 > 55 + * (i.e. we can address the entire object), objects will all use at most + * N-1 levels and the assertion won't overflow. However, once epbs is + * 13, 4 * 13 + 1 = 53, but 5 * 13 + 1 = 66. Then, 4 levels will not be + * enough to address an entire object, so objects will have 5 levels, + * but then this assertion will overflow. + * + * All this is to say that if we ever increase DN_MAX_INDBLKSHIFT, we + * need to redo this logic to handle overflows. + */ + ASSERT(level >= nlevels || + ((nlevels - level - 1) * epbs) + + highbit64(dn->dn_phys->dn_nblkptr) <= 64); if (level >= nlevels || - (blkid > (dn->dn_phys->dn_maxblkid >> (level * epbs)))) { + blkid >= ((uint64_t)dn->dn_phys->dn_nblkptr << + ((nlevels - level - 1) * epbs)) || + (fail_sparse && + blkid > (dn->dn_phys->dn_maxblkid >> (level * epbs)))) { /* the buffer has no parent yet */ return (SET_ERROR(ENOENT)); } else if (level < nlevels-1) { @@ -2209,6 +2225,8 @@ dbuf_findbp(dnode_t *dn, int level, uint64_t blkid, in } *bpp = ((blkptr_t *)(*parentp)->db.db_data) + (blkid & ((1ULL << epbs) - 1)); + if (blkid > (dn->dn_phys->dn_maxblkid >> (level * epbs))) + ASSERT(BP_IS_HOLE(*bpp)); return (0); } else { /* the block is referenced from the dnode */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Jul 26 16:26:34 2017 (r321540) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Jul 26 16:27:20 2017 (r321541) @@ -58,6 +58,12 @@ extern "C" { */ #define DNODE_SHIFT 9 /* 512 bytes */ #define DN_MIN_INDBLKSHIFT 12 /* 4k */ +/* + * If we ever increase this value beyond 20, we need to revisit all logic that + * does x << level * ebps to handle overflow. With a 1M indirect block size, + * 4 levels of indirect blocks would not be able to guarantee addressing an + * entire object, so 5 levels will be used, but 5 * (20 - 7) = 65. + */ #define DN_MAX_INDBLKSHIFT 17 /* 128k */ #define DNODE_BLOCK_SHIFT 14 /* 16k */ #define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:28:06 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C7B4D98E8C; Wed, 26 Jul 2017 16:28:06 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 689F569B8F; Wed, 26 Jul 2017 16:28:06 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGS5OH068370; Wed, 26 Jul 2017 16:28:05 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGS5NU068369; Wed, 26 Jul 2017 16:28:05 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261628.v6QGS5NU068369@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:28:05 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321542 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321542 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:28:06 -0000 Author: mav Date: Wed Jul 26 16:28:05 2017 New Revision: 321542 URL: https://svnweb.freebsd.org/changeset/base/321542 Log: MFC r317648: Fix misport of compressed ZFS send/recv from 317414 Reported by: Michael Jung Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Jul 26 16:27:20 2017 (r321541) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Jul 26 16:28:05 2017 (r321542) @@ -962,7 +962,7 @@ zio_free_sync(zio_t *pio, spa_t *spa, uint64_t txg, co flags |= ZIO_FLAG_DONT_QUEUE; zio = zio_create(pio, spa, txg, bp, NULL, size, - BP_GET_PSIZE(bp), NULL, NULL, ZIO_TYPE_FREE, ZIO_PRIORITY_NOW, + size, NULL, NULL, ZIO_TYPE_FREE, ZIO_PRIORITY_NOW, flags, NULL, 0, NULL, ZIO_STAGE_OPEN, stage); return (zio); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:30:10 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D339BD98F9E; Wed, 26 Jul 2017 16:30:10 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9C52B69D33; Wed, 26 Jul 2017 16:30:10 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGU90K068517; Wed, 26 Jul 2017 16:30:09 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGU9TN068516; Wed, 26 Jul 2017 16:30:09 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261630.v6QGU9TN068516@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:30:09 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321543 - stable/11/cddl/contrib/opensolaris/cmd/zdb X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/cmd/zdb X-SVN-Commit-Revision: 321543 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:30:10 -0000 Author: mav Date: Wed Jul 26 16:30:09 2017 New Revision: 321543 URL: https://svnweb.freebsd.org/changeset/base/321543 Log: MFC r318812: MFV r316860: 7545 zdb should disable reference tracking illumos/illumos-gate@4dd77f9e38ef05b39db128ff7608d926fd3218c6 https://github.com/illumos/illumos-gate/commit/4dd77f9e38ef05b39db128ff7608d926fd3218c6 https://www.illumos.org/issues/7545 When evicting from the ARC, we manipulate some refcount_t's, e.g. arcs_size. When using zdb to examine a large amount of data (e.g. zdb -bb on a large pool with small blocks), the ARC may have a large number of entries. If reference tracking is enabled, there will be ~1 reference for each block in the ARC. When evicting, we decrement the refcount and have to search all the references to find the one that we are removing, which is very slow. Since zdb is typically used to find problems with the on-disk format, and not with the code it is running, we should disable reference tracking in zdb. Reviewed by: Dan Kimmel Reviewed by: Steve Gonczi Reviewed by: George Wilson Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Jul 26 16:28:05 2017 (r321542) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Wed Jul 26 16:30:09 2017 (r321543) @@ -75,10 +75,12 @@ DMU_OT_ZAP_OTHER : DMU_OT_NUMTYPES)) #ifndef lint +extern int reference_tracking_enable; extern boolean_t zfs_recover; extern uint64_t zfs_arc_max, zfs_arc_meta_limit; extern int zfs_vdev_async_read_max_active; #else +int reference_tracking_enable; boolean_t zfs_recover; uint64_t zfs_arc_max, zfs_arc_meta_limit; int zfs_vdev_async_read_max_active; @@ -3695,6 +3697,11 @@ main(int argc, char **argv) * For good performance, let several of them be active at once. */ zfs_vdev_async_read_max_active = 10; + + /* + * Disable reference tracking for better performance. + */ + reference_tracking_enable = B_FALSE; kernel_init(FREAD); g_zfs = libzfs_init(); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:30:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1DD17DA9033; Wed, 26 Jul 2017 16:30:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE39A69EF6; Wed, 26 Jul 2017 16:30:58 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGUwge068595; Wed, 26 Jul 2017 16:30:58 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGUwuV068594; Wed, 26 Jul 2017 16:30:58 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261630.v6QGUwuV068594@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:30:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321544 - stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common X-SVN-Commit-Revision: 321544 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:30:59 -0000 Author: mav Date: Wed Jul 26 16:30:57 2017 New Revision: 321544 URL: https://svnweb.freebsd.org/changeset/base/321544 Log: MFC r318814: MFC r316904: 7729 libzfs_core`lzc_rollback() leaks result nvl illumos/illumos-gate@ac428481f96be89add7a1edf43ae47dd71038553 https://github.com/illumos/illumos-gate/commit/ac428481f96be89add7a1edf43ae47dd71038553 https://www.illumos.org/issues/7729 libzfs_core`lzc_rollback() doesn't free the result nvl after lzc_ioctl() call. Reviewed by: Matthew Ahrens Reviewed by: Prakash Surya Approved by: Dan McDonald Author: Yuri Pankov Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Jul 26 16:30:09 2017 (r321543) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Jul 26 16:30:57 2017 (r321544) @@ -759,6 +759,8 @@ lzc_rollback(const char *fsname, char *snapnamebuf, in const char *snapname = fnvlist_lookup_string(result, "target"); (void) strlcpy(snapnamebuf, snapname, snapnamelen); } + nvlist_free(result); + return (err); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:32:19 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6C8BEDA9227; Wed, 26 Jul 2017 16:32:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2113C6A20A; Wed, 26 Jul 2017 16:32:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGWIlu070202; Wed, 26 Jul 2017 16:32:18 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGWHM9070193; Wed, 26 Jul 2017 16:32:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261632.v6QGWHM9070193@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:32:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321545 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321545 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:32:19 -0000 Author: mav Date: Wed Jul 26 16:32:17 2017 New Revision: 321545 URL: https://svnweb.freebsd.org/changeset/base/321545 Log: MFC r318818: MFV r316907: 1300 filename normalization doesn't work for removes illumos/illumos-gate@1c17160ac558f98048951327f4e9248d8f46acc0 https://github.com/illumos/illumos-gate/commit/1c17160ac558f98048951327f4e9248d8f46acc0 https://www.illumos.org/issues/1300 FreeBSD note: recent FreeBSD was not affected by the issue fixed as the name cache is completely bypassed when normalization is enabled. The change is imported for the sake of ZAP infrastructure modifications. Reviewed by: Yuri Pankov Reviewed by: Pavel Zakharov Reviewed by: Matt Ahrens Approved by: Dan McDonald Author: Kevin Crowe Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:32:17 2017 (r321545) @@ -18,15 +18,16 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012, 2016 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. - * Copyright 2015 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2015, STRATO AG, Inc. All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2017 Nexenta Systems, Inc. */ /* Portions Copyright 2010 Robert Milkowski */ @@ -1622,7 +1623,7 @@ dmu_snapshot_realname(objset_t *os, char *name, char * return (zap_lookup_norm(ds->ds_dir->dd_pool->dp_meta_objset, dsl_dataset_phys(ds)->ds_snapnames_zapobj, name, 8, 1, &ignored, - MT_FIRST, real, maxlen, conflict)); + MT_NORMALIZE, real, maxlen, conflict)); } int Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c Wed Jul 26 16:32:17 2017 (r321545) @@ -12,8 +12,10 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2013, 2014 by Delphix. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. */ #include @@ -59,16 +61,14 @@ dsl_dataset_bmark_lookup(dsl_dataset_t *ds, const char { objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset; uint64_t bmark_zapobj = ds->ds_bookmarks; - matchtype_t mt; + matchtype_t mt = 0; int err; if (bmark_zapobj == 0) return (SET_ERROR(ESRCH)); if (dsl_dataset_phys(ds)->ds_flags & DS_FLAG_CI_DATASET) - mt = MT_FIRST; - else - mt = MT_EXACT; + mt = MT_NORMALIZE; err = zap_lookup_norm(mos, bmark_zapobj, shortname, sizeof (uint64_t), sizeof (*bmark_phys) / sizeof (uint64_t), bmark_phys, mt, @@ -339,12 +339,10 @@ dsl_dataset_bookmark_remove(dsl_dataset_t *ds, const c { objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset; uint64_t bmark_zapobj = ds->ds_bookmarks; - matchtype_t mt; + matchtype_t mt = 0; if (dsl_dataset_phys(ds)->ds_flags & DS_FLAG_CI_DATASET) - mt = MT_FIRST; - else - mt = MT_EXACT; + mt = MT_NORMALIZE; return (zap_remove_norm(mos, bmark_zapobj, name, mt, tx)); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Jul 26 16:32:17 2017 (r321545) @@ -18,6 +18,7 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright (c) 2011 Martin Matuska @@ -27,6 +28,7 @@ * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] * Copyright 2016, OmniTI Computer Consulting, Inc. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. */ #include @@ -364,17 +366,15 @@ dsl_dataset_snap_lookup(dsl_dataset_t *ds, const char { objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset; uint64_t snapobj = dsl_dataset_phys(ds)->ds_snapnames_zapobj; - matchtype_t mt; + matchtype_t mt = 0; int err; if (dsl_dataset_phys(ds)->ds_flags & DS_FLAG_CI_DATASET) - mt = MT_FIRST; - else - mt = MT_EXACT; + mt = MT_NORMALIZE; err = zap_lookup_norm(mos, snapobj, name, 8, 1, value, mt, NULL, 0, NULL); - if (err == ENOTSUP && mt == MT_FIRST) + if (err == ENOTSUP && (mt & MT_NORMALIZE)) err = zap_lookup(mos, snapobj, name, 8, 1, value); return (err); } @@ -385,18 +385,16 @@ dsl_dataset_snap_remove(dsl_dataset_t *ds, const char { objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset; uint64_t snapobj = dsl_dataset_phys(ds)->ds_snapnames_zapobj; - matchtype_t mt; + matchtype_t mt = 0; int err; dsl_dir_snap_cmtime_update(ds->ds_dir); if (dsl_dataset_phys(ds)->ds_flags & DS_FLAG_CI_DATASET) - mt = MT_FIRST; - else - mt = MT_EXACT; + mt = MT_NORMALIZE; err = zap_remove_norm(mos, snapobj, name, mt, tx); - if (err == ENOTSUP && mt == MT_FIRST) + if (err == ENOTSUP && (mt & MT_NORMALIZE)) err = zap_remove(mos, snapobj, name, tx); if (err == 0 && adj_cnt) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h Wed Jul 26 16:32:17 2017 (r321545) @@ -18,9 +18,11 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. */ #ifndef _SYS_ZAP_H @@ -88,22 +90,15 @@ extern "C" { /* * Specifies matching criteria for ZAP lookups. + * MT_NORMALIZE Use ZAP normalization flags, which can include both + * unicode normalization and case-insensitivity. + * MT_MATCH_CASE Do case-sensitive lookups even if MT_NORMALIZE is + * specified and ZAP normalization flags include + * U8_TEXTPREP_TOUPPER. */ -typedef enum matchtype -{ - /* Only find an exact match (non-normalized) */ - MT_EXACT, - /* - * If there is an exact match, find that, otherwise find the - * first normalized match. - */ - MT_BEST, - /* - * Find the "first" normalized (case and Unicode form) match; - * the designated "first" match will not change as long as the - * set of entries with this normalization doesn't change. - */ - MT_FIRST +typedef enum matchtype { + MT_NORMALIZE = 1 << 0, + MT_MATCH_CASE = 1 << 1, } matchtype_t; typedef enum zap_flags { @@ -120,16 +115,6 @@ typedef enum zap_flags { /* * Create a new zapobj with no attributes and return its object number. - * MT_EXACT will cause the zap object to only support MT_EXACT lookups, - * otherwise any matchtype can be used for lookups. - * - * normflags specifies what normalization will be done. values are: - * 0: no normalization (legacy on-disk format, supports MT_EXACT matching - * only) - * U8_TEXTPREP_TOLOWER: case normalization will be performed. - * MT_FIRST/MT_BEST matching will find entries that match without - * regard to case (eg. looking for "foo" can find an entry "Foo"). - * Eventually, other flags will permit unicode normalization as well. */ uint64_t zap_create(objset_t *ds, dmu_object_type_t ot, dmu_object_type_t bonustype, int bonuslen, dmu_tx_t *tx); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h Wed Jul 26 16:32:17 2017 (r321545) @@ -18,11 +18,13 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2013, 2016 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2017 Nexenta Systems, Inc. */ #ifndef _SYS_ZAP_IMPL_H @@ -189,6 +191,7 @@ typedef struct zap_name { int zn_key_norm_numints; uint64_t zn_hash; matchtype_t zn_matchtype; + int zn_normflags; char zn_normbuf[ZAP_MAXNAMELEN]; } zap_name_t; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c Wed Jul 26 16:32:17 2017 (r321545) @@ -18,9 +18,11 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2013, 2015 by Delphix. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. */ /* @@ -361,7 +363,7 @@ zap_leaf_array_match(zap_leaf_t *l, zap_name_t *zn, } ASSERT(zn->zn_key_intlen == 1); - if (zn->zn_matchtype == MT_FIRST) { + if (zn->zn_matchtype & MT_NORMALIZE) { char *thisname = kmem_alloc(array_numints, KM_SLEEP); boolean_t match; @@ -403,7 +405,6 @@ zap_leaf_lookup(zap_leaf_t *l, zap_name_t *zn, zap_ent ASSERT3U(zap_leaf_phys(l)->l_hdr.lh_magic, ==, ZAP_LEAF_MAGIC); -again: for (chunkp = LEAF_HASH_ENTPTR(l, zn->zn_hash); *chunkp != CHAIN_END; chunkp = &le->le_next) { uint16_t chunk = *chunkp; @@ -418,9 +419,9 @@ again: /* * NB: the entry chain is always sorted by cd on * normalized zap objects, so this will find the - * lowest-cd match for MT_FIRST. + * lowest-cd match for MT_NORMALIZE. */ - ASSERT(zn->zn_matchtype == MT_EXACT || + ASSERT((zn->zn_matchtype == 0) || (zap_leaf_phys(l)->l_hdr.lh_flags & ZLF_ENTRIES_CDSORTED)); if (zap_leaf_array_match(l, zn, le->le_name_chunk, le->le_name_numints)) { @@ -434,15 +435,6 @@ again: } } - /* - * NB: we could of course do this in one pass, but that would be - * a pain. We'll see if MT_BEST is even used much. - */ - if (zn->zn_matchtype == MT_BEST) { - zn->zn_matchtype = MT_FIRST; - goto again; - } - return (SET_ERROR(ENOENT)); } @@ -697,7 +689,7 @@ zap_entry_normalization_conflict(zap_entry_handle_t *z continue; if (zn == NULL) { - zn = zap_name_alloc(zap, name, MT_FIRST); + zn = zap_name_alloc(zap, name, MT_NORMALIZE); allocdzn = B_TRUE; } if (zap_leaf_array_match(zeh->zeh_leaf, zn, Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Jul 26 16:32:17 2017 (r321545) @@ -18,11 +18,13 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2011, 2016 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2017 Nexenta Systems, Inc. */ #include @@ -133,7 +135,7 @@ zap_hash(zap_name_t *zn) } static int -zap_normalize(zap_t *zap, const char *name, char *namenorm) +zap_normalize(zap_t *zap, const char *name, char *namenorm, int normflags) { size_t inlen, outlen; int err; @@ -145,8 +147,8 @@ zap_normalize(zap_t *zap, const char *name, char *name err = 0; (void) u8_textprep_str((char *)name, &inlen, namenorm, &outlen, - zap->zap_normflags | U8_TEXTPREP_IGNORE_NULL | - U8_TEXTPREP_IGNORE_INVALID, U8_UNICODE_LATEST, &err); + normflags | U8_TEXTPREP_IGNORE_NULL | U8_TEXTPREP_IGNORE_INVALID, + U8_UNICODE_LATEST, &err); return (err); } @@ -156,15 +158,15 @@ zap_match(zap_name_t *zn, const char *matchname) { ASSERT(!(zap_getflags(zn->zn_zap) & ZAP_FLAG_UINT64_KEY)); - if (zn->zn_matchtype == MT_FIRST) { + if (zn->zn_matchtype & MT_NORMALIZE) { char norm[ZAP_MAXNAMELEN]; - if (zap_normalize(zn->zn_zap, matchname, norm) != 0) + if (zap_normalize(zn->zn_zap, matchname, norm, + zn->zn_normflags) != 0) return (B_FALSE); return (strcmp(zn->zn_key_norm, norm) == 0); } else { - /* MT_BEST or MT_EXACT */ return (strcmp(zn->zn_key_orig, matchname) == 0); } } @@ -185,15 +187,30 @@ zap_name_alloc(zap_t *zap, const char *key, matchtype_ zn->zn_key_orig = key; zn->zn_key_orig_numints = strlen(zn->zn_key_orig) + 1; zn->zn_matchtype = mt; + zn->zn_normflags = zap->zap_normflags; + + /* + * If we're dealing with a case sensitive lookup on a mixed or + * insensitive fs, remove U8_TEXTPREP_TOUPPER or the lookup + * will fold case to all caps overriding the lookup request. + */ + if (mt & MT_MATCH_CASE) + zn->zn_normflags &= ~U8_TEXTPREP_TOUPPER; + if (zap->zap_normflags) { - if (zap_normalize(zap, key, zn->zn_normbuf) != 0) { + /* + * We *must* use zap_normflags because this normalization is + * what the hash is computed from. + */ + if (zap_normalize(zap, key, zn->zn_normbuf, + zap->zap_normflags) != 0) { zap_name_free(zn); return (NULL); } zn->zn_key_norm = zn->zn_normbuf; zn->zn_key_norm_numints = strlen(zn->zn_key_norm) + 1; } else { - if (mt != MT_EXACT) { + if (mt != 0) { zap_name_free(zn); return (NULL); } @@ -202,6 +219,20 @@ zap_name_alloc(zap_t *zap, const char *key, matchtype_ } zn->zn_hash = zap_hash(zn); + + if (zap->zap_normflags != zn->zn_normflags) { + /* + * We *must* use zn_normflags because this normalization is + * what the matching is based on. (Not the hash!) + */ + if (zap_normalize(zap, key, zn->zn_normbuf, + zn->zn_normflags) != 0) { + zap_name_free(zn); + return (NULL); + } + zn->zn_key_norm_numints = strlen(zn->zn_key_norm) + 1; + } + return (zn); } @@ -215,7 +246,7 @@ zap_name_alloc_uint64(zap_t *zap, const uint64_t *key, zn->zn_key_intlen = sizeof (*key); zn->zn_key_orig = zn->zn_key_norm = key; zn->zn_key_orig_numints = zn->zn_key_norm_numints = numints; - zn->zn_matchtype = MT_EXACT; + zn->zn_matchtype = 0; zn->zn_hash = zap_hash(zn); return (zn); @@ -305,7 +336,6 @@ mze_find(zap_name_t *zn) mze_tofind.mze_hash = zn->zn_hash; mze_tofind.mze_cd = 0; -again: mze = avl_find(avl, &mze_tofind, &idx); if (mze == NULL) mze = avl_nearest(avl, idx, AVL_AFTER); @@ -314,10 +344,7 @@ again: if (zap_match(zn, MZE_PHYS(zn->zn_zap, mze)->mze_name)) return (mze); } - if (zn->zn_matchtype == MT_BEST) { - zn->zn_matchtype = MT_FIRST; - goto again; - } + return (NULL); } @@ -422,8 +449,7 @@ mzap_open(objset_t *os, uint64_t obj, dmu_buf_t *db) if (mze->mze_name[0]) { zap_name_t *zn; - zn = zap_name_alloc(zap, mze->mze_name, - MT_EXACT); + zn = zap_name_alloc(zap, mze->mze_name, 0); if (mze_insert(zap, i, zn->zn_hash) == 0) zap->zap_m.zap_num_entries++; else { @@ -629,7 +655,7 @@ mzap_upgrade(zap_t **zapp, void *tag, dmu_tx_t *tx, za continue; dprintf("adding %s=%llu\n", mze->mze_name, mze->mze_value); - zn = zap_name_alloc(zap, mze->mze_name, MT_EXACT); + zn = zap_name_alloc(zap, mze->mze_name, 0); err = fzap_add_cd(zn, 8, 1, &mze->mze_value, mze->mze_cd, tag, tx); zap = zn->zn_zap; /* fzap_add_cd() may change zap */ @@ -642,6 +668,23 @@ mzap_upgrade(zap_t **zapp, void *tag, dmu_tx_t *tx, za return (err); } +/* + * The "normflags" determine the behavior of the matchtype_t which is + * passed to zap_lookup_norm(). Names which have the same normalized + * version will be stored with the same hash value, and therefore we can + * perform normalization-insensitive lookups. We can be Unicode form- + * insensitive and/or case-insensitive. The following flags are valid for + * "normflags": + * + * U8_TEXTPREP_NFC + * U8_TEXTPREP_NFD + * U8_TEXTPREP_NFKC + * U8_TEXTPREP_NFKD + * U8_TEXTPREP_TOUPPER + * + * The *_NF* (Normalization Form) flags are mutually exclusive; at most one + * of them may be supplied. + */ void mzap_create_impl(objset_t *os, uint64_t obj, int normflags, zap_flags_t flags, dmu_tx_t *tx) @@ -800,7 +843,7 @@ again: if (zn == NULL) { zn = zap_name_alloc(zap, MZE_PHYS(zap, mze)->mze_name, - MT_FIRST); + MT_NORMALIZE); allocdzn = B_TRUE; } if (zap_match(zn, MZE_PHYS(zap, other)->mze_name)) { @@ -829,7 +872,7 @@ zap_lookup(objset_t *os, uint64_t zapobj, const char * uint64_t integer_size, uint64_t num_integers, void *buf) { return (zap_lookup_norm(os, zapobj, name, integer_size, - num_integers, buf, MT_EXACT, NULL, 0, NULL)); + num_integers, buf, 0, NULL, 0, NULL)); } static int @@ -897,7 +940,7 @@ zap_lookup_by_dnode(dnode_t *dn, const char *name, uint64_t integer_size, uint64_t num_integers, void *buf) { return (zap_lookup_norm_by_dnode(dn, name, integer_size, - num_integers, buf, MT_EXACT, NULL, 0, NULL)); + num_integers, buf, 0, NULL, 0, NULL)); } int @@ -970,7 +1013,7 @@ int zap_contains(objset_t *os, uint64_t zapobj, const char *name) { int err = zap_lookup_norm(os, zapobj, name, 0, - 0, NULL, MT_EXACT, NULL, 0, NULL); + 0, NULL, 0, NULL, 0, NULL); if (err == EOVERFLOW || err == EINVAL) err = 0; /* found, but skipped reading the value */ return (err); @@ -988,7 +1031,7 @@ zap_length(objset_t *os, uint64_t zapobj, const char * err = zap_lockdir(os, zapobj, NULL, RW_READER, TRUE, FALSE, FTAG, &zap); if (err) return (err); - zn = zap_name_alloc(zap, name, MT_EXACT); + zn = zap_name_alloc(zap, name, 0); if (zn == NULL) { zap_unlockdir(zap, FTAG); return (SET_ERROR(ENOTSUP)); @@ -1091,7 +1134,7 @@ zap_add(objset_t *os, uint64_t zapobj, const char *key err = zap_lockdir(os, zapobj, tx, RW_WRITER, TRUE, TRUE, FTAG, &zap); if (err) return (err); - zn = zap_name_alloc(zap, key, MT_EXACT); + zn = zap_name_alloc(zap, key, 0); if (zn == NULL) { zap_unlockdir(zap, FTAG); return (SET_ERROR(ENOTSUP)); @@ -1170,7 +1213,7 @@ zap_update(objset_t *os, uint64_t zapobj, const char * err = zap_lockdir(os, zapobj, tx, RW_WRITER, TRUE, TRUE, FTAG, &zap); if (err) return (err); - zn = zap_name_alloc(zap, name, MT_EXACT); + zn = zap_name_alloc(zap, name, 0); if (zn == NULL) { zap_unlockdir(zap, FTAG); return (SET_ERROR(ENOTSUP)); @@ -1233,7 +1276,7 @@ zap_update_uint64(objset_t *os, uint64_t zapobj, const int zap_remove(objset_t *os, uint64_t zapobj, const char *name, dmu_tx_t *tx) { - return (zap_remove_norm(os, zapobj, name, MT_EXACT, tx)); + return (zap_remove_norm(os, zapobj, name, 0, tx)); } int @@ -1525,7 +1568,7 @@ zap_count_write_by_dnode(dnode_t *dn, const char *name return (err); if (!zap->zap_ismicro) { - zap_name_t *zn = zap_name_alloc(zap, name, MT_EXACT); + zap_name_t *zn = zap_name_alloc(zap, name, 0); if (zn) { err = fzap_count_write(zn, add, towrite, tooverwrite); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c Wed Jul 26 16:32:17 2017 (r321545) @@ -18,9 +18,11 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2013, 2015 by Delphix. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. */ #include @@ -63,12 +65,11 @@ */ static int zfs_match_find(zfsvfs_t *zfsvfs, znode_t *dzp, const char *name, - boolean_t exact, uint64_t *zoid) + matchtype_t mt, uint64_t *zoid) { int error; if (zfsvfs->z_norm) { - matchtype_t mt = exact? MT_EXACT : MT_FIRST; /* * In the non-mixed case we only expect there would ever @@ -108,7 +109,7 @@ int zfs_dirent_lookup(znode_t *dzp, const char *name, znode_t **zpp, int flag) { zfsvfs_t *zfsvfs = dzp->z_zfsvfs; - boolean_t exact; + matchtype_t mt = 0; uint64_t zoid; vnode_t *vp = NULL; int error = 0; @@ -131,15 +132,39 @@ zfs_dirent_lookup(znode_t *dzp, const char *name, znod * zfsvfs->z_case and zfsvfs->z_norm fields. These choices * affect how we perform zap lookups. * - * Decide if exact matches should be requested when performing - * a zap lookup on file systems supporting case-insensitive - * access. + * When matching we may need to normalize & change case according to + * FS settings. * + * Note that a normalized match is necessary for a case insensitive + * filesystem when the lookup request is not exact because normalization + * can fold case independent of normalizing code point sequences. + * + * See the table above zfs_dropname(). + */ + if (zfsvfs->z_norm != 0) { + mt = MT_NORMALIZE; + + /* + * Determine if the match needs to honor the case specified in + * lookup, and if so keep track of that so that during + * normalization we don't fold case. + */ + if (zfsvfs->z_case == ZFS_CASE_MIXED) { + mt |= MT_MATCH_CASE; + } + } + + /* + * Only look in or update the DNLC if we are looking for the + * name on a file system that does not require normalization + * or case folding. We can also look there if we happen to be + * on a non-normalizing, mixed sensitivity file system IF we + * are looking for the exact name. + * * NB: we do not need to worry about this flag for ZFS_CASE_SENSITIVE * because in that case MT_EXACT and MT_FIRST should produce exactly * the same result. */ - exact = zfsvfs->z_case == ZFS_CASE_MIXED; if (dzp->z_unlinked && !(flag & ZXATTR)) return (ENOENT); @@ -149,7 +174,7 @@ zfs_dirent_lookup(znode_t *dzp, const char *name, znod if (error == 0) error = (zoid == 0 ? ENOENT : 0); } else { - error = zfs_match_find(zfsvfs, dzp, name, exact, &zoid); + error = zfs_match_find(zfsvfs, dzp, name, mt, &zoid); } if (error) { if (error != ENOENT || (flag & ZEXISTS)) { @@ -566,6 +591,28 @@ zfs_link_create(znode_t *dzp, const char *name, znode_ return (0); } +/* + * The match type in the code for this function should conform to: + * + * ------------------------------------------------------------------------ + * fs type | z_norm | lookup type | match type + * ---------|-------------|-------------|---------------------------------- + * CS !norm | 0 | 0 | 0 (exact) + * CS norm | formX | 0 | MT_NORMALIZE + * CI !norm | upper | !ZCIEXACT | MT_NORMALIZE + * CI !norm | upper | ZCIEXACT | MT_NORMALIZE | MT_MATCH_CASE + * CI norm | upper|formX | !ZCIEXACT | MT_NORMALIZE + * CI norm | upper|formX | ZCIEXACT | MT_NORMALIZE | MT_MATCH_CASE + * CM !norm | upper | !ZCILOOK | MT_NORMALIZE | MT_MATCH_CASE + * CM !norm | upper | ZCILOOK | MT_NORMALIZE + * CM norm | upper|formX | !ZCILOOK | MT_NORMALIZE | MT_MATCH_CASE + * CM norm | upper|formX | ZCILOOK | MT_NORMALIZE + * + * Abbreviations: + * CS = Case Sensitive, CI = Case Insensitive, CM = Case Mixed + * upper = case folding set by fs type on creation (U8_TEXTPREP_TOUPPER) + * formX = unicode normalization form set on fs creation + */ static int zfs_dropname(znode_t *dzp, const char *name, znode_t *zp, dmu_tx_t *tx, int flag) @@ -573,15 +620,16 @@ zfs_dropname(znode_t *dzp, const char *name, znode_t * int error; if (zp->z_zfsvfs->z_norm) { - if (zp->z_zfsvfs->z_case == ZFS_CASE_MIXED) - error = zap_remove_norm(zp->z_zfsvfs->z_os, - dzp->z_id, name, MT_EXACT, tx); - else - error = zap_remove_norm(zp->z_zfsvfs->z_os, - dzp->z_id, name, MT_FIRST, tx); + matchtype_t mt = MT_NORMALIZE; + + if (zp->z_zfsvfs->z_case == ZFS_CASE_MIXED) { + mt |= MT_MATCH_CASE; + } + + error = zap_remove_norm(zp->z_zfsvfs->z_os, dzp->z_id, + name, mt, tx); } else { - error = zap_remove(zp->z_zfsvfs->z_os, - dzp->z_id, name, tx); + error = zap_remove(zp->z_zfsvfs->z_os, dzp->z_id, name, tx); } return (error); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 16:30:57 2017 (r321544) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 16:32:17 2017 (r321545) @@ -18,11 +18,12 @@ * * CDDL HEADER END */ + /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012, 2015 by Delphix. All rights reserved. - * Copyright 2014 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2017 Nexenta Systems, Inc. */ /* Portions Copyright 2007 Jeremy Teo */ @@ -1534,7 +1535,15 @@ zfs_lookup(vnode_t *dvp, char *nm, vnode_t **vpp, stru zfsvfs_t *zfsvfs = zdp->z_zfsvfs; int error = 0; - /* fast path (should be redundant with vfs namecache) */ + /* + * Fast path lookup, however we must skip DNLC lookup + * for case folding or normalizing lookups because the + * DNLC code only stores the passed in name. This means + * creating 'a' and removing 'A' on a case insensitive + * file system would work, but DNLC still thinks 'a' + * exists and won't let you create it again on the next + * pass through fast path. + */ if (!(flags & LOOKUP_XATTR)) { if (dvp->v_type != VDIR) { return (SET_ERROR(ENOTDIR)); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:33:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 609EADA92AE; Wed, 26 Jul 2017 16:33:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3AF526A371; Wed, 26 Jul 2017 16:33:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGXwXc072437; Wed, 26 Jul 2017 16:33:58 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGXwpi072435; Wed, 26 Jul 2017 16:33:58 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261633.v6QGXwpi072435@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:33:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321546 - stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Commit-Revision: 321546 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:33:59 -0000 Author: mav Date: Wed Jul 26 16:33:58 2017 New Revision: 321546 URL: https://svnweb.freebsd.org/changeset/base/321546 Log: MFC r318819: MFV r316908: 7541 zpool import/tryimport ioctl returns ENOMEM because provided buffer is too small for config illumos/illumos-gate@8b65a70b763232c90a91f31eb2010314c02ed338 https://github.com/illumos/illumos-gate/commit/8b65a70b763232c90a91f31eb2010314c02ed338 https://www.illumos.org/issues/7541 When calling zpool import, zpool does a few ioctls to ZFS. zpool allocates a buffer in userland and passes it to the kernel so that ZFS can copy info into it. ZFS will use it to put the nvlist that describes the pool configuration. If the allocated buffer is too small, ZFS will return ENOMEM and the call will have to be redone. This wastes CPU time and slows down the import process. This happens very often for the ZFS_IOC_POOL_TRYIMPORT call. Reviewed by: Matthew Ahrens Reviewed by: Dan Kimmel Approved by: Dan McDonald Author: Pavel Zakharov Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h Wed Jul 26 16:32:17 2017 (r321545) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_impl.h Wed Jul 26 16:33:58 2017 (r321546) @@ -132,6 +132,8 @@ typedef enum { SHARED_SMB = 0x4 } zfs_share_type_t; +#define CONFIG_BUF_MINSIZE 65536 + int zfs_error(libzfs_handle_t *, int, const char *); int zfs_error_fmt(libzfs_handle_t *, int, const char *, ...); void zfs_error_aux(libzfs_handle_t *, const char *, ...); Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Jul 26 16:32:17 2017 (r321545) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Wed Jul 26 16:33:58 2017 (r321546) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2015 by Delphix. All rights reserved. + * Copyright (c) 2012, 2016 by Delphix. All rights reserved. * Copyright 2015 RackTop Systems. * Copyright 2016 Nexenta Systems, Inc. */ @@ -379,13 +379,14 @@ refresh_config(libzfs_handle_t *hdl, nvlist_t *config) { nvlist_t *nvl; zfs_cmd_t zc = { 0 }; - int err; + int err, dstbuf_size; if (zcmd_write_conf_nvlist(hdl, &zc, config) != 0) return (NULL); - if (zcmd_alloc_dst_nvlist(hdl, &zc, - zc.zc_nvlist_conf_size * 2) != 0) { + dstbuf_size = MAX(CONFIG_BUF_MINSIZE, zc.zc_nvlist_conf_size * 4); + + if (zcmd_alloc_dst_nvlist(hdl, &zc, dstbuf_size) != 0) { zcmd_free_nvlists(&zc); return (NULL); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:35:19 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B6BDDA9330; Wed, 26 Jul 2017 16:35:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DEF286A4C1; Wed, 26 Jul 2017 16:35:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGZIPZ072551; Wed, 26 Jul 2017 16:35:18 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGZHOg072547; Wed, 26 Jul 2017 16:35:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261635.v6QGZHOg072547@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:35:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321547 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321547 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:35:19 -0000 Author: mav Date: Wed Jul 26 16:35:17 2017 New Revision: 321547 URL: https://svnweb.freebsd.org/changeset/base/321547 Log: MFC r318821: MFV r316912: 7793 ztest fails assertion in dmu_tx_willuse_space illumos/illumos-gate@61e255ce7267b52208af9daf434b77d37fb75622 https://github.com/illumos/illumos-gate/commit/61e255ce7267b52208af9daf434b77d37 fb75622 https://www.illumos.org/issues/7793 Background information: This assertion about tx_space_* verifies that we are not dirtying more stuff than we thought we would. We “need” to know how much we will dirty so that we can check if we should fail this transaction with ENOSPC/EDQUOT, in dmu_tx_assign(). While the transaction is open (i.e. between dmu_tx_assign() and dmu_tx_commit() — typically less than a millisecond), we call dbuf_dirty() on the exact blocks that will be modified. Once this happens, the temporary accounting in tx_space_* is unnecessary, because we know exactly what blocks are newly dirtied; we call dnode_willuse_space() to track this more exact accounting. The fundamental problem causing this bug is that dmu_tx_hold_*() relies on the current state in the DMU (e.g. dn_nlevels) to predict how much will be dirtied by this transaction, but this state can change before we actually perform the transaction (i.e. call dbuf_dirty()). This bug will be fixed by removing the assertion that the tx_space_* accounting is perfectly accurate (i.e. we never dirty more than was predicted by dmu_tx_hold_*()). By removing the requirement that this accounting be perfectly accurate, we can also vastly simplify it, e.g. removing most of the logic in dmu_tx_count_*(). The new tx space accounting will be very approximate, and may be more or less than what is actually dirtied. It will still be used to determine if this transaction will put us over quota. Transactions that are marked by dmu_tx_mark_netfree() will be excepted from this check. We won’t make an attempt to determine how much space will be freed by the transaction — this was rarely accurate enough to determine if a transaction should be permitted when we are over quota, which is why dmu_tx_mark_netfree() was introduced in 2014. We also won’t attempt to give “credit” when overwriting existing blocks, if those blocks may be freed. This allows us to remove the do_free_accounting logic in dbuf_dirty(), and associated routines. This Reviewed by: Steve Gonczi Reviewed by: George Wilson Reviewed by: Pavel Zakharov Reviewed by: Brian Behlendorf Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_tx.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:33:58 2017 (r321546) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:35:17 2017 (r321547) @@ -1346,41 +1346,6 @@ dbuf_free_range(dnode_t *dn, uint64_t start_blkid, uin mutex_exit(&dn->dn_dbufs_mtx); } -static int -dbuf_block_freeable(dmu_buf_impl_t *db) -{ - dsl_dataset_t *ds = db->db_objset->os_dsl_dataset; - uint64_t birth_txg = 0; - - /* - * We don't need any locking to protect db_blkptr: - * If it's syncing, then db_last_dirty will be set - * so we'll ignore db_blkptr. - * - * This logic ensures that only block births for - * filled blocks are considered. - */ - ASSERT(MUTEX_HELD(&db->db_mtx)); - if (db->db_last_dirty && (db->db_blkptr == NULL || - !BP_IS_HOLE(db->db_blkptr))) { - birth_txg = db->db_last_dirty->dr_txg; - } else if (db->db_blkptr != NULL && !BP_IS_HOLE(db->db_blkptr)) { - birth_txg = db->db_blkptr->blk_birth; - } - - /* - * If this block don't exist or is in a snapshot, it can't be freed. - * Don't pass the bp to dsl_dataset_block_freeable() since we - * are holding the db_mtx lock and might deadlock if we are - * prefetching a dedup-ed block. - */ - if (birth_txg != 0) - return (ds == NULL || - dsl_dataset_block_freeable(ds, NULL, birth_txg)); - else - return (B_FALSE); -} - void dbuf_new_size(dmu_buf_impl_t *db, int size, dmu_tx_t *tx) { @@ -1430,7 +1395,7 @@ dbuf_new_size(dmu_buf_impl_t *db, int size, dmu_tx_t * } mutex_exit(&db->db_mtx); - dnode_willuse_space(dn, size-osize, tx); + dmu_objset_willuse_space(dn->dn_objset, size - osize, tx); DB_DNODE_EXIT(db); } @@ -1480,7 +1445,6 @@ dbuf_dirty(dmu_buf_impl_t *db, dmu_tx_t *tx) objset_t *os; dbuf_dirty_record_t **drp, *dr; int drop_struct_lock = FALSE; - boolean_t do_free_accounting = B_FALSE; int txgoff = tx->tx_txg & TXG_MASK; ASSERT(tx->tx_txg != 0); @@ -1602,15 +1566,7 @@ dbuf_dirty(dmu_buf_impl_t *db, dmu_tx_t *tx) dprintf_dbuf(db, "size=%llx\n", (u_longlong_t)db->db.db_size); if (db->db_blkid != DMU_BONUS_BLKID) { - /* - * Update the accounting. - * Note: we delay "free accounting" until after we drop - * the db_mtx. This keeps us from grabbing other locks - * (and possibly deadlocking) in bp_get_dsize() while - * also holding the db_mtx. - */ - dnode_willuse_space(dn, db->db.db_size, tx); - do_free_accounting = dbuf_block_freeable(db); + dmu_objset_willuse_space(os, db->db.db_size, tx); } /* @@ -1703,21 +1659,13 @@ dbuf_dirty(dmu_buf_impl_t *db, dmu_tx_t *tx) drop_struct_lock = TRUE; } - if (do_free_accounting) { - blkptr_t *bp = db->db_blkptr; - int64_t willfree = (bp && !BP_IS_HOLE(bp)) ? - bp_get_dsize(os->os_spa, bp) : db->db.db_size; - /* - * This is only a guess -- if the dbuf is dirty - * in a previous txg, we don't know how much - * space it will use on disk yet. We should - * really have the struct_rwlock to access - * db_blkptr, but since this is just a guess, - * it's OK if we get an odd answer. - */ - ddt_prefetch(os->os_spa, bp); - dnode_willuse_space(dn, -willfree, tx); - } + /* + * If we are overwriting a dedup BP, then unless it is snapshotted, + * when we get to syncing context we will need to decrement its + * refcount in the DDT. Prefetch the relevant DDT block so that + * syncing context won't have to wait for the i/o. + */ + ddt_prefetch(os->os_spa, db->db_blkptr); if (db->db_level == 0) { dnode_new_blkid(dn, db->db_blkid, tx, drop_struct_lock); @@ -2925,19 +2873,6 @@ void dmu_buf_user_evict_wait() { taskq_wait(dbu_evict_taskq); -} - -boolean_t -dmu_buf_freeable(dmu_buf_t *dbuf) -{ - boolean_t res = B_FALSE; - dmu_buf_impl_t *db = (dmu_buf_impl_t *)dbuf; - - if (db->db_blkptr) - res = dsl_dataset_block_freeable(db->db_objset->os_dsl_dataset, - db->db_blkptr, db->db_blkptr->blk_birth); - - return (res); } blkptr_t * Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:33:58 2017 (r321546) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:35:17 2017 (r321547) @@ -2103,3 +2103,20 @@ dmu_fsname(const char *snapname, char *buf) (void) strlcpy(buf, snapname, atp - snapname + 1); return (0); } + +/* + * Call when we think we're going to write/free space in open context to track + * the amount of dirty data in the open txg, which is also the amount + * of memory that can not be evicted until this txg syncs. + */ +void +dmu_objset_willuse_space(objset_t *os, int64_t space, dmu_tx_t *tx) +{ + dsl_dataset_t *ds = os->os_dsl_dataset; + int64_t aspace = spa_get_worst_case_asize(os->os_spa, space); + + if (ds != NULL) { + dsl_dir_willuse_space(ds->ds_dir, aspace, tx); + dsl_pool_dirty_space(dmu_tx_pool(tx), space, tx); + } +} Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 16:33:58 2017 (r321546) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 16:35:17 2017 (r321547) @@ -30,10 +30,10 @@ #include #include #include -#include /* for dsl_dataset_block_freeable() */ -#include /* for dsl_dir_tempreserve_*() */ +#include +#include #include -#include /* for fzap_default_block_shift */ +#include #include #include #include @@ -56,10 +56,6 @@ dmu_tx_create_dd(dsl_dir_t *dd) list_create(&tx->tx_callbacks, sizeof (dmu_tx_callback_t), offsetof(dmu_tx_callback_t, dcb_node)); tx->tx_start = gethrtime(); -#ifdef ZFS_DEBUG - refcount_create(&tx->tx_space_written); - refcount_create(&tx->tx_space_freed); -#endif return (tx); } @@ -68,7 +64,6 @@ dmu_tx_create(objset_t *os) { dmu_tx_t *tx = dmu_tx_create_dd(os->os_dsl_dataset->ds_dir); tx->tx_objset = os; - tx->tx_lastsnap_txg = dsl_dataset_prev_snap_txg(os->os_dsl_dataset); return (tx); } @@ -130,16 +125,10 @@ dmu_tx_hold_object_impl(dmu_tx_t *tx, objset_t *os, ui txh->txh_tx = tx; txh->txh_dnode = dn; refcount_create(&txh->txh_space_towrite); - refcount_create(&txh->txh_space_tofree); - refcount_create(&txh->txh_space_tooverwrite); - refcount_create(&txh->txh_space_tounref); refcount_create(&txh->txh_memory_tohold); - refcount_create(&txh->txh_fudge); -#ifdef ZFS_DEBUG txh->txh_type = type; txh->txh_arg1 = arg1; txh->txh_arg2 = arg2; -#endif list_insert_tail(&tx->tx_holds, txh); return (txh); @@ -158,6 +147,34 @@ dmu_tx_add_new_object(dmu_tx_t *tx, objset_t *os, uint } } +/* + * This function reads specified data from disk. The specified data will + * be needed to perform the transaction -- i.e, it will be read after + * we do dmu_tx_assign(). There are two reasons that we read the data now + * (before dmu_tx_assign()): + * + * 1. Reading it now has potentially better performance. The transaction + * has not yet been assigned, so the TXG is not held open, and also the + * caller typically has less locks held when calling dmu_tx_hold_*() than + * after the transaction has been assigned. This reduces the lock (and txg) + * hold times, thus reducing lock contention. + * + * 2. It is easier for callers (primarily the ZPL) to handle i/o errors + * that are detected before they start making changes to the DMU state + * (i.e. now). Once the transaction has been assigned, and some DMU + * state has been changed, it can be difficult to recover from an i/o + * error (e.g. to undo the changes already made in memory at the DMU + * layer). Typically code to do so does not exist in the caller -- it + * assumes that the data has already been cached and thus i/o errors are + * not possible. + * + * It has been observed that the i/o initiated here can be a performance + * problem, and it appears to be optional, because we don't look at the + * data which is read. However, removing this read would only serve to + * move the work elsewhere (after the dmu_tx_assign()), where it may + * have a greater impact on performance (in addition to the impact on + * fault tolerance noted above). + */ static int dmu_tx_check_ioerr(zio_t *zio, dnode_t *dn, int level, uint64_t blkid) { @@ -174,259 +191,84 @@ dmu_tx_check_ioerr(zio_t *zio, dnode_t *dn, int level, return (err); } -static void -dmu_tx_count_twig(dmu_tx_hold_t *txh, dnode_t *dn, dmu_buf_impl_t *db, - int level, uint64_t blkid, boolean_t freeable, uint64_t *history) -{ - objset_t *os = dn->dn_objset; - dsl_dataset_t *ds = os->os_dsl_dataset; - int epbs = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - dmu_buf_impl_t *parent = NULL; - blkptr_t *bp = NULL; - uint64_t space; - - if (level >= dn->dn_nlevels || history[level] == blkid) - return; - - history[level] = blkid; - - space = (level == 0) ? dn->dn_datablksz : (1ULL << dn->dn_indblkshift); - - if (db == NULL || db == dn->dn_dbuf) { - ASSERT(level != 0); - db = NULL; - } else { - ASSERT(DB_DNODE(db) == dn); - ASSERT(db->db_level == level); - ASSERT(db->db.db_size == space); - ASSERT(db->db_blkid == blkid); - bp = db->db_blkptr; - parent = db->db_parent; - } - - freeable = (bp && (freeable || - dsl_dataset_block_freeable(ds, bp, bp->blk_birth))); - - if (freeable) { - (void) refcount_add_many(&txh->txh_space_tooverwrite, - space, FTAG); - } else { - (void) refcount_add_many(&txh->txh_space_towrite, - space, FTAG); - } - - if (bp) { - (void) refcount_add_many(&txh->txh_space_tounref, - bp_get_dsize(os->os_spa, bp), FTAG); - } - - dmu_tx_count_twig(txh, dn, parent, level + 1, - blkid >> epbs, freeable, history); -} - /* ARGSUSED */ static void dmu_tx_count_write(dmu_tx_hold_t *txh, uint64_t off, uint64_t len) { dnode_t *dn = txh->txh_dnode; - uint64_t start, end, i; - int min_bs, max_bs, min_ibs, max_ibs, epbs, bits; int err = 0; if (len == 0) return; - min_bs = SPA_MINBLOCKSHIFT; - max_bs = highbit64(txh->txh_tx->tx_objset->os_recordsize) - 1; - min_ibs = DN_MIN_INDBLKSHIFT; - max_ibs = DN_MAX_INDBLKSHIFT; + (void) refcount_add_many(&txh->txh_space_towrite, len, FTAG); - if (dn) { - uint64_t history[DN_MAX_LEVELS]; - int nlvls = dn->dn_nlevels; - int delta; + if (refcount_count(&txh->txh_space_towrite) > 2 * DMU_MAX_ACCESS) + err = SET_ERROR(EFBIG); - /* - * For i/o error checking, read the first and last level-0 - * blocks (if they are not aligned), and all the level-1 blocks. - */ - if (dn->dn_maxblkid == 0) { - delta = dn->dn_datablksz; - start = (off < dn->dn_datablksz) ? 0 : 1; - end = (off+len <= dn->dn_datablksz) ? 0 : 1; - if (start == 0 && (off > 0 || len < dn->dn_datablksz)) { - err = dmu_tx_check_ioerr(NULL, dn, 0, 0); - if (err) - goto out; - delta -= off; - } - } else { - zio_t *zio = zio_root(dn->dn_objset->os_spa, - NULL, NULL, ZIO_FLAG_CANFAIL); + if (dn == NULL) + return; - /* first level-0 block */ - start = off >> dn->dn_datablkshift; - if (P2PHASE(off, dn->dn_datablksz) || - len < dn->dn_datablksz) { - err = dmu_tx_check_ioerr(zio, dn, 0, start); - if (err) - goto out; + /* + * For i/o error checking, read the blocks that will be needed + * to perform the write: the first and last level-0 blocks (if + * they are not aligned, i.e. if they are partial-block writes), + * and all the level-1 blocks. + */ + if (dn->dn_maxblkid == 0) { + if (off < dn->dn_datablksz && + (off > 0 || len < dn->dn_datablksz)) { + err = dmu_tx_check_ioerr(NULL, dn, 0, 0); + if (err != 0) { + txh->txh_tx->tx_err = err; } + } + } else { + zio_t *zio = zio_root(dn->dn_objset->os_spa, + NULL, NULL, ZIO_FLAG_CANFAIL); - /* last level-0 block */ - end = (off+len-1) >> dn->dn_datablkshift; - if (end != start && end <= dn->dn_maxblkid && - P2PHASE(off+len, dn->dn_datablksz)) { - err = dmu_tx_check_ioerr(zio, dn, 0, end); - if (err) - goto out; + /* first level-0 block */ + uint64_t start = off >> dn->dn_datablkshift; + if (P2PHASE(off, dn->dn_datablksz) || len < dn->dn_datablksz) { + err = dmu_tx_check_ioerr(zio, dn, 0, start); + if (err != 0) { + txh->txh_tx->tx_err = err; } - - /* level-1 blocks */ - if (nlvls > 1) { - int shft = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - for (i = (start>>shft)+1; i < end>>shft; i++) { - err = dmu_tx_check_ioerr(zio, dn, 1, i); - if (err) - goto out; - } - } - - err = zio_wait(zio); - if (err) - goto out; - delta = P2NPHASE(off, dn->dn_datablksz); } - min_ibs = max_ibs = dn->dn_indblkshift; - if (dn->dn_maxblkid > 0) { - /* - * The blocksize can't change, - * so we can make a more precise estimate. - */ - ASSERT(dn->dn_datablkshift != 0); - min_bs = max_bs = dn->dn_datablkshift; - } else { - /* - * The blocksize can increase up to the recordsize, - * or if it is already more than the recordsize, - * up to the next power of 2. - */ - min_bs = highbit64(dn->dn_datablksz - 1); - max_bs = MAX(max_bs, highbit64(dn->dn_datablksz - 1)); - } - - /* - * If this write is not off the end of the file - * we need to account for overwrites/unref. - */ - if (start <= dn->dn_maxblkid) { - for (int l = 0; l < DN_MAX_LEVELS; l++) - history[l] = -1ULL; - } - while (start <= dn->dn_maxblkid) { - dmu_buf_impl_t *db; - - rw_enter(&dn->dn_struct_rwlock, RW_READER); - err = dbuf_hold_impl(dn, 0, start, - FALSE, FALSE, FTAG, &db); - rw_exit(&dn->dn_struct_rwlock); - - if (err) { + /* last level-0 block */ + uint64_t end = (off + len - 1) >> dn->dn_datablkshift; + if (end != start && end <= dn->dn_maxblkid && + P2PHASE(off + len, dn->dn_datablksz)) { + err = dmu_tx_check_ioerr(zio, dn, 0, end); + if (err != 0) { txh->txh_tx->tx_err = err; - return; } + } - dmu_tx_count_twig(txh, dn, db, 0, start, B_FALSE, - history); - dbuf_rele(db, FTAG); - if (++start > end) { - /* - * Account for new indirects appearing - * before this IO gets assigned into a txg. - */ - bits = 64 - min_bs; - epbs = min_ibs - SPA_BLKPTRSHIFT; - for (bits -= epbs * (nlvls - 1); - bits >= 0; bits -= epbs) { - (void) refcount_add_many( - &txh->txh_fudge, - 1ULL << max_ibs, FTAG); + /* level-1 blocks */ + if (dn->dn_nlevels > 1) { + int shft = dn->dn_indblkshift - SPA_BLKPTRSHIFT; + for (uint64_t i = (start >> shft) + 1; + i < end >> shft; i++) { + err = dmu_tx_check_ioerr(zio, dn, 1, i); + if (err != 0) { + txh->txh_tx->tx_err = err; } - goto out; } - off += delta; - if (len >= delta) - len -= delta; - delta = dn->dn_datablksz; } - } - /* - * 'end' is the last thing we will access, not one past. - * This way we won't overflow when accessing the last byte. - */ - start = P2ALIGN(off, 1ULL << max_bs); - end = P2ROUNDUP(off + len, 1ULL << max_bs) - 1; - (void) refcount_add_many(&txh->txh_space_towrite, - end - start + 1, FTAG); - - start >>= min_bs; - end >>= min_bs; - - epbs = min_ibs - SPA_BLKPTRSHIFT; - - /* - * The object contains at most 2^(64 - min_bs) blocks, - * and each indirect level maps 2^epbs. - */ - for (bits = 64 - min_bs; bits >= 0; bits -= epbs) { - start >>= epbs; - end >>= epbs; - ASSERT3U(end, >=, start); - (void) refcount_add_many(&txh->txh_space_towrite, - (end - start + 1) << max_ibs, FTAG); - if (start != 0) { - /* - * We also need a new blkid=0 indirect block - * to reference any existing file data. - */ - (void) refcount_add_many(&txh->txh_space_towrite, - 1ULL << max_ibs, FTAG); + err = zio_wait(zio); + if (err != 0) { + txh->txh_tx->tx_err = err; } } - -out: - if (refcount_count(&txh->txh_space_towrite) + - refcount_count(&txh->txh_space_tooverwrite) > - 2 * DMU_MAX_ACCESS) - err = SET_ERROR(EFBIG); - - if (err) - txh->txh_tx->tx_err = err; } static void dmu_tx_count_dnode(dmu_tx_hold_t *txh) { - dnode_t *dn = txh->txh_dnode; - dnode_t *mdn = DMU_META_DNODE(txh->txh_tx->tx_objset); - uint64_t space = mdn->dn_datablksz + - ((mdn->dn_nlevels-1) << mdn->dn_indblkshift); - - if (dn && dn->dn_dbuf->db_blkptr && - dsl_dataset_block_freeable(dn->dn_objset->os_dsl_dataset, - dn->dn_dbuf->db_blkptr, dn->dn_dbuf->db_blkptr->blk_birth)) { - (void) refcount_add_many(&txh->txh_space_tooverwrite, - space, FTAG); - (void) refcount_add_many(&txh->txh_space_tounref, space, FTAG); - } else { - (void) refcount_add_many(&txh->txh_space_towrite, space, FTAG); - if (dn && dn->dn_dbuf->db_blkptr) { - (void) refcount_add_many(&txh->txh_space_tounref, - space, FTAG); - } - } + (void) refcount_add_many(&txh->txh_space_towrite, DNODE_SIZE, FTAG); } void @@ -434,8 +276,8 @@ dmu_tx_hold_write(dmu_tx_t *tx, uint64_t object, uint6 { dmu_tx_hold_t *txh; - ASSERT(tx->tx_txg == 0); - ASSERT(len < DMU_MAX_ACCESS); + ASSERT0(tx->tx_txg); + ASSERT3U(len, <=, DMU_MAX_ACCESS); ASSERT(len == 0 || UINT64_MAX - off >= len - 1); txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, @@ -447,179 +289,6 @@ dmu_tx_hold_write(dmu_tx_t *tx, uint64_t object, uint6 dmu_tx_count_dnode(txh); } -static void -dmu_tx_count_free(dmu_tx_hold_t *txh, uint64_t off, uint64_t len) -{ - uint64_t blkid, nblks, lastblk; - uint64_t space = 0, unref = 0, skipped = 0; - dnode_t *dn = txh->txh_dnode; - dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset; - spa_t *spa = txh->txh_tx->tx_pool->dp_spa; - int epbs; - uint64_t l0span = 0, nl1blks = 0; - - if (dn->dn_nlevels == 0) - return; - - /* - * The struct_rwlock protects us against dn_nlevels - * changing, in case (against all odds) we manage to dirty & - * sync out the changes after we check for being dirty. - * Also, dbuf_hold_impl() wants us to have the struct_rwlock. - */ - rw_enter(&dn->dn_struct_rwlock, RW_READER); - epbs = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - if (dn->dn_maxblkid == 0) { - if (off == 0 && len >= dn->dn_datablksz) { - blkid = 0; - nblks = 1; - } else { - rw_exit(&dn->dn_struct_rwlock); - return; - } - } else { - blkid = off >> dn->dn_datablkshift; - nblks = (len + dn->dn_datablksz - 1) >> dn->dn_datablkshift; - - if (blkid > dn->dn_maxblkid) { - rw_exit(&dn->dn_struct_rwlock); - return; - } - if (blkid + nblks > dn->dn_maxblkid) - nblks = dn->dn_maxblkid - blkid + 1; - - } - l0span = nblks; /* save for later use to calc level > 1 overhead */ - if (dn->dn_nlevels == 1) { - int i; - for (i = 0; i < nblks; i++) { - blkptr_t *bp = dn->dn_phys->dn_blkptr; - ASSERT3U(blkid + i, <, dn->dn_nblkptr); - bp += blkid + i; - if (dsl_dataset_block_freeable(ds, bp, bp->blk_birth)) { - dprintf_bp(bp, "can free old%s", ""); - space += bp_get_dsize(spa, bp); - } - unref += BP_GET_ASIZE(bp); - } - nl1blks = 1; - nblks = 0; - } - - lastblk = blkid + nblks - 1; - while (nblks) { - dmu_buf_impl_t *dbuf; - uint64_t ibyte, new_blkid; - int epb = 1 << epbs; - int err, i, blkoff, tochk; - blkptr_t *bp; - - ibyte = blkid << dn->dn_datablkshift; - err = dnode_next_offset(dn, - DNODE_FIND_HAVELOCK, &ibyte, 2, 1, 0); - new_blkid = ibyte >> dn->dn_datablkshift; - if (err == ESRCH) { - skipped += (lastblk >> epbs) - (blkid >> epbs) + 1; - break; - } - if (err) { - txh->txh_tx->tx_err = err; - break; - } - if (new_blkid > lastblk) { - skipped += (lastblk >> epbs) - (blkid >> epbs) + 1; - break; - } - - if (new_blkid > blkid) { - ASSERT((new_blkid >> epbs) > (blkid >> epbs)); - skipped += (new_blkid >> epbs) - (blkid >> epbs) - 1; - nblks -= new_blkid - blkid; - blkid = new_blkid; - } - blkoff = P2PHASE(blkid, epb); - tochk = MIN(epb - blkoff, nblks); - - err = dbuf_hold_impl(dn, 1, blkid >> epbs, - FALSE, FALSE, FTAG, &dbuf); - if (err) { - txh->txh_tx->tx_err = err; - break; - } - - (void) refcount_add_many(&txh->txh_memory_tohold, - dbuf->db.db_size, FTAG); - - /* - * We don't check memory_tohold against DMU_MAX_ACCESS because - * memory_tohold is an over-estimation (especially the >L1 - * indirect blocks), so it could fail. Callers should have - * already verified that they will not be holding too much - * memory. - */ - - err = dbuf_read(dbuf, NULL, DB_RF_HAVESTRUCT | DB_RF_CANFAIL); - if (err != 0) { - txh->txh_tx->tx_err = err; - dbuf_rele(dbuf, FTAG); - break; - } - - bp = dbuf->db.db_data; - bp += blkoff; - - for (i = 0; i < tochk; i++) { - if (dsl_dataset_block_freeable(ds, &bp[i], - bp[i].blk_birth)) { - dprintf_bp(&bp[i], "can free old%s", ""); - space += bp_get_dsize(spa, &bp[i]); - } - unref += BP_GET_ASIZE(bp); - } - dbuf_rele(dbuf, FTAG); - - ++nl1blks; - blkid += tochk; - nblks -= tochk; - } - rw_exit(&dn->dn_struct_rwlock); - - /* - * Add in memory requirements of higher-level indirects. - * This assumes a worst-possible scenario for dn_nlevels and a - * worst-possible distribution of l1-blocks over the region to free. - */ - { - uint64_t blkcnt = 1 + ((l0span >> epbs) >> epbs); - int level = 2; - /* - * Here we don't use DN_MAX_LEVEL, but calculate it with the - * given datablkshift and indblkshift. This makes the - * difference between 19 and 8 on large files. - */ - int maxlevel = 2 + (DN_MAX_OFFSET_SHIFT - dn->dn_datablkshift) / - (dn->dn_indblkshift - SPA_BLKPTRSHIFT); - - while (level++ < maxlevel) { - (void) refcount_add_many(&txh->txh_memory_tohold, - MAX(MIN(blkcnt, nl1blks), 1) << dn->dn_indblkshift, - FTAG); - blkcnt = 1 + (blkcnt >> epbs); - } - } - - /* account for new level 1 indirect blocks that might show up */ - if (skipped > 0) { - (void) refcount_add_many(&txh->txh_fudge, - skipped << dn->dn_indblkshift, FTAG); - skipped = MIN(skipped, DMU_MAX_DELETEBLKCNT >> epbs); - (void) refcount_add_many(&txh->txh_memory_tohold, - skipped << dn->dn_indblkshift, FTAG); - } - (void) refcount_add_many(&txh->txh_space_tofree, space, FTAG); - (void) refcount_add_many(&txh->txh_space_tounref, unref, FTAG); -} - /* * This function marks the transaction as being a "net free". The end * result is that refquotas will be disabled for this transaction, and @@ -631,45 +300,27 @@ dmu_tx_count_free(dmu_tx_hold_t *txh, uint64_t off, ui void dmu_tx_mark_netfree(dmu_tx_t *tx) { - dmu_tx_hold_t *txh; - - txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, - DMU_NEW_OBJECT, THT_FREE, 0, 0); - - /* - * Pretend that this operation will free 1GB of space. This - * should be large enough to cancel out the largest write. - * We don't want to use something like UINT64_MAX, because that would - * cause overflows when doing math with these values (e.g. in - * dmu_tx_try_assign()). - */ - (void) refcount_add_many(&txh->txh_space_tofree, - 1024 * 1024 * 1024, FTAG); - (void) refcount_add_many(&txh->txh_space_tounref, - 1024 * 1024 * 1024, FTAG); + tx->tx_netfree = B_TRUE; } void dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64_t off, uint64_t len) { - dmu_tx_hold_t *txh; - dnode_t *dn; int err; - zio_t *zio; ASSERT(tx->tx_txg == 0); - txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, + dmu_tx_hold_t *txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, object, THT_FREE, off, len); if (txh == NULL) return; - dn = txh->txh_dnode; + dnode_t *dn = txh->txh_dnode; dmu_tx_count_dnode(txh); - if (off >= (dn->dn_maxblkid+1) * dn->dn_datablksz) + if (off >= (dn->dn_maxblkid + 1) * dn->dn_datablksz) return; if (len == DMU_OBJECT_END) - len = (dn->dn_maxblkid+1) * dn->dn_datablksz - off; + len = (dn->dn_maxblkid + 1) * dn->dn_datablksz - off; /* @@ -690,7 +341,7 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64 dmu_tx_count_write(txh, off, 1); /* last block will be modified if it is not aligned */ if (!IS_P2ALIGNED(off + len, 1 << dn->dn_datablkshift)) - dmu_tx_count_write(txh, off+len, 1); + dmu_tx_count_write(txh, off + len, 1); } /* @@ -712,7 +363,7 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64 if (dn->dn_datablkshift == 0) start = end = 0; - zio = zio_root(tx->tx_pool->dp_spa, + zio_t *zio = zio_root(tx->tx_pool->dp_spa, NULL, NULL, ZIO_FLAG_CANFAIL); for (uint64_t i = start; i <= end; i++) { uint64_t ibyte = i << shift; @@ -720,129 +371,82 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64 i = ibyte >> shift; if (err == ESRCH || i > end) break; - if (err) { + if (err != 0) { tx->tx_err = err; + (void) zio_wait(zio); return; } + (void) refcount_add_many(&txh->txh_memory_tohold, + 1 << dn->dn_indblkshift, FTAG); + err = dmu_tx_check_ioerr(zio, dn, 1, i); - if (err) { + if (err != 0) { tx->tx_err = err; + (void) zio_wait(zio); return; } } err = zio_wait(zio); - if (err) { + if (err != 0) { tx->tx_err = err; return; } } - - dmu_tx_count_free(txh, off, len); } void dmu_tx_hold_zap(dmu_tx_t *tx, uint64_t object, int add, const char *name) { - dmu_tx_hold_t *txh; - dnode_t *dn; int err; ASSERT(tx->tx_txg == 0); - txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, + dmu_tx_hold_t *txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, object, THT_ZAP, add, (uintptr_t)name); if (txh == NULL) return; - dn = txh->txh_dnode; + dnode_t *dn = txh->txh_dnode; dmu_tx_count_dnode(txh); - if (dn == NULL) { - /* - * We will be able to fit a new object's entries into one leaf - * block. So there will be at most 2 blocks total, - * including the header block. - */ - dmu_tx_count_write(txh, 0, 2 << fzap_default_block_shift); + /* + * Modifying a almost-full microzap is around the worst case (128KB) + * + * If it is a fat zap, the worst case would be 7*16KB=112KB: + * - 3 blocks overwritten: target leaf, ptrtbl block, header block + * - 4 new blocks written if adding: + * - 2 blocks for possibly split leaves, + * - 2 grown ptrtbl blocks + */ + (void) refcount_add_many(&txh->txh_space_towrite, + MZAP_MAX_BLKSZ, FTAG); + + if (dn == NULL) return; - } ASSERT3P(DMU_OT_BYTESWAP(dn->dn_type), ==, DMU_BSWAP_ZAP); - if (dn->dn_maxblkid == 0 && !add) { - blkptr_t *bp; - + if (dn->dn_maxblkid == 0 || name == NULL) { /* - * If there is only one block (i.e. this is a micro-zap) - * and we are not adding anything, the accounting is simple. + * This is a microzap (only one block), or we don't know + * the name. Check the first block for i/o errors. */ err = dmu_tx_check_ioerr(NULL, dn, 0, 0); - if (err) { + if (err != 0) { tx->tx_err = err; - return; } - + } else { /* - * Use max block size here, since we don't know how much - * the size will change between now and the dbuf dirty call. + * Access the name so that we'll check for i/o errors to + * the leaf blocks, etc. We ignore ENOENT, as this name + * may not yet exist. */ - bp = &dn->dn_phys->dn_blkptr[0]; - if (dsl_dataset_block_freeable(dn->dn_objset->os_dsl_dataset, - bp, bp->blk_birth)) { - (void) refcount_add_many(&txh->txh_space_tooverwrite, - MZAP_MAX_BLKSZ, FTAG); - } else { - (void) refcount_add_many(&txh->txh_space_towrite, - MZAP_MAX_BLKSZ, FTAG); - } - if (!BP_IS_HOLE(bp)) { - (void) refcount_add_many(&txh->txh_space_tounref, - MZAP_MAX_BLKSZ, FTAG); - } - return; - } - - if (dn->dn_maxblkid > 0 && name) { - /* - * access the name in this fat-zap so that we'll check - * for i/o errors to the leaf blocks, etc. - */ err = zap_lookup_by_dnode(dn, name, 8, 0, NULL); - if (err == EIO) { + if (err == EIO || err == ECKSUM || err == ENXIO) { tx->tx_err = err; - return; } } - - err = zap_count_write_by_dnode(dn, name, add, - &txh->txh_space_towrite, &txh->txh_space_tooverwrite); - - /* - * If the modified blocks are scattered to the four winds, - * we'll have to modify an indirect twig for each. We can make - * modifications at up to 3 locations: - * - header block at the beginning of the object - * - target leaf block - * - end of the object, where we might need to write: - * - a new leaf block if the target block needs to be split - * - the new pointer table, if it is growing - * - the new cookie table, if it is growing - */ - int epbs = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - dsl_dataset_phys_t *ds_phys = - dsl_dataset_phys(dn->dn_objset->os_dsl_dataset); - for (int lvl = 1; lvl < dn->dn_nlevels; lvl++) { - uint64_t num_indirects = 1 + (dn->dn_maxblkid >> (epbs * lvl)); - uint64_t spc = MIN(3, num_indirects) << dn->dn_indblkshift; - if (ds_phys->ds_prev_snap_obj != 0) { - (void) refcount_add_many(&txh->txh_space_towrite, - spc, FTAG); - } else { - (void) refcount_add_many(&txh->txh_space_tooverwrite, - spc, FTAG); - } - } } void @@ -870,42 +474,15 @@ dmu_tx_hold_space(dmu_tx_t *tx, uint64_t space) (void) refcount_add_many(&txh->txh_space_towrite, space, FTAG); } -int -dmu_tx_holds(dmu_tx_t *tx, uint64_t object) -{ - dmu_tx_hold_t *txh; - int holds = 0; - - /* - * By asserting that the tx is assigned, we're counting the - * number of dn_tx_holds, which is the same as the number of - * dn_holds. Otherwise, we'd be counting dn_holds, but - * dn_tx_holds could be 0. - */ - ASSERT(tx->tx_txg != 0); - - /* if (tx->tx_anyobj == TRUE) */ - /* return (0); */ - - for (txh = list_head(&tx->tx_holds); txh; - txh = list_next(&tx->tx_holds, txh)) { - if (txh->txh_dnode && txh->txh_dnode->dn_object == object) - holds++; - } - - return (holds); -} - #ifdef ZFS_DEBUG void dmu_tx_dirty_buf(dmu_tx_t *tx, dmu_buf_impl_t *db) { - dmu_tx_hold_t *txh; *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:42:33 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8A97EDA956E; Wed, 26 Jul 2017 16:42:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 60B556A91A; Wed, 26 Jul 2017 16:42:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGgWwS076542; Wed, 26 Jul 2017 16:42:32 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGgWg3076540; Wed, 26 Jul 2017 16:42:32 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261642.v6QGgWg3076540@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:42:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321548 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321548 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:42:33 -0000 Author: mav Date: Wed Jul 26 16:42:32 2017 New Revision: 321548 URL: https://svnweb.freebsd.org/changeset/base/321548 Log: MFC r318822: MFC r316913: 7869 panic in bpobj_space(): null pointer dereference illumos/illumos-gate@a3905a45920de250d181b66ac0b6b71bd200d9ef https://github.com/illumos/illumos-gate/commit/a3905a45920de250d181b66ac0b6b71bd200d9ef https://www.illumos.org/issues/7869 The issue fixed by this patch is a race condition in the deadlist code. A thread executing an administrative command that uses `dsl_deadlist_space_range()` holds the lock of the whole `deadlist_t` to protect the access of all its entries that the deadlist contains in an avl tree. Sync threads trying to insert a new entry in the deadlist (through `dsl_deadlist_insert()` -> `dle_enqueue()`) do not hold the deadlist lock at that moment. If the `dle_bpobj` is the empty bpobj (our sentinel value), we close and reopen it. Between these two operations, it is possible for the `dsl_deadlist_space_range()` thread to dereference that bpobj which is `NULL` during that window. Threads should hold the a deadlist's `dl_lock` when they manipulate its internal data so scenarios like the one above are avoided. In addition, threads should also hold the bpobj lock whenever they are allocating the subobj list of a bpobj, and not just when they actually insert the subobj to the list. This way we can avoid potential memory leaks. Reviewed by: Matt Ahrens Reviewed by: Dan Kimmel Reviewed by: Steve Gonczi Reviewed by: John Kennedy Reviewed by: George Melikov Reviewed by: Brian Behlendorf Approved by: Dan McDonald Author: Serapheim Dimitropoulos Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bpobj.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bpobj.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bpobj.c Wed Jul 26 16:35:17 2017 (r321547) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bpobj.c Wed Jul 26 16:42:32 2017 (r321548) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2014 by Delphix. All rights reserved. + * Copyright (c) 2011, 2016 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -395,6 +395,7 @@ bpobj_enqueue_subobj(bpobj_t *bpo, uint64_t subobj, dm return; } + mutex_enter(&bpo->bpo_lock); dmu_buf_will_dirty(bpo->bpo_dbuf, tx); if (bpo->bpo_phys->bpo_subobjs == 0) { bpo->bpo_phys->bpo_subobjs = dmu_object_alloc(bpo->bpo_os, @@ -406,7 +407,6 @@ bpobj_enqueue_subobj(bpobj_t *bpo, uint64_t subobj, dm ASSERT0(dmu_object_info(bpo->bpo_os, bpo->bpo_phys->bpo_subobjs, &doi)); ASSERT3U(doi.doi_type, ==, DMU_OT_BPOBJ_SUBOBJ); - mutex_enter(&bpo->bpo_lock); dmu_write(bpo->bpo_os, bpo->bpo_phys->bpo_subobjs, bpo->bpo_phys->bpo_num_subobjs * sizeof (subobj), sizeof (subobj), &subobj, tx); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Wed Jul 26 16:35:17 2017 (r321547) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Wed Jul 26 16:42:32 2017 (r321548) @@ -72,6 +72,8 @@ dsl_deadlist_load_tree(dsl_deadlist_t *dl) zap_cursor_t zc; zap_attribute_t za; + ASSERT(MUTEX_HELD(&dl->dl_lock)); + ASSERT(!dl->dl_oldfmt); if (dl->dl_havetree) return; @@ -182,6 +184,7 @@ static void dle_enqueue(dsl_deadlist_t *dl, dsl_deadlist_entry_t *dle, const blkptr_t *bp, dmu_tx_t *tx) { + ASSERT(MUTEX_HELD(&dl->dl_lock)); if (dle->dle_bpobj.bpo_object == dmu_objset_pool(dl->dl_os)->dp_empty_bpobj) { uint64_t obj = bpobj_alloc(dl->dl_os, SPA_OLD_MAXBLOCKSIZE, tx); @@ -198,6 +201,7 @@ static void dle_enqueue_subobj(dsl_deadlist_t *dl, dsl_deadlist_entry_t *dle, uint64_t obj, dmu_tx_t *tx) { + ASSERT(MUTEX_HELD(&dl->dl_lock)); if (dle->dle_bpobj.bpo_object != dmu_objset_pool(dl->dl_os)->dp_empty_bpobj) { bpobj_enqueue_subobj(&dle->dle_bpobj, obj, tx); @@ -222,15 +226,14 @@ dsl_deadlist_insert(dsl_deadlist_t *dl, const blkptr_t return; } + mutex_enter(&dl->dl_lock); dsl_deadlist_load_tree(dl); dmu_buf_will_dirty(dl->dl_dbuf, tx); - mutex_enter(&dl->dl_lock); dl->dl_phys->dl_used += bp_get_dsize_sync(dmu_objset_spa(dl->dl_os), bp); dl->dl_phys->dl_comp += BP_GET_PSIZE(bp); dl->dl_phys->dl_uncomp += BP_GET_UCSIZE(bp); - mutex_exit(&dl->dl_lock); dle_tofind.dle_mintxg = bp->blk_birth; dle = avl_find(&dl->dl_tree, &dle_tofind, &where); @@ -239,6 +242,7 @@ dsl_deadlist_insert(dsl_deadlist_t *dl, const blkptr_t else dle = AVL_PREV(&dl->dl_tree, dle); dle_enqueue(dl, dle, bp, tx); + mutex_exit(&dl->dl_lock); } /* @@ -254,16 +258,19 @@ dsl_deadlist_add_key(dsl_deadlist_t *dl, uint64_t mint if (dl->dl_oldfmt) return; - dsl_deadlist_load_tree(dl); - dle = kmem_alloc(sizeof (*dle), KM_SLEEP); dle->dle_mintxg = mintxg; + + mutex_enter(&dl->dl_lock); + dsl_deadlist_load_tree(dl); + obj = bpobj_alloc_empty(dl->dl_os, SPA_OLD_MAXBLOCKSIZE, tx); VERIFY3U(0, ==, bpobj_open(&dle->dle_bpobj, dl->dl_os, obj)); avl_add(&dl->dl_tree, dle); VERIFY3U(0, ==, zap_add_int_key(dl->dl_os, dl->dl_object, mintxg, obj, tx)); + mutex_exit(&dl->dl_lock); } /* @@ -278,6 +285,7 @@ dsl_deadlist_remove_key(dsl_deadlist_t *dl, uint64_t m if (dl->dl_oldfmt) return; + mutex_enter(&dl->dl_lock); dsl_deadlist_load_tree(dl); dle_tofind.dle_mintxg = mintxg; @@ -291,6 +299,7 @@ dsl_deadlist_remove_key(dsl_deadlist_t *dl, uint64_t m kmem_free(dle, sizeof (*dle)); VERIFY3U(0, ==, zap_remove_int(dl->dl_os, dl->dl_object, mintxg, tx)); + mutex_exit(&dl->dl_lock); } /* @@ -334,6 +343,7 @@ dsl_deadlist_clone(dsl_deadlist_t *dl, uint64_t maxtxg return (newobj); } + mutex_enter(&dl->dl_lock); dsl_deadlist_load_tree(dl); for (dle = avl_first(&dl->dl_tree); dle; @@ -347,6 +357,7 @@ dsl_deadlist_clone(dsl_deadlist_t *dl, uint64_t maxtxg VERIFY3U(0, ==, zap_add_int_key(dl->dl_os, newobj, dle->dle_mintxg, obj, tx)); } + mutex_exit(&dl->dl_lock); return (newobj); } @@ -424,6 +435,8 @@ dsl_deadlist_insert_bpobj(dsl_deadlist_t *dl, uint64_t uint64_t used, comp, uncomp; bpobj_t bpo; + ASSERT(MUTEX_HELD(&dl->dl_lock)); + VERIFY3U(0, ==, bpobj_open(&bpo, dl->dl_os, obj)); VERIFY3U(0, ==, bpobj_space(&bpo, &used, &comp, &uncomp)); bpobj_close(&bpo); @@ -431,11 +444,9 @@ dsl_deadlist_insert_bpobj(dsl_deadlist_t *dl, uint64_t dsl_deadlist_load_tree(dl); dmu_buf_will_dirty(dl->dl_dbuf, tx); - mutex_enter(&dl->dl_lock); dl->dl_phys->dl_used += used; dl->dl_phys->dl_comp += comp; dl->dl_phys->dl_uncomp += uncomp; - mutex_exit(&dl->dl_lock); dle_tofind.dle_mintxg = birth; dle = avl_find(&dl->dl_tree, &dle_tofind, &where); @@ -475,6 +486,7 @@ dsl_deadlist_merge(dsl_deadlist_t *dl, uint64_t obj, d return; } + mutex_enter(&dl->dl_lock); for (zap_cursor_init(&zc, dl->dl_os, obj); zap_cursor_retrieve(&zc, &za) == 0; zap_cursor_advance(&zc)) { @@ -489,6 +501,7 @@ dsl_deadlist_merge(dsl_deadlist_t *dl, uint64_t obj, d dmu_buf_will_dirty(bonus, tx); bzero(dlp, sizeof (*dlp)); dmu_buf_rele(bonus, FTAG); + mutex_exit(&dl->dl_lock); } /* @@ -503,6 +516,8 @@ dsl_deadlist_move_bpobj(dsl_deadlist_t *dl, bpobj_t *b avl_index_t where; ASSERT(!dl->dl_oldfmt); + + mutex_enter(&dl->dl_lock); dmu_buf_will_dirty(dl->dl_dbuf, tx); dsl_deadlist_load_tree(dl); @@ -518,14 +533,12 @@ dsl_deadlist_move_bpobj(dsl_deadlist_t *dl, bpobj_t *b VERIFY3U(0, ==, bpobj_space(&dle->dle_bpobj, &used, &comp, &uncomp)); - mutex_enter(&dl->dl_lock); ASSERT3U(dl->dl_phys->dl_used, >=, used); ASSERT3U(dl->dl_phys->dl_comp, >=, comp); ASSERT3U(dl->dl_phys->dl_uncomp, >=, uncomp); dl->dl_phys->dl_used -= used; dl->dl_phys->dl_comp -= comp; dl->dl_phys->dl_uncomp -= uncomp; - mutex_exit(&dl->dl_lock); VERIFY3U(0, ==, zap_remove_int(dl->dl_os, dl->dl_object, dle->dle_mintxg, tx)); @@ -536,4 +549,5 @@ dsl_deadlist_move_bpobj(dsl_deadlist_t *dl, bpobj_t *b kmem_free(dle, sizeof (*dle)); dle = dle_next; } + mutex_exit(&dl->dl_lock); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:43:22 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B20E3DA962B; Wed, 26 Jul 2017 16:43:22 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7AAED6AA5C; Wed, 26 Jul 2017 16:43:22 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGhLtl076633; Wed, 26 Jul 2017 16:43:21 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGhL3M076625; Wed, 26 Jul 2017 16:43:21 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261643.v6QGhL3M076625@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:43:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321549 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321549 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:43:22 -0000 Author: mav Date: Wed Jul 26 16:43:20 2017 New Revision: 321549 URL: https://svnweb.freebsd.org/changeset/base/321549 Log: MFC r318823: MFC r316914: 7801 add more by-dnode routines illumos/illumos-gate@b0c42cd4706ba01ce158bd2bb1004f7e59eca5fe https://github.com/illumos/illumos-gate/commit/b0c42cd4706ba01ce158bd2bb1004f7e59eca5fe https://www.illumos.org/issues/7801 Add *_by_dnode() routines for accessing objects given their dnode_t *, this is more efficient than accessing the object by (objset_t *, uint64_t object). This change converts some but not all of the existing consumers. As performance-sensitive code paths are discovered they should be converted to use these routines. Ported from: https://github.com/zfsonlinux/zfs/commit/0eef1bde31d67091d3deed23fe2394f5a8bf2276 Reviewed by: Matthew Ahrens Reviewed by: Brian Behlendorf Reviewed by: Pavel Zakharov Approved by: Robert Mustacchi Author: bzzz77 Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_tx.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 16:43:20 2017 (r321549) @@ -871,18 +871,13 @@ dmu_free_range(objset_t *os, uint64_t object, uint64_t return (0); } -int -dmu_read(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, +static int +dmu_read_impl(dnode_t *dn, uint64_t offset, uint64_t size, void *buf, uint32_t flags) { - dnode_t *dn; dmu_buf_t **dbp; - int numbufs, err; + int numbufs, err = 0; - err = dnode_hold(os, object, FTAG, &dn); - if (err) - return (err); - /* * Deal with odd block sizes, where there can't be data past the first * block. If we ever do the tail block optimization, we will need to @@ -926,23 +921,38 @@ dmu_read(objset_t *os, uint64_t object, uint64_t offse } dmu_buf_rele_array(dbp, numbufs, FTAG); } - dnode_rele(dn, FTAG); return (err); } -void -dmu_write(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, - const void *buf, dmu_tx_t *tx) +int +dmu_read(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, + void *buf, uint32_t flags) { - dmu_buf_t **dbp; - int numbufs, i; + dnode_t *dn; + int err; - if (size == 0) - return; + err = dnode_hold(os, object, FTAG, &dn); + if (err != 0) + return (err); - VERIFY(0 == dmu_buf_hold_array(os, object, offset, size, - FALSE, FTAG, &numbufs, &dbp)); + err = dmu_read_impl(dn, offset, size, buf, flags); + dnode_rele(dn, FTAG); + return (err); +} +int +dmu_read_by_dnode(dnode_t *dn, uint64_t offset, uint64_t size, void *buf, + uint32_t flags) +{ + return (dmu_read_impl(dn, offset, size, buf, flags)); +} + +static void +dmu_write_impl(dmu_buf_t **dbp, int numbufs, uint64_t offset, uint64_t size, + const void *buf, dmu_tx_t *tx) +{ + int i; + for (i = 0; i < numbufs; i++) { int tocpy; int bufoff; @@ -969,6 +979,37 @@ dmu_write(objset_t *os, uint64_t object, uint64_t offs size -= tocpy; buf = (char *)buf + tocpy; } +} + +void +dmu_write(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, + const void *buf, dmu_tx_t *tx) +{ + dmu_buf_t **dbp; + int numbufs; + + if (size == 0) + return; + + VERIFY0(dmu_buf_hold_array(os, object, offset, size, + FALSE, FTAG, &numbufs, &dbp)); + dmu_write_impl(dbp, numbufs, offset, size, buf, tx); + dmu_buf_rele_array(dbp, numbufs, FTAG); +} + +void +dmu_write_by_dnode(dnode_t *dn, uint64_t offset, uint64_t size, + const void *buf, dmu_tx_t *tx) +{ + dmu_buf_t **dbp; + int numbufs; + + if (size == 0) + return; + + VERIFY0(dmu_buf_hold_array_by_dnode(dn, offset, size, + FALSE, FTAG, &numbufs, &dbp, DMU_READ_PREFETCH)); + dmu_write_impl(dbp, numbufs, offset, size, buf, tx); dmu_buf_rele_array(dbp, numbufs, FTAG); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c Wed Jul 26 16:43:20 2017 (r321549) @@ -93,11 +93,11 @@ dmu_object_alloc(objset_t *os, dmu_object_type_t ot, i } dnode_allocate(dn, ot, blocksize, 0, bonustype, bonuslen, tx); - dnode_rele(dn, FTAG); - mutex_exit(&os->os_obj_lock); - dmu_tx_add_new_object(tx, os, object); + dmu_tx_add_new_object(tx, dn); + dnode_rele(dn, FTAG); + return (object); } @@ -115,9 +115,10 @@ dmu_object_claim(objset_t *os, uint64_t object, dmu_ob if (err) return (err); dnode_allocate(dn, ot, blocksize, 0, bonustype, bonuslen, tx); + dmu_tx_add_new_object(tx, dn); + dnode_rele(dn, FTAG); - dmu_tx_add_new_object(tx, os, object); return (0); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 16:43:20 2017 (r321549) @@ -93,21 +93,14 @@ dmu_tx_private_ok(dmu_tx_t *tx) } static dmu_tx_hold_t * -dmu_tx_hold_object_impl(dmu_tx_t *tx, objset_t *os, uint64_t object, - enum dmu_tx_hold_type type, uint64_t arg1, uint64_t arg2) +dmu_tx_hold_dnode_impl(dmu_tx_t *tx, dnode_t *dn, enum dmu_tx_hold_type type, + uint64_t arg1, uint64_t arg2) { dmu_tx_hold_t *txh; - dnode_t *dn = NULL; - int err; - if (object != DMU_NEW_OBJECT) { - err = dnode_hold(os, object, tx, &dn); - if (err) { - tx->tx_err = err; - return (NULL); - } - - if (err == 0 && tx->tx_txg != 0) { + if (dn != NULL) { + (void) refcount_add(&dn->dn_holds, tx); + if (tx->tx_txg != 0) { mutex_enter(&dn->dn_mtx); /* * dn->dn_assigned_txg == tx->tx_txg doesn't pose a @@ -134,17 +127,36 @@ dmu_tx_hold_object_impl(dmu_tx_t *tx, objset_t *os, ui return (txh); } +static dmu_tx_hold_t * +dmu_tx_hold_object_impl(dmu_tx_t *tx, objset_t *os, uint64_t object, + enum dmu_tx_hold_type type, uint64_t arg1, uint64_t arg2) +{ + dnode_t *dn = NULL; + dmu_tx_hold_t *txh; + int err; + + if (object != DMU_NEW_OBJECT) { + err = dnode_hold(os, object, FTAG, &dn); + if (err != 0) { + tx->tx_err = err; + return (NULL); + } + } + txh = dmu_tx_hold_dnode_impl(tx, dn, type, arg1, arg2); + if (dn != NULL) + dnode_rele(dn, FTAG); + return (txh); +} + void -dmu_tx_add_new_object(dmu_tx_t *tx, objset_t *os, uint64_t object) +dmu_tx_add_new_object(dmu_tx_t *tx, dnode_t *dn) { /* * If we're syncing, they can manipulate any object anyhow, and * the hold on the dnode_t can cause problems. */ - if (!dmu_tx_is_syncing(tx)) { - (void) dmu_tx_hold_object_impl(tx, os, - object, THT_NEWOBJECT, 0, 0); - } + if (!dmu_tx_is_syncing(tx)) + (void) dmu_tx_hold_dnode_impl(tx, dn, THT_NEWOBJECT, 0, 0); } /* @@ -282,11 +294,26 @@ dmu_tx_hold_write(dmu_tx_t *tx, uint64_t object, uint6 txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, object, THT_WRITE, off, len); - if (txh == NULL) - return; + if (txh != NULL) { + dmu_tx_count_write(txh, off, len); + dmu_tx_count_dnode(txh); + } +} - dmu_tx_count_write(txh, off, len); - dmu_tx_count_dnode(txh); +void +dmu_tx_hold_write_by_dnode(dmu_tx_t *tx, dnode_t *dn, uint64_t off, int len) +{ + dmu_tx_hold_t *txh; + + ASSERT0(tx->tx_txg); + ASSERT3U(len, <=, DMU_MAX_ACCESS); + ASSERT(len == 0 || UINT64_MAX - off >= len - 1); + + txh = dmu_tx_hold_dnode_impl(tx, dn, THT_WRITE, off, len); + if (txh != NULL) { + dmu_tx_count_write(txh, off, len); + dmu_tx_count_dnode(txh); + } } /* @@ -303,18 +330,18 @@ dmu_tx_mark_netfree(dmu_tx_t *tx) tx->tx_netfree = B_TRUE; } -void -dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64_t off, uint64_t len) +static void +dmu_tx_hold_free_impl(dmu_tx_hold_t *txh, uint64_t off, uint64_t len) { + dmu_tx_t *tx; + dnode_t *dn; int err; + zio_t *zio; + tx = txh->txh_tx; ASSERT(tx->tx_txg == 0); - dmu_tx_hold_t *txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, - object, THT_FREE, off, len); - if (txh == NULL) - return; - dnode_t *dn = txh->txh_dnode; + dn = txh->txh_dnode; dmu_tx_count_dnode(txh); if (off >= (dn->dn_maxblkid + 1) * dn->dn_datablksz) @@ -396,17 +423,36 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64 } void -dmu_tx_hold_zap(dmu_tx_t *tx, uint64_t object, int add, const char *name) +dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64_t off, uint64_t len) { + dmu_tx_hold_t *txh; + + txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, + object, THT_FREE, off, len); + if (txh != NULL) + (void) dmu_tx_hold_free_impl(txh, off, len); +} + +void +dmu_tx_hold_free_by_dnode(dmu_tx_t *tx, dnode_t *dn, uint64_t off, uint64_t len) +{ + dmu_tx_hold_t *txh; + + txh = dmu_tx_hold_dnode_impl(tx, dn, THT_FREE, off, len); + if (txh != NULL) + (void) dmu_tx_hold_free_impl(txh, off, len); +} + +static void +dmu_tx_hold_zap_impl(dmu_tx_hold_t *txh, int add, const char *name) +{ + dmu_tx_t *tx = txh->txh_tx; + dnode_t *dn; int err; ASSERT(tx->tx_txg == 0); - dmu_tx_hold_t *txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, - object, THT_ZAP, add, (uintptr_t)name); - if (txh == NULL) - return; - dnode_t *dn = txh->txh_dnode; + dn = txh->txh_dnode; dmu_tx_count_dnode(txh); @@ -450,6 +496,32 @@ dmu_tx_hold_zap(dmu_tx_t *tx, uint64_t object, int add } void +dmu_tx_hold_zap(dmu_tx_t *tx, uint64_t object, int add, const char *name) +{ + dmu_tx_hold_t *txh; + + ASSERT0(tx->tx_txg); + + txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, + object, THT_ZAP, add, (uintptr_t)name); + if (txh != NULL) + dmu_tx_hold_zap_impl(txh, add, name); +} + +void +dmu_tx_hold_zap_by_dnode(dmu_tx_t *tx, dnode_t *dn, int add, const char *name) +{ + dmu_tx_hold_t *txh; + + ASSERT0(tx->tx_txg); + ASSERT(dn != NULL); + + txh = dmu_tx_hold_dnode_impl(tx, dn, THT_ZAP, add, (uintptr_t)name); + if (txh != NULL) + dmu_tx_hold_zap_impl(txh, add, name); +} + +void dmu_tx_hold_bonus(dmu_tx_t *tx, uint64_t object) { dmu_tx_hold_t *txh; @@ -458,6 +530,18 @@ dmu_tx_hold_bonus(dmu_tx_t *tx, uint64_t object) txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, object, THT_BONUS, 0, 0); + if (txh) + dmu_tx_count_dnode(txh); +} + +void +dmu_tx_hold_bonus_by_dnode(dmu_tx_t *tx, dnode_t *dn) +{ + dmu_tx_hold_t *txh; + + ASSERT0(tx->tx_txg); + + txh = dmu_tx_hold_dnode_impl(tx, dn, THT_BONUS, 0, 0); if (txh) dmu_tx_count_dnode(txh); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 16:43:20 2017 (r321549) @@ -669,10 +669,17 @@ void dmu_buf_will_dirty(dmu_buf_t *db, dmu_tx_t *tx); dmu_tx_t *dmu_tx_create(objset_t *os); void dmu_tx_hold_write(dmu_tx_t *tx, uint64_t object, uint64_t off, int len); +void dmu_tx_hold_write_by_dnode(dmu_tx_t *tx, dnode_t *dn, uint64_t off, + int len); void dmu_tx_hold_free(dmu_tx_t *tx, uint64_t object, uint64_t off, uint64_t len); +void dmu_tx_hold_free_by_dnode(dmu_tx_t *tx, dnode_t *dn, uint64_t off, + uint64_t len); void dmu_tx_hold_zap(dmu_tx_t *tx, uint64_t object, int add, const char *name); +void dmu_tx_hold_zap_by_dnode(dmu_tx_t *tx, dnode_t *dn, int add, + const char *name); void dmu_tx_hold_bonus(dmu_tx_t *tx, uint64_t object); +void dmu_tx_hold_bonus_by_dnode(dmu_tx_t *tx, dnode_t *dn); void dmu_tx_hold_spill(dmu_tx_t *tx, uint64_t object); void dmu_tx_hold_sa(dmu_tx_t *tx, struct sa_handle *hdl, boolean_t may_grow); void dmu_tx_hold_sa_create(dmu_tx_t *tx, int total_size); @@ -722,8 +729,12 @@ int dmu_free_long_object(objset_t *os, uint64_t object #define DMU_READ_NO_PREFETCH 1 /* don't prefetch */ int dmu_read(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, void *buf, uint32_t flags); +int dmu_read_by_dnode(dnode_t *dn, uint64_t offset, uint64_t size, void *buf, + uint32_t flags); void dmu_write(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, const void *buf, dmu_tx_t *tx); +void dmu_write_by_dnode(dnode_t *dn, uint64_t offset, uint64_t size, + const void *buf, dmu_tx_t *tx); void dmu_prealloc(objset_t *os, uint64_t object, uint64_t offset, uint64_t size, dmu_tx_t *tx); int dmu_read_uio(objset_t *os, uint64_t object, struct uio *uio, uint64_t size); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_tx.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_tx.h Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_tx.h Wed Jul 26 16:43:20 2017 (r321549) @@ -135,7 +135,7 @@ extern dmu_tx_t *dmu_tx_create_assigned(struct dsl_poo dmu_tx_t *dmu_tx_create_dd(dsl_dir_t *dd); int dmu_tx_is_syncing(dmu_tx_t *tx); int dmu_tx_private_ok(dmu_tx_t *tx); -void dmu_tx_add_new_object(dmu_tx_t *tx, objset_t *os, uint64_t object); +void dmu_tx_add_new_object(dmu_tx_t *tx, dnode_t *dn); void dmu_tx_dirty_buf(dmu_tx_t *tx, struct dmu_buf_impl *db); void dmu_tx_hold_space(dmu_tx_t *tx, uint64_t space); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap.h Wed Jul 26 16:43:20 2017 (r321549) @@ -220,6 +220,9 @@ int zap_count_write_by_dnode(dnode_t *dn, const char * int zap_add(objset_t *ds, uint64_t zapobj, const char *key, int integer_size, uint64_t num_integers, const void *val, dmu_tx_t *tx); +int zap_add_by_dnode(dnode_t *dn, const char *key, + int integer_size, uint64_t num_integers, + const void *val, dmu_tx_t *tx); int zap_add_uint64(objset_t *ds, uint64_t zapobj, const uint64_t *key, int key_numints, int integer_size, uint64_t num_integers, const void *val, dmu_tx_t *tx); @@ -259,6 +262,7 @@ int zap_length_uint64(objset_t *os, uint64_t zapobj, c int zap_remove(objset_t *ds, uint64_t zapobj, const char *name, dmu_tx_t *tx); int zap_remove_norm(objset_t *ds, uint64_t zapobj, const char *name, matchtype_t mt, dmu_tx_t *tx); +int zap_remove_by_dnode(dnode_t *dn, const char *name, dmu_tx_t *tx); int zap_remove_uint64(objset_t *os, uint64_t zapobj, const uint64_t *key, int key_numints, dmu_tx_t *tx); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Jul 26 16:42:32 2017 (r321548) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c Wed Jul 26 16:43:20 2017 (r321549) @@ -1120,34 +1120,30 @@ again: ASSERT(!"out of entries!"); } -int -zap_add(objset_t *os, uint64_t zapobj, const char *key, +static int +zap_add_impl(zap_t *zap, const char *key, int integer_size, uint64_t num_integers, - const void *val, dmu_tx_t *tx) + const void *val, dmu_tx_t *tx, void *tag) { - zap_t *zap; - int err; + int err = 0; mzap_ent_t *mze; const uint64_t *intval = val; zap_name_t *zn; - err = zap_lockdir(os, zapobj, tx, RW_WRITER, TRUE, TRUE, FTAG, &zap); - if (err) - return (err); zn = zap_name_alloc(zap, key, 0); if (zn == NULL) { - zap_unlockdir(zap, FTAG); + zap_unlockdir(zap, tag); return (SET_ERROR(ENOTSUP)); } if (!zap->zap_ismicro) { - err = fzap_add(zn, integer_size, num_integers, val, FTAG, tx); + err = fzap_add(zn, integer_size, num_integers, val, tag, tx); zap = zn->zn_zap; /* fzap_add() may change zap */ } else if (integer_size != 8 || num_integers != 1 || strlen(key) >= MZAP_NAME_LEN) { - err = mzap_upgrade(&zn->zn_zap, FTAG, tx, 0); + err = mzap_upgrade(&zn->zn_zap, tag, tx, 0); if (err == 0) { err = fzap_add(zn, integer_size, num_integers, val, - FTAG, tx); + tag, tx); } zap = zn->zn_zap; /* fzap_add() may change zap */ } else { @@ -1161,11 +1157,43 @@ zap_add(objset_t *os, uint64_t zapobj, const char *key ASSERT(zap == zn->zn_zap); zap_name_free(zn); if (zap != NULL) /* may be NULL if fzap_add() failed */ - zap_unlockdir(zap, FTAG); + zap_unlockdir(zap, tag); return (err); } int +zap_add(objset_t *os, uint64_t zapobj, const char *key, + int integer_size, uint64_t num_integers, + const void *val, dmu_tx_t *tx) +{ + zap_t *zap; + int err; + + err = zap_lockdir(os, zapobj, tx, RW_WRITER, TRUE, TRUE, FTAG, &zap); + if (err != 0) + return (err); + err = zap_add_impl(zap, key, integer_size, num_integers, val, tx, FTAG); + /* zap_add_impl() calls zap_unlockdir() */ + return (err); +} + +int +zap_add_by_dnode(dnode_t *dn, const char *key, + int integer_size, uint64_t num_integers, + const void *val, dmu_tx_t *tx) +{ + zap_t *zap; + int err; + + err = zap_lockdir_by_dnode(dn, tx, RW_WRITER, TRUE, TRUE, FTAG, &zap); + if (err != 0) + return (err); + err = zap_add_impl(zap, key, integer_size, num_integers, val, tx, FTAG); + /* zap_add_impl() calls zap_unlockdir() */ + return (err); +} + +int zap_add_uint64(objset_t *os, uint64_t zapobj, const uint64_t *key, int key_numints, int integer_size, uint64_t num_integers, const void *val, dmu_tx_t *tx) @@ -1279,23 +1307,17 @@ zap_remove(objset_t *os, uint64_t zapobj, const char * return (zap_remove_norm(os, zapobj, name, 0, tx)); } -int -zap_remove_norm(objset_t *os, uint64_t zapobj, const char *name, +static int +zap_remove_impl(zap_t *zap, const char *name, matchtype_t mt, dmu_tx_t *tx) { - zap_t *zap; - int err; mzap_ent_t *mze; zap_name_t *zn; + int err = 0; - err = zap_lockdir(os, zapobj, tx, RW_WRITER, TRUE, FALSE, FTAG, &zap); - if (err) - return (err); zn = zap_name_alloc(zap, name, mt); - if (zn == NULL) { - zap_unlockdir(zap, FTAG); + if (zn == NULL) return (SET_ERROR(ENOTSUP)); - } if (!zap->zap_ismicro) { err = fzap_remove(zn, tx); } else { @@ -1310,6 +1332,34 @@ zap_remove_norm(objset_t *os, uint64_t zapobj, const c } } zap_name_free(zn); + return (err); +} + +int +zap_remove_norm(objset_t *os, uint64_t zapobj, const char *name, + matchtype_t mt, dmu_tx_t *tx) +{ + zap_t *zap; + int err; + + err = zap_lockdir(os, zapobj, tx, RW_WRITER, TRUE, FALSE, FTAG, &zap); + if (err) + return (err); + err = zap_remove_impl(zap, name, mt, tx); + zap_unlockdir(zap, FTAG); + return (err); +} + +int +zap_remove_by_dnode(dnode_t *dn, const char *name, dmu_tx_t *tx) +{ + zap_t *zap; + int err; + + err = zap_lockdir_by_dnode(dn, tx, RW_WRITER, TRUE, FALSE, FTAG, &zap); + if (err) + return (err); + err = zap_remove_impl(zap, name, 0, tx); zap_unlockdir(zap, FTAG); return (err); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:44:09 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67A90DA96EB; Wed, 26 Jul 2017 16:44:09 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 342216ABB4; Wed, 26 Jul 2017 16:44:09 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGi8Ux076733; Wed, 26 Jul 2017 16:44:08 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGi8Nq076732; Wed, 26 Jul 2017 16:44:08 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261644.v6QGi8Nq076732@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:44:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321550 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321550 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:44:09 -0000 Author: mav Date: Wed Jul 26 16:44:08 2017 New Revision: 321550 URL: https://svnweb.freebsd.org/changeset/base/321550 Log: MFC r318824: MFV r316915: 7801 add more by-dnode routines (lint) illumos/illumos-gate@411be58a6e030a3b606f1afcc7f2e2459ffda844 https://github.com/illumos/illumos-gate/commit/411be58a6e030a3b606f1afcc7f2e2459ffda844 Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 16:43:20 2017 (r321549) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 16:44:08 2017 (r321550) @@ -336,7 +336,6 @@ dmu_tx_hold_free_impl(dmu_tx_hold_t *txh, uint64_t off dmu_tx_t *tx; dnode_t *dn; int err; - zio_t *zio; tx = txh->txh_tx; ASSERT(tx->tx_txg == 0); @@ -444,7 +443,7 @@ dmu_tx_hold_free_by_dnode(dmu_tx_t *tx, dnode_t *dn, u } static void -dmu_tx_hold_zap_impl(dmu_tx_hold_t *txh, int add, const char *name) +dmu_tx_hold_zap_impl(dmu_tx_hold_t *txh, const char *name) { dmu_tx_t *tx = txh->txh_tx; dnode_t *dn; @@ -505,7 +504,7 @@ dmu_tx_hold_zap(dmu_tx_t *tx, uint64_t object, int add txh = dmu_tx_hold_object_impl(tx, tx->tx_objset, object, THT_ZAP, add, (uintptr_t)name); if (txh != NULL) - dmu_tx_hold_zap_impl(txh, add, name); + dmu_tx_hold_zap_impl(txh, name); } void @@ -518,7 +517,7 @@ dmu_tx_hold_zap_by_dnode(dmu_tx_t *tx, dnode_t *dn, in txh = dmu_tx_hold_dnode_impl(tx, dn, THT_ZAP, add, (uintptr_t)name); if (txh != NULL) - dmu_tx_hold_zap_impl(txh, add, name); + dmu_tx_hold_zap_impl(txh, name); } void From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:44:54 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2B112DA9882; Wed, 26 Jul 2017 16:44:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F249B6AEF2; Wed, 26 Jul 2017 16:44:53 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGirPq077055; Wed, 26 Jul 2017 16:44:53 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGiqtG077048; Wed, 26 Jul 2017 16:44:52 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261644.v6QGiqtG077048@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:44:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321552 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321552 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:44:54 -0000 Author: mav Date: Wed Jul 26 16:44:52 2017 New Revision: 321552 URL: https://svnweb.freebsd.org/changeset/base/321552 Log: MFC r318827: MFV r316916: 7970 zfs_arc_num_sublists_per_state should be common to all multilists illumos/illumos-gate@10fbdecb05f411234920f8d3c92c148d39106d7e https://github.com/illumos/illumos-gate/commit/10fbdecb05f411234920f8d3c92c148d39106d7e https://www.illumos.org/issues/7970 The global tunable zfs_arc_num_sublists_per_state is used by the ARC and the dbuf cache, and other users are planned. We should change this tunable to be common to all multilists. Reviewed by: Pavel Zakharov Reviewed by: Brad Lewis Reviewed by: Saso Kiselkov Reviewed by: Brian Behlendorf Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:44:24 2017 (r321551) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:44:52 2017 (r321552) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012, Joyent, Inc. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 by Saso Kiselkov. All rights reserved. * Copyright 2015 Nexenta Systems, Inc. All rights reserved. */ @@ -304,13 +304,6 @@ uint_t arc_reduce_dnlc_percent = 3; */ int zfs_arc_evict_batch_limit = 10; -/* - * The number of sublists used for each of the arc state lists. If this - * is not set to a suitable value by the user, it will be configured to - * the number of CPUs on the system in arc_init(). - */ -int zfs_arc_num_sublists_per_state = 0; - /* number of seconds before growing cache again */ static int arc_grow_retry = 60; @@ -6219,43 +6212,43 @@ arc_state_init(void) multilist_create(&arc_mru->arcs_list[ARC_BUFC_METADATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mru->arcs_list[ARC_BUFC_DATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mru_ghost->arcs_list[ARC_BUFC_METADATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mru_ghost->arcs_list[ARC_BUFC_DATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mfu->arcs_list[ARC_BUFC_METADATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mfu->arcs_list[ARC_BUFC_DATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mfu_ghost->arcs_list[ARC_BUFC_METADATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_mfu_ghost->arcs_list[ARC_BUFC_DATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_l2c_only->arcs_list[ARC_BUFC_METADATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); multilist_create(&arc_l2c_only->arcs_list[ARC_BUFC_DATA], sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), - zfs_arc_num_sublists_per_state, arc_state_multilist_index_func); + arc_state_multilist_index_func); refcount_create(&arc_anon->arcs_esize[ARC_BUFC_METADATA]); refcount_create(&arc_anon->arcs_esize[ARC_BUFC_DATA]); @@ -6411,9 +6404,6 @@ arc_init(void) if (zfs_arc_p_min_shift > 0) arc_p_min_shift = zfs_arc_p_min_shift; - - if (zfs_arc_num_sublists_per_state < 1) - zfs_arc_num_sublists_per_state = MAX(max_ncpus, 1); /* if kmem_flags are set, lets try to use less memory */ if (kmem_debugging()) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:44:24 2017 (r321551) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:44:52 2017 (r321552) @@ -623,7 +623,6 @@ retry: multilist_create(&dbuf_cache, sizeof (dmu_buf_impl_t), offsetof(dmu_buf_impl_t, db_cache_link), - zfs_arc_num_sublists_per_state, dbuf_cache_multilist_index_func); refcount_create(&dbuf_cache_size); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c Wed Jul 26 16:44:24 2017 (r321551) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c Wed Jul 26 16:44:52 2017 (r321552) @@ -13,7 +13,7 @@ * CDDL HEADER END */ /* - * Copyright (c) 2013, 2014 by Delphix. All rights reserved. + * Copyright (c) 2013, 2017 by Delphix. All rights reserved. */ #include @@ -23,6 +23,12 @@ #include /* + * This overrides the number of sublists in each multilist_t, which defaults + * to the number of CPUs in the system (see multilist_create()). + */ +int zfs_multilist_num_sublists = 0; + +/* * Given the object contained on the list, return a pointer to the * object's multilist_node_t structure it contains. */ @@ -59,9 +65,9 @@ multilist_d2l(multilist_t *ml, void *obj) * requirement, but a general rule of thumb in order to garner the * best multi-threaded performance out of the data structure. */ -void -multilist_create(multilist_t *ml, size_t size, size_t offset, unsigned int num, - multilist_sublist_index_func_t *index_func) +static void +multilist_create_impl(multilist_t *ml, size_t size, size_t offset, + unsigned int num, multilist_sublist_index_func_t *index_func) { ASSERT3P(ml, !=, NULL); ASSERT3U(size, >, 0); @@ -83,6 +89,26 @@ multilist_create(multilist_t *ml, size_t size, size_t mutex_init(&mls->mls_lock, NULL, MUTEX_DEFAULT, NULL); list_create(&mls->mls_list, size, offset); } +} + +/* + * Initialize a new sublist, using the default number of sublists + * (the number of CPUs, or at least 4, or the tunable + * zfs_multilist_num_sublists). + */ +void +multilist_create(multilist_t *ml, size_t size, size_t offset, + multilist_sublist_index_func_t *index_func) +{ + int num_sublists; + + if (zfs_multilist_num_sublists > 0) { + num_sublists = zfs_multilist_num_sublists; + } else { + num_sublists = MAX(max_ncpus, 4); + } + + multilist_create_impl(ml, size, offset, num_sublists, index_func); } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Jul 26 16:44:24 2017 (r321551) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Wed Jul 26 16:44:52 2017 (r321552) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ @@ -63,8 +63,6 @@ typedef void arc_done_func_t(zio_t *zio, arc_buf_t *bu /* generic arc_done_func_t's which you can use */ arc_done_func_t arc_bcopy_func; arc_done_func_t arc_getbuf_func; - -extern int zfs_arc_num_sublists_per_state; typedef enum arc_flags { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h Wed Jul 26 16:44:24 2017 (r321551) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h Wed Jul 26 16:44:52 2017 (r321552) @@ -13,7 +13,7 @@ * CDDL HEADER END */ /* - * Copyright (c) 2013, 2014 by Delphix. All rights reserved. + * Copyright (c) 2013, 2017 by Delphix. All rights reserved. */ #ifndef _SYS_MULTILIST_H @@ -73,7 +73,7 @@ struct multilist { }; void multilist_destroy(multilist_t *); -void multilist_create(multilist_t *, size_t, size_t, unsigned int, +void multilist_create(multilist_t *, size_t, size_t, multilist_sublist_index_func_t *); void multilist_insert(multilist_t *, void *); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:45:42 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5A4DBDA9975; Wed, 26 Jul 2017 16:45:42 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 036D56B109; Wed, 26 Jul 2017 16:45:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGjfwL077161; Wed, 26 Jul 2017 16:45:41 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGjdjl077144; Wed, 26 Jul 2017 16:45:39 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261645.v6QGjdjl077144@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:45:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321553 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321553 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:45:42 -0000 Author: mav Date: Wed Jul 26 16:45:39 2017 New Revision: 321553 URL: https://svnweb.freebsd.org/changeset/base/321553 Log: MFC r318828: MFV r316917: 7968 multi-threaded spa_sync() illumos/illumos-gate@94c2d0eb22e9624151ee84a7edbf7178e1bf4087 https://github.com/illumos/illumos-gate/commit/94c2d0eb22e9624151ee84a7edbf7178e1bf4087 https://www.illumos.org/issues/7968 spa_sync() iterates over all the dirty dnodes and processes each of them by calling dnode_sync(). If there are many dirty dnodes (e.g. because we created or removed a lot of files), the single thread of spa_sync() calling dnode_sync() can become a bottleneck. Additionally, if many dnodes are dirtied concurrently in open context (e.g. due to concurrent file creation), the os_lock will experience lock contention via dnode_setdirty(). The solution is to track dirty dnodes on a multilist_t, and for spa_sync() to use separate threads to process each of the sublists in the multilist. On the concurrent file creation microbenchmark, the performance improvement from dnode_setdirty() is up to 7%. Additionally, the wall clock time spent in spa_sync() is reduced to 15%-40% of the single-threaded case. In terms of cost/ reward, once the other bottlenecks are addressed, fixing this bug will provide a medium-large performance gain and require a medium amount of effort to implement. Reviewed by: Pavel Zakharov Reviewed by: Brad Lewis Reviewed by: Saso Kiselkov Reviewed by: Brian Behlendorf Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:45:39 2017 (r321553) @@ -473,7 +473,7 @@ typedef struct arc_state { /* * list of evictable buffers */ - multilist_t arcs_list[ARC_BUFC_NUMTYPES]; + multilist_t *arcs_list[ARC_BUFC_NUMTYPES]; /* * total amount of evictable data in this state */ @@ -2359,7 +2359,7 @@ add_reference(arc_buf_hdr_t *hdr, void *tag) (state != arc_anon)) { /* We don't use the L2-only state list. */ if (state != arc_l2c_only) { - multilist_remove(&state->arcs_list[arc_buf_type(hdr)], + multilist_remove(state->arcs_list[arc_buf_type(hdr)], hdr); arc_evictable_space_decrement(hdr, state); } @@ -2389,7 +2389,7 @@ remove_reference(arc_buf_hdr_t *hdr, kmutex_t *hash_lo */ if (((cnt = refcount_remove(&hdr->b_l1hdr.b_refcnt, tag)) == 0) && (state != arc_anon)) { - multilist_insert(&state->arcs_list[arc_buf_type(hdr)], hdr); + multilist_insert(state->arcs_list[arc_buf_type(hdr)], hdr); ASSERT3U(hdr->b_l1hdr.b_bufcnt, >, 0); arc_evictable_space_increment(hdr, state); } @@ -2442,7 +2442,7 @@ arc_change_state(arc_state_t *new_state, arc_buf_hdr_t if (refcnt == 0) { if (old_state != arc_anon && old_state != arc_l2c_only) { ASSERT(HDR_HAS_L1HDR(hdr)); - multilist_remove(&old_state->arcs_list[buftype], hdr); + multilist_remove(old_state->arcs_list[buftype], hdr); if (GHOST_STATE(old_state)) { ASSERT0(bufcnt); @@ -2460,7 +2460,7 @@ arc_change_state(arc_state_t *new_state, arc_buf_hdr_t * beforehand. */ ASSERT(HDR_HAS_L1HDR(hdr)); - multilist_insert(&new_state->arcs_list[buftype], hdr); + multilist_insert(new_state->arcs_list[buftype], hdr); if (GHOST_STATE(new_state)) { ASSERT0(bufcnt); @@ -2586,8 +2586,8 @@ arc_change_state(arc_state_t *new_state, arc_buf_hdr_t * L2 headers should never be on the L2 state list since they don't * have L1 headers allocated. */ - ASSERT(multilist_is_empty(&arc_l2c_only->arcs_list[ARC_BUFC_DATA]) && - multilist_is_empty(&arc_l2c_only->arcs_list[ARC_BUFC_METADATA])); + ASSERT(multilist_is_empty(arc_l2c_only->arcs_list[ARC_BUFC_DATA]) && + multilist_is_empty(arc_l2c_only->arcs_list[ARC_BUFC_METADATA])); } void @@ -3671,7 +3671,7 @@ arc_evict_state(arc_state_t *state, uint64_t spa, int6 arc_buf_contents_t type) { uint64_t total_evicted = 0; - multilist_t *ml = &state->arcs_list[type]; + multilist_t *ml = state->arcs_list[type]; int num_sublists; arc_buf_hdr_t **markers; @@ -3875,8 +3875,8 @@ arc_adjust_meta(void) static arc_buf_contents_t arc_adjust_type(arc_state_t *state) { - multilist_t *data_ml = &state->arcs_list[ARC_BUFC_DATA]; - multilist_t *meta_ml = &state->arcs_list[ARC_BUFC_METADATA]; + multilist_t *data_ml = state->arcs_list[ARC_BUFC_DATA]; + multilist_t *meta_ml = state->arcs_list[ARC_BUFC_METADATA]; int data_idx = multilist_get_random_index(data_ml); int meta_idx = multilist_get_random_index(meta_ml); multilist_sublist_t *data_mls; @@ -6209,44 +6209,44 @@ arc_state_init(void) arc_mfu_ghost = &ARC_mfu_ghost; arc_l2c_only = &ARC_l2c_only; - multilist_create(&arc_mru->arcs_list[ARC_BUFC_METADATA], - sizeof (arc_buf_hdr_t), + arc_mru->arcs_list[ARC_BUFC_METADATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mru->arcs_list[ARC_BUFC_DATA], - sizeof (arc_buf_hdr_t), + arc_mru->arcs_list[ARC_BUFC_DATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mru_ghost->arcs_list[ARC_BUFC_METADATA], - sizeof (arc_buf_hdr_t), + arc_mru_ghost->arcs_list[ARC_BUFC_METADATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mru_ghost->arcs_list[ARC_BUFC_DATA], - sizeof (arc_buf_hdr_t), + arc_mru_ghost->arcs_list[ARC_BUFC_DATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mfu->arcs_list[ARC_BUFC_METADATA], - sizeof (arc_buf_hdr_t), + arc_mfu->arcs_list[ARC_BUFC_METADATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mfu->arcs_list[ARC_BUFC_DATA], - sizeof (arc_buf_hdr_t), + arc_mfu->arcs_list[ARC_BUFC_DATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mfu_ghost->arcs_list[ARC_BUFC_METADATA], - sizeof (arc_buf_hdr_t), + arc_mfu_ghost->arcs_list[ARC_BUFC_METADATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_mfu_ghost->arcs_list[ARC_BUFC_DATA], - sizeof (arc_buf_hdr_t), + arc_mfu_ghost->arcs_list[ARC_BUFC_DATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_l2c_only->arcs_list[ARC_BUFC_METADATA], - sizeof (arc_buf_hdr_t), + arc_l2c_only->arcs_list[ARC_BUFC_METADATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); - multilist_create(&arc_l2c_only->arcs_list[ARC_BUFC_DATA], - sizeof (arc_buf_hdr_t), + arc_l2c_only->arcs_list[ARC_BUFC_DATA] = + multilist_create(sizeof (arc_buf_hdr_t), offsetof(arc_buf_hdr_t, b_l1hdr.b_arc_node), arc_state_multilist_index_func); @@ -6294,14 +6294,14 @@ arc_state_fini(void) refcount_destroy(&arc_mfu_ghost->arcs_size); refcount_destroy(&arc_l2c_only->arcs_size); - multilist_destroy(&arc_mru->arcs_list[ARC_BUFC_METADATA]); - multilist_destroy(&arc_mru_ghost->arcs_list[ARC_BUFC_METADATA]); - multilist_destroy(&arc_mfu->arcs_list[ARC_BUFC_METADATA]); - multilist_destroy(&arc_mfu_ghost->arcs_list[ARC_BUFC_METADATA]); - multilist_destroy(&arc_mru->arcs_list[ARC_BUFC_DATA]); - multilist_destroy(&arc_mru_ghost->arcs_list[ARC_BUFC_DATA]); - multilist_destroy(&arc_mfu->arcs_list[ARC_BUFC_DATA]); - multilist_destroy(&arc_mfu_ghost->arcs_list[ARC_BUFC_DATA]); + multilist_destroy(arc_mru->arcs_list[ARC_BUFC_METADATA]); + multilist_destroy(arc_mru_ghost->arcs_list[ARC_BUFC_METADATA]); + multilist_destroy(arc_mfu->arcs_list[ARC_BUFC_METADATA]); + multilist_destroy(arc_mfu_ghost->arcs_list[ARC_BUFC_METADATA]); + multilist_destroy(arc_mru->arcs_list[ARC_BUFC_DATA]); + multilist_destroy(arc_mru_ghost->arcs_list[ARC_BUFC_DATA]); + multilist_destroy(arc_mfu->arcs_list[ARC_BUFC_DATA]); + multilist_destroy(arc_mfu_ghost->arcs_list[ARC_BUFC_DATA]); } uint64_t @@ -7098,16 +7098,16 @@ l2arc_sublist_lock(int list_num) switch (list_num) { case 0: - ml = &arc_mfu->arcs_list[ARC_BUFC_METADATA]; + ml = arc_mfu->arcs_list[ARC_BUFC_METADATA]; break; case 1: - ml = &arc_mru->arcs_list[ARC_BUFC_METADATA]; + ml = arc_mru->arcs_list[ARC_BUFC_METADATA]; break; case 2: - ml = &arc_mfu->arcs_list[ARC_BUFC_DATA]; + ml = arc_mfu->arcs_list[ARC_BUFC_DATA]; break; case 3: - ml = &arc_mru->arcs_list[ARC_BUFC_DATA]; + ml = arc_mru->arcs_list[ARC_BUFC_DATA]; break; } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:45:39 2017 (r321553) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -80,7 +80,7 @@ static boolean_t dbuf_evict_thread_exit; * Dbufs that are aged out of the cache will be immediately destroyed and * become eligible for arc eviction. */ -static multilist_t dbuf_cache; +static multilist_t *dbuf_cache; static refcount_t dbuf_cache_size; uint64_t dbuf_cache_max_bytes = 100 * 1024 * 1024; @@ -454,8 +454,8 @@ dbuf_cache_above_lowater(void) static void dbuf_evict_one(void) { - int idx = multilist_get_random_index(&dbuf_cache); - multilist_sublist_t *mls = multilist_sublist_lock(&dbuf_cache, idx); + int idx = multilist_get_random_index(dbuf_cache); + multilist_sublist_t *mls = multilist_sublist_lock(dbuf_cache, idx); ASSERT(!MUTEX_HELD(&dbuf_evict_lock)); @@ -621,7 +621,7 @@ retry: */ dbu_evict_taskq = taskq_create("dbu_evict", 1, minclsyspri, 0, 0, 0); - multilist_create(&dbuf_cache, sizeof (dmu_buf_impl_t), + dbuf_cache = multilist_create(sizeof (dmu_buf_impl_t), offsetof(dmu_buf_impl_t, db_cache_link), dbuf_cache_multilist_index_func); refcount_create(&dbuf_cache_size); @@ -659,7 +659,7 @@ dbuf_fini(void) cv_destroy(&dbuf_evict_cv); refcount_destroy(&dbuf_cache_size); - multilist_destroy(&dbuf_cache); + multilist_destroy(dbuf_cache); } /* @@ -2029,7 +2029,7 @@ dbuf_destroy(dmu_buf_impl_t *db) dbuf_clear_data(db); if (multilist_link_active(&db->db_cache_link)) { - multilist_remove(&dbuf_cache, db); + multilist_remove(dbuf_cache, db); (void) refcount_remove_many(&dbuf_cache_size, db->db.db_size, db); } @@ -2577,7 +2577,7 @@ top: if (multilist_link_active(&db->db_cache_link)) { ASSERT(refcount_is_zero(&db->db_holds)); - multilist_remove(&dbuf_cache, db); + multilist_remove(dbuf_cache, db); (void) refcount_remove_many(&dbuf_cache_size, db->db.db_size, db); } @@ -2796,7 +2796,7 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) db->db_pending_evict) { dbuf_destroy(db); } else if (!multilist_link_active(&db->db_cache_link)) { - multilist_insert(&dbuf_cache, db); + multilist_insert(dbuf_cache, db); (void) refcount_add_many(&dbuf_cache_size, db->db.db_size, db); mutex_exit(&db->db_mtx); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 16:45:39 2017 (r321553) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2013, Joyent, Inc. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -303,6 +303,42 @@ dmu_objset_byteswap(void *buf, size_t size) } } +/* + * The hash is a CRC-based hash of the objset_t pointer and the object number. + */ +static uint64_t +dnode_hash(const objset_t *os, uint64_t obj) +{ + uintptr_t osv = (uintptr_t)os; + uint64_t crc = -1ULL; + + ASSERT(zfs_crc64_table[128] == ZFS_CRC64_POLY); + /* + * The low 6 bits of the pointer don't have much entropy, because + * the objset_t is larger than 2^6 bytes long. + */ + crc = (crc >> 8) ^ zfs_crc64_table[(crc ^ (osv >> 6)) & 0xFF]; + crc = (crc >> 8) ^ zfs_crc64_table[(crc ^ (obj >> 0)) & 0xFF]; + crc = (crc >> 8) ^ zfs_crc64_table[(crc ^ (obj >> 8)) & 0xFF]; + crc = (crc >> 8) ^ zfs_crc64_table[(crc ^ (obj >> 16)) & 0xFF]; + + crc ^= (osv>>14) ^ (obj>>24); + + return (crc); +} + +unsigned int +dnode_multilist_index_func(multilist_t *ml, void *obj) +{ + dnode_t *dn = obj; + return (dnode_hash(dn->dn_objset, dn->dn_object) % + multilist_get_num_sublists(ml)); +} + +/* + * Instantiates the objset_t in-memory structure corresponding to the + * objset_phys_t that's pointed to by the specified blkptr_t. + */ int dmu_objset_open_impl(spa_t *spa, dsl_dataset_t *ds, blkptr_t *bp, objset_t **osp) @@ -454,10 +490,9 @@ dmu_objset_open_impl(spa_t *spa, dsl_dataset_t *ds, bl os->os_zil = zil_alloc(os, &os->os_zil_header); for (i = 0; i < TXG_SIZE; i++) { - list_create(&os->os_dirty_dnodes[i], sizeof (dnode_t), - offsetof(dnode_t, dn_dirty_link[i])); - list_create(&os->os_free_dnodes[i], sizeof (dnode_t), - offsetof(dnode_t, dn_dirty_link[i])); + os->os_dirty_dnodes[i] = multilist_create(sizeof (dnode_t), + offsetof(dnode_t, dn_dirty_link[i]), + dnode_multilist_index_func); } list_create(&os->os_dnodes, sizeof (dnode_t), offsetof(dnode_t, dn_link)); @@ -465,6 +500,7 @@ dmu_objset_open_impl(spa_t *spa, dsl_dataset_t *ds, bl offsetof(dmu_buf_impl_t, db_link)); mutex_init(&os->os_lock, NULL, MUTEX_DEFAULT, NULL); + mutex_init(&os->os_userused_lock, NULL, MUTEX_DEFAULT, NULL); mutex_init(&os->os_obj_lock, NULL, MUTEX_DEFAULT, NULL); mutex_init(&os->os_user_ptr_lock, NULL, MUTEX_DEFAULT, NULL); @@ -748,8 +784,12 @@ dmu_objset_evict_done(objset_t *os) rw_exit(&os_lock); mutex_destroy(&os->os_lock); + mutex_destroy(&os->os_userused_lock); mutex_destroy(&os->os_obj_lock); mutex_destroy(&os->os_user_ptr_lock); + for (int i = 0; i < TXG_SIZE; i++) { + multilist_destroy(os->os_dirty_dnodes[i]); + } spa_evicting_os_deregister(os->os_spa, os); kmem_free(os, sizeof (objset_t)); } @@ -1027,11 +1067,11 @@ dmu_objset_snapshot_one(const char *fsname, const char } static void -dmu_objset_sync_dnodes(list_t *list, list_t *newlist, dmu_tx_t *tx) +dmu_objset_sync_dnodes(multilist_sublist_t *list, dmu_tx_t *tx) { dnode_t *dn; - while (dn = list_head(list)) { + while ((dn = multilist_sublist_head(list)) != NULL) { ASSERT(dn->dn_object != DMU_META_DNODE_OBJECT); ASSERT(dn->dn_dbuf->db_data_pending); /* @@ -1042,11 +1082,12 @@ dmu_objset_sync_dnodes(list_t *list, list_t *newlist, ASSERT(dn->dn_zio); ASSERT3U(dn->dn_nlevels, <=, DN_MAX_LEVELS); - list_remove(list, dn); + multilist_sublist_remove(list, dn); - if (newlist) { + multilist_t *newlist = dn->dn_objset->os_synced_dnodes; + if (newlist != NULL) { (void) dnode_add_ref(dn, newlist); - list_insert_tail(newlist, dn); + multilist_insert(newlist, dn); } dnode_sync(dn, tx); @@ -1101,6 +1142,29 @@ dmu_objset_write_done(zio_t *zio, arc_buf_t *abuf, voi kmem_free(bp, sizeof (*bp)); } +typedef struct sync_dnodes_arg { + multilist_t *sda_list; + int sda_sublist_idx; + multilist_t *sda_newlist; + dmu_tx_t *sda_tx; +} sync_dnodes_arg_t; + +static void +sync_dnodes_task(void *arg) +{ + sync_dnodes_arg_t *sda = arg; + + multilist_sublist_t *ms = + multilist_sublist_lock(sda->sda_list, sda->sda_sublist_idx); + + dmu_objset_sync_dnodes(ms, sda->sda_tx); + + multilist_sublist_unlock(ms); + + kmem_free(sda, sizeof (*sda)); +} + + /* called from dsl */ void dmu_objset_sync(objset_t *os, zio_t *pio, dmu_tx_t *tx) @@ -1110,7 +1174,6 @@ dmu_objset_sync(objset_t *os, zio_t *pio, dmu_tx_t *tx zio_prop_t zp; zio_t *zio; list_t *list; - list_t *newlist = NULL; dbuf_dirty_record_t *dr; blkptr_t *blkptr_copy = kmem_alloc(sizeof (*os->os_rootbp), KM_SLEEP); *blkptr_copy = *os->os_rootbp; @@ -1164,20 +1227,36 @@ dmu_objset_sync(objset_t *os, zio_t *pio, dmu_tx_t *tx txgoff = tx->tx_txg & TXG_MASK; if (dmu_objset_userused_enabled(os)) { - newlist = &os->os_synced_dnodes; /* * We must create the list here because it uses the - * dn_dirty_link[] of this txg. + * dn_dirty_link[] of this txg. But it may already + * exist because we call dsl_dataset_sync() twice per txg. */ - list_create(newlist, sizeof (dnode_t), - offsetof(dnode_t, dn_dirty_link[txgoff])); + if (os->os_synced_dnodes == NULL) { + os->os_synced_dnodes = + multilist_create(sizeof (dnode_t), + offsetof(dnode_t, dn_dirty_link[txgoff]), + dnode_multilist_index_func); + } else { + ASSERT3U(os->os_synced_dnodes->ml_offset, ==, + offsetof(dnode_t, dn_dirty_link[txgoff])); + } } - dmu_objset_sync_dnodes(&os->os_free_dnodes[txgoff], newlist, tx); - dmu_objset_sync_dnodes(&os->os_dirty_dnodes[txgoff], newlist, tx); + for (int i = 0; + i < multilist_get_num_sublists(os->os_dirty_dnodes[txgoff]); i++) { + sync_dnodes_arg_t *sda = kmem_alloc(sizeof (*sda), KM_SLEEP); + sda->sda_list = os->os_dirty_dnodes[txgoff]; + sda->sda_sublist_idx = i; + sda->sda_tx = tx; + (void) taskq_dispatch(dmu_objset_pool(os)->dp_sync_taskq, + sync_dnodes_task, sda, 0); + /* callback frees sda */ + } + taskq_wait(dmu_objset_pool(os)->dp_sync_taskq); list = &DMU_META_DNODE(os)->dn_dirty_records[txgoff]; - while (dr = list_head(list)) { + while ((dr = list_head(list)) != NULL) { ASSERT0(dr->dr_dbuf->db_level); list_remove(list, dr); if (dr->dr_zio) @@ -1201,8 +1280,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio, dmu_tx_t *tx boolean_t dmu_objset_is_dirty(objset_t *os, uint64_t txg) { - return (!list_is_empty(&os->os_dirty_dnodes[txg & TXG_MASK]) || - !list_is_empty(&os->os_free_dnodes[txg & TXG_MASK])); + return (!multilist_is_empty(os->os_dirty_dnodes[txg & TXG_MASK])); } static objset_used_cb_t *used_cbs[DMU_OST_NUMTYPES]; @@ -1256,8 +1334,15 @@ do_userquota_cacheflush(objset_t *os, userquota_cache_ cookie = NULL; while ((uqn = avl_destroy_nodes(&cache->uqc_user_deltas, &cookie)) != NULL) { + /* + * os_userused_lock protects against concurrent calls to + * zap_increment_int(). It's needed because zap_increment_int() + * is not thread-safe (i.e. not atomic). + */ + mutex_enter(&os->os_userused_lock); VERIFY0(zap_increment_int(os, DMU_USERUSED_OBJECT, uqn->uqn_id, uqn->uqn_delta, tx)); + mutex_exit(&os->os_userused_lock); kmem_free(uqn, sizeof (*uqn)); } avl_destroy(&cache->uqc_user_deltas); @@ -1265,8 +1350,10 @@ do_userquota_cacheflush(objset_t *os, userquota_cache_ cookie = NULL; while ((uqn = avl_destroy_nodes(&cache->uqc_group_deltas, &cookie)) != NULL) { + mutex_enter(&os->os_userused_lock); VERIFY0(zap_increment_int(os, DMU_GROUPUSED_OBJECT, uqn->uqn_id, uqn->uqn_delta, tx)); + mutex_exit(&os->os_userused_lock); kmem_free(uqn, sizeof (*uqn)); } avl_destroy(&cache->uqc_group_deltas); @@ -1301,37 +1388,38 @@ do_userquota_update(userquota_cache_t *cache, uint64_t } } -void -dmu_objset_do_userquota_updates(objset_t *os, dmu_tx_t *tx) +typedef struct userquota_updates_arg { + objset_t *uua_os; + int uua_sublist_idx; + dmu_tx_t *uua_tx; +} userquota_updates_arg_t; + +static void +userquota_updates_task(void *arg) { + userquota_updates_arg_t *uua = arg; + objset_t *os = uua->uua_os; + dmu_tx_t *tx = uua->uua_tx; dnode_t *dn; - list_t *list = &os->os_synced_dnodes; userquota_cache_t cache = { 0 }; - ASSERT(list_head(list) == NULL || dmu_objset_userused_enabled(os)); + multilist_sublist_t *list = + multilist_sublist_lock(os->os_synced_dnodes, uua->uua_sublist_idx); + ASSERT(multilist_sublist_head(list) == NULL || + dmu_objset_userused_enabled(os)); avl_create(&cache.uqc_user_deltas, userquota_compare, sizeof (userquota_node_t), offsetof(userquota_node_t, uqn_node)); avl_create(&cache.uqc_group_deltas, userquota_compare, sizeof (userquota_node_t), offsetof(userquota_node_t, uqn_node)); - while (dn = list_head(list)) { + while ((dn = multilist_sublist_head(list)) != NULL) { int flags; ASSERT(!DMU_OBJECT_IS_SPECIAL(dn->dn_object)); ASSERT(dn->dn_phys->dn_type == DMU_OT_NONE || dn->dn_phys->dn_flags & DNODE_FLAG_USERUSED_ACCOUNTED); - /* Allocate the user/groupused objects if necessary. */ - if (DMU_USERUSED_DNODE(os)->dn_type == DMU_OT_NONE) { - VERIFY0(zap_create_claim(os, - DMU_USERUSED_OBJECT, - DMU_OT_USERGROUP_USED, DMU_OT_NONE, 0, tx)); - VERIFY0(zap_create_claim(os, - DMU_GROUPUSED_OBJECT, - DMU_OT_USERGROUP_USED, DMU_OT_NONE, 0, tx)); - } - flags = dn->dn_id_flags; ASSERT(flags); if (flags & DN_ID_OLD_EXIST) { @@ -1361,10 +1449,42 @@ dmu_objset_do_userquota_updates(objset_t *os, dmu_tx_t dn->dn_id_flags &= ~(DN_ID_NEW_EXIST); mutex_exit(&dn->dn_mtx); - list_remove(list, dn); - dnode_rele(dn, list); + multilist_sublist_remove(list, dn); + dnode_rele(dn, os->os_synced_dnodes); } do_userquota_cacheflush(os, &cache, tx); + multilist_sublist_unlock(list); + kmem_free(uua, sizeof (*uua)); +} + +void +dmu_objset_do_userquota_updates(objset_t *os, dmu_tx_t *tx) +{ + if (!dmu_objset_userused_enabled(os)) + return; + + /* Allocate the user/groupused objects if necessary. */ + if (DMU_USERUSED_DNODE(os)->dn_type == DMU_OT_NONE) { + VERIFY0(zap_create_claim(os, + DMU_USERUSED_OBJECT, + DMU_OT_USERGROUP_USED, DMU_OT_NONE, 0, tx)); + VERIFY0(zap_create_claim(os, + DMU_GROUPUSED_OBJECT, + DMU_OT_USERGROUP_USED, DMU_OT_NONE, 0, tx)); + } + + for (int i = 0; + i < multilist_get_num_sublists(os->os_synced_dnodes); i++) { + userquota_updates_arg_t *uua = + kmem_alloc(sizeof (*uua), KM_SLEEP); + uua->uua_os = os; + uua->uua_sublist_idx = i; + uua->uua_tx = tx; + /* note: caller does taskq_wait() */ + (void) taskq_dispatch(dmu_objset_pool(os)->dp_sync_taskq, + userquota_updates_task, uua, 0); + /* callback frees uua */ + } } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Wed Jul 26 16:45:39 2017 (r321553) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -1287,13 +1287,14 @@ dnode_setdirty(dnode_t *dn, dmu_tx_t *tx) */ dmu_objset_userquota_get_ids(dn, B_TRUE, tx); - mutex_enter(&os->os_lock); + multilist_t *dirtylist = os->os_dirty_dnodes[txg & TXG_MASK]; + multilist_sublist_t *mls = multilist_sublist_lock_obj(dirtylist, dn); /* * If we are already marked dirty, we're done. */ if (list_link_active(&dn->dn_dirty_link[txg & TXG_MASK])) { - mutex_exit(&os->os_lock); + multilist_sublist_unlock(mls); return; } @@ -1307,13 +1308,9 @@ dnode_setdirty(dnode_t *dn, dmu_tx_t *tx) dprintf_ds(os->os_dsl_dataset, "obj=%llu txg=%llu\n", dn->dn_object, txg); - if (dn->dn_free_txg > 0 && dn->dn_free_txg <= txg) { - list_insert_tail(&os->os_free_dnodes[txg&TXG_MASK], dn); - } else { - list_insert_tail(&os->os_dirty_dnodes[txg&TXG_MASK], dn); - } + multilist_sublist_insert_head(mls, dn); - mutex_exit(&os->os_lock); + multilist_sublist_unlock(mls); /* * The dnode maintains a hold on its containing dbuf as @@ -1334,13 +1331,6 @@ dnode_setdirty(dnode_t *dn, dmu_tx_t *tx) void dnode_free(dnode_t *dn, dmu_tx_t *tx) { - int txgoff = tx->tx_txg & TXG_MASK; - - dprintf("dn=%p txg=%llu\n", dn, tx->tx_txg); - - /* we should be the only holder... hopefully */ - /* ASSERT3U(refcount_count(&dn->dn_holds), ==, 1); */ - mutex_enter(&dn->dn_mtx); if (dn->dn_type == DMU_OT_NONE || dn->dn_free_txg) { mutex_exit(&dn->dn_mtx); @@ -1349,19 +1339,7 @@ dnode_free(dnode_t *dn, dmu_tx_t *tx) dn->dn_free_txg = tx->tx_txg; mutex_exit(&dn->dn_mtx); - /* - * If the dnode is already dirty, it needs to be moved from - * the dirty list to the free list. - */ - mutex_enter(&dn->dn_objset->os_lock); - if (list_link_active(&dn->dn_dirty_link[txgoff])) { - list_remove(&dn->dn_objset->os_dirty_dnodes[txgoff], dn); - list_insert_tail(&dn->dn_objset->os_free_dnodes[txgoff], dn); - mutex_exit(&dn->dn_objset->os_lock); - } else { - mutex_exit(&dn->dn_objset->os_lock); - dnode_setdirty(dn, tx); - } + dnode_setdirty(dn, tx); } /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c Wed Jul 26 16:45:39 2017 (r321553) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c Wed Jul 26 16:45:39 2017 (r321553) @@ -1740,6 +1740,11 @@ dsl_dataset_sync_done(dsl_dataset_t *ds, dmu_tx_t *tx) bplist_iterate(&ds->ds_pending_deadlist, deadlist_enqueue_cb, &ds->ds_deadlist, tx); + if (os->os_synced_dnodes != NULL) { + multilist_destroy(os->os_synced_dnodes); + os->os_synced_dnodes = NULL; + } + ASSERT(!dmu_objset_is_dirty(os, dmu_tx_get_txg(tx))); dmu_buf_rele(ds->ds_dbuf, ds); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 16:45:39 2017 (r321553) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright (c) 2013 Steven Hartland. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] @@ -132,6 +132,10 @@ int zfs_delay_min_dirty_percent = 60; */ uint64_t zfs_delay_scale = 1000 * 1000 * 1000 / 2000; +/* + * This determines the number of threads used by the dp_sync_taskq. + */ +int zfs_sync_taskq_batch_pct = 75; #if defined(__FreeBSD__) && defined(_KERNEL) @@ -264,6 +268,10 @@ dsl_pool_open_impl(spa_t *spa, uint64_t txg) txg_list_create(&dp->dp_sync_tasks, offsetof(dsl_sync_task_t, dst_node)); + dp->dp_sync_taskq = taskq_create("dp_sync_taskq", + zfs_sync_taskq_batch_pct, minclsyspri, 1, INT_MAX, + TASKQ_THREADS_CPU_PCT); + mutex_init(&dp->dp_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&dp->dp_spaceavail_cv, NULL, CV_DEFAULT, NULL); @@ -414,6 +422,8 @@ dsl_pool_close(dsl_pool_t *dp) txg_list_destroy(&dp->dp_sync_tasks); txg_list_destroy(&dp->dp_dirty_dirs); + taskq_destroy(dp->dp_sync_taskq); + /* * We can't set retry to TRUE since we're explicitly specifying * a spa to flush. This is good enough; any missed buffers for @@ -602,12 +612,15 @@ dsl_pool_sync(dsl_pool_t *dp, uint64_t txg) /* * After the data blocks have been written (ensured by the zio_wait() - * above), update the user/group space accounting. + * above), update the user/group space accounting. This happens + * in tasks dispatched to dp_sync_taskq, so wait for them before + * continuing. */ for (ds = list_head(&synced_datasets); ds != NULL; ds = list_next(&synced_datasets, ds)) { dmu_objset_do_userquota_updates(ds->ds_objset, tx); } + taskq_wait(dp->dp_sync_taskq); /* * Sync the datasets again to push out the changes due to @@ -654,8 +667,7 @@ dsl_pool_sync(dsl_pool_t *dp, uint64_t txg) dp->dp_mos_uncompressed_delta = 0; } - if (list_head(&mos->os_dirty_dnodes[txg & TXG_MASK]) != NULL || - list_head(&mos->os_free_dnodes[txg & TXG_MASK]) != NULL) { + if (!multilist_is_empty(mos->os_dirty_dnodes[txg & TXG_MASK])) { dsl_pool_sync_mos(dp, tx); } @@ -713,7 +725,8 @@ int dsl_pool_sync_context(dsl_pool_t *dp) { return (curthread == dp->dp_tx.tx_sync_thread || - spa_is_initializing(dp->dp_spa)); + spa_is_initializing(dp->dp_spa) || + taskq_member(dp->dp_sync_taskq, curthread)); } uint64_t Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c Wed Jul 26 16:45:39 2017 (r321553) @@ -65,16 +65,16 @@ multilist_d2l(multilist_t *ml, void *obj) * requirement, but a general rule of thumb in order to garner the * best multi-threaded performance out of the data structure. */ -static void -multilist_create_impl(multilist_t *ml, size_t size, size_t offset, +static multilist_t * +multilist_create_impl(size_t size, size_t offset, unsigned int num, multilist_sublist_index_func_t *index_func) { - ASSERT3P(ml, !=, NULL); ASSERT3U(size, >, 0); ASSERT3U(size, >=, offset + sizeof (multilist_node_t)); ASSERT3U(num, >, 0); ASSERT3P(index_func, !=, NULL); + multilist_t *ml = kmem_alloc(sizeof (*ml), KM_SLEEP); ml->ml_offset = offset; ml->ml_num_sublists = num; ml->ml_index_func = index_func; @@ -89,15 +89,16 @@ multilist_create_impl(multilist_t *ml, size_t size, si mutex_init(&mls->mls_lock, NULL, MUTEX_DEFAULT, NULL); list_create(&mls->mls_list, size, offset); } + return (ml); } /* - * Initialize a new sublist, using the default number of sublists + * Allocate a new multilist, using the default number of sublists * (the number of CPUs, or at least 4, or the tunable * zfs_multilist_num_sublists). */ -void -multilist_create(multilist_t *ml, size_t size, size_t offset, +multilist_t * +multilist_create(size_t size, size_t offset, multilist_sublist_index_func_t *index_func) { int num_sublists; @@ -108,7 +109,7 @@ multilist_create(multilist_t *ml, size_t size, size_t num_sublists = MAX(max_ncpus, 4); } - multilist_create_impl(ml, size, offset, num_sublists, index_func); + return (multilist_create_impl(size, offset, num_sublists, index_func)); } /* @@ -134,6 +135,7 @@ multilist_destroy(multilist_t *ml) ml->ml_num_sublists = 0; ml->ml_offset = 0; + kmem_free(ml, sizeof (multilist_t)); } /* @@ -283,6 +285,13 @@ multilist_sublist_lock(multilist_t *ml, unsigned int s mutex_enter(&mls->mls_lock); return (mls); +} + +/* Lock and return the sublist that would be used to store the specified obj */ +multilist_sublist_t * +multilist_sublist_lock_obj(multilist_t *ml, void *obj) +{ + return (multilist_sublist_lock(ml, ml->ml_index_func(ml, obj))); } void Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Jul 26 16:45:39 2017 (r321553) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2015 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright 2015 Nexenta Systems, Inc. All rights reserved. * Copyright 2013 Martin Matuska . All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -744,7 +744,7 @@ spa_add(const char *name, nvlist_t *config, const char spa_active_count++; } - avl_create(&spa->spa_alloc_tree, zio_timestamp_compare, + avl_create(&spa->spa_alloc_tree, zio_bookmark_compare, sizeof (zio_t), offsetof(zio_t, io_alloc_node)); /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Wed Jul 26 16:45:39 2017 (r321553) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] @@ -110,7 +110,7 @@ struct objset { /* no lock needed: */ struct dmu_tx *os_synctx; /* XXX sketchy */ zil_header_t os_zil_header; - list_t os_synced_dnodes; + multilist_t *os_synced_dnodes; uint64_t os_flags; uint64_t os_freed_dnodes; boolean_t os_rescan_dnodes; @@ -121,10 +121,12 @@ struct objset { /* Protected by os_lock */ kmutex_t os_lock; - list_t os_dirty_dnodes[TXG_SIZE]; - list_t os_free_dnodes[TXG_SIZE]; + multilist_t *os_dirty_dnodes[TXG_SIZE]; list_t os_dnodes; list_t os_downgraded_dbufs; + + /* Protects changes to DMU_{USER,GROUP}USED_OBJECT */ + kmutex_t os_userused_lock; /* stuff we store for the user */ kmutex_t os_user_ptr_lock; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Wed Jul 26 16:45:39 2017 (r321553) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. */ @@ -35,6 +35,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -203,7 +204,7 @@ struct dnode { uint32_t dn_dbufs_count; /* count of dn_dbufs */ /* protected by os_lock: */ - list_node_t dn_dirty_link[TXG_SIZE]; /* next on dataset's dirty */ + multilist_node_t dn_dirty_link[TXG_SIZE]; /* next on dataset's dirty */ /* protected by dn_mtx: */ kmutex_t dn_mtx; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h Wed Jul 26 16:45:39 2017 (r321553) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013, 2017 by Delphix. All rights reserved. * Copyright 2016 Nexenta Systems, Inc. All rights reserved. */ @@ -121,6 +121,7 @@ typedef struct dsl_pool { txg_list_t dp_dirty_zilogs; txg_list_t dp_dirty_dirs; txg_list_t dp_sync_tasks; + taskq_t *dp_sync_taskq; /* * Protects administrative changes (properties, namespace) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/multilist.h Wed Jul 26 16:45:39 2017 (r321553) @@ -73,8 +73,7 @@ struct multilist { }; void multilist_destroy(multilist_t *); -void multilist_create(multilist_t *, size_t, size_t, - multilist_sublist_index_func_t *); +multilist_t *multilist_create(size_t, size_t, multilist_sublist_index_func_t *); void multilist_insert(multilist_t *, void *); void multilist_remove(multilist_t *, void *); @@ -84,6 +83,7 @@ unsigned int multilist_get_num_sublists(multilist_t *) unsigned int multilist_get_random_index(multilist_t *); multilist_sublist_t *multilist_sublist_lock(multilist_t *, unsigned int); +multilist_sublist_t *multilist_sublist_lock_obj(multilist_t *, void *); void multilist_sublist_unlock(multilist_sublist_t *); void multilist_sublist_insert_head(multilist_sublist_t *, void *); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Wed Jul 26 16:45:39 2017 (r321553) @@ -22,7 +22,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ @@ -484,7 +484,7 @@ struct zio { list_node_t io_trim_link; }; -extern int zio_timestamp_compare(const void *, const void *); +extern int zio_bookmark_compare(const void *, const void *); extern zio_t *zio_null(zio_t *pio, spa_t *spa, vdev_t *vd, zio_done_func_t *done, void *priv, enum zio_flag flags); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Jul 26 16:44:52 2017 (r321552) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Wed Jul 26 16:45:39 2017 (r321553) *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:46:40 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 19F6EDA9A3B; Wed, 26 Jul 2017 16:46:40 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CEA356B28A; Wed, 26 Jul 2017 16:46:39 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGkdI8077252; Wed, 26 Jul 2017 16:46:39 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGkckN077247; Wed, 26 Jul 2017 16:46:38 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261646.v6QGkckN077247@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:46:38 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321554 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321554 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:46:40 -0000 Author: mav Date: Wed Jul 26 16:46:38 2017 New Revision: 321554 URL: https://svnweb.freebsd.org/changeset/base/321554 Log: MFC r318829: MFV r316920: 8023 Panic destroying a metaslab deferred range tree illumos/illumos-gate@3991b535a8e990c0369be677746a87c259b13e9f https://github.com/illumos/illumos-gate/commit/3991b535a8e990c0369be677746a87c259b13e9f https://www.illumos.org/issues/8023 $C ffffff0011bc0970 vpanic() ffffff0011bc0a00 strlog() ffffff0011bc0a30 range_tree_destroy+0x72(ffffff043769ad00) ffffff0011bc0a70 metaslab_fini+0xd5(ffffff0449acf380) ffffff0011bc0ab0 vdev_metaslab_fini+0x56(ffffff0462bae800) ffffff0011bc0af0 spa_unload+0x9b(ffffff03e3dac000) ffffff0011bc0b70 spa_export_common+0x115(ffffff047f4b4000, 2, 0, 0, 0) ffffff0011bc0b90 spa_destroy+0x1d(ffffff047f4b4000) ffffff0011bc0bd0 zfs_ioc_pool_destroy+0x20(ffffff047f4b4000) ffffff0011bc0c80 zfsdev_ioctl+0x4d7(11400000000, 5a01, 8040190, 100003, ffffff03e1956b10, ffffff0011bc0e68) ffffff0011bc0cc0 cdev_ioctl+0x39(11400000000, 5a01, 8040190, 100003, ffffff03e1956b10, ffffff0011bc0e68) ffffff0011bc0d10 spec_ioctl+0x60(ffffff03d9153b00, 5a01, 8040190, 100003, ffffff03e1956b10, ffffff0011bc0e68, 0) ffffff0011bc0da0 fop_ioctl+0x55(ffffff03d9153b00, 5a01, 8040190, 100003, ffffff03e1956b10, ffffff0011bc0e68, 0) ffffff0011bc0ec0 ioctl+0x9b(3, 5a01, 8040190) ffffff0011bc0f10 _sys_sysenter_post_swapgs+0x149() Reviewed by: Brad Lewis Reviewed by: Matt Ahrens Reviewed by: Dan Kimmel Reviewed by: Saso Kiselkov Approved by: Dan McDonald Author: George Wilson Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:45:39 2017 (r321553) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:46:38 2017 (r321554) @@ -1552,6 +1552,7 @@ dbuf_dirty(dmu_buf_impl_t *db, dmu_tx_t *tx) * this assertion only if we're not already dirty. */ os = dn->dn_objset; + VERIFY3U(tx->tx_txg, <=, spa_final_dirty_txg(os->os_spa)); #ifdef DEBUG if (dn->dn_objset->os_dsl_dataset != NULL) rrw_enter(&os->os_dsl_dataset->ds_bp_rwlock, RW_READER, FTAG); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Jul 26 16:45:39 2017 (r321553) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Wed Jul 26 16:46:38 2017 (r321554) @@ -1652,11 +1652,19 @@ metaslab_set_fragmentation(metaslab_t *msp) uint64_t txg = spa_syncing_txg(spa); vdev_t *vd = msp->ms_group->mg_vd; - if (spa_writeable(spa)) { + /* + * If we've reached the final dirty txg, then we must + * be shutting down the pool. We don't want to dirty + * any data past this point so skip setting the condense + * flag. We can retry this action the next time the pool + * is imported. + */ + if (spa_writeable(spa) && txg < spa_final_dirty_txg(spa)) { msp->ms_condense_wanted = B_TRUE; vdev_dirty(vd, VDD_METASLAB, msp, txg + 1); spa_dbgmsg(spa, "txg %llu, requesting force condense: " - "msp %p, vd %p", txg, msp, vd); + "ms_id %llu, vdev_id %llu", txg, msp->ms_id, + vd->vdev_id); } msp->ms_fragmentation = ZFS_FRAG_INVALID; return; @@ -2278,12 +2286,16 @@ metaslab_sync(metaslab_t *msp, uint64_t txg) /* * Normally, we don't want to process a metaslab if there * are no allocations or frees to perform. However, if the metaslab - * is being forced to condense we need to let it through. + * is being forced to condense and it's loaded, we need to let it + * through. */ if (range_tree_space(alloctree) == 0 && range_tree_space(msp->ms_freeingtree) == 0 && - !msp->ms_condense_wanted) + !(msp->ms_loaded && msp->ms_condense_wanted)) return; + + + VERIFY(txg <= spa_final_dirty_txg(spa)); /* * The only state that can actually be changing concurrently with Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Jul 26 16:45:39 2017 (r321553) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Wed Jul 26 16:46:38 2017 (r321554) @@ -1741,6 +1741,16 @@ spa_syncing_txg(spa_t *spa) return (spa->spa_syncing_txg); } +/* + * Return the last txg where data can be dirtied. The final txgs + * will be used to just clear out any deferred frees that remain. + */ +uint64_t +spa_final_dirty_txg(spa_t *spa) +{ + return (spa->spa_final_txg - TXG_DEFER_SIZE); +} + pool_state_t spa_state(spa_t *spa) { Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c Wed Jul 26 16:45:39 2017 (r321553) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c Wed Jul 26 16:46:38 2017 (r321554) @@ -23,7 +23,7 @@ * Use is subject to license terms. */ /* - * Copyright (c) 2012, 2014 by Delphix. All rights reserved. + * Copyright (c) 2012, 2016 by Delphix. All rights reserved. */ #include @@ -407,6 +407,7 @@ space_map_truncate(space_map_t *sm, dmu_tx_t *tx) ASSERT(dsl_pool_sync_context(dmu_objset_pool(os))); ASSERT(dmu_tx_is_syncing(tx)); + VERIFY3U(dmu_tx_get_txg(tx), <=, spa_final_dirty_txg(spa)); dmu_object_info_from_db(sm->sm_dbuf, &doi); @@ -421,9 +422,10 @@ space_map_truncate(space_map_t *sm, dmu_tx_t *tx) if ((spa_feature_is_enabled(spa, SPA_FEATURE_SPACEMAP_HISTOGRAM) && doi.doi_bonus_size != sizeof (space_map_phys_t)) || doi.doi_data_block_size != space_map_blksz) { - zfs_dbgmsg("txg %llu, spa %s, reallocating: " - "old bonus %u, old blocksz %u", dmu_tx_get_txg(tx), - spa_name(spa), doi.doi_bonus_size, doi.doi_data_block_size); + zfs_dbgmsg("txg %llu, spa %s, sm %p, reallocating " + "object[%llu]: old bonus %u, old blocksz %u", + dmu_tx_get_txg(tx), spa_name(spa), sm, sm->sm_object, + doi.doi_bonus_size, doi.doi_data_block_size); space_map_free(sm, tx); dmu_buf_rele(sm->sm_dbuf, sm); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Jul 26 16:45:39 2017 (r321553) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Jul 26 16:46:38 2017 (r321554) @@ -791,6 +791,7 @@ extern uint64_t spa_load_guid(spa_t *spa); extern uint64_t spa_last_synced_txg(spa_t *spa); extern uint64_t spa_first_txg(spa_t *spa); extern uint64_t spa_syncing_txg(spa_t *spa); +extern uint64_t spa_final_dirty_txg(spa_t *spa); extern uint64_t spa_version(spa_t *spa); extern pool_state_t spa_state(spa_t *spa); extern spa_load_state_t spa_load_state(spa_t *spa); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:47:34 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4182DA9AC9; Wed, 26 Jul 2017 16:47:34 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A0F8E6B3E3; Wed, 26 Jul 2017 16:47:34 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGlXbX077342; Wed, 26 Jul 2017 16:47:33 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGlXCC077341; Wed, 26 Jul 2017 16:47:33 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261647.v6QGlXCC077341@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:47:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321555 - stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Commit-Revision: 321555 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:47:34 -0000 Author: mav Date: Wed Jul 26 16:47:33 2017 New Revision: 321555 URL: https://svnweb.freebsd.org/changeset/base/321555 Log: MFC r318831: MFV r316922: 5380 receive of a send -p stream doesn't need to try renaming snapshots illumos/illumos-gate@471a88e499c660844f4590487ce7c4d5a7090294 https://github.com/illumos/illumos-gate/commit/471a88e499c660844f4590487ce7c4d5a7090294 https://www.illumos.org/issues/5380 A stream created with zfs send -p -I contains properties of all snapshots of a given dataset as opposed to only properties of snapshots in a given range. Not only this is suboptimal but the receive code also does not filter properties by the range. So, properties of earlier snapshots would be updated even though the snapshots themselves are not in the stream (just their properties). Given that modifying the snapshot properties requires a TXG sync and that the snapshots are updated one by one the described behavior may lead to a sever performance penalty. Reviewed by: Paul Dagnelie Reviewed by: Matt Ahrens Approved by: Dan McDonald Author: Andriy Gapon Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jul 26 16:46:38 2017 (r321554) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Wed Jul 26 16:47:33 2017 (r321555) @@ -2800,7 +2800,7 @@ zfs_receive_package(libzfs_handle_t *hdl, int fd, cons goto out; } - if (fromsnap != NULL) { + if (fromsnap != NULL && recursive) { nvlist_t *renamed = NULL; nvpair_t *pair = NULL; @@ -2827,7 +2827,7 @@ zfs_receive_package(libzfs_handle_t *hdl, int fd, cons *strchr(tofs, '@') = '\0'; } - if (recursive && !flags->dryrun && !flags->nomount) { + if (!flags->dryrun && !flags->nomount) { VERIFY(0 == nvlist_alloc(&renamed, NV_UNIQUE_NAME, 0)); } @@ -2896,7 +2896,7 @@ zfs_receive_package(libzfs_handle_t *hdl, int fd, cons anyerr |= error; } while (error == 0); - if (drr->drr_payloadlen != 0 && fromsnap != NULL) { + if (drr->drr_payloadlen != 0 && recursive && fromsnap != NULL) { /* * Now that we have the fs's they sent us, try the * renames again. From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:48:35 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BDBF6DA9B77; Wed, 26 Jul 2017 16:48:35 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 99D766B520; Wed, 26 Jul 2017 16:48:35 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGmYXd077430; Wed, 26 Jul 2017 16:48:34 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGmY6I077428; Wed, 26 Jul 2017 16:48:34 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261648.v6QGmY6I077428@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:48:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321556 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321556 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:48:35 -0000 Author: mav Date: Wed Jul 26 16:48:34 2017 New Revision: 321556 URL: https://svnweb.freebsd.org/changeset/base/321556 Log: MFC r318833: MFV r316925: 6101 attempt to lzc_create() a filesystem under a volume results in a panic illumos/illumos-gate@b127fe3c059af7adf772735498680b4f2e1405ef https://github.com/illumos/illumos-gate/commit/b127fe3c059af7adf772735498680b4f2e1405ef https://www.illumos.org/issues/6101 lzc_create(), or more correctly, zfs_ioc_create() does not reject an attempt to create a filesystem as a child of a volume, instead it proceeds to a crash. A crash stack obtained on FreeBSD: page fault while in kernel mode zap_leaf_lookup() fzap_lookup() zap_lookup_norm() zap_lookup() zfs_get_zplprop() zfs_fill_zplprops_impl() zfs_ioc_create() zfsdev_ioctl() devfs_ioctl_f() kern_ioctl() sys_ioctl() This crash happened with a kernel without debugging assertions. The immediate cause of crash appears to an attempt to interpret a zvol object as a zap object. For filesystems: #define MASTER_NODE_OBJ 1 For zvols: #define ZVOL_OBJ 1ULL #define ZVOL_ZAP_OBJ 2ULL So, I see two problems here: 1. an attempt to create a filesystem under a zvol should be rejected as early as possible, maybe in zfs_fill_zplprops() 2. maybe zap_lookup / zap_lockdir should reject objects that are not of one of the zap object types Reviewed by: Matthew Ahrens Approved by: Dan McDonald Author: Andriy Gapon Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Jul 26 16:47:33 2017 (r321555) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Jul 26 16:48:34 2017 (r321556) @@ -3092,6 +3092,9 @@ zfs_fill_zplprops_impl(objset_t *os, uint64_t zplver, ASSERT(zplprops != NULL); + if (os != NULL && os->os_phys->os_type != DMU_OST_ZFS) + return (SET_ERROR(EINVAL)); + /* * Pull out creator prop choices, if any. */ Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Wed Jul 26 16:47:33 2017 (r321555) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Wed Jul 26 16:48:34 2017 (r321556) @@ -2459,8 +2459,10 @@ zfs_get_zplprop(objset_t *os, zfs_prop_t prop, uint64_ else pname = zfs_prop_to_name(prop); - if (os != NULL) + if (os != NULL) { + ASSERT3U(os->os_phys->os_type, ==, DMU_OST_ZFS); error = zap_lookup(os, MASTER_NODE_OBJ, pname, 8, 1, value); + } if (error == ENOENT) { /* No value set, use the default value */ From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:49:28 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C26A9DA9BE4; Wed, 26 Jul 2017 16:49:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8FF706B65A; Wed, 26 Jul 2017 16:49:28 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGnR34077512; Wed, 26 Jul 2017 16:49:27 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGnRF2077511; Wed, 26 Jul 2017 16:49:27 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261649.v6QGnRF2077511@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:49:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321557 - stable/11 X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11 X-SVN-Commit-Revision: 321557 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:49:28 -0000 Author: mav Date: Wed Jul 26 16:49:27 2017 New Revision: 321557 URL: https://svnweb.freebsd.org/changeset/base/321557 Log: MFC r318834: MFV r316930: 5814 bpobj_iterate_impl(): Close a refcount leak iterating on a sublist. illumos/illumos-gate@b67dde11a73a9455d641403cbbb65ec2add41b41 https://github.com/illumos/illumos-gate/commit/b67dde11a73a9455d641403cbbb65ec2add41b41 https://www.illumos.org/issues/5814 Lets pull in this patch from freebsd: http://svnweb.freebsd.org/base?view=revision&revision=271781 bpobj_iterate_impl(): Close a refcount leak iterating on a sublist. If bpobj_space() returned non-zero here, the sublist would have been left open, along with the bonus buffer hold it requires. This call does not invoke any calls to bpobj_close() itself. This bug doesn't have any known vector, but was found on inspection. MFC after: 1 week Sponsored by: Spectra Logic Affects: All ZFS versions starting 21 May 2010 (illumos cde58dbc) MFSpectraBSD: r1050998 on 2014/03/26 Fix bpobj_iterate_impl() to properly call bpobj_close() if bpobj_space() returns an error. Reviewed by: Prakash Surya Reviewed by: Matthew Ahrens Reviewed by: Paul Dagnelie Reviewed by: Simon Klinkert Approved by: Gordon Ross Author: Will Andrews FreeBSD note: this is a "record-only" commit as the actual change was directly committed to FreeBSD by (or on behalf of) the author. Modified: Directory Properties: stable/11/ (props changed) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:50:16 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9433DA9D35; Wed, 26 Jul 2017 16:50:16 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A54126B85C; Wed, 26 Jul 2017 16:50:16 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGoF3Q077620; Wed, 26 Jul 2017 16:50:15 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGoFUN077619; Wed, 26 Jul 2017 16:50:15 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261650.v6QGoFUN077619@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:50:15 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321558 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321558 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:50:17 -0000 Author: mav Date: Wed Jul 26 16:50:15 2017 New Revision: 321558 URL: https://svnweb.freebsd.org/changeset/base/321558 Log: MFC r318920: MFV r316924: 8061 sa_find_idx_tab can be declared more type-safely illumos/illumos-gate@7f0bdb4257bb4f1f76390b72665961e411da24c6 https://github.com/illumos/illumos-gate/commit/7f0bdb4257bb4f1f76390b72665961e411da24c6 https://www.illumos.org/issues/8061 sa_find_idx_tab() is declared as taking and returning "void *" parameters. These can be declared to be the specific types. Reviewed by: George Wilson Reviewed by: Chris Williamson Approved by: Dan McDonald Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Jul 26 16:49:27 2017 (r321557) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c Wed Jul 26 16:50:15 2017 (r321558) @@ -22,7 +22,7 @@ /* * Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright 2011 iXsystems, Inc - * Copyright (c) 2013, 2016 by Delphix. All rights reserved. + * Copyright (c) 2013, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -131,8 +131,8 @@ typedef void (sa_iterfunc_t)(void *hdr, void *addr, sa static int sa_build_index(sa_handle_t *hdl, sa_buf_type_t buftype); static void sa_idx_tab_hold(objset_t *os, sa_idx_tab_t *idx_tab); -static void *sa_find_idx_tab(objset_t *os, dmu_object_type_t bonustype, - void *data); +static sa_idx_tab_t *sa_find_idx_tab(objset_t *os, dmu_object_type_t bonustype, + sa_hdr_phys_t *hdr); static void sa_idx_tab_rele(objset_t *os, void *arg); static void sa_copy_data(sa_data_locator_t *func, void *start, void *target, int buflen); @@ -1486,11 +1486,10 @@ sa_lookup_uio(sa_handle_t *hdl, sa_attr_type_t attr, u } #endif -void * -sa_find_idx_tab(objset_t *os, dmu_object_type_t bonustype, void *data) +static sa_idx_tab_t * +sa_find_idx_tab(objset_t *os, dmu_object_type_t bonustype, sa_hdr_phys_t *hdr) { sa_idx_tab_t *idx_tab; - sa_hdr_phys_t *hdr = (sa_hdr_phys_t *)data; sa_os_t *sa = os->os_sa; sa_lot_t *tb, search; avl_index_t loc; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:50:56 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4D112DA9DAD; Wed, 26 Jul 2017 16:50:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 195FF6B9CB; Wed, 26 Jul 2017 16:50:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGotX2078356; Wed, 26 Jul 2017 16:50:55 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGotMx078355; Wed, 26 Jul 2017 16:50:55 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261650.v6QGotMx078355@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:50:55 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321559 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321559 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:50:56 -0000 Author: mav Date: Wed Jul 26 16:50:55 2017 New Revision: 321559 URL: https://svnweb.freebsd.org/changeset/base/321559 Log: MFC r318921: MFV r316928: 7256 low probability race in zfs_get_data illumos/illumos-gate@0c94e1af6784c69a1dea25e0e35dd13b2b91e2e5 https://github.com/illumos/illumos-gate/commit/0c94e1af6784c69a1dea25e0e35dd13b2b91e2e5 https://www.illumos.org/issues/7256 error = dmu_sync(zio, lr->lr_common.lrc_txg, zfs_get_done, zgd); ASSERT(error || lr->lr_length <= zp->z_blksz); It's possible, although extremely rare, that the zfs_get_done() callback is executed before dmu_sync() returns. In that case the znode's range lock is dropped and the znode is unreferenced. Thus, the assertion can access some invalid or wrong data via the zp pointer. size variable caches the correct value of z_blksz and can be safely used here. Reviewed by: Matt Ahrens Reviewed by: Pavel Zakharov Approved by: Dan McDonald Author: Andriy Gapon Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 16:50:15 2017 (r321558) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 16:50:55 2017 (r321559) @@ -1385,7 +1385,7 @@ zfs_get_data(void *arg, lr_write_t *lr, char *buf, zio error = dmu_sync(zio, lr->lr_common.lrc_txg, zfs_get_done, zgd); - ASSERT(error || lr->lr_length <= zp->z_blksz); + ASSERT(error || lr->lr_length <= size); /* * On success, we need to wait for the write I/O From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:52:25 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 66518DA9FC6; Wed, 26 Jul 2017 16:52:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 35C9B6BD66; Wed, 26 Jul 2017 16:52:25 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGqOp3080580; Wed, 26 Jul 2017 16:52:24 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGqOSe080579; Wed, 26 Jul 2017 16:52:24 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261652.v6QGqOSe080579@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:52:24 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321560 - stable/11 X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11 X-SVN-Commit-Revision: 321560 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:52:25 -0000 Author: mav Date: Wed Jul 26 16:52:24 2017 New Revision: 321560 URL: https://svnweb.freebsd.org/changeset/base/321560 Log: MFC r318922: MFV r316927: 5379 modifying a mmap()-ed file does not update its timestamps FreeBSD note: this is a record-only merge as the FreeBSD putpages code is quite different from the upstream. illumos/illumos-gate@80e10fd0d22bbf0d18bfdae035e06f44c68ae8e6 https://github.com/illumos/illumos-gate/commit/80e10fd0d22bbf0d18bfdae035e06f44c68ae8e6 https://www.illumos.org/issues/5379 The following is based on a review of the illumos code and on a similar problem reported for FreeBSD where the relevant code is different. Looking at this block of code http://src.illumos.org/source/xref/illumos-gate/ usr/src/uts/common/fs/zfs/zfs_vnops.c#4187 I see code to set up an sa_bulk_attr_t object, I see code to set up mtime and ctime values, but I do not see code to actually apply the attributes... I would expect there to be a call to sa_bulk_update(), there is such a call in zfs_write() for instance. mmap_write.c [Magnifier] - demo (1.42 KB) Andriy Gapon, 2015-11-11 01:53 PM Reviewed by: Matthew Ahrens Reviewed by: Prashanth Sreenivasa Reviewed by: Dan McDonald Approved by: Gordon Ross Author: Andriy Gapon Modified: Directory Properties: stable/11/ (props changed) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:53:41 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4F43DDAA0F5; Wed, 26 Jul 2017 16:53:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1B8836BFB0; Wed, 26 Jul 2017 16:53:41 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGre04081399; Wed, 26 Jul 2017 16:53:40 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGrepS081398; Wed, 26 Jul 2017 16:53:40 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261653.v6QGrepS081398@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:53:40 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321561 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321561 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:53:41 -0000 Author: mav Date: Wed Jul 26 16:53:39 2017 New Revision: 321561 URL: https://svnweb.freebsd.org/changeset/base/321561 Log: MFC r318923: zfs_putpages: assert that sa_bulk_update() must succeed Same as the upstream does in r316927. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 16:52:24 2017 (r321560) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 16:53:39 2017 (r321561) @@ -4757,7 +4757,8 @@ zfs_putpages(struct vnode *vp, vm_page_t *ma, size_t l &zp->z_pflags, 8); zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime, B_TRUE); - (void)sa_bulk_update(zp->z_sa_hdl, bulk, count, tx); + err = sa_bulk_update(zp->z_sa_hdl, bulk, count, tx); + ASSERT0(err); zfs_log_write(zfsvfs->z_log, tx, TX_WRITE, zp, off, len, 0); zfs_vmobject_wlock(object); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:55:08 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 70346DAA236; Wed, 26 Jul 2017 16:55:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4B1706C114; Wed, 26 Jul 2017 16:55:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGt7ds081523; Wed, 26 Jul 2017 16:55:07 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGt7ZZ081522; Wed, 26 Jul 2017 16:55:07 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261655.v6QGt7ZZ081522@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:55:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321562 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321562 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:55:08 -0000 Author: mav Date: Wed Jul 26 16:55:07 2017 New Revision: 321562 URL: https://svnweb.freebsd.org/changeset/base/321562 Log: MFC r318924 (by avg): arc_init: make code closer to upstream by introducing 'allmem' variable All the differences in calculations are kept. A comment about arc_max being 1/2 of all memory is fixed to reflect the actual code that uses 5/8 as a factor. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:53:39 2017 (r321561) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:55:07 2017 (r321562) @@ -6315,6 +6315,20 @@ arc_init(void) { int i, prefetch_tunable_set = 0; + /* + * allmem is "all memory that we could possibly use". + */ +#ifdef illumos +#ifdef _KERNEL + uint64_t allmem = ptob(physmem - swapfs_minfree); +#else + uint64_t allmem = (physmem * PAGESIZE) / 2; +#endif +#else + uint64_t allmem = kmem_size(); +#endif + + mutex_init(&arc_reclaim_lock, NULL, MUTEX_DEFAULT, NULL); cv_init(&arc_reclaim_thread_cv, NULL, CV_DEFAULT, NULL); cv_init(&arc_reclaim_waiters_cv, NULL, CV_DEFAULT, NULL); @@ -6326,7 +6340,7 @@ arc_init(void) arc_min_prefetch_lifespan = 1 * hz; /* Start out with 1/8 of all memory */ - arc_c = kmem_size() / 8; + arc_c = allmem / 8; #ifdef illumos #ifdef _KERNEL @@ -6339,13 +6353,13 @@ arc_init(void) #endif #endif /* illumos */ /* set min cache to 1/32 of all memory, or arc_abs_min, whichever is more */ - arc_c_min = MAX(arc_c / 4, arc_abs_min); - /* set max to 1/2 of all memory, or all but 1GB, whichever is more */ - if (arc_c * 8 >= 1 << 30) - arc_c_max = (arc_c * 8) - (1 << 30); + arc_c_min = MAX(allmem / 32, arc_abs_min); + /* set max to 5/8 of all memory, or all but 1GB, whichever is more */ + if (allmem >= 1 << 30) + arc_c_max = allmem - (1 << 30); else arc_c_max = arc_c_min; - arc_c_max = MAX(arc_c * 5, arc_c_max); + arc_c_max = MAX(allmem * 5 / 8, arc_c_max); /* * In userland, there's only the memory pressure that we artificially @@ -6362,7 +6376,7 @@ arc_init(void) * Allow the tunables to override our calculations if they are * reasonable. */ - if (zfs_arc_max > arc_abs_min && zfs_arc_max < kmem_size()) { + if (zfs_arc_max > arc_abs_min && zfs_arc_max < allmem) { arc_c_max = zfs_arc_max; arc_c_min = MIN(arc_c_min, arc_c_max); } @@ -6485,7 +6499,7 @@ arc_init(void) printf("ZFS WARNING: Recommended minimum RAM size is 512MB; " "expect unstable behavior.\n"); } - if (kmem_size() < 512 * (1 << 20)) { + if (allmem < 512 * (1 << 20)) { printf("ZFS WARNING: Recommended minimum kmem_size is 512MB; " "expect unstable behavior.\n"); printf(" Consider tuning vm.kmem_size and " From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:57:19 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 14013DAA2FC; Wed, 26 Jul 2017 16:57:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 898376C293; Wed, 26 Jul 2017 16:57:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGvH0H081644; Wed, 26 Jul 2017 16:57:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGvHiK081643; Wed, 26 Jul 2017 16:57:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261657.v6QGvHiK081643@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:57:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321563 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321563 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:57:19 -0000 Author: mav Date: Wed Jul 26 16:57:17 2017 New Revision: 321563 URL: https://svnweb.freebsd.org/changeset/base/321563 Log: MFC r318925: MFV r316929: 6914 kernel virtual memory fragmentation leads to hang illumos/illumos-gate@af868f46a5b794687741d5424de9e3a2d684a84a https://github.com/illumos/illumos-gate/commit/af868f46a5b794687741d5424de9e3a2d684a84a https://www.illumos.org/issues/6914 FreeBSD note: only a ZFS part of the change is merged, changes to the VM subsystem are not ported (obviously). Also, now that FreeBSD has vmem(9) we don't have to ifdef-out the code that uses it. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:55:07 2017 (r321562) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 16:57:17 2017 (r321563) @@ -6339,19 +6339,6 @@ arc_init(void) /* Convert seconds to clock ticks */ arc_min_prefetch_lifespan = 1 * hz; - /* Start out with 1/8 of all memory */ - arc_c = allmem / 8; - -#ifdef illumos -#ifdef _KERNEL - /* - * On architectures where the physical memory can be larger - * than the addressable space (intel in 32-bit mode), we may - * need to limit the cache to 1/8 of VM size. - */ - arc_c = MIN(arc_c, vmem_size(heap_arena, VMEM_ALLOC | VMEM_FREE) / 8); -#endif -#endif /* illumos */ /* set min cache to 1/32 of all memory, or arc_abs_min, whichever is more */ arc_c_min = MAX(allmem / 32, arc_abs_min); /* set max to 5/8 of all memory, or all but 1GB, whichever is more */ @@ -6390,6 +6377,15 @@ arc_init(void) /* limit meta-data to 1/4 of the arc capacity */ arc_meta_limit = arc_c_max / 4; + +#ifdef _KERNEL + /* + * Metadata is stored in the kernel's heap. Don't let us + * use more than half the heap for the ARC. + */ + arc_meta_limit = MIN(arc_meta_limit, + vmem_size(heap_arena, VMEM_ALLOC | VMEM_FREE) / 2); +#endif /* Allow the tunable to override if it is reasonable */ if (zfs_arc_meta_limit > 0 && zfs_arc_meta_limit <= arc_c_max) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 16:58:02 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4E054DAA368; Wed, 26 Jul 2017 16:58:02 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1D3596C3BC; Wed, 26 Jul 2017 16:58:02 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QGw1fM081725; Wed, 26 Jul 2017 16:58:01 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QGw1G1081724; Wed, 26 Jul 2017 16:58:01 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261658.v6QGw1G1081724@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 16:58:01 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321564 - stable/11 X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11 X-SVN-Commit-Revision: 321564 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 16:58:02 -0000 Author: mav Date: Wed Jul 26 16:58:01 2017 New Revision: 321564 URL: https://svnweb.freebsd.org/changeset/base/321564 Log: MFC r318926: MFV r316919: 7885 zpool list can report 16.0e for expandsz FreeBSD note: this is a record-only change, the actual change was directly committed by smh. illumos/illumos-gate@c040c10cdd1e4eab0fc88203758367dd81e057b7 https://github.com/illumos/illumos-gate/commit/c040c10cdd1e4eab0fc88203758367dd81e057b7 https://www.illumos.org/issues/7885 When a member of a RAIDZ has been replaced with a device smaller than the original, then the top level vdev can report its expand size as 16.0E. The reduced child asize causes the RAIDZ to have a vdev_asize lower than its vdev_max_asize which then results in an underflow during the calculation of the parents expand size. Also for RAIDZ vdevs the sum of their child vdev_min_asize could be smaller than the parents vdev_min_size. Fixed by: https://github.com/openzfs/openzfs/pull/296 Reviewed by: Matthew Ahrens Reviewed by: George Wilson Approved by: Gordon Ross Author: Steven Hartland Modified: Directory Properties: stable/11/ (props changed) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:12:54 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B2DDEDAAD6A; Wed, 26 Jul 2017 17:12:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8C5626D1E4; Wed, 26 Jul 2017 17:12:54 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHCrJ3090467; Wed, 26 Jul 2017 17:12:53 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHCrkg090466; Wed, 26 Jul 2017 17:12:53 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261712.v6QHCrkg090466@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:12:53 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321565 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321565 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:12:54 -0000 Author: mav Date: Wed Jul 26 17:12:53 2017 New Revision: 321565 URL: https://svnweb.freebsd.org/changeset/base/321565 Log: MFC r318928: MFV r318927: 8025 dbuf_read() creates unnecessary zio_root() for bonus buf illumos/illumos-gate@def4fac5882b4ca67bd0f4a53509b6d1fa8ae14e https://github.com/illumos/illumos-gate/commit/def4fac5882b4ca67bd0f4a53509b6d1fa8ae14e https://www.illumos.org/issues/8025 dbuf_read() creates a zio_root() to track and wait for all the zio's that may happen as part of this call. However, if the blkptr_t for this buffer is NULL or a hole, we will not create any more zio's, so this zio_root() is unnecessary. This is always the case when calling dbuf_read() on a bonus buffer, because it has no blkptr (it's part of the containing dnode). For workloads that read a lot of bonus buffers (e.g. file creation and removal), creating and destroying these unnecessary zio's can decrease performance by around 3%. Reviewed by: Dan Kimmel Reviewed by: Pavel Zakharov Reviewed by: Prashanth Sreenivasa Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 16:58:01 2017 (r321564) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:12:53 2017 (r321565) @@ -1088,7 +1088,6 @@ int dbuf_read(dmu_buf_impl_t *db, zio_t *zio, uint32_t flags) { int err = 0; - boolean_t havepzio = (zio != NULL); boolean_t prefetch; dnode_t *dn; @@ -1132,9 +1131,13 @@ dbuf_read(dmu_buf_impl_t *db, zio_t *zio, uint32_t fla DB_DNODE_EXIT(db); } else if (db->db_state == DB_UNCACHED) { spa_t *spa = dn->dn_objset->os_spa; + boolean_t need_wait = B_FALSE; - if (zio == NULL) + if (zio == NULL && + db->db_blkptr != NULL && !BP_IS_HOLE(db->db_blkptr)) { zio = zio_root(spa, NULL, NULL, ZIO_FLAG_CANFAIL); + need_wait = B_TRUE; + } dbuf_read_impl(db, zio, flags); /* dbuf_read_impl has dropped db_mtx for us */ @@ -1146,7 +1149,7 @@ dbuf_read(dmu_buf_impl_t *db, zio_t *zio, uint32_t fla rw_exit(&dn->dn_struct_rwlock); DB_DNODE_EXIT(db); - if (!havepzio) + if (need_wait) err = zio_wait(zio); } else { /* @@ -1181,7 +1184,6 @@ dbuf_read(dmu_buf_impl_t *db, zio_t *zio, uint32_t fla mutex_exit(&db->db_mtx); } - ASSERT(err || havepzio || db->db_state == DB_CACHED); return (err); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:35:05 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 14AA9DAB48D; Wed, 26 Jul 2017 17:35:05 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E26066DF07; Wed, 26 Jul 2017 17:35:04 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHZ4Q9098726; Wed, 26 Jul 2017 17:35:04 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHZ4YX098725; Wed, 26 Jul 2017 17:35:04 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261735.v6QHZ4YX098725@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:35:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321566 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321566 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:35:05 -0000 Author: mav Date: Wed Jul 26 17:35:03 2017 New Revision: 321566 URL: https://svnweb.freebsd.org/changeset/base/321566 Log: MFC r318930: MFV r318929: 7786 zfs`vdev_online() needs better notification about state changes illumos/illumos-gate@5f368aef86387d6ef4eda84030ae9b402313ee4c https://github.com/illumos/illumos-gate/commit/5f368aef86387d6ef4eda84030ae9b402313ee4c https://www.illumos.org/issues/7786 Currently, vdev_online() will only post sysevent if previous state was "offline". It should also post the event when the state changes from "removed" or "faulted" to "healthy" or "degraded". This will fix the following scenario: - pull disk from slot A - check that hotspare has taken its place (if available) - insert disk into slot B - check that hotspare moved back to "avail" state (if spare was used) The problem here is that we don't get any ESC_ZFS_VDEV_* notification and fail to update the vdev FRU. Reviewed by: Matthew Ahrens mahrens@delphix.com Reviewed by: George Wilson george.wilson@delphix.com Approved by: Albert Lee Author: Yuri Pankov Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Jul 26 17:12:53 2017 (r321565) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Jul 26 17:35:03 2017 (r321566) @@ -22,7 +22,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2011, 2015 by Delphix. All rights reserved. - * Copyright 2015 Nexenta Systems, Inc. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. * Copyright 2013 Martin Matuska . All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -2597,7 +2597,8 @@ int vdev_online(spa_t *spa, uint64_t guid, uint64_t flags, vdev_state_t *newstate) { vdev_t *vd, *tvd, *pvd, *rvd = spa->spa_root_vdev; - boolean_t postevent = B_FALSE; + boolean_t wasoffline; + vdev_state_t oldstate; spa_vdev_state_enter(spa, SCL_NONE); @@ -2607,9 +2608,8 @@ vdev_online(spa_t *spa, uint64_t guid, uint64_t flags, if (!vd->vdev_ops->vdev_op_leaf) return (spa_vdev_state_exit(spa, NULL, ENOTSUP)); - postevent = - (vd->vdev_offline == B_TRUE || vd->vdev_tmpoffline == B_TRUE) ? - B_TRUE : B_FALSE; + wasoffline = (vd->vdev_offline || vd->vdev_tmpoffline); + oldstate = vd->vdev_state; tvd = vd->vdev_top; vd->vdev_offline = B_FALSE; @@ -2647,7 +2647,9 @@ vdev_online(spa_t *spa, uint64_t guid, uint64_t flags, spa_async_request(spa, SPA_ASYNC_CONFIG_UPDATE); } - if (postevent) + if (wasoffline || + (oldstate < VDEV_STATE_DEGRADED && + vd->vdev_state >= VDEV_STATE_DEGRADED)) spa_event_notify(spa, vd, ESC_ZFS_VDEV_ONLINE); return (spa_vdev_state_exit(spa, vd, 0)); From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:35:53 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE314DAB52F; Wed, 26 Jul 2017 17:35:53 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B94B36E0D9; Wed, 26 Jul 2017 17:35:53 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHZqC9098814; Wed, 26 Jul 2017 17:35:52 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHZqWx098806; Wed, 26 Jul 2017 17:35:52 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261735.v6QHZqWx098806@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:35:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321567 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321567 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:35:54 -0000 Author: mav Date: Wed Jul 26 17:35:52 2017 New Revision: 321567 URL: https://svnweb.freebsd.org/changeset/base/321567 Log: MFC r318932: MFV r318931: 8063 verify that we do not attempt to access inactive txg illumos/illumos-gate@b7b2590dd9f11b12a0b4878db3886068cce176af https://github.com/illumos/illumos-gate/commit/b7b2590dd9f11b12a0b4878db3886068cce176af https://www.illumos.org/issues/8063 A standard practice in ZFS is to keep track of "per-txg" state. Any of the 3 active TXG's (open, quiescing, syncing) can have different values for this state. We should assert that we do not attempt to modify other (inactive) TXG's. Reviewed by: Serapheim Dimitropoulos Reviewed by: Pavel Zakharov Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Wed Jul 26 17:35:52 2017 (r321567) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. - * Copyright (c) 2012, 2016 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -72,7 +72,7 @@ dmu_tx_create_assigned(struct dsl_pool *dp, uint64_t t { dmu_tx_t *tx = dmu_tx_create_dd(NULL); - ASSERT3U(txg, <=, dp->dp_tx.tx_open_txg); + txg_verify(dp->dp_spa, txg); tx->tx_pool = dp; tx->tx_txg = txg; tx->tx_anyobj = TRUE; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 17:35:52 2017 (r321567) @@ -259,13 +259,13 @@ dsl_pool_open_impl(spa_t *spa, uint64_t txg) rrw_init(&dp->dp_config_rwlock, B_TRUE); txg_init(dp, txg); - txg_list_create(&dp->dp_dirty_datasets, + txg_list_create(&dp->dp_dirty_datasets, spa, offsetof(dsl_dataset_t, ds_dirty_link)); - txg_list_create(&dp->dp_dirty_zilogs, + txg_list_create(&dp->dp_dirty_zilogs, spa, offsetof(zilog_t, zl_dirty_link)); - txg_list_create(&dp->dp_dirty_dirs, + txg_list_create(&dp->dp_dirty_dirs, spa, offsetof(dsl_dir_t, dd_dirty_link)); - txg_list_create(&dp->dp_sync_tasks, + txg_list_create(&dp->dp_sync_tasks, spa, offsetof(dsl_sync_task_t, dst_node)); dp->dp_sync_taskq = taskq_create("dp_sync_taskq", Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Wed Jul 26 17:35:52 2017 (r321567) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2014 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright (c) 2015, Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2013 Martin Matuska . All rights reserved. * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved. @@ -1141,7 +1141,7 @@ spa_activate(spa_t *spa, int mode) list_create(&spa->spa_state_dirty_list, sizeof (vdev_t), offsetof(vdev_t, vdev_state_dirty_node)); - txg_list_create(&spa->spa_vdev_txg_list, + txg_list_create(&spa->spa_vdev_txg_list, spa, offsetof(struct vdev, vdev_txg_node)); avl_create(&spa->spa_errlist_scrub, Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h Wed Jul 26 17:35:52 2017 (r321567) @@ -23,7 +23,7 @@ * Use is subject to license terms. */ /* - * Copyright (c) 2012, 2014 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. */ #ifndef _SYS_TXG_H @@ -60,6 +60,7 @@ typedef struct txg_node { typedef struct txg_list { kmutex_t tl_lock; size_t tl_offset; + spa_t *tl_spa; txg_node_t *tl_head[TXG_SIZE]; } txg_list_t; @@ -103,13 +104,15 @@ extern boolean_t txg_stalled(struct dsl_pool *dp); /* returns TRUE if someone is waiting for the next txg to sync */ extern boolean_t txg_sync_waiting(struct dsl_pool *dp); +extern void txg_verify(spa_t *spa, uint64_t txg); + /* * Per-txg object lists. */ #define TXG_CLEAN(txg) ((txg) - 1) -extern void txg_list_create(txg_list_t *tl, size_t offset); +extern void txg_list_create(txg_list_t *tl, spa_t *spa, size_t offset); extern void txg_list_destroy(txg_list_t *tl); extern boolean_t txg_list_empty(txg_list_t *tl, uint64_t txg); extern boolean_t txg_all_lists_empty(txg_list_t *tl); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h Wed Jul 26 17:35:52 2017 (r321567) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -94,6 +94,15 @@ typedef struct zil_chain { } zil_chain_t; #define ZIL_MIN_BLKSZ 4096ULL + +/* + * ziltest is by and large an ugly hack, but very useful in + * checking replay without tedious work. + * When running ziltest we want to keep all itx's and so maintain + * a single list in the zl_itxg[] that uses a high txg: ZILTEST_TXG + * We subtract TXG_CONCURRENT_STATES to allow for common code. + */ +#define ZILTEST_TXG (UINT64_MAX - TXG_CONCURRENT_STATES) /* * The words of a log block checksum. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Wed Jul 26 17:35:52 2017 (r321567) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright 2011 Martin Matuska - * Copyright (c) 2012, 2014 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. */ #include @@ -30,6 +30,7 @@ #include #include #include +#include #include /* @@ -693,16 +694,32 @@ txg_sync_waiting(dsl_pool_t *dp) } /* + * Verify that this txg is active (open, quiescing, syncing). Non-active + * txg's should not be manipulated. + */ +void +txg_verify(spa_t *spa, uint64_t txg) +{ + dsl_pool_t *dp = spa_get_dsl(spa); + if (txg <= TXG_INITIAL || txg == ZILTEST_TXG) + return; + ASSERT3U(txg, <=, dp->dp_tx.tx_open_txg); + ASSERT3U(txg, >=, dp->dp_tx.tx_synced_txg); + ASSERT3U(txg, >=, dp->dp_tx.tx_open_txg - TXG_CONCURRENT_STATES); +} + +/* * Per-txg object lists. */ void -txg_list_create(txg_list_t *tl, size_t offset) +txg_list_create(txg_list_t *tl, spa_t *spa, size_t offset) { int t; mutex_init(&tl->tl_lock, NULL, MUTEX_DEFAULT, NULL); tl->tl_offset = offset; + tl->tl_spa = spa; for (t = 0; t < TXG_SIZE; t++) tl->tl_head[t] = NULL; @@ -722,15 +739,16 @@ txg_list_destroy(txg_list_t *tl) boolean_t txg_list_empty(txg_list_t *tl, uint64_t txg) { + txg_verify(tl->tl_spa, txg); return (tl->tl_head[txg & TXG_MASK] == NULL); } /* * Returns true if all txg lists are empty. * - * Warning: this is inherently racy (an item could be added immediately after this - * function returns). We don't bother with the lock because it wouldn't change the - * semantics. + * Warning: this is inherently racy (an item could be added immediately + * after this function returns). We don't bother with the lock because + * it wouldn't change the semantics. */ boolean_t txg_all_lists_empty(txg_list_t *tl) @@ -754,6 +772,7 @@ txg_list_add(txg_list_t *tl, void *p, uint64_t txg) txg_node_t *tn = (txg_node_t *)((char *)p + tl->tl_offset); boolean_t add; + txg_verify(tl->tl_spa, txg); mutex_enter(&tl->tl_lock); add = (tn->tn_member[t] == 0); if (add) { @@ -778,6 +797,7 @@ txg_list_add_tail(txg_list_t *tl, void *p, uint64_t tx txg_node_t *tn = (txg_node_t *)((char *)p + tl->tl_offset); boolean_t add; + txg_verify(tl->tl_spa, txg); mutex_enter(&tl->tl_lock); add = (tn->tn_member[t] == 0); if (add) { @@ -805,6 +825,7 @@ txg_list_remove(txg_list_t *tl, uint64_t txg) txg_node_t *tn; void *p = NULL; + txg_verify(tl->tl_spa, txg); mutex_enter(&tl->tl_lock); if ((tn = tl->tl_head[t]) != NULL) { p = (char *)tn - tl->tl_offset; @@ -826,6 +847,7 @@ txg_list_remove_this(txg_list_t *tl, void *p, uint64_t int t = txg & TXG_MASK; txg_node_t *tn, **tp; + txg_verify(tl->tl_spa, txg); mutex_enter(&tl->tl_lock); for (tp = &tl->tl_head[t]; (tn = *tp) != NULL; tp = &tn->tn_next[t]) { @@ -849,6 +871,7 @@ txg_list_member(txg_list_t *tl, void *p, uint64_t txg) int t = txg & TXG_MASK; txg_node_t *tn = (txg_node_t *)((char *)p + tl->tl_offset); + txg_verify(tl->tl_spa, txg); return (tn->tn_member[t] != 0); } @@ -861,6 +884,7 @@ txg_list_head(txg_list_t *tl, uint64_t txg) int t = txg & TXG_MASK; txg_node_t *tn = tl->tl_head[t]; + txg_verify(tl->tl_spa, txg); return (tn == NULL ? NULL : (char *)tn - tl->tl_offset); } @@ -870,6 +894,7 @@ txg_list_next(txg_list_t *tl, void *p, uint64_t txg) int t = txg & TXG_MASK; txg_node_t *tn = (txg_node_t *)((char *)p + tl->tl_offset); + txg_verify(tl->tl_spa, txg); tn = tn->tn_next[t]; return (tn == NULL ? NULL : (char *)tn - tl->tl_offset); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Jul 26 17:35:52 2017 (r321567) @@ -447,9 +447,9 @@ vdev_alloc_common(spa_t *spa, uint_t id, uint64_t guid vd->vdev_dtl[t] = range_tree_create(NULL, NULL, &vd->vdev_dtl_lock); } - txg_list_create(&vd->vdev_ms_list, + txg_list_create(&vd->vdev_ms_list, spa, offsetof(struct metaslab, ms_txg_node)); - txg_list_create(&vd->vdev_dtl_list, + txg_list_create(&vd->vdev_dtl_list, spa, offsetof(struct vdev, vdev_dtl_node)); vd->vdev_stat.vs_timestamp = gethrtime(); vdev_queue_init(vd); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Wed Jul 26 17:35:03 2017 (r321566) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Wed Jul 26 17:35:52 2017 (r321567) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -100,16 +100,6 @@ static kmem_cache_t *zil_lwb_cache; #define LWB_EMPTY(lwb) ((BP_GET_LSIZE(&lwb->lwb_blk) - \ sizeof (zil_chain_t)) == (lwb->lwb_sz - lwb->lwb_nused)) - - -/* - * ziltest is by and large an ugly hack, but very useful in - * checking replay without tedious work. - * When running ziltest we want to keep all itx's and so maintain - * a single list in the zl_itxg[] that uses a high txg: ZILTEST_TXG - * We subtract TXG_CONCURRENT_STATES to allow for common code. - */ -#define ZILTEST_TXG (UINT64_MAX - TXG_CONCURRENT_STATES) static int zil_bp_compare(const void *x1, const void *x2) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:36:32 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32246DAB5CD; Wed, 26 Jul 2017 17:36:32 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0C5376E241; Wed, 26 Jul 2017 17:36:31 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHaVN7098910; Wed, 26 Jul 2017 17:36:31 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHaViP098908; Wed, 26 Jul 2017 17:36:31 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261736.v6QHaViP098908@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:36:31 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321568 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321568 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:36:32 -0000 Author: mav Date: Wed Jul 26 17:36:30 2017 New Revision: 321568 URL: https://svnweb.freebsd.org/changeset/base/321568 Log: MFC r318935: MFV r318934: 8070 Add some ZFS comments illumos/illumos-gate@40713f2b249d289022c715107b3951055a63aef0 https://github.com/illumos/illumos-gate/commit/40713f2b249d289022c715107b3951055a63aef0 https://www.illumos.org/issues/8070 Add some ZFS comments left by various developers at different times Reviewed by: Yuri Pankov Reviewed by: Matthew Ahrens Approved by: Robert Mustacchi Author: Alan Somers Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:35:52 2017 (r321567) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:36:30 2017 (r321568) @@ -1219,6 +1219,11 @@ dbuf_unoverride(dbuf_dirty_record_t *dr) uint64_t txg = dr->dr_txg; ASSERT(MUTEX_HELD(&db->db_mtx)); + /* + * This assert is valid because dmu_sync() expects to be called by + * a zilog's get_data while holding a range lock. This call only + * comes from dbuf_dirty() callers who must also hold a range lock. + */ ASSERT(dr->dt.dl.dr_override_state != DR_IN_DMU_SYNC); ASSERT(db->db_level == 0); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Jul 26 17:35:52 2017 (r321567) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Jul 26 17:36:30 2017 (r321568) @@ -807,7 +807,7 @@ dsl_scan_visitbp(blkptr_t *bp, const zbookmark_phys_t return; /* - * If dsl_scan_ddt() has aready visited this block, it will have + * If dsl_scan_ddt() has already visited this block, it will have * already done any translations or scrubbing, so don't call the * callback again. */ @@ -1474,6 +1474,7 @@ dsl_scan_active(dsl_scan_t *scn) return (used != 0); } +/* Called whenever a txg syncs. */ void dsl_scan_sync(dsl_pool_t *dp, dmu_tx_t *tx) { @@ -1892,6 +1893,7 @@ dsl_scan_scrub_cb(dsl_pool_t *dp, return (0); } +/* Called by the ZFS_IOC_POOL_SCAN ioctl to start a scrub or resilver */ int dsl_scan(dsl_pool_t *dp, pool_scan_func_t func) { From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:36:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 876F2DAB641; Wed, 26 Jul 2017 17:36:59 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5405B6E377; Wed, 26 Jul 2017 17:36:59 +0000 (UTC) (envelope-from markj@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHawHp098993; Wed, 26 Jul 2017 17:36:58 GMT (envelope-from markj@FreeBSD.org) Received: (from markj@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHawfv098992; Wed, 26 Jul 2017 17:36:58 GMT (envelope-from markj@FreeBSD.org) Message-Id: <201707261736.v6QHawfv098992@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: markj set sender to markj@FreeBSD.org using -f From: Mark Johnston Date: Wed, 26 Jul 2017 17:36:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321569 - stable/11/usr.sbin/crashinfo X-SVN-Group: stable-11 X-SVN-Commit-Author: markj X-SVN-Commit-Paths: stable/11/usr.sbin/crashinfo X-SVN-Commit-Revision: 321569 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:36:59 -0000 Author: markj Date: Wed Jul 26 17:36:58 2017 New Revision: 321569 URL: https://svnweb.freebsd.org/changeset/base/321569 Log: MFC r321228: Allow matches of truncated version strings. Modified: stable/11/usr.sbin/crashinfo/crashinfo.sh Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/crashinfo/crashinfo.sh ============================================================================== --- stable/11/usr.sbin/crashinfo/crashinfo.sh Wed Jul 26 17:36:30 2017 (r321568) +++ stable/11/usr.sbin/crashinfo/crashinfo.sh Wed Jul 26 17:36:58 2017 (r321569) @@ -69,10 +69,12 @@ find_kernel() } }' $INFO) - # Look for a matching kernel version. + # Look for a matching kernel version, handling possible truncation + # of the version string recovered from the dump. for k in `sysctl -n kern.bootfile` $(ls -t /boot/*/kernel); do - kvers=$(gdb_command $k 'printf " Version String: %s", version' \ - 2>/dev/null) + kvers=$(gdb_command $k 'printf " Version String: %s", version' | \ + awk "{line=line\$0\"\n\"} END{print substr(line,1,${#ivers})}" \ + 2>/dev/null) if [ "$ivers" = "$kvers" ]; then KERNEL=$k break From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:38:30 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60C22DAB736; Wed, 26 Jul 2017 17:38:30 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2E9E56E520; Wed, 26 Jul 2017 17:38:30 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHcT4m099099; Wed, 26 Jul 2017 17:38:29 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHcT23099098; Wed, 26 Jul 2017 17:38:29 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261738.v6QHcT23099098@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:38:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321570 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Commit-Revision: 321570 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:38:30 -0000 Author: mav Date: Wed Jul 26 17:38:29 2017 New Revision: 321570 URL: https://svnweb.freebsd.org/changeset/base/321570 Log: MFC r318945: MFV r318944: 8265 Reserve send stream flag for large dnode feature illumos/illumos-gate@bc83969fdbd1cb0d97ba00218c0a3de5c89fba92 https://github.com/illumos/illumos-gate/commit/bc83969fdbd1cb0d97ba00218c0a3de5c 89fba92 https://www.illumos.org/issues/8265 Reserve bit 23 in the zfs send stream flags for the large dnode feature which has been implemented for Linux. Reviewed by: Matthew Ahrens Approved by: Robert Mustacchi Author: Brian Behlendorf Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h Wed Jul 26 17:36:58 2017 (r321569) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h Wed Jul 26 17:38:29 2017 (r321570) @@ -93,6 +93,7 @@ typedef enum drr_headertype { #define DMU_BACKUP_FEATURE_RESUMING (1 << 20) /* flag #21 is reserved for a Delphix feature */ #define DMU_BACKUP_FEATURE_COMPRESSED (1 << 22) +/* flag #23 is reserved for the large dnode feature */ /* * Mask of all supported backup features From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:40:15 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4877EDAB877; Wed, 26 Jul 2017 17:40:15 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 160876E7F1; Wed, 26 Jul 2017 17:40:15 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHeEGC099294; Wed, 26 Jul 2017 17:40:14 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHeErd099293; Wed, 26 Jul 2017 17:40:14 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261740.v6QHeErd099293@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:40:14 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321572 - stable/11/cddl/contrib/opensolaris/cmd/zpool X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/cmd/zpool X-SVN-Commit-Revision: 321572 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:40:15 -0000 Author: mav Date: Wed Jul 26 17:40:13 2017 New Revision: 321572 URL: https://svnweb.freebsd.org/changeset/base/321572 Log: MFC r319672 (by allanjude): New sentences start on new lines, fix two violations Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Wed Jul 26 17:39:10 2017 (r321571) +++ stable/11/cddl/contrib/opensolaris/cmd/zpool/zpool-features.7 Wed Jul 26 17:40:13 2017 (r321572) @@ -48,13 +48,15 @@ to send file systems between pools. Since most features can be enabled independently of each other the on\-disk format of the pool is specified by the set of all features marked as .Sy active -on the pool. If the pool was created by another software version this set may +on the pool. +If the pool was created by another software version this set may include unsupported features. .Ss Identifying features Every feature has a guid of the form .Sy com.example:feature_name . The reverse DNS name ensures that the feature's guid is unique across all ZFS -implementations. When unsupported features are encountered on a pool they will +implementations. +When unsupported features are encountered on a pool they will be identified by their guids. Refer to the documentation for the ZFS implementation that created the pool for information about those features. @@ -283,7 +285,8 @@ configuration. .El .Pp This features allows ZFS to maintain more information about how free space -is organized within the pool. If this feature is +is organized within the pool. +If this feature is .Sy enabled , ZFS will set this feature to @@ -337,7 +340,8 @@ All bookmarks in the pool can be listed by running .El .Pp Once this feature is enabled ZFS records the transaction group number -in which new features are enabled. This has no user-visible impact, +in which new features are enabled. +This has no user-visible impact, but other features may depend on this feature. .Pp This feature becomes From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:42:45 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7C6AEDABAB5; Wed, 26 Jul 2017 17:42:45 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5631E6EBCF; Wed, 26 Jul 2017 17:42:45 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHgiwH003066; Wed, 26 Jul 2017 17:42:44 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHgiKU003061; Wed, 26 Jul 2017 17:42:44 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261742.v6QHgiKU003061@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:42:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321573 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321573 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:42:45 -0000 Author: mav Date: Wed Jul 26 17:42:43 2017 New Revision: 321573 URL: https://svnweb.freebsd.org/changeset/base/321573 Log: MFC r319748: MFV r319738: 8155 simplify dmu_write_policy handling of pre-compressed buffers illumos/illumos-gate@adaec86ad212d9fd756bee322934fa54d1258605 https://github.com/illumos/illumos-gate/commit/adaec86ad212d9fd756bee322934fa54d1258605 https://www.illumos.org/issues/8155 When writing pre-compressed buffers, arc_write() requires that the compression algorithm used to compress the buffer matches the compression algorithm requested by the zio_prop_t, which is set by dmu_write_policy(). This makes dmu_write_policy() and its callers a bit more complicated. We can simplify this by making arc_write() trust the caller to supply the type of pre-compressed buffer that it wants to write, and override the compression setting in the zio_prop_t. Reviewed by: Dan Kimmel Reviewed by: George Wilson Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 17:40:13 2017 (r321572) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Jul 26 17:42:43 2017 (r321573) @@ -5937,6 +5937,7 @@ arc_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr arc_buf_hdr_t *hdr = buf->b_hdr; arc_write_callback_t *callback; zio_t *zio; + zio_prop_t localprop = *zp; ASSERT3P(ready, !=, NULL); ASSERT3P(done, !=, NULL); @@ -5947,7 +5948,13 @@ arc_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr if (l2arc) arc_hdr_set_flags(hdr, ARC_FLAG_L2CACHE); if (ARC_BUF_COMPRESSED(buf)) { - ASSERT3U(zp->zp_compress, !=, ZIO_COMPRESS_OFF); + /* + * We're writing a pre-compressed buffer. Make the + * compression algorithm requested by the zio_prop_t match + * the pre-compressed buffer's compression algorithm. + */ + localprop.zp_compress = HDR_GET_COMPRESS(hdr); + ASSERT3U(HDR_GET_LSIZE(hdr), !=, arc_buf_size(buf)); zio_flags |= ZIO_FLAG_RAW; } @@ -5982,7 +5989,7 @@ arc_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr ASSERT3P(hdr->b_l1hdr.b_pdata, ==, NULL); zio = zio_write(pio, spa, txg, bp, buf->b_data, - HDR_GET_LSIZE(hdr), arc_buf_size(buf), zp, arc_write_ready, + HDR_GET_LSIZE(hdr), arc_buf_size(buf), &localprop, arc_write_ready, (children_ready != NULL) ? arc_write_children_ready : NULL, arc_write_physdone, arc_write_done, callback, priority, zio_flags, zb); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:40:13 2017 (r321572) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:42:43 2017 (r321573) @@ -3536,9 +3536,7 @@ dbuf_write(dbuf_dirty_record_t *dr, arc_buf_t *data, d wp_flag = WP_SPILL; wp_flag |= (db->db_state == DB_NOFILL) ? WP_NOFILL : 0; - dmu_write_policy(os, dn, db->db_level, wp_flag, - (data != NULL && arc_get_compression(data) != ZIO_COMPRESS_OFF) ? - arc_get_compression(data) : ZIO_COMPRESS_INHERIT, &zp); + dmu_write_policy(os, dn, db->db_level, wp_flag, &zp); DB_DNODE_EXIT(db); /* Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 17:40:13 2017 (r321572) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Wed Jul 26 17:42:43 2017 (r321573) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. */ /* Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ /* Copyright (c) 2013, Joyent, Inc. All rights reserved. */ @@ -1794,8 +1794,7 @@ dmu_sync(zio_t *pio, uint64_t txg, dmu_sync_cb_t *done DB_DNODE_ENTER(db); dn = DB_DNODE(db); - dmu_write_policy(os, dn, db->db_level, WP_DMU_SYNC, - ZIO_COMPRESS_INHERIT, &zp); + dmu_write_policy(os, dn, db->db_level, WP_DMU_SYNC, &zp); DB_DNODE_EXIT(db); /* @@ -1967,8 +1966,7 @@ SYSCTL_INT(_vfs_zfs, OID_AUTO, mdcomp_disable, CTLFLAG int zfs_redundant_metadata_most_ditto_level = 2; void -dmu_write_policy(objset_t *os, dnode_t *dn, int level, int wp, - enum zio_compress override_compress, zio_prop_t *zp) +dmu_write_policy(objset_t *os, dnode_t *dn, int level, int wp, zio_prop_t *zp) { dmu_object_type_t type = dn ? dn->dn_type : DMU_OT_OBJSET; boolean_t ismd = (level > 0 || DMU_OT_IS_METADATA(type) || @@ -1980,11 +1978,7 @@ dmu_write_policy(objset_t *os, dnode_t *dn, int level, boolean_t nopwrite = B_FALSE; boolean_t dedup_verify = os->os_dedup_verify; int copies = os->os_copies; - boolean_t lz4_ac = spa_feature_is_active(os->os_spa, - SPA_FEATURE_LZ4_COMPRESS); - IMPLY(override_compress == ZIO_COMPRESS_LZ4, lz4_ac); - /* * We maintain different write policies for each of the following * types of data: @@ -2071,14 +2065,7 @@ dmu_write_policy(objset_t *os, dnode_t *dn, int level, } zp->zp_checksum = checksum; - - /* - * If we're writing a pre-compressed buffer, the compression type we use - * must match the data. If it hasn't been compressed yet, then we should - * use the value dictated by the policies above. - */ - zp->zp_compress = override_compress != ZIO_COMPRESS_INHERIT - ? override_compress : compress; + zp->zp_compress = compress; ASSERT3U(zp->zp_compress, !=, ZIO_COMPRESS_INHERIT); zp->zp_type = (wp & WP_SPILL) ? dn->dn_bonustype : type; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 17:40:13 2017 (r321572) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Wed Jul 26 17:42:43 2017 (r321573) @@ -1201,7 +1201,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio, dmu_tx_t *tx ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID); arc_release(os->os_phys_buf, &os->os_phys_buf); - dmu_write_policy(os, NULL, 0, 0, ZIO_COMPRESS_INHERIT, &zp); + dmu_write_policy(os, NULL, 0, 0, &zp); zio = arc_write(pio, os->os_spa, tx->tx_txg, blkptr_copy, os->os_phys_buf, DMU_OS_IS_L2CACHEABLE(os), Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 17:40:13 2017 (r321572) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Wed Jul 26 17:42:43 2017 (r321573) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2011, 2016 by Delphix. All rights reserved. + * Copyright (c) 2011, 2017 by Delphix. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2012, Joyent, Inc. All rights reserved. * Copyright 2013 DEY Storage Systems, Inc. @@ -422,7 +422,7 @@ dmu_write_embedded(objset_t *os, uint64_t object, uint #define WP_SPILL 0x4 void dmu_write_policy(objset_t *os, dnode_t *dn, int level, int wp, - enum zio_compress compress_override, struct zio_prop *zp); + struct zio_prop *zp); /* * The bonus data is accessed more or less like a regular buffer. * You must dmu_bonus_hold() to get the buffer, which will give you a From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:44:23 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6CB47DABB57; Wed, 26 Jul 2017 17:44:23 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 499496ED65; Wed, 26 Jul 2017 17:44:23 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHiMZi003191; Wed, 26 Jul 2017 17:44:22 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHiMVE003190; Wed, 26 Jul 2017 17:44:22 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261744.v6QHiMVE003190@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:44:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321574 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321574 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:44:23 -0000 Author: mav Date: Wed Jul 26 17:44:22 2017 New Revision: 321574 URL: https://svnweb.freebsd.org/changeset/base/321574 Log: MFC r319749: MFV r319739: 8005 poor performance of 1MB writes on certain RAID-Z configuration s illumos/illumos-gate@5b062782532a1d5961c4a4b655906e1238c7c908 https://github.com/illumos/illumos-gate/commit/5b062782532a1d5961c4a4b655906e123 8c7c908 https://www.illumos.org/issues/8005 RAID-Z requires that space be allocated in multiples of P+1 sectors, because this is the minimum size block that can have the required amount of parity. Thus blocks on RAIDZ1 must be allocated in a multiple of 2 sectors; on RAIDZ2 multiple of 3; and on RAIDZ3 multiple of 4. A sector is a unit of 2^ashift bytes, typically 512B or 4KB. To satisfy this constraint, the allocation size is rounded up to the proper multiple, resulting in up to 3 "pad sectors" at the end of some blocks. The contents of these pad sectors are not used, so we do not need to read or write these sectors. However, some storage hardware performs much worse (around 1/2 as fast) on mostly-contiguous writes when there are small gaps of non-overwritten data between the writes. Therefore, ZFS creates "optional" zio's when writing RAID-Z blocks that include pad sectors. If writing a pad sector will fill the gap between two (required) writes, we will issue the optional zio, thus doubling performance. The gap-filling performance improvement was introduced in July 2009. Writing the optional zio is done by the io aggregation code in vdev_queue.c. The problem is that it is also subject to the limit on the size of aggregate writes, zfs_vdev_aggregation_limit, which is by default 128KB. For a given block, if the amount of data plus padding written to a leaf device exceeds zfs_vdev_aggregation_limit, the optional zio will not be written, resulting in a ~2x performance degradation. The problem occurs only for certain values of ashift, compressed block size, and RAID-Z configuration (number of parity and data disks). It cannot occur with the default recordsize=128KB. If compression is enabled, all configurations with recordsize=1MB or larger will be impacted to some degree. The problem notably occurs with recordsize=1MB, compression=off, with 10 disks in a RAIDZ2 or RAIDZ3 group (with 512B or 4KB sectors). Therefore Reviewed by: Saso Kiselkov Reviewed by: George Wilson Reviewed by: Pavel Zakharov Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Wed Jul 26 17:42:43 2017 (r321573) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Wed Jul 26 17:44:22 2017 (r321574) @@ -24,7 +24,7 @@ */ /* - * Copyright (c) 2012, 2014 by Delphix. All rights reserved. + * Copyright (c) 2012, 2017 by Delphix. All rights reserved. * Copyright (c) 2014 Integros [integros.com] */ @@ -681,7 +681,7 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio) /* * Walk backwards through sufficiently contiguous I/Os - * recording the last non-option I/O. + * recording the last non-optional I/O. */ flags = zio->io_flags & ZIO_FLAG_AGG_INHERIT; t = vdev_queue_type_tree(vq, zio->io_type); @@ -704,10 +704,14 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio) /* * Walk forward through sufficiently contiguous I/Os. + * The aggregation limit does not apply to optional i/os, so that + * we can issue contiguous writes even if they are larger than the + * aggregation limit. */ while ((dio = AVL_NEXT(t, last)) != NULL && (dio->io_flags & ZIO_FLAG_AGG_INHERIT) == flags && - IO_SPAN(first, dio) <= zfs_vdev_aggregation_limit && + (IO_SPAN(first, dio) <= zfs_vdev_aggregation_limit || + (dio->io_flags & ZIO_FLAG_OPTIONAL)) && IO_GAP(last, dio) <= maxgap) { last = dio; if (!(last->io_flags & ZIO_FLAG_OPTIONAL)) @@ -739,10 +743,16 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio) } if (stretch) { - /* This may be a no-op. */ + /* + * We are going to include an optional io in our aggregated + * span, thus closing the write gap. Only mandatory i/os can + * start aggregated spans, so make sure that the next i/o + * after our span is mandatory. + */ dio = AVL_NEXT(t, last); dio->io_flags &= ~ZIO_FLAG_OPTIONAL; } else { + /* do not include the optional i/o */ while (last != mandatory && last != first) { ASSERT(last->io_flags & ZIO_FLAG_OPTIONAL); last = AVL_PREV(t, last); @@ -754,7 +764,7 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio) return (NULL); size = IO_SPAN(first, last); - ASSERT3U(size, <=, zfs_vdev_aggregation_limit); + ASSERT3U(size, <=, SPA_MAXBLOCKSIZE); abuf = zio_buf_alloc_nowait(size); if (abuf == NULL) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:45:10 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4623DABBF8; Wed, 26 Jul 2017 17:45:10 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A066D6EEB1; Wed, 26 Jul 2017 17:45:10 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHj9Br003282; Wed, 26 Jul 2017 17:45:09 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHj9sA003281; Wed, 26 Jul 2017 17:45:09 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261745.v6QHj9sA003281@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:45:09 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321575 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321575 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:45:10 -0000 Author: mav Date: Wed Jul 26 17:45:09 2017 New Revision: 321575 URL: https://svnweb.freebsd.org/changeset/base/321575 Log: MFC r319750: MFV r319741: 8156 dbuf_evict_notify() does not need dbuf_evict_lock illumos/illumos-gate@dbfd9f930004c390a2ce2cf850c71b4f880eef9c https://github.com/illumos/illumos-gate/commit/dbfd9f930004c390a2ce2cf850c71b4f880eef9c https://www.illumos.org/issues/8156 dbuf_evict_notify() holds the dbuf_evict_lock while checking if it should do the eviction itself (because the evict thread is not able to keep up). This can result in massive lock contention. It isn't necessary to hold the lock, because if we make the wrong choice occasionally, nothing bad will happen. Reviewed by: Dan Kimmel Reviewed by: Paul Dagnelie Approved by: Robert Mustacchi Author: Matthew Ahrens Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:44:22 2017 (r321574) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Jul 26 17:45:09 2017 (r321575) @@ -560,19 +560,15 @@ dbuf_evict_notify(void) if (tsd_get(zfs_dbuf_evict_key) != NULL) return; + /* + * We check if we should evict without holding the dbuf_evict_lock, + * because it's OK to occasionally make the wrong decision here, + * and grabbing the lock results in massive lock contention. + */ if (refcount_count(&dbuf_cache_size) > dbuf_cache_max_bytes) { - boolean_t evict_now = B_FALSE; - - mutex_enter(&dbuf_evict_lock); - if (refcount_count(&dbuf_cache_size) > dbuf_cache_max_bytes) { - evict_now = dbuf_cache_above_hiwater(); - cv_signal(&dbuf_evict_cv); - } - mutex_exit(&dbuf_evict_lock); - - if (evict_now) { + if (dbuf_cache_above_hiwater()) dbuf_evict_one(); - } + cv_signal(&dbuf_evict_cv); } } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:46:07 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6F8D3DABC70; Wed, 26 Jul 2017 17:46:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 467F36EFF3; Wed, 26 Jul 2017 17:46:07 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHk61h003373; Wed, 26 Jul 2017 17:46:06 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHk6vQ003372; Wed, 26 Jul 2017 17:46:06 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261746.v6QHk6vQ003372@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:46:06 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321576 - stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/cddl/contrib/opensolaris/lib/libzfs/common X-SVN-Commit-Revision: 321576 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:46:07 -0000 Author: mav Date: Wed Jul 26 17:46:06 2017 New Revision: 321576 URL: https://svnweb.freebsd.org/changeset/base/321576 Log: MFC r319751: MFV r319740: 8168 NULL pointer dereference in zfs_create() illumos/illumos-gate@690031d326342fa4ea28b5e80f1ad6a16281519d https://github.com/illumos/illumos-gate/commit/690031d326342fa4ea28b5e80f1ad6a16281519d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): https://github.com/zfsonlinux/zfs/pull/6096 Reviewed by: Matt Ahrens Approved by: Robert Mustacchi Author: loli10K Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 17:45:09 2017 (r321575) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 17:46:06 2017 (r321576) @@ -3316,6 +3316,7 @@ zfs_create(libzfs_handle_t *hdl, const char *path, zfs char errbuf[1024]; uint64_t zoned; enum lzc_dataset_type ost; + zpool_handle_t *zpool_handle; (void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN, "cannot create '%s'"), path); @@ -3355,7 +3356,8 @@ zfs_create(libzfs_handle_t *hdl, const char *path, zfs if (p != NULL) *p = '\0'; - zpool_handle_t *zpool_handle = zpool_open(hdl, pool_path); + if ((zpool_handle = zpool_open(hdl, pool_path)) == NULL) + return (-1); if (props && (props = zfs_valid_proplist(hdl, type, props, zoned, NULL, zpool_handle, errbuf)) == 0) { From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:47:33 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B7163DABD40; Wed, 26 Jul 2017 17:47:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 90F0E6F21C; Wed, 26 Jul 2017 17:47:33 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHlW8k003476; Wed, 26 Jul 2017 17:47:32 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHlWXg003472; Wed, 26 Jul 2017 17:47:32 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261747.v6QHlWXg003472@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:47:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321577 - in stable/11: cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzfs_core/common sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/lib/libzfs/common cddl/contrib/opensolaris/lib/libzfs_core/common sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321577 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:47:33 -0000 Author: mav Date: Wed Jul 26 17:47:32 2017 New Revision: 321577 URL: https://svnweb.freebsd.org/changeset/base/321577 Log: MFC r319947: MFV r319945,r319946: 8264 want support for promoting datasets in libzfs_core illumos/illumos-gate@a4b8c9aa65a0a735aba318024a424a90d7b06c37 https://github.com/illumos/illumos-gate/commit/a4b8c9aa65a0a735aba318024a424a90d7b06c37 https://www.illumos.org/issues/8264 Oddly there is a lzc_clone function, but no lzc_promote function. Reviewed by: Andriy Gapon Reviewed by: Matthew Ahrens Reviewed by: Dan McDonald Approved by: Dan McDonald Author: Andrew Stormont Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 17:46:06 2017 (r321576) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c Wed Jul 26 17:47:32 2017 (r321577) @@ -30,6 +30,7 @@ * Copyright (c) 2014 Integros [integros.com] * Copyright 2016 Igor Kozhukhov * Copyright 2016 Nexenta Systems, Inc. + * Copyright 2017 RackTop Systems. */ #include @@ -3675,8 +3676,7 @@ int zfs_promote(zfs_handle_t *zhp) { libzfs_handle_t *hdl = zhp->zfs_hdl; - zfs_cmd_t zc = { 0 }; - char parent[MAXPATHLEN]; + char snapname[ZFS_MAX_DATASET_NAME_LEN]; int ret; char errbuf[1024]; @@ -3689,31 +3689,25 @@ zfs_promote(zfs_handle_t *zhp) return (zfs_error(hdl, EZFS_BADTYPE, errbuf)); } - (void) strlcpy(parent, zhp->zfs_dmustats.dds_origin, sizeof (parent)); - if (parent[0] == '\0') { + if (zhp->zfs_dmustats.dds_origin[0] == '\0') { zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, "not a cloned filesystem")); return (zfs_error(hdl, EZFS_BADTYPE, errbuf)); } - (void) strlcpy(zc.zc_value, zhp->zfs_dmustats.dds_origin, - sizeof (zc.zc_value)); - (void) strlcpy(zc.zc_name, zhp->zfs_name, sizeof (zc.zc_name)); - ret = zfs_ioctl(hdl, ZFS_IOC_PROMOTE, &zc); + ret = lzc_promote(zhp->zfs_name, snapname, sizeof (snapname)); if (ret != 0) { - int save_errno = errno; - - switch (save_errno) { + switch (ret) { case EEXIST: /* There is a conflicting snapshot name. */ zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, "conflicting snapshot '%s' from parent '%s'"), - zc.zc_string, parent); + snapname, zhp->zfs_dmustats.dds_origin); return (zfs_error(hdl, EZFS_EXISTS, errbuf)); default: - return (zfs_standard_error(hdl, save_errno, errbuf)); + return (zfs_standard_error(hdl, ret, errbuf)); } } return (ret); Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Jul 26 17:46:06 2017 (r321576) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.c Wed Jul 26 17:47:32 2017 (r321577) @@ -23,6 +23,7 @@ * Copyright (c) 2012, 2014 by Delphix. All rights reserved. * Copyright (c) 2013 Steven Hartland. All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2017 RackTop Systems. */ /* @@ -244,6 +245,28 @@ lzc_clone(const char *fsname, const char *origin, return (error); } +int +lzc_promote(const char *fsname, char *snapnamebuf, int snapnamelen) +{ + /* + * The promote ioctl is still legacy, so we need to construct our + * own zfs_cmd_t rather than using lzc_ioctl(). + */ + zfs_cmd_t zc = { 0 }; + + ASSERT3S(g_refcount, >, 0); + VERIFY3S(g_fd, !=, -1); + + (void) strlcpy(zc.zc_name, fsname, sizeof (zc.zc_name)); + if (ioctl(g_fd, ZFS_IOC_PROMOTE, &zc) != 0) { + int error = errno; + if (error == EEXIST && snapnamebuf != NULL) + (void) strlcpy(snapnamebuf, zc.zc_string, snapnamelen); + return (error); + } + return (0); +} + /* * Creates snapshots. * @@ -371,7 +394,7 @@ lzc_exists(const char *dataset) { /* * The objset_stats ioctl is still legacy, so we need to construct our - * own zfs_cmd_t rather than using zfsc_ioctl(). + * own zfs_cmd_t rather than using lzc_ioctl(). */ zfs_cmd_t zc = { 0 }; Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h Wed Jul 26 17:46:06 2017 (r321576) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs_core/common/libzfs_core.h Wed Jul 26 17:47:32 2017 (r321577) @@ -22,6 +22,7 @@ /* * Copyright (c) 2012, 2014 by Delphix. All rights reserved. * Copyright (c) 2013 by Martin Matuska . All rights reserved. + * Copyright 2017 RackTop Systems. */ #ifndef _LIBZFS_CORE_H @@ -49,6 +50,7 @@ enum lzc_dataset_type { int lzc_snapshot(nvlist_t *, nvlist_t *, nvlist_t **); int lzc_create(const char *, enum lzc_dataset_type, nvlist_t *); int lzc_clone(const char *, const char *, nvlist_t *); +int lzc_promote(const char *, char *, int); int lzc_destroy_snaps(nvlist_t *, boolean_t, nvlist_t **); int lzc_bookmark(nvlist_t *, nvlist_t **); int lzc_get_bookmarks(const char *, nvlist_t *, nvlist_t **); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Jul 26 17:46:06 2017 (r321576) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c Wed Jul 26 17:47:32 2017 (r321577) @@ -31,6 +31,7 @@ * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. * Copyright (c) 2013 Steven Hartland. All rights reserved. * Copyright (c) 2014 Integros [integros.com] + * Copyright 2017 RackTop Systems. */ /* @@ -4893,7 +4894,6 @@ zfs_ioc_pool_reopen(zfs_cmd_t *zc) /* * inputs: * zc_name name of filesystem - * zc_value name of origin snapshot * * outputs: * zc_string name of conflicting snapshot, if there is one @@ -4901,16 +4901,49 @@ zfs_ioc_pool_reopen(zfs_cmd_t *zc) static int zfs_ioc_promote(zfs_cmd_t *zc) { + dsl_pool_t *dp; + dsl_dataset_t *ds, *ods; + char origin[ZFS_MAX_DATASET_NAME_LEN]; char *cp; + int error; + error = dsl_pool_hold(zc->zc_name, FTAG, &dp); + if (error != 0) + return (error); + + error = dsl_dataset_hold(dp, zc->zc_name, FTAG, &ds); + if (error != 0) { + dsl_pool_rele(dp, FTAG); + return (error); + } + + if (!dsl_dir_is_clone(ds->ds_dir)) { + dsl_dataset_rele(ds, FTAG); + dsl_pool_rele(dp, FTAG); + return (SET_ERROR(EINVAL)); + } + + error = dsl_dataset_hold_obj(dp, + dsl_dir_phys(ds->ds_dir)->dd_origin_obj, FTAG, &ods); + if (error != 0) { + dsl_dataset_rele(ds, FTAG); + dsl_pool_rele(dp, FTAG); + return (error); + } + + dsl_dataset_name(ods, origin); + dsl_dataset_rele(ods, FTAG); + dsl_dataset_rele(ds, FTAG); + dsl_pool_rele(dp, FTAG); + /* * We don't need to unmount *all* the origin fs's snapshots, but * it's easier. */ - cp = strchr(zc->zc_value, '@'); + cp = strchr(origin, '@'); if (cp) *cp = '\0'; - (void) dmu_objset_find(zc->zc_value, + (void) dmu_objset_find(origin, zfs_unmount_snap_cb, NULL, DS_FIND_SNAPSHOTS); return (dsl_dataset_promote(zc->zc_name, zc->zc_string)); } From owner-svn-src-stable-11@freebsd.org Wed Jul 26 17:48:38 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9896DABDD2; Wed, 26 Jul 2017 17:48:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B36976F35F; Wed, 26 Jul 2017 17:48:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QHmbXj003575; Wed, 26 Jul 2017 17:48:37 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QHmbk0003569; Wed, 26 Jul 2017 17:48:37 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261748.v6QHmbk0003569@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 17:48:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321578 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321578 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 17:48:39 -0000 Author: mav Date: Wed Jul 26 17:48:37 2017 New Revision: 321578 URL: https://svnweb.freebsd.org/changeset/base/321578 Log: MFC r319949: MFV r319948: 5428 provide fts(), reallocarray(), and strtonum() illumos/illumos-gate@4585130b259133a26efae68275dbe56b08366deb https://github.com/illumos/illumos-gate/commit/4585130b259133a26efae68275dbe56b08366deb https://www.illumos.org/issues/5428 Most of the upstream change is not applicable to FreeBSD. Only the renaming of strtonum to zfs_strtonum is relevant to us. And we already had it partially done. Reviewed by: Robert Mustacchi Approved by: Joshua M. Clulow Author: Yuri Pankov Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Wed Jul 26 17:47:32 2017 (r321577) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c Wed Jul 26 17:48:37 2017 (r321578) @@ -85,7 +85,7 @@ dsl_deadlist_load_tree(dsl_deadlist_t *dl) zap_cursor_retrieve(&zc, &za) == 0; zap_cursor_advance(&zc)) { dsl_deadlist_entry_t *dle = kmem_alloc(sizeof (*dle), KM_SLEEP); - dle->dle_mintxg = strtonum(za.za_name, NULL); + dle->dle_mintxg = zfs_strtonum(za.za_name, NULL); VERIFY3U(0, ==, bpobj_open(&dle->dle_bpobj, dl->dl_os, za.za_first_integer)); avl_add(&dl->dl_tree, dle); @@ -490,7 +490,7 @@ dsl_deadlist_merge(dsl_deadlist_t *dl, uint64_t obj, d for (zap_cursor_init(&zc, dl->dl_os, obj); zap_cursor_retrieve(&zc, &za) == 0; zap_cursor_advance(&zc)) { - uint64_t mintxg = strtonum(za.za_name, NULL); + uint64_t mintxg = zfs_strtonum(za.za_name, NULL); dsl_deadlist_insert_bpobj(dl, za.za_first_integer, mintxg, tx); VERIFY3U(0, ==, zap_remove_int(dl->dl_os, obj, mintxg, tx)); } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Jul 26 17:47:32 2017 (r321577) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Wed Jul 26 17:48:37 2017 (r321578) @@ -1389,7 +1389,7 @@ dsl_scan_visit(dsl_scan_t *scn, dmu_tx_t *tx) dsl_dataset_t *ds; uint64_t dsobj; - dsobj = strtonum(za.za_name, NULL); + dsobj = zfs_strtonum(za.za_name, NULL); VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset, scn->scn_phys.scn_queue_obj, dsobj, tx)); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c Wed Jul 26 17:47:32 2017 (r321577) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c Wed Jul 26 17:48:37 2017 (r321578) @@ -340,7 +340,7 @@ static int dsl_dataset_hold_obj_string(dsl_pool_t *dp, const char *dsobj, void *tag, dsl_dataset_t **dsp) { - return (dsl_dataset_hold_obj(dp, strtonum(dsobj, NULL), tag, dsp)); + return (dsl_dataset_hold_obj(dp, zfs_strtonum(dsobj, NULL), tag, dsp)); } static int Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c Wed Jul 26 17:47:32 2017 (r321577) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c Wed Jul 26 17:48:37 2017 (r321578) @@ -73,13 +73,13 @@ bookmark_to_name(zbookmark_phys_t *zb, char *buf, size static void name_to_bookmark(char *buf, zbookmark_phys_t *zb) { - zb->zb_objset = strtonum(buf, &buf); + zb->zb_objset = zfs_strtonum(buf, &buf); ASSERT(*buf == ':'); - zb->zb_object = strtonum(buf + 1, &buf); + zb->zb_object = zfs_strtonum(buf + 1, &buf); ASSERT(*buf == ':'); - zb->zb_level = (int)strtonum(buf + 1, &buf); + zb->zb_level = (int)zfs_strtonum(buf + 1, &buf); ASSERT(*buf == ':'); - zb->zb_blkid = strtonum(buf + 1, &buf); + zb->zb_blkid = zfs_strtonum(buf + 1, &buf); ASSERT(*buf == '\0'); } #endif Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Jul 26 17:47:32 2017 (r321577) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h Wed Jul 26 17:48:37 2017 (r321578) @@ -848,7 +848,6 @@ extern void zfs_blkptr_verify(spa_t *spa, const blkptr extern int spa_mode(spa_t *spa); extern uint64_t zfs_strtonum(const char *str, char **nptr); -#define strtonum(str, nptr) zfs_strtonum((str), (nptr)) extern char *spa_his_ievent_table[]; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Wed Jul 26 17:47:32 2017 (r321577) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c Wed Jul 26 17:48:37 2017 (r321578) @@ -630,7 +630,7 @@ fuidstr_to_sid(zfsvfs_t *zfsvfs, const char *fuidstr, uint64_t fuid; const char *domain; - fuid = strtonum(fuidstr, NULL); + fuid = zfs_strtonum(fuidstr, NULL); domain = zfs_fuid_find_by_idx(zfsvfs, FUID_INDEX(fuid)); if (domain) From owner-svn-src-stable-11@freebsd.org Wed Jul 26 19:01:16 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8B17FDADE1E; Wed, 26 Jul 2017 19:01:16 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6522072E4D; Wed, 26 Jul 2017 19:01:16 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QJ1Fn4035599; Wed, 26 Jul 2017 19:01:15 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QJ1FcN035597; Wed, 26 Jul 2017 19:01:15 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707261901.v6QJ1FcN035597@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 26 Jul 2017 19:01:15 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321579 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321579 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 19:01:16 -0000 Author: mav Date: Wed Jul 26 19:01:15 2017 New Revision: 321579 URL: https://svnweb.freebsd.org/changeset/base/321579 Log: MFC r319953: MFV r319951: 8311 ZFS_READONLY is a little too strict illumos/illumos-gate@2889ec41c05e9ffe1890b529b3111354da325aeb https://github.com/illumos/illumos-gate/commit/2889ec41c05e9ffe1890b529b3111354d a325aeb https://www.illumos.org/issues/8311 Description: There was a misunderstanding about the enforcement details of the "Read-only" flag introduced for SMB/CIFS compatibility, way back in 2007 in the Sun PSARC 2007/315 case. The original authors thought enforcement of the READONLY flag should work similarly as the IMMUTABLE flag. Unfortunately, that enforcement is incompatible with the expectations of Windows applications using this feature through the SMB service. Applications assume (and the MS File System Algorithm s MS-FSA confirms they should) that an SMB client can: (a) Open an SMB handle on a file with read/write access, (b) Set the DOS attributes to include the READONLY flag, (c) continue to have write access via that handle. This access model is essentially the same as a Unix/POSIX application that creates a file (with read/write access), uses fchmod() to change the file mode to something not granting write access (i.e. 0444), and then continues to writ e that file using the open handle it got before the mode change. Currently, the SMB server works-around this problem in a way that will become difficult to maintain as we implement support for SMB3 persistent handles, so SMB depends on this fix. I've written a test program that can be used to demonstrate this problem, and added it to zfs-tests (tests/functional/acl/cifs/cifs_attr_004_pos). It currently fails, but will pass when this problem fixed. Steps to Reproduce: Run the test program on a ZFS file system. Expected Results: Pass Actual Results: Fail. Reviewed by: Sanjay Nadkarni Reviewed by: Yuri Pankov Reviewed by: Andrew Stormont Reviewed by: Matt Ahrens Reviewed by: John Kennedy Approved by: Prakash Surya Author: Gordon Ross Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c Wed Jul 26 17:48:37 2017 (r321578) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c Wed Jul 26 19:01:15 2017 (r321579) @@ -20,8 +20,8 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. All rights reserved. */ #include @@ -2017,13 +2017,11 @@ zfs_zaccess_dataset_check(znode_t *zp, uint32_t v4_mod } /* - * Only check for READONLY on non-directories. + * Intentionally allow ZFS_READONLY through here. + * See zfs_zaccess_common(). */ if ((v4_mode & WRITE_MASK_DATA) && - (((ZTOV(zp)->v_type != VDIR) && - (zp->z_pflags & (ZFS_READONLY | ZFS_IMMUTABLE))) || - (ZTOV(zp)->v_type == VDIR && - (zp->z_pflags & ZFS_IMMUTABLE)))) { + (zp->z_pflags & ZFS_IMMUTABLE)) { return (SET_ERROR(EPERM)); } @@ -2246,6 +2244,24 @@ zfs_zaccess_common(znode_t *zp, uint32_t v4_mode, uint if (skipaclchk) { *working_mode = 0; return (0); + } + + /* + * Note: ZFS_READONLY represents the "DOS R/O" attribute. + * When that flag is set, we should behave as if write access + * were not granted by anything in the ACL. In particular: + * We _must_ allow writes after opening the file r/w, then + * setting the DOS R/O attribute, and writing some more. + * (Similar to how you can write after fchmod(fd, 0444).) + * + * Therefore ZFS_READONLY is ignored in the dataset check + * above, and checked here as if part of the ACL check. + * Also note: DOS R/O is ignored for directories. + */ + if ((v4_mode & WRITE_MASK_DATA) && + (ZTOV(zp)->v_type != VDIR) && + (zp->z_pflags & ZFS_READONLY)) { + return (SET_ERROR(EPERM)); } return (zfs_zaccess_aces_check(zp, working_mode, B_FALSE, cr)); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 17:48:37 2017 (r321578) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Jul 26 19:01:15 2017 (r321579) @@ -904,9 +904,11 @@ zfs_write(vnode_t *vp, uio_t *uio, int ioflag, cred_t } /* - * If immutable or not appending then return EPERM + * If immutable or not appending then return EPERM. + * Intentionally allow ZFS_READONLY through here. + * See zfs_zaccess_common() */ - if ((zp->z_pflags & (ZFS_IMMUTABLE | ZFS_READONLY)) || + if ((zp->z_pflags & ZFS_IMMUTABLE) || ((zp->z_pflags & ZFS_APPENDONLY) && !(ioflag & FAPPEND) && (uio->uio_loffset < zp->z_size))) { ZFS_EXIT(zfsvfs); @@ -2945,10 +2947,9 @@ zfs_setattr(vnode_t *vp, vattr_t *vap, int flags, cred return (SET_ERROR(EPERM)); } - if ((mask & AT_SIZE) && (zp->z_pflags & ZFS_READONLY)) { - ZFS_EXIT(zfsvfs); - return (SET_ERROR(EPERM)); - } + /* + * Note: ZFS_READONLY is handled in zfs_zaccess_common. + */ /* * Verify timestamps doesn't overflow 32 bits. From owner-svn-src-stable-11@freebsd.org Wed Jul 26 23:03:23 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1A85ADB2CB0; Wed, 26 Jul 2017 23:03:23 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DCC247F136; Wed, 26 Jul 2017 23:03:22 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QN3MEr036660; Wed, 26 Jul 2017 23:03:22 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QN3Mr7036659; Wed, 26 Jul 2017 23:03:22 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707262303.v6QN3Mr7036659@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Wed, 26 Jul 2017 23:03:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321591 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321591 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 23:03:23 -0000 Author: emaste Date: Wed Jul 26 23:03:21 2017 New Revision: 321591 URL: https://svnweb.freebsd.org/changeset/base/321591 Log: MFC r321218: zfs: Fix a typo in the delay_min_dirty_percent sysctl description The description is FreeBSD-specific and was added in r266497 to fix PR189865. PR: 220825 Submitted by: Fabian Keil Obtained from: ElectroBSD Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 22:32:57 2017 (r321590) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Wed Jul 26 23:03:21 2017 (r321591) @@ -167,7 +167,7 @@ static int sysctl_zfs_delay_min_dirty_percent(SYSCTL_H SYSCTL_PROC(_vfs_zfs, OID_AUTO, delay_min_dirty_percent, CTLTYPE_INT | CTLFLAG_MPSAFE | CTLFLAG_RW, 0, sizeof(int), sysctl_zfs_delay_min_dirty_percent, "I", - "The limit of outstanding dirty data before transations are delayed"); + "The limit of outstanding dirty data before transactions are delayed"); static int sysctl_zfs_delay_scale(SYSCTL_HANDLER_ARGS); /* No zfs_delay_scale tunable due to limit requirements */ From owner-svn-src-stable-11@freebsd.org Wed Jul 26 23:14:23 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 112A1DB3192; Wed, 26 Jul 2017 23:14:23 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D118E7FB1E; Wed, 26 Jul 2017 23:14:22 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QNEMiw041503; Wed, 26 Jul 2017 23:14:22 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QNEMmV041502; Wed, 26 Jul 2017 23:14:22 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707262314.v6QNEMmV041502@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Wed, 26 Jul 2017 23:14:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321593 - stable/11/sys/libkern/arm64 X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/sys/libkern/arm64 X-SVN-Commit-Revision: 321593 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 23:14:23 -0000 Author: emaste Date: Wed Jul 26 23:14:21 2017 New Revision: 321593 URL: https://svnweb.freebsd.org/changeset/base/321593 Log: MFC r319718: arm64: add ".arch armv8-a+crc" to allow use of crc instructions With Clang 5.0 the .arch directive is required, otherwise Clang complains "error: instruction requires: crc". This was reported in D10499 but not added initially, because clang 3.8 available on a ref machine reported unknown directive. Clang 4.0 allows but does not require the directive. Sponsored by: The FreeBSD Foundation Modified: stable/11/sys/libkern/arm64/crc32c_armv8.S Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/libkern/arm64/crc32c_armv8.S ============================================================================== --- stable/11/sys/libkern/arm64/crc32c_armv8.S Wed Jul 26 23:05:25 2017 (r321592) +++ stable/11/sys/libkern/arm64/crc32c_armv8.S Wed Jul 26 23:14:21 2017 (r321593) @@ -27,6 +27,7 @@ #include __FBSDID("$FreeBSD$"); +.arch armv8-a+crc /* * uint32_t From owner-svn-src-stable-11@freebsd.org Wed Jul 26 23:18:15 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ABD77DB32BD; Wed, 26 Jul 2017 23:18:15 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 72A727FD39; Wed, 26 Jul 2017 23:18:15 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QNIEsJ041685; Wed, 26 Jul 2017 23:18:14 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QNIEJ5041684; Wed, 26 Jul 2017 23:18:14 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707262318.v6QNIEJ5041684@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Wed, 26 Jul 2017 23:18:14 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321594 - stable/11 X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11 X-SVN-Commit-Revision: 321594 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 23:18:15 -0000 Author: emaste Date: Wed Jul 26 23:18:14 2017 New Revision: 321594 URL: https://svnweb.freebsd.org/changeset/base/321594 Log: MFC r312857: Use cross-NM (XNM) in compat32 build An attempt to build mips64 using external toolchain failed as it tried to use the host amd64 nm. Modified: stable/11/Makefile.libcompat Directory Properties: stable/11/ (props changed) Modified: stable/11/Makefile.libcompat ============================================================================== --- stable/11/Makefile.libcompat Wed Jul 26 23:14:21 2017 (r321593) +++ stable/11/Makefile.libcompat Wed Jul 26 23:18:14 2017 (r321594) @@ -34,6 +34,8 @@ LIB32WMAKEFLAGS= \ OBJCOPY="${XOBJCOPY}" .endif +LIB32WMAKEFLAGS+= NM="${XNM}" + LIB32CFLAGS= -m32 -DCOMPAT_32BIT LIB32DTRACE= ${DTRACE} -32 From owner-svn-src-stable-11@freebsd.org Wed Jul 26 23:21:30 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3BA8FDB346B; Wed, 26 Jul 2017 23:21:30 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0A70E80048; Wed, 26 Jul 2017 23:21:29 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QNLT2D043369; Wed, 26 Jul 2017 23:21:29 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QNLTRO043354; Wed, 26 Jul 2017 23:21:29 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707262321.v6QNLTRO043354@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Wed, 26 Jul 2017 23:21:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321595 - stable/11/usr.sbin/makefs X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/usr.sbin/makefs X-SVN-Commit-Revision: 321595 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 23:21:30 -0000 Author: emaste Date: Wed Jul 26 23:21:28 2017 New Revision: 321595 URL: https://svnweb.freebsd.org/changeset/base/321595 Log: MFC r316055: makefs: sort roundup with the other off_t members in fsinfo_t Modified: stable/11/usr.sbin/makefs/makefs.h Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/makefs/makefs.h ============================================================================== --- stable/11/usr.sbin/makefs/makefs.h Wed Jul 26 23:18:14 2017 (r321594) +++ stable/11/usr.sbin/makefs/makefs.h Wed Jul 26 23:21:28 2017 (r321595) @@ -124,13 +124,13 @@ typedef struct { off_t minsize; /* minimum size image should be */ off_t maxsize; /* maximum size image can be */ off_t freefiles; /* free file entries to leave */ - int freefilepc; /* free file % */ off_t freeblocks; /* free blocks to leave */ + off_t roundup; /* round image size up to this value */ + int freefilepc; /* free file % */ int freeblockpc; /* free block % */ int needswap; /* non-zero if byte swapping needed */ int sectorsize; /* sector size */ int sparse; /* sparse image, don't fill it with zeros */ - off_t roundup; /* round image size up to this value */ void *fs_specific; /* File system specific additions. */ } fsinfo_t; From owner-svn-src-stable-11@freebsd.org Wed Jul 26 23:23:35 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 22720DB350F; Wed, 26 Jul 2017 23:23:35 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E4A51802AA; Wed, 26 Jul 2017 23:23:34 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6QNNXjh045503; Wed, 26 Jul 2017 23:23:33 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6QNNXix045501; Wed, 26 Jul 2017 23:23:33 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707262323.v6QNNXix045501@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Wed, 26 Jul 2017 23:23:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321596 - stable/11/sys/conf X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/sys/conf X-SVN-Commit-Revision: 321596 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 23:23:35 -0000 Author: emaste Date: Wed Jul 26 23:23:33 2017 New Revision: 321596 URL: https://svnweb.freebsd.org/changeset/base/321596 Log: MFC r319513: linux vdso: pass -fPIC to the assembler, not linker -fPIC has no effect on linking although it seems to be ignored by GNU ld.bfd. However, it causes ld.lld to terminate with an invalid argument error. This is equivalent to r296057 but for the kernel (not modules) case. Sponsored by: The FreeBSD Foundation Modified: stable/11/sys/conf/files.amd64 stable/11/sys/conf/files.i386 Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/conf/files.amd64 ============================================================================== --- stable/11/sys/conf/files.amd64 Wed Jul 26 23:21:28 2017 (r321595) +++ stable/11/sys/conf/files.amd64 Wed Jul 26 23:23:33 2017 (r321596) @@ -46,7 +46,7 @@ linux32_assym.h optional compat_linux32 \ # linux32_locore.o optional compat_linux32 \ dependency "linux32_assym.h $S/amd64/linux32/linux32_locore.s" \ - compile-with "${CC} -x assembler-with-cpp -DLOCORE -m32 -shared -s -pipe -I. -I$S -Werror -Wall -fno-common -nostdinc -nostdlib -Wl,-T$S/amd64/linux32/linux32_vdso.lds.s -Wl,-soname=linux32_vdso.so,--eh-frame-hdr,-fPIC,-warn-common ${.IMPSRC} -o ${.TARGET}" \ + compile-with "${CC} -x assembler-with-cpp -DLOCORE -m32 -shared -s -pipe -I. -I$S -Werror -Wall -fPIC -fno-common -nostdinc -nostdlib -Wl,-T$S/amd64/linux32/linux32_vdso.lds.s -Wl,-soname=linux32_vdso.so,--eh-frame-hdr,-warn-common ${.IMPSRC} -o ${.TARGET}" \ no-obj no-implicit-rule \ clean "linux32_locore.o" # Modified: stable/11/sys/conf/files.i386 ============================================================================== --- stable/11/sys/conf/files.i386 Wed Jul 26 23:21:28 2017 (r321595) +++ stable/11/sys/conf/files.i386 Wed Jul 26 23:23:33 2017 (r321596) @@ -33,7 +33,7 @@ linux_assym.h optional compat_linux \ # linux_locore.o optional compat_linux \ dependency "linux_assym.h $S/i386/linux/linux_locore.s" \ - compile-with "${CC} -x assembler-with-cpp -DLOCORE -shared -s -pipe -I. -I$S -Werror -Wall -fno-common -nostdinc -nostdlib -Wl,-T$S/i386/linux/linux_vdso.lds.s -Wl,-soname=linux_vdso.so,--eh-frame-hdr,-fPIC,-warn-common ${.IMPSRC} -o ${.TARGET}" \ + compile-with "${CC} -x assembler-with-cpp -DLOCORE -shared -s -pipe -I. -I$S -Werror -Wall -fPIC -fno-common -nostdinc -nostdlib -Wl,-T$S/i386/linux/linux_vdso.lds.s -Wl,-soname=linux_vdso.so,--eh-frame-hdr,-warn-common ${.IMPSRC} -o ${.TARGET}" \ no-obj no-implicit-rule \ clean "linux_locore.o" # From owner-svn-src-stable-11@freebsd.org Thu Jul 27 00:14:09 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE208DB4736; Thu, 27 Jul 2017 00:14:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8A56381AB0; Thu, 27 Jul 2017 00:14:09 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6R0E86t065519; Thu, 27 Jul 2017 00:14:08 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6R0E8DP065518; Thu, 27 Jul 2017 00:14:08 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201707270014.v6R0E8DP065518@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Thu, 27 Jul 2017 00:14:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321598 - stable/11/usr.sbin/nfsd X-SVN-Group: stable-11 X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: stable/11/usr.sbin/nfsd X-SVN-Commit-Revision: 321598 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 00:14:09 -0000 Author: rmacklem Date: Thu Jul 27 00:14:08 2017 New Revision: 321598 URL: https://svnweb.freebsd.org/changeset/base/321598 Log: MFC: r321248 Update the nfsv4 man page to reflect recent changes to support the newer RFCs (5661 and 7530). The main man changes are for the case of "numbers in strings" for user/groups that RFC7530 allows and avoids use of nfsuserd(8). This is a content change. Modified: stable/11/usr.sbin/nfsd/nfsv4.4 Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/nfsd/nfsv4.4 ============================================================================== --- stable/11/usr.sbin/nfsd/nfsv4.4 Thu Jul 27 00:04:09 2017 (r321597) +++ stable/11/usr.sbin/nfsd/nfsv4.4 Thu Jul 27 00:14:08 2017 (r321598) @@ -24,7 +24,7 @@ .\" .\" $FreeBSD$ .\" -.Dd July 1, 2013 +.Dd July 19, 2017 .Dt NFSV4 4 .Os .Sh NAME @@ -34,7 +34,8 @@ The NFS client and server provides support for the .Tn NFSv4 specification; see -.%T "Network File System (NFS) Version 4 Protocol RFC 3530" . +.%T "Network File System (NFS) Version 4 Protocol RFC 7530" and +.%T "Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 5661" . The protocol is somewhat similar to NFS Version 3, but differs in significant ways. It uses a single compound RPC that concatenates operations to-gether. @@ -74,6 +75,7 @@ It provides several optional features not present in N - Referrals, which redirect subtrees to other servers (not yet implemented) - Delegations, which allow a client to operate on a file locally +- pNFS, where I/O operations are separated from Metadata operations .Ed .Pp The @@ -115,8 +117,8 @@ multiple server file systems, although not all clients this. .Pp .Nm -uses names for users and groups instead of numbers. -On the wire, they +uses strings for users and groups instead of numbers. +On the wire, these strings can either have the numbers in the string or take the form: .sp .Bd -literal -offset indent -compact @@ -136,15 +138,37 @@ Under FreeBSD, the mapping daemon is called .Xr nfsuserd 8 and has a command line option that overrides the domain component of the machine's hostname. -For use of +For use of this form of string on .Nm , either client or server, this daemon must be running. -If this ``'' is not set correctly or the daemon is not running, ``ls -l'' will typically +.Pp +The form where the numbers are in the strings can only be used for AUTH_SYS. +To configure your systems this way, the +.Xr nfsuserd 8 +daemon does not need to be running on the server, but the following sysctls need to be +set to 1 on the server. +.sp +.Bd -literal -offset indent -compact +vfs.nfs.enable_uidtostring +vfs.nfsd.enable_stringtouid +.Ed +.sp +On the client, the sysctl +.sp +.Bd -literal -offset indent -compact +vfs.nfs.enable_uidtostring +.Ed +.sp +must be set to 1 and the +.Xr nfsuserd 8 +daemon does not need to be running. +.Pp +If these strings are not configured correctly, ``ls -l'' will typically report a lot of ``nobody'' and ``nogroup'' ownerships. .Pp Although uid/gid numbers are no longer used in the .Nm -protocol, they will still be in the RPC authentication fields when +protocol except optionally in the above strings, they will still be in the RPC authentication fields when using AUTH_SYS (sec=sys), which is the default. As such, in this case both the user/group name and number spaces must be consistent between the client and server. @@ -156,24 +180,24 @@ will go on the wire. .Sh SERVER SETUP To set up the NFS server that supports .Nm , -you will need to either set the variables in +you will need to set the variables in .Xr rc.conf 5 as follows: .sp .Bd -literal -offset indent -compact nfs_server_enable="YES" nfsv4_server_enable="YES" +.Ed +.sp +plus +.sp +.Bd -literal -offset indent -compact nfsuserd_enable="YES" .Ed .sp -or start -.Xr mountd 8 -and -.Xr nfsd 8 -without the ``-o'' option, which would force use of the old server. -The -.Xr nfsuserd 8 -daemon must also be running. +if the server is using the ``@'' form of user/group strings or +is using the ``-manage-gids'' option for +.Xr nfsuserd 8 . .Pp You will also need to add at least one ``V4:'' line to the .Xr exports 5 @@ -232,7 +256,7 @@ plus set ``tcp'' and .Pp The .Xr nfsuserd 8 -must be running, as above. +must be running if name<->uid/gid mapping is being used, as above. Also, since an .Nm mount uses the host uuid to identify the client uniquely to the server, @@ -255,7 +279,7 @@ daemon to handle client side callbacks. This will occur if .sp .Bd -literal -offset indent -compact -nfsuserd_enable="YES" +nfsuserd_enable="YES" <-- If name<->uid/gid mapping is being used. nfscbd_enable="YES" .Ed .sp @@ -265,7 +289,7 @@ are set in Without a functioning callback path, a server will never issue Delegations to a client. .sp -By default, the callback address will be set to the IP address acquired via +For NFSv4.0, by default, the callback address will be set to the IP address acquired via rtalloc() in the kernel and port# 7745. To override the default port#, a command line option for .Xr nfscbd 8 @@ -281,6 +305,10 @@ N.N.N.N.N.N .sp where the first 4 Ns are the host IP address and the last two are the port# in network byte order (all decimal #s in the range 0-255). +.Pp +For NFSv4.1, the callback path (called a backchannel) uses the same TCP connection as the mount, +so none of the above applies and should work through gateways without +any issues. .Pp To build a kernel with the client that supports .Nm From owner-svn-src-stable-11@freebsd.org Thu Jul 27 00:25:53 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4526DB4A8C; Thu, 27 Jul 2017 00:25:53 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 93314820DA; Thu, 27 Jul 2017 00:25:53 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6R0PqM7069989; Thu, 27 Jul 2017 00:25:52 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6R0Pqp5069988; Thu, 27 Jul 2017 00:25:52 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707270025.v6R0Pqp5069988@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Thu, 27 Jul 2017 00:25:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321599 - stable/11/sys/conf X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/sys/conf X-SVN-Commit-Revision: 321599 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 00:25:53 -0000 Author: emaste Date: Thu Jul 27 00:25:52 2017 New Revision: 321599 URL: https://svnweb.freebsd.org/changeset/base/321599 Log: MFC r321302: add arm64 objcopy output target for embedfs PR: 220877 Submitted by: David NewHamlet Modified: stable/11/sys/conf/kern.pre.mk Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/conf/kern.pre.mk ============================================================================== --- stable/11/sys/conf/kern.pre.mk Thu Jul 27 00:14:08 2017 (r321598) +++ stable/11/sys/conf/kern.pre.mk Thu Jul 27 00:25:52 2017 (r321599) @@ -237,6 +237,7 @@ EMBEDFS_ARCH.${MACHINE_ARCH}!= sed -n '/OUTPUT_ARCH/s/ EMBEDFS_FORMAT.arm?= elf32-littlearm EMBEDFS_FORMAT.armv6?= elf32-littlearm +EMBEDFS_FORMAT.aarch64?= elf64-littleaarch64 EMBEDFS_FORMAT.mips?= elf32-tradbigmips EMBEDFS_FORMAT.mipsel?= elf32-tradlittlemips EMBEDFS_FORMAT.mips64?= elf64-tradbigmips From owner-svn-src-stable-11@freebsd.org Thu Jul 27 00:29:04 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E2C9BDB4B8E; Thu, 27 Jul 2017 00:29:04 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BE2CC8225B; Thu, 27 Jul 2017 00:29:04 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6R0T3T2070157; Thu, 27 Jul 2017 00:29:03 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6R0T32S070156; Thu, 27 Jul 2017 00:29:03 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707270029.v6R0T32S070156@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Thu, 27 Jul 2017 00:29:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321600 - stable/11/usr.sbin/acpi/acpidump X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/usr.sbin/acpi/acpidump X-SVN-Commit-Revision: 321600 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 00:29:05 -0000 Author: emaste Date: Thu Jul 27 00:29:03 2017 New Revision: 321600 URL: https://svnweb.freebsd.org/changeset/base/321600 Log: MFC r321294: acpidump: use C99 designated initializers Submitted by: Guangyuan Yang Sponsored by: The FreeBSD Foundation Modified: stable/11/usr.sbin/acpi/acpidump/acpi.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/acpi/acpidump/acpi.c ============================================================================== --- stable/11/usr.sbin/acpi/acpidump/acpi.c Thu Jul 27 00:25:52 2017 (r321599) +++ stable/11/usr.sbin/acpi/acpidump/acpi.c Thu Jul 27 00:29:03 2017 (r321600) @@ -388,16 +388,25 @@ acpi_print_local_nmi(u_int lint, uint16_t mps_flags) acpi_print_mps_flags(mps_flags); } -static const char *apic_types[] = { "Local APIC", "IO APIC", "INT Override", - "NMI", "Local APIC NMI", - "Local APIC Override", "IO SAPIC", - "Local SAPIC", "Platform Interrupt", - "Local X2APIC", "Local X2APIC NMI", - "GIC CPU Interface Structure", - "GIC Distributor Structure", - "GICv2m MSI Frame", - "GIC Redistributor Structure", - "GIC ITS Structure" }; +static const char *apic_types[] = { + [ACPI_MADT_TYPE_LOCAL_APIC] = "Local APIC", + [ACPI_MADT_TYPE_IO_APIC] = "IO APIC", + [ACPI_MADT_TYPE_INTERRUPT_OVERRIDE] = "INT Override", + [ACPI_MADT_TYPE_NMI_SOURCE] = "NMI", + [ACPI_MADT_TYPE_LOCAL_APIC_NMI] = "Local APIC NMI", + [ACPI_MADT_TYPE_LOCAL_APIC_OVERRIDE] = "Local APIC Override", + [ACPI_MADT_TYPE_IO_SAPIC] = "IO SAPIC", + [ACPI_MADT_TYPE_LOCAL_SAPIC] = "Local SAPIC", + [ACPI_MADT_TYPE_INTERRUPT_SOURCE] = "Platform Interrupt", + [ACPI_MADT_TYPE_LOCAL_X2APIC] = "Local X2APIC", + [ACPI_MADT_TYPE_LOCAL_X2APIC_NMI] = "Local X2APIC NMI", + [ACPI_MADT_TYPE_GENERIC_INTERRUPT] = "GIC CPU Interface Structure", + [ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR] = "GIC Distributor Structure", + [ACPI_MADT_TYPE_GENERIC_MSI_FRAME] = "GICv2m MSI Frame", + [ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR] = "GIC Redistributor Structure", + [ACPI_MADT_TYPE_GENERIC_TRANSLATOR] = "GIC ITS Structure" +}; + static const char *platform_int_types[] = { "0 (unknown)", "PMI", "INIT", "Corrected Platform Error" }; @@ -1072,7 +1081,12 @@ acpi_print_srat_memory(ACPI_SRAT_MEM_AFFINITY *mp) printf("\tProximity Domain=%d\n", mp->ProximityDomain); } -static const char *srat_types[] = { "CPU", "Memory", "X2APIC", "GICC" }; +static const char *srat_types[] = { + [ACPI_SRAT_TYPE_CPU_AFFINITY] = "CPU", + [ACPI_SRAT_TYPE_MEMORY_AFFINITY] = "Memory", + [ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY] = "X2APIC", + [ACPI_SRAT_TYPE_GICC_AFFINITY] = "GICC" +}; static void acpi_print_srat(ACPI_SUBTABLE_HEADER *srat) From owner-svn-src-stable-11@freebsd.org Thu Jul 27 00:30:14 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93B6DDB4C16; Thu, 27 Jul 2017 00:30:14 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 62C4C82390; Thu, 27 Jul 2017 00:30:14 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6R0UD1I070272; Thu, 27 Jul 2017 00:30:13 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6R0UDhZ070271; Thu, 27 Jul 2017 00:30:13 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707270030.v6R0UDhZ070271@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Thu, 27 Jul 2017 00:30:13 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321601 - stable/11/usr.sbin/acpi/acpidump X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/usr.sbin/acpi/acpidump X-SVN-Commit-Revision: 321601 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 00:30:14 -0000 Author: emaste Date: Thu Jul 27 00:30:13 2017 New Revision: 321601 URL: https://svnweb.freebsd.org/changeset/base/321601 Log: MFC r321299: acpidump: add GIC ITS srat type From ACPI 6.2, 5.2.16.5 Sponsored by: The FreeBSD Foundation Modified: stable/11/usr.sbin/acpi/acpidump/acpi.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/acpi/acpidump/acpi.c ============================================================================== --- stable/11/usr.sbin/acpi/acpidump/acpi.c Thu Jul 27 00:29:03 2017 (r321600) +++ stable/11/usr.sbin/acpi/acpidump/acpi.c Thu Jul 27 00:30:13 2017 (r321601) @@ -1085,7 +1085,8 @@ static const char *srat_types[] = { [ACPI_SRAT_TYPE_CPU_AFFINITY] = "CPU", [ACPI_SRAT_TYPE_MEMORY_AFFINITY] = "Memory", [ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY] = "X2APIC", - [ACPI_SRAT_TYPE_GICC_AFFINITY] = "GICC" + [ACPI_SRAT_TYPE_GICC_AFFINITY] = "GICC", + [ACPI_SRAT_TYPE_GIC_ITS_AFFINITY] = "GIC ITS", }; static void From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:19:15 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6BFC1DC6D75; Thu, 27 Jul 2017 10:19:15 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4709E6FC13; Thu, 27 Jul 2017 10:19:15 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RAJEJI011611; Thu, 27 Jul 2017 10:19:14 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RAJE2M011608; Thu, 27 Jul 2017 10:19:14 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271019.v6RAJE2M011608@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:19:14 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321609 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321609 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:19:15 -0000 Author: mav Date: Thu Jul 27 10:19:13 2017 New Revision: 321609 URL: https://svnweb.freebsd.org/changeset/base/321609 Log: MFC r320153: revert r315852 which introduced zio_buf_alloc_nowait for use in vdev_queue_aggregate I think that the change is still good, but reconciling it with a planned merge of the ARC buf data scatter-ization is a bit more tedious than I can handle. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Thu Jul 27 08:37:07 2017 (r321608) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Thu Jul 27 10:19:13 2017 (r321609) @@ -554,7 +554,6 @@ extern zio_t *zio_unique_parent(zio_t *cio); extern void zio_add_child(zio_t *pio, zio_t *cio); extern void *zio_buf_alloc(size_t size); -extern void *zio_buf_alloc_nowait(size_t size); extern void zio_buf_free(void *buf, size_t size); extern void *zio_data_buf_alloc(size_t size); extern void zio_data_buf_free(void *buf, size_t size); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Thu Jul 27 08:37:07 2017 (r321608) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c Thu Jul 27 10:19:13 2017 (r321609) @@ -647,7 +647,6 @@ static zio_t * vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio) { zio_t *first, *last, *aio, *dio, *mandatory, *nio; - void *abuf; uint64_t maxgap = 0; uint64_t size; boolean_t stretch; @@ -766,12 +765,8 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio) size = IO_SPAN(first, last); ASSERT3U(size, <=, SPA_MAXBLOCKSIZE); - abuf = zio_buf_alloc_nowait(size); - if (abuf == NULL) - return (NULL); - aio = zio_vdev_delegated_io(first->io_vd, first->io_offset, - abuf, size, first->io_type, zio->io_priority, + zio_buf_alloc(size), size, first->io_type, zio->io_priority, flags | ZIO_FLAG_DONT_CACHE | ZIO_FLAG_DONT_QUEUE, vdev_queue_agg_io_done, NULL); aio->io_timestamp = first->io_timestamp; Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jul 27 08:37:07 2017 (r321608) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jul 27 10:19:13 2017 (r321609) @@ -272,33 +272,18 @@ zio_fini(void) * useful to inspect ZFS metadata, but if possible, we should avoid keeping * excess / transient data in-core during a crashdump. */ -static void * -zio_buf_alloc_impl(size_t size, boolean_t canwait) +void * +zio_buf_alloc(size_t size) { size_t c = (size - 1) >> SPA_MINBLOCKSHIFT; int flags = zio_exclude_metadata ? KM_NODEBUG : 0; VERIFY3U(c, <, SPA_MAXBLOCKSIZE >> SPA_MINBLOCKSHIFT); - if (zio_use_uma) { - return (kmem_cache_alloc(zio_buf_cache[c], - canwait ? KM_PUSHPAGE : KM_NOSLEEP)); - } else { - return (kmem_alloc(size, - (canwait ? KM_SLEEP : KM_NOSLEEP) | flags)); - } -} - -void * -zio_buf_alloc(size_t size) -{ - return (zio_buf_alloc_impl(size, B_TRUE)); -} - -void * -zio_buf_alloc_nowait(size_t size) -{ - return (zio_buf_alloc_impl(size, B_FALSE)); + if (zio_use_uma) + return (kmem_cache_alloc(zio_buf_cache[c], KM_PUSHPAGE)); + else + return (kmem_alloc(size, KM_SLEEP|flags)); } /* From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:25:20 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2FF3CDC7135; Thu, 27 Jul 2017 10:25:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DB3B870173; Thu, 27 Jul 2017 10:25:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RAPJ5T015428; Thu, 27 Jul 2017 10:25:19 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RAPIAj015418; Thu, 27 Jul 2017 10:25:18 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271025.v6RAPIAj015418@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:25:18 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321610 - in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/c... X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contrib/opensolaris/uts/co... X-SVN-Commit-Revision: 321610 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:25:20 -0000 Author: mav Date: Thu Jul 27 10:25:18 2017 New Revision: 321610 URL: https://svnweb.freebsd.org/changeset/base/321610 Log: MFC r320156, r320185, r320186, r320262, r320452, r321111: MFV r318946: 8021 ARC buf data scatter-ization illumos/illumos-gate@770499e185d15678ccb0be57ebc626ad18d93383 https://github.com/illumos/illumos-gate/commit/770499e185d15678ccb0be57ebc626ad1 8d93383 https://www.illumos.org/issues/8021 The ARC buf data project (known simply as "ABD" since its genesis in the ZoL community) changes the way the ARC allocates `b_pdata` memory from using linea r `void *` buffers to using scatter/gather lists of fixed-size 1KB chunks. This improves ZFS's performance by helping to defragment the address space occupied by the ARC, in particular for cases where compressed ARC is enabled. It could also ease future work to allocate pages directly from `segkpm` for minimal- overhead memory allocations, bypassing the `kmem` subsystem. This is essentially the same change as the one which recently landed in ZFS on Linux, although they made some platform-specific changes while adapting this work to their codebase: 1. Implemented the equivalent of the `segkpm` suggestion for future work mentioned above to bypass issues that they've had with the Linux kernel memory allocator. 2. Changed the internal representation of the ABD's scatter/gather list so it could be used to pass I/O directly into Linux block device drivers. (This feature is not available in the illumos block device interface yet.) FreeBSD notes: - the actual (default) chunk size is 4KB (despite the text above saying 1KB) - we can try to reimplement ABDs, so that they are not permanently mapped into the KVA unless explicitly requested, especially on platforms with scarce KVA - we can try to use unmapped I/O and avoid intermediate allocation of a linear, virtual memory mapped buffer - we can try to avoid extra data copying by referring to chunks / pages in the original ABD Reviewed by: Matthew Ahrens Reviewed by: George Wilson Reviewed by: Paul Dagnelie Reviewed by: John Kennedy Reviewed by: Prakash Surya Reviewed by: Prashanth Sreenivasa Reviewed by: Pavel Zakharov Reviewed by: Chris Williamson Approved by: Richard Lowe Author: Dan Kimmel Added: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c - copied unchanged from r320156, head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/abd.h - copied unchanged from r320156, head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/abd.h Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb_il.c stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.c stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.h stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/blkptr.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/edonr_zfs.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lz4.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sha256.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/skein_zfs.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/ddt.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_checksum.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_compress.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c stable/11/sys/conf/files Directory Properties: stable/11/ (props changed) Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb.c Thu Jul 27 10:25:18 2017 (r321610) @@ -59,6 +59,7 @@ #include #include #include +#include #include #undef verify #include @@ -2410,7 +2411,7 @@ zdb_blkptr_done(zio_t *zio) zdb_cb_t *zcb = zio->io_private; zbookmark_phys_t *zb = &zio->io_bookmark; - zio_data_buf_free(zio->io_data, zio->io_size); + abd_free(zio->io_abd); mutex_enter(&spa->spa_scrub_lock); spa->spa_scrub_inflight--; @@ -2477,7 +2478,7 @@ zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr if (!BP_IS_EMBEDDED(bp) && (dump_opt['c'] > 1 || (dump_opt['c'] && is_metadata))) { size_t size = BP_GET_PSIZE(bp); - void *data = zio_data_buf_alloc(size); + abd_t *abd = abd_alloc(size, B_FALSE); int flags = ZIO_FLAG_CANFAIL | ZIO_FLAG_SCRUB | ZIO_FLAG_RAW; /* If it's an intent log block, failure is expected. */ @@ -2490,7 +2491,7 @@ zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr spa->spa_scrub_inflight++; mutex_exit(&spa->spa_scrub_lock); - zio_nowait(zio_read(NULL, spa, bp, data, size, + zio_nowait(zio_read(NULL, spa, bp, abd, size, zdb_blkptr_done, zcb, ZIO_PRIORITY_ASYNC_READ, flags, zb)); } @@ -3270,6 +3271,13 @@ name: return (NULL); } +/* ARGSUSED */ +static int +random_get_pseudo_bytes_cb(void *buf, size_t len, void *unused) +{ + return (random_get_pseudo_bytes(buf, len)); +} + /* * Read a block from a pool and print it out. The syntax of the * block descriptor is: @@ -3301,7 +3309,8 @@ zdb_read_block(char *thing, spa_t *spa) uint64_t offset = 0, size = 0, psize = 0, lsize = 0, blkptr_offset = 0; zio_t *zio; vdev_t *vd; - void *pbuf, *lbuf, *buf; + abd_t *pabd; + void *lbuf, *buf; char *s, *p, *dup, *vdev, *flagstr; int i, error; @@ -3373,7 +3382,7 @@ zdb_read_block(char *thing, spa_t *spa) psize = size; lsize = size; - pbuf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); + pabd = abd_alloc_linear(SPA_MAXBLOCKSIZE, B_FALSE); lbuf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); BP_ZERO(bp); @@ -3401,15 +3410,15 @@ zdb_read_block(char *thing, spa_t *spa) /* * Treat this as a normal block read. */ - zio_nowait(zio_read(zio, spa, bp, pbuf, psize, NULL, NULL, + zio_nowait(zio_read(zio, spa, bp, pabd, psize, NULL, NULL, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL)); } else { /* * Treat this as a vdev child I/O. */ - zio_nowait(zio_vdev_child_io(zio, bp, vd, offset, pbuf, psize, - ZIO_TYPE_READ, ZIO_PRIORITY_SYNC_READ, + zio_nowait(zio_vdev_child_io(zio, bp, vd, offset, pabd, + psize, ZIO_TYPE_READ, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_DONT_CACHE | ZIO_FLAG_DONT_QUEUE | ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY | ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL, NULL)); @@ -3432,21 +3441,21 @@ zdb_read_block(char *thing, spa_t *spa) void *pbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); void *lbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL); - bcopy(pbuf, pbuf2, psize); + abd_copy_to_buf(pbuf2, pabd, psize); - VERIFY(random_get_pseudo_bytes((uint8_t *)pbuf + psize, - SPA_MAXBLOCKSIZE - psize) == 0); + VERIFY0(abd_iterate_func(pabd, psize, SPA_MAXBLOCKSIZE - psize, + random_get_pseudo_bytes_cb, NULL)); - VERIFY(random_get_pseudo_bytes((uint8_t *)pbuf2 + psize, - SPA_MAXBLOCKSIZE - psize) == 0); + VERIFY0(random_get_pseudo_bytes((uint8_t *)pbuf2 + psize, + SPA_MAXBLOCKSIZE - psize)); for (lsize = SPA_MAXBLOCKSIZE; lsize > psize; lsize -= SPA_MINBLOCKSIZE) { for (c = 0; c < ZIO_COMPRESS_FUNCTIONS; c++) { - if (zio_decompress_data(c, pbuf, lbuf, - psize, lsize) == 0 && - zio_decompress_data(c, pbuf2, lbuf2, - psize, lsize) == 0 && + if (zio_decompress_data(c, pabd, + lbuf, psize, lsize) == 0 && + zio_decompress_data_buf(c, pbuf2, + lbuf2, psize, lsize) == 0 && bcmp(lbuf, lbuf2, lsize) == 0) break; } @@ -3465,7 +3474,7 @@ zdb_read_block(char *thing, spa_t *spa) buf = lbuf; size = lsize; } else { - buf = pbuf; + buf = abd_to_buf(pabd); size = psize; } @@ -3483,7 +3492,7 @@ zdb_read_block(char *thing, spa_t *spa) zdb_dump_block(thing, buf, size, flags); out: - umem_free(pbuf, SPA_MAXBLOCKSIZE); + abd_free(pabd); umem_free(lbuf, SPA_MAXBLOCKSIZE); free(dup); } Modified: stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb_il.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb_il.c Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/cddl/contrib/opensolaris/cmd/zdb/zdb_il.c Thu Jul 27 10:25:18 2017 (r321610) @@ -24,7 +24,7 @@ */ /* - * Copyright (c) 2013, 2014 by Delphix. All rights reserved. + * Copyright (c) 2013, 2016 by Delphix. All rights reserved. */ /* @@ -41,6 +41,7 @@ #include #include #include +#include extern uint8_t dump_opt[256]; @@ -117,13 +118,27 @@ zil_prt_rec_rename(zilog_t *zilog, int txtype, lr_rena } /* ARGSUSED */ +static int +zil_prt_rec_write_cb(void *data, size_t len, void *unused) +{ + char *cdata = data; + for (int i = 0; i < len; i++) { + if (isprint(*cdata)) + (void) printf("%c ", *cdata); + else + (void) printf("%2X", *cdata); + cdata++; + } + return (0); +} + +/* ARGSUSED */ static void zil_prt_rec_write(zilog_t *zilog, int txtype, lr_write_t *lr) { - char *data, *dlimit; + abd_t *data; blkptr_t *bp = &lr->lr_blkptr; zbookmark_phys_t zb; - char buf[SPA_MAXBLOCKSIZE]; int verbose = MAX(dump_opt['d'], dump_opt['i']); int error; @@ -144,7 +159,6 @@ zil_prt_rec_write(zilog_t *zilog, int txtype, lr_write if (BP_IS_HOLE(bp)) { (void) printf("\t\t\tLSIZE 0x%llx\n", (u_longlong_t)BP_GET_LSIZE(bp)); - bzero(buf, sizeof (buf)); (void) printf("%s\n", prefix); return; } @@ -157,28 +171,26 @@ zil_prt_rec_write(zilog_t *zilog, int txtype, lr_write lr->lr_foid, ZB_ZIL_LEVEL, lr->lr_offset / BP_GET_LSIZE(bp)); + data = abd_alloc(BP_GET_LSIZE(bp), B_FALSE); error = zio_wait(zio_read(NULL, zilog->zl_spa, - bp, buf, BP_GET_LSIZE(bp), NULL, NULL, + bp, data, BP_GET_LSIZE(bp), NULL, NULL, ZIO_PRIORITY_SYNC_READ, ZIO_FLAG_CANFAIL, &zb)); if (error) - return; - data = buf; + goto out; } else { - data = (char *)(lr + 1); + /* data is stored after the end of the lr_write record */ + data = abd_alloc(lr->lr_length, B_FALSE); + abd_copy_from_buf(data, lr + 1, lr->lr_length); } - dlimit = data + MIN(lr->lr_length, - (verbose < 6 ? 20 : SPA_MAXBLOCKSIZE)); - (void) printf("%s", prefix); - while (data < dlimit) { - if (isprint(*data)) - (void) printf("%c ", *data); - else - (void) printf("%2X", *data); - data++; - } + (void) abd_iterate_func(data, + 0, MIN(lr->lr_length, (verbose < 6 ? 20 : SPA_MAXBLOCKSIZE)), + zil_prt_rec_write_cb, NULL); (void) printf("\n"); + +out: + abd_free(data); } /* ARGSUSED */ Modified: stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Jul 27 10:25:18 2017 (r321610) @@ -112,6 +112,7 @@ #include #include #include +#include #include #include #include @@ -190,6 +191,7 @@ extern uint64_t metaslab_df_alloc_threshold; extern uint64_t zfs_deadman_synctime_ms; extern int metaslab_preload_limit; extern boolean_t zfs_compressed_arc_enabled; +extern boolean_t zfs_abd_scatter_enabled; static ztest_shared_opts_t *ztest_shared_opts; static ztest_shared_opts_t ztest_opts; @@ -5042,7 +5044,7 @@ ztest_ddt_repair(ztest_ds_t *zd, uint64_t id) enum zio_checksum checksum = spa_dedup_checksum(spa); dmu_buf_t *db; dmu_tx_t *tx; - void *buf; + abd_t *abd; blkptr_t blk; int copies = 2 * ZIO_DEDUPDITTO_MIN; @@ -5122,14 +5124,14 @@ ztest_ddt_repair(ztest_ds_t *zd, uint64_t id) * Damage the block. Dedup-ditto will save us when we read it later. */ psize = BP_GET_PSIZE(&blk); - buf = zio_buf_alloc(psize); - ztest_pattern_set(buf, psize, ~pattern); + abd = abd_alloc_linear(psize, B_TRUE); + ztest_pattern_set(abd_to_buf(abd), psize, ~pattern); (void) zio_wait(zio_rewrite(NULL, spa, 0, &blk, - buf, psize, NULL, NULL, ZIO_PRIORITY_SYNC_WRITE, + abd, psize, NULL, NULL, ZIO_PRIORITY_SYNC_WRITE, ZIO_FLAG_CANFAIL | ZIO_FLAG_INDUCE_DAMAGE, NULL)); - zio_buf_free(buf, psize); + abd_free(abd); (void) rw_unlock(&ztest_name_lock); } @@ -5413,6 +5415,12 @@ ztest_resume_thread(void *arg) */ if (ztest_random(10) == 0) zfs_compressed_arc_enabled = ztest_random(2); + + /* + * Periodically change the zfs_abd_scatter_enabled setting. + */ + if (ztest_random(10) == 0) + zfs_abd_scatter_enabled = ztest_random(2); } return (NULL); } Modified: stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c ============================================================================== --- stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c Thu Jul 27 10:25:18 2017 (r321610) @@ -199,19 +199,19 @@ dump_record(dmu_replay_record_t *drr, void *payload, i { ASSERT3U(offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), ==, sizeof (dmu_replay_record_t) - sizeof (zio_cksum_t)); - fletcher_4_incremental_native(drr, + (void) fletcher_4_incremental_native(drr, offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), zc); if (drr->drr_type != DRR_BEGIN) { ASSERT(ZIO_CHECKSUM_IS_ZERO(&drr->drr_u. drr_checksum.drr_checksum)); drr->drr_u.drr_checksum.drr_checksum = *zc; } - fletcher_4_incremental_native(&drr->drr_u.drr_checksum.drr_checksum, - sizeof (zio_cksum_t), zc); + (void) fletcher_4_incremental_native( + &drr->drr_u.drr_checksum.drr_checksum, sizeof (zio_cksum_t), zc); if (write(outfd, drr, sizeof (*drr)) == -1) return (errno); if (payload_len != 0) { - fletcher_4_incremental_native(payload, payload_len, zc); + (void) fletcher_4_incremental_native(payload, payload_len, zc); if (write(outfd, payload, payload_len) == -1) return (errno); } @@ -2096,9 +2096,9 @@ recv_read(libzfs_handle_t *hdl, int fd, void *buf, int if (zc) { if (byteswap) - fletcher_4_incremental_byteswap(buf, ilen, zc); + (void) fletcher_4_incremental_byteswap(buf, ilen, zc); else - fletcher_4_incremental_native(buf, ilen, zc); + (void) fletcher_4_incremental_native(buf, ilen, zc); } return (0); } @@ -3688,7 +3688,8 @@ zfs_receive_impl(libzfs_handle_t *hdl, const char *tos * recv_read() above; do it again correctly. */ bzero(&zcksum, sizeof (zio_cksum_t)); - fletcher_4_incremental_byteswap(&drr, sizeof (drr), &zcksum); + (void) fletcher_4_incremental_byteswap(&drr, + sizeof (drr), &zcksum); flags->byteswap = B_TRUE; drr.drr_type = BSWAP_32(drr.drr_type); Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.c Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.c Thu Jul 27 10:25:18 2017 (r321610) @@ -24,6 +24,7 @@ */ /* * Copyright 2013 Saso Kiselkov. All rights reserved. + * Copyright (c) 2016 by Delphix. All rights reserved. */ /* @@ -133,17 +134,29 @@ #include #include #include +#include -/*ARGSUSED*/ void -fletcher_2_native(const void *buf, uint64_t size, - const void *ctx_template, zio_cksum_t *zcp) +fletcher_init(zio_cksum_t *zcp) { + ZIO_SET_CHECKSUM(zcp, 0, 0, 0, 0); +} + +int +fletcher_2_incremental_native(void *buf, size_t size, void *data) +{ + zio_cksum_t *zcp = data; + const uint64_t *ip = buf; const uint64_t *ipend = ip + (size / sizeof (uint64_t)); uint64_t a0, b0, a1, b1; - for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 = zcp->zc_word[0]; + a1 = zcp->zc_word[1]; + b0 = zcp->zc_word[2]; + b1 = zcp->zc_word[3]; + + for (; ip < ipend; ip += 2) { a0 += ip[0]; a1 += ip[1]; b0 += a0; @@ -151,18 +164,33 @@ fletcher_2_native(const void *buf, uint64_t size, } ZIO_SET_CHECKSUM(zcp, a0, a1, b0, b1); + return (0); } /*ARGSUSED*/ void -fletcher_2_byteswap(const void *buf, uint64_t size, +fletcher_2_native(const void *buf, size_t size, const void *ctx_template, zio_cksum_t *zcp) { + fletcher_init(zcp); + (void) fletcher_2_incremental_native((void *) buf, size, zcp); +} + +int +fletcher_2_incremental_byteswap(void *buf, size_t size, void *data) +{ + zio_cksum_t *zcp = data; + const uint64_t *ip = buf; const uint64_t *ipend = ip + (size / sizeof (uint64_t)); uint64_t a0, b0, a1, b1; - for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 = zcp->zc_word[0]; + a1 = zcp->zc_word[1]; + b0 = zcp->zc_word[2]; + b1 = zcp->zc_word[3]; + + for (; ip < ipend; ip += 2) { a0 += BSWAP_64(ip[0]); a1 += BSWAP_64(ip[1]); b0 += a0; @@ -170,50 +198,23 @@ fletcher_2_byteswap(const void *buf, uint64_t size, } ZIO_SET_CHECKSUM(zcp, a0, a1, b0, b1); + return (0); } /*ARGSUSED*/ void -fletcher_4_native(const void *buf, uint64_t size, +fletcher_2_byteswap(const void *buf, size_t size, const void *ctx_template, zio_cksum_t *zcp) { - const uint32_t *ip = buf; - const uint32_t *ipend = ip + (size / sizeof (uint32_t)); - uint64_t a, b, c, d; - - for (a = b = c = d = 0; ip < ipend; ip++) { - a += ip[0]; - b += a; - c += b; - d += c; - } - - ZIO_SET_CHECKSUM(zcp, a, b, c, d); + fletcher_init(zcp); + (void) fletcher_2_incremental_byteswap((void *) buf, size, zcp); } -/*ARGSUSED*/ -void -fletcher_4_byteswap(const void *buf, uint64_t size, - const void *ctx_template, zio_cksum_t *zcp) +int +fletcher_4_incremental_native(void *buf, size_t size, void *data) { - const uint32_t *ip = buf; - const uint32_t *ipend = ip + (size / sizeof (uint32_t)); - uint64_t a, b, c, d; + zio_cksum_t *zcp = data; - for (a = b = c = d = 0; ip < ipend; ip++) { - a += BSWAP_32(ip[0]); - b += a; - c += b; - d += c; - } - - ZIO_SET_CHECKSUM(zcp, a, b, c, d); -} - -void -fletcher_4_incremental_native(const void *buf, uint64_t size, - zio_cksum_t *zcp) -{ const uint32_t *ip = buf; const uint32_t *ipend = ip + (size / sizeof (uint32_t)); uint64_t a, b, c, d; @@ -231,12 +232,23 @@ fletcher_4_incremental_native(const void *buf, uint64_ } ZIO_SET_CHECKSUM(zcp, a, b, c, d); + return (0); } +/*ARGSUSED*/ void -fletcher_4_incremental_byteswap(const void *buf, uint64_t size, - zio_cksum_t *zcp) +fletcher_4_native(const void *buf, size_t size, + const void *ctx_template, zio_cksum_t *zcp) { + fletcher_init(zcp); + (void) fletcher_4_incremental_native((void *) buf, size, zcp); +} + +int +fletcher_4_incremental_byteswap(void *buf, size_t size, void *data) +{ + zio_cksum_t *zcp = data; + const uint32_t *ip = buf; const uint32_t *ipend = ip + (size / sizeof (uint32_t)); uint64_t a, b, c, d; @@ -254,4 +266,14 @@ fletcher_4_incremental_byteswap(const void *buf, uint6 } ZIO_SET_CHECKSUM(zcp, a, b, c, d); + return (0); +} + +/*ARGSUSED*/ +void +fletcher_4_byteswap(const void *buf, size_t size, + const void *ctx_template, zio_cksum_t *zcp) +{ + fletcher_init(zcp); + (void) fletcher_4_incremental_byteswap((void *) buf, size, zcp); } Modified: stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.h Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/sys/cddl/contrib/opensolaris/common/zfs/zfs_fletcher.h Thu Jul 27 10:25:18 2017 (r321610) @@ -24,6 +24,7 @@ */ /* * Copyright 2013 Saso Kiselkov. All rights reserved. + * Copyright (c) 2016 by Delphix. All rights reserved. */ #ifndef _ZFS_FLETCHER_H @@ -40,12 +41,15 @@ extern "C" { * fletcher checksum functions */ -void fletcher_2_native(const void *, uint64_t, const void *, zio_cksum_t *); -void fletcher_2_byteswap(const void *, uint64_t, const void *, zio_cksum_t *); -void fletcher_4_native(const void *, uint64_t, const void *, zio_cksum_t *); -void fletcher_4_byteswap(const void *, uint64_t, const void *, zio_cksum_t *); -void fletcher_4_incremental_native(const void *, uint64_t, zio_cksum_t *); -void fletcher_4_incremental_byteswap(const void *, uint64_t, zio_cksum_t *); +void fletcher_init(zio_cksum_t *); +void fletcher_2_native(const void *, size_t, const void *, zio_cksum_t *); +void fletcher_2_byteswap(const void *, size_t, const void *, zio_cksum_t *); +int fletcher_2_incremental_native(void *, size_t, void *); +int fletcher_2_incremental_byteswap(void *, size_t, void *); +void fletcher_4_native(const void *, size_t, const void *, zio_cksum_t *); +void fletcher_4_byteswap(const void *, size_t, const void *, zio_cksum_t *); +int fletcher_4_incremental_native(void *, size_t, void *); +int fletcher_4_incremental_byteswap(void *, size_t, void *); #ifdef __cplusplus } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Thu Jul 27 10:19:13 2017 (r321609) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Thu Jul 27 10:25:18 2017 (r321610) @@ -33,6 +33,7 @@ # common to all SunOS systems. ZFS_COMMON_OBJS += \ + abd.o \ arc.o \ bplist.o \ blkptr.o \ Copied: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c (from r320156, head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c Thu Jul 27 10:25:18 2017 (r321610, copy of r320156, head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c) @@ -0,0 +1,944 @@ +/* + * This file and its contents are supplied under the terms of the + * Common Development and Distribution License ("CDDL"), version 1.0. + * You may only use this file in accordance with the terms of version + * 1.0 of the CDDL. + * + * A full copy of the text of the CDDL should have accompanied this + * source. A copy of the CDDL is also available via the Internet at + * http://www.illumos.org/license/CDDL. + */ + +/* + * Copyright (c) 2014 by Chunwei Chen. All rights reserved. + * Copyright (c) 2016 by Delphix. All rights reserved. + */ + +/* + * ARC buffer data (ABD). + * + * ABDs are an abstract data structure for the ARC which can use two + * different ways of storing the underlying data: + * + * (a) Linear buffer. In this case, all the data in the ABD is stored in one + * contiguous buffer in memory (from a zio_[data_]buf_* kmem cache). + * + * +-------------------+ + * | ABD (linear) | + * | abd_flags = ... | + * | abd_size = ... | +--------------------------------+ + * | abd_buf ------------->| raw buffer of size abd_size | + * +-------------------+ +--------------------------------+ + * no abd_chunks + * + * (b) Scattered buffer. In this case, the data in the ABD is split into + * equal-sized chunks (from the abd_chunk_cache kmem_cache), with pointers + * to the chunks recorded in an array at the end of the ABD structure. + * + * +-------------------+ + * | ABD (scattered) | + * | abd_flags = ... | + * | abd_size = ... | + * | abd_offset = 0 | +-----------+ + * | abd_chunks[0] ----------------------------->| chunk 0 | + * | abd_chunks[1] ---------------------+ +-----------+ + * | ... | | +-----------+ + * | abd_chunks[N-1] ---------+ +------->| chunk 1 | + * +-------------------+ | +-----------+ + * | ... + * | +-----------+ + * +----------------->| chunk N-1 | + * +-----------+ + * + * Using a large proportion of scattered ABDs decreases ARC fragmentation since + * when we are at the limit of allocatable space, using equal-size chunks will + * allow us to quickly reclaim enough space for a new large allocation (assuming + * it is also scattered). + * + * In addition to directly allocating a linear or scattered ABD, it is also + * possible to create an ABD by requesting the "sub-ABD" starting at an offset + * within an existing ABD. In linear buffers this is simple (set abd_buf of + * the new ABD to the starting point within the original raw buffer), but + * scattered ABDs are a little more complex. The new ABD makes a copy of the + * relevant abd_chunks pointers (but not the underlying data). However, to + * provide arbitrary rather than only chunk-aligned starting offsets, it also + * tracks an abd_offset field which represents the starting point of the data + * within the first chunk in abd_chunks. For both linear and scattered ABDs, + * creating an offset ABD marks the original ABD as the offset's parent, and the + * original ABD's abd_children refcount is incremented. This data allows us to + * ensure the root ABD isn't deleted before its children. + * + * Most consumers should never need to know what type of ABD they're using -- + * the ABD public API ensures that it's possible to transparently switch from + * using a linear ABD to a scattered one when doing so would be beneficial. + * + * If you need to use the data within an ABD directly, if you know it's linear + * (because you allocated it) you can use abd_to_buf() to access the underlying + * raw buffer. Otherwise, you should use one of the abd_borrow_buf* functions + * which will allocate a raw buffer if necessary. Use the abd_return_buf* + * functions to return any raw buffers that are no longer necessary when you're + * done using them. + * + * There are a variety of ABD APIs that implement basic buffer operations: + * compare, copy, read, write, and fill with zeroes. If you need a custom + * function which progressively accesses the whole ABD, use the abd_iterate_* + * functions. + */ + +#include +#include +#include +#include +#include + +typedef struct abd_stats { + kstat_named_t abdstat_struct_size; + kstat_named_t abdstat_scatter_cnt; + kstat_named_t abdstat_scatter_data_size; + kstat_named_t abdstat_scatter_chunk_waste; + kstat_named_t abdstat_linear_cnt; + kstat_named_t abdstat_linear_data_size; +} abd_stats_t; + +static abd_stats_t abd_stats = { + /* Amount of memory occupied by all of the abd_t struct allocations */ + { "struct_size", KSTAT_DATA_UINT64 }, + /* + * The number of scatter ABDs which are currently allocated, excluding + * ABDs which don't own their data (for instance the ones which were + * allocated through abd_get_offset()). + */ + { "scatter_cnt", KSTAT_DATA_UINT64 }, + /* Amount of data stored in all scatter ABDs tracked by scatter_cnt */ + { "scatter_data_size", KSTAT_DATA_UINT64 }, + /* + * The amount of space wasted at the end of the last chunk across all + * scatter ABDs tracked by scatter_cnt. + */ + { "scatter_chunk_waste", KSTAT_DATA_UINT64 }, + /* + * The number of linear ABDs which are currently allocated, excluding + * ABDs which don't own their data (for instance the ones which were + * allocated through abd_get_offset() and abd_get_from_buf()). If an + * ABD takes ownership of its buf then it will become tracked. + */ + { "linear_cnt", KSTAT_DATA_UINT64 }, + /* Amount of data stored in all linear ABDs tracked by linear_cnt */ + { "linear_data_size", KSTAT_DATA_UINT64 }, +}; + +#define ABDSTAT(stat) (abd_stats.stat.value.ui64) +#define ABDSTAT_INCR(stat, val) \ + atomic_add_64(&abd_stats.stat.value.ui64, (val)) +#define ABDSTAT_BUMP(stat) ABDSTAT_INCR(stat, 1) +#define ABDSTAT_BUMPDOWN(stat) ABDSTAT_INCR(stat, -1) + +/* + * It is possible to make all future ABDs be linear by setting this to B_FALSE. + * Otherwise, ABDs are allocated scattered by default unless the caller uses + * abd_alloc_linear(). + */ +boolean_t zfs_abd_scatter_enabled = B_TRUE; + +/* + * The size of the chunks ABD allocates. Because the sizes allocated from the + * kmem_cache can't change, this tunable can only be modified at boot. Changing + * it at runtime would cause ABD iteration to work incorrectly for ABDs which + * were allocated with the old size, so a safeguard has been put in place which + * will cause the machine to panic if you change it and try to access the data + * within a scattered ABD. + */ +size_t zfs_abd_chunk_size = 4096; + +#ifdef _KERNEL +extern vmem_t *zio_alloc_arena; +#endif + +kmem_cache_t *abd_chunk_cache; +static kstat_t *abd_ksp; + +static void * +abd_alloc_chunk() +{ + void *c = kmem_cache_alloc(abd_chunk_cache, KM_PUSHPAGE); + ASSERT3P(c, !=, NULL); + return (c); +} + +static void +abd_free_chunk(void *c) +{ + kmem_cache_free(abd_chunk_cache, c); +} + +void +abd_init(void) +{ +#ifdef illumos + vmem_t *data_alloc_arena = NULL; + +#ifdef _KERNEL + data_alloc_arena = zio_alloc_arena; +#endif + + /* + * Since ABD chunks do not appear in crash dumps, we pass KMC_NOTOUCH + * so that no allocator metadata is stored with the buffers. + */ + abd_chunk_cache = kmem_cache_create("abd_chunk", zfs_abd_chunk_size, 0, + NULL, NULL, NULL, NULL, data_alloc_arena, KMC_NOTOUCH); +#else + abd_chunk_cache = kmem_cache_create("abd_chunk", zfs_abd_chunk_size, 0, + NULL, NULL, NULL, NULL, 0, KMC_NOTOUCH | KMC_NODEBUG); +#endif + abd_ksp = kstat_create("zfs", 0, "abdstats", "misc", KSTAT_TYPE_NAMED, + sizeof (abd_stats) / sizeof (kstat_named_t), KSTAT_FLAG_VIRTUAL); + if (abd_ksp != NULL) { + abd_ksp->ks_data = &abd_stats; + kstat_install(abd_ksp); + } +} + +void +abd_fini(void) +{ + if (abd_ksp != NULL) { + kstat_delete(abd_ksp); + abd_ksp = NULL; + } + + kmem_cache_destroy(abd_chunk_cache); + abd_chunk_cache = NULL; +} + +static inline size_t +abd_chunkcnt_for_bytes(size_t size) +{ + return (P2ROUNDUP(size, zfs_abd_chunk_size) / zfs_abd_chunk_size); +} + +static inline size_t +abd_scatter_chunkcnt(abd_t *abd) +{ + ASSERT(!abd_is_linear(abd)); + return (abd_chunkcnt_for_bytes( + abd->abd_u.abd_scatter.abd_offset + abd->abd_size)); +} + +static inline void +abd_verify(abd_t *abd) +{ + ASSERT3U(abd->abd_size, >, 0); + ASSERT3U(abd->abd_size, <=, SPA_MAXBLOCKSIZE); + ASSERT3U(abd->abd_flags, ==, abd->abd_flags & (ABD_FLAG_LINEAR | + ABD_FLAG_OWNER | ABD_FLAG_META)); + IMPLY(abd->abd_parent != NULL, !(abd->abd_flags & ABD_FLAG_OWNER)); + IMPLY(abd->abd_flags & ABD_FLAG_META, abd->abd_flags & ABD_FLAG_OWNER); + if (abd_is_linear(abd)) { + ASSERT3P(abd->abd_u.abd_linear.abd_buf, !=, NULL); + } else { + ASSERT3U(abd->abd_u.abd_scatter.abd_offset, <, + zfs_abd_chunk_size); + size_t n = abd_scatter_chunkcnt(abd); + for (int i = 0; i < n; i++) { + ASSERT3P( + abd->abd_u.abd_scatter.abd_chunks[i], !=, NULL); + } + } +} + +static inline abd_t * +abd_alloc_struct(size_t chunkcnt) +{ + size_t size = offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]); + abd_t *abd = kmem_alloc(size, KM_PUSHPAGE); + ASSERT3P(abd, !=, NULL); + ABDSTAT_INCR(abdstat_struct_size, size); + + return (abd); +} + +static inline void +abd_free_struct(abd_t *abd) +{ + size_t chunkcnt = abd_is_linear(abd) ? 0 : abd_scatter_chunkcnt(abd); + int size = offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]); + kmem_free(abd, size); + ABDSTAT_INCR(abdstat_struct_size, -size); +} + +/* + * Allocate an ABD, along with its own underlying data buffers. Use this if you + * don't care whether the ABD is linear or not. + */ +abd_t * +abd_alloc(size_t size, boolean_t is_metadata) +{ + if (!zfs_abd_scatter_enabled) + return (abd_alloc_linear(size, is_metadata)); + + VERIFY3U(size, <=, SPA_MAXBLOCKSIZE); + + size_t n = abd_chunkcnt_for_bytes(size); + abd_t *abd = abd_alloc_struct(n); + + abd->abd_flags = ABD_FLAG_OWNER; + if (is_metadata) { + abd->abd_flags |= ABD_FLAG_META; + } + abd->abd_size = size; + abd->abd_parent = NULL; + refcount_create(&abd->abd_children); + + abd->abd_u.abd_scatter.abd_offset = 0; + abd->abd_u.abd_scatter.abd_chunk_size = zfs_abd_chunk_size; + + for (int i = 0; i < n; i++) { + void *c = abd_alloc_chunk(); + ASSERT3P(c, !=, NULL); + abd->abd_u.abd_scatter.abd_chunks[i] = c; + } + + ABDSTAT_BUMP(abdstat_scatter_cnt); + ABDSTAT_INCR(abdstat_scatter_data_size, size); + ABDSTAT_INCR(abdstat_scatter_chunk_waste, + n * zfs_abd_chunk_size - size); + + return (abd); +} + +static void +abd_free_scatter(abd_t *abd) +{ + size_t n = abd_scatter_chunkcnt(abd); + for (int i = 0; i < n; i++) { + abd_free_chunk(abd->abd_u.abd_scatter.abd_chunks[i]); + } + + refcount_destroy(&abd->abd_children); + ABDSTAT_BUMPDOWN(abdstat_scatter_cnt); + ABDSTAT_INCR(abdstat_scatter_data_size, -(int)abd->abd_size); + ABDSTAT_INCR(abdstat_scatter_chunk_waste, + abd->abd_size - n * zfs_abd_chunk_size); + + abd_free_struct(abd); +} + +/* + * Allocate an ABD that must be linear, along with its own underlying data + * buffer. Only use this when it would be very annoying to write your ABD + * consumer with a scattered ABD. + */ +abd_t * +abd_alloc_linear(size_t size, boolean_t is_metadata) +{ + abd_t *abd = abd_alloc_struct(0); + + VERIFY3U(size, <=, SPA_MAXBLOCKSIZE); + + abd->abd_flags = ABD_FLAG_LINEAR | ABD_FLAG_OWNER; + if (is_metadata) { + abd->abd_flags |= ABD_FLAG_META; + } + abd->abd_size = size; + abd->abd_parent = NULL; + refcount_create(&abd->abd_children); + + if (is_metadata) { + abd->abd_u.abd_linear.abd_buf = zio_buf_alloc(size); + } else { + abd->abd_u.abd_linear.abd_buf = zio_data_buf_alloc(size); + } + + ABDSTAT_BUMP(abdstat_linear_cnt); + ABDSTAT_INCR(abdstat_linear_data_size, size); + + return (abd); +} + +static void +abd_free_linear(abd_t *abd) +{ + if (abd->abd_flags & ABD_FLAG_META) { + zio_buf_free(abd->abd_u.abd_linear.abd_buf, abd->abd_size); + } else { + zio_data_buf_free(abd->abd_u.abd_linear.abd_buf, abd->abd_size); + } + + refcount_destroy(&abd->abd_children); + ABDSTAT_BUMPDOWN(abdstat_linear_cnt); + ABDSTAT_INCR(abdstat_linear_data_size, -(int)abd->abd_size); + + abd_free_struct(abd); +} + +/* + * Free an ABD. Only use this on ABDs allocated with abd_alloc() or + * abd_alloc_linear(). + */ +void +abd_free(abd_t *abd) +{ + abd_verify(abd); + ASSERT3P(abd->abd_parent, ==, NULL); + ASSERT(abd->abd_flags & ABD_FLAG_OWNER); + if (abd_is_linear(abd)) + abd_free_linear(abd); + else + abd_free_scatter(abd); +} + +/* + * Allocate an ABD of the same format (same metadata flag, same scatterize + * setting) as another ABD. + */ +abd_t * +abd_alloc_sametype(abd_t *sabd, size_t size) +{ + boolean_t is_metadata = (sabd->abd_flags & ABD_FLAG_META) != 0; + if (abd_is_linear(sabd)) { + return (abd_alloc_linear(size, is_metadata)); + } else { + return (abd_alloc(size, is_metadata)); + } +} + +/* + * If we're going to use this ABD for doing I/O using the block layer, the + * consumer of the ABD data doesn't care if it's scattered or not, and we don't + * plan to store this ABD in memory for a long period of time, we should + * allocate the ABD type that requires the least data copying to do the I/O. + * + * Currently this is linear ABDs, however if ldi_strategy() can ever issue I/Os + * using a scatter/gather list we should switch to that and replace this call + * with vanilla abd_alloc(). + */ +abd_t * +abd_alloc_for_io(size_t size, boolean_t is_metadata) +{ + return (abd_alloc_linear(size, is_metadata)); +} + +/* + * Allocate a new ABD to point to offset off of sabd. It shares the underlying + * buffer data with sabd. Use abd_put() to free. sabd must not be freed while + * any derived ABDs exist. + */ +abd_t * +abd_get_offset(abd_t *sabd, size_t off) +{ + abd_t *abd; + + abd_verify(sabd); + ASSERT3U(off, <=, sabd->abd_size); *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:26:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 69906DC71C6; Thu, 27 Jul 2017 10:26:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 45F44702B0; Thu, 27 Jul 2017 10:26:59 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RAQwhq015528; Thu, 27 Jul 2017 10:26:58 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RAQwjB015526; Thu, 27 Jul 2017 10:26:58 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271026.v6RAQwjB015526@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:26:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321611 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321611 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:26:59 -0000 Author: mav Date: Thu Jul 27 10:26:58 2017 New Revision: 321611 URL: https://svnweb.freebsd.org/changeset/base/321611 Log: MFC r320237: MFV r318947: 7578 Fix/improve some aspects of ZIL writing. FreeBSD note: this commit removes small differences between what mav committed to FreeBSD in r308782 and what ended up committed to illumos after addressing all review comments. illumos/illumos-gate@c5ee46810f82e8a53d2cc5a487568a573f449039 https://github.com/illumos/illumos-gate/commit/c5ee46810f82e8a53d2cc5a487568a573f449039 https://www.illumos.org/issues/7578 After some ZIL changes 6 years ago zil_slog_limit got partially broken due to zl_itx_list_sz not updated when async itx'es upgraded to sync. Actually because of other changes about that time zl_itx_list_sz is not really required to implement the functionality, so this patch removes some unneeded broken code and variables. Original idea of zil_slog_limit was to reduce chance of SLOG abuse by single heavy logger, that increased latency for other (more latency critical) loggers, by pushing heavy log out into the main pool instead of SLOG. Beside huge latency increase for heavy writers, this implementation caused double write of all data, since the log records were explicitly prepared for SLOG. Since we now have I/O scheduler, I've found it can be much more efficient to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG. Existing ZIL implementation had problem with space efficiency when it has to write large chunks of data into log blocks of limited size. In some cases efficiency stopped to almost as low as 50%. In case of ZIL stored on spinning rust, that also reduced log write speed in half, since head had to uselessly fly over allocated but not written areas. This change improves the situation by offloading problematic operations from z*_log_write() to zil_lwb_commit(), which knows real situation of log blocks allocation and can split large requests into pieces much more efficiently. Also as side effect it removes one of two data copy operations done by ZIL code WR_COPIED case. While there, untangle and unify code of z*_log_write() functions. Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing block boundary, that may also improve efficiency if ZPL is made to do that. Reviewed by: Matthew Ahrens Reviewed by: Prakash Surya Reviewed by: Andriy Gapon Reviewed by: Steven Hartland Reviewed by: Brad Lewis Reviewed by: Richard Elling Approved by: Robert Mustacchi Author: Alexander Motin Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h Thu Jul 27 10:25:18 2017 (r321610) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h Thu Jul 27 10:26:58 2017 (r321611) @@ -139,10 +139,27 @@ typedef struct zil_bp_node { avl_node_t zn_node; } zil_bp_node_t; +/* + * Maximum amount of write data that can be put into single log block. + */ #define ZIL_MAX_LOG_DATA (SPA_OLD_MAXBLOCKSIZE - sizeof (zil_chain_t) - \ sizeof (lr_write_t)) #define ZIL_MAX_COPIED_DATA \ ((SPA_OLD_MAXBLOCKSIZE - sizeof (zil_chain_t)) / 2 - sizeof (lr_write_t)) + +/* + * Maximum amount of log space we agree to waste to reduce number of + * WR_NEED_COPY chunks to reduce zl_get_data() overhead (~12%). + */ +#define ZIL_MAX_WASTE_SPACE (ZIL_MAX_LOG_DATA / 8) + +/* + * Maximum amount of write data for WR_COPIED. Fall back to WR_NEED_COPY + * as more space efficient if we can't fit at least two log records into + * maximum sized log block. + */ +#define ZIL_MAX_COPIED_DATA ((SPA_OLD_MAXBLOCKSIZE - \ + sizeof (zil_chain_t)) / 2 - sizeof (lr_write_t)) #ifdef __cplusplus } Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Thu Jul 27 10:25:18 2017 (r321610) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Thu Jul 27 10:26:58 2017 (r321611) @@ -90,12 +90,12 @@ SYSCTL_INT(_vfs_zfs_trim, OID_AUTO, enabled, CTLFLAG_R /* * Limit SLOG write size per commit executed with synchronous priority. - * Any writes above that executed with lower (asynchronous) priority to - * limit potential SLOG device abuse by single active ZIL writer. + * Any writes above that will be executed with lower (asynchronous) priority + * to limit potential SLOG device abuse by single active ZIL writer. */ -uint64_t zil_slog_limit = 768 * 1024; -SYSCTL_QUAD(_vfs_zfs, OID_AUTO, zil_slog_limit, CTLFLAG_RWTUN, - &zil_slog_limit, 0, "Maximal SLOG commit size with sync priority"); +uint64_t zil_slog_bulk = 768 * 1024; +SYSCTL_QUAD(_vfs_zfs, OID_AUTO, zil_slog_bulk, CTLFLAG_RWTUN, + &zil_slog_bulk, 0, "Maximal SLOG commit size with sync priority"); static kmem_cache_t *zil_lwb_cache; @@ -923,7 +923,7 @@ zil_lwb_write_init(zilog_t *zilog, lwb_t *lwb) if (lwb->lwb_zio == NULL) { abd_t *lwb_abd = abd_get_from_buf(lwb->lwb_buf, BP_GET_LSIZE(&lwb->lwb_blk)); - if (zilog->zl_cur_used <= zil_slog_limit || !lwb->lwb_slog) + if (!lwb->lwb_slog || zilog->zl_cur_used <= zil_slog_bulk) prio = ZIO_PRIORITY_SYNC_WRITE; else prio = ZIO_PRIORITY_ASYNC_WRITE; @@ -1068,36 +1068,38 @@ zil_lwb_write_start(zilog_t *zilog, lwb_t *lwb, boolea static lwb_t * zil_lwb_commit(zilog_t *zilog, itx_t *itx, lwb_t *lwb) { - lr_t *lrcb, *lrc = &itx->itx_lr; /* common log record */ - lr_write_t *lrwb, *lrw = (lr_write_t *)lrc; + lr_t *lrcb, *lrc; + lr_write_t *lrwb, *lrw; char *lr_buf; - uint64_t txg = lrc->lrc_txg; - uint64_t reclen = lrc->lrc_reclen; - uint64_t dlen = 0; - uint64_t dnow, lwb_sp; + uint64_t dlen, dnow, lwb_sp, reclen, txg; if (lwb == NULL) return (NULL); ASSERT(lwb->lwb_buf != NULL); - if (lrc->lrc_txtype == TX_WRITE && itx->itx_wr_state == WR_NEED_COPY) + lrc = &itx->itx_lr; /* Common log record inside itx. */ + lrw = (lr_write_t *)lrc; /* Write log record inside itx. */ + if (lrc->lrc_txtype == TX_WRITE && itx->itx_wr_state == WR_NEED_COPY) { dlen = P2ROUNDUP_TYPED( lrw->lr_length, sizeof (uint64_t), uint64_t); - + } else { + dlen = 0; + } + reclen = lrc->lrc_reclen; zilog->zl_cur_used += (reclen + dlen); + txg = lrc->lrc_txg; zil_lwb_write_init(zilog, lwb); cont: /* * If this record won't fit in the current log block, start a new one. - * For WR_NEED_COPY optimize layout for minimal number of chunks, but - * try to keep wasted space withing reasonable range (12%). + * For WR_NEED_COPY optimize layout for minimal number of chunks. */ lwb_sp = lwb->lwb_sz - lwb->lwb_nused; if (reclen > lwb_sp || (reclen + dlen > lwb_sp && - lwb_sp < ZIL_MAX_LOG_DATA / 8 && (dlen % ZIL_MAX_LOG_DATA == 0 || + lwb_sp < ZIL_MAX_WASTE_SPACE && (dlen % ZIL_MAX_LOG_DATA == 0 || lwb_sp < reclen + dlen % ZIL_MAX_LOG_DATA))) { lwb = zil_lwb_write_start(zilog, lwb, B_FALSE); if (lwb == NULL) @@ -1105,14 +1107,14 @@ cont: zil_lwb_write_init(zilog, lwb); ASSERT(LWB_EMPTY(lwb)); lwb_sp = lwb->lwb_sz - lwb->lwb_nused; - ASSERT3U(reclen + MIN(dlen, sizeof(uint64_t)), <=, lwb_sp); + ASSERT3U(reclen + MIN(dlen, sizeof (uint64_t)), <=, lwb_sp); } dnow = MIN(dlen, lwb_sp - reclen); lr_buf = lwb->lwb_buf + lwb->lwb_nused; bcopy(lrc, lr_buf, reclen); - lrcb = (lr_t *)lr_buf; - lrwb = (lr_write_t *)lrcb; + lrcb = (lr_t *)lr_buf; /* Like lrc, but inside lwb. */ + lrwb = (lr_write_t *)lrcb; /* Like lrw, but inside lwb. */ /* * If it's a write, fetch the data or get its blkptr as appropriate. @@ -1328,6 +1330,8 @@ zil_itx_assign(zilog_t *zilog, itx_t *itx, dmu_tx_t *t * this itxg. Save the itxs for release below. * This should be rare. */ + zfs_dbgmsg("zil_itx_assign: missed itx cleanup for " + "txg %llu", itxg->itxg_txg); clean = itxg->itxg_itxs; } itxg->itxg_txg = txg; From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:28:09 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42F2FDC7252; Thu, 27 Jul 2017 10:28:09 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 06F2370405; Thu, 27 Jul 2017 10:28:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RAS8eP015626; Thu, 27 Jul 2017 10:28:08 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RAS8Zb015625; Thu, 27 Jul 2017 10:28:08 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271028.v6RAS8Zb015625@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:28:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321612 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321612 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:28:09 -0000 Author: mav Date: Thu Jul 27 10:28:07 2017 New Revision: 321612 URL: https://svnweb.freebsd.org/changeset/base/321612 Log: MFC r320238: MFV r319742: 8056 zfs send size estimate is inaccurate for some zvols illumos/illumos-gate@0255edcc85fc0cd1dda0e49bcd52eb66c06a1b16 https://github.com/illumos/illumos-gate/commit/0255edcc85fc0cd1dda0e49bcd52eb66c06a1b16 https://www.illumos.org/issues/8056 The send size estimate for a zvol can be too low, if the size of the record headers (dmu_replay_record_t's) is a significant portion of the size. This is typically the case when the data is highly compressible, especially with embedded blocks. The problem is that dmu_adjust_send_estimate_for_indirects() assumes that blocks are the size of the "recordsize" property (128KB). However, for zvols, the blocks are the size of the "volblocksize" property (8KB). Therefore, we estimate that there will be 16x less record headers than there really will be. Reviewed by: Matthew Ahrens Reviewed by: Pavel Zakharov Approved by: Robert Mustacchi Author: Paul Dagnelie Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Thu Jul 27 10:26:58 2017 (r321611) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Thu Jul 27 10:28:07 2017 (r321612) @@ -1140,10 +1140,17 @@ dmu_adjust_send_estimate_for_indirects(dsl_dataset_t * */ uint64_t recordsize; uint64_t record_count; + objset_t *os; + VERIFY0(dmu_objset_from_ds(ds, &os)); /* Assume all (uncompressed) blocks are recordsize. */ - err = dsl_prop_get_int_ds(ds, zfs_prop_to_name(ZFS_PROP_RECORDSIZE), - &recordsize); + if (os->os_phys->os_type == DMU_OST_ZVOL) { + err = dsl_prop_get_int_ds(ds, + zfs_prop_to_name(ZFS_PROP_VOLBLOCKSIZE), &recordsize); + } else { + err = dsl_prop_get_int_ds(ds, + zfs_prop_to_name(ZFS_PROP_RECORDSIZE), &recordsize); + } if (err != 0) return (err); record_count = uncompressed / recordsize; @@ -1212,6 +1219,10 @@ dmu_send_estimate(dsl_dataset_t *ds, dsl_dataset_t *fr err = dmu_adjust_send_estimate_for_indirects(ds, uncomp, comp, stream_compressed, sizep); + /* + * Add the size of the BEGIN and END records to the estimate. + */ + *sizep += 2 * sizeof (dmu_replay_record_t); return (err); } From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:29:30 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D96DEDC7306; Thu, 27 Jul 2017 10:29:30 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 95A0170567; Thu, 27 Jul 2017 10:29:30 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RATTK1015720; Thu, 27 Jul 2017 10:29:29 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RATTf5015719; Thu, 27 Jul 2017 10:29:29 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271029.v6RATTf5015719@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:29:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321613 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 321613 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:29:31 -0000 Author: mav Date: Thu Jul 27 10:29:29 2017 New Revision: 321613 URL: https://svnweb.freebsd.org/changeset/base/321613 Log: MFC r320239: MFV r319950: 5220 L2ARC does not support devices that do not provide 512B access FreeBSD note: the actual change has been in FreeBSD since r297848. This commit accounts for integration of that change with subsequent changes, especially r320156 (MFV of r318946) and r314274. illumos/illumos-gate@403a8da73c64ff9dfb6230ba045c765a242213fb https://github.com/illumos/illumos-gate/commit/403a8da73c64ff9dfb6230ba045c765a242213fb https://www.illumos.org/issues/5220 There are disk devices that have logical sector size larger than 512B, for example 4KB. That is, their physical sector size is larger than 512B and they do not provide emulation for 512B sector sizes. For such devices both a data offset and a data size must be properly aligned. L2ARC should arrange that because it uses physical I/O. zio_vdev_io_start() performs a necessary transformation if io_size is not aligned to vdev_ashift, but that is done only for logical I/O. Something similar should be done in L2ARC code. * a temporary write buffer should be allocated if the original buffer is not going to be compressed and its size is not aligned * size of a temporary compression buffer should be ashift aligned * for the reads, if a size of a target buffer is not sufficiently large and it is not aligned then a temporary read buffer should be allocated Reviewed by: George Wilson Reviewed by: Dan Kimmel Reviewed by: Saso Kiselkov Approved by: Dan McDonald Author: Andriy Gapon Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Thu Jul 27 10:28:07 2017 (r321612) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Thu Jul 27 10:29:29 2017 (r321613) @@ -1343,7 +1343,7 @@ typedef struct l2arc_read_callback { blkptr_t l2rcb_bp; /* original blkptr */ zbookmark_phys_t l2rcb_zb; /* original bookmark */ int l2rcb_flags; /* original flags */ - void *l2rcb_abd; /* temporary buffer */ + abd_t *l2rcb_abd; /* temporary buffer */ } l2arc_read_callback_t; typedef struct l2arc_write_callback { @@ -7485,8 +7485,13 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint * Normally the L2ARC can use the hdr's data, but if * we're sharing data between the hdr and one of its * bufs, L2ARC needs its own copy of the data so that - * the ZIO below can't race with the buf consumer. To - * ensure that this copy will be available for the + * the ZIO below can't race with the buf consumer. + * Another case where we need to create a copy of the + * data is when the buffer size is not device-aligned + * and we need to pad the block to make it such. + * That also keeps the clock hand suitably aligned. + * + * To ensure that the copy will be available for the * lifetime of the ZIO and be cleaned up afterwards, we * add it to the l2arc_free_on_write queue. */ From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:30:56 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B23C7DC7389; Thu, 27 Jul 2017 10:30:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8C123706F6; Thu, 27 Jul 2017 10:30:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RAUt5f015842; Thu, 27 Jul 2017 10:30:55 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RAUtAA015839; Thu, 27 Jul 2017 10:30:55 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271030.v6RAUtAA015839@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:30:55 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321614 - in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 321614 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:30:56 -0000 Author: mav Date: Thu Jul 27 10:30:55 2017 New Revision: 321614 URL: https://svnweb.freebsd.org/changeset/base/321614 Log: MFC r320352: zfs: port vdev_file part of illumos change 3306 3306 zdb should be able to issue reads in parallel illumos/illumos-gate/31d7e8fa33fae995f558673adb22641b5aa8b6e1 https://www.illumos.org/issues/3306 The upstream change was made before we started to import upstream commits individually. It was imported into the illumos vendor area as r242733. That commit was MFV-ed in r260138, but as the commit message says vdev_file.c was left intact. This commit actually implements the parallel I/O for vdev_file using a taskqueue with multiple thread. This implementation does not depend on the illumos or FreeBSD bio interface at all, but uses zio_t to pass around all the relevent data. So, the code looks a bit different from the upstream. This commit also incorporates ZoL commit zfsonlinux/zfs/bc25c9325b0e5ced897b9820dad239539d561ec9 that fixed https://github.com/zfsonlinux/zfs/issues/2270 We need to use a dedicated taskqueue for exactly the same reason as ZoL as we do not implement TASKQ_DYNAMIC. Obtained from: illumos, ZFS on Linux Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jul 27 10:29:29 2017 (r321613) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jul 27 10:30:55 2017 (r321614) @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -2024,6 +2025,7 @@ spa_init(int mode) dmu_init(); zil_init(); vdev_cache_stat_init(); + vdev_file_init(); zfs_prop_init(); zpool_prop_init(); zpool_feature_init(); @@ -2043,6 +2045,7 @@ spa_fini(void) spa_evict_all(); + vdev_file_fini(); vdev_cache_stat_fini(); zil_fini(); dmu_fini(); Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h Thu Jul 27 10:29:29 2017 (r321613) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h Thu Jul 27 10:30:55 2017 (r321614) @@ -39,6 +39,9 @@ typedef struct vdev_file { vnode_t *vf_vnode; } vdev_file_t; +extern void vdev_file_init(void); +extern void vdev_file_fini(void); + #ifdef __cplusplus } #endif Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c Thu Jul 27 10:29:29 2017 (r321613) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c Thu Jul 27 10:30:55 2017 (r321614) @@ -36,6 +36,21 @@ * Virtual device vector for files. */ +static taskq_t *vdev_file_taskq; + +void +vdev_file_init(void) +{ + vdev_file_taskq = taskq_create("z_vdev_file", MAX(max_ncpus, 16), + minclsyspri, max_ncpus, INT_MAX, 0); +} + +void +vdev_file_fini(void) +{ + taskq_destroy(vdev_file_taskq); +} + static void vdev_file_hold(vdev_t *vd) { @@ -157,41 +172,32 @@ vdev_file_close(vdev_t *vd) vd->vdev_tsd = NULL; } +/* + * Implements the interrupt side for file vdev types. This routine will be + * called when the I/O completes allowing us to transfer the I/O to the + * interrupt taskqs. For consistency, the code structure mimics disk vdev + * types. + */ static void -vdev_file_io_start(zio_t *zio) +vdev_file_io_intr(zio_t *zio) { + zio_delay_interrupt(zio); +} + +static void +vdev_file_io_strategy(void *arg) +{ + zio_t *zio = arg; vdev_t *vd = zio->io_vd; vdev_file_t *vf; vnode_t *vp; void *addr; ssize_t resid; - if (!vdev_readable(vd)) { - zio->io_error = SET_ERROR(ENXIO); - zio_interrupt(zio); - return; - } - vf = vd->vdev_tsd; vp = vf->vf_vnode; - if (zio->io_type == ZIO_TYPE_IOCTL) { - switch (zio->io_cmd) { - case DKIOCFLUSHWRITECACHE: - zio->io_error = VOP_FSYNC(vp, FSYNC | FDSYNC, - kcred, NULL); - break; - default: - zio->io_error = SET_ERROR(ENOTSUP); - } - - zio_execute(zio); - return; - } - ASSERT(zio->io_type == ZIO_TYPE_READ || zio->io_type == ZIO_TYPE_WRITE); - zio->io_target_timestamp = zio_handle_io_delay(zio); - if (zio->io_type == ZIO_TYPE_READ) { addr = abd_borrow_buf(zio->io_abd, zio->io_size); } else { @@ -211,12 +217,41 @@ vdev_file_io_start(zio_t *zio) if (resid != 0 && zio->io_error == 0) zio->io_error = ENOSPC; - zio_delay_interrupt(zio); + vdev_file_io_intr(zio); +} -#ifdef illumos - VERIFY3U(taskq_dispatch(system_taskq, vdev_file_io_strategy, bp, +static void +vdev_file_io_start(zio_t *zio) +{ + vdev_t *vd = zio->io_vd; + vdev_file_t *vf = vd->vdev_tsd; + + if (zio->io_type == ZIO_TYPE_IOCTL) { + /* XXPOLICY */ + if (!vdev_readable(vd)) { + zio->io_error = SET_ERROR(ENXIO); + zio_interrupt(zio); + return; + } + + switch (zio->io_cmd) { + case DKIOCFLUSHWRITECACHE: + zio->io_error = VOP_FSYNC(vf->vf_vnode, FSYNC | FDSYNC, + kcred, NULL); + break; + default: + zio->io_error = SET_ERROR(ENOTSUP); + } + + zio_execute(zio); + return; + } + + ASSERT(zio->io_type == ZIO_TYPE_READ || zio->io_type == ZIO_TYPE_WRITE); + zio->io_target_timestamp = zio_handle_io_delay(zio); + + VERIFY3U(taskq_dispatch(vdev_file_taskq, vdev_file_io_strategy, zio, TQ_SLEEP), !=, 0); -#endif } /* ARGSUSED */ From owner-svn-src-stable-11@freebsd.org Thu Jul 27 10:52:38 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 44CF2DC7B85; Thu, 27 Jul 2017 10:52:38 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0D80D713B3; Thu, 27 Jul 2017 10:52:37 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RAqbIj027792; Thu, 27 Jul 2017 10:52:37 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RAqbZj027791; Thu, 27 Jul 2017 10:52:37 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707271052.v6RAqbZj027791@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Thu, 27 Jul 2017 10:52:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321615 - stable/11/usr.sbin/fstyp X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/usr.sbin/fstyp X-SVN-Commit-Revision: 321615 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 10:52:38 -0000 Author: mav Date: Thu Jul 27 10:52:36 2017 New Revision: 321615 URL: https://svnweb.freebsd.org/changeset/base/321615 Log: MFC r320152 (by avg): fstyp: move sys/ include path after zfs include paths The reason is that FreeBSD refcount.h shadows ZFS refcount.h and that will lead to a build error after a planned import of the ARC buf data scatter-ization. It's possible that some day we will have an opposite problem where a ZFS header would shadow an essential FreeBSD header. So, we need to think about a better long term solution. Modified: stable/11/usr.sbin/fstyp/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/fstyp/Makefile ============================================================================== --- stable/11/usr.sbin/fstyp/Makefile Thu Jul 27 10:30:55 2017 (r321614) +++ stable/11/usr.sbin/fstyp/Makefile Thu Jul 27 10:52:36 2017 (r321615) @@ -18,8 +18,6 @@ WARNS?= 2 SUBDIR+= tests .endif -CFLAGS+=-I${SRCTOP}/sys - .if ${MK_ZFS} != "no" IGNORE_PRAGMA= YES @@ -34,6 +32,8 @@ CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/ CFLAGS+= -I${SRCTOP}/sys/cddl/contrib/opensolaris/uts/common/sys CFLAGS+= -I${SRCTOP}/cddl/contrib/opensolaris/head .endif + +CFLAGS+=-I${SRCTOP}/sys LIBADD= geom md From owner-svn-src-stable-11@freebsd.org Thu Jul 27 12:37:19 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE354DCA2E3; Thu, 27 Jul 2017 12:37:19 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BACB9749F6; Thu, 27 Jul 2017 12:37:19 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RCbIpF068869; Thu, 27 Jul 2017 12:37:18 GMT (envelope-from emaste@FreeBSD.org) Received: (from emaste@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RCbI0L068868; Thu, 27 Jul 2017 12:37:18 GMT (envelope-from emaste@FreeBSD.org) Message-Id: <201707271237.v6RCbI0L068868@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: emaste set sender to emaste@FreeBSD.org using -f From: Ed Maste Date: Thu, 27 Jul 2017 12:37:18 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321617 - stable/11/usr.sbin/acpi/acpidump X-SVN-Group: stable-11 X-SVN-Commit-Author: emaste X-SVN-Commit-Paths: stable/11/usr.sbin/acpi/acpidump X-SVN-Commit-Revision: 321617 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 12:37:20 -0000 Author: emaste Date: Thu Jul 27 12:37:18 2017 New Revision: 321617 URL: https://svnweb.freebsd.org/changeset/base/321617 Log: revert r321601, it depends on an ACPICA update not yet merged Modified: stable/11/usr.sbin/acpi/acpidump/acpi.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/acpi/acpidump/acpi.c ============================================================================== --- stable/11/usr.sbin/acpi/acpidump/acpi.c Thu Jul 27 12:29:31 2017 (r321616) +++ stable/11/usr.sbin/acpi/acpidump/acpi.c Thu Jul 27 12:37:18 2017 (r321617) @@ -1085,8 +1085,7 @@ static const char *srat_types[] = { [ACPI_SRAT_TYPE_CPU_AFFINITY] = "CPU", [ACPI_SRAT_TYPE_MEMORY_AFFINITY] = "Memory", [ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY] = "X2APIC", - [ACPI_SRAT_TYPE_GICC_AFFINITY] = "GICC", - [ACPI_SRAT_TYPE_GIC_ITS_AFFINITY] = "GIC ITS", + [ACPI_SRAT_TYPE_GICC_AFFINITY] = "GICC" }; static void From owner-svn-src-stable-11@freebsd.org Thu Jul 27 15:59:38 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 03652DCE00D; Thu, 27 Jul 2017 15:59:38 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C5D7D7FB8B; Thu, 27 Jul 2017 15:59:37 +0000 (UTC) (envelope-from gjb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RFxb12052682; Thu, 27 Jul 2017 15:59:37 GMT (envelope-from gjb@FreeBSD.org) Received: (from gjb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RFxbet052681; Thu, 27 Jul 2017 15:59:37 GMT (envelope-from gjb@FreeBSD.org) Message-Id: <201707271559.v6RFxbet052681@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: gjb set sender to gjb@FreeBSD.org using -f From: Glen Barber Date: Thu, 27 Jul 2017 15:59:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321624 - stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Group: stable-11 X-SVN-Commit-Author: gjb X-SVN-Commit-Paths: stable/11/release/doc/en_US.ISO8859-1/errata X-SVN-Commit-Revision: 321624 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 15:59:38 -0000 Author: gjb Date: Thu Jul 27 15:59:36 2017 New Revision: 321624 URL: https://svnweb.freebsd.org/changeset/base/321624 Log: Add an errata entry to reflect an incorrect attribution for r315330. Reported by: Jim Thompson Sponsored by: The FreeBSD Foundation Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Modified: stable/11/release/doc/en_US.ISO8859-1/errata/article.xml ============================================================================== --- stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Thu Jul 27 15:51:56 2017 (r321623) +++ stable/11/release/doc/en_US.ISO8859-1/errata/article.xml Thu Jul 27 15:59:36 2017 (r321624) @@ -179,6 +179,13 @@ boot Systems being upgraded from 11.1-RC1 and earlier and 11.1-RC3 to 11.1-RELEASE should be unaffected. + + + [2017-07-27] The release notes erroneously state + revision r315330 was sponsored by Rubicon + Communications, LLC (Netgate), when in fact this work was + done by Hiroki Mori independently. + From owner-svn-src-stable-11@freebsd.org Thu Jul 27 16:08:36 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8678BDCE3B8; Thu, 27 Jul 2017 16:08:36 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 61EE880298; Thu, 27 Jul 2017 16:08:35 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id 0916710AF12; Thu, 27 Jul 2017 12:08:28 -0400 (EDT) From: John Baldwin To: Ed Maste Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: Re: svn commit: r321594 - stable/11 Date: Thu, 27 Jul 2017 08:47:30 -0700 Message-ID: <3534925.c9nRS26nSD@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.1-STABLE; KDE/4.14.30; amd64; ; ) In-Reply-To: <201707262318.v6QNIEJ5041684@repo.freebsd.org> References: <201707262318.v6QNIEJ5041684@repo.freebsd.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Thu, 27 Jul 2017 12:08:28 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 16:08:36 -0000 On Wednesday, July 26, 2017 11:18:14 PM Ed Maste wrote: > Author: emaste > Date: Wed Jul 26 23:18:14 2017 > New Revision: 321594 > URL: https://svnweb.freebsd.org/changeset/base/321594 > > Log: > MFC r312857: Use cross-NM (XNM) in compat32 build > > An attempt to build mips64 using external toolchain failed as it tried > to use the host amd64 nm. Note that none of the mips bits for lib32 have been merged to 11. -- John Baldwin From owner-svn-src-stable-11@freebsd.org Thu Jul 27 23:09:13 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 68F41DB2F58; Thu, 27 Jul 2017 23:09:13 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 34EF268A64; Thu, 27 Jul 2017 23:09:13 +0000 (UTC) (envelope-from rmacklem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6RN9C7L029718; Thu, 27 Jul 2017 23:09:12 GMT (envelope-from rmacklem@FreeBSD.org) Received: (from rmacklem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6RN9CUs029717; Thu, 27 Jul 2017 23:09:12 GMT (envelope-from rmacklem@FreeBSD.org) Message-Id: <201707272309.v6RN9CUs029717@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: rmacklem set sender to rmacklem@FreeBSD.org using -f From: Rick Macklem Date: Thu, 27 Jul 2017 23:09:12 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321632 - stable/11/sys/fs/nfsclient X-SVN-Group: stable-11 X-SVN-Commit-Author: rmacklem X-SVN-Commit-Paths: stable/11/sys/fs/nfsclient X-SVN-Commit-Revision: 321632 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 23:09:13 -0000 Author: rmacklem Date: Thu Jul 27 23:09:12 2017 New Revision: 321632 URL: https://svnweb.freebsd.org/changeset/base/321632 Log: MFC: r321314 r320062 introduced a bug when doing NFSv4.1 mounts against some non-FreeBSD servers. r320062 used nm_rsize, nm_wsize to set the maximum request/response sizes for the NFSv4.1 session. If rsize,wsize are not specified as options, the value of nm_rsize, nm_wsize is 0 at session creation, resulting in values for request/response that are too small. This patch fixes the problem. A workaround is to specify rsize=N,wsize=N mount options explicitly, so they are set before session creation. This bug only affects NFSv4.1 mounts against some non-FreeBSD servers. Modified: stable/11/sys/fs/nfsclient/nfs_clrpcops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/fs/nfsclient/nfs_clrpcops.c ============================================================================== --- stable/11/sys/fs/nfsclient/nfs_clrpcops.c Thu Jul 27 23:01:07 2017 (r321631) +++ stable/11/sys/fs/nfsclient/nfs_clrpcops.c Thu Jul 27 23:09:12 2017 (r321632) @@ -4664,6 +4664,11 @@ nfsrpc_createsession(struct nfsmount *nmp, struct nfsc struct nfsrv_descript *nd = &nfsd; int error, irdcnt; + /* Make sure nm_rsize, nm_wsize is set. */ + if (nmp->nm_rsize > NFS_MAXBSIZE || nmp->nm_rsize == 0) + nmp->nm_rsize = NFS_MAXBSIZE; + if (nmp->nm_wsize > NFS_MAXBSIZE || nmp->nm_wsize == 0) + nmp->nm_wsize = NFS_MAXBSIZE; nfscl_reqstart(nd, NFSPROC_CREATESESSION, nmp, NULL, 0, NULL, NULL); NFSM_BUILD(tl, uint32_t *, 4 * NFSX_UNSIGNED); *tl++ = sep->nfsess_clientid.lval[0]; From owner-svn-src-stable-11@freebsd.org Fri Jul 28 03:25:02 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57450DBC528; Fri, 28 Jul 2017 03:25:02 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 17661710EE; Fri, 28 Jul 2017 03:25:02 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6S3P1dJ037885; Fri, 28 Jul 2017 03:25:01 GMT (envelope-from ngie@FreeBSD.org) Received: (from ngie@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6S3P1mT037884; Fri, 28 Jul 2017 03:25:01 GMT (envelope-from ngie@FreeBSD.org) Message-Id: <201707280325.v6S3P1mT037884@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ngie set sender to ngie@FreeBSD.org using -f From: Ngie Cooper Date: Fri, 28 Jul 2017 03:25:01 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321642 - stable/11/tests/sys/vfs X-SVN-Group: stable-11 X-SVN-Commit-Author: ngie X-SVN-Commit-Paths: stable/11/tests/sys/vfs X-SVN-Commit-Revision: 321642 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 03:25:02 -0000 Author: ngie Date: Fri Jul 28 03:25:01 2017 New Revision: 321642 URL: https://svnweb.freebsd.org/changeset/base/321642 Log: MFC r320445: Don't hardcode path to file in /tmp; this violates the kyua sandbox Modified: stable/11/tests/sys/vfs/trailing_slash.sh Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/vfs/trailing_slash.sh ============================================================================== --- stable/11/tests/sys/vfs/trailing_slash.sh Fri Jul 28 03:24:57 2017 (r321641) +++ stable/11/tests/sys/vfs/trailing_slash.sh Fri Jul 28 03:25:01 2017 (r321642) @@ -6,8 +6,9 @@ # point to files. See kern/21768 for details. Fixed in r193028. # -testfile="/tmp/testfile-$$" -testlink="/tmp/testlink-$$" +: ${TMPDIR=/tmp} +testfile="$TMPDIR/testfile-$$" +testlink="$TMPDIR/testlink-$$" tests=" $testfile:$testlink:$testfile:0 From owner-svn-src-stable-11@freebsd.org Fri Jul 28 03:25:59 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97CBFDBC5E1; Fri, 28 Jul 2017 03:25:59 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 66B827134D; Fri, 28 Jul 2017 03:25:59 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6S3PwIZ037996; Fri, 28 Jul 2017 03:25:58 GMT (envelope-from ngie@FreeBSD.org) Received: (from ngie@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6S3PwS0037995; Fri, 28 Jul 2017 03:25:58 GMT (envelope-from ngie@FreeBSD.org) Message-Id: <201707280325.v6S3PwS0037995@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ngie set sender to ngie@FreeBSD.org using -f From: Ngie Cooper Date: Fri, 28 Jul 2017 03:25:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321643 - stable/11/tests/sys/vfs X-SVN-Group: stable-11 X-SVN-Commit-Author: ngie X-SVN-Commit-Paths: stable/11/tests/sys/vfs X-SVN-Commit-Revision: 321643 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 03:25:59 -0000 Author: ngie Date: Fri Jul 28 03:25:58 2017 New Revision: 321643 URL: https://svnweb.freebsd.org/changeset/base/321643 Log: MFC r320446: trailing_slash is a TAP-compliant testcase; mark it as such, instead of calling is a plain testcase. Modified: stable/11/tests/sys/vfs/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/tests/sys/vfs/Makefile ============================================================================== --- stable/11/tests/sys/vfs/Makefile Fri Jul 28 03:25:01 2017 (r321642) +++ stable/11/tests/sys/vfs/Makefile Fri Jul 28 03:25:58 2017 (r321643) @@ -4,6 +4,6 @@ PACKAGE= tests TESTSDIR= ${TESTSBASE}/sys/vfs -PLAIN_TESTS_SH+= trailing_slash +TAP_TESTS_SH+= trailing_slash .include From owner-svn-src-stable-11@freebsd.org Fri Jul 28 03:27:42 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A4B92DBC6D3; Fri, 28 Jul 2017 03:27:42 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7241B715A8; Fri, 28 Jul 2017 03:27:42 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6S3Rf2T038206; Fri, 28 Jul 2017 03:27:41 GMT (envelope-from ngie@FreeBSD.org) Received: (from ngie@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6S3Rf92038204; Fri, 28 Jul 2017 03:27:41 GMT (envelope-from ngie@FreeBSD.org) Message-Id: <201707280327.v6S3Rf92038204@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ngie set sender to ngie@FreeBSD.org using -f From: Ngie Cooper Date: Fri, 28 Jul 2017 03:27:41 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321645 - in stable/11: etc/mtree share/examples/tests/tests share/examples/tests/tests/tap X-SVN-Group: stable-11 X-SVN-Commit-Author: ngie X-SVN-Commit-Paths: in stable/11: etc/mtree share/examples/tests/tests share/examples/tests/tests/tap X-SVN-Commit-Revision: 321645 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 03:27:42 -0000 Author: ngie Date: Fri Jul 28 03:27:41 2017 New Revision: 321645 URL: https://svnweb.freebsd.org/changeset/base/321645 Log: MFC r320443,r320444: r320443: Add kyua TAP test integration examples The examples are patterned loosely after the ATF examples, similar to the plain test examples. r320444: Commit the corresponding mtree file change for the TAP test examples MFC with: r320443 Added: stable/11/share/examples/tests/tests/tap/ - copied from r320443, head/share/examples/tests/tests/tap/ Modified: stable/11/etc/mtree/BSD.tests.dist stable/11/share/examples/tests/tests/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/etc/mtree/BSD.tests.dist ============================================================================== --- stable/11/etc/mtree/BSD.tests.dist Fri Jul 28 03:26:05 2017 (r321644) +++ stable/11/etc/mtree/BSD.tests.dist Fri Jul 28 03:27:41 2017 (r321645) @@ -384,6 +384,8 @@ .. plain .. + tap + .. .. .. .. Modified: stable/11/share/examples/tests/tests/Makefile ============================================================================== --- stable/11/share/examples/tests/tests/Makefile Fri Jul 28 03:26:05 2017 (r321644) +++ stable/11/share/examples/tests/tests/Makefile Fri Jul 28 03:27:41 2017 (r321645) @@ -17,6 +17,7 @@ TESTSDIR= ${TESTSBASE}/share/examples/tests # of the system. We use TESTS_SUBDIRS instead of SUBDIR because we want # the auto-generated Kyuafile to recurse into these directories. TESTS_SUBDIRS= atf plain +TESTS_SUBDIRS+= tap # We leave KYUAFILE unset so that bsd.test.mk auto-generates a Kyuafile # for us based on the contents of the TESTS_SUBDIRS line above. The From owner-svn-src-stable-11@freebsd.org Fri Jul 28 03:32:37 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D276FDBC8EA; Fri, 28 Jul 2017 03:32:37 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AA12571A62; Fri, 28 Jul 2017 03:32:37 +0000 (UTC) (envelope-from ngie@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6S3WaSJ042250; Fri, 28 Jul 2017 03:32:36 GMT (envelope-from ngie@FreeBSD.org) Received: (from ngie@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6S3WaDW042247; Fri, 28 Jul 2017 03:32:36 GMT (envelope-from ngie@FreeBSD.org) Message-Id: <201707280332.v6S3WaDW042247@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ngie set sender to ngie@FreeBSD.org using -f From: Ngie Cooper Date: Fri, 28 Jul 2017 03:32:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321647 - in stable/11/share/examples/tests/tests: . atf plain X-SVN-Group: stable-11 X-SVN-Commit-Author: ngie X-SVN-Commit-Paths: in stable/11/share/examples/tests/tests: . atf plain X-SVN-Commit-Revision: 321647 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 03:32:37 -0000 Author: ngie Date: Fri Jul 28 03:32:36 2017 New Revision: 321647 URL: https://svnweb.freebsd.org/changeset/base/321647 Log: MFC r320441,r320442: r320441: share/examples/tests/Makefile: clean up example snippets/documentation - TESTSDIR doesn't need to be specified after r289158. - Including bsd.own.mk isn't required since no MK_ knobs are being manipulated. - TESTS_SUBDIRS should be written out in an append format, one entry per line, to provide a better, more conflict resistant example. r320442: share/examples/tests/{atf,plain}/Makefile: tweak example Makefile snippets - Including bsd.own.mk isn't required since no MK_ knobs are being manipulated. - Update documentation to note that ${FILES} is installed via bsd.progs.mk, not bsd.prog.mk. Modified: stable/11/share/examples/tests/tests/Makefile stable/11/share/examples/tests/tests/atf/Makefile stable/11/share/examples/tests/tests/plain/Makefile Directory Properties: stable/11/ (props changed) Modified: stable/11/share/examples/tests/tests/Makefile ============================================================================== --- stable/11/share/examples/tests/tests/Makefile Fri Jul 28 03:30:46 2017 (r321646) +++ stable/11/share/examples/tests/tests/Makefile Fri Jul 28 03:32:36 2017 (r321647) @@ -1,7 +1,5 @@ # $FreeBSD$ -.include - # Directory into which the Kyuafile provided by this directory will be # installed. # @@ -11,12 +9,16 @@ # # For example: if this Makefile were in src/bin/cp/tests/, its TESTSDIR # would point at ${TESTSBASE}/bin/cp/. -TESTSDIR= ${TESTSBASE}/share/examples/tests +# +# The default path specified by bsd.test.mk is `${TESTSBASE}/${RELDIR:H}`, +# which happens to be the same as `${TESTSBASE}/share/examples/tests`. +#TESTSDIR= ${TESTSBASE}/share/examples/tests # List of subdirectories into which we want to recurse during the build # of the system. We use TESTS_SUBDIRS instead of SUBDIR because we want # the auto-generated Kyuafile to recurse into these directories. -TESTS_SUBDIRS= atf plain +TESTS_SUBDIRS+= atf +TESTS_SUBDIRS+= plain TESTS_SUBDIRS+= tap # We leave KYUAFILE unset so that bsd.test.mk auto-generates a Kyuafile Modified: stable/11/share/examples/tests/tests/atf/Makefile ============================================================================== --- stable/11/share/examples/tests/tests/atf/Makefile Fri Jul 28 03:30:46 2017 (r321646) +++ stable/11/share/examples/tests/tests/atf/Makefile Fri Jul 28 03:32:36 2017 (r321647) @@ -1,7 +1,5 @@ # $FreeBSD$ -.include - # The release package to use for the tests contained within the directory # # This applies to components which rely on ^/projects/release-pkg support @@ -33,10 +31,10 @@ ATF_TESTS_SH= cp_test # definitions from above. KYUAFILE= yes -# Install file1 and file2 as files via bsd.prog.mk. Please note the intentional +# Install file1 and file2 as files via bsd.progs.mk. Please note the intentional # ${PACKAGE} namespace of files. # -# The basic semantics of this are the same as FILES in bsd.prog.mk, e.g. the +# The basic semantics of this are the same as FILES in bsd.progs.mk, e.g. the # installation of the files can be manipulated via ${PACKAGE}FILESDIR, # ${PACKAGE}FILESMODE, etc. # Modified: stable/11/share/examples/tests/tests/plain/Makefile ============================================================================== --- stable/11/share/examples/tests/tests/plain/Makefile Fri Jul 28 03:30:46 2017 (r321646) +++ stable/11/share/examples/tests/tests/plain/Makefile Fri Jul 28 03:32:36 2017 (r321647) @@ -1,7 +1,5 @@ # $FreeBSD$ -.include - # The release package to use for the tests contained within the directory # # This applies to components which rely on ^/projects/release-pkg support @@ -33,10 +31,10 @@ PLAIN_TESTS_SH= cp_test # definitions from above. KYUAFILE= yes -# Install file1 and file2 as files via bsd.prog.mk. Please note the intentional +# Install file1 and file2 as files via bsd.progs.mk. Please note the intentional # ${PACKAGE} namespace of files. # -# The basic semantics of this are the same as FILES in bsd.prog.mk, e.g. the +# The basic semantics of this are the same as FILES in bsd.progs.mk, e.g. the # installation of the files can be manipulated via ${PACKAGE}FILESDIR, # ${PACKAGE}FILESMODE, etc. # From owner-svn-src-stable-11@freebsd.org Fri Jul 28 10:31:00 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8351CDC28F6; Fri, 28 Jul 2017 10:31:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5F41B7FDD4; Fri, 28 Jul 2017 10:31:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SAUx86009432; Fri, 28 Jul 2017 10:30:59 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6SAUx01009431; Fri, 28 Jul 2017 10:30:59 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201707281030.v6SAUx01009431@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Fri, 28 Jul 2017 10:30:59 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321650 - stable/11/lib/libc/x86/sys X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/lib/libc/x86/sys X-SVN-Commit-Revision: 321650 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 10:31:00 -0000 Author: kib Date: Fri Jul 28 10:30:59 2017 New Revision: 321650 URL: https://svnweb.freebsd.org/changeset/base/321650 Log: MFC r314319 (by oshogbo): Don't try to open devices in the gettc() function which will always fail in the Capability mode. Instead silently fallback to the syscall method, which is done for example in the gettimeofday(2) function. MFC r314320 (by oshogbo): Remove unneeded variable initialization from r314319. MFC r321461: Fix indent. Modified: stable/11/lib/libc/x86/sys/__vdso_gettc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/lib/libc/x86/sys/__vdso_gettc.c ============================================================================== --- stable/11/lib/libc/x86/sys/__vdso_gettc.c Fri Jul 28 04:41:57 2017 (r321649) +++ stable/11/lib/libc/x86/sys/__vdso_gettc.c Fri Jul 28 10:30:59 2017 (r321650) @@ -33,6 +33,7 @@ __FBSDID("$FreeBSD$"); #include #include "namespace.h" +#include #include #include #include @@ -124,6 +125,7 @@ __vdso_init_hpet(uint32_t u) static const char devprefix[] = "/dev/hpet"; char devname[64], *c, *c1, t; volatile char *new_map, *old_map; + unsigned int mode; uint32_t u1; int fd; @@ -144,18 +146,25 @@ __vdso_init_hpet(uint32_t u) if (old_map != NULL) return; + if (cap_getmode(&mode) == 0 && mode != 0) + goto fail; + fd = _open(devname, O_RDONLY); - if (fd == -1) { - atomic_cmpset_rel_ptr((volatile uintptr_t *)&hpet_dev_map[u], - (uintptr_t)old_map, (uintptr_t)MAP_FAILED); - return; - } + if (fd == -1) + goto fail; + new_map = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_SHARED, fd, 0); _close(fd); if (atomic_cmpset_rel_ptr((volatile uintptr_t *)&hpet_dev_map[u], (uintptr_t)old_map, (uintptr_t)new_map) == 0 && new_map != MAP_FAILED) munmap((void *)new_map, PAGE_SIZE); + + return; +fail: + /* Prevent the caller from re-entering. */ + atomic_cmpset_rel_ptr((volatile uintptr_t *)&hpet_dev_map[u], + (uintptr_t)old_map, (uintptr_t)MAP_FAILED); } #ifdef WANT_HYPERV @@ -174,16 +183,22 @@ static void __vdso_init_hyperv_tsc(void) { int fd; + unsigned int mode; + if (cap_getmode(&mode) == 0 && mode != 0) + goto fail; + fd = _open(HYPERV_REFTSC_DEVPATH, O_RDONLY); - if (fd < 0) { - /* Prevent the caller from re-entering. */ - hyperv_ref_tsc = MAP_FAILED; - return; - } + if (fd < 0) + goto fail; hyperv_ref_tsc = mmap(NULL, sizeof(*hyperv_ref_tsc), PROT_READ, MAP_SHARED, fd, 0); _close(fd); + + return; +fail: + /* Prevent the caller from re-entering. */ + hyperv_ref_tsc = MAP_FAILED; } static int From owner-svn-src-stable-11@freebsd.org Fri Jul 28 15:43:39 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6EDE1DC8853; Fri, 28 Jul 2017 15:43:39 +0000 (UTC) (envelope-from sjg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 371996578E; Fri, 28 Jul 2017 15:43:39 +0000 (UTC) (envelope-from sjg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SFhcep039811; Fri, 28 Jul 2017 15:43:38 GMT (envelope-from sjg@FreeBSD.org) Received: (from sjg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6SFhaBR039795; Fri, 28 Jul 2017 15:43:36 GMT (envelope-from sjg@FreeBSD.org) Message-Id: <201707281543.v6SFhaBR039795@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: sjg set sender to sjg@FreeBSD.org using -f From: "Simon J. Gerraty" Date: Fri, 28 Jul 2017 15:43:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321653 - in stable/11: contrib/bmake contrib/bmake/mk usr.bin/bmake X-SVN-Group: stable-11 X-SVN-Commit-Author: sjg X-SVN-Commit-Paths: in stable/11: contrib/bmake contrib/bmake/mk usr.bin/bmake X-SVN-Commit-Revision: 321653 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 15:43:39 -0000 Author: sjg Date: Fri Jul 28 15:43:36 2017 New Revision: 321653 URL: https://svnweb.freebsd.org/changeset/base/321653 Log: MFC bmake-20170720 PR: 221023 Modified: stable/11/contrib/bmake/ChangeLog stable/11/contrib/bmake/Makefile stable/11/contrib/bmake/bmake.1 stable/11/contrib/bmake/bmake.cat1 stable/11/contrib/bmake/buf.h stable/11/contrib/bmake/compat.c stable/11/contrib/bmake/dir.h stable/11/contrib/bmake/hash.h stable/11/contrib/bmake/job.c stable/11/contrib/bmake/main.c stable/11/contrib/bmake/make.1 stable/11/contrib/bmake/make.h stable/11/contrib/bmake/meta.c stable/11/contrib/bmake/mk/ChangeLog stable/11/contrib/bmake/mk/dirdeps.mk stable/11/contrib/bmake/mk/install-mk stable/11/contrib/bmake/mk/lib.mk stable/11/contrib/bmake/mk/meta.stage.mk stable/11/contrib/bmake/mk/meta.sys.mk stable/11/contrib/bmake/mk/meta2deps.py stable/11/contrib/bmake/mk/own.mk stable/11/contrib/bmake/nonints.h stable/11/contrib/bmake/sprite.h stable/11/usr.bin/bmake/Makefile stable/11/usr.bin/bmake/Makefile.inc Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/bmake/ChangeLog ============================================================================== --- stable/11/contrib/bmake/ChangeLog Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/ChangeLog Fri Jul 28 15:43:36 2017 (r321653) @@ -1,3 +1,23 @@ +2017-07-20 Simon J. Gerraty + + * Makefile (_MAKE_VERSION): 20170720 + Merge with NetBSD make, pick up + o compat.c: pass SIGINT etc onto child and wait for it to exit + before we self-terminate. + +2017-07-11 Simon J. Gerraty + + * Makefile (_MAKE_VERSION): 20170711 + forgot to update after merge on 20170708 ;-) + o main.c: refactor to reduce size of main function. + add -v option to always fully expand values. + o meta.c: ensure command output in meta file has ending newline + even when filemon not being used. + When matching ${.MAKE.META.IGNORE_PATTERNS} do not use + pathname via ':L' since any ':' in pathname breaks that. + Instead set a '${.p.}' to pathname in the target context and + use that. + 2017-05-10 Simon J. Gerraty * Makefile (_MAKE_VERSION): 20170510 Modified: stable/11/contrib/bmake/Makefile ============================================================================== --- stable/11/contrib/bmake/Makefile Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/Makefile Fri Jul 28 15:43:36 2017 (r321653) @@ -1,7 +1,7 @@ -# $Id: Makefile,v 1.92 2017/05/10 22:29:04 sjg Exp $ +# $Id: Makefile,v 1.95 2017/07/20 19:36:13 sjg Exp $ # Base version on src date -_MAKE_VERSION= 20170510 +_MAKE_VERSION= 20170720 PROG= bmake Modified: stable/11/contrib/bmake/bmake.1 ============================================================================== --- stable/11/contrib/bmake/bmake.1 Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/bmake.1 Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -.\" $NetBSD: make.1,v 1.266 2017/02/01 18:39:27 sjg Exp $ +.\" $NetBSD: make.1,v 1.271 2017/07/03 21:34:20 wiz Exp $ .\" .\" Copyright (c) 1990, 1993 .\" The Regents of the University of California. All rights reserved. @@ -29,7 +29,7 @@ .\" .\" from: @(#)make.1 8.4 (Berkeley) 3/19/94 .\" -.Dd February 1, 2017 +.Dd June 22, 2017 .Dt BMAKE 1 .Os .Sh NAME @@ -48,6 +48,7 @@ .Op Fl m Ar directory .Op Fl T Ar file .Op Fl V Ar variable +.Op Fl v Ar variable .Op Ar variable=value .Op Ar target ... .Sh DESCRIPTION @@ -206,7 +207,9 @@ Print debugging information about target list maintena .It Ar V Force the .Fl V -option to print raw values of variables. +option to print raw values of variables, overriding the default behavior +set via +.Va .MAKE.EXPAND_VARIABLES . .It Ar v Print debugging information about variable assignment. .It Ar x @@ -334,20 +337,39 @@ for each job started and completed. Rather than re-building a target as specified in the makefile, create it or update its modification time to make it appear up-to-date. .It Fl V Ar variable -Print -.Nm Ns 's -idea of the value of -.Ar variable , -in the global context. +Print the value of +.Ar variable . Do not build any targets. Multiple instances of this option may be specified; the variables will be printed one per line, with a blank line for each null or undefined variable. +The value printed is extracted from the global context after all +makefiles have been read. +By default, the raw variable contents (which may +include additional unexpanded variable references) are shown. If .Ar variable contains a .Ql \&$ -then the value will be expanded before printing. +then the value will be recursively expanded to its complete resultant +text before printing. +The expanded value will also be printed if +.Va .MAKE.EXPAND_VARIABLES +is set to true and +the +.Fl dV +option has not been used to override it. +Note that loop-local and target-local variables, as well as values +taken temporarily by global variables during makefile processing, are +not accessible via this option. +The +.Fl dv +debug mode can be used to see these at the cost of generating +substantial extraneous output. +.It Fl v Ar variable +Like +.Fl V +but the variable is always expanded to its complete value. .It Fl W Treat any warnings during makefile parsing as errors. .It Fl w @@ -657,7 +679,7 @@ The seven local variables are as follows: .Bl -tag -width ".ARCHIVE" -offset indent .It Va .ALLSRC The list of all sources for this target; also known as -.Ql Va \&\*[Gt] . +.Ql Va \&> . .It Va .ARCHIVE The name of the archive file; also known as .Ql Va \&! . @@ -666,7 +688,7 @@ In suffix-transformation rules, the name/path of the s target is to be transformed (the .Dq implied source); also known as -.Ql Va \&\*[Lt] . +.Ql Va \&< . It is not defined in explicit rules. .It Va .MEMBER The name of the archive member; also known as @@ -691,9 +713,9 @@ in archive member rules. .El .Pp The shorter forms -.Ql ( Va \*[Gt] , +.Ql ( Va > , .Ql Va \&! , -.Ql Va \*[Lt] , +.Ql Va < , .Ql Va % , .Ql Va \&? , .Ql Va * , @@ -776,6 +798,10 @@ from which generated dependencies are read. A boolean that controls the default behavior of the .Fl V option. +If true, variable values printed with +.Fl V +are fully expanded; if false, the raw variable contents (which may +include additional unexpanded variable references) are shown. .It Va .MAKE.EXPORTED The list of variables exported by .Nm . @@ -1287,7 +1313,7 @@ it is anchored at the end of each word. Inside .Ar new_string , an ampersand -.Pq Ql \*[Am] +.Pq Ql & is replaced by .Ar old_string (without any @@ -1751,7 +1777,7 @@ may be any one of the following: .Bl -tag -width "Cm XX" .It Cm \&|\&| Logical OR. -.It Cm \&\*[Am]\*[Am] +.It Cm \&&& Logical .Tn AND ; of higher precedence than @@ -1768,7 +1794,7 @@ The boolean operator may be used to logically negate an entire conditional. It is of higher precedence than -.Ql Ic \&\*[Am]\*[Am] . +.Ql Ic \&&& . .Pp The value of .Ar expression Modified: stable/11/contrib/bmake/bmake.cat1 ============================================================================== --- stable/11/contrib/bmake/bmake.cat1 Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/bmake.cat1 Fri Jul 28 15:43:36 2017 (r321653) @@ -6,8 +6,8 @@ NNAAMMEE SSYYNNOOPPSSIISS bbmmaakkee [--BBeeiikkNNnnqqrrssttWWwwXX] [--CC _d_i_r_e_c_t_o_r_y] [--DD _v_a_r_i_a_b_l_e] [--dd _f_l_a_g_s] [--ff _m_a_k_e_f_i_l_e] [--II _d_i_r_e_c_t_o_r_y] [--JJ _p_r_i_v_a_t_e] [--jj _m_a_x___j_o_b_s] - [--mm _d_i_r_e_c_t_o_r_y] [--TT _f_i_l_e] [--VV _v_a_r_i_a_b_l_e] [_v_a_r_i_a_b_l_e_=_v_a_l_u_e] - [_t_a_r_g_e_t _._._.] + [--mm _d_i_r_e_c_t_o_r_y] [--TT _f_i_l_e] [--VV _v_a_r_i_a_b_l_e] [--vv _v_a_r_i_a_b_l_e] + [_v_a_r_i_a_b_l_e_=_v_a_l_u_e] [_t_a_r_g_e_t _._._.] DDEESSCCRRIIPPTTIIOONN bbmmaakkee is a program designed to simplify the maintenance of other pro- @@ -118,7 +118,9 @@ DDEESSCCRRIIPPTTIIOONN _t Print debugging information about target list mainte- nance. - _V Force the --VV option to print raw values of variables. + _V Force the --VV option to print raw values of variables, + overriding the default behavior set via + _._M_A_K_E_._E_X_P_A_N_D___V_A_R_I_A_B_L_E_S. _v Print debugging information about variable assignment. @@ -209,13 +211,26 @@ DDEESSCCRRIIPPTTIIOONN to-date. --VV _v_a_r_i_a_b_l_e - Print bbmmaakkee's idea of the value of _v_a_r_i_a_b_l_e, in the global con- - text. Do not build any targets. Multiple instances of this - option may be specified; the variables will be printed one per - line, with a blank line for each null or undefined variable. If - _v_a_r_i_a_b_l_e contains a `$' then the value will be expanded before - printing. + Print the value of _v_a_r_i_a_b_l_e. Do not build any targets. Multiple + instances of this option may be specified; the variables will be + printed one per line, with a blank line for each null or unde- + fined variable. The value printed is extracted from the global + context after all makefiles have been read. By default, the raw + variable contents (which may include additional unexpanded vari- + able references) are shown. If _v_a_r_i_a_b_l_e contains a `$' then the + value will be recursively expanded to its complete resultant text + before printing. The expanded value will also be printed if + _._M_A_K_E_._E_X_P_A_N_D___V_A_R_I_A_B_L_E_S is set to true and the --ddVV option has not + been used to override it. Note that loop-local and target-local + variables, as well as values taken temporarily by global vari- + ables during makefile processing, are not accessible via this + option. The --ddvv debug mode can be used to see these at the cost + of generating substantial extraneous output. + --vv _v_a_r_i_a_b_l_e + Like --VV but the variable is always expanded to its complete + value. + --WW Treat any warnings during makefile parsing as errors. --ww Print entering and leaving directory messages, pre and post pro- @@ -488,7 +503,10 @@ VVAARRIIAABBLLEE AASSSSIIGGNNMMEENNT _._M_A_K_E_._E_X_P_A_N_D___V_A_R_I_A_B_L_E_S A boolean that controls the default behavior of the --VV - option. + option. If true, variable values printed with --VV are + fully expanded; if false, the raw variable contents + (which may include additional unexpanded variable refer- + ences) are shown. _._M_A_K_E_._E_X_P_O_R_T_E_D The list of variables exported by bbmmaakkee. @@ -1523,4 +1541,4 @@ BBUUGGSS There is no way of escaping a space character in a filename. -NetBSD 7.1_RC1 February 1, 2017 NetBSD 7.1_RC1 +NetBSD 7.1_RC1 June 22, 2017 NetBSD 7.1_RC1 Modified: stable/11/contrib/bmake/buf.h ============================================================================== --- stable/11/contrib/bmake/buf.h Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/buf.h Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: buf.h,v 1.17 2012/04/24 20:26:58 sjg Exp $ */ +/* $NetBSD: buf.h,v 1.19 2017/05/31 22:02:06 maya Exp $ */ /* * Copyright (c) 1988, 1989, 1990 The Regents of the University of California. @@ -77,8 +77,8 @@ * Header for users of the buf library. */ -#ifndef _BUF_H -#define _BUF_H +#ifndef MAKE_BUF_H +#define MAKE_BUF_H typedef char Byte; @@ -116,4 +116,4 @@ void Buf_Init(Buffer *, int); Byte *Buf_Destroy(Buffer *, Boolean); Byte *Buf_DestroyCompact(Buffer *); -#endif /* _BUF_H */ +#endif /* MAKE_BUF_H */ Modified: stable/11/contrib/bmake/compat.c ============================================================================== --- stable/11/contrib/bmake/compat.c Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/compat.c Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: compat.c,v 1.106 2016/08/26 23:28:39 dholland Exp $ */ +/* $NetBSD: compat.c,v 1.107 2017/07/20 19:29:54 sjg Exp $ */ /* * Copyright (c) 1988, 1989, 1990 The Regents of the University of California. @@ -70,14 +70,14 @@ */ #ifndef MAKE_NATIVE -static char rcsid[] = "$NetBSD: compat.c,v 1.106 2016/08/26 23:28:39 dholland Exp $"; +static char rcsid[] = "$NetBSD: compat.c,v 1.107 2017/07/20 19:29:54 sjg Exp $"; #else #include #ifndef lint #if 0 static char sccsid[] = "@(#)compat.c 8.2 (Berkeley) 3/19/94"; #else -__RCSID("$NetBSD: compat.c,v 1.106 2016/08/26 23:28:39 dholland Exp $"); +__RCSID("$NetBSD: compat.c,v 1.107 2017/07/20 19:29:54 sjg Exp $"); #endif #endif /* not lint */ #endif @@ -118,6 +118,8 @@ __RCSID("$NetBSD: compat.c,v 1.106 2016/08/26 23:28:39 static GNode *curTarg = NULL; static GNode *ENDNode; static void CompatInterrupt(int); +static pid_t compatChild; +static int compatSigno; /* * CompatDeleteTarget -- delete a failed, interrupted, or otherwise @@ -176,8 +178,17 @@ CompatInterrupt(int signo) } if (signo == SIGQUIT) _exit(signo); - bmake_signal(signo, SIG_DFL); - kill(myPid, signo); + /* + * If there is a child running, pass the signal on + * we will exist after it has exited. + */ + compatSigno = signo; + if (compatChild > 0) { + KILLPG(compatChild, signo); + } else { + bmake_signal(signo, SIG_DFL); + kill(myPid, signo); + } } /*- @@ -370,7 +381,7 @@ again: /* * Fork and execute the single command. If the fork fails, we abort. */ - cpid = vFork(); + compatChild = cpid = vFork(); if (cpid < 0) { Fatal("Could not fork"); } @@ -483,7 +494,12 @@ again: } } free(cmdStart); - + compatChild = 0; + if (compatSigno) { + bmake_signal(compatSigno, SIG_DFL); + kill(myPid, compatSigno); + } + return (status); } Modified: stable/11/contrib/bmake/dir.h ============================================================================== --- stable/11/contrib/bmake/dir.h Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/dir.h Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: dir.h,v 1.15 2012/04/07 18:29:08 christos Exp $ */ +/* $NetBSD: dir.h,v 1.18 2017/05/31 22:02:06 maya Exp $ */ /* * Copyright (c) 1988, 1989, 1990 The Regents of the University of California. @@ -75,8 +75,8 @@ /* dir.h -- */ -#ifndef _DIR -#define _DIR +#ifndef MAKE_DIR_H +#define MAKE_DIR_H typedef struct Path { char *name; /* Name of directory */ @@ -105,4 +105,4 @@ void Dir_PrintPath(Lst); void Dir_Destroy(void *); void * Dir_CopyDir(void *); -#endif /* _DIR */ +#endif /* MAKE_DIR_H */ Modified: stable/11/contrib/bmake/hash.h ============================================================================== --- stable/11/contrib/bmake/hash.h Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/hash.h Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: hash.h,v 1.11 2016/06/07 00:40:00 sjg Exp $ */ +/* $NetBSD: hash.h,v 1.12 2017/05/31 21:07:03 maya Exp $ */ /* * Copyright (c) 1988, 1989, 1990 The Regents of the University of California. @@ -78,8 +78,8 @@ * which maintains hash tables. */ -#ifndef _HASH -#define _HASH +#ifndef _HASH_H +#define _HASH_H /* * The following defines one entry in the hash table. @@ -146,4 +146,4 @@ void Hash_DeleteEntry(Hash_Table *, Hash_Entry *); Hash_Entry *Hash_EnumFirst(Hash_Table *, Hash_Search *); Hash_Entry *Hash_EnumNext(Hash_Search *); -#endif /* _HASH */ +#endif /* _HASH_H */ Modified: stable/11/contrib/bmake/job.c ============================================================================== --- stable/11/contrib/bmake/job.c Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/job.c Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: job.c,v 1.190 2017/04/16 21:23:43 riastradh Exp $ */ +/* $NetBSD: job.c,v 1.191 2017/07/20 19:29:54 sjg Exp $ */ /* * Copyright (c) 1988, 1989, 1990 The Regents of the University of California. @@ -70,14 +70,14 @@ */ #ifndef MAKE_NATIVE -static char rcsid[] = "$NetBSD: job.c,v 1.190 2017/04/16 21:23:43 riastradh Exp $"; +static char rcsid[] = "$NetBSD: job.c,v 1.191 2017/07/20 19:29:54 sjg Exp $"; #else #include #ifndef lint #if 0 static char sccsid[] = "@(#)job.c 8.2 (Berkeley) 3/19/94"; #else -__RCSID("$NetBSD: job.c,v 1.190 2017/04/16 21:23:43 riastradh Exp $"); +__RCSID("$NetBSD: job.c,v 1.191 2017/07/20 19:29:54 sjg Exp $"); #endif #endif /* not lint */ #endif @@ -364,11 +364,6 @@ static Job childExitJob; /* child exit pseudo-job */ (void)fprintf(fp, TARG_FMT, targPrefix, gn->name) static sigset_t caught_signals; /* Set of signals we handle */ -#if defined(SYSV) -#define KILLPG(pid, sig) kill(-(pid), (sig)) -#else -#define KILLPG(pid, sig) killpg((pid), (sig)) -#endif static void JobChildSig(int); static void JobContinueSig(int); Modified: stable/11/contrib/bmake/main.c ============================================================================== --- stable/11/contrib/bmake/main.c Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/main.c Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: main.c,v 1.265 2017/05/10 22:26:14 sjg Exp $ */ +/* $NetBSD: main.c,v 1.272 2017/06/19 19:58:24 christos Exp $ */ /* * Copyright (c) 1988, 1989, 1990, 1993 @@ -69,7 +69,7 @@ */ #ifndef MAKE_NATIVE -static char rcsid[] = "$NetBSD: main.c,v 1.265 2017/05/10 22:26:14 sjg Exp $"; +static char rcsid[] = "$NetBSD: main.c,v 1.272 2017/06/19 19:58:24 christos Exp $"; #else #include #ifndef lint @@ -81,7 +81,7 @@ __COPYRIGHT("@(#) Copyright (c) 1988, 1989, 1990, 1993 #if 0 static char sccsid[] = "@(#)main.c 8.3 (Berkeley) 3/19/94"; #else -__RCSID("$NetBSD: main.c,v 1.265 2017/05/10 22:26:14 sjg Exp $"); +__RCSID("$NetBSD: main.c,v 1.272 2017/06/19 19:58:24 christos Exp $"); #endif #endif /* not lint */ #endif @@ -159,7 +159,9 @@ Boolean deleteOnError; /* .DELETE_ON_ERROR: set */ static Boolean noBuiltins; /* -r flag */ static Lst makefiles; /* ordered list of makefiles to read */ -static Boolean printVars; /* print value of one or more vars */ +static int printVars; /* -[vV] argument */ +#define COMPAT_VARS 1 +#define EXPAND_VARS 2 static Lst variables; /* list of variables to print */ int maxJobs; /* -j argument */ static int maxJobTokens; /* -j argument */ @@ -421,7 +423,7 @@ MainParseArgs(int argc, char **argv) Boolean inOption, dashDash = FALSE; char found_path[MAXPATHLEN + 1]; /* for searching for sys.mk */ -#define OPTFLAGS "BC:D:I:J:NST:V:WXd:ef:ij:km:nqrstw" +#define OPTFLAGS "BC:D:I:J:NST:V:WXd:ef:ij:km:nqrstv:w" /* Can't actually use getopt(3) because rescanning is not portable */ getopt_def = OPTFLAGS; @@ -546,8 +548,9 @@ rearg: Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL); break; case 'V': + case 'v': if (argvalue == NULL) goto noarg; - printVars = TRUE; + printVars = c == 'v' ? EXPAND_VARS : COMPAT_VARS; (void)Lst_AtEnd(variables, argvalue); Var_Append(MAKEFLAGS, "-V", VAR_GLOBAL); Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL); @@ -877,6 +880,89 @@ MakeMode(const char *mode) free(mp); } +static void +doPrintVars(void) +{ + LstNode ln; + Boolean expandVars; + + if (printVars == EXPAND_VARS) + expandVars = TRUE; + else if (debugVflag) + expandVars = FALSE; + else + expandVars = getBoolean(".MAKE.EXPAND_VARIABLES", FALSE); + + for (ln = Lst_First(variables); ln != NULL; + ln = Lst_Succ(ln)) { + char *var = (char *)Lst_Datum(ln); + char *value; + char *p1; + + if (strchr(var, '$')) { + value = p1 = Var_Subst(NULL, var, VAR_GLOBAL, + VARF_WANTRES); + } else if (expandVars) { + char tmp[128]; + int len = snprintf(tmp, sizeof(tmp), "${%s}", var); + + if (len >= (int)sizeof(tmp)) + Fatal("%s: variable name too big: %s", + progname, var); + value = p1 = Var_Subst(NULL, tmp, VAR_GLOBAL, + VARF_WANTRES); + } else { + value = Var_Value(var, VAR_GLOBAL, &p1); + } + printf("%s\n", value ? value : ""); + free(p1); + } +} + +static Boolean +runTargets(void) +{ + Lst targs; /* target nodes to create -- passed to Make_Init */ + Boolean outOfDate; /* FALSE if all targets up to date */ + + /* + * Have now read the entire graph and need to make a list of + * targets to create. If none was given on the command line, + * we consult the parsing module to find the main target(s) + * to create. + */ + if (Lst_IsEmpty(create)) + targs = Parse_MainName(); + else + targs = Targ_FindList(create, TARG_CREATE); + + if (!compatMake) { + /* + * Initialize job module before traversing the graph + * now that any .BEGIN and .END targets have been read. + * This is done only if the -q flag wasn't given + * (to prevent the .BEGIN from being executed should + * it exist). + */ + if (!queryFlag) { + Job_Init(); + jobsRunning = TRUE; + } + + /* Traverse the graph, checking on all the targets */ + outOfDate = Make_Run(targs); + } else { + /* + * Compat_Init will take care of creating all the + * targets as well as initializing the module. + */ + Compat_Run(targs); + outOfDate = FALSE; + } + Lst_Destroy(targs, NULL); + return outOfDate; +} + /*- * main -- * The main function, for obvious reasons. Initializes variables @@ -897,8 +983,7 @@ MakeMode(const char *mode) int main(int argc, char **argv) { - Lst targs; /* target nodes to create -- passed to Make_Init */ - Boolean outOfDate = FALSE; /* FALSE if all targets up to date */ + Boolean outOfDate; /* FALSE if all targets up to date */ struct stat sb, sa; char *p1, *path; char mdpath[MAXPATHLEN]; @@ -1027,7 +1112,7 @@ main(int argc, char **argv) create = Lst_Init(FALSE); makefiles = Lst_Init(FALSE); - printVars = FALSE; + printVars = 0; debugVflag = FALSE; variables = Lst_Init(FALSE); beSilent = FALSE; /* Print commands as executed */ @@ -1406,73 +1491,13 @@ main(int argc, char **argv) /* print the values of any variables requested by the user */ if (printVars) { - LstNode ln; - Boolean expandVars; - - if (debugVflag) - expandVars = FALSE; - else - expandVars = getBoolean(".MAKE.EXPAND_VARIABLES", FALSE); - for (ln = Lst_First(variables); ln != NULL; - ln = Lst_Succ(ln)) { - char *var = (char *)Lst_Datum(ln); - char *value; - - if (strchr(var, '$')) { - value = p1 = Var_Subst(NULL, var, VAR_GLOBAL, - VARF_WANTRES); - } else if (expandVars) { - char tmp[128]; - - if (snprintf(tmp, sizeof(tmp), "${%s}", var) >= (int)(sizeof(tmp))) - Fatal("%s: variable name too big: %s", - progname, var); - value = p1 = Var_Subst(NULL, tmp, VAR_GLOBAL, - VARF_WANTRES); - } else { - value = Var_Value(var, VAR_GLOBAL, &p1); - } - printf("%s\n", value ? value : ""); - free(p1); - } + doPrintVars(); + outOfDate = FALSE; } else { - /* - * Have now read the entire graph and need to make a list of - * targets to create. If none was given on the command line, - * we consult the parsing module to find the main target(s) - * to create. - */ - if (Lst_IsEmpty(create)) - targs = Parse_MainName(); - else - targs = Targ_FindList(create, TARG_CREATE); - - if (!compatMake) { - /* - * Initialize job module before traversing the graph - * now that any .BEGIN and .END targets have been read. - * This is done only if the -q flag wasn't given - * (to prevent the .BEGIN from being executed should - * it exist). - */ - if (!queryFlag) { - Job_Init(); - jobsRunning = TRUE; - } - - /* Traverse the graph, checking on all the targets */ - outOfDate = Make_Run(targs); - } else { - /* - * Compat_Init will take care of creating all the - * targets as well as initializing the module. - */ - Compat_Run(targs); - } + outOfDate = runTargets(); } #ifdef CLEANUP - Lst_Destroy(targs, NULL); Lst_Destroy(variables, NULL); Lst_Destroy(makefiles, NULL); Lst_Destroy(create, (FreeProc *)free); @@ -1931,7 +1956,8 @@ usage(void) "usage: %s [-BeikNnqrstWwX] \n\ [-C directory] [-D variable] [-d flags] [-f makefile]\n\ [-I directory] [-J private] [-j max_jobs] [-m directory] [-T file]\n\ - [-V variable] [variable=value] [target ...]\n", progname); + [-V variable] [-v variable] [variable=value] [target ...]\n", + progname); exit(2); } Modified: stable/11/contrib/bmake/make.1 ============================================================================== --- stable/11/contrib/bmake/make.1 Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/make.1 Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -.\" $NetBSD: make.1,v 1.266 2017/02/01 18:39:27 sjg Exp $ +.\" $NetBSD: make.1,v 1.271 2017/07/03 21:34:20 wiz Exp $ .\" .\" Copyright (c) 1990, 1993 .\" The Regents of the University of California. All rights reserved. @@ -29,7 +29,7 @@ .\" .\" from: @(#)make.1 8.4 (Berkeley) 3/19/94 .\" -.Dd February 1, 2017 +.Dd June 22, 2017 .Dt MAKE 1 .Os .Sh NAME @@ -48,6 +48,7 @@ .Op Fl m Ar directory .Op Fl T Ar file .Op Fl V Ar variable +.Op Fl v Ar variable .Op Ar variable=value .Op Ar target ... .Sh DESCRIPTION @@ -206,7 +207,9 @@ Print debugging information about target list maintena .It Ar V Force the .Fl V -option to print raw values of variables. +option to print raw values of variables, overriding the default behavior +set via +.Va .MAKE.EXPAND_VARIABLES . .It Ar v Print debugging information about variable assignment. .It Ar x @@ -334,20 +337,39 @@ for each job started and completed. Rather than re-building a target as specified in the makefile, create it or update its modification time to make it appear up-to-date. .It Fl V Ar variable -Print -.Nm Ns 's -idea of the value of -.Ar variable , -in the global context. +Print the value of +.Ar variable . Do not build any targets. Multiple instances of this option may be specified; the variables will be printed one per line, with a blank line for each null or undefined variable. +The value printed is extracted from the global context after all +makefiles have been read. +By default, the raw variable contents (which may +include additional unexpanded variable references) are shown. If .Ar variable contains a .Ql \&$ -then the value will be expanded before printing. +then the value will be recursively expanded to its complete resultant +text before printing. +The expanded value will also be printed if +.Va .MAKE.EXPAND_VARIABLES +is set to true and +the +.Fl dV +option has not been used to override it. +Note that loop-local and target-local variables, as well as values +taken temporarily by global variables during makefile processing, are +not accessible via this option. +The +.Fl dv +debug mode can be used to see these at the cost of generating +substantial extraneous output. +.It Fl v Ar variable +Like +.Fl V +but the variable is always expanded to its complete value. .It Fl W Treat any warnings during makefile parsing as errors. .It Fl w @@ -657,7 +679,7 @@ The seven local variables are as follows: .Bl -tag -width ".ARCHIVE" -offset indent .It Va .ALLSRC The list of all sources for this target; also known as -.Ql Va \&\*[Gt] . +.Ql Va \&> . .It Va .ARCHIVE The name of the archive file; also known as .Ql Va \&! . @@ -666,7 +688,7 @@ In suffix-transformation rules, the name/path of the s target is to be transformed (the .Dq implied source); also known as -.Ql Va \&\*[Lt] . +.Ql Va \&< . It is not defined in explicit rules. .It Va .MEMBER The name of the archive member; also known as @@ -691,9 +713,9 @@ in archive member rules. .El .Pp The shorter forms -.Ql ( Va \*[Gt] , +.Ql ( Va > , .Ql Va \&! , -.Ql Va \*[Lt] , +.Ql Va < , .Ql Va % , .Ql Va \&? , .Ql Va * , @@ -787,6 +809,10 @@ from which generated dependencies are read. A boolean that controls the default behavior of the .Fl V option. +If true, variable values printed with +.Fl V +are fully expanded; if false, the raw variable contents (which may +include additional unexpanded variable references) are shown. .It Va .MAKE.EXPORTED The list of variables exported by .Nm . @@ -1298,7 +1324,7 @@ it is anchored at the end of each word. Inside .Ar new_string , an ampersand -.Pq Ql \*[Am] +.Pq Ql & is replaced by .Ar old_string (without any @@ -1762,7 +1788,7 @@ may be any one of the following: .Bl -tag -width "Cm XX" .It Cm \&|\&| Logical OR. -.It Cm \&\*[Am]\*[Am] +.It Cm \&&& Logical .Tn AND ; of higher precedence than @@ -1779,7 +1805,7 @@ The boolean operator may be used to logically negate an entire conditional. It is of higher precedence than -.Ql Ic \&\*[Am]\*[Am] . +.Ql Ic \&&& . .Pp The value of .Ar expression Modified: stable/11/contrib/bmake/make.h ============================================================================== --- stable/11/contrib/bmake/make.h Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/make.h Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: make.h,v 1.102 2016/12/07 15:00:46 christos Exp $ */ +/* $NetBSD: make.h,v 1.103 2017/07/20 19:29:54 sjg Exp $ */ /* * Copyright (c) 1988, 1989, 1990, 1993 @@ -541,6 +541,12 @@ int cached_stat(const char *, void *); #endif #ifndef PATH_MAX #define PATH_MAX MAXPATHLEN +#endif + +#if defined(SYSV) +#define KILLPG(pid, sig) kill(-(pid), (sig)) +#else +#define KILLPG(pid, sig) killpg((pid), (sig)) #endif #endif /* _MAKE_H_ */ Modified: stable/11/contrib/bmake/meta.c ============================================================================== --- stable/11/contrib/bmake/meta.c Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/meta.c Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -/* $NetBSD: meta.c,v 1.67 2016/08/17 15:52:42 sjg Exp $ */ +/* $NetBSD: meta.c,v 1.68 2017/07/09 04:54:00 sjg Exp $ */ /* * Implement 'meta' mode. @@ -727,7 +727,7 @@ meta_job_error(Job *job, GNode *gn, int flags, int sta pbm = &Mybm; } if (pbm->mfp != NULL) { - fprintf(pbm->mfp, "*** Error code %d%s\n", + fprintf(pbm->mfp, "\n*** Error code %d%s\n", status, (flags & JOB_IGNERR) ? "(ignored)" : ""); @@ -782,13 +782,13 @@ int meta_cmd_finish(void *pbmp) { int error = 0; -#ifdef USE_FILEMON BuildMon *pbm = pbmp; int x; if (!pbm) pbm = &Mybm; +#ifdef USE_FILEMON if (pbm->filemon_fd >= 0) { if (close(pbm->filemon_fd) < 0) error = errno; @@ -796,8 +796,9 @@ meta_cmd_finish(void *pbmp) if (error == 0 && x != 0) error = x; pbm->filemon_fd = pbm->mon_fd = -1; - } + } else #endif + fprintf(pbm->mfp, "\n"); /* ensure end with newline */ return error; } @@ -861,6 +862,8 @@ fgetLine(char **bufp, size_t *szp, int o, FILE *fp) newsz = ROUNDUP((fs.st_size / 2), BUFSIZ); if (newsz <= bufsz) newsz = ROUNDUP(fs.st_size, BUFSIZ); + if (newsz <= bufsz) + return x; /* truncated */ if (DEBUG(META)) fprintf(debug_file, "growing buffer %u -> %u\n", (unsigned)bufsz, (unsigned)newsz); @@ -948,10 +951,10 @@ meta_ignore(GNode *gn, const char *p) if (metaIgnorePatterns) { char *pm; - snprintf(fname, sizeof(fname), - "${%s:@m@${%s:L:M$m}@}", - MAKE_META_IGNORE_PATTERNS, p); - pm = Var_Subst(NULL, fname, gn, VARF_WANTRES); + Var_Set(".p.", p, gn, 0); + pm = Var_Subst(NULL, + "${" MAKE_META_IGNORE_PATTERNS ":@m@${.p.:M$m}@}", + gn, VARF_WANTRES); if (*pm) { #ifdef DEBUG_META_MODE if (DEBUG(META)) Modified: stable/11/contrib/bmake/mk/ChangeLog ============================================================================== --- stable/11/contrib/bmake/mk/ChangeLog Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/mk/ChangeLog Fri Jul 28 15:43:36 2017 (r321653) @@ -1,3 +1,22 @@ +2017-06-30 Simon J. Gerraty + + * install-mk (MK_VERSION): 20170630 + + * meta.stage.mk: avoid triggering stage_* targets with nothing to do. + +2017-05-23 Simon J. Gerraty + + * meta2deps.py: take special care of '..' + +2017-05-15 Simon J. Gerraty + + * install-mk (MK_VERSION): 20170515 + + * dirdeps.mk (DEP_EXPORT_VARS): on rare occasions it is + useful/necessary for a Makefile.depend file to export some knobs. + This is complicated when we are doing DIRDEPS_CACHE, so we will + handle export of any variables listed in DEP_EXPORT_VARS. + 2017-05-08 Simon J. Gerraty * install-mk (MK_VERSION): 20170505 Modified: stable/11/contrib/bmake/mk/dirdeps.mk ============================================================================== --- stable/11/contrib/bmake/mk/dirdeps.mk Fri Jul 28 12:22:32 2017 (r321652) +++ stable/11/contrib/bmake/mk/dirdeps.mk Fri Jul 28 15:43:36 2017 (r321653) @@ -1,4 +1,4 @@ -# $Id: dirdeps.mk,v 1.88 2017/04/24 20:34:59 sjg Exp $ +# $Id: dirdeps.mk,v 1.89 2017/05/17 17:41:47 sjg Exp $ # Copyright (c) 2010-2013, Juniper Networks, Inc. # All rights reserved. @@ -636,6 +636,11 @@ _build_all_dirs := ${_build_all_dirs:O:u} x!= { echo; echo '\# ${DEP_RELDIR}.${DEP_TARGET_SPEC}'; \ echo 'dirdeps: ${_build_all_dirs:${M_oneperline}}'; echo; } >&3; echo x!= { ${_build_all_dirs:@x@${target($x):?:echo '$x: _DIRDEP_USE';}@} echo; } >&3; echo +.if !empty(DEP_EXPORT_VARS) +# Discouraged, but there are always exceptions. +# Handle it here rather than explain how. +x!= { echo; ${DEP_EXPORT_VARS:@v@echo '$v=${$v}';@} echo '.export ${DEP_EXPORT_VARS}'; echo; } >&3; echo +.endif .else # this makes it all happen dirdeps: ${_build_all_dirs} @@ -644,6 +649,11 @@ ${_build_all_dirs}: _DIRDEP_USE .if ${_debug_reldir} .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: needs: ${_build_dirs} +.endif + +.if !empty(DEP_EXPORT_VARS) +.export ${DEP_EXPORT_VARS} +DEP_EXPORT_VARS= .endif # this builds the dependency graph *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-11@freebsd.org Fri Jul 28 18:09:42 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 69731DCB2E7; Fri, 28 Jul 2017 18:09:42 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3845F6A656; Fri, 28 Jul 2017 18:09:42 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SI9f29097175; Fri, 28 Jul 2017 18:09:41 GMT (envelope-from jhb@FreeBSD.org) Received: (from jhb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6SI9fYo097174; Fri, 28 Jul 2017 18:09:41 GMT (envelope-from jhb@FreeBSD.org) Message-Id: <201707281809.v6SI9fYo097174@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jhb set sender to jhb@FreeBSD.org using -f From: John Baldwin Date: Fri, 28 Jul 2017 18:09:41 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321657 - stable/11/sys/kern X-SVN-Group: stable-11 X-SVN-Commit-Author: jhb X-SVN-Commit-Paths: stable/11/sys/kern X-SVN-Commit-Revision: 321657 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 18:09:42 -0000 Author: jhb Date: Fri Jul 28 18:09:41 2017 New Revision: 321657 URL: https://svnweb.freebsd.org/changeset/base/321657 Log: MFC 321075: Set the current vnet pointer in the socket buffer AIO handler. This fixes panics when using AIO under VIMAGE. Sponsored by: Chelsio Communications Modified: stable/11/sys/kern/sys_socket.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/sys_socket.c ============================================================================== --- stable/11/sys/kern/sys_socket.c Fri Jul 28 17:45:20 2017 (r321656) +++ stable/11/sys/kern/sys_socket.c Fri Jul 28 18:09:41 2017 (r321657) @@ -675,6 +675,7 @@ soaio_process_sb(struct socket *so, struct sockbuf *sb { struct kaiocb *job; + CURVNET_SET(so->so_vnet); SOCKBUF_LOCK(sb); while (!TAILQ_EMPTY(&sb->sb_aiojobq) && soaio_ready(so, sb)) { job = TAILQ_FIRST(&sb->sb_aiojobq); @@ -698,6 +699,7 @@ soaio_process_sb(struct socket *so, struct sockbuf *sb ACCEPT_LOCK(); SOCK_LOCK(so); sorele(so); + CURVNET_RESTORE(); } void From owner-svn-src-stable-11@freebsd.org Fri Jul 28 18:35:30 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 90D2BDCBEF8; Fri, 28 Jul 2017 18:35:30 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5F0656BAF4; Fri, 28 Jul 2017 18:35:30 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SIZTAd010131; Fri, 28 Jul 2017 18:35:29 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6SIZTVS010130; Fri, 28 Jul 2017 18:35:29 GMT (envelope-from dim@FreeBSD.org) Message-Id: <201707281835.v6SIZTVS010130@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Fri, 28 Jul 2017 18:35:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321660 - in stable: 10/sys/boot/efi/boot1 11/sys/boot/efi/boot1 X-SVN-Group: stable-11 X-SVN-Commit-Author: dim X-SVN-Commit-Paths: in stable: 10/sys/boot/efi/boot1 11/sys/boot/efi/boot1 X-SVN-Commit-Revision: 321660 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 18:35:30 -0000 Author: dim Date: Fri Jul 28 18:35:29 2017 New Revision: 321660 URL: https://svnweb.freebsd.org/changeset/base/321660 Log: MFC r321305: Fix printf format warning in zfs_module.c Clang 5.0.0 got better warnings about print format strings using %zd, and this leads to the following -Werror warning on e.g. arm: sys/boot/efi/boot1/zfs_module.c:186:18: error: format specifies type 'ssize_t' (aka 'int') but the argument has type 'off_t' (aka 'long long') [-Werror,-Wformat] "(%lu)\n", st.st_size, spa->spa_name, filepath, EFI_ERROR_CODE(status)); ^~~~~~~~~~ Fix this by casting off_t arguments to intmax_t, and using %jd instead. Reviewed by: tsoome Differential Revision: https://reviews.freebsd.org/D11678 Modified: stable/11/sys/boot/efi/boot1/zfs_module.c Directory Properties: stable/11/ (props changed) Changes in other areas also in this revision: Modified: stable/10/sys/boot/efi/boot1/zfs_module.c Directory Properties: stable/10/ (props changed) Modified: stable/11/sys/boot/efi/boot1/zfs_module.c ============================================================================== --- stable/11/sys/boot/efi/boot1/zfs_module.c Fri Jul 28 18:27:30 2017 (r321659) +++ stable/11/sys/boot/efi/boot1/zfs_module.c Fri Jul 28 18:35:29 2017 (r321660) @@ -173,8 +173,8 @@ load(const char *filepath, dev_info_t *devinfo, void * if ((status = bs->AllocatePool(EfiLoaderData, (UINTN)st.st_size, &buf)) != EFI_SUCCESS) { - printf("Failed to allocate load buffer %zd for pool '%s' for '%s' " - "(%lu)\n", st.st_size, spa->spa_name, filepath, EFI_ERROR_CODE(status)); + printf("Failed to allocate load buffer %jd for pool '%s' for '%s' " + "(%lu)\n", (intmax_t)st.st_size, spa->spa_name, filepath, EFI_ERROR_CODE(status)); return (EFI_INVALID_PARAMETER); } From owner-svn-src-stable-11@freebsd.org Fri Jul 28 18:47:05 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 707C5DCC24C; Fri, 28 Jul 2017 18:47:05 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3EB9E6C37E; Fri, 28 Jul 2017 18:47:05 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SIl48m014496; Fri, 28 Jul 2017 18:47:04 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6SIl42I014495; Fri, 28 Jul 2017 18:47:04 GMT (envelope-from dim@FreeBSD.org) Message-Id: <201707281847.v6SIl42I014495@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Fri, 28 Jul 2017 18:47:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321662 - stable/11/sys/net X-SVN-Group: stable-11 X-SVN-Commit-Author: dim X-SVN-Commit-Paths: stable/11/sys/net X-SVN-Commit-Revision: 321662 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 18:47:05 -0000 Author: dim Date: Fri Jul 28 18:47:04 2017 New Revision: 321662 URL: https://svnweb.freebsd.org/changeset/base/321662 Log: MFC r321306: Fix printf format warning in iflib.c Clang 5.0.0 got better warnings about printf format strings using %zd, and this leads to the following -Werror warning on e.g. arm: sys/net/iflib.c:1517:8: error: format specifies type 'ssize_t' (aka 'int') but the argument has type 'bus_size_t' (aka 'unsigned long') [-Werror,-Wformat] sctx->isc_tx_maxsize, nsegments, sctx->isc_tx_maxsegsize); ^~~~~~~~~~~~~~~~~~~~ sys/net/iflib.c:1517:41: error: format specifies type 'ssize_t' (aka 'int') but the argument has type 'bus_size_t' (aka 'unsigned long') [-Werror,-Wformat] sctx->isc_tx_maxsize, nsegments, sctx->isc_tx_maxsegsize); ^~~~~~~~~~~~~~~~~~~~~~~ Fix this by casting bus_size_t arguments to uintmax_t, and using %ju instead. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D11679 Modified: stable/11/sys/net/iflib.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/net/iflib.c ============================================================================== --- stable/11/sys/net/iflib.c Fri Jul 28 18:46:02 2017 (r321661) +++ stable/11/sys/net/iflib.c Fri Jul 28 18:47:04 2017 (r321662) @@ -1262,8 +1262,8 @@ iflib_txsd_alloc(iflib_txq_t txq) NULL, /* lockfuncarg */ &txq->ift_desc_tag))) { device_printf(dev,"Unable to allocate TX DMA tag: %d\n", err); - device_printf(dev,"maxsize: %zd nsegments: %d maxsegsize: %zd\n", - sctx->isc_tx_maxsize, nsegments, sctx->isc_tx_maxsegsize); + device_printf(dev,"maxsize: %ju nsegments: %d maxsegsize: %ju\n", + (uintmax_t)sctx->isc_tx_maxsize, nsegments, (uintmax_t)sctx->isc_tx_maxsegsize); goto fail; } #ifdef IFLIB_DIAGNOSTICS From owner-svn-src-stable-11@freebsd.org Fri Jul 28 19:10:35 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6EF1CDCC7C6; Fri, 28 Jul 2017 19:10:35 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47DED6CD71; Fri, 28 Jul 2017 19:10:35 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SJAYMd022944; Fri, 28 Jul 2017 19:10:34 GMT (envelope-from dim@FreeBSD.org) Received: (from dim@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6SJAYZ4022943; Fri, 28 Jul 2017 19:10:34 GMT (envelope-from dim@FreeBSD.org) Message-Id: <201707281910.v6SJAYZ4022943@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dim set sender to dim@FreeBSD.org using -f From: Dimitry Andric Date: Fri, 28 Jul 2017 19:10:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321663 - stable/11/contrib/llvm/tools/clang/lib/AST X-SVN-Group: stable-11 X-SVN-Commit-Author: dim X-SVN-Commit-Paths: stable/11/contrib/llvm/tools/clang/lib/AST X-SVN-Commit-Revision: 321663 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 19:10:35 -0000 Author: dim Date: Fri Jul 28 19:10:34 2017 New Revision: 321663 URL: https://svnweb.freebsd.org/changeset/base/321663 Log: MFC r321342: Pull in r295886 from upstream clang trunk (by Richard Smith): PR32034: Evaluate _Atomic(T) in-place when T is a class or array type. This is necessary in order for the evaluation of an _Atomic initializer for those types to have an associated object, which an initializer for class or array type needs. This fixes an assertion when building recent versions of LinuxCNC. Reported by: trasz PR: 220883 Modified: stable/11/contrib/llvm/tools/clang/lib/AST/ExprConstant.cpp Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/llvm/tools/clang/lib/AST/ExprConstant.cpp ============================================================================== --- stable/11/contrib/llvm/tools/clang/lib/AST/ExprConstant.cpp Fri Jul 28 18:47:04 2017 (r321662) +++ stable/11/contrib/llvm/tools/clang/lib/AST/ExprConstant.cpp Fri Jul 28 19:10:34 2017 (r321663) @@ -1404,7 +1404,8 @@ static bool EvaluateIntegerOrLValue(const Expr *E, APV EvalInfo &Info); static bool EvaluateFloat(const Expr *E, APFloat &Result, EvalInfo &Info); static bool EvaluateComplex(const Expr *E, ComplexValue &Res, EvalInfo &Info); -static bool EvaluateAtomic(const Expr *E, APValue &Result, EvalInfo &Info); +static bool EvaluateAtomic(const Expr *E, const LValue *This, APValue &Result, + EvalInfo &Info); static bool EvaluateAsRValue(EvalInfo &Info, const Expr *E, APValue &Result); //===----------------------------------------------------------------------===// @@ -4691,7 +4692,10 @@ class ExprEvaluatorBase (public) case CK_AtomicToNonAtomic: { APValue AtomicVal; - if (!EvaluateAtomic(E->getSubExpr(), AtomicVal, Info)) + // This does not need to be done in place even for class/array types: + // atomic-to-non-atomic conversion implies copying the object + // representation. + if (!Evaluate(AtomicVal, Info, E->getSubExpr())) return false; return DerivedSuccess(AtomicVal, E); } @@ -9565,10 +9569,11 @@ bool ComplexExprEvaluator::VisitInitListExpr(const Ini namespace { class AtomicExprEvaluator : public ExprEvaluatorBase { + const LValue *This; APValue &Result; public: - AtomicExprEvaluator(EvalInfo &Info, APValue &Result) - : ExprEvaluatorBaseTy(Info), Result(Result) {} + AtomicExprEvaluator(EvalInfo &Info, const LValue *This, APValue &Result) + : ExprEvaluatorBaseTy(Info), This(This), Result(Result) {} bool Success(const APValue &V, const Expr *E) { Result = V; @@ -9578,7 +9583,10 @@ class AtomicExprEvaluator : (public) bool ZeroInitialization(const Expr *E) { ImplicitValueInitExpr VIE( E->getType()->castAs()->getValueType()); - return Evaluate(Result, Info, &VIE); + // For atomic-qualified class (and array) types in C++, initialize the + // _Atomic-wrapped subobject directly, in-place. + return This ? EvaluateInPlace(Result, Info, *This, &VIE) + : Evaluate(Result, Info, &VIE); } bool VisitCastExpr(const CastExpr *E) { @@ -9586,15 +9594,17 @@ class AtomicExprEvaluator : (public) default: return ExprEvaluatorBaseTy::VisitCastExpr(E); case CK_NonAtomicToAtomic: - return Evaluate(Result, Info, E->getSubExpr()); + return This ? EvaluateInPlace(Result, Info, *This, E->getSubExpr()) + : Evaluate(Result, Info, E->getSubExpr()); } } }; } // end anonymous namespace -static bool EvaluateAtomic(const Expr *E, APValue &Result, EvalInfo &Info) { +static bool EvaluateAtomic(const Expr *E, const LValue *This, APValue &Result, + EvalInfo &Info) { assert(E->isRValue() && E->getType()->isAtomicType()); - return AtomicExprEvaluator(Info, Result).Visit(E); + return AtomicExprEvaluator(Info, This, Result).Visit(E); } //===----------------------------------------------------------------------===// @@ -9699,8 +9709,17 @@ static bool Evaluate(APValue &Result, EvalInfo &Info, if (!EvaluateVoid(E, Info)) return false; } else if (T->isAtomicType()) { - if (!EvaluateAtomic(E, Result, Info)) - return false; + QualType Unqual = T.getAtomicUnqualifiedType(); + if (Unqual->isArrayType() || Unqual->isRecordType()) { + LValue LV; + LV.set(E, Info.CurrentCall->Index); + APValue &Value = Info.CurrentCall->createTemporary(E, false); + if (!EvaluateAtomic(E, &LV, Value, Info)) + return false; + } else { + if (!EvaluateAtomic(E, nullptr, Result, Info)) + return false; + } } else if (Info.getLangOpts().CPlusPlus11) { Info.FFDiag(E, diag::note_constexpr_nonliteral) << E->getType(); return false; @@ -9725,10 +9744,16 @@ static bool EvaluateInPlace(APValue &Result, EvalInfo if (E->isRValue()) { // Evaluate arrays and record types in-place, so that later initializers can // refer to earlier-initialized members of the object. - if (E->getType()->isArrayType()) + QualType T = E->getType(); + if (T->isArrayType()) return EvaluateArray(E, This, Result, Info); - else if (E->getType()->isRecordType()) + else if (T->isRecordType()) return EvaluateRecord(E, This, Result, Info); + else if (T->isAtomicType()) { + QualType Unqual = T.getAtomicUnqualifiedType(); + if (Unqual->isArrayType() || Unqual->isRecordType()) + return EvaluateAtomic(E, &This, Result, Info); + } } // For any other type, in-place evaluation is unimportant. From owner-svn-src-stable-11@freebsd.org Sat Jul 29 07:59:29 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4E13CDBD761; Sat, 29 Jul 2017 07:59:29 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1F1193DA6; Sat, 29 Jul 2017 07:59:29 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6T7xSZ7037462; Sat, 29 Jul 2017 07:59:28 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6T7xSNQ037461; Sat, 29 Jul 2017 07:59:28 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201707290759.v6T7xSNQ037461@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Sat, 29 Jul 2017 07:59:28 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321676 - stable/11/sys/vm X-SVN-Group: stable-11 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/11/sys/vm X-SVN-Commit-Revision: 321676 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Jul 2017 07:59:29 -0000 Author: kib Date: Sat Jul 29 07:59:28 2017 New Revision: 321676 URL: https://svnweb.freebsd.org/changeset/base/321676 Log: MFC r321371: Do not allocate struct kinfo_vmobject on stack. Modified: stable/11/sys/vm/vm_object.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/vm/vm_object.c ============================================================================== --- stable/11/sys/vm/vm_object.c Sat Jul 29 02:25:49 2017 (r321675) +++ stable/11/sys/vm/vm_object.c Sat Jul 29 07:59:28 2017 (r321676) @@ -2282,7 +2282,7 @@ vm_object_vnode(vm_object_t object) static int sysctl_vm_object_list(SYSCTL_HANDLER_ARGS) { - struct kinfo_vmobject kvo; + struct kinfo_vmobject *kvo; char *fullpath, *freepath; struct vnode *vp; struct vattr va; @@ -2307,6 +2307,7 @@ sysctl_vm_object_list(SYSCTL_HANDLER_ARGS) count * 11 / 10)); } + kvo = malloc(sizeof(*kvo), M_TEMP, M_WAITOK); error = 0; /* @@ -2324,13 +2325,13 @@ sysctl_vm_object_list(SYSCTL_HANDLER_ARGS) continue; } mtx_unlock(&vm_object_list_mtx); - kvo.kvo_size = ptoa(obj->size); - kvo.kvo_resident = obj->resident_page_count; - kvo.kvo_ref_count = obj->ref_count; - kvo.kvo_shadow_count = obj->shadow_count; - kvo.kvo_memattr = obj->memattr; - kvo.kvo_active = 0; - kvo.kvo_inactive = 0; + kvo->kvo_size = ptoa(obj->size); + kvo->kvo_resident = obj->resident_page_count; + kvo->kvo_ref_count = obj->ref_count; + kvo->kvo_shadow_count = obj->shadow_count; + kvo->kvo_memattr = obj->memattr; + kvo->kvo_active = 0; + kvo->kvo_inactive = 0; TAILQ_FOREACH(m, &obj->memq, listq) { /* * A page may belong to the object but be @@ -2342,45 +2343,45 @@ sysctl_vm_object_list(SYSCTL_HANDLER_ARGS) * approximation of the system anyway. */ if (vm_page_active(m)) - kvo.kvo_active++; + kvo->kvo_active++; else if (vm_page_inactive(m)) - kvo.kvo_inactive++; + kvo->kvo_inactive++; } - kvo.kvo_vn_fileid = 0; - kvo.kvo_vn_fsid = 0; + kvo->kvo_vn_fileid = 0; + kvo->kvo_vn_fsid = 0; freepath = NULL; fullpath = ""; vp = NULL; switch (obj->type) { case OBJT_DEFAULT: - kvo.kvo_type = KVME_TYPE_DEFAULT; + kvo->kvo_type = KVME_TYPE_DEFAULT; break; case OBJT_VNODE: - kvo.kvo_type = KVME_TYPE_VNODE; + kvo->kvo_type = KVME_TYPE_VNODE; vp = obj->handle; vref(vp); break; case OBJT_SWAP: - kvo.kvo_type = KVME_TYPE_SWAP; + kvo->kvo_type = KVME_TYPE_SWAP; break; case OBJT_DEVICE: - kvo.kvo_type = KVME_TYPE_DEVICE; + kvo->kvo_type = KVME_TYPE_DEVICE; break; case OBJT_PHYS: - kvo.kvo_type = KVME_TYPE_PHYS; + kvo->kvo_type = KVME_TYPE_PHYS; break; case OBJT_DEAD: - kvo.kvo_type = KVME_TYPE_DEAD; + kvo->kvo_type = KVME_TYPE_DEAD; break; case OBJT_SG: - kvo.kvo_type = KVME_TYPE_SG; + kvo->kvo_type = KVME_TYPE_SG; break; case OBJT_MGTDEVICE: - kvo.kvo_type = KVME_TYPE_MGTDEVICE; + kvo->kvo_type = KVME_TYPE_MGTDEVICE; break; default: - kvo.kvo_type = KVME_TYPE_UNKNOWN; + kvo->kvo_type = KVME_TYPE_UNKNOWN; break; } VM_OBJECT_RUNLOCK(obj); @@ -2388,27 +2389,28 @@ sysctl_vm_object_list(SYSCTL_HANDLER_ARGS) vn_fullpath(curthread, vp, &fullpath, &freepath); vn_lock(vp, LK_SHARED | LK_RETRY); if (VOP_GETATTR(vp, &va, curthread->td_ucred) == 0) { - kvo.kvo_vn_fileid = va.va_fileid; - kvo.kvo_vn_fsid = va.va_fsid; + kvo->kvo_vn_fileid = va.va_fileid; + kvo->kvo_vn_fsid = va.va_fsid; } vput(vp); } - strlcpy(kvo.kvo_path, fullpath, sizeof(kvo.kvo_path)); + strlcpy(kvo->kvo_path, fullpath, sizeof(kvo->kvo_path)); if (freepath != NULL) free(freepath, M_TEMP); /* Pack record size down */ - kvo.kvo_structsize = offsetof(struct kinfo_vmobject, kvo_path) + - strlen(kvo.kvo_path) + 1; - kvo.kvo_structsize = roundup(kvo.kvo_structsize, + kvo->kvo_structsize = offsetof(struct kinfo_vmobject, kvo_path) + + strlen(kvo->kvo_path) + 1; + kvo->kvo_structsize = roundup(kvo->kvo_structsize, sizeof(uint64_t)); - error = SYSCTL_OUT(req, &kvo, kvo.kvo_structsize); + error = SYSCTL_OUT(req, kvo, kvo->kvo_structsize); mtx_lock(&vm_object_list_mtx); if (error) break; } mtx_unlock(&vm_object_list_mtx); + free(kvo, M_TEMP); return (error); } SYSCTL_PROC(_vm, OID_AUTO, objects, CTLTYPE_STRUCT | CTLFLAG_RW | CTLFLAG_SKIP | From owner-svn-src-stable-11@freebsd.org Sat Jul 29 09:56:09 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 40D67DBFF87; Sat, 29 Jul 2017 09:56:09 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0087E6666A; Sat, 29 Jul 2017 09:56:08 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6T9u84S086186; Sat, 29 Jul 2017 09:56:08 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6T9u85G086185; Sat, 29 Jul 2017 09:56:08 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707290956.v6T9u85G086185@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Sat, 29 Jul 2017 09:56:08 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321680 - stable/11/sys/boot/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/boot/zfs X-SVN-Commit-Revision: 321680 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Jul 2017 09:56:09 -0000 Author: mav Date: Sat Jul 29 09:56:07 2017 New Revision: 321680 URL: https://svnweb.freebsd.org/changeset/base/321680 Log: MFC r307865 (by tsoome): loader should boot pre-feature flags pools. The feature flags chek is missing the corner case where we have valid pool version, but feature flags are not enabled - as for example plain v28 pool. This update does fix the boot support for such pools. Differential Revision: https://reviews.freebsd.org/D8331 PR: 221084 Modified: stable/11/sys/boot/zfs/zfsimpl.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/11/sys/boot/zfs/zfsimpl.c Sat Jul 29 09:22:48 2017 (r321679) +++ stable/11/sys/boot/zfs/zfsimpl.c Sat Jul 29 09:56:07 2017 (r321680) @@ -2046,8 +2046,13 @@ check_mos_features(const spa_t *spa) if ((rc = objset_get_dnode(spa, &spa->spa_mos, DMU_OT_OBJECT_DIRECTORY, &dir)) != 0) return (rc); - if ((rc = zap_lookup(spa, &dir, DMU_POOL_FEATURES_FOR_READ, &objnum)) != 0) - return (rc); + if ((rc = zap_lookup(spa, &dir, DMU_POOL_FEATURES_FOR_READ, &objnum)) != 0) { + /* + * It is older pool without features. As we have already + * tested the label, just return without raising the error. + */ + return (0); + } if ((rc = objset_get_dnode(spa, &spa->spa_mos, objnum, &dir)) != 0) return (rc); From owner-svn-src-stable-11@freebsd.org Sat Jul 29 10:30:14 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8DFA0DC0CD7; Sat, 29 Jul 2017 10:30:14 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 64B8C677A0; Sat, 29 Jul 2017 10:30:14 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6TAUDMS098602; Sat, 29 Jul 2017 10:30:13 GMT (envelope-from dchagin@FreeBSD.org) Received: (from dchagin@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6TAUDcn098600; Sat, 29 Jul 2017 10:30:13 GMT (envelope-from dchagin@FreeBSD.org) Message-Id: <201707291030.v6TAUDcn098600@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dchagin set sender to dchagin@FreeBSD.org using -f From: Dmitry Chagin Date: Sat, 29 Jul 2017 10:30:13 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321681 - stable/11/sys/fs/fdescfs X-SVN-Group: stable-11 X-SVN-Commit-Author: dchagin X-SVN-Commit-Paths: stable/11/sys/fs/fdescfs X-SVN-Commit-Revision: 321681 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Jul 2017 10:30:14 -0000 Author: dchagin Date: Sat Jul 29 10:30:13 2017 New Revision: 321681 URL: https://svnweb.freebsd.org/changeset/base/321681 Log: MFC r320814: Style(9). Add blank line aftr {. MFC r320815: Remove init from declaration. MFC r320816: Remove init from declaration, collapse two int vars declarations into single. MFC r320817: Don't take a lock around atomic operation. MFC r320818: Eliminate the bogus cast. MFC r320819: Eliminate the bogus cast. MFC r320820: Don't initialize error in declaration. Modified: stable/11/sys/fs/fdescfs/fdesc_vfsops.c stable/11/sys/fs/fdescfs/fdesc_vnops.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/fs/fdescfs/fdesc_vfsops.c ============================================================================== --- stable/11/sys/fs/fdescfs/fdesc_vfsops.c Sat Jul 29 09:56:07 2017 (r321680) +++ stable/11/sys/fs/fdescfs/fdesc_vfsops.c Sat Jul 29 10:30:13 2017 (r321681) @@ -68,6 +68,7 @@ static vfs_root_t fdesc_root; int fdesc_cmount(struct mntarg *ma, void *data, uint64_t flags) { + return kernel_mount(ma, flags); } @@ -77,10 +78,10 @@ fdesc_cmount(struct mntarg *ma, void *data, uint64_t f static int fdesc_mount(struct mount *mp) { - int error = 0; struct fdescmount *fmp; struct thread *td = curthread; struct vnode *rvp; + int error; if (!prison_allow(td->td_ucred, PR_ALLOW_MOUNT_FDESCFS)) return (EPERM); @@ -98,7 +99,7 @@ fdesc_mount(struct mount *mp) * We need to initialize a few bits of our local mount point struct to * avoid confusion in allocvp. */ - mp->mnt_data = (qaddr_t) fmp; + mp->mnt_data = fmp; fmp->flags = 0; error = fdesc_allocvp(Froot, -1, FD_ROOT, mp, &rvp); if (error) { @@ -122,11 +123,10 @@ static int fdesc_unmount(struct mount *mp, int mntflags) { struct fdescmount *fmp; - caddr_t data; - int error; - int flags = 0; + int error, flags; - fmp = (struct fdescmount *)mp->mnt_data; + flags = 0; + fmp = mp->mnt_data; if (mntflags & MNT_FORCE) { /* The hash mutex protects the private mount flags. */ mtx_lock(&fdesc_hashmtx); @@ -147,15 +147,10 @@ fdesc_unmount(struct mount *mp, int mntflags) return (error); /* - * Finally, throw away the fdescmount structure. Hold the hashmtx to - * protect the fdescmount structure. + * Finally, throw away the fdescmount structure. */ - mtx_lock(&fdesc_hashmtx); - data = mp->mnt_data; mp->mnt_data = NULL; - mtx_unlock(&fdesc_hashmtx); - free(data, M_FDESCMNT); /* XXX */ - + free(fmp, M_FDESCMNT); return (0); } Modified: stable/11/sys/fs/fdescfs/fdesc_vnops.c ============================================================================== --- stable/11/sys/fs/fdescfs/fdesc_vnops.c Sat Jul 29 09:56:07 2017 (r321680) +++ stable/11/sys/fs/fdescfs/fdesc_vnops.c Sat Jul 29 10:30:13 2017 (r321681) @@ -152,7 +152,7 @@ fdesc_allocvp(fdntype ftype, unsigned fd_fd, int ix, s struct fdescnode *fd, *fd2; struct vnode *vp, *vp2; struct thread *td; - int error = 0; + int error; td = curthread; fc = FD_NHASH(ix); From owner-svn-src-stable-11@freebsd.org Sat Jul 29 10:31:58 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DF2DDC0D50; Sat, 29 Jul 2017 10:31:58 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2B1DE679A6; Sat, 29 Jul 2017 10:31:58 +0000 (UTC) (envelope-from dchagin@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6TAVvRj002188; Sat, 29 Jul 2017 10:31:57 GMT (envelope-from dchagin@FreeBSD.org) Received: (from dchagin@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6TAVvxa002182; Sat, 29 Jul 2017 10:31:57 GMT (envelope-from dchagin@FreeBSD.org) Message-Id: <201707291031.v6TAVvxa002182@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: dchagin set sender to dchagin@FreeBSD.org using -f From: Dmitry Chagin Date: Sat, 29 Jul 2017 10:31:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321682 - stable/11/sys/compat/linux X-SVN-Group: stable-11 X-SVN-Commit-Author: dchagin X-SVN-Commit-Paths: stable/11/sys/compat/linux X-SVN-Commit-Revision: 321682 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Jul 2017 10:31:58 -0000 Author: dchagin Date: Sat Jul 29 10:31:57 2017 New Revision: 321682 URL: https://svnweb.freebsd.org/changeset/base/321682 Log: MFC r321366: Style(9) whitespace fix. Modified: stable/11/sys/compat/linux/linux_ioctl.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/compat/linux/linux_ioctl.h ============================================================================== --- stable/11/sys/compat/linux/linux_ioctl.h Sat Jul 29 10:30:13 2017 (r321681) +++ stable/11/sys/compat/linux/linux_ioctl.h Sat Jul 29 10:31:57 2017 (r321682) @@ -198,7 +198,7 @@ #define LINUX_VT_SETMODE 0x5602 #define LINUX_VT_GETSTATE 0x5603 #define LINUX_VT_RELDISP 0x5605 -#define LINUX_VT_ACTIVATE 0x5606 +#define LINUX_VT_ACTIVATE 0x5606 #define LINUX_VT_WAITACTIVE 0x5607 #define LINUX_IOCTL_CONSOLE_MIN LINUX_KIOCSOUND From owner-svn-src-stable-11@freebsd.org Sat Jul 29 11:27:56 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 223C1DC1C5D; Sat, 29 Jul 2017 11:27:56 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E362768FD5; Sat, 29 Jul 2017 11:27:55 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6TBRtAa022709; Sat, 29 Jul 2017 11:27:55 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6TBRtD7022708; Sat, 29 Jul 2017 11:27:55 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201707291127.v6TBRtD7022708@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Sat, 29 Jul 2017 11:27:55 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321683 - stable/11/sys/boot/zfs X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/boot/zfs X-SVN-Commit-Revision: 321683 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Jul 2017 11:27:56 -0000 Author: mav Date: Sat Jul 29 11:27:54 2017 New Revision: 321683 URL: https://svnweb.freebsd.org/changeset/base/321683 Log: MFC r314504 (by tsoome): loader: r314112 did introduce dereference freed pointer entry CID: 1371675 Reported by: Coverity Differential Revision: https://reviews.freebsd.org/D9846 Modified: stable/11/sys/boot/zfs/zfsimpl.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/boot/zfs/zfsimpl.c ============================================================================== --- stable/11/sys/boot/zfs/zfsimpl.c Sat Jul 29 10:31:57 2017 (r321682) +++ stable/11/sys/boot/zfs/zfsimpl.c Sat Jul 29 11:27:54 2017 (r321683) @@ -2217,7 +2217,7 @@ zfs_lookup(const struct zfsmount *mount, const char *u char path[1024]; int symlinks_followed = 0; struct stat sb; - struct obj_list *entry; + struct obj_list *entry, *tentry; STAILQ_HEAD(, obj_list) on_cache = STAILQ_HEAD_INITIALIZER(on_cache); spa = mount->spa; @@ -2365,7 +2365,7 @@ zfs_lookup(const struct zfsmount *mount, const char *u *dnode = dn; done: - STAILQ_FOREACH(entry, &on_cache, entry) + STAILQ_FOREACH_SAFE(entry, &on_cache, entry, tentry) free(entry); return (rc); } From owner-svn-src-stable-11@freebsd.org Sat Jul 29 17:30:26 2017 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D7E69DC7857; Sat, 29 Jul 2017 17:30:26 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A4A8B71EE8; Sat, 29 Jul 2017 17:30:26 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v6THUPtn068698; Sat, 29 Jul 2017 17:30:25 GMT (envelope-from kp@FreeBSD.org) Received: (from kp@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v6THUP1C068696; Sat, 29 Jul 2017 17:30:25 GMT (envelope-from kp@FreeBSD.org) Message-Id: <201707291730.v6THUP1C068696@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kp set sender to kp@FreeBSD.org using -f From: Kristof Provost Date: Sat, 29 Jul 2017 17:30:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r321687 - stable/11/lib/libsysdecode X-SVN-Group: stable-11 X-SVN-Commit-Author: kp X-SVN-Commit-Paths: stable/11/lib/libsysdecode X-SVN-Commit-Revision: 321687 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Jul 2017 17:30:27 -0000 Author: kp Date: Sat Jul 29 17:30:25 2017 New Revision: 321687 URL: https://svnweb.freebsd.org/changeset/base/321687 Log: MFC r321370 Handle WITH/WITHOUT_PF in libsysdecode Only filter out the PF ioctls if we're building without pf support. Until now those were always filtered out, so truss did not show symbolic names for pf ioctls. Differential Revision: https://reviews.freebsd.org/D11629 Modified: stable/11/lib/libsysdecode/Makefile stable/11/lib/libsysdecode/mkioctls Directory Properties: stable/11/ (props changed) Modified: stable/11/lib/libsysdecode/Makefile ============================================================================== --- stable/11/lib/libsysdecode/Makefile Sat Jul 29 17:00:23 2017 (r321686) +++ stable/11/lib/libsysdecode/Makefile Sat Jul 29 17:30:25 2017 (r321687) @@ -121,7 +121,7 @@ tables.h: mktables ioctl.c: .PHONY .endif ioctl.c: mkioctls .META - env MACHINE=${MACHINE} CPP="${CPP}" \ + env MACHINE=${MACHINE} CPP="${CPP}" MK_PF="${MK_PF}" \ /bin/sh ${.CURDIR}/mkioctls ${DESTDIR}${INCLUDEDIR} > ${.TARGET} beforedepend: ioctl.c tables.h Modified: stable/11/lib/libsysdecode/mkioctls ============================================================================== --- stable/11/lib/libsysdecode/mkioctls Sat Jul 29 17:00:23 2017 (r321686) +++ stable/11/lib/libsysdecode/mkioctls Sat Jul 29 17:30:25 2017 (r321687) @@ -17,8 +17,14 @@ LC_ALL=C; export LC_ALL # XXX should we use an ANSI cpp? ioctl_includes=$( cd $includedir + + filter='(.*disk.*)\.h' + if [ "${MK_PF}" == "no" ]; then + filter="(${filter})|((net/pfvar|net/if_pfsync)\.h)" + fi + find -H -s * -name '*.h' | \ - egrep -v '(.*disk.*|net/pfvar|net/if_pfsync)\.h' | \ + egrep -v ${filter} | \ xargs egrep -l \ '^#[ ]*define[ ]+[A-Za-z_][A-Za-z0-9_]*[ ]+_IO[^a-z0-9_]' | awk '{printf("#include <%s>\\n", $1)}'