From owner-freebsd-fs@freebsd.org Sun Aug 16 14:26:34 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 855E39BABE0; Sun, 16 Aug 2015 14:26:34 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5F82B1DF2; Sun, 16 Aug 2015 14:26:33 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from Julian-MBP3.local (ppp121-45-240-35.lns20.per4.internode.on.net [121.45.240.35]) (authenticated bits=0) by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id t7GEQR6d023531 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sun, 16 Aug 2015 07:26:31 -0700 (PDT) (envelope-from julian@freebsd.org) Subject: Re: futimens and utimensat vs birthtime To: John Baldwin , freebsd-current@freebsd.org References: <55CDFF32.7050601@freebsd.org> <2405496.WdPSxGzEuT@ralph.baldwin.cx> Cc: "freebsd-fs@freebsd.org" , "'Jilles Tjoelker'" From: Julian Elischer Message-ID: <55D09D8D.7010206@freebsd.org> Date: Sun, 16 Aug 2015 22:26:21 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <2405496.WdPSxGzEuT@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Aug 2015 14:26:34 -0000 On 8/15/15 1:39 AM, John Baldwin wrote: > On Friday, August 14, 2015 10:46:10 PM Julian Elischer wrote: >> I would like to implement this call. but would like input as to it's >> nature. >> The code inside the system would already appear to support handling >> three elements, though it needs some scrutiny, >> so all that is needed is a system call with the ability to set the >> birthtime directly. >> >> Whether it should take the form of the existing calls but expecting >> three items is up for discussion. >> Maybe teh addition of a flags argument to specify which items are >> present and which to set. >> >> ideas? > I believe these should be new calls. Only utimensat() provides a flag > argument, but it is reserved for AT_* flags. I wasn't suggesting we keep the old ones and silently make them take 3 args :-) I was thining of suplementing them wth new syscalls and the obvious names are those you suggested. however I do wonder if there will ever be a need for a 4th... > I would be fine with > something like futimens3() and utimensat3() (where 3 means "three > timespecs"). Jilles implemented futimens() and utimensat(), so he > might have ideas as well. I would probably stick the birth time in > the third (final) timespec slot to make it easier to update new code > (you can use an #ifdef just around ts[2] without having to #ifdef the > entire block). > From owner-freebsd-fs@freebsd.org Sun Aug 16 19:05:08 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 98DC49BB131; Sun, 16 Aug 2015 19:05:08 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-io0-x231.google.com (mail-io0-x231.google.com [IPv6:2607:f8b0:4001:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 654D3122D; Sun, 16 Aug 2015 19:05:08 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: by iods203 with SMTP id s203so130498011iod.0; Sun, 16 Aug 2015 12:05:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=88ayhWhIF0YZT9e4+DNF98kbR3zI2uORDVc52QQPVkE=; b=ruU4zGelYjY+yJ+dqkamWeHWsZFpIZIxDXXA8zpbwlCJyfF+NGRahnEml17sjpDQZT JeOEJ5fCZUGcUygcxmmiGgvxi8d9OJ97iNjSzQ218GgIpBkVer0/LxFC7qwDb0a+WW7g DTeRUS9JN0XXo8M4NQgAZPGvngffxHlwG/KGDITXN7NL5m6fkS4eWoEYrcUgWXmokMBf xh0qZ1SGM9G9fibyJ1sprjf/8pUm7lo7mVHRHt2ERJM/UkFtYPKwwX30Ghw9a41bJAd7 VqDIGXlxRk1FLT2OnAGKfMe66s14E0jafqLkOymwFrF2XE8EZ8bRa5bB8wm9z4jcn1nW wvrQ== MIME-Version: 1.0 X-Received: by 10.107.131.196 with SMTP id n65mr15007669ioi.75.1439751907638; Sun, 16 Aug 2015 12:05:07 -0700 (PDT) Received: by 10.36.38.133 with HTTP; Sun, 16 Aug 2015 12:05:07 -0700 (PDT) In-Reply-To: <55D09D8D.7010206@freebsd.org> References: <55CDFF32.7050601@freebsd.org> <2405496.WdPSxGzEuT@ralph.baldwin.cx> <55D09D8D.7010206@freebsd.org> Date: Sun, 16 Aug 2015 12:05:07 -0700 Message-ID: Subject: Re: futimens and utimensat vs birthtime From: Adrian Chadd To: Julian Elischer Cc: John Baldwin , freebsd-current , "freebsd-fs@freebsd.org" , Jilles Tjoelker Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Aug 2015 19:05:08 -0000 .. then make it take a struct and a type flag. :P Then you can extend it however you'd like. -adrian On 16 August 2015 at 07:26, Julian Elischer wrote: > On 8/15/15 1:39 AM, John Baldwin wrote: >> >> On Friday, August 14, 2015 10:46:10 PM Julian Elischer wrote: >>> >>> I would like to implement this call. but would like input as to it's >>> nature. >>> The code inside the system would already appear to support handling >>> three elements, though it needs some scrutiny, >>> so all that is needed is a system call with the ability to set the >>> birthtime directly. >>> >>> Whether it should take the form of the existing calls but expecting >>> three items is up for discussion. >>> Maybe teh addition of a flags argument to specify which items are >>> present and which to set. >>> >>> ideas? >> >> I believe these should be new calls. Only utimensat() provides a flag >> argument, but it is reserved for AT_* flags. > > I wasn't suggesting we keep the old ones and silently make them take 3 args > :-) > I was thining of suplementing them wth new syscalls and the obvious names > are those you suggested. > however I do wonder if there will ever be a need for a 4th... > >> I would be fine with >> something like futimens3() and utimensat3() (where 3 means "three >> timespecs"). Jilles implemented futimens() and utimensat(), so he >> might have ideas as well. I would probably stick the birth time in >> the third (final) timespec slot to make it easier to update new code >> (you can use an #ifdef just around ts[2] without having to #ifdef the >> entire block). >> > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Sun Aug 16 21:00:34 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6E18E9BB28F for ; Sun, 16 Aug 2015 21:00:34 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 490161CBD for ; Sun, 16 Aug 2015 21:00:34 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id t7GL0Yvl088730 for ; Sun, 16 Aug 2015 21:00:34 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201508162100.t7GL0Yvl088730@kenobi.freebsd.org> From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 Date: Sun, 16 Aug 2015 21:00:34 +0000 Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Aug 2015 21:00:34 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- Open | 136470 | [nfs] Cannot mount / in read-only, over NFS Open | 139651 | [nfs] mount(8): read-only remount of NFS volume d Open | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f 3 problems total for which you should take action. From owner-freebsd-fs@freebsd.org Sun Aug 16 21:46:19 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E473C9BBB0E; Sun, 16 Aug 2015 21:46:19 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A3D8D1181; Sun, 16 Aug 2015 21:46:19 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id BECE4B8074; Sun, 16 Aug 2015 23:46:16 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id AF7DC28494; Sun, 16 Aug 2015 23:46:16 +0200 (CEST) Date: Sun, 16 Aug 2015 23:46:16 +0200 From: Jilles Tjoelker To: Julian Elischer Cc: John Baldwin , freebsd-current@freebsd.org, "freebsd-fs@freebsd.org" Subject: Re: futimens and utimensat vs birthtime Message-ID: <20150816214616.GA38422@stack.nl> References: <55CDFF32.7050601@freebsd.org> <2405496.WdPSxGzEuT@ralph.baldwin.cx> <55D09D8D.7010206@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55D09D8D.7010206@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Aug 2015 21:46:20 -0000 On Sun, Aug 16, 2015 at 10:26:21PM +0800, Julian Elischer wrote: > On 8/15/15 1:39 AM, John Baldwin wrote: > > On Friday, August 14, 2015 10:46:10 PM Julian Elischer wrote: > >> I would like to implement this call. but would like input as to it's > >> nature. > >> The code inside the system would already appear to support handling > >> three elements, though it needs some scrutiny, > >> so all that is needed is a system call with the ability to set the > >> birthtime directly. > >> Whether it should take the form of the existing calls but expecting > >> three items is up for discussion. > >> Maybe teh addition of a flags argument to specify which items are > >> present and which to set. > >> ideas? > > I believe these should be new calls. Only utimensat() provides a flag > > argument, but it is reserved for AT_* flags. Using AT_* flags for things unrelated to pathnames is not without precedent: AT_REMOVEDIR for unlinkat() and AT_EACCESS for faccessat(). This isn't suitable for a large number of flags, though. > I wasn't suggesting we keep the old ones and silently make them take 3 > args :-) > I was thining of suplementing them wth new syscalls and the obvious > names are those you suggested. > however I do wonder if there will ever be a need for a 4th... This could be indicated by yet another flag. I'm a bit disappointed that setting birthtimes apparently wasn't needed when I added futimens and utimensat. However, they are not part of any release yet, so it may be possible to remove them at some point. -- Jilles Tjoelker From owner-freebsd-fs@freebsd.org Mon Aug 17 05:26:01 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 954539B7F53; Mon, 17 Aug 2015 05:26:01 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 53BDF1AC4; Mon, 17 Aug 2015 05:26:00 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from Julian-MBP3.local (ppp121-45-240-35.lns20.per4.internode.on.net [121.45.240.35]) (authenticated bits=0) by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id t7H5Pri8025838 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sun, 16 Aug 2015 22:25:56 -0700 (PDT) (envelope-from julian@freebsd.org) Subject: Re: futimens and utimensat vs birthtime To: Jilles Tjoelker References: <55CDFF32.7050601@freebsd.org> <2405496.WdPSxGzEuT@ralph.baldwin.cx> <55D09D8D.7010206@freebsd.org> <20150816214616.GA38422@stack.nl> Cc: John Baldwin , freebsd-current@freebsd.org, "freebsd-fs@freebsd.org" From: Julian Elischer Message-ID: <55D1705C.7070305@freebsd.org> Date: Mon, 17 Aug 2015 13:25:48 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150816214616.GA38422@stack.nl> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Aug 2015 05:26:01 -0000 On 8/17/15 5:46 AM, Jilles Tjoelker wrote: > On Sun, Aug 16, 2015 at 10:26:21PM +0800, Julian Elischer wrote: >> On 8/15/15 1:39 AM, John Baldwin wrote: >>> On Friday, August 14, 2015 10:46:10 PM Julian Elischer wrote: >>>> I would like to implement this call. but would like input as to it's >>>> nature. >>>> The code inside the system would already appear to support handling >>>> three elements, though it needs some scrutiny, >>>> so all that is needed is a system call with the ability to set the >>>> birthtime directly. >>>> Whether it should take the form of the existing calls but expecting >>>> three items is up for discussion. >>>> Maybe teh addition of a flags argument to specify which items are >>>> present and which to set. >>>> ideas? >>> I believe these should be new calls. Only utimensat() provides a flag >>> argument, but it is reserved for AT_* flags. > Using AT_* flags for things unrelated to pathnames is not without > precedent: AT_REMOVEDIR for unlinkat() and AT_EACCESS for faccessat(). > This isn't suitable for a large number of flags, though. > >> I wasn't suggesting we keep the old ones and silently make them take 3 >> args :-) >> I was thining of suplementing them wth new syscalls and the obvious >> names are those you suggested. >> however I do wonder if there will ever be a need for a 4th... > This could be indicated by yet another flag. not in futimens, futimes or any of the other varieties that have no flags.. > > I'm a bit disappointed that setting birthtimes apparently wasn't needed > when I added futimens and utimensat. However, they are not part of any > release yet, so it may be possible to remove them at some point. Well they are defined in posix as only taking the two args right? I'm not sure I am parsing your statement correctly. It reads a bit like you are disappointed at yourself, which doesn't seem right. I think we should keep them as they are part of posix and part of Linux. We actually independently added them at $JOB (for Linux compat) but now we have a need for better birthtime control. In the meantime UFS2 and ZFS both have birthtime and its use is getting more widespread. (not sure if NFSv4 supports it but Samba needs it for full emulation) We'd rather add a new call in conjunction with FreeBSD rather than add our own and see FreeBSD add a slightly different one. It is possible we could do a syscall capable of doing 3, and just have a library entry that is compatible with the standard as a front end. suggestions welcome. > From owner-freebsd-fs@freebsd.org Mon Aug 17 22:29:55 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 137A29BBFD1; Mon, 17 Aug 2015 22:29:55 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E202511DB; Mon, 17 Aug 2015 22:29:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (75-48-78-19.lightspeed.cncrca.sbcglobal.net [75.48.78.19]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 284D0B939; Mon, 17 Aug 2015 18:29:53 -0400 (EDT) From: John Baldwin To: Adrian Chadd Cc: Julian Elischer , freebsd-current , "freebsd-fs@freebsd.org" , Jilles Tjoelker Subject: Re: futimens and utimensat vs birthtime Date: Mon, 17 Aug 2015 15:28:45 -0700 Message-ID: <6270978.RcR1JVbHrR@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-PRERELEASE; KDE/4.14.3; amd64; ; ) In-Reply-To: References: <55CDFF32.7050601@freebsd.org> <55D09D8D.7010206@freebsd.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 17 Aug 2015 18:29:53 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Aug 2015 22:29:55 -0000 On Sunday, August 16, 2015 12:05:07 PM Adrian Chadd wrote: > .. then make it take a struct and a type flag. :P > > Then you can extend it however you'd like. I think that might be a bit much (making it a struct), but one option could be to add an argument that says how many timespecs are in the array. Any "missing" timespecs could be treated as if they were set to UTIME_OMIT. This would in theory mean you could support additional timestamps in the future without needing new calls. I'm just not sure if there are any conceivable timestamps such that this flexibility is warranted? > -adrian > > On 16 August 2015 at 07:26, Julian Elischer wrote: > > On 8/15/15 1:39 AM, John Baldwin wrote: > >> > >> On Friday, August 14, 2015 10:46:10 PM Julian Elischer wrote: > >>> > >>> I would like to implement this call. but would like input as to it's > >>> nature. > >>> The code inside the system would already appear to support handling > >>> three elements, though it needs some scrutiny, > >>> so all that is needed is a system call with the ability to set the > >>> birthtime directly. > >>> > >>> Whether it should take the form of the existing calls but expecting > >>> three items is up for discussion. > >>> Maybe teh addition of a flags argument to specify which items are > >>> present and which to set. > >>> > >>> ideas? > >> > >> I believe these should be new calls. Only utimensat() provides a flag > >> argument, but it is reserved for AT_* flags. > > > > I wasn't suggesting we keep the old ones and silently make them take 3 args > > :-) > > I was thining of suplementing them wth new syscalls and the obvious names > > are those you suggested. > > however I do wonder if there will ever be a need for a 4th... > > > >> I would be fine with > >> something like futimens3() and utimensat3() (where 3 means "three > >> timespecs"). Jilles implemented futimens() and utimensat(), so he > >> might have ideas as well. I would probably stick the birth time in > >> the third (final) timespec slot to make it easier to update new code > >> (you can use an #ifdef just around ts[2] without having to #ifdef the > >> entire block). > >> > > > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- John Baldwin From owner-freebsd-fs@freebsd.org Mon Aug 17 23:19:37 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E98A9BC73B for ; Mon, 17 Aug 2015 23:19:37 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:d250:99ff:fe57:4030]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7BDAC17DC; Mon, 17 Aug 2015 23:19:37 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (localhost [IPv6:::1]) by chez.mckusick.com (8.14.9/8.14.9) with ESMTP id t7HNJYKN018032; Mon, 17 Aug 2015 16:19:35 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201508172319.t7HNJYKN018032@chez.mckusick.com> From: Kirk McKusick To: John Baldwin Subject: Re: futimens and utimensat vs birthtime cc: Adrian Chadd , "freebsd-fs@freebsd.org" , Jilles Tjoelker In-reply-to: <6270978.RcR1JVbHrR@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <18030.1439853574.1@chez.mckusick.com> Date: Mon, 17 Aug 2015 16:19:34 -0700 X-Spam-Status: No, score=0.1 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on chez.mckusick.com X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Aug 2015 23:19:37 -0000 > From: John Baldwin > To: Adrian Chadd > Subject: Re: futimens and utimensat vs birthtime > Date: Mon, 17 Aug 2015 15:28:45 -0700 > > On Sunday, August 16, 2015 12:05:07 PM Adrian Chadd wrote: >> .. then make it take a struct and a type flag. :P >> >> Then you can extend it however you'd like. > > I think that might be a bit much (making it a struct), but one option > could be to add an argument that says how many timespecs are in the > array. Any "missing" timespecs could be treated as if they were set > to UTIME_OMIT. This would in theory mean you could support additional > timestamps in the future without needing new calls. I'm just not sure > if there are any conceivable timestamps such that this flexibility is > warranted? I agree that it is unlikely that you would ever need another timestamp. But just to stretch my imagination to think of a conceivable one, how about a "snapshot" timestamp that indicates the time that the snapshot of the file was taken. If it is a log file, that would let you know when events stopped being able to be logged to it. ~Kirk From owner-freebsd-fs@freebsd.org Tue Aug 18 17:25:49 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A09DE9BC892 for ; Tue, 18 Aug 2015 17:25:49 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 867841931 for ; Tue, 18 Aug 2015 17:25:49 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: by mailman.ysv.freebsd.org (Postfix) id 8350B9BC890; Tue, 18 Aug 2015 17:25:49 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 82E229BC88F for ; Tue, 18 Aug 2015 17:25:49 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (mail.michaelwlucas.com [104.236.197.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3D1861930 for ; Tue, 18 Aug 2015 17:25:48 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (localhost [127.0.0.1]) by mail.michaelwlucas.com (8.14.9/8.14.7) with ESMTP id t7IHOV7d098039 for ; Tue, 18 Aug 2015 13:24:32 -0400 (EDT) (envelope-from mwlucas@mail.michaelwlucas.com) Received: (from mwlucas@localhost) by mail.michaelwlucas.com (8.14.9/8.14.7/Submit) id t7IHOVfX098038 for fs@freebsd.org; Tue, 18 Aug 2015 13:24:31 -0400 (EDT) (envelope-from mwlucas) Date: Tue, 18 Aug 2015 13:24:31 -0400 From: "Michael W. Lucas" To: fs@freebsd.org Subject: dtrace script for io latency/throughput Message-ID: <20150818172431.GA97967@mail.michaelwlucas.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.michaelwlucas.com X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mail.michaelwlucas.com [127.0.0.1]); Tue, 18 Aug 2015 13:24:32 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Aug 2015 17:25:49 -0000 Hi, I'm working on the performance part of allanjude@'s & mine next ZFS book. There's an incredibly useful latency/throughput script at http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/, but it doesn't work on FreeBSD. I try to run this script on FreeBSD and get: # dtrace -s rw.d -c 'sleep 60' dtrace: failed to compile script rw.d: line 10: b_edev is not a member of struct bio Which seems pretty clear: FreeBSD is not Solarisy. Is there a similar (or simpler) way to map latency vs throughput on FreeBSD? Thanks, ==ml PS: The script is: #pragma D option quiet BEGIN { start = timestamp; } io:::start { ts[args[0]->b_edev, args[0]->b_lblkno] = timestamp; } io:::done /ts[args[0]->b_edev, args[0]->b_lblkno]/ { this->delta = (timestamp - ts[args[0]->b_edev, args[0]->b_lblkno]) / 1000; this->name = (args[0]->b_flags & (B_READ | B_WRITE)) == B_READ ? "read " : "write "; @q[this->name] = quantize(this->delta); @a[this->name] = avg(this->delta); @v[this->name] = stddev(this->delta); @i[this->name] = count(); @b[this->name] = sum(args[0]->b_bcount); ts[args[0]->b_edev, args[0]->b_lblkno] = 0; } END { printa(@q); normalize(@i, (timestamp - start) / 1000000000); normalize(@b, (timestamp - start) / 1000000000 * 1024); printf("%-30s %11s %11s %11s %11s\n", "", "avg latency", "stddev", "iops", "throughput"); printa("%-30s %@9uus %@9uus %@9u/s %@8uk/s\n", @a, @v, @i, @b); } -- Michael W. Lucas - mwlucas@michaelwlucas.com, Twitter @mwlauthor http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/ From owner-freebsd-fs@freebsd.org Tue Aug 18 22:07:45 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 456139BBD7D for ; Tue, 18 Aug 2015 22:07:45 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 2708F97C for ; Tue, 18 Aug 2015 22:07:45 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 261DE9BBD7C; Tue, 18 Aug 2015 22:07:45 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 25B139BBD7B for ; Tue, 18 Aug 2015 22:07:45 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-ig0-x22c.google.com (mail-ig0-x22c.google.com [IPv6:2607:f8b0:4001:c05::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E7CE797B for ; Tue, 18 Aug 2015 22:07:44 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: by igxp17 with SMTP id p17so92068123igx.1 for ; Tue, 18 Aug 2015 15:07:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=94hNLq0KAH4o/EciKylr6hy2PdRell2UjVyPJRf0MMo=; b=gfP3M11kOBLjTVSplZVAsam1HLGm6KCvlCNPOPZ2g15BPQeFpMQ+acn0ja3T3NULc0 +7r/cwQQfrok/37Lp6SAd9Q8YZTZjzVQ1iS3AxTh2XuurOx6l+uYGXPM1GhCR1kpnE7b cHKWbgCLuLGC1LsaOJXDYczMujsvAF47HDBSnJqtwepDwIKN/FtOAqQVqcEV32m4yOUN ruuE0pX1vT6Ritegyljcn+v6cL0tx3/monZxDT2LeLyU6qKNLs76gEaxbUD3nNPcw8qB RocymZlXeis3qDLOhZ0ugJIJomTdZn+umqc52sIjBM3Yp+VmbCyGKsEAnnZtnIFRy/0Q eTMA== MIME-Version: 1.0 X-Received: by 10.50.93.99 with SMTP id ct3mr25100274igb.83.1439935664225; Tue, 18 Aug 2015 15:07:44 -0700 (PDT) Received: by 10.107.169.94 with HTTP; Tue, 18 Aug 2015 15:07:44 -0700 (PDT) In-Reply-To: <20150818172431.GA97967@mail.michaelwlucas.com> References: <20150818172431.GA97967@mail.michaelwlucas.com> Date: Tue, 18 Aug 2015 18:07:44 -0400 Message-ID: Subject: Re: dtrace script for io latency/throughput From: Ryan Stone To: "Michael W. Lucas" Cc: "freebsd-fs@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Aug 2015 22:07:45 -0000 Try this: https://people.freebsd.org/~rstone/rw.d I've run it on a recent-ish head and it appears to give reasonable data. From owner-freebsd-fs@freebsd.org Wed Aug 19 00:28:29 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F35929BC979 for ; Wed, 19 Aug 2015 00:28:28 +0000 (UTC) (envelope-from javocado@gmail.com) Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com [IPv6:2a00:1450:4010:c04::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 67562E1E for ; Wed, 19 Aug 2015 00:28:28 +0000 (UTC) (envelope-from javocado@gmail.com) Received: by lbbpu9 with SMTP id pu9so113827502lbb.3 for ; Tue, 18 Aug 2015 17:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=D4kCSF5dWes/xPj6WIJ5uh+rJolfDD2uXOI8SyDRKVU=; b=lIgNVORcHKSZLntfPTAlbMWf4qnhsAxv6eIKBNwD2GzGAgqqCv4TPPfe11HCiWomjJ c4T/c20ko3ioNcKb1a/gDGGyLirRjXtj89ZdYG2ayDhU0HtKs499bZtOlGlggk2Xna3F KZ0gCTflxB7BCoe3YR9Mhq1I5oxsZkmQpDX3x5842j4z70DlSYuuZpm08X+qjYBFGXQ9 49Wo1sNqz/nVT1ss3TEl4mosWCQ+HqPQWs/CFaHIbsU2oR/wOhw/FDP26pgR8IlGvN+R 8sXsvMb9gv5SxfR/uA8DpMFYc7DviL2ZpcwAEmdQfCyrRBy0Cj5TzKPAQOQefsZqJbWw kyxw== MIME-Version: 1.0 X-Received: by 10.152.43.77 with SMTP id u13mr8567106lal.96.1439944105793; Tue, 18 Aug 2015 17:28:25 -0700 (PDT) Received: by 10.114.96.136 with HTTP; Tue, 18 Aug 2015 17:28:25 -0700 (PDT) Date: Tue, 18 Aug 2015 17:28:25 -0700 Message-ID: Subject: Optimizing performance with SLOG/L2ARC From: javocado To: FreeBSD Filesystems Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 00:28:29 -0000 Hi, I've been trying to optimize and enhance my ZFS filesystem performance (running FreeBSD 8.3amd) which has been sluggish at times. Thus far I have added RAM (256GB) and I've added an SLOG (SSD mirror). The RAM seems to have helped a bit, but not sure if the SLOG was of much help. My vdev is decently busy, with writes and reads averaging at 100 per second with spikes as high as 500. Here's what arc_statistics is showing me: ARC Size: 70.28% 173.89 GiB Target Size: (Adaptive) 71.84% 177.77 GiB Min Size (Hard Limit): 12.50% 30.93 GiB Max Size (High Water): 8:1 247.44 GiB ARC Efficiency: 2.25b Cache Hit Ratio: 95.76% 2.16b Cache Miss Ratio: 4.24% 95.55m Actual Hit Ratio: 64.95% 1.46b Data Demand Efficiency: 94.83% 330.99m Data Prefetch Efficiency: 26.36% 64.23m CACHE HITS BY CACHE LIST: Anonymously Used: 30.87% 665.74m Most Recently Used: 7.54% 162.67m Most Frequently Used: 60.29% 1.30b Most Recently Used Ghost: 0.18% 3.97m Most Frequently Used Ghost: 1.11% 23.89m CACHE HITS BY DATA TYPE: Demand Data: 14.56% 313.89m Prefetch Data: 0.79% 16.93m Demand Metadata: 53.28% 1.15b Prefetch Metadata: 31.38% 676.68m CACHE MISSES BY DATA TYPE: Demand Data: 17.90% 17.10m Prefetch Data: 49.50% 47.30m Demand Metadata: 24.46% 23.37m Prefetch Metadata: 8.14% 7.78m 1. based on the output above, I believe a larger ARC may not necessarily benefit me at this point. True? 2. Is more (L2)ARC always better? 3. I know it's a good idea to mirror the SLOG (and I have). Do I understand correctly that I do not need to mirror the L2ARC since it's just a read cache, nothing to lose if the SSD goes down? 4. Is there a better way than looking at zpool iostat -v to determine the SLOG utilization and usefulness? I'd like to test-see if adding L2ARC yields any performance boost. Since SLOG isn't doing much for me, I'm thinking I could easily repurpose my SLOG into an L2ARC. Questions: 5. In testing, it seemed fine to remove the SLOG from a live/running system (zpool remove pool mirror-3). Is this in fact a safe thing to do to a live/running system? ZFS knows that it should flush the ZIL, then remove the device? Is it better or necessary to shut down the system and remove the SLOG in "read only" mode? 6. Am I missing something about how the SLOG and L2ARC play together that I would miss by running my proposed test. i.e. if I take down the SLOG and repurpose as an L2ARC might I be shooting myself in the foot cause the SLOG and L2ARC combo is much more powerful than the L2ARC alone (or SLOG alone)? My hope here is to see if the L2ARC improves performance, after which I will proceed with buying the SSD(s) for both the SLOG and L2ARC. Thanks From owner-freebsd-fs@freebsd.org Wed Aug 19 01:50:30 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D81499BBB3A for ; Wed, 19 Aug 2015 01:50:30 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id BB6A6ABF for ; Wed, 19 Aug 2015 01:50:30 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: by mailman.ysv.freebsd.org (Postfix) id B87A69BBB39; Wed, 19 Aug 2015 01:50:30 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B80DB9BBB37 for ; Wed, 19 Aug 2015 01:50:30 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (mail.michaelwlucas.com [104.236.197.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 678F4ABD for ; Wed, 19 Aug 2015 01:50:30 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (localhost [127.0.0.1]) by mail.michaelwlucas.com (8.14.9/8.14.7) with ESMTP id t7J1nCQA001737; Tue, 18 Aug 2015 21:49:12 -0400 (EDT) (envelope-from mwlucas@mail.michaelwlucas.com) Received: (from mwlucas@localhost) by mail.michaelwlucas.com (8.14.9/8.14.7/Submit) id t7J1nCDc001736; Tue, 18 Aug 2015 21:49:12 -0400 (EDT) (envelope-from mwlucas) Date: Tue, 18 Aug 2015 21:49:12 -0400 From: "Michael W. Lucas" To: Ryan Stone Cc: "freebsd-fs@freebsd.org" Subject: Re: dtrace script for io latency/throughput Message-ID: <20150819014912.GA1707@mail.michaelwlucas.com> References: <20150818172431.GA97967@mail.michaelwlucas.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.michaelwlucas.com X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mail.michaelwlucas.com [127.0.0.1]); Tue, 18 Aug 2015 21:49:13 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 01:50:30 -0000 On Tue, Aug 18, 2015 at 06:07:44PM -0400, Ryan Stone wrote: > Try this: > [1]https://people.freebsd.org/~rstone/rw.d > I've run it on a recent-ish head and it appears to give reasonable > data. Brilliant, thank you! For the archives: if you want to see both reads and writes, you need to do something to generate both types of I/O. ==ml -- Michael W. Lucas - mwlucas@michaelwlucas.com, Twitter @mwlauthor http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/ From owner-freebsd-fs@freebsd.org Wed Aug 19 02:28:53 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1D4D39BD432 for ; Wed, 19 Aug 2015 02:28:53 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BD017FD6 for ; Wed, 19 Aug 2015 02:28:52 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t7J2GAQV009049; Tue, 18 Aug 2015 21:16:10 -0500 (CDT) Date: Tue, 18 Aug 2015 21:16:10 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: javocado cc: FreeBSD Filesystems Subject: Re: Optimizing performance with SLOG/L2ARC In-Reply-To: Message-ID: References: User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Tue, 18 Aug 2015 21:16:11 -0500 (CDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 02:28:53 -0000 On Tue, 18 Aug 2015, javocado wrote: > I've been trying to optimize and enhance my ZFS filesystem performance > (running FreeBSD 8.3amd) which has been sluggish at times. Thus far I have > added RAM (256GB) and I've added an SLOG (SSD mirror). The RAM seems to > have helped a bit, but not sure if the SLOG was of much help. My vdev is > decently busy, with writes and reads averaging at 100 per second with > spikes as high as 500. Lots of interesting questions. You have not told us the use case for your system, or the zpool layout. Your comment about 'vdev is decently busy' causes me to think that perhaps you have just one and that more vdevs will (at least) linearly improve over-all performance. > 1. based on the output above, I believe a larger ARC may not necessarily > benefit me at this point. True? It looks like your ARC is doing well. > 2. Is more (L2)ARC always better? No. If (L2)ARC ends up empty, then it is wasted. > 3. I know it's a good idea to mirror the SLOG (and I have). Do I understand > correctly that I do not need to mirror the L2ARC since it's just a read > cache, nothing to lose if the SSD goes down? That is my understanding. Everything in the (L2)ARC is also on the pool disks. Look into re-architecting your pool. It is not clear what type of reads/writes are taking place but if these are random access to pool disks and you are using a raidzN organization, then you may be bottlenecked on disk write I/Os. This is the important thing to determine. If you were bottlenecked on async disk write I/Os and your slog is relatively idle, then you may benefit most from more vdevs and possibly more disks. Decreasing read I/Os will help, but can only go so far. Use mirrors if you can afford it. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@freebsd.org Wed Aug 19 08:58:36 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 74C8B9BD952 for ; Wed, 19 Aug 2015 08:58:36 +0000 (UTC) (envelope-from joh.hendriks@gmail.com) Received: from mail-wi0-x234.google.com (mail-wi0-x234.google.com [IPv6:2a00:1450:400c:c05::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0CB5DC02 for ; Wed, 19 Aug 2015 08:58:36 +0000 (UTC) (envelope-from joh.hendriks@gmail.com) Received: by wicja10 with SMTP id ja10so1637940wic.1 for ; Wed, 19 Aug 2015 01:58:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:cc:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=RhrInTTZZHU7sOov59QUH70WPae19sOtWvr18M+UEiQ=; b=wXNvvn8ReROgUwodWMaWoWy4JjqtDe0sYR1NyXuNKh8Gl48cexrQ9eYPqZELHwDszm 6HGVjZGgEOcOKoamJnS1VnA5d62RvOsMaA7ZWy8waRtxwELJsM6T6d2UKRzOJmFuy5tu WEjdGu/I9CyQH/oJFXtjbjmCgLuT+skHDaQ+lpbocoMY2b8/FeV3WfBWyhRXpeNF7EMY WLctEqlof46x5EFQMzY/h5yHCrm+qBIAB58q/btGQi96sVMUhpzV+K2q4MrxW+HlhPw/ vUYhkJgIsIORlj5G3fVLVo7JW3SpdlBr5I5M2fHdlEu3850n3lWJa3NnGK9VN0kynO3B +Iqg== X-Received: by 10.194.112.3 with SMTP id im3mr21687619wjb.54.1439974714492; Wed, 19 Aug 2015 01:58:34 -0700 (PDT) Received: from Johans-MacBook-Air.local (92-70-102-130.glasvezel.netexpo.nl. [92.70.102.130]) by smtp.googlemail.com with ESMTPSA id bd9sm581224wib.18.2015.08.19.01.58.33 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Aug 2015 01:58:33 -0700 (PDT) Subject: Re: Optimizing performance with SLOG/L2ARC To: javocado References: From: Johan Hendriks X-Enigmail-Draft-Status: N1110 Cc: freebsd-fs@freebsd.org Message-ID: <55D4453C.7040203@gmail.com> Date: Wed, 19 Aug 2015 10:58:36 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 08:58:36 -0000 Op 19/08/15 om 02:28 schreef javocado: > Hi, > > I've been trying to optimize and enhance my ZFS filesystem performance > (running FreeBSD 8.3amd) which has been sluggish at times. Thus far I have > added RAM (256GB) and I've added an SLOG (SSD mirror). The RAM seems to > have helped a bit, but not sure if the SLOG was of much help. My vdev is > decently busy, with writes and reads averaging at 100 per second with > spikes as high as 500. > > Here's what arc_statistics is showing me: > > ARC Size: 70.28% 173.89 GiB > Target Size: (Adaptive) 71.84% 177.77 GiB > Min Size (Hard Limit): 12.50% 30.93 GiB > Max Size (High Water): 8:1 247.44 GiB > > ARC Efficiency: 2.25b > Cache Hit Ratio: 95.76% 2.16b > Cache Miss Ratio: 4.24% 95.55m > Actual Hit Ratio: 64.95% 1.46b > > Data Demand Efficiency: 94.83% 330.99m > Data Prefetch Efficiency: 26.36% 64.23m > > CACHE HITS BY CACHE LIST: > Anonymously Used: 30.87% 665.74m > Most Recently Used: 7.54% 162.67m > Most Frequently Used: 60.29% 1.30b > Most Recently Used Ghost: 0.18% 3.97m > Most Frequently Used Ghost: 1.11% 23.89m > > CACHE HITS BY DATA TYPE: > Demand Data: 14.56% 313.89m > Prefetch Data: 0.79% 16.93m > Demand Metadata: 53.28% 1.15b > Prefetch Metadata: 31.38% 676.68m > > CACHE MISSES BY DATA TYPE: > Demand Data: 17.90% 17.10m > Prefetch Data: 49.50% 47.30m > Demand Metadata: 24.46% 23.37m > Prefetch Metadata: 8.14% 7.78m > > > 1. based on the output above, I believe a larger ARC may not necessarily > benefit me at this point. True? > > 2. Is more (L2)ARC always better? One thing to remember is that a L2ARC requires memory! So for your hardware you need to find the sweetspot which L2ARC size is best performing. http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg34674.html > > 3. I know it's a good idea to mirror the SLOG (and I have). Do I understand > correctly that I do not need to mirror the L2ARC since it's just a read > cache, nothing to lose if the SSD goes down? You could potentially loose data if the ZIL/SLOG is lost, so always use a mirrorred vdev as ZIL/SLOG. You do not need a large vdev for ZIL/SLOG 8 to 10 GB is large enough. https://pthree.org/2013/04/19/zfs-administration-appendix-a-visualizing-the-zfs-intent-log/ For L2ARC you do not need to mirror the disks. It just copies data to the device for cache, if it is not on the cache it will use the spinning disks. If for whatever reason the cache vdev dies it will use the spinning disks again. > > 4. Is there a better way than looking at zpool iostat -v to determine the > SLOG utilization and usefulness? > > I'd like to test-see if adding L2ARC yields any performance boost. Since > SLOG isn't doing much for me, I'm thinking I could easily repurpose my SLOG > into an L2ARC. > > Questions: > > 5. In testing, it seemed fine to remove the SLOG from a live/running system > (zpool remove pool mirror-3). Is this in fact a safe thing to do to a > live/running system? ZFS knows that it should flush the ZIL, then remove > the device? Is it better or necessary to shut down the system and remove > the SLOG in "read only" mode? You can without problem remove the ZIL/SLOG on a running system. ZFS will fall back to the spinning disks for ZIL/SLOG > > 6. Am I missing something about how the SLOG and L2ARC play together that I > would miss by running my proposed test. i.e. if I take down the SLOG and > repurpose as an L2ARC might I be shooting myself in the foot cause the SLOG > and L2ARC combo is much more powerful than the L2ARC alone (or SLOG alone)? > My hope here is to see if the L2ARC improves performance, after which I > will proceed with buying the SSD(s) for both the SLOG and L2ARC. > > Thanks > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Wed Aug 19 12:18:44 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6D7E89BC1F5 for ; Wed, 19 Aug 2015 12:18:44 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.21.123]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smtp-sofia.digsys.bg", Issuer "Digital Systems Operational CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E74D8DAC for ; Wed, 19 Aug 2015 12:18:43 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from [193.68.6.110] ([193.68.6.110]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.9/8.14.9) with ESMTP id t7JCBnnl022634 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 19 Aug 2015 15:11:49 +0300 (EEST) (envelope-from daniel@digsys.bg) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: Optimizing performance with SLOG/L2ARC From: Daniel Kalchev In-Reply-To: Date: Wed, 19 Aug 2015 15:11:49 +0300 Cc: FreeBSD Filesystems Content-Transfer-Encoding: quoted-printable Message-Id: <4BF27882-7BE1-480E-9903-434BDD202BB3@digsys.bg> References: To: javocado X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 12:18:44 -0000 > On 19.08.2015 =D0=B3., at 3:28, javocado wrote: >=20 >=20 > 3. I know it's a good idea to mirror the SLOG (and I have). Do I = understand > correctly that I do not need to mirror the L2ARC since it's just a = read > cache, nothing to lose if the SSD goes down? >=20 There is a little known and grossly underestimated benefit of using = SLOG, even if it is not on SSD: less fragmentation in the pool. This is because, without SLOG ZFS will allocate the intent log records = from blocks in the pool. then free those blocks. These are all small = writes and leave behind a lot of holes. SLOG should be on it=E2=80=99s own drive anyway, normally, it=E2=80=99s = write-only and besides latency, a normal HDD should do as well. Of course, everything needs to be tested against the expected workload. Daniel From owner-freebsd-fs@freebsd.org Wed Aug 19 15:06:19 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4464A9BC7BA for ; Wed, 19 Aug 2015 15:06:19 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0D6DC15E2 for ; Wed, 19 Aug 2015 15:06:18 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t7JF6G6S024736; Wed, 19 Aug 2015 10:06:16 -0500 (CDT) Date: Wed, 19 Aug 2015 10:06:16 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: kpneal@pobox.com cc: javocado , FreeBSD Filesystems Subject: Re: Optimizing performance with SLOG/L2ARC In-Reply-To: <20150819111243.GA44407@neutralgood.org> Message-ID: References: <20150819111243.GA44407@neutralgood.org> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Wed, 19 Aug 2015 10:06:17 -0500 (CDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 15:06:19 -0000 On Wed, 19 Aug 2015, kpneal@pobox.com wrote: >> Use mirrors if you can afford it. > > Mirrors are less safe than raidz* unless you have enough drives in the > mirror. Look up the failure prediction calculations that were done on one > of the zfs lists last year. (The relevant keyword might be "MTTDL".) > > Mirrors also have about the same write performance as raidz*. But reads > from mirrors scale excellently with the number of drives in the mirror. With traditional hard drives, mirrors offer more write performance than raidz because they consume fewer precious drive IOPS (e.g. 5X, 8X less) and because the pool will have more vdevs (supporting more simultaneous activities). With single large sequential writes, raidz* likely matches performance with mirrors. Performance with large sequential writes is not usually the problem when someone says that a server feels "sluggish". With today's large disk sector sizes (4k, 8k), the amount of space lost to mirroring (compared with raidz*) is not as much as it used to be. It is necessary to start at raidz2 in order to exceed the MTTDL with simple mirrors. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@freebsd.org Wed Aug 19 15:29:50 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 84FA49BCFD6 for ; Wed, 19 Aug 2015 15:29:50 +0000 (UTC) (envelope-from paul@pk1048.com) Received: from cpanel61.fastdnsservers.com (server61.fastdnsservers.com [216.51.232.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 63B51B94 for ; Wed, 19 Aug 2015 15:29:49 +0000 (UTC) (envelope-from paul@pk1048.com) Received: from mail.thecreativeadvantage.com ([96.236.20.34]:62541 helo=mbp-1.thecreativeadvantage.com) by cpanel61.fastdnsservers.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.85) (envelope-from ) id 1ZS5J9-001Wxh-CJ; Wed, 19 Aug 2015 10:29:43 -0500 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: Optimizing performance with SLOG/L2ARC From: PK1048 In-Reply-To: Date: Wed, 19 Aug 2015 11:29:44 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> References: To: javocado , FreeBSD Filesystems X-Mailer: Apple Mail (2.1878.6) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - cpanel61.fastdnsservers.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - pk1048.com X-Get-Message-Sender-Via: cpanel61.fastdnsservers.com: authenticated_id: info@pk1048.com X-Source: X-Source-Args: X-Source-Dir: X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 15:29:50 -0000 Lots of good comments already posted, more inline below... On Aug 18, 2015, at 20:28, javocado wrote: > I've been trying to optimize and enhance my ZFS filesystem performance > (running FreeBSD 8.3amd) which has been sluggish at times. Thus far I = have > added RAM (256GB) and I've added an SLOG (SSD mirror). The RAM seems = to > have helped a bit, but not sure if the SLOG was of much help. My vdev = is > decently busy, with writes and reads averaging at 100 per second with > spikes as high as 500. What is the system doing _besides_ being a file server ? =46rom the = arcstats, the ARC is not getting all of the 256 GB RAM, what else is = running ? What is the pool layout (zpool status)=85 =46rom a strictly a = performance standpoint, each vdev gives you ONE drives worth of WRITE = performance. It does not much matter if the vdev is a mirror or = RAIDz. So a pool consisting of one vdev which is a 12 drive RAIDz2 = will have much slower write performance that a pool consisting of 3 = vdevs each of which is a 4 drive RAIDz2 =85 Yes, you are balancing off = capacity vs. performance vs. reliability. > 1. based on the output above, I believe a larger ARC may not = necessarily > benefit me at this point. True? The arcstats look good, are you having issues with reads or writes ? If = reads, then you need to now look at your disk layout. If writes, then = are they sync or async, if sync then a ZIL can help, if async, then more = ARC can help and you need to look at your disk layout. > 2. Is more (L2)ARC always better? ARC is _aways_ better than L2ARC. Adding L2ARC also consumes ARC space = with L2ARC metadata, and that is ARC space that cannot now be used to = cache file metadata and data. So adding L2ARC (which is slower than RAM = based ARC) _may_ actually make a system slower. > 3. I know it's a good idea to mirror the SLOG (and I have). Do I = understand > correctly that I do not need to mirror the L2ARC since it's just a = read > cache, nothing to lose if the SSD goes down? L2ARC is just read cache, so loosing it causes no data loss. You _may_ = lose performance _if_ the L2ARC was increasing your performance. > 4. Is there a better way than looking at zpool iostat -v to determine = the > SLOG utilization and usefulness? The ZIL/SLOG is used to cache SYNC write operations (if you test with = iozone, use the -o option to force sync writes). Using iostat -x you can = see the activity to the SSDs, both write and read. This can tell you how = much you are using them. I am not aware of any way to track how FULL = your ZIL/SLOG device is. Please note that, depending on your workload, an SSD may _not_ be any = faster than a HDD. I am in the process of rebuilding a file server that = exhibited poor NFS SYNC write performance. Yet it had a mirrored pair of = SSDs. Unfortunately, those SSDs had _worse_ write performance than an = HDD for small (4 KB) writes. Based on recommendations from the OpenZFS = list I have a pair of Intel 3710 SSDs coming in to try, they are = supposed to have much better write performance (at all block sizes) and = much better reliability long term (10x full disk writes per day for 5 = years). I=92ll know more once they arrive and I can test with them. Someone commented on the size of the ZIL/SLOG=85 it needs to hold all of = the write data that arrives between TXG commits, which happen at least = every 5 seconds (it used to be 30 seconds, but that scared too many = people :-). SU a sync write arrives and it _must_ be committed to = permanent storage, so ZFS writes it to the ZFS Intent Log (ZIL) which = may or may not be a separate device (vdev). When the TXG that contains = that data is committed to the pool itself the data can be flushed from = the ZIL. If your source of sync writes is network shares, and you have a = 1 Gbps link, then your maximum ZIL will be 5 seconds x 1 Gbps or 5 = Gigabits. > I'd like to test-see if adding L2ARC yields any performance boost. = Since > SLOG isn't doing much for me, I'm thinking I could easily repurpose my = SLOG > into an L2ARC. If a ZIL/SLOG device is not adding any performance, then either your = workload is NOT sync write bound -or- your ZIL/SLOG device is no faster = than your native pool performance. If the ZIL/SLOG device is not very = fast, then reusing it as an L2ARC -may- actually make the system slower. >=20 > Questions: >=20 > 5. In testing, it seemed fine to remove the SLOG from a live/running = system > (zpool remove pool mirror-3). Is this in fact a safe thing to do to a > live/running system? ZFS knows that it should flush the ZIL, then = remove > the device? Is it better or necessary to shut down the system and = remove > the SLOG in "read only" mode? There was a time when you _could_not_ remove a LOG device once added to = a pool. A specific revision of the zpool added the ability to remove a = LOG device. As long as your zpool is a late enough version. zpool = upgrade -v will tell you what version / features you have. It looks like = version 19 added log removal, while 7 added the log itself :-) So if = your system and zpool are version 19 or later (and I think all FreeBSD = ZFS code is) then it is completely safe to remove a log device and = repurpose it. > 6. Am I missing something about how the SLOG and L2ARC play together = that I > would miss by running my proposed test. i.e. if I take down the SLOG = and > repurpose as an L2ARC might I be shooting myself in the foot cause the = SLOG > and L2ARC combo is much more powerful than the L2ARC alone (or SLOG = alone)? > My hope here is to see if the L2ARC improves performance, after which = I > will proceed with buying the SSD(s) for both the SLOG and L2ARC. It all depends on your workload=85 Sync Writes are helped by the = ZIL/SLOG, it does nothing for reads. ARC helps both async writes and all = reads. L2ARC helps all reads. From owner-freebsd-fs@freebsd.org Wed Aug 19 15:34:41 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89E989BE26D for ; Wed, 19 Aug 2015 15:34:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 76F0D1214 for ; Wed, 19 Aug 2015 15:34:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id t7JFYfhW041658 for ; Wed, 19 Aug 2015 15:34:41 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 202358] [patch] [zfs] fix possible assert fail in sa_handle_get_from_db() Date: Wed, 19 Aug 2015 15:34:41 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 15:34:41 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202358 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Wed Aug 19 16:03:09 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C0FA09BE84D for ; Wed, 19 Aug 2015 16:03:09 +0000 (UTC) (envelope-from chip@innovates.com) Received: from mail-pd0-f175.google.com (mail-pd0-f175.google.com [209.85.192.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9BD897B3 for ; Wed, 19 Aug 2015 16:03:09 +0000 (UTC) (envelope-from chip@innovates.com) Received: by pdbfa8 with SMTP id fa8so3075696pdb.1 for ; Wed, 19 Aug 2015 09:03:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=F8M8yu6EO+ymv8Z1h0z19+RyPV+Ybl+ixrJWKts5/6k=; b=Jhnm1oHjESRkwz2ld131ckL7YAgN0s5PA+gQQDfpuUlbscfZrikoWJXaYxpA/CGsc2 pRLRJ398IjVQ1HBAqRplHqMaDN8+CUKGwl0eNX5ciI0mIoaX5jr95lx+V9GUOEWe9GC4 Ti1hAyMZY5Fa9gtV+v0whOGL0LwdpHsF/RKu0YU1ITn0/FnJiMHbkGC9wznsF7Nvdq2H 3B3REp3VmaxyqmYT+IlRiy3QtOG9qAcSM+XPjz+wNmRMT3t4aTRg0mbcyYDVBsNANbOB ofweAK+56pBHtyYsafZvx3WqAw7MRBoaFrptL3bUXCk9mbj2Ybgeb9bqy6BGTgeH3gDV 8Nsg== X-Gm-Message-State: ALoCoQmfA6sdTYvLd7tWsMQfJDTXd/1ZgL9X7pmlOAgAO3W2VY8E3BYmESiMPlSN357FzerzlpEF X-Received: by 10.70.137.37 with SMTP id qf5mr10797912pdb.12.1440000183679; Wed, 19 Aug 2015 09:03:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.66.162.196 with HTTP; Wed, 19 Aug 2015 09:02:44 -0700 (PDT) X-Originating-IP: [128.252.11.235] In-Reply-To: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> References: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> From: "Schweiss, Chip" Date: Wed, 19 Aug 2015 11:02:44 -0500 Message-ID: Subject: Re: Optimizing performance with SLOG/L2ARC To: PK1048 Cc: javocado , FreeBSD Filesystems Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 16:03:09 -0000 On Wed, Aug 19, 2015 at 10:29 AM, PK1048 wrote: > > Please note that, depending on your workload, an SSD may _not_ be any > faster than a HDD. I am in the process of rebuilding a file server that > exhibited poor NFS SYNC write performance. Yet it had a mirrored pair of > SSDs. Unfortunately, those SSDs had _worse_ write performance than an HDD > for small (4 KB) writes. Based on recommendations from the OpenZFS list I > have a pair of Intel 3710 SSDs coming in to try, they are supposed to hav= e > much better write performance (at all block sizes) and much better > reliability long term (10x full disk writes per day for 5 years). I=E2=80= =99ll know > more once they arrive and I can test with them. > Pure SSD pools still need a log device. ZFS doesn't play well with the ZIL on the pool with SSDs. Even an SSD of the same type as the pool devices as the log device will fix the latency problem and throughput problems. It seems counter-intuitive but a very real problem, there is a long thread about this on the Illumos ZFS list. If you don't believe it, turn off sync on your SSD pool and performance will skyrocket. -Chip From owner-freebsd-fs@freebsd.org Wed Aug 19 16:06:12 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F39F29BE8B2 for ; Wed, 19 Aug 2015 16:06:12 +0000 (UTC) (envelope-from paul@pk1048.com) Received: from cpanel61.fastdnsservers.com (server61.fastdnsservers.com [216.51.232.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D308386D for ; Wed, 19 Aug 2015 16:06:12 +0000 (UTC) (envelope-from paul@pk1048.com) Received: from mail.thecreativeadvantage.com ([96.236.20.34]:63422 helo=mbp-1.thecreativeadvantage.com) by cpanel61.fastdnsservers.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.85) (envelope-from ) id 1ZS5sR-001bvZ-T3; Wed, 19 Aug 2015 11:06:11 -0500 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: Optimizing performance with SLOG/L2ARC From: PK1048 In-Reply-To: <20150819154650.GA78333@neutralgood.org> Date: Wed, 19 Aug 2015 12:06:13 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <3C6E3C03-9A64-4CEC-8238-2A73F4EE26D1@pk1048.com> References: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> <20150819154650.GA78333@neutralgood.org> To: FreeBSD Filesystems X-Mailer: Apple Mail (2.1878.6) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - cpanel61.fastdnsservers.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - pk1048.com X-Get-Message-Sender-Via: cpanel61.fastdnsservers.com: authenticated_id: info@pk1048.com X-Source: X-Source-Args: X-Source-Dir: X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 16:06:13 -0000 On Aug 19, 2015, at 11:46, kpneal@pobox.com wrote: > On Wed, Aug 19, 2015 at 11:29:44AM -0400, PK1048 wrote: >> Someone commented on the size of the ZIL/SLOG=85 it needs to hold all = of the write data that arrives between TXG commits, which happen at = least every 5 seconds (it used to be 30 seconds, but that scared too = many people :-). SU a sync write arrives and it _must_ be committed to = permanent storage, so ZFS writes it to the ZFS Intent Log (ZIL) which = may or may not be a separate device (vdev). When the TXG that contains = that data is committed to the pool itself the data can be flushed from = the ZIL. If your source of sync writes is network shares, and you have a = 1 Gbps link, then your maximum ZIL will be 5 seconds x 1 Gbps or 5 = Gigabits. >=20 > That was me. By your figures 5 Gigabits is small compared to the size = of > SSD these days. If the SLOG ends up being that important to = performance > then it may make sense to buy small, excellent quality SSD. Exactly, unless your sync write data is coming from a _local_ = application or you have multiple 10 G ethernet connections :-) When I asked about SSD recommendations and sizing over on the OpenZFS = list, the consensus was that a 32 GB log device was probably big enough = for any rational load. I ordered 200 GB 3710=92s =85 sometimes = performance also scales with capacity, so be careful buying the smallest = SSD that fits the strict size needs. I will partition them and use some = for LOG and some for L2ARC (if I need it). I know this is not = recommended, but if it works, why not (and I do understand the = limitations of such a configuration). I only have 24 GB RAM in this box = and can probably benefit from some L2ARC. > I believe they even make PCIe battery-backed up RAM for this use. I = have > no idea about the price, though. Probably a lot. But maybe "a lot" = isn't > much depending on the value of the service provided by the machine. I have seen the RAM based battery-backed-up drives, and some are not = even ludicrous in terms of price, but I did not see any with FreeBSD = driver support. Since they are generally PCIe based (for speed), drives = are needed. From owner-freebsd-fs@freebsd.org Wed Aug 19 16:10:39 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 514499BEA9B for ; Wed, 19 Aug 2015 16:10:39 +0000 (UTC) (envelope-from paul@pk1048.com) Received: from cpanel61.fastdnsservers.com (server61.fastdnsservers.com [216.51.232.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2F55CC2E for ; Wed, 19 Aug 2015 16:10:38 +0000 (UTC) (envelope-from paul@pk1048.com) Received: from mail.thecreativeadvantage.com ([96.236.20.34]:63449 helo=mbp-1.thecreativeadvantage.com) by cpanel61.fastdnsservers.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.85) (envelope-from ) id 1ZS5wj-001cVZ-Le; Wed, 19 Aug 2015 11:10:37 -0500 Subject: Re: Optimizing performance with SLOG/L2ARC Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Content-Type: text/plain; charset=windows-1252 From: PK1048 In-Reply-To: Date: Wed, 19 Aug 2015 12:10:39 -0400 Cc: javocado , "Schweiss, Chip" Content-Transfer-Encoding: quoted-printable Message-Id: <3FE10173-656C-4744-AB2D-32148A34CB46@pk1048.com> References: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> To: FreeBSD Filesystems X-Mailer: Apple Mail (2.1878.6) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - cpanel61.fastdnsservers.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - pk1048.com X-Get-Message-Sender-Via: cpanel61.fastdnsservers.com: authenticated_id: info@pk1048.com X-Source: X-Source-Args: X-Source-Dir: X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 16:10:39 -0000 On Aug 19, 2015, at 12:02, Schweiss, Chip wrote: > On Wed, Aug 19, 2015 at 10:29 AM, PK1048 wrote: >=20 >>=20 >> Please note that, depending on your workload, an SSD may _not_ be any >> faster than a HDD. I am in the process of rebuilding a file server = that >> exhibited poor NFS SYNC write performance. Yet it had a mirrored pair = of >> SSDs. Unfortunately, those SSDs had _worse_ write performance than an = HDD >> for small (4 KB) writes. Based on recommendations from the OpenZFS = list I >> have a pair of Intel 3710 SSDs coming in to try, they are supposed to = have >> much better write performance (at all block sizes) and much better >> reliability long term (10x full disk writes per day for 5 years). = I=92ll know >> more once they arrive and I can test with them. >>=20 >=20 > Pure SSD pools still need a log device. Sorry I was unclear, I was NOT suggesting a pure SSD pool. > ZFS doesn't play well with the > ZIL on the pool with SSDs. Even an SSD of the same type as the pool > devices as the log device will fix the latency problem and throughput > problems. If your load is sync writes then you decidedly want a LOG device, even = if it is the same type as the devices in the pool. For the reasons = others have posted. > It seems counter-intuitive but a very real problem, there is a long = thread > about this on the Illumos ZFS list. If you don't believe it, turn off = sync > on your SSD pool and performance will skyrocket. But remember to turn it back on after you test so that you don=92t break = posix sync behavior and raise the possibility of loosing writes in = flight. I am horrified at the number of posts on the Internet that tell = you to simply disable sync to fix sync performance issues (VM images = accessed via NFS being a very common one). From owner-freebsd-fs@freebsd.org Wed Aug 19 16:14:27 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 075829BEB0F for ; Wed, 19 Aug 2015 16:14:27 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C1A41E2E for ; Wed, 19 Aug 2015 16:14:26 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id t7JGEObY020050; Wed, 19 Aug 2015 11:14:24 -0500 (CDT) Date: Wed, 19 Aug 2015 11:14:24 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: PK1048 cc: FreeBSD Filesystems Subject: Re: Optimizing performance with SLOG/L2ARC In-Reply-To: <3C6E3C03-9A64-4CEC-8238-2A73F4EE26D1@pk1048.com> Message-ID: References: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> <20150819154650.GA78333@neutralgood.org> <3C6E3C03-9A64-4CEC-8238-2A73F4EE26D1@pk1048.com> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Wed, 19 Aug 2015 11:14:25 -0500 (CDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 16:14:27 -0000 On Wed, 19 Aug 2015, PK1048 wrote: > > I have seen the RAM based battery-backed-up drives, and some are not > even ludicrous in terms of price, but I did not see any with FreeBSD > driver support. Since they are generally PCIe based (for speed), > drives are needed. RAID-oriented HBAs often offer battery-backed SRAM support. One would think that using one of these with a single attached SSD should serve as a lower-latency zil than the SSD itself. Driver support is then readily available. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@freebsd.org Wed Aug 19 16:25:04 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3F1C69BECE3 for ; Wed, 19 Aug 2015 16:25:04 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0220412E5 for ; Wed, 19 Aug 2015 16:25:04 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1ZS6Ae-000B4D-JR; Wed, 19 Aug 2015 17:25:00 +0100 Date: Wed, 19 Aug 2015 17:25:00 +0100 From: Gary Palmer To: PK1048 Cc: FreeBSD Filesystems , javocado Subject: Re: Optimizing performance with SLOG/L2ARC Message-ID: <20150819162500.GC13503@in-addr.com> References: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> <3FE10173-656C-4744-AB2D-32148A34CB46@pk1048.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FE10173-656C-4744-AB2D-32148A34CB46@pk1048.com> X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 16:25:04 -0000 On Wed, Aug 19, 2015 at 12:10:39PM -0400, PK1048 wrote: > On Aug 19, 2015, at 12:02, Schweiss, Chip wrote: > > ZFS doesn't play well with the > > ZIL on the pool with SSDs. Even an SSD of the same type as the pool > > devices as the log device will fix the latency problem and throughput > > problems. > > If your load is sync writes then you decidedly want a LOG device, even if it is the same type as the devices in the pool. For the reasons others have posted. > One thing I am curious about: A lot of posters in the past (not blaming anyone in this thread) have said that the best way to find out if a SLOG device will help your application is to try. While ZFS gathers quite extensive statistics about read/write performance & volumes, ARC/L2ARC stats, etc, it doesn't seem to have any data about the number of writes to the pool that are sync vs async. It has to know, else it couldn't handle them properly. So why are there no stats about percentage of writes that are sync? Thanks, Gary From owner-freebsd-fs@freebsd.org Wed Aug 19 16:29:59 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A0B019BEE0D for ; Wed, 19 Aug 2015 16:29:59 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.21.123]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smtp-sofia.digsys.bg", Issuer "Digital Systems Operational CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 36B541695 for ; Wed, 19 Aug 2015 16:29:58 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from [193.68.6.110] ([193.68.6.110]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.9/8.14.9) with ESMTP id t7JGTsGB001101 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 19 Aug 2015 19:29:55 +0300 (EEST) (envelope-from daniel@digsys.bg) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: Optimizing performance with SLOG/L2ARC From: Daniel Kalchev In-Reply-To: <3FE10173-656C-4744-AB2D-32148A34CB46@pk1048.com> Date: Wed, 19 Aug 2015 19:29:54 +0300 Cc: FreeBSD Filesystems , javocado Content-Transfer-Encoding: quoted-printable Message-Id: <6EAEF15E-03B0-4C75-B252-FE56DEA38DA2@digsys.bg> References: <023F881D-CCC5-4FCA-B09D-EB92C3BFBC03@pk1048.com> <3FE10173-656C-4744-AB2D-32148A34CB46@pk1048.com> To: PK1048 X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 16:29:59 -0000 > On 19.08.2015 =D0=B3., at 19:10, PK1048 wrote: >=20 > On Aug 19, 2015, at 12:02, Schweiss, Chip wrote: >=20 >> ZFS doesn't play well with the >> ZIL on the pool with SSDs. Even an SSD of the same type as the pool >> devices as the log device will fix the latency problem and throughput >> problems. >=20 > If your load is sync writes then you decidedly want a LOG device, even = if it is the same type as the devices in the pool. For the reasons = others have posted. >=20 This is because of the reason I mentioned earlier. When you don=E2=80=99t = have separate SLOG, ZFS will allocate the ZIL record from the pool = blocks, then when the blocks are at their intended location, delete that = ZIL record from the pool. This plays bad with SSDs. You also get much = fragmentation. When you have SLOG, ZFS will write the ZIL record to the SLOG, then = (batch) write the blocks to their intended place, then forget about the = SLOG (it=E2=80=99s not freed, just overwritten). This plays much better = with SSDs. You should get much better performance, with pure-SSD pool, if you = allocate small portion of the same SSD for SLOG. I know it sounds = counderintuitive. :) Of course, much better performance with separate SSD for the SLOG. Daniel From owner-freebsd-fs@freebsd.org Wed Aug 19 23:17:38 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 310389BD949 for ; Wed, 19 Aug 2015 23:17:38 +0000 (UTC) (envelope-from wiml@omnigroup.com) Received: from omnigroup.com (omnigroup.com [198.151.161.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "omnigroup.com", Issuer "The Omni Group CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C88A145A for ; Wed, 19 Aug 2015 23:17:37 +0000 (UTC) (envelope-from wiml@omnigroup.com) Received: from machamp.omnigroup.com (machamp.omnigroup.com [198.151.161.135]) by omnigroup.com (Postfix) with ESMTP id 560C126353B1 for ; Wed, 19 Aug 2015 16:08:48 -0700 (PDT) Received: from [10.4.3.73] (pfsense.omnigroup.com [198.151.161.131]) by machamp.omnigroup.com (Postfix) with ESMTPSA id 490A512F9E0D for ; Wed, 19 Aug 2015 16:08:30 -0700 (PDT) From: Wim Lewis Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: ZFS L2ARC statistics interpretation Message-Id: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> Date: Wed, 19 Aug 2015 16:08:47 -0700 To: FreeBSD Filesystems Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\)) X-Mailer: Apple Mail (2.2102) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Aug 2015 23:17:38 -0000 I'm trying to understand some problems we've been having with our ZFS = systems, in particular their L2ARC performance. Before I make too many = guesses about what's going on, I'm hoping someone can clarify what some = of the ZFS statistics actually mean, or point me to documentation if any = exists. In particular, I'm hoping someone can tell me the interpretation of: Errors: kstat.zfs.misc.arcstats.l2_cksum_bad kstat.zfs.misc.arcstats.l2_io_error Other than problems with the underlying disk (or controller or cable = or...), are there reasons for these counters to be nonzero? On some of = our systems, they increase fairly rapidly (20000/day). Is this = considered normal, or does it indicate a problem? If a problem, what = should I be looking at? Size: kstat.zfs.misc.arcstats.l2_size kstat.zfs.misc.arcstats.l2_asize What does l2_size/l2_asize measure? Compressed or uncompressed size? It = sometimes tops out at roughly the size of my L2ARC device, and sometimes = just continually grows (e.g., one of my systems has an l2_size of about = 1.3T but a 190G L2ARC; I doubt I'm getting nearly 7:1 compression on my = dataset! But maybe I am? How can I tell?) There are reports over the last few years [1,2,3,4] that suggest that = there's a ZFS bug that attempts to use space past the end of the L2ARC, = resulting both in l2_size being larger than is possible and also in = io_errors and bad cksums (when the nonexistent sectors are read back). = But given that this behavior has been reported off and on for several = years now, and many of the threads devolve into supposition and = folklore, I'm hoping to get an informed answer about what these = statistics mean, whether the numbers I'm seeing indicate a problem or = not, and be able to make a judgment about whether a given fix in FreeBSD = might solve the problem. FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm not seeing = them on 9.2.=20 [1] = https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.ht= ml [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/ [3] = https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D198242 Thanks Wim Lewis / wiml@omnigroup.com From owner-freebsd-fs@freebsd.org Thu Aug 20 00:29:51 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1E1B69BE54D for ; Thu, 20 Aug 2015 00:29:51 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D354BF7A for ; Thu, 20 Aug 2015 00:29:50 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1ZSDjm-000MLT-Ko; Thu, 20 Aug 2015 01:29:46 +0100 Date: Thu, 20 Aug 2015 01:29:46 +0100 From: Gary Palmer To: Wim Lewis Cc: FreeBSD Filesystems Subject: Re: ZFS L2ARC statistics interpretation Message-ID: <20150820002946.GD13503@in-addr.com> References: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 00:29:51 -0000 On Wed, Aug 19, 2015 at 04:08:47PM -0700, Wim Lewis wrote: > I'm trying to understand some problems we've been having with our ZFS systems, in particular their L2ARC performance. Before I make too many guesses about what's going on, I'm hoping someone can clarify what some of the ZFS statistics actually mean, or point me to documentation if any exists. > > In particular, I'm hoping someone can tell me the interpretation of: > > Errors: > kstat.zfs.misc.arcstats.l2_cksum_bad > kstat.zfs.misc.arcstats.l2_io_error > > Other than problems with the underlying disk (or controller or cable or...), are there reasons for these counters to be nonzero? On some of our systems, they increase fairly rapidly (20000/day). Is this considered normal, or does it indicate a problem? If a problem, what should I be looking at? > > Size: > kstat.zfs.misc.arcstats.l2_size > kstat.zfs.misc.arcstats.l2_asize > > What does l2_size/l2_asize measure? Compressed or uncompressed size? It sometimes tops out at roughly the size of my L2ARC device, and sometimes just continually grows (e.g., one of my systems has an l2_size of about 1.3T but a 190G L2ARC; I doubt I'm getting nearly 7:1 compression on my dataset! But maybe I am? How can I tell?) > > There are reports over the last few years [1,2,3,4] that suggest that there's a ZFS bug that attempts to use space past the end of the L2ARC, resulting both in l2_size being larger than is possible and also in io_errors and bad cksums (when the nonexistent sectors are read back). But given that this behavior has been reported off and on for several years now, and many of the threads devolve into supposition and folklore, I'm hoping to get an informed answer about what these statistics mean, whether the numbers I'm seeing indicate a problem or not, and be able to make a judgment about whether a given fix in FreeBSD might solve the problem. > > FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm not seeing them on 9.2. > > > [1] https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html > [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/ > [3] https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html > [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198242 I think the checksum/IO problems as well as the huge reported size of your L2ARC are both a result of a problem described at the following url https://reviews.freebsd.org/D2764 Not sure if a fix is in 10.2 or not yet. Regards, Gary From owner-freebsd-fs@freebsd.org Thu Aug 20 07:21:28 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C94829BE87B for ; Thu, 20 Aug 2015 07:21:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9D0A310C4 for ; Thu, 20 Aug 2015 07:21:28 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id t7K7LSwK045803 for ; Thu, 20 Aug 2015 07:21:28 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 202358] [patch] [zfs] fix possible assert fail in sa_handle_get_from_db() Date: Thu, 20 Aug 2015 07:21:28 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 07:21:28 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202358 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |gibbs@FreeBSD.org, | |mav@FreeBSD.org --- Comment #1 from Andriy Gapon --- (In reply to luke.tw from comment #0) I think that you correctly identified the problem. And what you describe seems to be only the part of the problem. It seems that illumos kmem cache API is used incorrectly for sa_cache. Its usage resembles how FreeBSD uma(9) could be used and that almost works with FreeBSD's emulation of kmem cache. The difference is that kmem_cache_create() is not as flexible as uma_create(): whereas the latter supports two ways of initialization an object - via init/fini and constructor/destructor, the former has only constructor/destructor support: http://illumos.org/man/9F/kmem_cache_create But the kmem cache's constructor and destructor work similarly to how the init and fini work in uma(9). And, apparently, that is a source of the confusion here. I am surprised that there are no bug reports about this API misuse from illumos users yet. It seems that the problem was introduced as a part of bigger changes in base r286575 which is an import of illumos/illumos-gate@bc9014e6a81272073b9854d9f65dd59e18d18c35 -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Thu Aug 20 07:35:47 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9264D9BEC43 for ; Thu, 20 Aug 2015 07:35:47 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 782CC17DF; Thu, 20 Aug 2015 07:35:46 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA01670; Thu, 20 Aug 2015 10:35:43 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1ZSKNz-000C2I-2O; Thu, 20 Aug 2015 10:35:43 +0300 Subject: Re: ZFS L2ARC statistics interpretation To: Gary Palmer , Wim Lewis References: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> <20150820002946.GD13503@in-addr.com> Cc: FreeBSD Filesystems Newsgroups: gmane.os.freebsd.devel.file-systems From: Andriy Gapon Message-ID: <55D582F9.6020207@FreeBSD.org> Date: Thu, 20 Aug 2015 10:34:17 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150820002946.GD13503@in-addr.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 07:35:47 -0000 On 20/08/2015 03:29, Gary Palmer wrote: > On Wed, Aug 19, 2015 at 04:08:47PM -0700, Wim Lewis wrote: >> I'm trying to understand some problems we've been having with our ZFS systems, in particular their L2ARC performance. Before I make too many guesses about what's going on, I'm hoping someone can clarify what some of the ZFS statistics actually mean, or point me to documentation if any exists. >> >> In particular, I'm hoping someone can tell me the interpretation of: >> >> Errors: >> kstat.zfs.misc.arcstats.l2_cksum_bad >> kstat.zfs.misc.arcstats.l2_io_error >> >> Other than problems with the underlying disk (or controller or cable or...), are there reasons for these counters to be nonzero? On some of our systems, they increase fairly rapidly (20000/day). Is this considered normal, or does it indicate a problem? If a problem, what should I be looking at? >> >> Size: >> kstat.zfs.misc.arcstats.l2_size >> kstat.zfs.misc.arcstats.l2_asize >> >> What does l2_size/l2_asize measure? Compressed or uncompressed size? It sometimes tops out at roughly the size of my L2ARC device, and sometimes just continually grows (e.g., one of my systems has an l2_size of about 1.3T but a 190G L2ARC; I doubt I'm getting nearly 7:1 compression on my dataset! But maybe I am? How can I tell?) >> >> There are reports over the last few years [1,2,3,4] that suggest that there's a ZFS bug that attempts to use space past the end of the L2ARC, resulting both in l2_size being larger than is possible and also in io_errors and bad cksums (when the nonexistent sectors are read back). But given that this behavior has been reported off and on for several years now, and many of the threads devolve into supposition and folklore, I'm hoping to get an informed answer about what these statistics mean, whether the numbers I'm seeing indicate a problem or not, and be able to make a judgment about whether a given fix in FreeBSD might solve the problem. >> >> FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm not seeing them on 9.2. >> >> >> [1] https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html >> [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/ >> [3] https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html >> [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198242 > > > I think the checksum/IO problems as well as the huge reported size > of your L2ARC are both a result of a problem described at the following > url > > https://reviews.freebsd.org/D2764 > > Not sure if a fix is in 10.2 or not yet. The fix is not in head yet. And the patch needs to be rebased after the recent large imports of the upstream code. -- Andriy Gapon From owner-freebsd-fs@freebsd.org Thu Aug 20 08:08:43 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 64D539BE7CA for ; Thu, 20 Aug 2015 08:08:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4820A1696 for ; Thu, 20 Aug 2015 08:08:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id t7K88hNJ069946 for ; Thu, 20 Aug 2015 08:08:43 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 202358] [patch] [zfs] fix possible assert fail in sa_handle_get_from_db() Date: Thu, 20 Aug 2015 08:08:42 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: avg@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: avg@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.isobsolete bug_status assigned_to attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 08:08:43 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202358 Andriy Gapon changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #159915|0 |1 is obsolete| | Status|New |Open Assignee|freebsd-fs@FreeBSD.org |avg@FreeBSD.org --- Comment #2 from Andriy Gapon --- Created attachment 160132 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=160132&action=edit proposed patch -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@freebsd.org Thu Aug 20 17:47:24 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 735529BF75C for ; Thu, 20 Aug 2015 17:47:24 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 58A0EA2B for ; Thu, 20 Aug 2015 17:47:24 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: by mailman.ysv.freebsd.org (Postfix) id 564D59BF75B; Thu, 20 Aug 2015 17:47:24 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54F2F9BF75A for ; Thu, 20 Aug 2015 17:47:24 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (mail.michaelwlucas.com [104.236.197.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0AC85A2A for ; Thu, 20 Aug 2015 17:47:23 +0000 (UTC) (envelope-from mwlucas@mail.michaelwlucas.com) Received: from mail.michaelwlucas.com (localhost [127.0.0.1]) by mail.michaelwlucas.com (8.14.9/8.14.7) with ESMTP id t7KHjxiH028362 for ; Thu, 20 Aug 2015 13:46:00 -0400 (EDT) (envelope-from mwlucas@mail.michaelwlucas.com) Received: (from mwlucas@localhost) by mail.michaelwlucas.com (8.14.9/8.14.7/Submit) id t7KHjxYh028361 for fs@freebsd.org; Thu, 20 Aug 2015 13:45:59 -0400 (EDT) (envelope-from mwlucas) Date: Thu, 20 Aug 2015 13:45:59 -0400 From: "Michael W. Lucas" To: fs@freebsd.org Subject: DTrace to measure ZFS operations & latency Message-ID: <20150820174559.GA28318@mail.michaelwlucas.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.michaelwlucas.com X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mail.michaelwlucas.com [127.0.0.1]); Thu, 20 Aug 2015 13:46:01 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 17:47:24 -0000 Hi, I'm working on measuring the number & latency of async operations in ZFS. (Yes, this is still for the ZFS book.) There's a nice script at http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/, but it's illumos-specific. I try to run the script on last week's -current and get: # dtrace -s q.d zroot dtrace: failed to compile script q.d: line 4: probe description fbt::vdev_queue_max_async_writes:entry does not match any probes Any chance someone could help me out here? Thanks, ==ml PS: The script is: #pragma D option aggpack #pragma D option quiet fbt::vdev_queue_max_async_writes:entry { self->spa = args[0]; } fbt::vdev_queue_max_async_writes:return /self->spa && self->spa->spa_name == $$1/ { @ = lquantize(args[1], 0, 30, 1); } tick-1s { printa(@); clear(@); } -- Michael W. Lucas - mwlucas@michaelwlucas.com, Twitter @mwlauthor http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/ From owner-freebsd-fs@freebsd.org Thu Aug 20 18:02:14 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 705469BFA49 for ; Thu, 20 Aug 2015 18:02:14 +0000 (UTC) (envelope-from lacey.leanne@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 4F7DF1256 for ; Thu, 20 Aug 2015 18:02:14 +0000 (UTC) (envelope-from lacey.leanne@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 4C7EB9BFA48; Thu, 20 Aug 2015 18:02:14 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 321109BFA47 for ; Thu, 20 Aug 2015 18:02:14 +0000 (UTC) (envelope-from lacey.leanne@gmail.com) Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com [IPv6:2a00:1450:4010:c03::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AEE171255 for ; Thu, 20 Aug 2015 18:02:13 +0000 (UTC) (envelope-from lacey.leanne@gmail.com) Received: by laba3 with SMTP id a3so27486259lab.1 for ; Thu, 20 Aug 2015 11:02:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=5r+ZLodUNvy1wQimBFzb0vm+1O1E+j1tEQ3ox994gzQ=; b=Z8T0IjpFahe+dlg8EawEd77ep/0P738n0eArFYCu4mU7ugBoQifqXTx9FEJVXGhcEZ 7Wu2UEtb6XbHCEcezJ5UrZR3frkOuLCoGgHHFpMQ0t3kMmdDo+TkJ25QoAlR+s0sevf1 gJ6C23I2RbQCu+EhzrvoFfs/sXt9Zcfak5aS1PjtCjO2ASYmi/v7cMoVLrLb7OlfUsNj UVoLOz1ZRwEAnFIWX4EkRSFugSbFrWwPStzw3/ukU13v27tzp4LoJ2TqlWTQ/0ctJCDz 37q6UxCeksDre2cLzTxW/2CuLvGf2dY0NPtIYinIavr6L290xsoAHcUck85wTT1ivCn5 QO1A== MIME-Version: 1.0 X-Received: by 10.112.201.36 with SMTP id jx4mr4078947lbc.9.1440093731759; Thu, 20 Aug 2015 11:02:11 -0700 (PDT) Received: by 10.25.88.80 with HTTP; Thu, 20 Aug 2015 11:02:11 -0700 (PDT) Received: by 10.25.88.80 with HTTP; Thu, 20 Aug 2015 11:02:11 -0700 (PDT) In-Reply-To: <20150820174559.GA28318@mail.michaelwlucas.com> References: <20150820174559.GA28318@mail.michaelwlucas.com> Date: Thu, 20 Aug 2015 11:02:11 -0700 Message-ID: Subject: Re: DTrace to measure ZFS operations & latency From: Lacey Powers To: "Michael W. Lucas" Cc: fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 18:02:14 -0000 I put in a bug for this because the function is inlined and thus hidden from dtrace. The patch hasn't been applied for several months. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200316 You could always custom compile to expose this like I did to try and debug it for my own use with ZFS and PostgreSQL. Hope that helps. Regards, Lacey On Aug 20, 2015 10:48 AM, "Michael W. Lucas" wrote: > Hi, > > I'm working on measuring the number & latency of async operations in > ZFS. (Yes, this is still for the ZFS book.) There's a nice script at > http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/, but it's > illumos-specific. I try to run the script on last week's -current and > get: > > # dtrace -s q.d zroot > dtrace: failed to compile script q.d: line 4: probe description > fbt::vdev_queue_max_async_writes:entry does not match any probes > > Any chance someone could help me out here? > > Thanks, > ==ml > > PS: The script is: > > #pragma D option aggpack > #pragma D option quiet > > fbt::vdev_queue_max_async_writes:entry > { > self->spa = args[0]; > } > fbt::vdev_queue_max_async_writes:return > /self->spa && self->spa->spa_name == $$1/ > { > @ = lquantize(args[1], 0, 30, 1); > } > > tick-1s > { > printa(@); > clear(@); > } > > > -- > Michael W. Lucas - mwlucas@michaelwlucas.com, Twitter @mwlauthor > http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/ > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Thu Aug 20 21:21:10 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 08ADE9BFB2B for ; Thu, 20 Aug 2015 21:21:10 +0000 (UTC) (envelope-from lkateley@kateley.com) Received: from mail-ig0-f178.google.com (mail-ig0-f178.google.com [209.85.213.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CFAB3AF9 for ; Thu, 20 Aug 2015 21:21:09 +0000 (UTC) (envelope-from lkateley@kateley.com) Received: by igbjg10 with SMTP id jg10so881931igb.0 for ; Thu, 20 Aug 2015 14:21:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=kny94TsaTLi6VTr6Vg5sqhaNLgz7JO2dlXNUkkjsuR8=; b=My7y1fFfD12ZZ6d37kIXPgfreiP7Uv1vc+7IqGwRGU2SXxDxa2rkdacITlH4sf2d/R 0RG2jYliIedoYpKU8U0mvgVZdlb5ucOGvUEIIFvxLywvG9elshnDm1vy8zfWlKTenxuU NY8R0okoNAtkmCMd6gh2MXxagfML3vU4sXIxK7Qzkv8Ink/CzOVgvMscKOqqAV9MxS8x sGaniCHUMA0+hBu+TD03ypAGrZuoBUuTsO0VGzYYNvvc7+J4DdfhUjtzbXQq+5V3MKgA z1Vi7+QGcsyDezVOEwcNjvyJV3pZvR1q2ByVAXz2RSAX1waSaKUar9t4r9JmjsfEScMD yc5g== X-Gm-Message-State: ALoCoQmmfmVX0QWNx2qJlbNJi8JGaH24yycNvmGM2Nkn4OyW/DOC4LRnkvTz/8OFDqm6IFT+Q+uj X-Received: by 10.50.43.169 with SMTP id x9mr192378igl.12.1440105668200; Thu, 20 Aug 2015 14:21:08 -0700 (PDT) Received: from kateleycoimac.local ([63.231.252.189]) by smtp.googlemail.com with ESMTPSA id a1sm95434iga.4.2015.08.20.14.21.06 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Aug 2015 14:21:07 -0700 (PDT) Subject: Re: DTrace to measure ZFS operations & latency To: freebsd-fs@freebsd.org References: <20150820174559.GA28318@mail.michaelwlucas.com> From: Linda Kateley Message-ID: <55D644C2.8060701@kateley.com> Date: Thu, 20 Aug 2015 16:21:06 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20150820174559.GA28318@mail.michaelwlucas.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2015 21:21:10 -0000 Usually what I will do with those is to try to find a similar probe in freebsd. Something like... #dtrace -l | grep async or something like that.. to try and find a similar function to trace. This one looks promising spa_async_dispatch_vd Not sure I will be able to quantize on the args though, without more time... Hopefully that will give you a start. On 8/20/15 12:45 PM, Michael W. Lucas wrote: > Hi, > > I'm working on measuring the number & latency of async operations in > ZFS. (Yes, this is still for the ZFS book.) There's a nice script at > http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/, but it's > illumos-specific. I try to run the script on last week's -current and > get: > > # dtrace -s q.d zroot > dtrace: failed to compile script q.d: line 4: probe description fbt::vdev_queue_max_async_writes:entry does not match any probes > > Any chance someone could help me out here? > > Thanks, > ==ml > > PS: The script is: > > #pragma D option aggpack > #pragma D option quiet > > fbt::vdev_queue_max_async_writes:entry > { > self->spa = args[0]; > } > fbt::vdev_queue_max_async_writes:return > /self->spa && self->spa->spa_name == $$1/ > { > @ = lquantize(args[1], 0, 30, 1); > } > > tick-1s > { > printa(@); > clear(@); > } > > -- Linda Kateley Kateley Company Skype ID-kateleyco http://kateleyco.com From owner-freebsd-fs@freebsd.org Fri Aug 21 08:34:53 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 592DB9BF8D2 for ; Fri, 21 Aug 2015 08:34:53 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (cl-1657.chi-02.us.sixxs.net [IPv6:2001:4978:f:678::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D9839F16 for ; Fri, 21 Aug 2015 08:34:52 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id t7L8Yigk093994 for ; Fri, 21 Aug 2015 01:34:48 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201508210834.t7L8Yigk093994@gw.catspoiler.org> Date: Fri, 21 Aug 2015 01:34:44 -0700 (PDT) From: Don Lewis Subject: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic To: freebsd-fs@FreeBSD.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 08:34:53 -0000 I just started getting this panic: solaris assert: avl_is_empty(&dn -> dn_dbufs), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c, line 495 System info: FreeBSD zipper.catspoiler.org 11.0-CURRENT FreeBSD 11.0-CURRENT #25 r286923: Wed Aug 19 09:28:53 PDT 2015 dl@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC amd64 My zfs pool has one mirrored vdev. Scrub doesn't find any problems. %zpool status pool: zroot state: ONLINE scan: scrub repaired 0 in 2h58m with 0 errors on Fri Aug 21 00:44:52 2015 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p3 ONLINE 0 0 0 ada1p3 ONLINE 0 0 0 This panic is reproduceable and happens every time I use poudriere to build ports using my 9.3-RELEASE amd64 jail and occurs at the end of the poudriere run when it is unmounting filesystems. [00:10:43] ====>> Stopping 4 builders 93amd64-default-job-01: removed 93amd64-default-job-01-n: removed 93amd64-default-job-02: removed 93amd64-default-job-02-n: removed 93amd64-default-job-03: removed 93amd64-default-job-03-n: removed 93amd64-default-job-04: removed 93amd64-default-job-04-n: removed [00:10:46] ====>> Creating pkgng repository Creating repository in /tmp/packages: 100% Packing files for repository: 100% [00:10:55] ====>> Committing packages to repository [00:10:55] ====>> Removing old packages [00:10:55] ====>> Built ports: devel/py-pymtbl net/sie-nmsg net/p5-Net-Nmsg net/axa [93amd64-default] [2015-08-21_00h47m41s] [committing:] Queued: 4 Built: 4 Failed: 0 Skipped: 0 Ignored: 0 Tobuild: 0 Time: 00:10:53 [00:10:55] ====>> Logs: /var/poudriere/data/logs/bulk/93amd64-default/2015-08-21_00h47m41s [00:10:55] ====>> Cleaning up 93amd64-default: removed 93amd64-default-n: removed [00:10:55] ====>> Umounting file systems Write failed: Broken pipe Prior to that, I ran poudriere a number of times with a 10.2-STABLE amd64 jail without incident. I've kicked off a bunch of poudriere runs for other jails and will check on it in the morning. From owner-freebsd-fs@freebsd.org Fri Aug 21 12:33:24 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A925F9BEECB for ; Fri, 21 Aug 2015 12:33:24 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 97ABA90B; Fri, 21 Aug 2015 12:33:23 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA24576; Fri, 21 Aug 2015 15:33:20 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1ZSlVY-000DoM-6v; Fri, 21 Aug 2015 15:33:20 +0300 Subject: Re: ZFS L2ARC statistics interpretation To: Gary Palmer , Wim Lewis References: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> <20150820002946.GD13503@in-addr.com> <55D582F9.6020207@FreeBSD.org> Cc: FreeBSD Filesystems From: Andriy Gapon Message-ID: <55D71A56.2080300@FreeBSD.org> Date: Fri, 21 Aug 2015 15:32:22 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55D582F9.6020207@FreeBSD.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 12:33:24 -0000 On 20/08/2015 10:34, Andriy Gapon wrote: > On 20/08/2015 03:29, Gary Palmer wrote: >> On Wed, Aug 19, 2015 at 04:08:47PM -0700, Wim Lewis wrote: >>> I'm trying to understand some problems we've been having with our ZFS systems, in particular their L2ARC performance. Before I make too many guesses about what's going on, I'm hoping someone can clarify what some of the ZFS statistics actually mean, or point me to documentation if any exists. >>> >>> In particular, I'm hoping someone can tell me the interpretation of: >>> >>> Errors: >>> kstat.zfs.misc.arcstats.l2_cksum_bad >>> kstat.zfs.misc.arcstats.l2_io_error >>> >>> Other than problems with the underlying disk (or controller or cable or...), are there reasons for these counters to be nonzero? On some of our systems, they increase fairly rapidly (20000/day). Is this considered normal, or does it indicate a problem? If a problem, what should I be looking at? >>> >>> Size: >>> kstat.zfs.misc.arcstats.l2_size >>> kstat.zfs.misc.arcstats.l2_asize >>> >>> What does l2_size/l2_asize measure? Compressed or uncompressed size? It sometimes tops out at roughly the size of my L2ARC device, and sometimes just continually grows (e.g., one of my systems has an l2_size of about 1.3T but a 190G L2ARC; I doubt I'm getting nearly 7:1 compression on my dataset! But maybe I am? How can I tell?) >>> >>> There are reports over the last few years [1,2,3,4] that suggest that there's a ZFS bug that attempts to use space past the end of the L2ARC, resulting both in l2_size being larger than is possible and also in io_errors and bad cksums (when the nonexistent sectors are read back). But given that this behavior has been reported off and on for several years now, and many of the threads devolve into supposition and folklore, I'm hoping to get an informed answer about what these statistics mean, whether the numbers I'm seeing indicate a problem or not, and be able to make a judgment about whether a given fix in FreeBSD might solve the problem. >>> >>> FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm not seeing them on 9.2. >>> >>> >>> [1] https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html >>> [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/ >>> [3] https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html >>> [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198242 >> >> >> I think the checksum/IO problems as well as the huge reported size >> of your L2ARC are both a result of a problem described at the following >> url >> >> https://reviews.freebsd.org/D2764 >> >> Not sure if a fix is in 10.2 or not yet. > > The fix is not in head yet. > And the patch needs to be rebased after the recent large imports of the > upstream code. An updated patch for head is here https://reviews.freebsd.org/D2764?download=true Testers are welcome! -- Andriy Gapon From owner-freebsd-fs@freebsd.org Fri Aug 21 13:20:57 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B2D639BD88D for ; Fri, 21 Aug 2015 13:20:57 +0000 (UTC) (envelope-from sodynet1@gmail.com) Received: from mail-lb0-x229.google.com (mail-lb0-x229.google.com [IPv6:2a00:1450:4010:c04::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3BA701E34; Fri, 21 Aug 2015 13:20:57 +0000 (UTC) (envelope-from sodynet1@gmail.com) Received: by lbbtg9 with SMTP id tg9so43681769lbb.1; Fri, 21 Aug 2015 06:20:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Hjp2mIxaEpprC04j6qk7G4coem6Ar2RCj7odrp9POrA=; b=BS2zVzA12uUyFz37SKRnFuryYVOSKaeDK7tlsJr07A45YACwweh+bM4blexbiJ3Nt/ M5OzVt1VgyBiadxxSTMg2VjYgQLJqUZDSzV0IfVPCDt3p3Mubo6do8bfzDVRAxjXqeh+ kZSoUpadehV3R7ScZCZmnhnkHWAQUIkGPsd+f0QqtvHSSTv//d5nOtHoGKXF0Dq1yEnE SsHKSBZumlY2jxhjd65EKPOskG7zv5v6wBMRqC03i7B/2PidVAsTnx2fJRchX0fDB5Ne NT6w62PKKkrDIJTa/hsnoTakgVlLNcGkRU4BrALeLZr0nlmD3mTywzoKfVlbTeQuZQRP hbYw== MIME-Version: 1.0 X-Received: by 10.152.28.105 with SMTP id a9mr8039256lah.9.1440163254907; Fri, 21 Aug 2015 06:20:54 -0700 (PDT) Received: by 10.112.154.33 with HTTP; Fri, 21 Aug 2015 06:20:54 -0700 (PDT) Received: by 10.112.154.33 with HTTP; Fri, 21 Aug 2015 06:20:54 -0700 (PDT) In-Reply-To: <55D71A56.2080300@FreeBSD.org> References: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> <20150820002946.GD13503@in-addr.com> <55D582F9.6020207@FreeBSD.org> <55D71A56.2080300@FreeBSD.org> Date: Fri, 21 Aug 2015 16:20:54 +0300 Message-ID: Subject: Re: ZFS L2ARC statistics interpretation From: Sami Halabi To: Andriy Gapon Cc: freebsd-fs@freebsd.org, Wim Lewis , Gary Palmer Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 13:20:57 -0000 Will there be a patch for 10.2 ? =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 21 =D7=91=D7=90=D7=95=D7=92=D7=B3 2015= 15:33,=E2=80=8F "Andriy Gapon" =D7=9B=D7=AA=D7=91: > On 20/08/2015 10:34, Andriy Gapon wrote: > > On 20/08/2015 03:29, Gary Palmer wrote: > >> On Wed, Aug 19, 2015 at 04:08:47PM -0700, Wim Lewis wrote: > >>> I'm trying to understand some problems we've been having with our ZFS > systems, in particular their L2ARC performance. Before I make too many > guesses about what's going on, I'm hoping someone can clarify what some o= f > the ZFS statistics actually mean, or point me to documentation if any > exists. > >>> > >>> In particular, I'm hoping someone can tell me the interpretation of: > >>> > >>> Errors: > >>> kstat.zfs.misc.arcstats.l2_cksum_bad > >>> kstat.zfs.misc.arcstats.l2_io_error > >>> > >>> Other than problems with the underlying disk (or controller or cable > or...), are there reasons for these counters to be nonzero? On some of ou= r > systems, they increase fairly rapidly (20000/day). Is this considered > normal, or does it indicate a problem? If a problem, what should I be > looking at? > >>> > >>> Size: > >>> kstat.zfs.misc.arcstats.l2_size > >>> kstat.zfs.misc.arcstats.l2_asize > >>> > >>> What does l2_size/l2_asize measure? Compressed or uncompressed size? > It sometimes tops out at roughly the size of my L2ARC device, and sometim= es > just continually grows (e.g., one of my systems has an l2_size of about > 1.3T but a 190G L2ARC; I doubt I'm getting nearly 7:1 compression on my > dataset! But maybe I am? How can I tell?) > >>> > >>> There are reports over the last few years [1,2,3,4] that suggest that > there's a ZFS bug that attempts to use space past the end of the L2ARC, > resulting both in l2_size being larger than is possible and also in > io_errors and bad cksums (when the nonexistent sectors are read back). Bu= t > given that this behavior has been reported off and on for several years > now, and many of the threads devolve into supposition and folklore, I'm > hoping to get an informed answer about what these statistics mean, whethe= r > the numbers I'm seeing indicate a problem or not, and be able to make a > judgment about whether a given fix in FreeBSD might solve the problem. > >>> > >>> FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm not > seeing them on 9.2. > >>> > >>> > >>> [1] > https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.h= tml > >>> [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/ > >>> [3] > https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html > >>> [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D198242 > >> > >> > >> I think the checksum/IO problems as well as the huge reported size > >> of your L2ARC are both a result of a problem described at the followin= g > >> url > >> > >> https://reviews.freebsd.org/D2764 > >> > >> Not sure if a fix is in 10.2 or not yet. > > > > The fix is not in head yet. > > And the patch needs to be rebased after the recent large imports of the > > upstream code. > > An updated patch for head is here > https://reviews.freebsd.org/D2764?download=3Dtrue > Testers are welcome! > > > -- > Andriy Gapon > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Fri Aug 21 13:28:33 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 780AA9BD911 for ; Fri, 21 Aug 2015 13:28:33 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 8663767; Fri, 21 Aug 2015 13:28:32 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA25292; Fri, 21 Aug 2015 16:28:30 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1ZSmMv-000Drh-Py; Fri, 21 Aug 2015 16:28:30 +0300 Subject: Re: ZFS L2ARC statistics interpretation To: Sami Halabi References: <0CEC2752-7787-4C6D-99E2-E7D7BF238449@omnigroup.com> <20150820002946.GD13503@in-addr.com> <55D582F9.6020207@FreeBSD.org> <55D71A56.2080300@FreeBSD.org> Cc: freebsd-fs@FreeBSD.org, Wim Lewis , Gary Palmer From: Andriy Gapon Message-ID: <55D7272F.6020106@FreeBSD.org> Date: Fri, 21 Aug 2015 16:27:11 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 13:28:33 -0000 On 21/08/2015 16:20, Sami Halabi wrote: > Will there be a patch for 10.2 ? There was a patch against older version of head that should have been applicable to 10.2. It still can be accessed via https://reviews.freebsd.org/D2764?vs=on&id=6055&whitespace=ignore-most&download=true > בתאריך 21 באוג׳ 2015 15:33,‏ "Andriy Gapon" > כתב: > > On 20/08/2015 10:34, Andriy Gapon wrote: > > On 20/08/2015 03:29, Gary Palmer wrote: > >> On Wed, Aug 19, 2015 at 04:08:47PM -0700, Wim Lewis wrote: > >>> I'm trying to understand some problems we've been having with > our ZFS systems, in particular their L2ARC performance. Before I > make too many guesses about what's going on, I'm hoping someone can > clarify what some of the ZFS statistics actually mean, or point me > to documentation if any exists. > >>> > >>> In particular, I'm hoping someone can tell me the interpretation of: > >>> > >>> Errors: > >>> kstat.zfs.misc.arcstats.l2_cksum_bad > >>> kstat.zfs.misc.arcstats.l2_io_error > >>> > >>> Other than problems with the underlying disk (or controller or > cable or...), are there reasons for these counters to be nonzero? On > some of our systems, they increase fairly rapidly (20000/day). Is > this considered normal, or does it indicate a problem? If a problem, > what should I be looking at? > >>> > >>> Size: > >>> kstat.zfs.misc.arcstats.l2_size > >>> kstat.zfs.misc.arcstats.l2_asize > >>> > >>> What does l2_size/l2_asize measure? Compressed or uncompressed > size? It sometimes tops out at roughly the size of my L2ARC device, > and sometimes just continually grows (e.g., one of my systems has an > l2_size of about 1.3T but a 190G L2ARC; I doubt I'm getting nearly > 7:1 compression on my dataset! But maybe I am? How can I tell?) > >>> > >>> There are reports over the last few years [1,2,3,4] that suggest > that there's a ZFS bug that attempts to use space past the end of > the L2ARC, resulting both in l2_size being larger than is possible > and also in io_errors and bad cksums (when the nonexistent sectors > are read back). But given that this behavior has been reported off > and on for several years now, and many of the threads devolve into > supposition and folklore, I'm hoping to get an informed answer about > what these statistics mean, whether the numbers I'm seeing indicate > a problem or not, and be able to make a judgment about whether a > given fix in FreeBSD might solve the problem. > >>> > >>> FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm > not seeing them on 9.2. > >>> > >>> > >>> [1] > https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html > >>> [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/ > >>> [3] > https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html > >>> [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198242 > >> > >> > >> I think the checksum/IO problems as well as the huge reported size > >> of your L2ARC are both a result of a problem described at the > following > >> url > >> > >> https://reviews.freebsd.org/D2764 > >> > >> Not sure if a fix is in 10.2 or not yet. > > > > The fix is not in head yet. > > And the patch needs to be rebased after the recent large imports > of the > > upstream code. > > An updated patch for head is here > https://reviews.freebsd.org/D2764?download=true > Testers are welcome! > > > -- > Andriy Gapon > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org > " > -- Andriy Gapon From owner-freebsd-fs@freebsd.org Fri Aug 21 17:34:35 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E8E089BF920 for ; Fri, 21 Aug 2015 17:34:35 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (cl-1657.chi-02.us.sixxs.net [IPv6:2001:4978:f:678::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7DD1E34C for ; Fri, 21 Aug 2015 17:34:35 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id t7LHYQ33096035 for ; Fri, 21 Aug 2015 10:34:30 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201508211734.t7LHYQ33096035@gw.catspoiler.org> Date: Fri, 21 Aug 2015 10:34:26 -0700 (PDT) From: Don Lewis Subject: Re: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic To: freebsd-fs@FreeBSD.org In-Reply-To: <201508210834.t7L8Yigk093994@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 17:34:36 -0000 On 21 Aug, Don Lewis wrote: > I just started getting this panic: > > solaris assert: avl_is_empty(&dn -> dn_dbufs), file: > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c, > line 495 > > System info: > FreeBSD zipper.catspoiler.org 11.0-CURRENT FreeBSD 11.0-CURRENT #25 r286923: Wed Aug 19 09:28:53 PDT 2015 dl@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC amd64 > > My zfs pool has one mirrored vdev. Scrub doesn't find any problems. > > %zpool status > pool: zroot > state: ONLINE > scan: scrub repaired 0 in 2h58m with 0 errors on Fri Aug 21 00:44:52 2015 > config: > > NAME STATE READ WRITE CKSUM > zroot ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > ada0p3 ONLINE 0 0 0 > ada1p3 ONLINE 0 0 0 > > This panic is reproduceable and happens every time I use poudriere to > build ports using my 9.3-RELEASE amd64 jail and occurs at the end of the > poudriere run when it is unmounting filesystems. > > [00:10:43] ====>> Stopping 4 builders > 93amd64-default-job-01: removed > 93amd64-default-job-01-n: removed > 93amd64-default-job-02: removed > 93amd64-default-job-02-n: removed > 93amd64-default-job-03: removed > 93amd64-default-job-03-n: removed > 93amd64-default-job-04: removed > 93amd64-default-job-04-n: removed > [00:10:46] ====>> Creating pkgng repository > Creating repository in /tmp/packages: 100% > Packing files for repository: 100% > [00:10:55] ====>> Committing packages to repository > [00:10:55] ====>> Removing old packages > [00:10:55] ====>> Built ports: devel/py-pymtbl net/sie-nmsg net/p5-Net-Nmsg net/axa > [93amd64-default] [2015-08-21_00h47m41s] [committing:] Queued: 4 Built: 4 Failed: 0 Skipped: 0 Ignored: 0 Tobuild: 0 Time: 00:10:53 > [00:10:55] ====>> Logs: /var/poudriere/data/logs/bulk/93amd64-default/2015-08-21_00h47m41s > [00:10:55] ====>> Cleaning up > 93amd64-default: removed > 93amd64-default-n: removed > [00:10:55] ====>> Umounting file systems > Write failed: Broken pipe > > Prior to that, I ran poudriere a number of times with a 10.2-STABLE > amd64 jail without incident. > > I've kicked off a bunch of poudriere runs for other jails and > will check on it in the morning. Died the same way after building ports on the first jail, 10.1-RELEASE amd64. Since there have been some zfs commits since r286923, I upgraded to r286998 this morning and tried again with no better luck. I got the same panic again. This machine has mirrored swap, and even though I've done what gmirror(8) says to do in order to capture crash dumps, I've had no luck with that. The dump is getting written, but savecore is unable to find it. From owner-freebsd-fs@freebsd.org Fri Aug 21 17:48:59 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EC08F9BFC99 for ; Fri, 21 Aug 2015 17:48:58 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (cl-1657.chi-02.us.sixxs.net [IPv6:2001:4978:f:678::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7D9F7E01 for ; Fri, 21 Aug 2015 17:48:58 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id t7LHmo96096088 for ; Fri, 21 Aug 2015 10:48:54 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201508211748.t7LHmo96096088@gw.catspoiler.org> Date: Fri, 21 Aug 2015 10:48:50 -0700 (PDT) From: Don Lewis Subject: Re: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic To: freebsd-fs@FreeBSD.org In-Reply-To: <201508211734.t7LHYQ33096035@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 17:48:59 -0000 On 21 Aug, Don Lewis wrote: > On 21 Aug, Don Lewis wrote: >> I just started getting this panic: >> >> solaris assert: avl_is_empty(&dn -> dn_dbufs), file: >> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c, >> line 495 >> >> System info: >> FreeBSD zipper.catspoiler.org 11.0-CURRENT FreeBSD 11.0-CURRENT #25 r286923: Wed Aug 19 09:28:53 PDT 2015 dl@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC amd64 >> >> My zfs pool has one mirrored vdev. Scrub doesn't find any problems. >> >> %zpool status >> pool: zroot >> state: ONLINE >> scan: scrub repaired 0 in 2h58m with 0 errors on Fri Aug 21 00:44:52 2015 >> config: >> >> NAME STATE READ WRITE CKSUM >> zroot ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> ada0p3 ONLINE 0 0 0 >> ada1p3 ONLINE 0 0 0 >> >> This panic is reproduceable and happens every time I use poudriere to >> build ports using my 9.3-RELEASE amd64 jail and occurs at the end of the >> poudriere run when it is unmounting filesystems. >> >> [00:10:43] ====>> Stopping 4 builders >> 93amd64-default-job-01: removed >> 93amd64-default-job-01-n: removed >> 93amd64-default-job-02: removed >> 93amd64-default-job-02-n: removed >> 93amd64-default-job-03: removed >> 93amd64-default-job-03-n: removed >> 93amd64-default-job-04: removed >> 93amd64-default-job-04-n: removed >> [00:10:46] ====>> Creating pkgng repository >> Creating repository in /tmp/packages: 100% >> Packing files for repository: 100% >> [00:10:55] ====>> Committing packages to repository >> [00:10:55] ====>> Removing old packages >> [00:10:55] ====>> Built ports: devel/py-pymtbl net/sie-nmsg net/p5-Net-Nmsg net/axa >> [93amd64-default] [2015-08-21_00h47m41s] [committing:] Queued: 4 Built: 4 Failed: 0 Skipped: 0 Ignored: 0 Tobuild: 0 Time: 00:10:53 >> [00:10:55] ====>> Logs: /var/poudriere/data/logs/bulk/93amd64-default/2015-08-21_00h47m41s >> [00:10:55] ====>> Cleaning up >> 93amd64-default: removed >> 93amd64-default-n: removed >> [00:10:55] ====>> Umounting file systems >> Write failed: Broken pipe >> >> Prior to that, I ran poudriere a number of times with a 10.2-STABLE >> amd64 jail without incident. >> >> I've kicked off a bunch of poudriere runs for other jails and >> will check on it in the morning. > > Died the same way after building ports on the first jail, > 10.1-RELEASE amd64. > > Since there have been some zfs commits since r286923, I upgraded to > r286998 this morning and tried again with no better luck. I got the > same panic again. > > This machine has mirrored swap, and even though I've done what > gmirror(8) says to do in order to capture crash dumps, I've had no luck > with that. The dump is getting written, but savecore is unable to find > it. Not sure what is happening with savecore during boot, but I was able to run it manually and collect the crash dump. Unread portion of the kernel message buffer: panic: solaris assert: avl_is_empty(&dn->dn_dbufs), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c, line: 495 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0859e4e4e0 vpanic() at vpanic+0x189/frame 0xfffffe0859e4e560 panic() at panic+0x43/frame 0xfffffe0859e4e5c0 assfail() at assfail+0x1a/frame 0xfffffe0859e4e5d0 dnode_sync() at dnode_sync+0x6c8/frame 0xfffffe0859e4e6b0 dmu_objset_sync_dnodes() at dmu_objset_sync_dnodes+0x2b/frame 0xfffffe0859e4e6e0 dmu_objset_sync() at dmu_objset_sync+0x29e/frame 0xfffffe0859e4e7b0 dsl_pool_sync() at dsl_pool_sync+0x348/frame 0xfffffe0859e4e820 spa_sync() at spa_sync+0x442/frame 0xfffffe0859e4e910 txg_sync_thread() at txg_sync_thread+0x23d/frame 0xfffffe0859e4e9f0 fork_exit() at fork_exit+0x84/frame 0xfffffe0859e4ea30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0859e4ea30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic (kgdb) bt #0 doadump (textdump=0) at pcpu.h:221 #1 0xffffffff8037bb86 in db_fncall (dummy1=, dummy2=, dummy3=, dummy4=) at /usr/src/sys/ddb/db_command.c:568 #2 0xffffffff8037b941 in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #3 0xffffffff8037b5d4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #4 0xffffffff8037e18b in db_trap (type=, code=0) at /usr/src/sys/ddb/db_main.c:251 #5 0xffffffff80a5b294 in kdb_trap (type=3, code=0, tf=) at /usr/src/sys/kern/subr_kdb.c:654 #6 0xffffffff80e6a4b1 in trap (frame=0xfffffe0859e4e410) at /usr/src/sys/amd64/amd64/trap.c:540 #7 0xffffffff80e49f22 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:235 #8 0xffffffff80a5a96e in kdb_enter (why=0xffffffff81379010 "panic", msg=0xffffffff80a60b60 "UH\211\ufffdAWAVATSH\203\ufffdPI\211\ufffdA\211\ufffdH\213\004%Py\ufffd\201H\211E\ufffd\201<%\ufffd\210\ufffd\201") at cpufunc.h:63 #9 0xffffffff80a1e2c9 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:619 #10 0xffffffff80a1e333 in panic (fmt=0xffffffff81aafa90 "\004") at /usr/src/sys/kern/kern_shutdown.c:557 ---Type to continue, or q to quit--- #11 0xffffffff8240922a in assfail (a=, f=, l=) at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81 #12 0xffffffff820d4f78 in dnode_sync (dn=0xfffff8040b72d3d0, tx=0xfffff8001598ec00) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:495 #13 0xffffffff820c922b in dmu_objset_sync_dnodes (list=0xfffff80007712b90, newlist=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1045 #14 0xffffffff820c8ede in dmu_objset_sync (os=0xfffff80007712800, pio=, tx=0xfffff8001598ec00) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1163 #15 0xffffffff820e8e78 in dsl_pool_sync (dp=0xfffff80015676000, txg=2660975) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:536 #16 0xffffffff8210dca2 in spa_sync (spa=0xfffffe00089c6000, txg=2660975) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:6641 #17 0xffffffff8211843d in txg_sync_thread (arg=0xfffff80015676000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:517 #18 0xffffffff809e47c4 in fork_exit ( callout=0xffffffff82118200 , arg=0xfffff80015676000, frame=0xfffffe0859e4ea40) at /usr/src/sys/kern/kern_fork.c:1006 ---Type to continue, or q to quit--- #19 0xffffffff80e4a45e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:610 #20 0x0000000000000000 in ?? () Current language: auto; currently minimal From owner-freebsd-fs@freebsd.org Fri Aug 21 18:38:15 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95F6F9BF4B0 for ; Fri, 21 Aug 2015 18:38:15 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "anubis.delphij.net", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 708CD307; Fri, 21 Aug 2015 18:38:15 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from zeta.ixsystems.com (unknown [12.229.62.2]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by anubis.delphij.net (Postfix) with ESMTPSA id 057521D322; Fri, 21 Aug 2015 11:38:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphij.net; s=anubis; t=1440182288; x=1440196688; bh=oTW3ttZTPdJB7vnVwJpBRj9OoOlr6z0YR0t4JIRS+vc=; h=Reply-To:Subject:References:To:Cc:From:Date:In-Reply-To; b=OR8zq+yRcvGz2+B2rOGoKx7JB+9YysrUl3DrQTHS628oaM0ih1szplrSbZX3REsga POrgLxe9e8o93+u/N1njLWaDdwsbhheeX4HMNfocaqN5xmnLYZ/NMQhy0ZomLfw62b 586Gq1bKwYtktmW+0H4GqOyKnUIm6agFdQidW+5Q= Reply-To: d@delphij.net Subject: Re: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic References: <201508211748.t7LHmo96096088@gw.catspoiler.org> To: Don Lewis , freebsd-fs@FreeBSD.org Cc: "Justin T. Gibbs" , George Wilson From: Xin Li X-Enigmail-Draft-Status: N1110 Organization: The FreeBSD Project Message-ID: <55D7700B.3080207@delphij.net> Date: Fri, 21 Aug 2015 11:38:03 -0700 MIME-Version: 1.0 In-Reply-To: <201508211748.t7LHmo96096088@gw.catspoiler.org> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="0gaAdQp5Winb1VPmqwelCaUlbW8WLnfDR" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 18:38:15 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --0gaAdQp5Winb1VPmqwelCaUlbW8WLnfDR Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi, A quick glance at the changes suggests that Justin's changeset may be related. The reasoning is here: https://reviews.csiden.org/r/131/ Related Illumos ticket: https://www.illumos.org/issues/5056 In dnode_evict_dbufs(), remove multiple passes over dn->dn_dbufs. This is possible now that objset eviction is asynchronously completed in a different context once dbuf eviction completes. In the case of objset eviction, any dbufs held by children will be evicted via dbuf_rele_and_unlock() once their refcounts go to zero. Even when objset eviction is not active, the ordering of the avl tree guarantees that children will be released before parents, allowing the parent's refcounts to naturally drop to zero before they are inspected in this single loop. =3D=3D=3D=3D So, upon return from dnode_evict_dbufs(), there could be some DB_EVICTING buffers on the AVL pending release and thus breaks the invariant. Should we restore the loop where we yield briefly with the lock released, then reacquire and recheck? Cheers, On 08/21/15 10:48, Don Lewis wrote: > On 21 Aug, Don Lewis wrote: >> On 21 Aug, Don Lewis wrote: >>> I just started getting this panic: >>> >>> solaris assert: avl_is_empty(&dn -> dn_dbufs), file: >>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c,= >>> line 495 >>> >>> System info: >>> FreeBSD zipper.catspoiler.org 11.0-CURRENT FreeBSD 11.0-CURRENT #25 r= 286923: Wed Aug 19 09:28:53 PDT 2015 dl@zipper.catspoiler.org:/usr/ob= j/usr/src/sys/GENERIC amd64 >>> >>> My zfs pool has one mirrored vdev. Scrub doesn't find any problems. >>> >>> %zpool status >>> pool: zroot >>> state: ONLINE >>> scan: scrub repaired 0 in 2h58m with 0 errors on Fri Aug 21 00:44:5= 2 2015 >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> zroot ONLINE 0 0 0 >>> mirror-0 ONLINE 0 0 0 >>> ada0p3 ONLINE 0 0 0 >>> ada1p3 ONLINE 0 0 0 >>> >>> This panic is reproduceable and happens every time I use poudriere to= >>> build ports using my 9.3-RELEASE amd64 jail and occurs at the end of = the >>> poudriere run when it is unmounting filesystems. >>> >>> [00:10:43] =3D=3D=3D=3D>> Stopping 4 builders >>> 93amd64-default-job-01: removed >>> 93amd64-default-job-01-n: removed >>> 93amd64-default-job-02: removed >>> 93amd64-default-job-02-n: removed >>> 93amd64-default-job-03: removed >>> 93amd64-default-job-03-n: removed >>> 93amd64-default-job-04: removed >>> 93amd64-default-job-04-n: removed >>> [00:10:46] =3D=3D=3D=3D>> Creating pkgng repository >>> Creating repository in /tmp/packages: 100% >>> Packing files for repository: 100% >>> [00:10:55] =3D=3D=3D=3D>> Committing packages to repository >>> [00:10:55] =3D=3D=3D=3D>> Removing old packages >>> [00:10:55] =3D=3D=3D=3D>> Built ports: devel/py-pymtbl net/sie-nmsg n= et/p5-Net-Nmsg net/axa >>> [93amd64-default] [2015-08-21_00h47m41s] [committing:] Queued: 4 Bui= lt: 4 Failed: 0 Skipped: 0 Ignored: 0 Tobuild: 0 Time: 00:10:53 >>> [00:10:55] =3D=3D=3D=3D>> Logs: /var/poudriere/data/logs/bulk/93amd64= -default/2015-08-21_00h47m41s >>> [00:10:55] =3D=3D=3D=3D>> Cleaning up >>> 93amd64-default: removed >>> 93amd64-default-n: removed >>> [00:10:55] =3D=3D=3D=3D>> Umounting file systems >>> Write failed: Broken pipe >>> >>> Prior to that, I ran poudriere a number of times with a 10.2-STABLE >>> amd64 jail without incident. >>> >>> I've kicked off a bunch of poudriere runs for other jails and >>> will check on it in the morning. >> >> Died the same way after building ports on the first jail, >> 10.1-RELEASE amd64. >> >> Since there have been some zfs commits since r286923, I upgraded to >> r286998 this morning and tried again with no better luck. I got the >> same panic again. >> >> This machine has mirrored swap, and even though I've done what >> gmirror(8) says to do in order to capture crash dumps, I've had no luc= k >> with that. The dump is getting written, but savecore is unable to fin= d >> it. >=20 > Not sure what is happening with savecore during boot, but I was able to= > run it manually and collect the crash dump. >=20 >=20 > Unread portion of the kernel message buffer: > panic: solaris assert: avl_is_empty(&dn->dn_dbufs), file: /usr/src/sys/= cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c, line: 495 > cpuid =3D 1 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe085= 9e4e4e0 > vpanic() at vpanic+0x189/frame 0xfffffe0859e4e560 > panic() at panic+0x43/frame 0xfffffe0859e4e5c0 > assfail() at assfail+0x1a/frame 0xfffffe0859e4e5d0 > dnode_sync() at dnode_sync+0x6c8/frame 0xfffffe0859e4e6b0 > dmu_objset_sync_dnodes() at dmu_objset_sync_dnodes+0x2b/frame 0xfffffe0= 859e4e6e0 > dmu_objset_sync() at dmu_objset_sync+0x29e/frame 0xfffffe0859e4e7b0 > dsl_pool_sync() at dsl_pool_sync+0x348/frame 0xfffffe0859e4e820 > spa_sync() at spa_sync+0x442/frame 0xfffffe0859e4e910 > txg_sync_thread() at txg_sync_thread+0x23d/frame 0xfffffe0859e4e9f0 > fork_exit() at fork_exit+0x84/frame 0xfffffe0859e4ea30 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0859e4ea30 > --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- > KDB: enter: panic >=20 >=20 >=20 > (kgdb) bt > #0 doadump (textdump=3D0) at pcpu.h:221 > #1 0xffffffff8037bb86 in db_fncall (dummy1=3D,=20 > dummy2=3D, dummy3=3D,=20 > dummy4=3D) at /usr/src/sys/ddb/db_command.c:56= 8 > #2 0xffffffff8037b941 in db_command (cmd_table=3D0x0) > at /usr/src/sys/ddb/db_command.c:440 > #3 0xffffffff8037b5d4 in db_command_loop () > at /usr/src/sys/ddb/db_command.c:493 > #4 0xffffffff8037e18b in db_trap (type=3D, code=3D= 0) > at /usr/src/sys/ddb/db_main.c:251 > #5 0xffffffff80a5b294 in kdb_trap (type=3D3, code=3D0, tf=3D) > at /usr/src/sys/kern/subr_kdb.c:654 > #6 0xffffffff80e6a4b1 in trap (frame=3D0xfffffe0859e4e410) > at /usr/src/sys/amd64/amd64/trap.c:540 > #7 0xffffffff80e49f22 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:235 > #8 0xffffffff80a5a96e in kdb_enter (why=3D0xffffffff81379010 "panic", = > msg=3D0xffffffff80a60b60 "UH\211\ufffdAWAVATSH\203\ufffdPI\211\ufff= dA\211\ufffdH\213\004%Py\ufffd\201H\211E\ufffd\201<%\ufffd\210\ufffd\201"= ) at cpufunc.h:63 > #9 0xffffffff80a1e2c9 in vpanic (fmt=3D,=20 > ap=3D) at /usr/src/sys/kern/kern_shutdown.c:61= 9 > #10 0xffffffff80a1e333 in panic (fmt=3D0xffffffff81aafa90 "\004") > at /usr/src/sys/kern/kern_shutdown.c:557 > ---Type to continue, or q to quit--- > #11 0xffffffff8240922a in assfail (a=3D,=20 > f=3D, l=3D) > at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:= 81 > #12 0xffffffff820d4f78 in dnode_sync (dn=3D0xfffff8040b72d3d0,=20 > tx=3D0xfffff8001598ec00) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sy= nc.c:495 > #13 0xffffffff820c922b in dmu_objset_sync_dnodes (list=3D0xfffff8000771= 2b90,=20 > newlist=3D, tx=3D) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objs= et.c:1045 > #14 0xffffffff820c8ede in dmu_objset_sync (os=3D0xfffff80007712800,=20 > pio=3D, tx=3D0xfffff8001598ec00) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objs= et.c:1163 > #15 0xffffffff820e8e78 in dsl_pool_sync (dp=3D0xfffff80015676000, txg=3D= 2660975) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool= =2Ec:536 > #16 0xffffffff8210dca2 in spa_sync (spa=3D0xfffffe00089c6000, txg=3D266= 0975) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:66= 41 > #17 0xffffffff8211843d in txg_sync_thread (arg=3D0xfffff80015676000) > at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:51= 7 > #18 0xffffffff809e47c4 in fork_exit ( > callout=3D0xffffffff82118200 , arg=3D0xfffff800156= 76000,=20 > frame=3D0xfffffe0859e4ea40) at /usr/src/sys/kern/kern_fork.c:1006 > ---Type to continue, or q to quit--- > #19 0xffffffff80e4a45e in fork_trampoline () > at /usr/src/sys/amd64/amd64/exception.S:610 > #20 0x0000000000000000 in ?? () > Current language: auto; currently minimal >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 --=20 Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die --0gaAdQp5Winb1VPmqwelCaUlbW8WLnfDR Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.1.7 (FreeBSD) iQIcBAEBCgAGBQJV13AOAAoJEJW2GBstM+nsz+YP/jkHa9U0NfBa3bD4mH7xSl+o umRX2nHmzu0qH+D/csa0Qj427+pwPnCZdpuGgI9OI6N52eSFm8G4oBkF4lDgTLJE tNXpxaD6kvKsN9Xo/kv1CwreNk/QhatL+i7JR5Ozc0gZ7R3sJwenbjtLVB5L8JuB Q1lXA/bbc1gfwqdZ73NWkYNHUA2gYkUypL3O9YQC1YbFcPrNDLbXZVRlBw0J1pqL D2Rb7d6aztut2WHK0VE8rbnheBHT5gphuqwOGhNYut0DmPxFSgO6wJ78LHlCQ3te 5Ga7TVEA15ByWzVqqoTGfDP/Mze76OvB5I6wzHbKq9lZiFh5pFoPiBXgqBJKMwE5 UEkgmxoSR+HJ+/FL07rXBf/NwHHPzx10D7cI8S7Q1ZdJI/oOrCna9yAMfyrqO1BH taKoNJqgZqPVhlB0rQB47Cnhp4pBKf6AkrK4TG7Oqrvko2jJt1aIEBfxaauhc+Ar tH3CGy2hzP3/mMm0MSZ63g3m9jV8fVQdKV0yg9E2V5KoCbG7d9i5MftC7oz3kM7u pSUMBzWlDv7zkkJ8xErIA22AEhjLVXQ/5+Va5m4Dbdb6le+Jk0Wl/ncMJ+XigUPZ /cNfVPv2JlHArPNVgjOEJ4EI6spXR0xTaOxnyGefxCd5ONIvpnLeTaurzPGLF7S5 jSyz3pBLOTWnUDWMFyBo =AKkZ -----END PGP SIGNATURE----- --0gaAdQp5Winb1VPmqwelCaUlbW8WLnfDR-- From owner-freebsd-fs@freebsd.org Fri Aug 21 20:22:32 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C54159BF5E3 for ; Fri, 21 Aug 2015 20:22:32 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (cl-1657.chi-02.us.sixxs.net [IPv6:2001:4978:f:678::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 710F6143F; Fri, 21 Aug 2015 20:22:32 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id t7LKM5Vg096499; Fri, 21 Aug 2015 13:22:10 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201508212022.t7LKM5Vg096499@gw.catspoiler.org> Date: Fri, 21 Aug 2015 13:22:05 -0700 (PDT) From: Don Lewis Subject: Re: solaris assert: avl_is_empty(&dn -> dn_dbufs) panic To: d@delphij.net cc: freebsd-fs@FreeBSD.org, gibbs@freebsd.org, george.wilson@delphix.com In-Reply-To: <55D7700B.3080207@delphij.net> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2015 20:22:32 -0000 On 21 Aug, Xin Li wrote: > Hi, > > A quick glance at the changes suggests that Justin's changeset may be > related. The reasoning is here: > > https://reviews.csiden.org/r/131/ > > Related Illumos ticket: > > https://www.illumos.org/issues/5056 > > In dnode_evict_dbufs(), remove multiple passes over dn->dn_dbufs. > This is possible now that objset eviction is asynchronously > completed in a different context once dbuf eviction completes. > > In the case of objset eviction, any dbufs held by children will > be evicted via dbuf_rele_and_unlock() once their refcounts go > to zero. Even when objset eviction is not active, the ordering > of the avl tree guarantees that children will be released before > parents, allowing the parent's refcounts to naturally drop to > zero before they are inspected in this single loop. Thanks. Changing USE_TMPFS in poudriere.conf from "data localbase" to "all" has me temporarily up and running. That my not work so well for my larger package build runs that will now be a lot more RAM intensive.