From owner-freebsd-fs@FreeBSD.ORG Sun Oct 19 15:30:34 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9DADC6BD; Sun, 19 Oct 2014 15:30:34 +0000 (UTC) Received: from mail.jrv.org (adsl-70-243-84-11.dsl.austtx.swbell.net [70.243.84.11]) by mx1.freebsd.org (Postfix) with ESMTP id 5128D13E; Sun, 19 Oct 2014 15:30:33 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.jrv.org (Postfix) with ESMTP id A8C7C1A86E3; Sun, 19 Oct 2014 10:30:26 -0500 (CDT) Received: from mail.jrv.org ([127.0.0.1]) by localhost (zimbra64.housenet.jrv [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 6984uc3Y5_5t; Sun, 19 Oct 2014 10:30:16 -0500 (CDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.jrv.org (Postfix) with ESMTP id C8D1B1A86CF; Sun, 19 Oct 2014 10:30:16 -0500 (CDT) X-Virus-Scanned: amavisd-new at zimbra64.housenet.jrv Received: from mail.jrv.org ([127.0.0.1]) by localhost (zimbra64.housenet.jrv [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id gizZiRC4ska5; Sun, 19 Oct 2014 10:30:16 -0500 (CDT) Received: from [192.168.138.128] (BMX.housenet.jrv [192.168.3.140]) by mail.jrv.org (Postfix) with ESMTPSA id A059C1A86CC; Sun, 19 Oct 2014 10:30:16 -0500 (CDT) Message-ID: <5443D918.9090307@jrv.org> Date: Sun, 19 Oct 2014 10:30:32 -0500 From: "James R. Van Artsdalen" User-Agent: Mozilla/5.0 (Windows NT 5.0; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 Subject: Re: zfs recv hangs in kmem arena References: <54250AE9.6070609@jrv.org> <543FAB3C.4090503@jrv.org> <543FEE6F.5050007@delphij.net> <54409050.4070401@jrv.org> <544096B3.20306@delphij.net> <54409CFE.8070905@jrv.org> In-Reply-To: <54409CFE.8070905@jrv.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, d@delphij.net, current@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2014 15:30:34 -0000 Removing kern.maxfiles from loader.conf still hangs in "kmem arena". I tried using a memstick image of -CURRENT made from the release/ process and this also hangs in "kmem arena" An uninvolved server of mine hung Friday night in state"kmem arena" during periodic's "zpool history". After a reboot it did not hang Saturday night. On 10/16/2014 11:37 PM, James R. Van Artsdalen wrote: > On 10/16/2014 11:10 PM, Xin Li wrote: >> On 10/16/14 8:43 PM, James R. Van Artsdalen wrote: >>> On 10/16/2014 11:12 AM, Xin Li wrote: >>>>> On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote: >>>>>> FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 >>>>>> #2 r272070M: Wed Sep 24 17:36:56 CDT 2014 >>>>>> james@BLACKIE.housenet.jrv:/usr/obj/usr/src/sys/GENERIC >>>>>> amd64 >>>>>> >>>>>> With current STABLE10 I am unable to replicate a ZFS pool >>>>>> using zfs send/recv without zfs hanging in state "kmem >>>>>> arena", within the first 4TB or so (of a 23TB Pool). >>>> What does procstat -kk 1176 (or the PID of your 'zfs' process >>>> that stuck in that state) say? >>>> >>>> Cheers, >>>> >>> SUPERTEX:/root# ps -lp 866 UID PID PPID CPU PRI NI VSZ RSS >>> MWCHAN STAT TT TIME COMMAND 0 866 863 0 52 0 66800 >>> 29716 kmem are D+ 1 57:40.82 zfs recv -duvF BIGTOX >>> SUPERTEX:/root# procstat -kk 866 PID TID COMM TDNAME >>> KSTACK 866 101573 zfs - mi_switch+0xe1 >>> sleepq_wait+0x3a _cv_wait+0x16d vmem_xalloc+0x568 vmem_alloc+0x3d >>> kmem_malloc+0x33 keg_alloc_slab+0xcd keg_fetch_slab+0x151 >>> zone_fetch_slab+0x7e zone_import+0x40 uma_zalloc_arg+0x34e >>> arc_get_data_buf+0x31a arc_buf_alloc+0xaa dmu_buf_will_fill+0x169 >>> dmu_write+0xfc dmu_recv_stream+0xd40 zfs_ioc_recv+0x94e >>> zfsdev_ioctl+0x5ca >> Do you have any special tuning in your /boot/loader.conf? >> >> Cheers, >> > Below. I had forgotten some of this was there. > > After sending the previous message I ran kgdb to see if I could get a > backtrace with function args. I didn't see how to do it for this proc, > but during all this the process un-blocked and started running again. > > The process blocked again in kmem arena after a few minutes. > > > SUPERTEX:/root# cat /boot/loader.conf > zfs_load="YES" # ZFS > vfs.root.mountfrom="zfs:SUPERTEX/UNIX" # Specify root partition > in a way the > # kernel understands > kern.maxfiles="32K" # Set the sys. wide open files limit > kern.ktrace.request_pool="512" > #vfs.zfs.debug=1 > vfs.zfs.check_hostid=0 > > loader_logo="beastie" # Desired logo: fbsdbw, beastiebw, beastie, > none > boot_verbose="YES" # -v: Causes extra debugging information to be > printed > geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8)) > geom_label_load="YES" # File system labels (see glabel(8)) > ahci_load="YES" > siis_load="YES" > mvs_load="YES" > coretemp_load="YES" # Intel Core CPU temperature monitor > #console="comconsole" > kern.msgbufsize="131072" # Set size of kernel message buffer > > kern.geom.label.gpt.enable=0 > kern.geom.label.gptid.enable=0 > kern.geom.label.disk_ident.enable=0 > SUPERTEX:/root# > From owner-freebsd-fs@FreeBSD.ORG Sun Oct 19 21:00:09 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9CD7A689 for ; Sun, 19 Oct 2014 21:00:09 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 73A446E9 for ; Sun, 19 Oct 2014 21:00:09 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9JL09Cf000904 for ; Sun, 19 Oct 2014 21:00:09 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201410192100.s9JL09Cf000904@kenobi.freebsd.org> From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 Date: Sun, 19 Oct 2014 21:00:09 +0000 Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2014 21:00:09 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ----------------+-----------+------------------------------------------------- Needs MFC | 136470 | [nfs] Cannot mount / in read-only, over NFS Needs MFC | 139651 | [nfs] mount(8): read-only remount of NFS volume Needs MFC | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non 3 problems total for which you should take action. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 08:00:09 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 64929459 for ; Mon, 20 Oct 2014 08:00:09 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 51C0AB0B for ; Mon, 20 Oct 2014 08:00:09 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9K809v5038257 for ; Mon, 20 Oct 2014 08:00:09 GMT (envelope-from bugzilla-noreply@freebsd.org) Message-Id: <201410200800.s9K809v5038257@kenobi.freebsd.org> From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [FreeBSD Bugzilla] Commit Needs MFC MIME-Version: 1.0 X-Bugzilla-Type: whine X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated Date: Mon, 20 Oct 2014 08:00:09 +0000 Content-Type: text/plain X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2014 08:00:09 -0000 Hi, You have a bug in the "Needs MFC" state which has not been touched in 7 or more days. This email serves as a reminder that you may want to MFC this bug or marked it as completed. In the event you have a longer MFC timeout you may update this bug with a comment and I won't remind you again for 7 days. This reminder is only sent on Mondays. Please file a bug about concerns you may have. This search was scheduled by eadler@FreeBSD.org. (3 bugs) Bug 136470: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=136470 Severity: Affects Only Me Priority: Normal Hardware: Any Assignee: freebsd-fs@FreeBSD.org Status: Needs MFC Resolution: Summary: [nfs] Cannot mount / in read-only, over NFS Bug 139651: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=139651 Severity: Affects Only Me Priority: Normal Hardware: Any Assignee: freebsd-fs@FreeBSD.org Status: Needs MFC Resolution: Summary: [nfs] mount(8): read-only remount of NFS volume does not work Bug 144447: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=144447 Severity: Affects Only Me Priority: Normal Hardware: Any Assignee: freebsd-fs@FreeBSD.org Status: Needs MFC Resolution: Summary: [zfs] sharenfs fsunshare() & fsshare_main() non functional From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 08:53:22 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 54022AD7 for ; Mon, 20 Oct 2014 08:53:22 +0000 (UTC) Received: from umail.aei.mpg.de (umail.aei.mpg.de [194.94.224.6]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E93B9111 for ; Mon, 20 Oct 2014 08:53:21 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by umail.aei.mpg.de (Postfix) with ESMTP id AEDD2200CA0; Mon, 20 Oct 2014 10:44:58 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.4 (20090625) (Debian) at aei.mpg.de Received: from umail.aei.mpg.de ([127.0.0.1]) by localhost (umail.aei.mpg.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZRIyCxMjwEUJ; Mon, 20 Oct 2014 10:44:58 +0200 (CEST) Received: from mailgate.aei.mpg.de (mailgate.aei.mpg.de [194.94.224.5]) by umail.aei.mpg.de (Postfix) with ESMTP id 94D7E200C9F; Mon, 20 Oct 2014 10:44:58 +0200 (CEST) Received: from mailgate.aei.mpg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id 82B5440588A; Mon, 20 Oct 2014 10:44:58 +0200 (CEST) Received: from intranet.aei.uni-hannover.de (ahin1.aei.uni-hannover.de [130.75.117.40]) by mailgate.aei.mpg.de (Postfix) with ESMTP id 5D688406AF1; Mon, 20 Oct 2014 10:44:58 +0200 (CEST) Received: from arc.aei.uni-hannover.de ([10.117.15.110]) by intranet.aei.uni-hannover.de (Lotus Domino Release 8.5.3FP6) with ESMTP id 2014102010444761-987 ; Mon, 20 Oct 2014 10:44:47 +0200 Date: Mon, 20 Oct 2014 10:44:48 +0200 From: Gerrit =?ISO-8859-1?Q?K=FChn?= To: Gerrit =?ISO-8859-1?Q?K=FChn?= Subject: Re: ZFS snapshot renames failing after upgrade to 9.2 Message-Id: <20141020104448.1a71b93a279b67d1b62254ba@aei.mpg.de> In-Reply-To: <7975_1390380918_52DF8776_7975_154_1_20140122094658.ea83cda2.gerrit.kuehn@aei.mpg.de> References: <0C9FD4E1-0549-4849-BFC5-D8C5D4A34D64@msqr.us> <54D3B3C002184A52BEC9B1543854B87F@multiplay.co.uk> <333D57C6A4544067880D9CFC04F02312@multiplay.co.uk> <26053_1387447492_52B2C4C4_26053_331_1_20131219105503.3a8d1df3.gerrit.kuehn@aei.mpg.de> <20131219165549.9f2ca709.gerrit.kuehn@aei.mpg.de> <20131219174054.91ac617a.gerrit.kuehn@aei.mpg.de> <20131220100522.382a39ac.gerrit.kuehn@aei.mpg.de> <7975_1390380918_52DF8776_7975_154_1_20140122094658.ea83cda2.gerrit.kuehn@aei.mpg.de> Organization: Max Planck Gesellschaft X-Mailer: Sylpheed 3.4.2 (GTK+ 2.24.22; amd64-portbld-freebsd10.0) Mime-Version: 1.0 X-MIMETrack: Itemize by SMTP Server on intranet/aei-hannover(Release 8.5.3FP6|November 21, 2013) at 20.10.2014 10:44:47, Serialize by Router on intranet/aei-hannover(Release 8.5.3FP6|November 21, 2013) at 20.10.2014 10:44:57, Serialize complete at 20.10.2014 10:44:57 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1 X-PMX-Version: 6.0.2.2308539, Antispam-Engine: 2.7.2.2107409, Antispam-Data: 2014.10.20.83635 X-PerlMx-Spam: Gauge=IIIIIIIII, Probability=9%, Report=' MULTIPLE_RCPTS 0.1, FROM_SAME_AS_TO 0.05, HTML_00_01 0.05, HTML_00_10 0.05, MIME_LOWER_CASE 0.05, BODYTEXTP_SIZE_3000_LESS 0, BODY_SIZE_1000_LESS 0, BODY_SIZE_2000_LESS 0, BODY_SIZE_200_299 0, BODY_SIZE_5000_LESS 0, BODY_SIZE_7000_LESS 0, SMALL_BODY 0, __ANY_URI 0, __BOUNCE_CHALLENGE_SUBJ 0, __BOUNCE_NDR_SUBJ_EXEMPT 0, __CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __FROM_SAME_AS_TO2 0, __HAS_FROM 0, __HAS_MSGID 0, __HAS_X_MAILER 0, __IN_REP_TO 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __MULTIPLE_RCPTS_CC_X2 0, __SANE_MSGID 0, __SUBJ_ALPHA_NEGATE 0, __TO_MALFORMED_2 0, __URI_NO_PATH 0, __URI_NO_WWW 0, __URI_NS ' Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2014 08:53:22 -0000 On Wed, 22 Jan 2014 09:46:58 +0100 Gerrit K=FChn wrote about Re: ZFS snapshot renames failing after upgrade to 9.2: KH> I will look if I can check or fix it. Any news on this? I just upgrade a system to 9-stable, and see still the same issue. cu Gerrit From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 15:36:20 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BCA7F2DA; Mon, 20 Oct 2014 15:36:20 +0000 (UTC) Received: from mx6-phx2.redhat.com (mx6-phx2.redhat.com [209.132.183.39]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.redhat.com", Issuer "DigiCert SHA2 Extended Validation Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 945C7CB6; Mon, 20 Oct 2014 15:36:20 +0000 (UTC) Received: from zmail12.collab.prod.int.phx2.redhat.com (zmail12.collab.prod.int.phx2.redhat.com [10.5.83.14]) by mx6-phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s9KFaIug022018; Mon, 20 Oct 2014 11:36:19 -0400 Date: Mon, 20 Oct 2014 11:36:18 -0400 (EDT) From: Justin Clift To: Jordan Hubbard Message-ID: <287933729.7365215.1413819378546.JavaMail.zimbra@redhat.com> In-Reply-To: References: <0F20AEEC-6244-42BC-815C-1440BBBDE664@mail.turbofuzz.com> <0ABAE2AC-BF1B-4125-ACA9-C6177D013E25@mail.turbofuzz.com> <20140706230910.GA8523@ivaldir.etoilebsd.net> <2F416D06-0A98-4E66-902C-ED0690A4B1C0@ixsystems.com> <20140825211459.GB65120@ivaldir.etoilebsd.net> Subject: Re: FreeBSD support being added to GlusterFS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.5.82.7] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - FF31 (Linux)/8.0.6_GA_5922) Thread-Topic: FreeBSD support being added to GlusterFS Thread-Index: 8F1yB9gTBm45ui9fVrTji9D1UfdFDA== Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2014 15:36:20 -0000 Does anyone have time to review: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192701 It's a FUSE patch that's been hanging around a few months, written by Harsha. Harsha, is GlusterFS dependant on this? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 17:32:45 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 22044FF1 for ; Mon, 20 Oct 2014 17:32:45 +0000 (UTC) Received: from relay02.ioffe.ru (relay02.ioffe.ru [194.85.224.38]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8900DE20 for ; Mon, 20 Oct 2014 17:32:43 +0000 (UTC) Received: from mail.ioffe.ru (mail [194.85.224.39]) by relay02.ioffe.ru (8.14.5/8.12.6) with ESMTP id s9KHWfg4042025 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 20 Oct 2014 21:32:41 +0400 (MSK) (envelope-from alex@putnichek.ru) Received: from neptunmac.local ([194.85.230.94]) (authenticated bits=0) by mail.ioffe.ru (8.14.5/8.14.4) with ESMTP id s9KHWfn4004580 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Mon, 20 Oct 2014 21:32:41 +0400 (MSK) (envelope-from alex@putnichek.ru) Message-ID: <54454739.1070900@putnichek.ru> Date: Mon, 20 Oct 2014 21:32:41 +0400 From: Alex User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: l2_io_error and l2_cksum_bad are not null and growing Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2014 17:32:45 -0000 Hello. We seem to have a problem with l2arc on zfs system on 10.1-BETA1 FreeBSD 10.1-BETA1 #1 r271710 server. The server has a l2arc cache configured from Intel SSD 480G disk. The problem is that l2_io_error and l2_cksum_bad values are constantly growing, however there are no traces of any hardware malfunctioning. As for now, the values are kstat.zfs.misc.arcstats.l2_io_error: 1501 kstat.zfs.misc.arcstats.l2_cksum_bad: 19480 Here is the output of zpool status: pool: zpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM zpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 diskid/DISK-WD-WMC1P0DFSF47p2 ONLINE 0 0 0 diskid/DISK-WD-WMC1P0DEFERYp2 ONLINE 0 0 0 logs gpt/zil0 ONLINE 0 0 0 cache gpt/cache0 ONLINE 0 0 0 errors: No known data errors Here is the output of zfs-stats -L: ------------------------------------------------------------------------ ZFS Subsystem Report Mon Oct 20 19:29:38 2014 ------------------------------------------------------------------------ L2 ARC Summary: (DEGRADED) Passed Headroom: 72.42m Tried Lock Failures: 360.55m IO In Progress: 65 Low Memory Aborts: 101 Free on Write: 7.09k Writes While Full: 16.90k R/W Clashes: 11 Bad Checksums: 19.48k IO Errors: 1.50k SPA Mismatch: 1.18m L2 ARC Size: (Adaptive) 555.88 GiB Header Size: 0.21% 1.17 GiB L2 ARC Evicts: Lock Retries: 10 Upon Reading: 0 L2 ARC Breakdown: 272.76m Hit Ratio: 0.35% 949.30k Miss Ratio: 99.65% 271.81m Feeds: 2.19m L2 ARC Buffer: Bytes Scanned: 20.45 PiB Buffer Iterations: 2.19m List Iterations: 139.97m NULL List Iterations: 5.58k L2 ARC Writes: Writes Sent: 100.00% 567.37k Any help is welcome. Best regards -- Alex From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 18:27:28 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 44BAD420 for ; Mon, 20 Oct 2014 18:27:28 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 9C6966A0 for ; Mon, 20 Oct 2014 18:27:22 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA25161; Mon, 20 Oct 2014 21:27:14 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1XgHfm-0000pH-De; Mon, 20 Oct 2014 21:27:14 +0300 Message-ID: <544553CA.1060406@FreeBSD.org> Date: Mon, 20 Oct 2014 21:26:18 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: Alex , freebsd-fs@FreeBSD.org Subject: Re: l2_io_error and l2_cksum_bad are not null and growing References: <54454739.1070900@putnichek.ru> In-Reply-To: <54454739.1070900@putnichek.ru> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2014 18:27:28 -0000 On 20/10/2014 20:32, Alex wrote: > Hello. > > We seem to have a problem with l2arc on zfs system on 10.1-BETA1 FreeBSD > 10.1-BETA1 #1 r271710 server. The server has a l2arc cache configured from Intel > SSD 480G disk. The problem is that l2_io_error and l2_cksum_bad values are > constantly growing, however there are no traces of any hardware malfunctioning. > As for now, the values are > kstat.zfs.misc.arcstats.l2_io_error: 1501 > kstat.zfs.misc.arcstats.l2_cksum_bad: 19480 Please see if the following patch might help https://github.com/avg-I/freebsd/compare/review/l2arc-write-target-size.diff > Here is the output of zpool status: > pool: zpool > state: ONLINE > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > zpool ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > diskid/DISK-WD-WMC1P0DFSF47p2 ONLINE 0 0 0 > diskid/DISK-WD-WMC1P0DEFERYp2 ONLINE 0 0 0 > logs > gpt/zil0 ONLINE 0 0 0 > cache > gpt/cache0 ONLINE 0 0 0 > > errors: No known data errors > > Here is the output of zfs-stats -L: > ------------------------------------------------------------------------ > ZFS Subsystem Report Mon Oct 20 19:29:38 2014 > ------------------------------------------------------------------------ > > L2 ARC Summary: (DEGRADED) > Passed Headroom: 72.42m > Tried Lock Failures: 360.55m > IO In Progress: 65 > Low Memory Aborts: 101 > Free on Write: 7.09k > Writes While Full: 16.90k > R/W Clashes: 11 > Bad Checksums: 19.48k > IO Errors: 1.50k > SPA Mismatch: 1.18m > > L2 ARC Size: (Adaptive) 555.88 GiB > Header Size: 0.21% 1.17 GiB > > L2 ARC Evicts: > Lock Retries: 10 > Upon Reading: 0 > > L2 ARC Breakdown: 272.76m > Hit Ratio: 0.35% 949.30k > Miss Ratio: 99.65% 271.81m > Feeds: 2.19m > > L2 ARC Buffer: > Bytes Scanned: 20.45 PiB > Buffer Iterations: 2.19m > List Iterations: 139.97m > NULL List Iterations: 5.58k > > L2 ARC Writes: > Writes Sent: 100.00% 567.37k > > Any help is welcome. Best regards > -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 22:17:57 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D7531E3C for ; Mon, 20 Oct 2014 22:17:57 +0000 (UTC) Received: from mail-qc0-f175.google.com (mail-qc0-f175.google.com [209.85.216.175]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9270B276 for ; Mon, 20 Oct 2014 22:17:56 +0000 (UTC) Received: by mail-qc0-f175.google.com with SMTP id b13so4918201qcw.20 for ; Mon, 20 Oct 2014 15:17:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=Ozfr3MiOU6334wLh9NWoh+ZtDUgU+zJgLVrXM3M8nXE=; b=GdbdgM7ggwbSVAT7H51T1p1s36rtyeWst/0Psyu9g6Ij/wtLOHKEd3KjesRMqpSu9K dIqdCzsxUfUmCpZqWk8/+DYrTf8KlDHbN4og+9CkLIifpOimf9GCFdBSx70OJZ463Zte bWHnqIhT2UxBaFLZ7gZ8JjXvMdvtpP2M6lqzmYhQ5KxiUtom+2P9tqkQa3diYd3NW6p4 Q6BXAF9jFrWB9NfA8brzLeAgpL6TPPiBSyYfMTwSGmpPIgXNPBsZoIe7jfK3lnMZAixd OXI+no9MgG1ti8w2my8JAhsoUMCyOG+xqRFBHO6pqD9yyzE/LzXmC6HClNmk1qa7iWWy hTLA== X-Gm-Message-State: ALoCoQmJSH/j+kKmwXLDJDHk1NKb4kxT6ns5CHYOnoD+ufXV6DiXPFZOKzmkeH+tR4J39nwlryIo MIME-Version: 1.0 X-Received: by 10.224.46.66 with SMTP id i2mr39437865qaf.72.1413843469386; Mon, 20 Oct 2014 15:17:49 -0700 (PDT) Received: by 10.229.133.205 with HTTP; Mon, 20 Oct 2014 15:17:49 -0700 (PDT) X-Originating-IP: [2601:9:100:79e:65a7:64bc:95e1:a6f3] In-Reply-To: <287933729.7365215.1413819378546.JavaMail.zimbra@redhat.com> References: <0F20AEEC-6244-42BC-815C-1440BBBDE664@mail.turbofuzz.com> <0ABAE2AC-BF1B-4125-ACA9-C6177D013E25@mail.turbofuzz.com> <20140706230910.GA8523@ivaldir.etoilebsd.net> <2F416D06-0A98-4E66-902C-ED0690A4B1C0@ixsystems.com> <20140825211459.GB65120@ivaldir.etoilebsd.net> <287933729.7365215.1413819378546.JavaMail.zimbra@redhat.com> Date: Mon, 20 Oct 2014 15:17:49 -0700 Message-ID: Subject: Re: FreeBSD support being added to GlusterFS From: Harshavardhana To: Justin Clift Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, Jordan Hubbard X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2014 22:17:57 -0000 On Mon, Oct 20, 2014 at 8:36 AM, Justin Clift wrote: > Does anyone have time to review: > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192701 > > It's a FUSE patch that's been hanging around a few months, > written by Harsha. > > Harsha, is GlusterFS dependant on this? > Not dependent, but its an important part that needs to be exposed for certain tests to pass under regression testing. -- Religious confuse piety with mere ritual, the virtuous confuse regulation with outcomes From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 06:03:52 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 88ECCF30 for ; Tue, 21 Oct 2014 06:03:52 +0000 (UTC) Received: from smtprelay05.ispgateway.de (smtprelay05.ispgateway.de [80.67.31.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4A2766CB for ; Tue, 21 Oct 2014 06:03:52 +0000 (UTC) Received: from [89.182.173.25] (helo=localhost) by smtprelay05.ispgateway.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84) (envelope-from ) id 1XgSXu-0005w1-2F for freebsd-fs@FreeBSD.org; Tue, 21 Oct 2014 08:03:50 +0200 Date: Tue, 21 Oct 2014 08:03:49 +0200 From: Marcus von Appen To: freebsd-fs@FreeBSD.org Subject: Convert your bugs from Needs MFC Message-ID: <20141021060349.GO1065@medusa.sysfault.org> Reply-To: Marcus von Appen MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="wi8hZpkE3Wxgr1M3" Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Df-Sender: MTEyNTc0Mg== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 06:03:52 -0000 --wi8hZpkE3Wxgr1M3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Dear freebsd-fs@FreeBSD.org, the "Needs MFC" status is subject to be removed in a few weeks and will be replaced by the newly added flags "mfc-stable8", "mfc-stable9" and "mfc-stable10". We would like you to convert the bugs with the status "Needs MFC", that are currently assigned to you, to those new flags and to set the bug to "In Discussion". Please set only those flags to "?", for which a MFC is still required. If a MFC already took place for a stable branch, set the flag to "+". If a MFC will not be done for a stable branch, set the flag to "-". All bugs, which are not converted within the next 14 days (until the 5th of November), will be converted by bugmeister, requesting a MFC for all branches, unknowingly, if that may be correct or not. Thanks for your help Marcus on behalf of bugmeister --wi8hZpkE3Wxgr1M3 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlRF90UACgkQi68/ErJnpkcrkACgsOEnIb8CJU/5oIHp7tg8muJR 2tAAoJtIqi51oJXUwA5L7cFTIHW7bErR =egAA -----END PGP SIGNATURE----- --wi8hZpkE3Wxgr1M3-- From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 11:50:40 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5371FB8D; Tue, 21 Oct 2014 11:50:40 +0000 (UTC) Received: from dbh.germany.ru (dbh.germany.ru [188.40.73.20]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D0559F8D; Tue, 21 Oct 2014 11:50:39 +0000 (UTC) Received: from Alex-Varshs-iMac.local (client090-177-185-93.cnt.tvoe.tv [93.185.177.90] (may be forged)) (authenticated bits=0) by dbh.germany.ru (8.14.3/8.14.3) with ESMTP id s9LBoFtJ022535 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Tue, 21 Oct 2014 13:50:26 +0200 (CEST) Message-ID: <544648B4.9030106@putnichek.ru> Date: Tue, 21 Oct 2014 15:51:16 +0400 From: Alex User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Andriy Gapon , freebsd-fs@FreeBSD.org Subject: Re: l2_io_error and l2_cksum_bad are not null and growing References: <54454739.1070900@putnichek.ru> <544553CA.1060406@FreeBSD.org> In-Reply-To: <544553CA.1060406@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 11:50:40 -0000 Hello. As I see this patch seems to be pretty old. How can it be that it haven't been included in the stable branche since then? The problem is that I can't afford making experiments on the production server. Is there any official solution of the issue? Alex 10/20/14 10:26 PM, Andriy Gapon пишет: > On 20/10/2014 20:32, Alex wrote: >> Hello. >> >> We seem to have a problem with l2arc on zfs system on 10.1-BETA1 FreeBSD >> 10.1-BETA1 #1 r271710 server. The server has a l2arc cache configured from Intel >> SSD 480G disk. The problem is that l2_io_error and l2_cksum_bad values are >> constantly growing, however there are no traces of any hardware malfunctioning. >> As for now, the values are >> kstat.zfs.misc.arcstats.l2_io_error: 1501 >> kstat.zfs.misc.arcstats.l2_cksum_bad: 19480 > Please see if the following patch might help > https://github.com/avg-I/freebsd/compare/review/l2arc-write-target-size.diff > >> Here is the output of zpool status: >> pool: zpool >> state: ONLINE >> scan: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> zpool ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> diskid/DISK-WD-WMC1P0DFSF47p2 ONLINE 0 0 0 >> diskid/DISK-WD-WMC1P0DEFERYp2 ONLINE 0 0 0 >> logs >> gpt/zil0 ONLINE 0 0 0 >> cache >> gpt/cache0 ONLINE 0 0 0 >> >> errors: No known data errors >> >> Here is the output of zfs-stats -L: >> ------------------------------------------------------------------------ >> ZFS Subsystem Report Mon Oct 20 19:29:38 2014 >> ------------------------------------------------------------------------ >> >> L2 ARC Summary: (DEGRADED) >> Passed Headroom: 72.42m >> Tried Lock Failures: 360.55m >> IO In Progress: 65 >> Low Memory Aborts: 101 >> Free on Write: 7.09k >> Writes While Full: 16.90k >> R/W Clashes: 11 >> Bad Checksums: 19.48k >> IO Errors: 1.50k >> SPA Mismatch: 1.18m >> >> L2 ARC Size: (Adaptive) 555.88 GiB >> Header Size: 0.21% 1.17 GiB >> >> L2 ARC Evicts: >> Lock Retries: 10 >> Upon Reading: 0 >> >> L2 ARC Breakdown: 272.76m >> Hit Ratio: 0.35% 949.30k >> Miss Ratio: 99.65% 271.81m >> Feeds: 2.19m >> >> L2 ARC Buffer: >> Bytes Scanned: 20.45 PiB >> Buffer Iterations: 2.19m >> List Iterations: 139.97m >> NULL List Iterations: 5.58k >> >> L2 ARC Writes: >> Writes Sent: 100.00% 567.37k >> >> Any help is welcome. Best regards >> > From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 15:17:15 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 779AEDB8 for ; Tue, 21 Oct 2014 15:17:15 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 451FAB84 for ; Tue, 21 Oct 2014 15:17:14 +0000 (UTC) Received: from Julian-MBP3.local (50-196-156-133-static.hfc.comcastbusiness.net [50.196.156.133]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id s9LFH0Nu061494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Tue, 21 Oct 2014 08:17:03 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <544678E6.3040400@freebsd.org> Date: Tue, 21 Oct 2014 23:16:54 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Harshavardhana , Justin Clift Subject: Re: FreeBSD support being added to GlusterFS References: <0F20AEEC-6244-42BC-815C-1440BBBDE664@mail.turbofuzz.com> <0ABAE2AC-BF1B-4125-ACA9-C6177D013E25@mail.turbofuzz.com> <20140706230910.GA8523@ivaldir.etoilebsd.net> <2F416D06-0A98-4E66-902C-ED0690A4B1C0@ixsystems.com> <20140825211459.GB65120@ivaldir.etoilebsd.net> <287933729.7365215.1413819378546.JavaMail.zimbra@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Jordan Hubbard X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 15:17:15 -0000 On 10/21/14, 6:17 AM, Harshavardhana wrote: > On Mon, Oct 20, 2014 at 8:36 AM, Justin Clift wrote: >> Does anyone have time to review: >> >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192701 >> >> It's a FUSE patch that's been hanging around a few months, >> written by Harsha. >> >> Harsha, is GlusterFS dependant on this? >> > Not dependent, but its an important part that needs to be exposed for > certain tests to pass under > regression testing. > I've been hitting direct-io issues in fuse.. if I can get a minute sideways from work I will try to look. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 16:26:49 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 51AECE03; Tue, 21 Oct 2014 16:26:49 +0000 (UTC) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.redhat.com", Issuer "DigiCert SHA2 Extended Validation Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0CDFC351; Tue, 21 Oct 2014 16:26:46 +0000 (UTC) Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s9LFXnLF008242 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 21 Oct 2014 11:33:49 -0400 Received: from f19laptop.uk.gluster.org (vpn1-50-189.bne.redhat.com [10.64.50.189]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id s9LFXhlZ010489; Tue, 21 Oct 2014 11:33:45 -0400 Date: Tue, 21 Oct 2014 17:33:41 +0100 From: Justin Clift To: Julian Elischer Subject: Re: FreeBSD support being added to GlusterFS Message-Id: <20141021173341.dbcf2050dc7179ba47c9ca8c@gluster.org> In-Reply-To: <544678E6.3040400@freebsd.org> References: <0F20AEEC-6244-42BC-815C-1440BBBDE664@mail.turbofuzz.com> <0ABAE2AC-BF1B-4125-ACA9-C6177D013E25@mail.turbofuzz.com> <20140706230910.GA8523@ivaldir.etoilebsd.net> <2F416D06-0A98-4E66-902C-ED0690A4B1C0@ixsystems.com> <20140825211459.GB65120@ivaldir.etoilebsd.net> <287933729.7365215.1413819378546.JavaMail.zimbra@redhat.com> <544678E6.3040400@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 Cc: freebsd-fs@freebsd.org, Jordan Hubbard X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 16:26:49 -0000 On Tue, 21 Oct 2014 23:16:54 +0800 Julian Elischer wrote: > On 10/21/14, 6:17 AM, Harshavardhana wrote: > > On Mon, Oct 20, 2014 at 8:36 AM, Justin Clift wrote: > >> Does anyone have time to review: > >> > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192701 > >> > >> It's a FUSE patch that's been hanging around a few months, > >> written by Harsha. > >> > >> Harsha, is GlusterFS dependant on this? > >> > > Not dependent, but its an important part that needs to be exposed for > > certain tests to pass under > > regression testing. > > > I've been hitting direct-io issues in fuse.. if I can get a minute > sideways from work I will try to look. Thanks, that'd be cool. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 22:03:55 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 88E032B9; Tue, 21 Oct 2014 22:03:55 +0000 (UTC) Received: from mail-ig0-x22d.google.com (mail-ig0-x22d.google.com [IPv6:2607:f8b0:4001:c05::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4AFF066B; Tue, 21 Oct 2014 22:03:55 +0000 (UTC) Received: by mail-ig0-f173.google.com with SMTP id h18so2301511igc.0 for ; Tue, 21 Oct 2014 15:03:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=zvuSdhbzaqGZA1JaJvX4FxznbHJ2o8hDP8/MkrxrTiM=; b=bR9yg5MpIx7fXAP1SXXyWfrduNst2GJY9lV/zujyHr62xJLeYENwsjMAIE0FHRZQku i1fUqejYECbrjVLWmtLSPcM6s0sQHV0xaDEY62RW2Qo/tWiSGtQ5ig4bkjS4QhSWV336 GHoJce/HpNJXWt38YHwt8EINPG8nfaK5e5LhPNRqanv+4dFQ0uvwx2BOoRpamrw1QmvG ZZNxQ08soC+3FpUPeMpizQqbqxvlHs7PO6ZlOzUPuhvAi+rg//ZIB6Y33G0gZSSh3AE7 J0iOGERd6IwYw7S2PHhnNhSG4SU5l/w+TPjSSa3r8S3toqg93j1Fh0lM1m89CVwl74VH +4ZA== MIME-Version: 1.0 X-Received: by 10.42.91.75 with SMTP id o11mr786241icm.89.1413929034665; Tue, 21 Oct 2014 15:03:54 -0700 (PDT) Received: by 10.42.223.69 with HTTP; Tue, 21 Oct 2014 15:03:54 -0700 (PDT) Reply-To: alc@freebsd.org In-Reply-To: <5443D918.9090307@jrv.org> References: <54250AE9.6070609@jrv.org> <543FAB3C.4090503@jrv.org> <543FEE6F.5050007@delphij.net> <54409050.4070401@jrv.org> <544096B3.20306@delphij.net> <54409CFE.8070905@jrv.org> <5443D918.9090307@jrv.org> Date: Tue, 21 Oct 2014 17:03:54 -0500 Message-ID: Subject: Re: zfs recv hangs in kmem arena From: Alan Cox To: "James R. Van Artsdalen" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs@freebsd.org, d@delphij.net, "current@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 22:03:55 -0000 On Sun, Oct 19, 2014 at 10:30 AM, James R. Van Artsdalen < james-freebsd-fs2@jrv.org> wrote: > Removing kern.maxfiles from loader.conf still hangs in "kmem arena". > > I tried using a memstick image of -CURRENT made from the release/ > process and this also hangs in "kmem arena" > > An uninvolved server of mine hung Friday night in state"kmem arena" > during periodic's "zpool history". After a reboot it did not hang > Saturday night. > > How up to date is your source tree? r2720221 is relevant. Without that change, there are circumstances in which the code that is supposed to free space from the kmem arena doesn't get called. > On 10/16/2014 11:37 PM, James R. Van Artsdalen wrote: > > On 10/16/2014 11:10 PM, Xin Li wrote: > >> On 10/16/14 8:43 PM, James R. Van Artsdalen wrote: > >>> On 10/16/2014 11:12 AM, Xin Li wrote: > >>>>> On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote: > >>>>>> FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 > >>>>>> #2 r272070M: Wed Sep 24 17:36:56 CDT 2014 > >>>>>> james@BLACKIE.housenet.jrv:/usr/obj/usr/src/sys/GENERIC > >>>>>> amd64 > >>>>>> > >>>>>> With current STABLE10 I am unable to replicate a ZFS pool > >>>>>> using zfs send/recv without zfs hanging in state "kmem > >>>>>> arena", within the first 4TB or so (of a 23TB Pool). > >>>> What does procstat -kk 1176 (or the PID of your 'zfs' process > >>>> that stuck in that state) say? > >>>> > >>>> Cheers, > >>>> > >>> SUPERTEX:/root# ps -lp 866 UID PID PPID CPU PRI NI VSZ RSS > >>> MWCHAN STAT TT TIME COMMAND 0 866 863 0 52 0 66800 > >>> 29716 kmem are D+ 1 57:40.82 zfs recv -duvF BIGTOX > >>> SUPERTEX:/root# procstat -kk 866 PID TID COMM TDNAME > >>> KSTACK 866 101573 zfs - mi_switch+0xe1 > >>> sleepq_wait+0x3a _cv_wait+0x16d vmem_xalloc+0x568 vmem_alloc+0x3d > >>> kmem_malloc+0x33 keg_alloc_slab+0xcd keg_fetch_slab+0x151 > >>> zone_fetch_slab+0x7e zone_import+0x40 uma_zalloc_arg+0x34e > >>> arc_get_data_buf+0x31a arc_buf_alloc+0xaa dmu_buf_will_fill+0x169 > >>> dmu_write+0xfc dmu_recv_stream+0xd40 zfs_ioc_recv+0x94e > >>> zfsdev_ioctl+0x5ca > >> Do you have any special tuning in your /boot/loader.conf? > >> > >> Cheers, > >> > > Below. I had forgotten some of this was there. > > > > After sending the previous message I ran kgdb to see if I could get a > > backtrace with function args. I didn't see how to do it for this proc, > > but during all this the process un-blocked and started running again. > > > > The process blocked again in kmem arena after a few minutes. > > > > > > SUPERTEX:/root# cat /boot/loader.conf > > zfs_load="YES" # ZFS > > vfs.root.mountfrom="zfs:SUPERTEX/UNIX" # Specify root partition > > in a way the > > # kernel understands > > kern.maxfiles="32K" # Set the sys. wide open files limit > > kern.ktrace.request_pool="512" > > #vfs.zfs.debug=1 > > vfs.zfs.check_hostid=0 > > > > loader_logo="beastie" # Desired logo: fbsdbw, beastiebw, beastie, > > none > > boot_verbose="YES" # -v: Causes extra debugging information to be > > printed > > geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8)) > > geom_label_load="YES" # File system labels (see glabel(8)) > > ahci_load="YES" > > siis_load="YES" > > mvs_load="YES" > > coretemp_load="YES" # Intel Core CPU temperature monitor > > #console="comconsole" > > kern.msgbufsize="131072" # Set size of kernel message buffer > > > > kern.geom.label.gpt.enable=0 > > kern.geom.label.gptid.enable=0 > > kern.geom.label.disk_ident.enable=0 > > SUPERTEX:/root# > > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 22:05:33 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5D176454; Tue, 21 Oct 2014 22:05:33 +0000 (UTC) Received: from mail-ig0-x22a.google.com (mail-ig0-x22a.google.com [IPv6:2607:f8b0:4001:c05::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1E40B696; Tue, 21 Oct 2014 22:05:33 +0000 (UTC) Received: by mail-ig0-f170.google.com with SMTP id hn18so352950igb.5 for ; Tue, 21 Oct 2014 15:05:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=HJdcIZnAuMAErzbYQdm/LSR4bxYIZDbj3joerIZoUsg=; b=O6v0OEiHc3g5dnvy9vFyiPSJXkb/emj3AO7ZDVgFU8LyZAbmD38w6lopvmjYOHEhbB d0AVoSeU9JFdZ7eA9rqEtbX3Anh2gzcKFRFyPSqDLBkiD5LXPqB8hcvCFmnnUnBurY/c 0Ld7D2c2ok4ZEfyWlQkp1CMuGiwoSIF2Wf2TAbo9F+HrTDX6xnzHGTXB3yi3aiGCyd0a zYBmdSoXDIeZThtQFWZ3ticwWYK6Rx4OZnq+8d1x+JZBKsJxZT2OZBRoSOIdt1cAIOLy JOipnJZLwAN2RpkWmWP6C9sQneEzd13AXf5CuCwivI+NwsSk5jufIWzDv8YT9xfG1iD8 Etrg== MIME-Version: 1.0 X-Received: by 10.107.165.76 with SMTP id o73mr40511143ioe.1.1413929132428; Tue, 21 Oct 2014 15:05:32 -0700 (PDT) Received: by 10.42.223.69 with HTTP; Tue, 21 Oct 2014 15:05:32 -0700 (PDT) Reply-To: alc@freebsd.org In-Reply-To: References: <54250AE9.6070609@jrv.org> <543FAB3C.4090503@jrv.org> <543FEE6F.5050007@delphij.net> <54409050.4070401@jrv.org> <544096B3.20306@delphij.net> <54409CFE.8070905@jrv.org> <5443D918.9090307@jrv.org> Date: Tue, 21 Oct 2014 17:05:32 -0500 Message-ID: Subject: Re: zfs recv hangs in kmem arena From: Alan Cox To: "James R. Van Artsdalen" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs@freebsd.org, d@delphij.net, "current@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 22:05:33 -0000 On Tue, Oct 21, 2014 at 5:03 PM, Alan Cox wrote: > On Sun, Oct 19, 2014 at 10:30 AM, James R. Van Artsdalen < > james-freebsd-fs2@jrv.org> wrote: > >> Removing kern.maxfiles from loader.conf still hangs in "kmem arena". >> >> I tried using a memstick image of -CURRENT made from the release/ >> process and this also hangs in "kmem arena" >> >> An uninvolved server of mine hung Friday night in state"kmem arena" >> during periodic's "zpool history". After a reboot it did not hang >> Saturday night. >> >> > > > How up to date is your source tree? r2720221 is relevant. Without that > change, there are circumstances in which the code that is supposed to free > space from the kmem arena doesn't get called. > > That should be r272071. > > > >> On 10/16/2014 11:37 PM, James R. Van Artsdalen wrote: >> > On 10/16/2014 11:10 PM, Xin Li wrote: >> >> On 10/16/14 8:43 PM, James R. Van Artsdalen wrote: >> >>> On 10/16/2014 11:12 AM, Xin Li wrote: >> >>>>> On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote: >> >>>>>> FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 >> >>>>>> #2 r272070M: Wed Sep 24 17:36:56 CDT 2014 >> >>>>>> james@BLACKIE.housenet.jrv:/usr/obj/usr/src/sys/GENERIC >> >>>>>> amd64 >> >>>>>> >> >>>>>> With current STABLE10 I am unable to replicate a ZFS pool >> >>>>>> using zfs send/recv without zfs hanging in state "kmem >> >>>>>> arena", within the first 4TB or so (of a 23TB Pool). >> >>>> What does procstat -kk 1176 (or the PID of your 'zfs' process >> >>>> that stuck in that state) say? >> >>>> >> >>>> Cheers, >> >>>> >> >>> SUPERTEX:/root# ps -lp 866 UID PID PPID CPU PRI NI VSZ RSS >> >>> MWCHAN STAT TT TIME COMMAND 0 866 863 0 52 0 66800 >> >>> 29716 kmem are D+ 1 57:40.82 zfs recv -duvF BIGTOX >> >>> SUPERTEX:/root# procstat -kk 866 PID TID COMM TDNAME >> >>> KSTACK 866 101573 zfs - mi_switch+0xe1 >> >>> sleepq_wait+0x3a _cv_wait+0x16d vmem_xalloc+0x568 vmem_alloc+0x3d >> >>> kmem_malloc+0x33 keg_alloc_slab+0xcd keg_fetch_slab+0x151 >> >>> zone_fetch_slab+0x7e zone_import+0x40 uma_zalloc_arg+0x34e >> >>> arc_get_data_buf+0x31a arc_buf_alloc+0xaa dmu_buf_will_fill+0x169 >> >>> dmu_write+0xfc dmu_recv_stream+0xd40 zfs_ioc_recv+0x94e >> >>> zfsdev_ioctl+0x5ca >> >> Do you have any special tuning in your /boot/loader.conf? >> >> >> >> Cheers, >> >> >> > Below. I had forgotten some of this was there. >> > >> > After sending the previous message I ran kgdb to see if I could get a >> > backtrace with function args. I didn't see how to do it for this proc, >> > but during all this the process un-blocked and started running again. >> > >> > The process blocked again in kmem arena after a few minutes. >> > >> > >> > SUPERTEX:/root# cat /boot/loader.conf >> > zfs_load="YES" # ZFS >> > vfs.root.mountfrom="zfs:SUPERTEX/UNIX" # Specify root partition >> > in a way the >> > # kernel understands >> > kern.maxfiles="32K" # Set the sys. wide open files limit >> > kern.ktrace.request_pool="512" >> > #vfs.zfs.debug=1 >> > vfs.zfs.check_hostid=0 >> > >> > loader_logo="beastie" # Desired logo: fbsdbw, beastiebw, beastie, >> > none >> > boot_verbose="YES" # -v: Causes extra debugging information to be >> > printed >> > geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8)) >> > geom_label_load="YES" # File system labels (see glabel(8)) >> > ahci_load="YES" >> > siis_load="YES" >> > mvs_load="YES" >> > coretemp_load="YES" # Intel Core CPU temperature monitor >> > #console="comconsole" >> > kern.msgbufsize="131072" # Set size of kernel message buffer >> > >> > kern.geom.label.gpt.enable=0 >> > kern.geom.label.gptid.enable=0 >> > kern.geom.label.disk_ident.enable=0 >> > SUPERTEX:/root# >> > >> >> _______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org >> " >> > > From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 23:19:57 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D144C576; Tue, 21 Oct 2014 23:19:57 +0000 (UTC) Received: from mail.jrv.org (adsl-70-243-84-11.dsl.austtx.swbell.net [70.243.84.11]) by mx1.freebsd.org (Postfix) with ESMTP id 9DE36D9B; Tue, 21 Oct 2014 23:19:56 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.jrv.org (Postfix) with ESMTP id DF9F61AD525; Tue, 21 Oct 2014 18:19:49 -0500 (CDT) Received: from mail.jrv.org ([127.0.0.1]) by localhost (zimbra64.housenet.jrv [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id ywn1zvHtDekR; Tue, 21 Oct 2014 18:19:40 -0500 (CDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.jrv.org (Postfix) with ESMTP id EF56B1AD520; Tue, 21 Oct 2014 18:19:39 -0500 (CDT) X-Virus-Scanned: amavisd-new at zimbra64.housenet.jrv Received: from mail.jrv.org ([127.0.0.1]) by localhost (zimbra64.housenet.jrv [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 2Dt9zWXbzoHl; Tue, 21 Oct 2014 18:19:39 -0500 (CDT) Received: from [192.168.138.128] (BMX.housenet.jrv [192.168.3.140]) by mail.jrv.org (Postfix) with ESMTPSA id CBE581AD51D; Tue, 21 Oct 2014 18:19:39 -0500 (CDT) Message-ID: <5446EA1C.8070305@jrv.org> Date: Tue, 21 Oct 2014 18:19:56 -0500 From: "James R. Van Artsdalen" User-Agent: Mozilla/5.0 (Windows NT 5.0; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: alc@freebsd.org Subject: Re: zfs recv hangs in kmem arena References: <54250AE9.6070609@jrv.org> <543FAB3C.4090503@jrv.org> <543FEE6F.5050007@delphij.net> <54409050.4070401@jrv.org> <544096B3.20306@delphij.net> <54409CFE.8070905@jrv.org> <5443D918.9090307@jrv.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Alan Cox , d@delphij.net, "current@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2014 23:19:57 -0000 On 10/21/2014 5:03 PM, Alan Cox wrote: > How up to date is your source tree? r2720221 is relevant. Without that > change, there are circumstances in which the code that is supposed to > free space from the kmem arena doesn't get called. I've tried HEAD/CURRENT at r272749 On 10-STABLE through r273364 - I do a nightly build & test. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 23 02:50:23 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 104ACAF0; Thu, 23 Oct 2014 02:50:23 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C277297B; Thu, 23 Oct 2014 02:50:22 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s9N2oKCL036069; Wed, 22 Oct 2014 22:50:20 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.7/8.14.4/Submit) id s9N2oKUB036066; Wed, 22 Oct 2014 22:50:20 -0400 (EDT) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21576.27884.76574.977691@hergotha.csail.mit.edu> Date: Wed, 22 Oct 2014 22:50:20 -0400 From: Garrett Wollman To: freebsd-stable@freebsd.org, freebsd-fs@freebsd.org Subject: Some 9.3 NFS testing X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 22 Oct 2014 22:50:20 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED, HEADER_FROM_DIFFERENT_DOMAINS autolearn=disabled version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on hergotha.csail.mit.edu X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Oct 2014 02:50:23 -0000 Just thought I'd share this... I've been doing some acceptance testing on 9.3 prior to upgrading my production NFS servers. My most recent test is running bonnie++ on 192 Ubuntu VMs in parallel, to independent directories in the same server filesystem. It hasn't fallen over yet (will probably take another day or so to complete), and peaked at about 220k ops/s (but this was NFSv4 so there's no FHA and it takes at least two ops for every v3 RPC[1]). bonnie++ is running with -D (O_DIRECT), but I'm actually just using it as a load generator -- I don't care about the output. I have this system configured for a maximum of 64 nfsd threads, and the test load has had it pegged for the past eight hours. Right now all of the load generators are doing the "small file" part of bonnie++, so there's not a lot of activity but there are a lot of synchronous operations; it's been doing 60k ops/s for the past five hours. Load average maxed out at about 24 early on in the test, and has settled around 16-20 for this part of the test. Here's what nfsstat -se has to say (note: not reset for this round of testing): Server Info: Getattr Setattr Lookup Readlink Read Write Create Remove 1566655064 230074779 162549702 0 471311053 1466525587 149235773 115496945 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access 125 0 0 245 116 2032193 27485368 223929240 Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf 0 53 268 131 15999631 0 386 386 Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock 80924092 0 0 194 0 0 81110394 0 LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH 0 0 80578106 0 0 1203868156 0 193 Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create 1271 0 14 384 0 570 Server: Retfailed Faults Clients 0 0 191 OpenOwner Opens LockOwner Locks Delegs 192 154 0 0 0 Server Cache Stats: Inprog Idem Non-idem Misses CacheSize TCPPeak 0 0 0 -167156883 1651 115531 I'd love to mix in some FreeBSD-generated loads but as discussed a week or so ago, our NFS client can't handle reading directories from which files are being deleted. FWIW, I just ran a quick "pmcstat -T" and noted the following: PMC: [unhalted-core-cycles] Samples: 775371 (100.0%) , 3264 unresolved Key: q => exiting... %SAMP IMAGE FUNCTION CALLERS 24.0 kernel _mtx_lock_sleep _vm_map_lock:22.4 ... 4.7 kernel Xinvlrng 4.7 kernel _mtx_lock_spin pmclog_reserve 4.2 kernel _sx_xlock_hard _sx_xlock 3.8 pmcstat _init 2.5 kernel bcopy vdev_queue_io_done 1.7 kernel _sx_xlock 1.6 zfs.ko lzjb_compress zio_compress_data 1.4 zfs.ko lzjb_decompress zio_decompress 1.2 kernel _sx_xunlock 1.2 kernel ipfw_chk ipfw_check_hook 1.1 libc.so.7 bsearch 1.0 zfs.ko fletcher_4_native zio_checksum_compute 1.0 kernel vm_page_splay vm_page_find_least 1.0 kernel cpu_idle_mwait sched_idletd 1.0 kernel free 0.9 kernel bzero 0.9 kernel cpu_search_lowest cpu_search_lowest 0.8 kernel vm_map_entry_splay vm_map_lookup_entry 0.8 kernel cpu_search_highest cpu_search_highest I doubt that this is news to anybody. Once I get the production servers upgraded to 9.3, I'll be ready to start testing 10.1 on this same setup. -GAWollman [1] I did previous testing, with smaller numbers of clients, using v3 as that is what we currently require our clients to use. I switched to v4 to try out the worst case -- after finding an OpenStack bug that was preventing me from starting more than 16 load generators at a time. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 23 13:30:39 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8D9D69BE for ; Thu, 23 Oct 2014 13:30:39 +0000 (UTC) Received: from mail.slu.se (tmgext2-1.slu.se [77.235.224.51]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client CN "webmail.slu.se", Issuer "TERENA SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EC3FC67A for ; Thu, 23 Oct 2014 13:30:38 +0000 (UTC) Received: from Exchange2-3.slu.se (130.238.96.156) by Tmg2-1.slu.se (130.238.96.151) with Microsoft SMTP Server (TLS) id 14.3.210.2; Thu, 23 Oct 2014 15:29:27 +0200 Received: from Exchange2-1.slu.se ([130.238.96.154]) by exchange2-3 ([130.238.96.156]) with mapi id 14.03.0210.002; Thu, 23 Oct 2014 15:29:26 +0200 From: =?utf-8?B?S2FybGkgU2rDtmJlcmc=?= To: "freebsd-fs@freebsd.org" Subject: How big can ZFS L2ARC grow? Thread-Topic: How big can ZFS L2ARC grow? Thread-Index: Ac/uxV0RpYJU8EBtST6ftbX7oE7zkA== Date: Thu, 23 Oct 2014 13:29:25 +0000 Message-ID: <5F9E965F5A80BC468BE5F40576769F099DF6E88A@exchange2-1> Accept-Language: sv-SE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [77.235.228.32] Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Oct 2014 13:30:39 -0000 SGV5IQoKQXMgdGhlIHRvcGljIHN0YXRlcywgScK0bSB3b25kZXJpbmcgYWJvdXQgdGhlIHNpemUg b2YgTDJBUkMgYW5kIGlmIHRoZXJlCmlzIGEgbGltaXQgdG8gaG93IGJpZyBpdCBpcyBhYmxlIHRv IGdyb3cuCgpXaHkgScK0bSBhc2tpbmcgaXMgYmVjYXVzZSBJwrR2ZSBhbHdheXMgdGhvdWdodCB0 aGF0IGlmIHlvdSBhZGQgYSBjYWNoZQpkZXZpY2UgdG8gdGhlIHBvb2wsIHRoZSBtYXhpbXVtIHNp emUgb2YgTDJBUkMgd291bGQgYmUgdGhlIHNpemUgb2YgdGhlCmRpc2sgeW91wrR2ZSBhZGRlZCwg YnV0IHJlY2VudGx5IEnCtHZlIGNvbWUgdG8ga25vdyB0aGF0wrRzIG5vdCB0aGUgY2FzZS4KCkhl cmXCtHMgYSA5LjMtUkVMRUFTRSBzeXN0ZW0gdGhhdCBoYXMgNjQgR0IgUkFNIGFuZCB0d28gMjU2 IEdCIGxhcmdlClNTRCdzIGFkZGVkIGFzIGNhY2hlLCB0aGF0IEkgd291bGTCtHZlIHRob3VnaHQg b25seSBjb3VsZMK0dmUgZ3Jvd24gdG8KYWJvdXQgNTEyIEdCOgojIHN5c2N0bCAtbiBrc3RhdC56 ZnMubWlzYy5hcmNzdGF0cy5sMl9zaXplIAo3Njk2NjI2MjMzMzQ0CgpBbm90aGVyIHN5c3RlbSBy dW5uaW5nIDkuMi1SRUxFQVNFIHdpdGggMzIgR0IgUkFNICsgMjQwIEdCIFNTRDoKIyBzeXNjdGwg LW4ga3N0YXQuemZzLm1pc2MuYXJjc3RhdHMubDJfc2l6ZQoxNDAwMDM4OTgwNjA4CgpUaGUgc2Vy dmVycyBhcmUgcnVubmluZyBhIHNvZnR3YXJlIGZvciBncmFwaGluZyBzbyBJIGhhdmUgc2VlbiB0 aGF0IHRoZQpzaXplIG51bWJlcnMgY2FuIGdvIHVwIGFuZCBkb3duIG92ZXIgdGltZSwgYnV0IGNs ZWFybHkgZ29lcyBvdmVyIHRoZQpzaXplIG9mIHRoZSBTU0QncyB0aGF0IGhhdmUgYmVlbiBhZGRl ZC4KCldlIGhhdmUgdHdvIG1vcmUgc3lzdGVtcyBjb25maWd1cmVkIHdpdGggY2FjaGUgZGV2aWNl cyBhbmQgeWV0IGFub3RoZXIKdHdvIHN5c3RlbXMgY29uZmlndXJlZCB3aXRob3V0LgoKVGhlIHBy b2JsZW0gd2UgaGF2ZSBpcyB0aGF0IHRoZSBmb3VyIHN5c3RlbXMgd2l0aCBjYWNoZSBkZXZpY2Vz IChvdXIKcHJpbWFyeSBzdG9yYWdlIHN5c3RlbXMpIGdvZXMgY29tcGxldGVseSB1bnJlc3BvbnNp dmUgYWZ0ZXIgZGlmZmVyZW50CnBlcmlvZHMgb2YgdGltZSwgZGVwZW5kaW5nIG9uIGhvdyBtdWNo IFJBTSB0aGV5IGhhdmUgYW5kIHRoZSBsb2FkIHRoZXkKwrR2ZSBiZWVuIHVuZGVyLCBJIGd1ZXNz LiBUaGUgbGVzcyBSQU0sIHRoZSBtb3JlIGZyZXF1ZW50IHRoZXkgc3RhbGwsIGFuZApJwrRtIHN0 YXJ0aW5nIHRvIHdvbmRlciBpZiB3aGF0wrRzIGNvbW1vbiBiZXR3ZWVuIHRoZW0gaXMgTDJBUkMs IGJlY2F1c2UKdGhlIG90aGVyIHR3byBzeXN0ZW1zIHdpdGhvdXQgY2FjaGUgZGV2aWNlcyBkb2Vz bsK0dCBoYXZlIHRob3NlIGlzc3VlcywKYWx0aG91Z2ggdGhleSBhcmVuwrR0IHVuZGVyIHRoZSBz YW1lIGtpbmQgb2YgbG9hZCBlaXRoZXIsIGl0wrRzIGEgZGlzYXN0ZXIKcmVjb3Zlcnkgc3lzdGVt IHJlY2VpdmluZyB6ZnMgc25hcHNob3RzIGFuZCB0aGUgb3RoZXIgb25lIGlzIG91ciBzeXNsb2cK c2VydmVyLCBidXQgc3RpbGwuLi4KCldoYXQgZG8geW91IHRoaW5rLCBhcmUgdGhlIHNpemUgbnVt YmVycyBmb3IgTDJBUkMgdW51c3VhbCwgYW5kIGNvdWxkIGl0CmJlIHJlbGF0ZWQgdG8gdGhlIHN0 YWxscyB3ZcK0dmUgYmVlbiBleHBlcmllbmNpbmc/IEFuZCBpZiB0aGUgc2l6ZQpudW1iZXJzIHJl YWxseSBhcmUgdW51c3VhbCwgaXMgdGhlcmUgYSB3YXkgdG8gaGFuZGxlIGl0LCBsaWtlIGxpbWl0 IGhvdwpsYXJnZSB0aGUgTDJBUkMgaXMgYWJsZSB0byBncm93IHNvbWVob3c/CgoKCi0tIAoKTWVk IFbDpG5saWdhIEjDpGxzbmluZ2FyCgotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCkthcmxpIFNqw7Zi ZXJnClN3ZWRpc2ggVW5pdmVyc2l0eSBvZiBBZ3JpY3VsdHVyYWwgU2NpZW5jZXMgQm94IDcwNzkg KFZpc2l0aW5nIEFkZHJlc3MKS3JvbsOlc3bDpGdlbiA4KQpTLTc1MCAwNyBVcHBzYWxhLCBTd2Vk ZW4KUGhvbmU6ICArNDYtKDApMTgtNjcgMTUgNjYKa2FybGkuc2pvYmVyZ0BzbHUuc2UK From owner-freebsd-fs@FreeBSD.ORG Thu Oct 23 22:34:18 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AF135829 for ; Thu, 23 Oct 2014 22:34:18 +0000 (UTC) Received: from sender1.zohomail.com (sender1.zohomail.com [74.201.84.155]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7FFCCC20 for ; Thu, 23 Oct 2014 22:34:18 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=bsdjunk; d=bsdjunk.com; h=date:from:to:subject:message-id:mime-version:content-type:user-agent; b=Tz98xYyngsNdfyPAbVso4wUlzXAT9zIuaIBDe0oCCbCasvO3nExRtEmADwiHV5bwBEQu0pr2WIZJ 54ALPuZNxNwwZeVRhrSL+zkUrX/wlBpLOQvLVMYfsXKt4iaIANv5djuH2UNfHGjxpm172Q7cHWWa qkUN3Xdn2MhWrVgxcDI= Received: from bsdjunk.com (bsdjunk.com [199.48.132.237]) by mx.zohomail.com with SMTPS id 1414103654646707.2552681327763; Thu, 23 Oct 2014 15:34:14 -0700 (PDT) Date: Thu, 23 Oct 2014 22:34:57 +0000 From: Christopher Petrik To: freebsd-fs@freebsd.org Subject: Hammer fs Message-ID: <20141023223457.GA29206@bsdjunk.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-ZohoMailClient: External X-Zoho-Virus-Status: 2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Oct 2014 22:34:18 -0000 One way to improve FreeBSD is looking at the ideas page and act upon it, ive used dragonfly and decided it is time to improve my c skills by porting hammer to FreeBSD. This will be a very time consumiung process which will take time. However it would be nice to have this option during install since it brings in some nice features. -- Mutt Version: 1.5.23 OS Version: 10.0-RELEASE-p7 Hostname: bsdjunk10.bsdjunk.com From owner-freebsd-fs@FreeBSD.ORG Thu Oct 23 23:17:48 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 55CC5650 for ; Thu, 23 Oct 2014 23:17:48 +0000 (UTC) Received: from webmail2.jnielsen.net (webmail2.jnielsen.net [50.114.224.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "webmail2.jnielsen.net", Issuer "freebsdsolutions.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 34D90FF5 for ; Thu, 23 Oct 2014 23:17:47 +0000 (UTC) Received: from [10.10.1.196] (office.betterlinux.com [199.58.199.60]) (authenticated bits=0) by webmail2.jnielsen.net (8.14.9/8.14.9) with ESMTP id s9NNHXmZ019204 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 23 Oct 2014 17:17:37 -0600 (MDT) (envelope-from lists@jnielsen.net) X-Authentication-Warning: webmail2.jnielsen.net: Host office.betterlinux.com [199.58.199.60] claimed to be [10.10.1.196] Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.0 \(1990.1\)) Subject: Re: Hammer fs From: John Nielsen In-Reply-To: <20141023223457.GA29206@bsdjunk.com> Date: Thu, 23 Oct 2014 17:17:32 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20141023223457.GA29206@bsdjunk.com> To: Christopher Petrik X-Mailer: Apple Mail (2.1990.1) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Oct 2014 23:17:48 -0000 On Oct 23, 2014, at 4:34 PM, Christopher Petrik = wrote: >=20 > One way to improve FreeBSD is looking at the ideas page and act upon = it, ive used dragonfly and decided it is time to improve my c skills by = porting hammer to FreeBSD. This will be a very time consumiung process = which will take time. However it would be nice to have this option = during install since it brings in some nice features.=20 Best of luck. Were you looking for any specific feedback or just trying = to raise awareness? Out of curiosity, were you planning to bring in the original HAMMER or = the not-quite-yet-fully-done HAMMER2? JN From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 05:37:03 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 99A4EF0F; Fri, 24 Oct 2014 05:37:03 +0000 (UTC) Received: from mail-yh0-x236.google.com (mail-yh0-x236.google.com [IPv6:2607:f8b0:4002:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 51CA693D; Fri, 24 Oct 2014 05:37:03 +0000 (UTC) Received: by mail-yh0-f54.google.com with SMTP id 29so3062252yhl.27 for ; Thu, 23 Oct 2014 22:37:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=c6ZkRAUbIRWwGH3uHVrRkYWQynjytiqagvAo0EHCMk8=; b=caoPG358yUXk1XmE4nHevt5MDsb+Ok3Z9sL+gc8RsTBFqgSavjZgnYoaucMNNaat2L 830WK0a2gswGjPF5qvU6YjNDWwB3dbQ0MmToxV6TzOpPP7RTgO2gJczfEmbnWTlvAAHF t2/sudRUcLTdZwJ42grn9kt8amjqbTYPrxtR7IpPOuPGzBaWqRsYzVeKsKchh4ZhH5g3 d4pQdTzCz9U93k4pRI59noCm7mCMP3Or4hOwXVnzX97eRDOaDCgl6jt6DIGN8BEcMeum XQXUQt4KL+3dK8EM0zOjDl/uQvjur/bAaaUjWtNjqEadufvQkOIX0hdhbEe5fVyRtPNO DVAQ== MIME-Version: 1.0 X-Received: by 10.170.199.138 with SMTP id q132mr2981290yke.17.1414129022593; Thu, 23 Oct 2014 22:37:02 -0700 (PDT) Received: by 10.220.238.14 with HTTP; Thu, 23 Oct 2014 22:37:02 -0700 (PDT) Date: Fri, 24 Oct 2014 01:37:02 -0400 Message-ID: Subject: ZFS errors on the array but not the disk. From: Zaphod Beeblebrox To: FreeBSD Hackers , freebsd-fs Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 05:37:03 -0000 What does it mean when checksum errors appear on the array (and the vdev) but not on any of the disks? See the paste below. One would think that there isn't some ephemeral data stored somewhere that is not one of the disks, yet "cksum" errors show only on the vdev and the array lines. Help? [2:17:316]root@virtual:/vr2/torrent/in> zpool status pool: vr2 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Thu Oct 23 23:11:29 2014 1.53T scanned out of 22.6T at 62.4M/s, 98h23m to go 119G resilvered, 6.79% done config: NAME STATE READ WRITE CKSUM vr2 ONLINE 0 0 36 raidz1-0 ONLINE 0 0 72 label/vr2-d0 ONLINE 0 0 0 label/vr2-d1 ONLINE 0 0 0 gpt/vr2-d2c ONLINE 0 0 0 block size: 512B configured, 4096B native (resilvering) gpt/vr2-d3b ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-d4a ONLINE 0 0 0 block size: 512B configured, 4096B native ada14 ONLINE 0 0 0 label/vr2-d6 ONLINE 0 0 0 label/vr2-d7c ONLINE 0 0 0 label/vr2-d8 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 gpt/vr2-e0 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-e1 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-e2 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-e3 ONLINE 0 0 0 gpt/vr2-e4 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-e5 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-e6 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/vr2-e7 ONLINE 0 0 0 block size: 512B configured, 4096B native errors: 43 data errors, use '-v' for a list From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 06:52:36 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1591DE89 for ; Fri, 24 Oct 2014 06:52:36 +0000 (UTC) Received: from mail-yh0-x236.google.com (mail-yh0-x236.google.com [IPv6:2607:f8b0:4002:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CBD18FEB for ; Fri, 24 Oct 2014 06:52:35 +0000 (UTC) Received: by mail-yh0-f54.google.com with SMTP id 29so3358461yhl.41 for ; Thu, 23 Oct 2014 23:52:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=u+5o/vu+M7yjWTtzVKA4rbhniN8EohuphovD6OXDF5Y=; b=ObyVR9OUdyiOaxODAsWSmZ7nP3WjBn3z4hmg/UyGaSWotJdtwtc7RYGjMW4O5hxG+M loBuDOvFxxXH+vbj0cggFbKAjstyJFWuaYW3GLKqNLwqCCwPmI8PknrE079Tgvofsz7J Z7Ea3mPPzlJIYaGW0DugVJSJZ3l+7/6gdfo/2a04ufQcfm14iY874Oypn61Y3kTaJCj9 pMjydGNx2MGrC9JN492PfvxZRXtoa15weo2fP4M+jJHLdTawu8gPG95PYca52wkuMvQn gZzhlfNeID5MRbt5VxThvVVzZpXk2mPuX4JD+IGY4nBK/2aqk8kdiDCOVPAJutFKzJMM 1wdA== MIME-Version: 1.0 X-Received: by 10.170.197.150 with SMTP id o144mr3302802yke.103.1414133554051; Thu, 23 Oct 2014 23:52:34 -0700 (PDT) Received: by 10.170.156.139 with HTTP; Thu, 23 Oct 2014 23:52:34 -0700 (PDT) In-Reply-To: <5F9E965F5A80BC468BE5F40576769F099DF6E88A@exchange2-1> References: <5F9E965F5A80BC468BE5F40576769F099DF6E88A@exchange2-1> Date: Fri, 24 Oct 2014 07:52:34 +0100 Message-ID: Subject: Re: How big can ZFS L2ARC grow? From: krad To: =?UTF-8?Q?Karli_Sj=C3=B6berg?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 06:52:36 -0000 doesn l2arc have compression these days? On 23 October 2014 14:29, Karli Sj=C3=B6berg wrote: > Hey! > > As the topic states, I=C2=B4m wondering about the size of L2ARC and if th= ere > is a limit to how big it is able to grow. > > Why I=C2=B4m asking is because I=C2=B4ve always thought that if you add a= cache > device to the pool, the maximum size of L2ARC would be the size of the > disk you=C2=B4ve added, but recently I=C2=B4ve come to know that=C2=B4s n= ot the case. > > Here=C2=B4s a 9.3-RELEASE system that has 64 GB RAM and two 256 GB large > SSD's added as cache, that I would=C2=B4ve thought only could=C2=B4ve gro= wn to > about 512 GB: > # sysctl -n kstat.zfs.misc.arcstats.l2_size > 7696626233344 > > Another system running 9.2-RELEASE with 32 GB RAM + 240 GB SSD: > # sysctl -n kstat.zfs.misc.arcstats.l2_size > 1400038980608 > > The servers are running a software for graphing so I have seen that the > size numbers can go up and down over time, but clearly goes over the > size of the SSD's that have been added. > > We have two more systems configured with cache devices and yet another > two systems configured without. > > The problem we have is that the four systems with cache devices (our > primary storage systems) goes completely unresponsive after different > periods of time, depending on how much RAM they have and the load they > =C2=B4ve been under, I guess. The less RAM, the more frequent they stall,= and > I=C2=B4m starting to wonder if what=C2=B4s common between them is L2ARC, = because > the other two systems without cache devices doesn=C2=B4t have those issue= s, > although they aren=C2=B4t under the same kind of load either, it=C2=B4s a= disaster > recovery system receiving zfs snapshots and the other one is our syslog > server, but still... > > What do you think, are the size numbers for L2ARC unusual, and could it > be related to the stalls we=C2=B4ve been experiencing? And if the size > numbers really are unusual, is there a way to handle it, like limit how > large the L2ARC is able to grow somehow? > > > > -- > > Med V=C3=A4nliga H=C3=A4lsningar > > > -------------------------------------------------------------------------= ------ > Karli Sj=C3=B6berg > Swedish University of Agricultural Sciences Box 7079 (Visiting Address > Kron=C3=A5sv=C3=A4gen 8) > S-750 07 Uppsala, Sweden > Phone: +46-(0)18-67 15 66 > karli.sjoberg@slu.se > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 13:27:13 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3FB28DAC; Fri, 24 Oct 2014 13:27:13 +0000 (UTC) Received: from mail.madpilot.net (grunt.madpilot.net [78.47.145.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F3EE426E; Fri, 24 Oct 2014 13:27:12 +0000 (UTC) Received: from mail (mail [192.168.254.3]) by mail.madpilot.net (Postfix) with ESMTP id 3jPR785M82zb3H; Fri, 24 Oct 2014 15:27:00 +0200 (CEST) Received: from mail.madpilot.net ([192.168.254.3]) by mail (mail.madpilot.net [192.168.254.3]) (amavisd-new, port 10024) with ESMTP id gP5xM9MWieP9; Fri, 24 Oct 2014 15:26:45 +0200 (CEST) Received: from marvin.madpilot.net (micro.madpilot.net [88.149.173.206]) by mail.madpilot.net (Postfix) with ESMTPSA; Fri, 24 Oct 2014 15:26:40 +0200 (CEST) Message-ID: <544A538F.6060202@FreeBSD.org> Date: Fri, 24 Oct 2014 15:26:39 +0200 From: Guido Falsi User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: FreeBSD FS Subject: panic: detach with active requests on 10.1-RC3 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Glen Barber X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 13:27:13 -0000 Hi, I'm making some experiments with 10.1-RC3 on alix boards as hardware using NanoBSD. By mounting and umounting UFS filesystems I have seen umount constantly hanging hard in a deadlock. I have tested on two boards with two distinct compactflash disks with same results. This was not happening with 10.0-RELEASE. I have build a 10.1-RC3 kernel with full debugging and caused the problem to happen, I got this: root@qtest:~ [0]# umount /cfg panic: detach with active requests KDB: stack backtrace: db_trace_self_wrapper(c0968053,c08ea7f0,c2d48800,c23d6bc8,c0536a16,...) at db_trace_self_wrapper+0x2d/frame 0xc23d6b98 kdb_backtrace(c09639e1,c09fa7e8,c095761d,c23d6c54,c095761d,...) at kdb_backtrace+0x30/frame 0xc23d6c00 vpanic(c09fa682,100,c095761d,c23d6c54,c23d6c54,...) at vpanic+0x80/frame 0xc23d6c24 kassert_panic(c095761d,c09575b3,c2d7acc0,4c7,c2d7acc0,...) at kassert_panic+0xe9/frame 0xc23d6c48 g_detach(c2d7acc0,4,c095725c,1c2,c09c8d5c,...) at g_detach+0x1d3/frame 0xc23d6c64 g_wither_washer(c09f7df4,0,c0956544,124,0,...) at g_wither_washer+0x109/frame 0xc23d6c90 g_run_events(0,c23d6d08,c095d42a,3dc,0,...) at g_run_events+0x40/frame 0xc23d6ccc fork_exit(c05c4e60,0,c23d6d08) at fork_exit+0x7f/frame 0xc23d6cf4 fork_trampoline() at fork_trampoline+0x8/frame 0xc23d6cf4 --- trap 0, eip = 0, esp = 0xc23d6d40, ebp = 0 --- KDB: enter: panic [ thread pid 12 tid 100006 ] Stopped at kdb_enter+0x3d: movl $0,kdb_why db> The machine is sitting there, I am connected with serial console, anyone willing to help me debug this further? I really know very little about kernel debugging. If necessary I can also make myself available via IRC or Jabber. It looks like this has some similarities with what was reported here: https://lists.freebsd.org/pipermail/freebsd-fs/2014-September/020035.html I also tested with head (including r272130) and it does deadlock the same. Maybe the slower media is exposing some problem which does not show up with faster disks? Thanks in advance! -- Guido Falsi From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 15:00:56 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C307CB09; Fri, 24 Oct 2014 15:00:56 +0000 (UTC) Received: from mx1.scaleengine.net (beauharnois2.bhs1.scaleengine.net [142.4.218.15]) by mx1.freebsd.org (Postfix) with ESMTP id 80ED0E82; Fri, 24 Oct 2014 15:00:56 +0000 (UTC) Received: from [192.168.1.2] (Seawolf.HML3.ScaleEngine.net [209.51.186.28]) (Authenticated sender: allanjude.freebsd@scaleengine.com) by mx1.scaleengine.net (Postfix) with ESMTPSA id AEE4E64AE8; Fri, 24 Oct 2014 15:00:55 +0000 (UTC) Message-ID: <544A69B8.4020402@freebsd.org> Date: Fri, 24 Oct 2014 11:01:12 -0400 From: Allan Jude User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Zaphod Beeblebrox , FreeBSD Hackers , freebsd-fs Subject: Re: ZFS errors on the array but not the disk. References: In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="wkWRSt1dVbA0301swoKKkbEhTwaA6GPvs" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 15:00:56 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --wkWRSt1dVbA0301swoKKkbEhTwaA6GPvs Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 2014-10-24 01:37, Zaphod Beeblebrox wrote: > What does it mean when checksum errors appear on the array (and the vde= v) > but not on any of the disks? See the paste below. One would think tha= t > there isn't some ephemeral data stored somewhere that is not one of the= > disks, yet "cksum" errors show only on the vdev and the array lines. H= elp? >=20 > [2:17:316]root@virtual:/vr2/torrent/in> zpool status > pool: vr2 > state: ONLINE > status: One or more devices is currently being resilvered. The pool wi= ll > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Thu Oct 23 23:11:29 2014 > 1.53T scanned out of 22.6T at 62.4M/s, 98h23m to go > 119G resilvered, 6.79% done > config: >=20 > NAME STATE READ WRITE CKSUM > vr2 ONLINE 0 0 36 > raidz1-0 ONLINE 0 0 72 > label/vr2-d0 ONLINE 0 0 0 > label/vr2-d1 ONLINE 0 0 0 > gpt/vr2-d2c ONLINE 0 0 0 block size: 512B= > configured, 4096B native (resilvering) > gpt/vr2-d3b ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-d4a ONLINE 0 0 0 block size: 512B= > configured, 4096B native > ada14 ONLINE 0 0 0 > label/vr2-d6 ONLINE 0 0 0 > label/vr2-d7c ONLINE 0 0 0 > label/vr2-d8 ONLINE 0 0 0 > raidz1-1 ONLINE 0 0 0 > gpt/vr2-e0 ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-e1 ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-e2 ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-e3 ONLINE 0 0 0 > gpt/vr2-e4 ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-e5 ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-e6 ONLINE 0 0 0 block size: 512B= > configured, 4096B native > gpt/vr2-e7 ONLINE 0 0 0 block size: 512B= > configured, 4096B native >=20 > errors: 43 data errors, use '-v' for a list > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.o= rg" >=20 I am guessing they were on the disk that is now resilvering. I am not sure if resilvering causes the devices error count to get reset, but it would make sense if it is considered a 'replaced' disk. --=20 Allan Jude --wkWRSt1dVbA0301swoKKkbEhTwaA6GPvs Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJUSmm6AAoJEJrBFpNRJZKflAkP/Re9F3VJOXgQQSJbG8jV0fO/ n6EzHLVUezMJUndq26DX6zzsuXl/7uSAXoxMoY52i5M2nf3DmK1n0Nn1cx+/k9Mx 4vDKWSIhrZsWfoHPR8aJHjIUYd2SPdEcmmrXb5N21YVencPtzqfHjyvCWf71kx8A ZJsqQQ1CBAh0jfcPmGJXXLxbj/8I7LTW7JCAdsZX942SYx9D6qi45uD9mObw/DfJ QWFrFS/JrjBMHjnIUaSMmElpoHuXYekpdS7bpAE2B6jsXNh26aVmb2QEA/w4gmK9 2JE9Q/NBYx0E31m5kYfooQVXczvDo5fV/dNBqF8tNIOFtqLOi0b3I34mwH1uu+uy 28JMQf2OxW8PWkpGbe4X9D6CBqV0vZbM74zRsdkNneCUlbGL5ETbSNDbt3w4Z/Gc c6Mdx4uOulYGIqdp0Gb9/2SJgFrmQq/vNkzNPFkg3urzm/y6R/VlcqUbLDpG/JOH SKRoDcKyWSyAMBE7bN7Tdg6Atf2kNSJGTFkgLrJ4S+tLpjHFHg6vuJs2NEDlVZ+0 3p5RvsjKfz+H/3WLyrIaqPXVYS1PCugMCGTDC3TCdtbMBHzl6yrCmj2vJYsM4H0c Cdvfy0wUTFnAXxb7BLFakrIMybhxEHhfgFVx60jN4jmWAmFtd6FYf5cdL4KkG2AL msJ7+3WYt7ZvRCPi10ip =2rG4 -----END PGP SIGNATURE----- --wkWRSt1dVbA0301swoKKkbEhTwaA6GPvs-- From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 15:33:25 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 10F94563; Fri, 24 Oct 2014 15:33:25 +0000 (UTC) Received: from mail-wg0-x234.google.com (mail-wg0-x234.google.com [IPv6:2a00:1450:400c:c00::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 78FFC33F; Fri, 24 Oct 2014 15:33:24 +0000 (UTC) Received: by mail-wg0-f52.google.com with SMTP id a1so1328591wgh.11 for ; Fri, 24 Oct 2014 08:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=wgXrmU+8kn1JH5xWfoyDqY0K1SagVeWCg58rIyA14Ro=; b=u5U1J+Gn1f5euOJ+sLti7ZS7B+VkHIHDVMUw7f8rTpEQl+rmAONsGO3ScdB0YV4pHM 02WKeiXkLKdFPe5mq1OH8rQ+p+cbFK0g7u2GXKulmdOgRQcplG0QFPshE+nLufqbMqXJ hh6d0tmOm1UrKqVkhF4mFHt4336ioLHXkcqX4yFzVDWyT4CkOr7WO7cAX90qBGwLOz4B S7OstHMV0q9LVX48Ef7+SLwr+QfaFdWQZlS4WuFnX/4Jxetx3762AVzmZ475+XSabpgH lwIpF+PL0G6/fvrEX96uqEr4LppaD6/ecB01lcfw3+iW0LdlJ7kFWl3gmfjyS5Bxmukg IR6g== MIME-Version: 1.0 X-Received: by 10.180.109.99 with SMTP id hr3mr4907955wib.82.1414164802558; Fri, 24 Oct 2014 08:33:22 -0700 (PDT) Sender: asomers@gmail.com Received: by 10.194.220.227 with HTTP; Fri, 24 Oct 2014 08:33:22 -0700 (PDT) In-Reply-To: References: Date: Fri, 24 Oct 2014 09:33:22 -0600 X-Google-Sender-Auth: 8SCGdQ-LlO1ZTBX4V5xzL6yIcYw Message-ID: Subject: Re: ZFS errors on the array but not the disk. From: Alan Somers To: Zaphod Beeblebrox Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 15:33:25 -0000 On Thu, Oct 23, 2014 at 11:37 PM, Zaphod Beeblebrox wrote: > What does it mean when checksum errors appear on the array (and the vdev) > but not on any of the disks? See the paste below. One would think that > there isn't some ephemeral data stored somewhere that is not one of the > disks, yet "cksum" errors show only on the vdev and the array lines. Help? > > [2:17:316]root@virtual:/vr2/torrent/in> zpool status > pool: vr2 > state: ONLINE > status: One or more devices is currently being resilvered. The pool will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Thu Oct 23 23:11:29 2014 > 1.53T scanned out of 22.6T at 62.4M/s, 98h23m to go > 119G resilvered, 6.79% done > config: > > NAME STATE READ WRITE CKSUM > vr2 ONLINE 0 0 36 > raidz1-0 ONLINE 0 0 72 > label/vr2-d0 ONLINE 0 0 0 > label/vr2-d1 ONLINE 0 0 0 > gpt/vr2-d2c ONLINE 0 0 0 block size: 512B > configured, 4096B native (resilvering) > gpt/vr2-d3b ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-d4a ONLINE 0 0 0 block size: 512B > configured, 4096B native > ada14 ONLINE 0 0 0 > label/vr2-d6 ONLINE 0 0 0 > label/vr2-d7c ONLINE 0 0 0 > label/vr2-d8 ONLINE 0 0 0 > raidz1-1 ONLINE 0 0 0 > gpt/vr2-e0 ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-e1 ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-e2 ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-e3 ONLINE 0 0 0 > gpt/vr2-e4 ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-e5 ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-e6 ONLINE 0 0 0 block size: 512B > configured, 4096B native > gpt/vr2-e7 ONLINE 0 0 0 block size: 512B > configured, 4096B native > > errors: 43 data errors, use '-v' for a list The checksum errors will appear on the raidz vdev instead of a leaf if vdev_raidz.c can't determine which leaf vdev was responsible. This could happen if two or more leaf vdevs return bad data for the same block, which would also lead to unrecoverable data errors. I see that you have some unrecoverable data errors, so maybe that's what happened to you. Subtle design bugs in ZFS can also lead to vdev_raidz.c being unable to determine which child was responsible for a checksum error. However, I've only seen that happen when a raidz vdev has a mirror child. That can only happen if the child is a spare or replacing vdev. Did you activate any spares, or did you manually replace a vdev? -Alan From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 16:46:04 2014 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9ACFE684 for ; Fri, 24 Oct 2014 16:46:04 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 58175D87 for ; Fri, 24 Oct 2014 16:46:04 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-234-114.lns20.per1.internode.on.net [121.45.234.114]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id s9OGjqud076089 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Fri, 24 Oct 2014 09:45:56 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <544A823A.1080304@freebsd.org> Date: Sat, 25 Oct 2014 00:45:46 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: fs@freebsd.org Subject: change in VFS layer API? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 16:46:04 -0000 Can anyone point me at a VFS API contract change that occurred over the last 5 years where a filesystem written to teh old contract would end up with extra references to all its vnodes/objects? Specifically a proprietary filesystem that ran on 8.0 now can be compiled but ends up with extra references on its vnodes and can not free them. thanks, Julian From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 17:18:37 2014 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F33772EB; Fri, 24 Oct 2014 17:18:36 +0000 (UTC) Received: from mail-yh0-x22c.google.com (mail-yh0-x22c.google.com [IPv6:2607:f8b0:4002:c01::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AAE67116; Fri, 24 Oct 2014 17:18:36 +0000 (UTC) Received: by mail-yh0-f44.google.com with SMTP id i57so1251073yha.31 for ; Fri, 24 Oct 2014 10:18:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=spCtYFLteT7/MZ8p9zNUkxjdEJZwgWsLWuHD2PQcU8s=; b=PE8Uhs0CBxUqDQiKekxlsiUc6P0ocshyJWtXWn9LnQwOK/MJ/hkTbtFrb5kldw7tdM 6Oattxc6Nz9SqeMeTBwjy82irX9jjk7THw91TVQJ02h+t/02QbjQDsAeVWpXc2QbjGcO V9bEs0u7QloU32bTFuL/h1ejQADxrg+o9pjFN6VEaX8rxcgxH163OFzaU+Ls6QtMiiVX OR2x5Y/X8R4QBiDascZbPDk6cJug+Rb7g5tfXZ6y8mU9YIvpWSk0/NQrfq1DlEm3k7En noASR6VCsrXytgUeJSSmxLPvgn8hccHB8JkAJkgBELs5+CfPnb1BY+xmujfNoCPNVAlQ GBcQ== MIME-Version: 1.0 X-Received: by 10.236.39.5 with SMTP id c5mr6732510yhb.92.1414171115752; Fri, 24 Oct 2014 10:18:35 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.170.82.197 with HTTP; Fri, 24 Oct 2014 10:18:35 -0700 (PDT) In-Reply-To: <544A823A.1080304@freebsd.org> References: <544A823A.1080304@freebsd.org> Date: Fri, 24 Oct 2014 10:18:35 -0700 X-Google-Sender-Auth: T6xZJP3_WUi-4DDbYnCjLowhAu8 Message-ID: Subject: Re: change in VFS layer API? From: "K. Macy" To: Julian Elischer Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-fs@FreeBSD.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 17:18:37 -0000 On Fri, Oct 24, 2014 at 9:45 AM, Julian Elischer wrote: > Can anyone point me at a VFS API contract change that occurred over the last > 5 years where a filesystem written to teh old contract would end up with > extra references to all its vnodes/objects? Specifically a proprietary > filesystem that ran on 8.0 now can be compiled but ends up with extra > references on its vnodes and can not free them. > I think the contract for some functions has become unclear. I've found that the opensolaris' compatibility layer traverse' vput of the initial vnode passed in triggers negative reference count panics. It is clear that some callers of lookup expect the reference to be maintained on error so the unconditional vput was (well is - this patch isn't in base) wrong, but in the case of success it isn't clear. Doing the vput on success will still eventually (as in a few seconds of this torture test script) cause a negative reference count panic. I think there needs to be an audit of VFS function contract compliance. Preferably by someone who knows what they are. I can only infer from cumulative context. Thanks. -K From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 18:42:35 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2B689AB7; Fri, 24 Oct 2014 18:42:35 +0000 (UTC) Received: from mail-vc0-x22f.google.com (mail-vc0-x22f.google.com [IPv6:2607:f8b0:400c:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B9236C6B; Fri, 24 Oct 2014 18:42:34 +0000 (UTC) Received: by mail-vc0-f175.google.com with SMTP id id10so577411vcb.6 for ; Fri, 24 Oct 2014 11:42:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yacKNvcBnyT3CZOWf0Z7ax56esPKFstuCx7hDfawYfI=; b=bjkpgzh4wtQg5hKzfMigHqNXmQWQSwdfsPV/pAXVkUf8O4iUCfDm4/EgTjbc68MRZW EJ+VyqNNMB5qXTUKrnK5FZX0LZQ+3pmV8phmqWUT5Ug0eyyYZhSZJxqnnRpFT4woz+1k le4NoRlCt3Qf6sdRNIOZy/xAcEEcNgPNv1uU+z499h0Ps6RawnT9Y9odu/XgotAygWpE XF0+Ws6L+PXw3j4jdGA5ZkXl2QW4YzB3PX93RFRcK75YCCvAxAjpMFRBO1imZ1pW244M Xi/Vrw3iKykBGgVceTMOhfnXX8rn2NvoaiyfiYWUAxYWq/gyuLBzz4PPqYWNYBlnrwqX mucg== MIME-Version: 1.0 X-Received: by 10.221.46.4 with SMTP id um4mr4068777vcb.23.1414176153132; Fri, 24 Oct 2014 11:42:33 -0700 (PDT) Received: by 10.220.238.14 with HTTP; Fri, 24 Oct 2014 11:42:32 -0700 (PDT) In-Reply-To: References: Date: Fri, 24 Oct 2014 14:42:32 -0400 Message-ID: Subject: Re: ZFS errors on the array but not the disk. From: Zaphod Beeblebrox To: Alan Somers Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2014 18:42:35 -0000 I manually replaced a disk... and the array was scrubbed recently. Interestingly, I seem to be in the "endless loop" of resilvering problem. Not much I can find on it. but resilvering will complete and I can then run another scrub. It will complete, too. Then rebooting causes another resilvering. Another odd data point: it seems as if the things that show up as "errors" change from resilvering to resilvering. One bug, it would seem, is that once ZFS has detected an error... another scrub can reset it, but no attempt is made to read-through the error if you access the object directly. On Fri, Oct 24, 2014 at 11:33 AM, Alan Somers wrote: > On Thu, Oct 23, 2014 at 11:37 PM, Zaphod Beeblebrox > wrote: > > What does it mean when checksum errors appear on the array (and the vdev) > > but not on any of the disks? See the paste below. One would think that > > there isn't some ephemeral data stored somewhere that is not one of the > > disks, yet "cksum" errors show only on the vdev and the array lines. > Help? > > > > [2:17:316]root@virtual:/vr2/torrent/in> zpool status > > pool: vr2 > > state: ONLINE > > status: One or more devices is currently being resilvered. The pool will > > continue to function, possibly in a degraded state. > > action: Wait for the resilver to complete. > > scan: resilver in progress since Thu Oct 23 23:11:29 2014 > > 1.53T scanned out of 22.6T at 62.4M/s, 98h23m to go > > 119G resilvered, 6.79% done > > config: > > > > NAME STATE READ WRITE CKSUM > > vr2 ONLINE 0 0 36 > > raidz1-0 ONLINE 0 0 72 > > label/vr2-d0 ONLINE 0 0 0 > > label/vr2-d1 ONLINE 0 0 0 > > gpt/vr2-d2c ONLINE 0 0 0 block size: 512B > > configured, 4096B native (resilvering) > > gpt/vr2-d3b ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-d4a ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > ada14 ONLINE 0 0 0 > > label/vr2-d6 ONLINE 0 0 0 > > label/vr2-d7c ONLINE 0 0 0 > > label/vr2-d8 ONLINE 0 0 0 > > raidz1-1 ONLINE 0 0 0 > > gpt/vr2-e0 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-e1 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-e2 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-e3 ONLINE 0 0 0 > > gpt/vr2-e4 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-e5 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-e6 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > gpt/vr2-e7 ONLINE 0 0 0 block size: 512B > > configured, 4096B native > > > > errors: 43 data errors, use '-v' for a list > > The checksum errors will appear on the raidz vdev instead of a leaf if > vdev_raidz.c can't determine which leaf vdev was responsible. This > could happen if two or more leaf vdevs return bad data for the same > block, which would also lead to unrecoverable data errors. I see that > you have some unrecoverable data errors, so maybe that's what happened > to you. > > Subtle design bugs in ZFS can also lead to vdev_raidz.c being unable > to determine which child was responsible for a checksum error. > However, I've only seen that happen when a raidz vdev has a mirror > child. That can only happen if the child is a spare or replacing > vdev. Did you activate any spares, or did you manually replace a > vdev? > > -Alan > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 02:47:10 2014 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A2DA09BB; Sat, 25 Oct 2014 02:47:10 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 77B6DF2; Sat, 25 Oct 2014 02:47:10 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-234-114.lns20.per1.internode.on.net [121.45.234.114]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id s9P2l5eD077558 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Fri, 24 Oct 2014 19:47:08 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <544B0F24.4060500@freebsd.org> Date: Sat, 25 Oct 2014 10:47:00 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "K. Macy" Subject: Re: change in VFS layer API? References: <544A823A.1080304@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@FreeBSD.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 02:47:10 -0000 On 10/25/14, 1:18 AM, K. Macy wrote: > On Fri, Oct 24, 2014 at 9:45 AM, Julian Elischer wrote: >> Can anyone point me at a VFS API contract change that occurred over the last >> 5 years where a filesystem written to teh old contract would end up with >> extra references to all its vnodes/objects? Specifically a proprietary >> filesystem that ran on 8.0 now can be compiled but ends up with extra >> references on its vnodes and can not free them. >> > I think the contract for some functions has become unclear. I've found > that the opensolaris' compatibility layer traverse' vput of the > initial vnode passed in triggers negative reference count panics. It > is clear that some callers of lookup expect the reference to be > maintained on error so the unconditional vput was (well is - this > patch isn't in base) wrong, but in the case of success it isn't clear. > Doing the vput on success will still eventually (as in a few seconds > of this torture test script) cause a negative reference count panic. I > think there needs to be an audit of VFS function contract compliance. > Preferably by someone who knows what they are. I can only infer from > cumulative context. I have evidence that the API has actually changed somehow. The old API would have extra calls to remove references to nodes. We don't seem to be seeing those extra calls any more. I'll have more info later. > Thanks. > > -K > > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 03:00:15 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 76865ADE for ; Sat, 25 Oct 2014 03:00:15 +0000 (UTC) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id 0FD3D1CE for ; Sat, 25 Oct 2014 03:00:14 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id CF78320E7088D; Sat, 25 Oct 2014 03:00:06 +0000 (UTC) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTP id BF31A20E7088A for ; Sat, 25 Oct 2014 03:00:06 +0000 (UTC) Message-ID: <544B12B8.8060302@freebsd.org> Date: Sat, 25 Oct 2014 04:02:16 +0100 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS errors on the array but not the disk. References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 03:00:15 -0000 There was an issue which would cause resilver restarts fixed by *265253* which was MFC'ed to stable/10 by *271683* so you'll want to make sure your latter than that. On 24/10/2014 19:42, Zaphod Beeblebrox wrote: > I manually replaced a disk... and the array was scrubbed recently. > Interestingly, I seem to be in the "endless loop" of resilvering problem. > Not much I can find on it. but resilvering will complete and I can then > run another scrub. It will complete, too. Then rebooting causes another > resilvering. > > Another odd data point: it seems as if the things that show up as "errors" > change from resilvering to resilvering. > > One bug, it would seem, is that once ZFS has detected an error... another > scrub can reset it, but no attempt is made to read-through the error if you > access the object directly. > > On Fri, Oct 24, 2014 at 11:33 AM, Alan Somers wrote: > >> On Thu, Oct 23, 2014 at 11:37 PM, Zaphod Beeblebrox >> wrote: >>> What does it mean when checksum errors appear on the array (and the vdev) >>> but not on any of the disks? See the paste below. One would think that >>> there isn't some ephemeral data stored somewhere that is not one of the >>> disks, yet "cksum" errors show only on the vdev and the array lines. >> Help? >>> [2:17:316]root@virtual:/vr2/torrent/in> zpool status >>> pool: vr2 >>> state: ONLINE >>> status: One or more devices is currently being resilvered. The pool will >>> continue to function, possibly in a degraded state. >>> action: Wait for the resilver to complete. >>> scan: resilver in progress since Thu Oct 23 23:11:29 2014 >>> 1.53T scanned out of 22.6T at 62.4M/s, 98h23m to go >>> 119G resilvered, 6.79% done >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> vr2 ONLINE 0 0 36 >>> raidz1-0 ONLINE 0 0 72 >>> label/vr2-d0 ONLINE 0 0 0 >>> label/vr2-d1 ONLINE 0 0 0 >>> gpt/vr2-d2c ONLINE 0 0 0 block size: 512B >>> configured, 4096B native (resilvering) >>> gpt/vr2-d3b ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-d4a ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> ada14 ONLINE 0 0 0 >>> label/vr2-d6 ONLINE 0 0 0 >>> label/vr2-d7c ONLINE 0 0 0 >>> label/vr2-d8 ONLINE 0 0 0 >>> raidz1-1 ONLINE 0 0 0 >>> gpt/vr2-e0 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-e1 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-e2 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-e3 ONLINE 0 0 0 >>> gpt/vr2-e4 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-e5 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-e6 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> gpt/vr2-e7 ONLINE 0 0 0 block size: 512B >>> configured, 4096B native >>> >>> errors: 43 data errors, use '-v' for a list >> The checksum errors will appear on the raidz vdev instead of a leaf if >> vdev_raidz.c can't determine which leaf vdev was responsible. This >> could happen if two or more leaf vdevs return bad data for the same >> block, which would also lead to unrecoverable data errors. I see that >> you have some unrecoverable data errors, so maybe that's what happened >> to you. >> >> Subtle design bugs in ZFS can also lead to vdev_raidz.c being unable >> to determine which child was responsible for a checksum error. >> However, I've only seen that happen when a raidz vdev has a mirror >> child. That can only happen if the child is a spare or replacing >> vdev. Did you activate any spares, or did you manually replace a >> vdev? >> >> -Alan >> > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 03:47:44 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BBA643B7; Sat, 25 Oct 2014 03:47:44 +0000 (UTC) Received: from mail-vc0-x22f.google.com (mail-vc0-x22f.google.com [IPv6:2607:f8b0:400c:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 65A5D920; Sat, 25 Oct 2014 03:47:44 +0000 (UTC) Received: by mail-vc0-f175.google.com with SMTP id id10so921732vcb.34 for ; Fri, 24 Oct 2014 20:47:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=0BnmbHrX/Kv7dQq2OxzsMW3F+oLauUCmo0wzo8PXHmc=; b=SIuMNvb31uqKWYArdO0v9AwPrx44etjqwYgmfZc6SNZRRAUHtU754uXK8cGyOOrShm YjDv9gJAYjTkhRE5/1jFVritYQ0Pn8eH06H6TeLwnzfW/De/J2W9iLzj96ZmzR3rJAlY GYG68eeex6LQlixTmO87OFpqvoX56PzmPwkgPiZvNTFT1AGN1OmLwr+2Nv+9SwCNy8p9 Mzj8ZcCSP97+KgjcizR5eyrVFgJhC2RSdOhUUvdU9Oui8LDQ/5BCDYNuqSvjcWdU/blD 0RP6PIcdsE7mXqkCKr5lkLY+7wZXQ4cgdnKZ5buzHjShPSLJvOaQPWj8XTWnWRaxQQE3 y6Xg== MIME-Version: 1.0 X-Received: by 10.52.121.73 with SMTP id li9mr3236351vdb.34.1414208863229; Fri, 24 Oct 2014 20:47:43 -0700 (PDT) Received: by 10.220.238.14 with HTTP; Fri, 24 Oct 2014 20:47:43 -0700 (PDT) In-Reply-To: <544B12B8.8060302@freebsd.org> References: <544B12B8.8060302@freebsd.org> Date: Fri, 24 Oct 2014 23:47:43 -0400 Message-ID: Subject: Re: ZFS errors on the array but not the disk. From: Zaphod Beeblebrox To: Steven Hartland Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 03:47:44 -0000 Thanks for the heads up. I'm following releng/10.1 and 271683 seems to be part of that, but a good catch/guess. On Fri, Oct 24, 2014 at 11:02 PM, Steven Hartland wrote: > There was an issue which would cause resilver restarts fixed by *265253* < > https://svnweb.freebsd.org/base?view=revision&revision=265253> which was > MFC'ed to stable/10 by *271683* base?view=revision&revision=271683>so you'll want to make sure your > latter than that. > > > On 24/10/2014 19:42, Zaphod Beeblebrox wrote: > >> I manually replaced a disk... and the array was scrubbed recently. >> Interestingly, I seem to be in the "endless loop" of resilvering problem. >> Not much I can find on it. but resilvering will complete and I can then >> run another scrub. It will complete, too. Then rebooting causes another >> resilvering. >> >> Another odd data point: it seems as if the things that show up as "errors" >> change from resilvering to resilvering. >> >> One bug, it would seem, is that once ZFS has detected an error... another >> scrub can reset it, but no attempt is made to read-through the error if >> you >> access the object directly. >> >> On Fri, Oct 24, 2014 at 11:33 AM, Alan Somers >> wrote: >> >> On Thu, Oct 23, 2014 at 11:37 PM, Zaphod Beeblebrox >>> wrote: >>> >>>> What does it mean when checksum errors appear on the array (and the >>>> vdev) >>>> but not on any of the disks? See the paste below. One would think that >>>> there isn't some ephemeral data stored somewhere that is not one of the >>>> disks, yet "cksum" errors show only on the vdev and the array lines. >>>> >>> Help? >>> >>>> [2:17:316]root@virtual:/vr2/torrent/in> zpool status >>>> pool: vr2 >>>> state: ONLINE >>>> status: One or more devices is currently being resilvered. The pool >>>> will >>>> continue to function, possibly in a degraded state. >>>> action: Wait for the resilver to complete. >>>> scan: resilver in progress since Thu Oct 23 23:11:29 2014 >>>> 1.53T scanned out of 22.6T at 62.4M/s, 98h23m to go >>>> 119G resilvered, 6.79% done >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> vr2 ONLINE 0 0 36 >>>> raidz1-0 ONLINE 0 0 72 >>>> label/vr2-d0 ONLINE 0 0 0 >>>> label/vr2-d1 ONLINE 0 0 0 >>>> gpt/vr2-d2c ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native (resilvering) >>>> gpt/vr2-d3b ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-d4a ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> ada14 ONLINE 0 0 0 >>>> label/vr2-d6 ONLINE 0 0 0 >>>> label/vr2-d7c ONLINE 0 0 0 >>>> label/vr2-d8 ONLINE 0 0 0 >>>> raidz1-1 ONLINE 0 0 0 >>>> gpt/vr2-e0 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-e1 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-e2 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-e3 ONLINE 0 0 0 >>>> gpt/vr2-e4 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-e5 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-e6 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> gpt/vr2-e7 ONLINE 0 0 0 block size: 512B >>>> configured, 4096B native >>>> >>>> errors: 43 data errors, use '-v' for a list >>>> >>> The checksum errors will appear on the raidz vdev instead of a leaf if >>> vdev_raidz.c can't determine which leaf vdev was responsible. This >>> could happen if two or more leaf vdevs return bad data for the same >>> block, which would also lead to unrecoverable data errors. I see that >>> you have some unrecoverable data errors, so maybe that's what happened >>> to you. >>> >>> Subtle design bugs in ZFS can also lead to vdev_raidz.c being unable >>> to determine which child was responsible for a checksum error. >>> However, I've only seen that happen when a raidz vdev has a mirror >>> child. That can only happen if the child is a spare or replacing >>> vdev. Did you activate any spares, or did you manually replace a >>> vdev? >>> >>> -Alan >>> >>> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> >> >> > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 05:53:57 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3F265B5D for ; Sat, 25 Oct 2014 05:53:57 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 26B456C0 for ; Sat, 25 Oct 2014 05:53:57 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P5rv5M005223 for ; Sat, 25 Oct 2014 05:53:57 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194586] [zfs] kernel panic when running zpool/add/option-f_size_mismatch.t Date: Sat, 25 Oct 2014 05:53:57 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 05:53:57 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194586 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 05:59:14 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 389F5DE2 for ; Sat, 25 Oct 2014 05:59:14 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2067A6F1 for ; Sat, 25 Oct 2014 05:59:14 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P5xEZa009476 for ; Sat, 25 Oct 2014 05:59:14 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194587] [zfs] kernel panic when running zpool/add/open-f_type_mismatch.t Date: Sat, 25 Oct 2014 05:59:14 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 05:59:14 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194587 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 06:07:06 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A6462EF9 for ; Sat, 25 Oct 2014 06:07:06 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8DE3380E for ; Sat, 25 Oct 2014 06:07:06 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P676Wk011560 for ; Sat, 25 Oct 2014 06:07:06 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194586] [zfs] kernel panic when running zpool/add/option-f_size_mismatch.t Date: Sat, 25 Oct 2014 06:07:06 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: see_also Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 06:07:06 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194586 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freebsd.org/bu | |gzilla/show_bug.cgi?id=1915 | |74 -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 06:07:06 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5F68EEF6 for ; Sat, 25 Oct 2014 06:07:06 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 475E280C for ; Sat, 25 Oct 2014 06:07:06 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P676RJ011480 for ; Sat, 25 Oct 2014 06:07:06 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 191573] [zfs] kernel panic when running zpool/add/files.t Date: Sat, 25 Oct 2014 06:07:06 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: In Discussion X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: see_also Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 06:07:06 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191573 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freebsd.org/bu | |gzilla/show_bug.cgi?id=1915 | |74 -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 06:07:06 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C9B7BEFA for ; Sat, 25 Oct 2014 06:07:06 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B16B280F for ; Sat, 25 Oct 2014 06:07:06 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P676MB011620 for ; Sat, 25 Oct 2014 06:07:06 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194587] [zfs] kernel panic when running zpool/add/open-f_type_mismatch.t Date: Sat, 25 Oct 2014 06:07:06 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: see_also Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 06:07:06 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194587 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freebsd.org/bu | |gzilla/show_bug.cgi?id=1915 | |74 -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 06:10:14 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0A32625E for ; Sat, 25 Oct 2014 06:10:14 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E58668A2 for ; Sat, 25 Oct 2014 06:10:13 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P6ADnO037109 for ; Sat, 25 Oct 2014 06:10:13 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 191573] [zfs] kernel panic when running zpool/add/files.t Date: Sat, 25 Oct 2014 06:10:14 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: In Discussion X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 06:10:14 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191573 --- Comment #17 from commit-hook@freebsd.org --- A commit references this bug: Author: ngie Date: Sat Oct 25 06:10:02 UTC 2014 New revision: 273630 URL: https://svnweb.freebsd.org/changeset/base/273630 Log: Bail out of the script on FreeBSD due to deterministic panic issue PR: 191573 Sponsored by: EMC / Isilon Storage Division Changes: head/tools/regression/zfs/zpool/add/files.t -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 06:29:16 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6C3154DE for ; Sat, 25 Oct 2014 06:29:16 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5354B99C for ; Sat, 25 Oct 2014 06:29:16 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P6TG42024693 for ; Sat, 25 Oct 2014 06:29:16 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194586] [zfs] kernel panic when running zpool/add/option-f_size_mismatch.t Date: Sat, 25 Oct 2014 06:29:16 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 06:29:16 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194586 --- Comment #1 from commit-hook@freebsd.org --- A commit references this bug: Author: ngie Date: Sat Oct 25 06:28:49 UTC 2014 New revision: 273631 URL: https://svnweb.freebsd.org/changeset/base/273631 Log: Bail out of the script on FreeBSD due to deterministic panic issue PR: 194586 Sponsored by: EMC / Isilon Storage Division Changes: head/tools/regression/zfs/zpool/add/option-f_size_mismatch.t -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 06:33:17 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D2D6F6E6 for ; Sat, 25 Oct 2014 06:33:17 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BA41DA48 for ; Sat, 25 Oct 2014 06:33:17 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P6XHMe061236 for ; Sat, 25 Oct 2014 06:33:17 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194587] [zfs] kernel panic when running zpool/add/open-f_type_mismatch.t Date: Sat, 25 Oct 2014 06:33:17 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 06:33:17 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194587 --- Comment #1 from commit-hook@freebsd.org --- A commit references this bug: Author: ngie Date: Sat Oct 25 06:33:01 UTC 2014 New revision: 273632 URL: https://svnweb.freebsd.org/changeset/base/273632 Log: Bail out of the script on FreeBSD due to deterministic panic issue PR: 194587 Sponsored by: EMC / Isilon Storage Division Changes: head/tools/regression/zfs/zpool/add/option-f_type_mismatch.t -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 07:17:32 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 06ABDFCA for ; Sat, 25 Oct 2014 07:17:32 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D4B3FD90 for ; Sat, 25 Oct 2014 07:17:31 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P7HVgx028828 for ; Sat, 25 Oct 2014 07:17:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194589] [zfs] kernel panic when running zpool/create/files.t Date: Sat, 25 Oct 2014 07:17:31 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 07:17:32 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194589 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 07:17:45 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 00F0B10E for ; Sat, 25 Oct 2014 07:17:44 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DCCAAD97 for ; Sat, 25 Oct 2014 07:17:44 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P7HiOL029161 for ; Sat, 25 Oct 2014 07:17:44 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194588] [zfs] kernel panic when running zpool/remove/spare.t Date: Sat, 25 Oct 2014 07:17:45 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: ngie@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 07:17:45 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194588 Garrett Cooper changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-fs@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 07:21:23 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5A10B33B for ; Sat, 25 Oct 2014 07:21:23 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 41C0DE46 for ; Sat, 25 Oct 2014 07:21:23 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s9P7LNQP036884 for ; Sat, 25 Oct 2014 07:21:23 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 194589] [zfs] kernel panic when running zpool/create/files.t Date: Sat, 25 Oct 2014 07:21:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 07:21:23 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194589 --- Comment #1 from commit-hook@freebsd.org --- A commit references this bug: Author: ngie Date: Sat Oct 25 07:20:47 UTC 2014 New revision: 273633 URL: https://svnweb.freebsd.org/changeset/base/273633 Log: Bail out of the script on FreeBSD due to deterministic panic issue PR: 194589 Sponsored by: EMC / Isilon Storage Division Changes: head/tools/regression/zfs/zpool/create/files.t -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 09:51:12 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 22B4A991; Sat, 25 Oct 2014 09:51:12 +0000 (UTC) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.81]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DBCC5DDA; Sat, 25 Oct 2014 09:51:11 +0000 (UTC) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1Xhxzu-0006JY-0I; Sat, 25 Oct 2014 11:51:03 +0200 Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-arm@freebsd.org, "freebsd-fs@freebsd.org" Date: Sat, 25 Oct 2014 11:50:57 +0200 Subject: panic in nfs on arm MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Ronald Klop" Message-ID: User-Agent: Opera Mail/12.16 (FreeBSD) X-Authenticated-As-Hash: bdb49c4ff80bd276e321aade33e76e02752072e2 X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: / X-Spam-Score: -0.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=ALL_TRUSTED, BAYES_50 autolearn=disabled version=3.3.1 X-Scan-Signature: 118046538f1968b7a7bc35ac7f8c9032 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 09:51:12 -0000 Hi, I got a panic on my arm computer while building a port with /usr/ports mounted from my FreeBSD-10-STABLE/amd64 machine. This is the machine which paniced: FreeBSD 11.0-CURRENT #1 r272028M: Tue Sep 23 17:11:45 CEST 2014 root@sjakie.klop.ws:/usr/obj-arm/arm.arm/usr/src-arm/sys/SHEEVAPLUG arm Tracing pid 90295 tid 100119 td 0xc5f8c960 db_trace_self() at db_trace_self pc = 0xc0bb12c8 lr = 0xc0bb1354 (db_trace_thread+0x50) sp = 0xdf29e5d0 fp = 0xc3e07120 db_trace_thread() at db_trace_thread+0x50 pc = 0xc0bb1354 lr = 0xc0936314 (db_command_init+0x5a4) sp = 0xdf29e630 fp = 0xc3e07120 db_command_init() at db_command_init+0x5a4 pc = 0xc0936314 lr = 0xc0935ad0 (db_skip_to_eol+0x484) sp = 0xdf29e648 fp = 0xc3e07120 r4 = 0xc0c8d350 r5 = 0x00000000 db_skip_to_eol() at db_skip_to_eol+0x484 pc = 0xc0935ad0 lr = 0xc0935c38 (db_command_loop+0x5c) sp = 0xdf29e6e8 fp = 0xc3e07120 r4 = 0xdf29e6fc r5 = 0xc0c8d64c r6 = 0x3cd90e75 r7 = 0x00000000 r8 = 0x00000001 r10 = 0x600000d3 db_command_loop() at db_command_loop+0x5c pc = 0xc0935c38 lr = 0xc0937f80 (X_db_sym_numargs+0xec) sp = 0xdf29e6f0 fp = 0xc3e07120 X_db_sym_numargs() at X_db_sym_numargs+0xec pc = 0xc0937f80 lr = 0xc0a6f0c0 (kdb_trap+0x94) sp = 0xdf29e808 fp = 0xc3e07120 r4 = 0xdf29e8f8 kdb_trap() at kdb_trap+0x94 pc = 0xc0a6f0c0 lr = 0xc0bc1d60 (badaddr_read+0x274) sp = 0xdf29e828 fp = 0xc3e07120 r4 = 0xdf29e8f8 r5 = 0x00000001 r6 = 0x3cd90e75 r7 = 0xc5f8c960 r8 = 0xdf29e8f8 r10 = 0xdf2a1eb0 badaddr_read() at badaddr_read+0x274 pc = 0xc0bc1d60 lr = 0xc0bc1e98 (badaddr_read+0x3ac) sp = 0xdf29e840 fp = 0xc3e07120 r4 = 0xc5f8c960 r5 = 0xdf29e8f8 r6 = 0x3cd90e05 badaddr_read() at badaddr_read+0x3ac pc = 0xc0bc1e98 lr = 0xc0bc2278 (data_abort_handler+0x10c) sp = 0xdf29e858 fp = 0xc3e07120 r4 = 0xc0cd8af8 r5 = 0xffff1004 data_abort_handler() at data_abort_handler+0x10c pc = 0xc0bc2278 lr = 0xc0bb2f40 (exception_exit) sp = 0xdf29e8f8 fp = 0xc3e07120 r4 = 0xffffffff r5 = 0xffff1004 r6 = 0x3cd90e05 r7 = 0xc0e0ea48 r8 = 0x0000000f r9 = 0x00000101 r10 = 0x0000001d exception_exit() at exception_exit pc = 0xc0bb2f40 lr = 0xc0b8daf8 (uma_reclaim+0x1f8) sp = 0xdf29e948 fp = 0xc3e07120 r0 = 0xba9b9127 r1 = 0x8b3de5fb r2 = 0xc61c1fc8 r3 = 0xba9b9126 r4 = 0x00000000 r5 = 0xc61c1fc8 r6 = 0x3cd90e05 r7 = 0xc0e0ea48 r8 = 0x0000000f r9 = 0x00000101 r10 = 0x0000001d r12 = 0x00000000 uma_reclaim() at uma_reclaim+0x24c pc = 0xc0b8db4c lr = 0xc0b8c800 (uma_zalloc_arg+0x2f0) sp = 0xdf29e978 fp = 0xdf29ec10 r4 = 0xc3e071d8 r5 = 0xc0e0ea00 r6 = 0xc3e07120 r7 = 0x00000000 r8 = 0x00000102 r9 = 0xdf29ecf8 r10 = 0xc61c0760 uma_zalloc_arg() at uma_zalloc_arg+0x2f0 pc = 0xc0b8c800 lr = 0xc09e1df0 (nfscl_nget+0x308) sp = 0xdf29e990 fp = 0xdf29ec10 r4 = 0x9bb9fa43 r5 = 0x00000000 r6 = 0xc550dce8 r7 = 0xc3edaa00 r8 = 0xc3ebbac0 nfscl_nget() at nfscl_nget+0x308 pc = 0xc09e1df0 lr = 0xc09da69c (ncl_readlinkrpc+0xf60) sp = 0xdf29e9d8 fp = 0xdf29ea10 r4 = 0xc550dce8 r5 = 0x00000000 r6 = 0xc550dcf8 r7 = 0xdf29ecf8 r8 = 0xdf29ec6c r9 = 0x00000000 r10 = 0xdf29ed28 ncl_readlinkrpc() at ncl_readlinkrpc+0xf60 pc = 0xc09da69c lr = 0xc0bdae44 (VOP_MKDIR_APV+0x94) sp = 0xdf29ec40 fp = 0xbffff620 r4 = 0xc0c95c68 r5 = 0xdf29ec6c r6 = 0x00000001 r7 = 0x00020284 r8 = 0xffffff9c r9 = 0x00200800 r10 = 0xc5f8c960 VOP_MKDIR_APV() at VOP_MKDIR_APV+0x94 pc = 0xc0bdae44 lr = 0xc0aca614 (kern_mkdirat+0x18c) sp = 0xdf29ec50 fp = 0xbffff620 r4 = 0xdf29ed28 r5 = 0xdf29ec90 r6 = 0x00000000 kern_mkdirat() at kern_mkdirat+0x18c pc = 0xc0aca614 lr = 0xc0aca684 (kern_mkdir+0x24) sp = 0xdf29ede0 fp = 0xbffff620 r4 = 0x00020290 r5 = 0xc5f8c960 r6 = 0x00000000 r7 = 0xc5f7f000 r8 = 0x00000000 r10 = 0x00013640 kern_mkdir() at kern_mkdir+0x24 pc = 0xc0aca684 lr = 0xc0aca6a8 (sys_mkdir+0x1c) sp = 0xdf29edf0 fp = 0xbffff620 sys_mkdir() at sys_mkdir+0x1c pc = 0xc0aca6a8 lr = 0xc0bc2884 (swi_handler+0x254) sp = 0xdf29edf8 fp = 0xbffff620 swi_handler() at swi_handler+0x254 pc = 0xc0bc2884 lr = 0xc0bb2ed0 (swi_exit) sp = 0xdf29ee60 fp = 0xbffff620 r4 = 0x00020290 r5 = 0x2085e8e0 r6 = 0x00020284 r7 = 0x00000088 r8 = 0x00000001 swi_exit() at swi_exit pc = 0xc0bb2ed0 lr = 0xc0bb2ed0 (swi_exit) sp = 0xdf29ee60 fp = 0xbffff620 Unable to unwind further Unfortunately dumping the kernel core also paniced. db> dump Physical memory: 507 MB Dumping 74 MB: 71 67 63 vm_fault(0xc4147000, 0, 1, 0) -> 0 Fatal kernel mode data abort: 'Translation Fault (P)' trapframe: 0xdf29e0b8 FSR=00000017, FAR=00000014, spsr=a00000d3 r0 =c0cd0f40, r1 =00000000, r2 =c5f8c960, r3 =00000004 r4 =00000000, r5 =00000000, r6 =00000000, r7 =c3ead01c r8 =c3ead000, r9 =c3e9e88c, r10=00000000, r11=0000000a r12=600000d3, ssp=df29e108, slr=c0bb4e24, pc =c0a7d060 panic: Fatal abort Uptime: 3d18h30m32s Sleeping thread (tid 100119, pid 90295) owns a non-sleepable lock From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 14:20:15 2014 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D2119B24; Sat, 25 Oct 2014 14:20:15 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id AD2A2C32; Sat, 25 Oct 2014 14:20:15 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-234-114.lns20.per1.internode.on.net [121.45.234.114]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id s9PEKAG8079602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sat, 25 Oct 2014 07:20:12 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <544BB194.3010705@freebsd.org> Date: Sat, 25 Oct 2014 22:20:04 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "K. Macy" Subject: Re: change in VFS layer API? References: <544A823A.1080304@freebsd.org> <544B0F24.4060500@freebsd.org> In-Reply-To: <544B0F24.4060500@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-fs@FreeBSD.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 14:20:16 -0000 On 10/25/14, 10:47 AM, Julian Elischer wrote: > On 10/25/14, 1:18 AM, K. Macy wrote: >> On Fri, Oct 24, 2014 at 9:45 AM, Julian Elischer >> wrote: >>> Can anyone point me at a VFS API contract change that occurred >>> over the last >>> 5 years where a filesystem written to teh old contract would end >>> up with >>> extra references to all its vnodes/objects? Specifically a >>> proprietary >>> filesystem that ran on 8.0 now can be compiled but ends up with extra >>> references on its vnodes and can not free them. >>> >> I think the contract for some functions has become unclear. I've found >> that the opensolaris' compatibility layer traverse' vput of the >> initial vnode passed in triggers negative reference count panics. It >> is clear that some callers of lookup expect the reference to be >> maintained on error so the unconditional vput was (well is - this >> patch isn't in base) wrong, but in the case of success it isn't clear. >> Doing the vput on success will still eventually (as in a few seconds >> of this torture test script) cause a negative reference count panic. I >> think there needs to be an audit of VFS function contract compliance. >> Preferably by someone who knows what they are. I can only infer from >> cumulative context. > > I have evidence that the API has actually changed somehow. > > The old API would have extra calls to remove references to nodes. > We don't seem to be seeing those extra calls any more. > I'll have more info later. My colleague who works on the filesystem in question suspects a change in the interplay between _inactive and _reclaim of a znode/vnode, resulting in extra references on the nodes. Doesn't ring any bells with anyone? >> Thanks. >> >> -K >> >> > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 15:02:53 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 93FD12CB; Sat, 25 Oct 2014 15:02:53 +0000 (UTC) Received: from mail.madpilot.net (grunt.madpilot.net [78.47.145.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 39C47F3; Sat, 25 Oct 2014 15:02:52 +0000 (UTC) Received: from mail (mail [192.168.254.3]) by mail.madpilot.net (Postfix) with ESMTP id 3jQ5Bw4hRSzb3H; Sat, 25 Oct 2014 17:02:32 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=madpilot.net; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1414249350; x=1416063751; bh=jOdZQts3DDqVZVbV67r+Q6l7Fu5JyaGlJTvk93/BPow=; b= ZLiJ06sHMeE3nSEbWoUcQ88PDpVX7/I150LBmedPGyAkh0Rfje3qPR1XblGbG9mn ubeRY92hwF0G1E6fB3ODR5GUMSH4yfD6Ex1O8+54a4L9tMg0EDI7voKs7qor/cN0 EU9SEoKGWdi2qJf3Mcaxby87A/opshHyfYKO1Ul65+A= Received: from mail.madpilot.net ([192.168.254.3]) by mail (mail.madpilot.net [192.168.254.3]) (amavisd-new, port 10024) with ESMTP id vHBL_i1R6wKW; Sat, 25 Oct 2014 17:02:30 +0200 (CEST) Received: from tommy.madpilot.net (micro.madpilot.net [88.149.173.206]) by mail.madpilot.net (Postfix) with ESMTPSA; Sat, 25 Oct 2014 17:02:30 +0200 (CEST) Message-ID: <544BBB85.2020909@madpilot.net> Date: Sat, 25 Oct 2014 17:02:29 +0200 From: Guido Falsi User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: FreeBSD FS Subject: Re: panic: detach with active requests on 10.1-RC3 References: <544A538F.6060202@FreeBSD.org> In-Reply-To: <544A538F.6060202@FreeBSD.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: Glen Barber , freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 15:02:53 -0000 On 10/24/14 15:26, Guido Falsi wrote: > Hi, > > I'm making some experiments with 10.1-RC3 on alix boards as hardware > using NanoBSD. > > By mounting and umounting UFS filesystems I have seen umount constantly > hanging hard in a deadlock. I have tested on two boards with two > distinct compactflash disks with same results. This was not happening > with 10.0-RELEASE. > > I have build a 10.1-RC3 kernel with full debugging and caused the > problem to happen, I got this: > > root@qtest:~ [0]# umount /cfg > panic: detach with active requests > KDB: stack backtrace: > db_trace_self_wrapper(c0968053,c08ea7f0,c2d48800,c23d6bc8,c0536a16,...) > at db_trace_self_wrapper+0x2d/frame 0xc23d6b98 > kdb_backtrace(c09639e1,c09fa7e8,c095761d,c23d6c54,c095761d,...) at > kdb_backtrace+0x30/frame 0xc23d6c00 > vpanic(c09fa682,100,c095761d,c23d6c54,c23d6c54,...) at vpanic+0x80/frame > 0xc23d6c24 > kassert_panic(c095761d,c09575b3,c2d7acc0,4c7,c2d7acc0,...) at > kassert_panic+0xe9/frame 0xc23d6c48 > g_detach(c2d7acc0,4,c095725c,1c2,c09c8d5c,...) at g_detach+0x1d3/frame > 0xc23d6c64 > g_wither_washer(c09f7df4,0,c0956544,124,0,...) at > g_wither_washer+0x109/frame 0xc23d6c90 > g_run_events(0,c23d6d08,c095d42a,3dc,0,...) at g_run_events+0x40/frame > 0xc23d6ccc > fork_exit(c05c4e60,0,c23d6d08) at fork_exit+0x7f/frame 0xc23d6cf4 > fork_trampoline() at fork_trampoline+0x8/frame 0xc23d6cf4 > --- trap 0, eip = 0, esp = 0xc23d6d40, ebp = 0 --- > KDB: enter: panic > [ thread pid 12 tid 100006 ] > Stopped at kdb_enter+0x3d: movl $0,kdb_why > db> > I tried to investigate some more by myself. Maybe what I found is obvious to anyone with decent VFS knowledge, anyway: After some fumbling around I did: db> show geom 0xc2e98b40 consumer: 0xc2e98b40 class: VFS (0xc09c8d5c) geom: ffs.ada0s3 (0xc3293600) provider: ada0s3 (0xc2e7e200) access: r0w0e0 flags: 0x0030 nstart: 19 nend: 18 Which shows nstart != nend, while g_detach asserts them to be the same. Going up the chain of providers I find also it's providers have nstart - nend == 1: db> show geom 0xc2e9b7c0 consumer: 0xc2e9b7c0 class: PART (0xc09c96b0) geom: ada0 (0xc2e7e780) provider: ada0 (0xc2e7e500) access: r2w0e0 flags: 0x0030 nstart: 1430 nend: 1429 db> show geom 0xc2e7e500 provider: ada0 (0xc2e7e500) class: DISK (0xc09c8890) geom: ada0 (0xc2e7e580) mediasize: 4017807360 sectorsize: 512 stripesize: 0 stripeoffset: 0 access: r2w0e0 flags: (0x0030) error: 0 nstart: 2085 nend: 2084 consumer: 0xc2e9a700 (ada0), access=r0w0e0, flags=0x0030 consumer: 0xc2e9b480 (ada0), access=r0w0e0, flags=0x0030 consumer: 0xc2e9b7c0 (ada0), access=r2w0e0, flags=0x0030 Looking at the code these values are touched only in g_io_request() and g_io_deliver() respectively. So this one now looks like a geom problem. In fact the only commit which touched those functions between 10.0 and 10.1 branches is r260385, which merged quite a few things. I've tried reverting it to test without that, but "svn merge -c -260385 ." generated a few conflicts I'm unable to resolve. So I need some guidance even to perform this simple test. > > The machine is sitting there, I am connected with serial console, anyone > willing to help me debug this further? I really know very little about > kernel debugging. If necessary I can also make myself available via IRC > or Jabber. > > It looks like this has some similarities with what was reported here: > > https://lists.freebsd.org/pipermail/freebsd-fs/2014-September/020035.html > > I also tested with head (including r272130) and it does deadlock the same. > After the analysis above I think that there really is no similitude with the probllem reported by bdrewery. -- Guido Falsi From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 16:02:38 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CCCC4397; Sat, 25 Oct 2014 16:02:38 +0000 (UTC) Received: from mail.madpilot.net (grunt.madpilot.net [78.47.145.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 57E868F8; Sat, 25 Oct 2014 16:02:38 +0000 (UTC) Received: from mail (mail [192.168.254.3]) by mail.madpilot.net (Postfix) with ESMTP id 3jQ6X23q2czb3G; Sat, 25 Oct 2014 18:02:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=madpilot.net; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1414252944; x=1416067345; bh=+i1X43VeKO7bXtQEVE9+Ku3mMcwriLVb+P2MoHp/1no=; b= CYzE/rcEydjxWcDIACQpcjFglrIRxIwoyLMYAAVl6pCvPTw9A3covUNc6UXMIKaG SKJQ9HKW4mh6MOadMsrhc/U7Df7dc2o07XbuoZMb1a8u/z6FG+o6wK1/BoIUogw5 ow8T6W77IsR93vpbO2shLGDfd41hXkFdVns0tfheoaI= Received: from mail.madpilot.net ([192.168.254.3]) by mail (mail.madpilot.net [192.168.254.3]) (amavisd-new, port 10024) with ESMTP id GritJvivHpbs; Sat, 25 Oct 2014 18:02:24 +0200 (CEST) Received: from tommy.madpilot.net (micro.madpilot.net [88.149.173.206]) by mail.madpilot.net (Postfix) with ESMTPSA; Sat, 25 Oct 2014 18:02:24 +0200 (CEST) Message-ID: <544BC990.4030700@madpilot.net> Date: Sat, 25 Oct 2014 18:02:24 +0200 From: Guido Falsi User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: FreeBSD FS Subject: Re: panic: detach with active requests on 10.1-RC3 References: <544A538F.6060202@FreeBSD.org> <544BBB85.2020909@madpilot.net> In-Reply-To: <544BBB85.2020909@madpilot.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: Glen Barber , freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 16:02:38 -0000 On 10/25/14 17:02, Guido Falsi wrote: > On 10/24/14 15:26, Guido Falsi wrote: >> Hi, >> >> I'm making some experiments with 10.1-RC3 on alix boards as hardware >> using NanoBSD. >> >> By mounting and umounting UFS filesystems I have seen umount constantly >> hanging hard in a deadlock. I have tested on two boards with two >> distinct compactflash disks with same results. This was not happening >> with 10.0-RELEASE. >> >> I have build a 10.1-RC3 kernel with full debugging and caused the >> problem to happen, I got this: >> >> root@qtest:~ [0]# umount /cfg >> panic: detach with active requests >> KDB: stack backtrace: >> db_trace_self_wrapper(c0968053,c08ea7f0,c2d48800,c23d6bc8,c0536a16,...) >> at db_trace_self_wrapper+0x2d/frame 0xc23d6b98 >> kdb_backtrace(c09639e1,c09fa7e8,c095761d,c23d6c54,c095761d,...) at >> kdb_backtrace+0x30/frame 0xc23d6c00 >> vpanic(c09fa682,100,c095761d,c23d6c54,c23d6c54,...) at vpanic+0x80/frame >> 0xc23d6c24 >> kassert_panic(c095761d,c09575b3,c2d7acc0,4c7,c2d7acc0,...) at >> kassert_panic+0xe9/frame 0xc23d6c48 >> g_detach(c2d7acc0,4,c095725c,1c2,c09c8d5c,...) at g_detach+0x1d3/frame >> 0xc23d6c64 >> g_wither_washer(c09f7df4,0,c0956544,124,0,...) at >> g_wither_washer+0x109/frame 0xc23d6c90 >> g_run_events(0,c23d6d08,c095d42a,3dc,0,...) at g_run_events+0x40/frame >> 0xc23d6ccc >> fork_exit(c05c4e60,0,c23d6d08) at fork_exit+0x7f/frame 0xc23d6cf4 >> fork_trampoline() at fork_trampoline+0x8/frame 0xc23d6cf4 >> --- trap 0, eip = 0, esp = 0xc23d6d40, ebp = 0 --- >> KDB: enter: panic >> [ thread pid 12 tid 100006 ] >> Stopped at kdb_enter+0x3d: movl $0,kdb_why >> db> >> > > I tried to investigate some more by myself. Maybe what I found is > obvious to anyone with decent VFS knowledge, anyway: > > After some fumbling around I did: > > db> show geom 0xc2e98b40 > consumer: 0xc2e98b40 > class: VFS (0xc09c8d5c) > geom: ffs.ada0s3 (0xc3293600) > provider: ada0s3 (0xc2e7e200) > access: r0w0e0 > flags: 0x0030 > nstart: 19 > nend: 18 > > Which shows nstart != nend, while g_detach asserts them to be the same. > > Going up the chain of providers I find also it's providers have nstart - > nend == 1: > > db> show geom 0xc2e9b7c0 > consumer: 0xc2e9b7c0 > class: PART (0xc09c96b0) > geom: ada0 (0xc2e7e780) > provider: ada0 (0xc2e7e500) > access: r2w0e0 > flags: 0x0030 > nstart: 1430 > nend: 1429 > db> show geom 0xc2e7e500 > provider: ada0 (0xc2e7e500) > class: DISK (0xc09c8890) > geom: ada0 (0xc2e7e580) > mediasize: 4017807360 > sectorsize: 512 > stripesize: 0 > stripeoffset: 0 > access: r2w0e0 > flags: (0x0030) > error: 0 > nstart: 2085 > nend: 2084 > consumer: 0xc2e9a700 (ada0), access=r0w0e0, flags=0x0030 > consumer: 0xc2e9b480 (ada0), access=r0w0e0, flags=0x0030 > consumer: 0xc2e9b7c0 (ada0), access=r2w0e0, flags=0x0030 > > Looking at the code these values are touched only in g_io_request() and > g_io_deliver() respectively. > > So this one now looks like a geom problem. > > In fact the only commit which touched those functions between 10.0 and > 10.1 branches is r260385, which merged quite a few things. > > I've tried reverting it to test without that, but "svn merge -c -260385 > ." generated a few conflicts I'm unable to resolve. So I need some > guidance even to perform this simple test. > I finally succeeded in merging it good enough to compile and boot, and got the same panic, so Even this commit looks unrelated. I must admit I am out of ideas. -- Guido Falsi From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 21:22:14 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7299990 for ; Sat, 25 Oct 2014 21:22:14 +0000 (UTC) Received: from keltia.net (cl-90.mrs-01.fr.sixxs.net [IPv6:2a01:240:fe00:59::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3723ABFC for ; Sat, 25 Oct 2014 21:22:13 +0000 (UTC) Received: from lonrach.local (foret.keltia.net [78.232.116.160]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: roberto) by keltia.net (Postfix) with ESMTPSA id 96108529E for ; Sat, 25 Oct 2014 23:22:01 +0200 (CEST) Date: Sat, 25 Oct 2014 23:21:51 +0200 From: Ollivier Robert To: freebsd-fs@freebsd.org Subject: Re: Hammer fs Message-ID: <20141025212150.GA23731@lonrach.local> References: <20141023223457.GA29206@bsdjunk.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20141023223457.GA29206@bsdjunk.com> X-Operating-System: MacOS X / MBP 4,1 - FreeBSD 8.0 / T3500-E5520 Nehalem User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 21:22:14 -0000 According to Christopher Petrik: > One way to improve FreeBSD is looking at the ideas page and act upon it, ive used dragonfly and decided it is time to improve my c skills by porting hammer to FreeBSD. This will be a very time consumiung process which will take time. However it would be nice to have this option during install since it brings in some nice features. Be aware that the DFly VFS layer is very different from the FreeBSD one which means that either will need big changes to support the other. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 22:17:06 2014 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C41FEC3B; Sat, 25 Oct 2014 22:17:06 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 67995124; Sat, 25 Oct 2014 22:17:05 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArMEAHUgTFSDaFve/2dsb2JhbABcg2JYBIMCykYKhnlUAoEbAX2EAwEBBAEBASArIAsbGAICDRkCKQEJJgYIBwQBHASIIA2zc5Q1AQEBAQEBBAEBAQEBAQEbgSyPCwEBGzQHgneBVAWWT4QOhHGUQYQUIS8HgQg5gQMBAQE X-IronPort-AV: E=Sophos;i="5.04,788,1406606400"; d="scan'208";a="162255228" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 25 Oct 2014 18:17:05 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 1BCA3B3F4B; Sat, 25 Oct 2014 18:17:05 -0400 (EDT) Date: Sat, 25 Oct 2014 18:17:05 -0400 (EDT) From: Rick Macklem To: Julian Elischer Message-ID: <1831388944.7492104.1414275425108.JavaMail.root@uoguelph.ca> In-Reply-To: <544BB194.3010705@freebsd.org> Subject: Re: change in VFS layer API? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.209] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: "freebsd-fs@FreeBSD.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 22:17:07 -0000 Julian Elischer wrote: > On 10/25/14, 10:47 AM, Julian Elischer wrote: > > On 10/25/14, 1:18 AM, K. Macy wrote: > >> On Fri, Oct 24, 2014 at 9:45 AM, Julian Elischer > >> wrote: > >>> Can anyone point me at a VFS API contract change that occurred > >>> over the last > >>> 5 years where a filesystem written to teh old contract would end > >>> up with > >>> extra references to all its vnodes/objects? Specifically a > >>> proprietary > >>> filesystem that ran on 8.0 now can be compiled but ends up with > >>> extra > >>> references on its vnodes and can not free them. > >>> > >> I think the contract for some functions has become unclear. I've > >> found > >> that the opensolaris' compatibility layer traverse' vput of the > >> initial vnode passed in triggers negative reference count panics. > >> It > >> is clear that some callers of lookup expect the reference to be > >> maintained on error so the unconditional vput was (well is - this > >> patch isn't in base) wrong, but in the case of success it isn't > >> clear. > >> Doing the vput on success will still eventually (as in a few > >> seconds > >> of this torture test script) cause a negative reference count > >> panic. I > >> think there needs to be an audit of VFS function contract > >> compliance. > >> Preferably by someone who knows what they are. I can only infer > >> from > >> cumulative context. > > > > I have evidence that the API has actually changed somehow. > > > > The old API would have extra calls to remove references to nodes. > > We don't seem to be seeing those extra calls any more. > > I'll have more info later. > My colleague who works on the filesystem in question suspects > a change in the interplay between _inactive and _reclaim of a > znode/vnode, > resulting in extra references on the nodes. > > Doesn't ring any bells with anyone? > Well, neither X_inactive() nor X_reclaim() are called until the ref count goes to 0. One quirk is that X_inactive() isn't guaranteed to be called, so anything you do in X_inactive() needs to be tested for and done again in X_reclaim(). (I have no idea if this explains why you have non-zero ref counted vnodes hanging about.) rick > > >> Thanks. > >> > >> -K > >> > >> > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to > > "freebsd-fs-unsubscribe@freebsd.org" > > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 23:21:20 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3ACB5A48; Sat, 25 Oct 2014 23:21:20 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id BBDB4914; Sat, 25 Oct 2014 23:21:19 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqoEAPAvTFSDaFve/2dsb2JhbABcg2JYBIMCykYKhnlUAoEbAX2EAgEBAQMBAQEBIAQnIAsFFhgCAg0ZAikBCSYGCAcEARwEiBcJDbNmlB8BAQEBAQEEAQEBAQEBARuBLI8LAQEbNAeCd4FUBZZPhA6EcZRBhBQhLweBCDmBAwEBAQ X-IronPort-AV: E=Sophos;i="5.04,788,1406606400"; d="scan'208";a="163519011" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 25 Oct 2014 19:21:13 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 2803CB403E; Sat, 25 Oct 2014 19:21:13 -0400 (EDT) Date: Sat, 25 Oct 2014 19:21:13 -0400 (EDT) From: Rick Macklem To: Ronald Klop Message-ID: <1388627434.7506173.1414279273153.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: panic in nfs on arm MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.209] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, freebsd-arm@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2014 23:21:20 -0000 Ronald Klop wrote: > Hi, > > I got a panic on my arm computer while building a port with > /usr/ports > mounted from my FreeBSD-10-STABLE/amd64 machine. > > This is the machine which paniced: > FreeBSD 11.0-CURRENT #1 r272028M: Tue Sep 23 17:11:45 CEST 2014 > root@sjakie.klop.ws:/usr/obj-arm/arm.arm/usr/src-arm/sys/SHEEVAPLUG > arm > > > Tracing pid 90295 tid 100119 td 0xc5f8c960 > db_trace_self() at db_trace_self > pc = 0xc0bb12c8 lr = 0xc0bb1354 (db_trace_thread+0x50) > sp = 0xdf29e5d0 fp = 0xc3e07120 > db_trace_thread() at db_trace_thread+0x50 > pc = 0xc0bb1354 lr = 0xc0936314 (db_command_init+0x5a4) > sp = 0xdf29e630 fp = 0xc3e07120 > db_command_init() at db_command_init+0x5a4 > pc = 0xc0936314 lr = 0xc0935ad0 (db_skip_to_eol+0x484) > sp = 0xdf29e648 fp = 0xc3e07120 > r4 = 0xc0c8d350 r5 = 0x00000000 > db_skip_to_eol() at db_skip_to_eol+0x484 > pc = 0xc0935ad0 lr = 0xc0935c38 (db_command_loop+0x5c) > sp = 0xdf29e6e8 fp = 0xc3e07120 > r4 = 0xdf29e6fc r5 = 0xc0c8d64c > r6 = 0x3cd90e75 r7 = 0x00000000 > r8 = 0x00000001 r10 = 0x600000d3 > db_command_loop() at db_command_loop+0x5c > pc = 0xc0935c38 lr = 0xc0937f80 (X_db_sym_numargs+0xec) > sp = 0xdf29e6f0 fp = 0xc3e07120 > X_db_sym_numargs() at X_db_sym_numargs+0xec > pc = 0xc0937f80 lr = 0xc0a6f0c0 (kdb_trap+0x94) > sp = 0xdf29e808 fp = 0xc3e07120 > r4 = 0xdf29e8f8 > kdb_trap() at kdb_trap+0x94 > pc = 0xc0a6f0c0 lr = 0xc0bc1d60 (badaddr_read+0x274) > sp = 0xdf29e828 fp = 0xc3e07120 > r4 = 0xdf29e8f8 r5 = 0x00000001 > r6 = 0x3cd90e75 r7 = 0xc5f8c960 > r8 = 0xdf29e8f8 r10 = 0xdf2a1eb0 > badaddr_read() at badaddr_read+0x274 > pc = 0xc0bc1d60 lr = 0xc0bc1e98 (badaddr_read+0x3ac) > sp = 0xdf29e840 fp = 0xc3e07120 > r4 = 0xc5f8c960 r5 = 0xdf29e8f8 > r6 = 0x3cd90e05 > badaddr_read() at badaddr_read+0x3ac > pc = 0xc0bc1e98 lr = 0xc0bc2278 (data_abort_handler+0x10c) > sp = 0xdf29e858 fp = 0xc3e07120 > r4 = 0xc0cd8af8 r5 = 0xffff1004 > data_abort_handler() at data_abort_handler+0x10c > pc = 0xc0bc2278 lr = 0xc0bb2f40 (exception_exit) > sp = 0xdf29e8f8 fp = 0xc3e07120 > r4 = 0xffffffff r5 = 0xffff1004 > r6 = 0x3cd90e05 r7 = 0xc0e0ea48 > r8 = 0x0000000f r9 = 0x00000101 > r10 = 0x0000001d > exception_exit() at exception_exit > pc = 0xc0bb2f40 lr = 0xc0b8daf8 (uma_reclaim+0x1f8) > sp = 0xdf29e948 fp = 0xc3e07120 > r0 = 0xba9b9127 r1 = 0x8b3de5fb > r2 = 0xc61c1fc8 r3 = 0xba9b9126 > r4 = 0x00000000 r5 = 0xc61c1fc8 > r6 = 0x3cd90e05 r7 = 0xc0e0ea48 > r8 = 0x0000000f r9 = 0x00000101 > r10 = 0x0000001d r12 = 0x00000000 > uma_reclaim() at uma_reclaim+0x24c This looks to me like a crash in uma_reclaim() and I find UMA way too obscure to understand. I have no idea if it might be related, but alc@ put a fix for low memory situations in r272071 (or maybe it's r272221?). Might be worth trying a slightly newer kernel to see if the problem still occurs. And hopefully someone more conversant with UMA (or this stack trace) can help more. rick > pc = 0xc0b8db4c lr = 0xc0b8c800 (uma_zalloc_arg+0x2f0) > sp = 0xdf29e978 fp = 0xdf29ec10 > r4 = 0xc3e071d8 r5 = 0xc0e0ea00 > r6 = 0xc3e07120 r7 = 0x00000000 > r8 = 0x00000102 r9 = 0xdf29ecf8 > r10 = 0xc61c0760 > uma_zalloc_arg() at uma_zalloc_arg+0x2f0 > pc = 0xc0b8c800 lr = 0xc09e1df0 (nfscl_nget+0x308) > sp = 0xdf29e990 fp = 0xdf29ec10 > r4 = 0x9bb9fa43 r5 = 0x00000000 > r6 = 0xc550dce8 r7 = 0xc3edaa00 > r8 = 0xc3ebbac0 > nfscl_nget() at nfscl_nget+0x308 > pc = 0xc09e1df0 lr = 0xc09da69c (ncl_readlinkrpc+0xf60) > sp = 0xdf29e9d8 fp = 0xdf29ea10 > r4 = 0xc550dce8 r5 = 0x00000000 > r6 = 0xc550dcf8 r7 = 0xdf29ecf8 > r8 = 0xdf29ec6c r9 = 0x00000000 > r10 = 0xdf29ed28 > ncl_readlinkrpc() at ncl_readlinkrpc+0xf60 > pc = 0xc09da69c lr = 0xc0bdae44 (VOP_MKDIR_APV+0x94) > sp = 0xdf29ec40 fp = 0xbffff620 > r4 = 0xc0c95c68 r5 = 0xdf29ec6c > r6 = 0x00000001 r7 = 0x00020284 > r8 = 0xffffff9c r9 = 0x00200800 > r10 = 0xc5f8c960 > VOP_MKDIR_APV() at VOP_MKDIR_APV+0x94 > pc = 0xc0bdae44 lr = 0xc0aca614 (kern_mkdirat+0x18c) > sp = 0xdf29ec50 fp = 0xbffff620 > r4 = 0xdf29ed28 r5 = 0xdf29ec90 > r6 = 0x00000000 > kern_mkdirat() at kern_mkdirat+0x18c > pc = 0xc0aca614 lr = 0xc0aca684 (kern_mkdir+0x24) > sp = 0xdf29ede0 fp = 0xbffff620 > r4 = 0x00020290 r5 = 0xc5f8c960 > r6 = 0x00000000 r7 = 0xc5f7f000 > r8 = 0x00000000 r10 = 0x00013640 > kern_mkdir() at kern_mkdir+0x24 > pc = 0xc0aca684 lr = 0xc0aca6a8 (sys_mkdir+0x1c) > sp = 0xdf29edf0 fp = 0xbffff620 > sys_mkdir() at sys_mkdir+0x1c > pc = 0xc0aca6a8 lr = 0xc0bc2884 (swi_handler+0x254) > sp = 0xdf29edf8 fp = 0xbffff620 > swi_handler() at swi_handler+0x254 > pc = 0xc0bc2884 lr = 0xc0bb2ed0 (swi_exit) > sp = 0xdf29ee60 fp = 0xbffff620 > r4 = 0x00020290 r5 = 0x2085e8e0 > r6 = 0x00020284 r7 = 0x00000088 > r8 = 0x00000001 > swi_exit() at swi_exit > pc = 0xc0bb2ed0 lr = 0xc0bb2ed0 (swi_exit) > sp = 0xdf29ee60 fp = 0xbffff620 > Unable to unwind further > > > Unfortunately dumping the kernel core also paniced. > db> dump > Physical memory: 507 MB > Dumping 74 MB: 71 67 63 > vm_fault(0xc4147000, 0, 1, 0) -> 0 > Fatal kernel mode data abort: 'Translation Fault (P)' > trapframe: 0xdf29e0b8 > FSR=00000017, FAR=00000014, spsr=a00000d3 > r0 =c0cd0f40, r1 =00000000, r2 =c5f8c960, r3 =00000004 > r4 =00000000, r5 =00000000, r6 =00000000, r7 =c3ead01c > r8 =c3ead000, r9 =c3e9e88c, r10=00000000, r11=0000000a > r12=600000d3, ssp=df29e108, slr=c0bb4e24, pc =c0a7d060 > > panic: Fatal abort > Uptime: 3d18h30m32s > Sleeping thread (tid 100119, pid 90295) owns a non-sleepable lock > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >