From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 04:26:32 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EC2A106564A; Sun, 6 Mar 2011 04:26:32 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [204.109.58.86]) by mx1.freebsd.org (Postfix) with ESMTP id 439318FC12; Sun, 6 Mar 2011 04:26:31 +0000 (UTC) Received: from meatwad.mouf.net (cpe-065-190-178-041.nc.res.rr.com [65.190.178.41]) (authenticated bits=0) by mouf.net (8.14.4/8.14.4) with ESMTP id p264Bxf5092753 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Sat, 5 Mar 2011 23:12:00 -0500 (EST) (envelope-from swills@FreeBSD.org) Message-ID: <4D73098F.3000807@FreeBSD.org> Date: Sat, 05 Mar 2011 23:11:59 -0500 From: Steve Wills User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.16) Gecko/20110130 Thunderbird/3.0.11 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <20110227202957.GD1992@garage.freebsd.pl> In-Reply-To: <20110227202957.GD1992@garage.freebsd.pl> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mouf.net [204.109.58.86]); Sat, 05 Mar 2011 23:12:00 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.2 at mouf.net X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: HEADS UP: ZFSv28 is in! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 04:26:32 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Pawel, On 02/27/11 15:29, Pawel Jakub Dawidek wrote: > Hi. > > I just committed ZFSv28 to HEAD. > > New major features: > > - Data deduplication. > - Triple parity RAIDZ (RAIDZ3). > - zfs diff. > - zpool split. > - Snapshot holds. > - zpool import -F. Allows to rewind corrupted pool to earlier > transaction group. > - Possibility to import pool in read-only mode. > > PS. If you like my work, you help me to promote yomoli.com:) > > http://yomoli.com > http://www.facebook.com/pages/Yomolicom/178311095544155 > Thanks for your work on this, I'm very happy to have ZFS v28. I just updated my -CURRENT system from a snapshot from about a month ago to code from today. I have 3 pools and one of them is for ports tinderbox. I only upgraded that pool. When I try to build something using tinderbox, I get this error: cp: failed to set acl entries for /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation not supported If I delete the /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/ directory then try to build, I get no errors. Could this be a bug in tinderbox or something else? It was working fine before the update as far as I know. Thanks, Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBAgAGBQJNcwmPAAoJEPXPYrMgexuhoXUH/jelhA/5sebWUjp2mEQ3GWYj GBIxM0H3v4kzRQ3CxQ3ACC/piXcDtF+j33KJl1032DgrijaWLs9kj1vdQd1ye5xc A9qN4Ek++/w3+JoLWkyyzIyg2/glIy/VaVdzXClEjR5GC02M3QG62OwVYyKEHicC 7FzeFHVRw29Rs6Rael3vkGospXfo7ha8uhc8Dv+kqLnmeBEaTYllpjtyzd9DbM38 01DhMc6Yg0EWbOF4h1wL6dwQDGDc0aBlLV8IWft90wVtewZWAhVGhrCBWLAflPWn X6lSg74PLryANaV7Vmk9MvR+9McCwCFstrVVCvAnAwlUYJ5Umo8h0uRWY5bt9mA= =EMs5 -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 08:25:21 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1613106564A for ; Sun, 6 Mar 2011 08:25:21 +0000 (UTC) (envelope-from boydjd@jbip.net) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 73F0D8FC16 for ; Sun, 6 Mar 2011 08:25:21 +0000 (UTC) Received: by fxm19 with SMTP id 19so3730587fxm.13 for ; Sun, 06 Mar 2011 00:25:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jbip.net; s=google; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=7wsutrzOk06D15DwykGWLJpC+T94JD3mDAIo8PTVuXI=; b=ep1eLbR2jxdeWvx9+sYtsIRjw9cmFPdINhX/0P9ZD6IXHwlS4XdBbNgyba5GHmMtJ4 h3BI4gSl5gMRYFlQzwuefmwUouBDFkc5sx4JwhvpYqQSK6QpiPxao8TxAB+Mmy8da9Jx PyX3Ahhn4CUj9tX7qUCaMDD0hD12VVDtWZXZw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=jbip.net; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=At7WJy5bLaYluFO4MJUFdNT8edIzde2I8Mms8rRZroB5vu2qFOUYhUxjhrFMhMVu3p UsVU4JSHNuFsNIyYzyRshRLrcuEC9+VTi0HU4bo3H8CuMnFQrhlnE/2HQ5UMGGZ6OQBv IVheg03d5qwrRLirC2rgnVLY9K/fNxcrKYBME= Received: by 10.223.15.152 with SMTP id k24mr643652faa.96.1299398403092; Sun, 06 Mar 2011 00:00:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.144.137 with HTTP; Sat, 5 Mar 2011 23:59:43 -0800 (PST) In-Reply-To: <20110304105608.GA23887@icarus.home.lan> References: <1299232133.18671.3.camel@pc286.embl.fr> <20110304100517.GA23249@icarus.home.lan> <20110304105608.GA23887@icarus.home.lan> From: Joshua Boyd Date: Sun, 6 Mar 2011 02:59:43 -0500 Message-ID: To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 08:25:22 -0000 On Fri, Mar 4, 2011 at 5:56 AM, Jeremy Chadwick wrote: > If you get better performance -- really, truly, honestly -- with > prefetch enabled on your system, then I strongly recommend you keep it > enabled. =A0However, for what it's worth (probably not much), this is the > first I've ever heard of a FreeBSD system performing better with > prefetch enabled. I just recently turned it on after having it turned off for a long time ... my speeds went from ~300MB/s to 600+MB/s in bonnie++. This is a dual core AM3 system with 8GB of ram, and 15 disks in a striped raidz configuration (3 sets striped). --=20 Joshua Boyd JBipNet E-mail: boydjd@jbip.net http://www.jbip.net From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 08:54:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 169D3106566B for ; Sun, 6 Mar 2011 08:54:54 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id B2DBF8FC12 for ; Sun, 6 Mar 2011 08:54:53 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 5E0B345DD8; Sun, 6 Mar 2011 09:54:51 +0100 (CET) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id B3FF845CA6; Sun, 6 Mar 2011 09:54:45 +0100 (CET) Date: Sun, 6 Mar 2011 09:54:45 +0100 From: Pawel Jakub Dawidek To: Attila Nagy Message-ID: <20110306084217.GA9791@garage.freebsd.pl> References: <4D710154.90409@fsn.hu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="jL2BoiuKMElzg3CS" Content-Disposition: inline In-Reply-To: <4D710154.90409@fsn.hu> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT amd64 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Punching holes into (sparse) files - porting Solaris fcntl(F_FREESP) to FreeBSD? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 08:54:54 -0000 --jL2BoiuKMElzg3CS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Mar 04, 2011 at 04:12:20PM +0100, Attila Nagy wrote: > Hi, >=20 > Is it possible to make regions of files, with already written data=20 > sparse? (I'm interested to do this on ZFS) >=20 > All I could find in this topic is: > http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg29047.html >=20 > grepping through the source gives a match for VOP_SPACE in=20 > cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c: > zfs_replay_truncate(zfsvfs_t *zfsvfs, lr_truncate_t *lr, boolean_t bytesw= ap) > { > #ifdef sun > [...] > error =3D VOP_SPACE(ZTOV(zp), F_FREESP, &fl, FWRITE | FOFFMAX, > lr->lr_offset, kcred, NULL); >=20 > And the relevant section from fcntl(2) in Solaris: > F_FREESP >=20 > Free storage space associated with a section of the > ordinary file fildes. The section is specified by a > variable of data type struct flock pointed to by arg. > The data type struct flock is defined in the > header (see fcntl.h(3HEAD)) and is described below. Note > that all file systems might not support all possible > variations of F_FREESP arguments. In particular, many > file systems allow space to be freed only at the end of > a file. >=20 > F_FREESP seems to be my friend, and it's implemented in Solaris's ZFS.=20 > How hard would it be to complete the port and make it accessible from=20 > FreeBSD? > I guess it was left out with a reason... Well, adding new VOP is important decision. We could eventually implement this via ioctl(2), I think... This is a nice feature after all. I don't know why do you need this, but note that when compression is enabled on a ZFS file system, all-zeros blocks are turned into holes, so if you do have compression enabled and you write all zeros in the place you want to punch a hole, the pool space should be reclaimed. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --jL2BoiuKMElzg3CS Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEUEARECAAYFAk1zS9QACgkQForvXbEpPzQE5QCYo51Ht4WXKSPPGLPXZj8lFyHn UACfTowvfEMJD7/s4NEYIR11T6zTH5Y= =q7p7 -----END PGP SIGNATURE----- --jL2BoiuKMElzg3CS-- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 09:04:58 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2ADF6106566C for ; Sun, 6 Mar 2011 09:04:58 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta06.emeryville.ca.mail.comcast.net (qmta06.emeryville.ca.mail.comcast.net [76.96.30.56]) by mx1.freebsd.org (Postfix) with ESMTP id 0F1338FC0C for ; Sun, 6 Mar 2011 09:04:57 +0000 (UTC) Received: from omta02.emeryville.ca.mail.comcast.net ([76.96.30.19]) by qmta06.emeryville.ca.mail.comcast.net with comcast id Fl3j1g0020QkzPwA6l4xq8; Sun, 06 Mar 2011 09:04:57 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta02.emeryville.ca.mail.comcast.net with comcast id Fl4w1g0010PUQVN8Nl4wbh; Sun, 06 Mar 2011 09:04:56 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id DEE459B422; Sun, 6 Mar 2011 01:04:55 -0800 (PST) Date: Sun, 6 Mar 2011 01:04:55 -0800 From: Jeremy Chadwick To: Joshua Boyd Message-ID: <20110306090455.GA87055@icarus.home.lan> References: <1299232133.18671.3.camel@pc286.embl.fr> <20110304100517.GA23249@icarus.home.lan> <20110304105608.GA23887@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 09:04:58 -0000 On Sun, Mar 06, 2011 at 02:59:43AM -0500, Joshua Boyd wrote: > On Fri, Mar 4, 2011 at 5:56 AM, Jeremy Chadwick > wrote: > > If you get better performance -- really, truly, honestly -- with > > prefetch enabled on your system, then I strongly recommend you keep it > > enabled. ?However, for what it's worth (probably not much), this is the > > first I've ever heard of a FreeBSD system performing better with > > prefetch enabled. > > I just recently turned it on after having it turned off for a long > time ... my speeds went from ~300MB/s to 600+MB/s in bonnie++. This is > a dual core AM3 system with 8GB of ram, and 15 disks in a striped > raidz configuration (3 sets striped). Here are some numbers for you. This is from a 8.2-STABLE (RELENG_8) system built Thu Feb 24 22:06:45 PST 2011, type amd64. /boot/loader.conf ====================== ahci_load="yes" vm.kmem_size="8192M" vfs.zfs.arc_max="6144M" vfs.zfs.prefetch_disable="1" vfs.zfs.txg.timeout="5" /etc/sysctl.conf ====================== kern.maxvnodes=250000 vfs.zfs.txg.write_limit_override=1073741824 ZFS details ====================== # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT data 2.72T 780G 1.96T 28% ONLINE - # zfs list NAME USED AVAIL REFER MOUNTPOINT data 519G 1.28T 28.0K none data/backups 519G 1.28T 519G /backups data/home 7.19M 1.28T 7.19M /home # zpool status data pool: data state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 Pool and filesystem attributes are all defaults (ZFS v15), except for mountpoint. Disk details ====================== ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 ada2: ATA-8 SATA 2.x device ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: ATA-8 SATA 2.x device ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) Benchmark commands ====================== shutdown -r now cd /backups && bonnie++ -s 16g -u root Benchmark results #1 ====================== Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP XXXXX 16G 92 99 127026 42 62634 18 244 99 177522 22 207.0 5 Latency 144ms 1277ms 1616ms 47845us 111ms 639ms Version 1.96 ------Sequential Create------ --------Random Create-------- XXXXX -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 18945 96 +++++ +++ 17552 98 18975 94 +++++ +++ 18286 98 Latency 18018us 85us 131us 17862us 38us 80us Benchmark results #2 ====================== Results with only one change made to /boot/loader.conf: vfs.zfs.prefetch_disable="0" Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP XXXXX 16G 95 99 126656 40 74678 24 229 97 187918 26 219.8 5 Latency 159ms 1813ms 2506ms 330ms 308ms 657ms Version 1.96 ------Sequential Create------ --------Random Create-------- XXXXX -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 20140 96 +++++ +++ 16814 96 10431 54 +++++ +++ 14846 95 Latency 21010us 158us 1242us 146ms 721us 11662us -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 09:51:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B103106566C; Sun, 6 Mar 2011 09:51:03 +0000 (UTC) (envelope-from etnapierala@googlemail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 41FAF8FC17; Sun, 6 Mar 2011 09:51:01 +0000 (UTC) Received: by bwz12 with SMTP id 12so3362800bwz.13 for ; Sun, 06 Mar 2011 01:51:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to:x-mailer; bh=qv+dZhHvdDNOHSZpZKXFgwvquWcd5Hh+xwpZ7AEnv8w=; b=dZrisIKj1KHwhINYvy+Bti+x6xY8+VL+ggEtM1+yTqAZC95tx5dHaxO8MYuvSLM7r5 a21Q5bKoozeeW+Awm03Msz1Q3mcxCx4QehP49eV9R0st8Wd5l9ZTuXkAwaLuwXE8sJCp oUQWGaHArZULQRUtoVWlgraU/CHLBn9m0JtXw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; b=k5sBwkc/vH55Diwr2etAI5ytN5I8r6MhRBlpfj8Lj81mbDWYZxOBHXzxRsckXkFrvF dGu2Q0GxeLdAc4c5vqS2Tf52PC0ldgndxmbF7Fe0ed705k2hMtYWpdZAUeR/0ceVu8WK E69oBPoYjnsnd8l+fv3ffOoGadBttPOHqc7iQ= Received: by 10.204.47.201 with SMTP id o9mr846751bkf.15.1299403358913; Sun, 06 Mar 2011 01:22:38 -0800 (PST) Received: from [192.168.1.102] (45.81.datacomsa.pl [195.34.81.45]) by mx.google.com with ESMTPS id 12sm857330bki.19.2011.03.06.01.22.36 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 06 Mar 2011 01:22:37 -0800 (PST) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=iso-8859-2 From: =?iso-8859-2?Q?Edward_Tomasz_Napiera=B3a?= In-Reply-To: <4D73098F.3000807@FreeBSD.org> Date: Sun, 6 Mar 2011 10:22:34 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> To: Steve Wills X-Mailer: Apple Mail (2.1082) Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: HEADS UP: ZFSv28 is in! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 09:51:03 -0000 Wiadomo=B6=E6 napisana przez Steve Wills w dniu 2011-03-06, o godz. = 05:11: [..] > Thanks for your work on this, I'm very happy to have ZFS v28. I just > updated my -CURRENT system from a snapshot from about a month ago to > code from today. I have 3 pools and one of them is for ports = tinderbox. > I only upgraded that pool. When I try to build something using > tinderbox, I get this error: >=20 > cp: failed to set acl entries for > /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation = not > supported What does "mount" show? -- If you cut off my head, what would I say? Me and my head, or me and my = body? From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 13:35:47 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6909F106564A; Sun, 6 Mar 2011 13:35:47 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [204.109.58.86]) by mx1.freebsd.org (Postfix) with ESMTP id E34BC8FC08; Sun, 6 Mar 2011 13:35:46 +0000 (UTC) Received: from meatwad.mouf.net (cpe-065-190-178-041.nc.res.rr.com [65.190.178.41]) (authenticated bits=0) by mouf.net (8.14.4/8.14.4) with ESMTP id p26DZiuJ096395 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Sun, 6 Mar 2011 08:35:45 -0500 (EST) (envelope-from swills@FreeBSD.org) Message-ID: <4D738DB0.1090603@FreeBSD.org> Date: Sun, 06 Mar 2011 08:35:44 -0500 From: Steve Wills User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.16) Gecko/20110130 Thunderbird/3.0.11 MIME-Version: 1.0 To: =?ISO-8859-2?Q?Edward_Tomasz_Napiera=B3a?= References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> In-Reply-To: <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 8bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mouf.net [204.109.58.86]); Sun, 06 Mar 2011 08:35:45 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.2 at mouf.net X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: HEADS UP: ZFSv28 is in! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 13:35:47 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/06/11 04:22, Edward Tomasz Napierała wrote: > Wiadomość napisana przez Steve Wills w dniu 2011-03-06, o godz. 05:11: > > [..] > >> Thanks for your work on this, I'm very happy to have ZFS v28. I just >> updated my -CURRENT system from a snapshot from about a month ago to >> code from today. I have 3 pools and one of them is for ports tinderbox. >> I only upgraded that pool. When I try to build something using >> tinderbox, I get this error: >> >> cp: failed to set acl entries for >> /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation not >> supported > > What does "mount" show? /dev/md4 12186190 332724 11853466 3% /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD Sorry, I forgot about the mdmfs hacks I had in my local tinderd. Without them, it works fine. So the problem seems to be in mfs rather than zfs. Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBAgAGBQJNc42wAAoJEPXPYrMgexuhYiwH/0k+HYFiWHgDlpbEZL5xEYHS +ZlOZ19kW58A648iVbzuBgGWcQIEkylflJc23pigue+Qm5gvUR9PYZmr20hleCow 96pOxlQOyu1yJ/w90yJsfOTnVXfwdgEZWrHwiW+dDQp1YCGjnXqiocHT6gjukCNV HSm6hEMI9YD5y75tVZVep/en6VmSwywsLlHEL5T8M+x1bsUUkawCltNUcbYLfJWS NMuyG1ZudwscTj1bZHKyz+1bu5sToBj/w8aU7vASn5wvIjnKhMo/DDiCICyykbQ1 7Qa3i+BfG7ugvq6WxV7OiiCTYzx8f8R2D9QOtz6wmAPLkQbItRfdX7i0lxG+nig= =h6fT -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 14:43:37 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A4771065672; Sun, 6 Mar 2011 14:43:37 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [204.109.58.86]) by mx1.freebsd.org (Postfix) with ESMTP id D36928FC12; Sun, 6 Mar 2011 14:43:36 +0000 (UTC) Received: from meatwad.mouf.net (cpe-065-190-178-041.nc.res.rr.com [65.190.178.41]) (authenticated bits=0) by mouf.net (8.14.4/8.14.4) with ESMTP id p26EhYwf096635 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Sun, 6 Mar 2011 09:43:35 -0500 (EST) (envelope-from swills@FreeBSD.org) Message-ID: <4D739D96.5090705@FreeBSD.org> Date: Sun, 06 Mar 2011 09:43:34 -0500 From: Steve Wills User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.16) Gecko/20110130 Thunderbird/3.0.11 MIME-Version: 1.0 To: =?ISO-8859-2?Q?Edward_Tomasz_Napiera=B3a?= References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> In-Reply-To: <4D738DB0.1090603@FreeBSD.org> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mouf.net [204.109.58.86]); Sun, 06 Mar 2011 09:43:35 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.2 at mouf.net X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 14:43:37 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/06/11 08:35, Steve Wills wrote: > On 03/06/11 04:22, Edward Tomasz NapieraBa wrote: >> Wiadomo[ napisana przez Steve Wills w dniu 2011-03-06, o godz. 05:11: > >> [..] > >>> Thanks for your work on this, I'm very happy to have ZFS v28. I just >>> updated my -CURRENT system from a snapshot from about a month ago to >>> code from today. I have 3 pools and one of them is for ports tinderbox. >>> I only upgraded that pool. When I try to build something using >>> tinderbox, I get this error: >>> >>> cp: failed to set acl entries for >>> /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation not >>> supported > >> What does "mount" show? > > /dev/md4 12186190 332724 11853466 3% > /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD > > Sorry, I forgot about the mdmfs hacks I had in my local tinderd. Without > them, it works fine. So the problem seems to be in mfs rather than zfs. I should have said mdmfs, but all that's doing is running mdconfig and newfs for me. I've reproduced the issue without mdmfs: % mdconfig -a -t swap -s 12G -u 4 % newfs -m 0 -o time /dev/md4 [...] % mount /dev/md4 /tmp/foobar % cp -p /usr/local/tinderbox/scripts/lib/buildscript /tmp/foobar cp: failed to set acl entries for /tmp/foobar/buildscript: Operation not supported Without -p it works fine. FWIW: % getfacl /usr/local/tinderbox/scripts/lib/buildscript # file: /usr/local/tinderbox/scripts/lib/buildscript # owner: root # group: wheel owner@:--------------:------:deny owner@:rwxp---A-W-Co-:------:allow group@:-w-p----------:------:deny group@:r-x-----------:------:allow everyone@:-w-p---A-W-Co-:------:deny everyone@:r-x---a-R-c--s:------:allow Any suggestions on where the problem could be? Thanks, Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBAgAGBQJNc52WAAoJEPXPYrMgexuhayoH/ROak0Vj2Ezh3t0ViqeZ8n/v Pa60x/MDvHcoqtEUM6CQulvf88pAjat07JCwoZKf2qlNgZgrcoK5gPjSeDsN+9jW LJxuFIyTOAmNxVC3FJgRuynTv06nAXDJu9f8psYVQS8EW56UQ9gmvKWNA3v80w2F bre2qzHneA42+5ZvVLnK6sSMJ2IBoyk9F1FXamUsP74TKygDL3iijatWWROJ+lQ+ HdY+TnmKEkZcXbl5qhya4etpPOxKcuTCD/VqYvUJXqkseIny9SE60xVhGyQWlDkU xEtjHQL8oRkc5CTHpCVJQMFiVGNFpBKutZq56wAaG0xgcDuWhvHJ3hcv8m93VYg= =c86J -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 15:37:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E1021065676 for ; Sun, 6 Mar 2011 15:37:49 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [76.96.27.228]) by mx1.freebsd.org (Postfix) with ESMTP id 0138E8FC0C for ; Sun, 6 Mar 2011 15:37:48 +0000 (UTC) Received: from omta17.emeryville.ca.mail.comcast.net ([76.96.30.73]) by qmta15.emeryville.ca.mail.comcast.net with comcast id Frdo1g0031afHeLAFrdovM; Sun, 06 Mar 2011 15:37:48 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta17.emeryville.ca.mail.comcast.net with comcast id Frdl1g00C0PUQVN8drdmup; Sun, 06 Mar 2011 15:37:46 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6B8F69B422; Sun, 6 Mar 2011 07:37:45 -0800 (PST) Date: Sun, 6 Mar 2011 07:37:45 -0800 From: Jeremy Chadwick To: Steve Wills Message-ID: <20110306153745.GA93530@icarus.home.lan> References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <4D739D96.5090705@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, Edward Tomasz Napiera?a Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 15:37:49 -0000 On Sun, Mar 06, 2011 at 09:43:34AM -0500, Steve Wills wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 >=20 > On 03/06/11 08:35, Steve Wills wrote: > > On 03/06/11 04:22, Edward Tomasz NapieraBa wrote: > >> Wiadomo[=07 napisana przez Steve Wills w dniu 2011-03-06, o godz. 05:1= 1: > >=20 > >> [..] > >=20 > >>> Thanks for your work on this, I'm very happy to have ZFS v28. I just > >>> updated my -CURRENT system from a snapshot from about a month ago to > >>> code from today. I have 3 pools and one of them is for ports tinderbo= x. > >>> I only upgraded that pool. When I try to build something using > >>> tinderbox, I get this error: > >>> > >>> cp: failed to set acl entries for > >>> /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation n= ot > >>> supported > >=20 > >> What does "mount" show? > >=20 > > /dev/md4 12186190 332724 11853466 3% > > /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD > >=20 > > Sorry, I forgot about the mdmfs hacks I had in my local tinderd. Without > > them, it works fine. So the problem seems to be in mfs rather than zfs. >=20 > I should have said mdmfs, but all that's doing is running mdconfig and > newfs for me. I've reproduced the issue without mdmfs: >=20 > % mdconfig -a -t swap -s 12G -u 4 > % newfs -m 0 -o time /dev/md4 > [...] > % mount /dev/md4 /tmp/foobar > % cp -p /usr/local/tinderbox/scripts/lib/buildscript /tmp/foobar > cp: failed to set acl entries for /tmp/foobar/buildscript: Operation not > supported >=20 > Without -p it works fine. FWIW: >=20 > % getfacl /usr/local/tinderbox/scripts/lib/buildscript > # file: /usr/local/tinderbox/scripts/lib/buildscript > # owner: root > # group: wheel > owner@:--------------:------:deny > owner@:rwxp---A-W-Co-:------:allow > group@:-w-p----------:------:deny > group@:r-x-----------:------:allow > everyone@:-w-p---A-W-Co-:------:deny > everyone@:r-x---a-R-c--s:------:allow >=20 > Any suggestions on where the problem could be? At first glance it looks like acl_set_fd_np(3) isn't working on an md-backed filesystem; specifically, it's returning EOPNOTSUPP. You should be able to reproduce the problem by doing a setfacl on something in /tmp/foobar. Looking through src/bin/cp/utils.c, this is the code: 420 if (acl_set_fd_np(dest_fd, acl, acl_type) < 0) { 421 warn("failed to set acl entries for %s", to.p_path); 422 acl_free(acl); 423 return (1); 424 } EOPNOTSUPP for acl_set_fd_np(3) is defined as: [EOPNOTSUPP] The file system does not support ACL retrieval. This would be referring to the destination filesystem. Looking through the md(4) source for references to EOPNOTSUPP, we do find some references: $ egrep -n -r "EOPNOTSUPP|ENOTSUP" /usr/src/sys/dev/md /usr/src/sys/dev/md/md.c:423: return (EOPNOTSUPP); /usr/src/sys/dev/md/md.c:475: error =3D EOPNOTSUPP; /usr/src/sys/dev/md/md.c:523: return (EOPNOTSUPP); /usr/src/sys/dev/md/md.c:601: return (EOPNOTSUPP); /usr/src/sys/dev/md/md.c:731: error =3D EOPNOTSUP= P; Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any BIO operation other than READ/WRITE/DELETE. Line 475 is a continuation of that. Line 508 is within mdstart_vnode(), behaving effectively the same as line 423. Line 601 is within mdstart_swap(), behaving effectively the same as line 423. Line 731 is within md_kthread(), and indicates only BIO operation BIO_GETATTR is supported. This would not be an "ACL attribute" thing, but rather getting attributes of the backing device itself. The code hints at that: 722 if (bp->bio_cmd =3D=3D BIO_GETATTR) { 723 if ((sc->fwsectors && sc->fwheads && 724 (g_handleattr_int(bp, "GEOM::fwsectors", 725 sc->fwsectors) || 726 g_handleattr_int(bp, "GEOM::fwheads", 727 sc->fwheads))) || 728 g_handleattr_int(bp, "GEOM::candelete", 1)) 729 error =3D -1; 730 else 731 error =3D EOPNOTSUPP; 732 } else { This leaves me with some ideas; just tossing them out here... 1. Maybe/somehow this is caused by swap being used as the backing type/store for md(4)? Try using "mdconfig -t malloc -o reserve" instead, temporarily anyway. 2. Are you absolutely 100% sure the kernel you're using was built with "options UFS_ACL" defined in it? Doing a "strings -a /boot/kernel/kernel | grep UFS_ACL" should suffice. --=20 | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 16:06:13 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 296E7106566B; Sun, 6 Mar 2011 16:06:13 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [204.109.58.86]) by mx1.freebsd.org (Postfix) with ESMTP id 970968FC18; Sun, 6 Mar 2011 16:06:12 +0000 (UTC) Received: from meatwad.mouf.net (cpe-065-190-178-041.nc.res.rr.com [65.190.178.41]) (authenticated bits=0) by mouf.net (8.14.4/8.14.4) with ESMTP id p26G699w097069 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Sun, 6 Mar 2011 11:06:11 -0500 (EST) (envelope-from swills@FreeBSD.org) Message-ID: <4D73B0F1.1040304@FreeBSD.org> Date: Sun, 06 Mar 2011 11:06:09 -0500 From: Steve Wills User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.16) Gecko/20110130 Thunderbird/3.0.11 MIME-Version: 1.0 To: Jeremy Chadwick References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> <20110306153745.GA93530@icarus.home.lan> In-Reply-To: <20110306153745.GA93530@icarus.home.lan> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mouf.net [204.109.58.86]); Sun, 06 Mar 2011 11:06:11 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.2 at mouf.net X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, Edward Tomasz Napiera?a Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 16:06:13 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/06/11 10:37, Jeremy Chadwick wrote: > > At first glance it looks like acl_set_fd_np(3) isn't working on an > md-backed filesystem; specifically, it's returning EOPNOTSUPP. You > should be able to reproduce the problem by doing a setfacl on something > in /tmp/foobar. > > Looking through src/bin/cp/utils.c, this is the code: > > 420 if (acl_set_fd_np(dest_fd, acl, acl_type) < 0) { > 421 warn("failed to set acl entries for %s", to.p_path); > 422 acl_free(acl); > 423 return (1); > 424 } > > EOPNOTSUPP for acl_set_fd_np(3) is defined as: > > [EOPNOTSUPP] The file system does not support ACL retrieval. > > This would be referring to the destination filesystem. > > Looking through the md(4) source for references to EOPNOTSUPP, we do > find some references: > > $ egrep -n -r "EOPNOTSUPP|ENOTSUP" /usr/src/sys/dev/md > /usr/src/sys/dev/md/md.c:423: return (EOPNOTSUPP); > /usr/src/sys/dev/md/md.c:475: error = EOPNOTSUPP; > /usr/src/sys/dev/md/md.c:523: return (EOPNOTSUPP); > /usr/src/sys/dev/md/md.c:601: return (EOPNOTSUPP); > /usr/src/sys/dev/md/md.c:731: error = EOPNOTSUPP; > > Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any > BIO operation other than READ/WRITE/DELETE. Line 475 is a continuation > of that. > > Line 508 is within mdstart_vnode(), behaving effectively the same as > line 423. Line 601 is within mdstart_swap(), behaving effectively the > same as line 423. > > Line 731 is within md_kthread(), and indicates only BIO operation > BIO_GETATTR is supported. This would not be an "ACL attribute" thing, > but rather getting attributes of the backing device itself. The code > hints at that: > > 722 if (bp->bio_cmd == BIO_GETATTR) { > 723 if ((sc->fwsectors && sc->fwheads && > 724 (g_handleattr_int(bp, "GEOM::fwsectors", > 725 sc->fwsectors) || > 726 g_handleattr_int(bp, "GEOM::fwheads", > 727 sc->fwheads))) || > 728 g_handleattr_int(bp, "GEOM::candelete", 1)) > 729 error = -1; > 730 else > 731 error = EOPNOTSUPP; > 732 } else { Thanks for the investigation! So this seems to be a bug in md? That's too bad, I was enjoying using it to make my tinderbox builds faster. > This leaves me with some ideas; just tossing them out here... > > 1. Maybe/somehow this is caused by swap being used as the backing > type/store for md(4)? Try using "mdconfig -t malloc -o reserve" > instead, temporarily anyway. Seems to be the same. > 2. Are you absolutely 100% sure the kernel you're using was built > with "options UFS_ACL" defined in it? Doing a "strings -a > /boot/kernel/kernel | grep UFS_ACL" should suffice. > Yep, it does: % strings -a /boot/kernel/kernel | grep UFS_ACL options UFS_ACL (My kernel config is just "include GENERIC" then a bunch of "nooptions" for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.) Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBAgAGBQJNc7DxAAoJEPXPYrMgexuh3gsH/0L474FitZMdLLrTLiDiU7jR D+9syg0boUYcWbv6pA1j1r8LvXMrw0rIxvZOPB4BauY/u8nL5n0YgDgv7tjb69+D n/m7ce6r1tm6JtBSl/d+MIYfmcnj1E9B8ibgeGwPApKnhe4lmmyLpFHW98tcU1EL Be+koxDiaKloryyfHrlcIfmSmXMUZ8lP7MFHfFeS39KbE+sf7xXHHLjFE7bcPSi4 qKyBFDcw/ykRjsrM3+YDIanhLUHg8ZjKhlrzbPUgMpzlXXe2QbmLkQELa9SmhVzH juYywb7JOe5uHuefFQxnTLkSWuDjTlxLW6M+FuNEDejfA91sGIil7m+1nMcdCFg= =nsSt -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 16:23:47 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5ECFF1065674 for ; Sun, 6 Mar 2011 16:23:47 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta08.emeryville.ca.mail.comcast.net (qmta08.emeryville.ca.mail.comcast.net [76.96.30.80]) by mx1.freebsd.org (Postfix) with ESMTP id 41B678FC17 for ; Sun, 6 Mar 2011 16:23:46 +0000 (UTC) Received: from omta19.emeryville.ca.mail.comcast.net ([76.96.30.76]) by qmta08.emeryville.ca.mail.comcast.net with comcast id FsDw1g0011eYJf8A8sPm93; Sun, 06 Mar 2011 16:23:46 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta19.emeryville.ca.mail.comcast.net with comcast id FsPj1g00F0PUQVN01sPjEG; Sun, 06 Mar 2011 16:23:44 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 014CA9B422; Sun, 6 Mar 2011 08:23:42 -0800 (PST) Date: Sun, 6 Mar 2011 08:23:42 -0800 From: Jeremy Chadwick To: Steve Wills Message-ID: <20110306162342.GA94700@icarus.home.lan> References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> <20110306153745.GA93530@icarus.home.lan> <4D73B0F1.1040304@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D73B0F1.1040304@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, Edward Tomasz Napiera?a Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 16:23:47 -0000 On Sun, Mar 06, 2011 at 11:06:09AM -0500, Steve Wills wrote: > On 03/06/11 10:37, Jeremy Chadwick wrote: > > > > At first glance it looks like acl_set_fd_np(3) isn't working on an > > md-backed filesystem; specifically, it's returning EOPNOTSUPP. You > > should be able to reproduce the problem by doing a setfacl on something > > in /tmp/foobar. > > > > Looking through src/bin/cp/utils.c, this is the code: > > > > 420 if (acl_set_fd_np(dest_fd, acl, acl_type) < 0) { > > 421 warn("failed to set acl entries for %s", to.p_path); > > 422 acl_free(acl); > > 423 return (1); > > 424 } > > > > EOPNOTSUPP for acl_set_fd_np(3) is defined as: > > > > [EOPNOTSUPP] The file system does not support ACL retrieval. > > > > This would be referring to the destination filesystem. > > > > Looking through the md(4) source for references to EOPNOTSUPP, we do > > find some references: > > > > $ egrep -n -r "EOPNOTSUPP|ENOTSUP" /usr/src/sys/dev/md > > /usr/src/sys/dev/md/md.c:423: return (EOPNOTSUPP); > > /usr/src/sys/dev/md/md.c:475: error = EOPNOTSUPP; > > /usr/src/sys/dev/md/md.c:523: return (EOPNOTSUPP); > > /usr/src/sys/dev/md/md.c:601: return (EOPNOTSUPP); > > /usr/src/sys/dev/md/md.c:731: error = EOPNOTSUPP; > > > > Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any > > BIO operation other than READ/WRITE/DELETE. Line 475 is a continuation > > of that. > > > > Line 508 is within mdstart_vnode(), behaving effectively the same as > > line 423. Line 601 is within mdstart_swap(), behaving effectively the > > same as line 423. > > > > Line 731 is within md_kthread(), and indicates only BIO operation > > BIO_GETATTR is supported. This would not be an "ACL attribute" thing, > > but rather getting attributes of the backing device itself. The code > > hints at that: > > > > 722 if (bp->bio_cmd == BIO_GETATTR) { > > 723 if ((sc->fwsectors && sc->fwheads && > > 724 (g_handleattr_int(bp, "GEOM::fwsectors", > > 725 sc->fwsectors) || > > 726 g_handleattr_int(bp, "GEOM::fwheads", > > 727 sc->fwheads))) || > > 728 g_handleattr_int(bp, "GEOM::candelete", 1)) > > 729 error = -1; > > 730 else > > 731 error = EOPNOTSUPP; > > 732 } else { > > Thanks for the investigation! So this seems to be a bug in md? That's > too bad, I was enjoying using it to make my tinderbox builds faster. Sorry, I should have been more clear -- my investigation wasn't to determine if the issue you're reporting was a bug or not, but more along the lines of "hmm, where is userland getting EOPNOTSUPP from in the kernel in this situation?" It could be that some piece hasn't been implemented somewhere yet (more an "incomplete" than a bug :-) ). I tend to trace source the way I did above in hopes that someone (kernel dev, etc.) will chime in and go "Oh, yes, THAT... let me tell you about that!" It's also for educational purposes; I figure sharing the innards along with some simple descriptions might help people feel more comfortable (vs. thinking everything is a black box; don't let the magic smoke out!). Sometimes digging through the code helps. > > This leaves me with some ideas; just tossing them out here... > > > > 1. Maybe/somehow this is caused by swap being used as the backing > > type/store for md(4)? Try using "mdconfig -t malloc -o reserve" > > instead, temporarily anyway. > > Seems to be the same. I'm not too surprised, but at least that rules out swap vs. non-block-device stuff being somehow responsible. I'm not a user of ACLs myself, but Robert Watson might know what's up with this, or where to go looking. I've CC'd him here. > > 2. Are you absolutely 100% sure the kernel you're using was built > > with "options UFS_ACL" defined in it? Doing a "strings -a > > /boot/kernel/kernel | grep UFS_ACL" should suffice. > > > > Yep, it does: > > % strings -a /boot/kernel/kernel | grep UFS_ACL > options UFS_ACL > > (My kernel config is just "include GENERIC" then a bunch of "nooptions" > for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.) Cool, good to rule out the obvious. Thanks. The only other thing I can think of off the top of my head would be to "ktrace -t+ -i" the cp -p, then provide output of kdump -s -t+ after. I wouldn't say go about this quite yet (it may not even help determine what's going on); maybe wait for Robert to take a look first. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 16:30:16 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E093106566C for ; Sun, 6 Mar 2011 16:30:16 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta04.emeryville.ca.mail.comcast.net (qmta04.emeryville.ca.mail.comcast.net [76.96.30.40]) by mx1.freebsd.org (Postfix) with ESMTP id 2F33D8FC18 for ; Sun, 6 Mar 2011 16:30:15 +0000 (UTC) Received: from omta16.emeryville.ca.mail.comcast.net ([76.96.30.72]) by qmta04.emeryville.ca.mail.comcast.net with comcast id FsCF1g0011ZMdJ4A4sWFBu; Sun, 06 Mar 2011 16:30:15 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta16.emeryville.ca.mail.comcast.net with comcast id FsWC1g0010PUQVN8csWCx8; Sun, 06 Mar 2011 16:30:14 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id DA2E69B422; Sun, 6 Mar 2011 08:30:11 -0800 (PST) Date: Sun, 6 Mar 2011 08:30:11 -0800 From: Jeremy Chadwick To: Steve Wills Message-ID: <20110306163011.GA95053@icarus.home.lan> References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> <20110306153745.GA93530@icarus.home.lan> <4D73B0F1.1040304@FreeBSD.org> <20110306162342.GA94700@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110306162342.GA94700@icarus.home.lan> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@FreeBSD.org, Robert Watson , freebsd-current@FreeBSD.org, Edward Tomasz Napiera?a Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 16:30:16 -0000 On Sun, Mar 06, 2011 at 08:23:42AM -0800, Jeremy Chadwick wrote: > On Sun, Mar 06, 2011 at 11:06:09AM -0500, Steve Wills wrote: > > On 03/06/11 10:37, Jeremy Chadwick wrote: > > > > > > At first glance it looks like acl_set_fd_np(3) isn't working on an > > > md-backed filesystem; specifically, it's returning EOPNOTSUPP. You > > > should be able to reproduce the problem by doing a setfacl on something > > > in /tmp/foobar. > > > > > > Looking through src/bin/cp/utils.c, this is the code: > > > > > > 420 if (acl_set_fd_np(dest_fd, acl, acl_type) < 0) { > > > 421 warn("failed to set acl entries for %s", to.p_path); > > > 422 acl_free(acl); > > > 423 return (1); > > > 424 } > > > > > > EOPNOTSUPP for acl_set_fd_np(3) is defined as: > > > > > > [EOPNOTSUPP] The file system does not support ACL retrieval. > > > > > > This would be referring to the destination filesystem. > > > > > > Looking through the md(4) source for references to EOPNOTSUPP, we do > > > find some references: > > > > > > $ egrep -n -r "EOPNOTSUPP|ENOTSUP" /usr/src/sys/dev/md > > > /usr/src/sys/dev/md/md.c:423: return (EOPNOTSUPP); > > > /usr/src/sys/dev/md/md.c:475: error = EOPNOTSUPP; > > > /usr/src/sys/dev/md/md.c:523: return (EOPNOTSUPP); > > > /usr/src/sys/dev/md/md.c:601: return (EOPNOTSUPP); > > > /usr/src/sys/dev/md/md.c:731: error = EOPNOTSUPP; > > > > > > Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any > > > BIO operation other than READ/WRITE/DELETE. Line 475 is a continuation > > > of that. > > > > > > Line 508 is within mdstart_vnode(), behaving effectively the same as > > > line 423. Line 601 is within mdstart_swap(), behaving effectively the > > > same as line 423. > > > > > > Line 731 is within md_kthread(), and indicates only BIO operation > > > BIO_GETATTR is supported. This would not be an "ACL attribute" thing, > > > but rather getting attributes of the backing device itself. The code > > > hints at that: > > > > > > 722 if (bp->bio_cmd == BIO_GETATTR) { > > > 723 if ((sc->fwsectors && sc->fwheads && > > > 724 (g_handleattr_int(bp, "GEOM::fwsectors", > > > 725 sc->fwsectors) || > > > 726 g_handleattr_int(bp, "GEOM::fwheads", > > > 727 sc->fwheads))) || > > > 728 g_handleattr_int(bp, "GEOM::candelete", 1)) > > > 729 error = -1; > > > 730 else > > > 731 error = EOPNOTSUPP; > > > 732 } else { > > > > Thanks for the investigation! So this seems to be a bug in md? That's > > too bad, I was enjoying using it to make my tinderbox builds faster. > > Sorry, I should have been more clear -- my investigation wasn't to > determine if the issue you're reporting was a bug or not, but more along > the lines of "hmm, where is userland getting EOPNOTSUPP from in the > kernel in this situation?" It could be that some piece hasn't been > implemented somewhere yet (more an "incomplete" than a bug :-) ). > > I tend to trace source the way I did above in hopes that someone (kernel > dev, etc.) will chime in and go "Oh, yes, THAT... let me tell you about > that!" It's also for educational purposes; I figure sharing the innards > along with some simple descriptions might help people feel more > comfortable (vs. thinking everything is a black box; don't let the magic > smoke out!). Sometimes digging through the code helps. > > > > This leaves me with some ideas; just tossing them out here... > > > > > > 1. Maybe/somehow this is caused by swap being used as the backing > > > type/store for md(4)? Try using "mdconfig -t malloc -o reserve" > > > instead, temporarily anyway. > > > > Seems to be the same. > > I'm not too surprised, but at least that rules out swap vs. > non-block-device stuff being somehow responsible. > > I'm not a user of ACLs myself, but Robert Watson might know what's up > with this, or where to go looking. I've CC'd him here. > > > > 2. Are you absolutely 100% sure the kernel you're using was built > > > with "options UFS_ACL" defined in it? Doing a "strings -a > > > /boot/kernel/kernel | grep UFS_ACL" should suffice. > > > > > > > Yep, it does: > > > > % strings -a /boot/kernel/kernel | grep UFS_ACL > > options UFS_ACL > > > > (My kernel config is just "include GENERIC" then a bunch of "nooptions" > > for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.) > > Cool, good to rule out the obvious. Thanks. > > The only other thing I can think of off the top of my head would be to > "ktrace -t+ -i" the cp -p, then provide output of kdump -s -t+ after. > I wouldn't say go about this quite yet (it may not even help determine > what's going on); maybe wait for Robert to take a look first. It would help if I actually added Robert to the CC list, wouldn't it? :-) -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 16:33:59 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34073106566B; Sun, 6 Mar 2011 16:33:59 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 099878FC08; Sun, 6 Mar 2011 16:33:59 +0000 (UTC) Received: from [192.168.2.112] (host109-157-101-183.range109-157.btcentralplus.com [109.157.101.183]) by cyrus.watson.org (Postfix) with ESMTPSA id E19C546B2E; Sun, 6 Mar 2011 11:33:57 -0500 (EST) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: "Robert N. M. Watson" In-Reply-To: <20110306163011.GA95053@icarus.home.lan> Date: Sun, 6 Mar 2011 16:33:56 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <995309EC-397D-4DC5-A49C-B881DAA519AF@FreeBSD.org> References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> <20110306153745.GA93530@icarus.home.lan> <4D73B0F1.1040304@FreeBSD.org> <20110306162342.GA94700@icarus.home.lan> <20110306163011.GA95053@icarus.home.lan> To: Jeremy Chadwick X-Mailer: Apple Mail (2.1082) Cc: freebsd-fs@FreeBSD.org, Steve Wills , freebsd-current@FreeBSD.org, Edward Tomasz Napiera?a Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 16:33:59 -0000 On 6 Mar 2011, at 16:30, Jeremy Chadwick wrote: >>>> 2. Are you absolutely 100% sure the kernel you're using was built >>>> with "options UFS_ACL" defined in it? Doing a "strings -a >>>> /boot/kernel/kernel | grep UFS_ACL" should suffice. >>>>=20 >>>=20 >>> Yep, it does: >>>=20 >>> % strings -a /boot/kernel/kernel | grep UFS_ACL >>> options UFS_ACL >>>=20 >>> (My kernel config is just "include GENERIC" then a bunch of = "nooptions" >>> for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.) >>=20 >> Cool, good to rule out the obvious. Thanks. >>=20 >> The only other thing I can think of off the top of my head would be = to >> "ktrace -t+ -i" the cp -p, then provide output of kdump -s -t+ after. >> I wouldn't say go about this quite yet (it may not even help = determine >> what's going on); maybe wait for Robert to take a look first. >=20 > It would help if I actually added Robert to the CC list, wouldn't it? > :-) There's a lot of information in that post, perhaps it would be useful = for someone to clarify what's going on exactly. If you're using ACLs on = UFS, have you turned them on using tunefs? What flavour of ACLs are you = using -- POSIX.1e or NFSv4? Robert= From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 16:43:13 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58A46106566B; Sun, 6 Mar 2011 16:43:13 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [204.109.58.86]) by mx1.freebsd.org (Postfix) with ESMTP id EC0C88FC16; Sun, 6 Mar 2011 16:43:12 +0000 (UTC) Received: from meatwad.mouf.net (cpe-065-190-178-041.nc.res.rr.com [65.190.178.41]) (authenticated bits=0) by mouf.net (8.14.4/8.14.4) with ESMTP id p26Gh9FL097237 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Sun, 6 Mar 2011 11:43:10 -0500 (EST) (envelope-from swills@FreeBSD.org) Message-ID: <4D73B99D.1000901@FreeBSD.org> Date: Sun, 06 Mar 2011 11:43:09 -0500 From: Steve Wills User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.16) Gecko/20110130 Thunderbird/3.0.11 MIME-Version: 1.0 To: Jeremy Chadwick References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> <20110306153745.GA93530@icarus.home.lan> <4D73B0F1.1040304@FreeBSD.org> <20110306162342.GA94700@icarus.home.lan> <20110306163011.GA95053@icarus.home.lan> In-Reply-To: <20110306163011.GA95053@icarus.home.lan> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mouf.net [204.109.58.86]); Sun, 06 Mar 2011 11:43:11 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.2 at mouf.net X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, Robert Watson , freebsd-current@FreeBSD.org, Edward Tomasz Napiera?a Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 16:43:13 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/06/11 11:30, Jeremy Chadwick wrote: > On Sun, Mar 06, 2011 at 08:23:42AM -0800, Jeremy Chadwick wrote: >> On Sun, Mar 06, 2011 at 11:06:09AM -0500, Steve Wills wrote: >> >> Sorry, I should have been more clear -- my investigation wasn't to >> determine if the issue you're reporting was a bug or not, but more along >> the lines of "hmm, where is userland getting EOPNOTSUPP from in the >> kernel in this situation?" It could be that some piece hasn't been >> implemented somewhere yet (more an "incomplete" than a bug :-) ). >> >> I tend to trace source the way I did above in hopes that someone (kernel >> dev, etc.) will chime in and go "Oh, yes, THAT... let me tell you about >> that!" It's also for educational purposes; I figure sharing the innards >> along with some simple descriptions might help people feel more >> comfortable (vs. thinking everything is a black box; don't let the magic >> smoke out!). Sometimes digging through the code helps. Definitely. I had started looking at cp(1) source, but got a bit lost. >>>> This leaves me with some ideas; just tossing them out here... >>>> >>>> 1. Maybe/somehow this is caused by swap being used as the backing >>>> type/store for md(4)? Try using "mdconfig -t malloc -o reserve" >>>> instead, temporarily anyway. >>> >>> Seems to be the same. >> >> I'm not too surprised, but at least that rules out swap vs. >> non-block-device stuff being somehow responsible. >> >> I'm not a user of ACLs myself, but Robert Watson might know what's up >> with this, or where to go looking. I've CC'd him here. >> >>>> 2. Are you absolutely 100% sure the kernel you're using was built >>>> with "options UFS_ACL" defined in it? Doing a "strings -a >>>> /boot/kernel/kernel | grep UFS_ACL" should suffice. >>>> >>> >>> Yep, it does: >>> >>> % strings -a /boot/kernel/kernel | grep UFS_ACL >>> options UFS_ACL >>> >>> (My kernel config is just "include GENERIC" then a bunch of "nooptions" >>> for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.) >> >> Cool, good to rule out the obvious. Thanks. >> >> The only other thing I can think of off the top of my head would be to >> "ktrace -t+ -i" the cp -p, then provide output of kdump -s -t+ after. >> I wouldn't say go about this quite yet (it may not even help determine >> what's going on); maybe wait for Robert to take a look first. > > It would help if I actually added Robert to the CC list, wouldn't it? > :-) > That's OK, kib@ enlightened me (via IRC) that the issue is that I failed to enable NFSv4 ACLs on the FS. I had tried this, but somehow got an error, and then when I tried again I had the wrong ACL type (POSIX.1e). Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBAgAGBQJNc7mdAAoJEPXPYrMgexuhhZUIAId0nmh4YTJbjzv3NDmxXVt3 16ZIx+wOQON9Sln0vrpKIDJGk95KzvuLnbVBPg7Oxhaa11llkEeYFFqMEVWn6Esa hqwDe5yYJYWWyF7ulCmHDbAE2gEF5q2rVy0KrV+aI9x5DLeB607dpmZqVV6TeQky mQb1zOcw165galYhI3S4juPK6z5nq5pnTc+l05590CcAkWtxOFwQjlDZiQtrxdg2 YhFhtrMeGubRdKtJyG0r17kJzlGCBwIYBg7SgnmORVB64W0N0zkVcC+ZrIhioR6Z FoucxqelZ4VDt6IlmxZ3DzTNUGKWulCeCrus8+lDBPL1M92AfFgMF89i5n0Ot8Y= =302p -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 17:22:10 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AD94E106566C for ; Sun, 6 Mar 2011 17:22:10 +0000 (UTC) (envelope-from mlmichael70@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 3C4348FC14 for ; Sun, 6 Mar 2011 17:22:09 +0000 (UTC) Received: by wwb31 with SMTP id 31so4598676wwb.31 for ; Sun, 06 Mar 2011 09:22:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=+HsxlBScW7Yp7J/liZXVTJzTJNrIy+GjT1Pgo2XVTcc=; b=KKE6sWFPm/QtB+cPQYS031Lhw0zXlpFlJiGhspb8F/cM0hjtG8PwXM+Og+2MNXAdKN Xe7kH56DUEl6N1qar4LR0pXEcLeE5KWbCa5giySP8p1dqxE6ED3z6YoKX6Fyq9D21y6Z TRGktxvIa3/frbcgS3A8ztPLOa/zwt/DP5pOI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=t84iYDU9aeR3pJZtVWsGhdvagWaP5SCcrhTtqKsasRqMfqdheQpt6wWNy5+rxQe2rX gpfcqXArYw3ae/Zu7sQj6wY9z+vmOAqQAjDOwI8mYocc0Ak2wshh1AGCXHTm8CuTmPW5 ELMd+u/DwDOfurOl78UDPJekxr2TqiCcakLJs= Received: by 10.227.165.194 with SMTP id j2mr2618825wby.178.1299432128525; Sun, 06 Mar 2011 09:22:08 -0800 (PST) Received: from prime.nonspace ([82.132.211.72]) by mx.google.com with ESMTPS id w25sm1405463wbd.5.2011.03.06.09.22.06 (version=SSLv3 cipher=OTHER); Sun, 06 Mar 2011 09:22:07 -0800 (PST) Message-ID: <4D73C2C4.6010809@gmail.com> Date: Sun, 06 Mar 2011 17:22:12 +0000 From: Michael User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.13) Gecko/20101215 Thunderbird/3.1.7 MIME-Version: 1.0 To: Jeremy Chadwick References: <4D504D96.2040305@gmail.com> <4D510A7F.3070708@my.gd> <20110208101249.GA8057@icarus.home.lan> In-Reply-To: <20110208101249.GA8057@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: chflags (uappnd) on ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 17:22:10 -0000 On 08/02/2011 10:12, Jeremy Chadwick wrote: > On Tue, Feb 08, 2011 at 10:18:55AM +0100, Damien Fleuriot wrote: >> Getting the very same error on 8.2-RC3 amd64 >> >> FreeBSD mybsd 8.2-RC3 FreeBSD 8.2-RC3 #1: Thu Feb 3 11:03:48 CET 2011 >> root@mybsd:/usr/obj/usr/src/sys/DAM amd64 >> >> mybsd# zpool get version data >> NAME PROPERTY VALUE SOURCE >> data version 15 default >> >> mybsd# zfs get version >> NAME PROPERTY VALUE SOURCE >> data version 4 >> >> >> On 2/7/11 8:52 PM, Michael wrote: >>> Hello, >>> >>> Is uappnd flag supported on ZFS? I'm using 8.1-R and when I try to: >>> chflags uappnd file.txt >>> then I get: >>> chflags: file.txt: Operation not supported > > It looks like the implemented/translated chflags(2) bits are: > > SF_IMMUTABLE ("chflags schg")<--> ZFS_IMMUTABLE > SF_APPEND ("chflags sappnd")<--> ZFS_APPENDONLY > SF_NOUNLINK ("chflags sunlnk")<--> ZFS_NOUNLINK > UF_NODUMP ("chflags nodump")<--> ZFS_NODUMP > Any idea what is the reason for that? I mean is it temporal situation (not implemented yet) or are they for some reason dropped/deprecated? Are the UFlags coming for ZFS or not really? Michael From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 17:49:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 63682106564A; Sun, 6 Mar 2011 17:49:12 +0000 (UTC) (envelope-from etnapierala@googlemail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8C0D18FC15; Sun, 6 Mar 2011 17:49:11 +0000 (UTC) Received: by bwz12 with SMTP id 12so3546800bwz.13 for ; Sun, 06 Mar 2011 09:49:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to:x-mailer; bh=WWev/scJ0luoDZlyNMO1RlXjY6KyQ3cpe5UZ4ixoCqk=; b=MQT09A2YjszpR8fduJfIIsDce/vy4m7q95reg2jvgsRga5Nf+Ll2ShwjVRf/bnM0rY E411K3zVWNRHTTKNl5h/tXgOYNa6wSeYxKgeWy0WGGOipLd0wGOllimyuexkIJPE8ogb FIYqmfaWy+atrk7ugltFwCI92znM0DWmQ6kPU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; b=Q8DPjvffrU9Fre0L1dKT5ybikJlPLeC5MjhIMmIeSGVw1MhyMFUjHQ4nAEcwqNqSsf 9jrRGzud4USthZ0AbzrlXB0XjZ9okANRyIc5MuwNE8pSW2/TQUikBZaAoYYiGevFMcFf 6GGYUe5CjyLMnewvuXGHKnJu9VEaEpkQ+H3B4= Received: by 10.204.126.131 with SMTP id c3mr2060030bks.104.1299433750434; Sun, 06 Mar 2011 09:49:10 -0800 (PST) Received: from [192.168.1.102] (45.81.datacomsa.pl [195.34.81.45]) by mx.google.com with ESMTPS id l1sm1160554bkl.1.2011.03.06.09.49.08 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 06 Mar 2011 09:49:09 -0800 (PST) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=iso-8859-2 From: =?iso-8859-2?Q?Edward_Tomasz_Napiera=B3a?= In-Reply-To: <4D739D96.5090705@FreeBSD.org> Date: Sun, 6 Mar 2011 18:49:06 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> To: Steve Wills X-Mailer: Apple Mail (2.1082) Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 17:49:12 -0000 Wiadomo=B6=E6 napisana przez Steve Wills w dniu 2011-03-06, o godz. = 15:43: > On 03/06/11 08:35, Steve Wills wrote: >> On 03/06/11 04:22, Edward Tomasz NapieraBa wrote: >>> Wiadomo[=07 napisana przez Steve Wills w dniu 2011-03-06, o godz. = 05:11: >>=20 >>> [..] >>=20 >>>> Thanks for your work on this, I'm very happy to have ZFS v28. I = just >>>> updated my -CURRENT system from a snapshot from about a month ago = to >>>> code from today. I have 3 pools and one of them is for ports = tinderbox. >>>> I only upgraded that pool. When I try to build something using >>>> tinderbox, I get this error: >>>>=20 >>>> cp: failed to set acl entries for >>>> /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD/buildscript: Operation = not >>>> supported >>=20 >>> What does "mount" show? >>=20 >> /dev/md4 12186190 332724 11853466 3% >> /usr/local/tinderbox/9-CURRENT-amd64-FreeBSD >>=20 >> Sorry, I forgot about the mdmfs hacks I had in my local tinderd. = Without >> them, it works fine. So the problem seems to be in mfs rather than = zfs. >=20 > I should have said mdmfs, but all that's doing is running mdconfig and > newfs for me. I've reproduced the issue without mdmfs: >=20 > % mdconfig -a -t swap -s 12G -u 4 > % newfs -m 0 -o time /dev/md4 > [...] > % mount /dev/md4 /tmp/foobar > % cp -p /usr/local/tinderbox/scripts/lib/buildscript /tmp/foobar > cp: failed to set acl entries for /tmp/foobar/buildscript: Operation = not > supported >=20 > Without -p it works fine. FWIW: >=20 > % getfacl /usr/local/tinderbox/scripts/lib/buildscript > # file: /usr/local/tinderbox/scripts/lib/buildscript > # owner: root > # group: wheel > owner@:--------------:------:deny > owner@:rwxp---A-W-Co-:------:allow > group@:-w-p----------:------:deny > group@:r-x-----------:------:allow > everyone@:-w-p---A-W-Co-:------:deny > everyone@:r-x---a-R-c--s:------:allow >=20 > Any suggestions on where the problem could be? The above looks like old-style, "canonical six" trivial ACL. Now, cp(1) shouldn't even try to copy the ACL in this case, since there is nothing to copy. So, for some reason, something failed between cp(1), acl_is_trivial_np(3) and the kernel. What does "ls -al /usr/local/tinderbox/scripts/lib/buildscript" show? -- If you cut off my head, what would I say? Me and my head, or me and my = body? From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 18:27:27 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF71C106566C; Sun, 6 Mar 2011 18:27:27 +0000 (UTC) (envelope-from swills@FreeBSD.org) Received: from mouf.net (mouf.net [204.109.58.86]) by mx1.freebsd.org (Postfix) with ESMTP id 8F95A8FC15; Sun, 6 Mar 2011 18:27:27 +0000 (UTC) Received: from meatwad.mouf.net (cpe-065-190-178-041.nc.res.rr.com [65.190.178.41]) (authenticated bits=0) by mouf.net (8.14.4/8.14.4) with ESMTP id p26IRP9D097784 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Sun, 6 Mar 2011 13:27:26 -0500 (EST) (envelope-from swills@FreeBSD.org) Message-ID: <4D73D20C.2010706@FreeBSD.org> Date: Sun, 06 Mar 2011 13:27:24 -0500 From: Steve Wills User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.16) Gecko/20110130 Thunderbird/3.0.11 MIME-Version: 1.0 To: =?ISO-8859-2?Q?Edward_Tomasz_Napiera=B3a?= References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 8bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mouf.net [204.109.58.86]); Sun, 06 Mar 2011 13:27:26 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.2 at mouf.net X-Virus-Status: Clean Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 18:27:28 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/06/11 12:49, Edward Tomasz Napierała wrote: > > The above looks like old-style, "canonical six" trivial ACL. Now, > cp(1) shouldn't even try to copy the ACL in this case, since there > is nothing to copy. So, for some reason, something failed between > cp(1), acl_is_trivial_np(3) and the kernel. > > What does "ls -al /usr/local/tinderbox/scripts/lib/buildscript" show? It looks like: - -rwxr-xr-x+ 1 root wheel 12547 Feb 1 21:21 /usr/local/tinderbox/scripts/lib/buildscript Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iQEcBAEBAgAGBQJNc9IMAAoJEPXPYrMgexuhMmQH/jU7Zsay7fDmYP4UY60P63TH Zbq3jyitlhgh3BNyibbbJ3O0OsEUWEJ+xO5jz04g32Kvv/NFWqQD9tPygkABeBb2 v50K/uOS8VskcMoJaxzkOIDz2Y/0PNKHo+/Cft7/hMbW1W5h7ebe7peKn7FA/F6R MzFY6RMe6sY4x00gxpo/f3DQAB5VR6MqQl5SbDMUE8dP7ut5gUe9f+QvJPc2OgMA thCLqxEfjKohWtpmuctr1c8Ap3UKvAAzwUVT6qs+CNidaxb3qzXLDyA9Z614GVAy 1WQxTsEtfiByMm6N1qUqIkNZNFmFSO0cEuRyK8Z4FJ0ZA5X4smk8gicATp/wAPQ= =Y/S+ -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 18:48:19 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BF9351065672 for ; Sun, 6 Mar 2011 18:48:19 +0000 (UTC) (envelope-from boydjd@jbip.net) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4A2A08FC1B for ; Sun, 6 Mar 2011 18:48:18 +0000 (UTC) Received: by fxm19 with SMTP id 19so3969675fxm.13 for ; Sun, 06 Mar 2011 10:48:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jbip.net; s=google; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=+AcuZp0Kb35R/Ocu/vMmOgXnRQsWV8yvHk8ozJxqFyI=; b=HXoF42t8my9jFhqoejCZryoKeJ1hgtdVzWmgLav2Pp1OZ/UrPXIVZ+t1VZw+cT1C83 RzI47ykTKKYTJAgB3TfY3tmMPH3PJsom/+HSoP2jjHozQLlEcIxTddxvxth6gsxVgT8D QcC5v8oUMqluwdWLHceMLtu1Xfq0OIpwWx9gM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=jbip.net; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=WdYMshgo35cd5avjolB+nAo4p7F06iuxPmEzBdGFdth9SS+0UMQeOkODKFgo1SROUi X3pXn+KQjJVr+w97HSmfetKhrh1fJT21el62RE46EitWIsjlyQ97gqtlATcEj7zQtq41 L68IIxpGSHiciOfTDqnRaO+PrR1YZmQK2mPXE= Received: by 10.223.151.14 with SMTP id a14mr1527634faw.134.1299437298067; Sun, 06 Mar 2011 10:48:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.144.137 with HTTP; Sun, 6 Mar 2011 10:47:58 -0800 (PST) In-Reply-To: <20110306090455.GA87055@icarus.home.lan> References: <1299232133.18671.3.camel@pc286.embl.fr> <20110304100517.GA23249@icarus.home.lan> <20110304105608.GA23887@icarus.home.lan> <20110306090455.GA87055@icarus.home.lan> From: Joshua Boyd Date: Sun, 6 Mar 2011 13:47:58 -0500 Message-ID: To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 18:48:19 -0000 On Sun, Mar 6, 2011 at 4:04 AM, Jeremy Chadwick wrote: > On Sun, Mar 06, 2011 at 02:59:43AM -0500, Joshua Boyd wrote: >> On Fri, Mar 4, 2011 at 5:56 AM, Jeremy Chadwick >> wrote: >> > If you get better performance -- really, truly, honestly -- with >> > prefetch enabled on your system, then I strongly recommend you keep it >> > enabled. ?However, for what it's worth (probably not much), this is th= e >> > first I've ever heard of a FreeBSD system performing better with >> > prefetch enabled. >> >> I just recently turned it on after having it turned off for a long >> time ... my speeds went from ~300MB/s to 600+MB/s in bonnie++. This is >> a dual core AM3 system with 8GB of ram, and 15 disks in a striped >> raidz configuration (3 sets striped). > > Here are some numbers for you. =A0This is from a 8.2-STABLE (RELENG_8) > system built Thu Feb 24 22:06:45 PST 2011, type amd64. Interesting results. My kernel currently has a build date 2 days earlier than yours. Here are my results, showing the huge increase in speed. The only major configuration difference appears that I've disabled the ZIL and you have yours enabled. That shouldn't make any difference for read speeds though. FreeBSD foghornleghorn.res.openband.net 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #13: Tue Feb 22 17:39:03 EST 2011 root@foghornleghorn.res.openband.net:/usr/obj/usr/src/sys/FOGHORNLEGHORN amd64 /boot/loader.conf =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D vfs.zfs.zil_disable=3D"1" vfs.zfs.vdev.min_pending=3D"1" vfs.zfs.vdev.max_pending=3D"1" vm.kmem_size=3D"8192M" vfs.zfs.arc_max=3D6144M vfs.zfs.prefetch_disable=3D"0" vfs.zfs.txg.timeout=3D"5" /etc/sysctl.conf =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D kern.maxfiles=3D65536 kern.maxfilesperproc=3D32768 vfs.read_max=3D32 vfs.ufs.dirhash_maxmem=3D16777216 kern.maxvnodes=3D250000 vfs.zfs.txg.write_limit_override=3D1073741824 ZFS details =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 18.2T 7.73T 10.4T 42% ONLINE - # zfs list NAME USED AVAIL REFER MOUNTPOINT tank 6.17T 8.10T 36.7K /tank tank/downloads 4.89T 8.10T 2.30T /tank/downloads tank/downloads/movies 2.59T 8.10T 2.59T /tank/downloads/movies tank/usr 1.29T 8.10T 32.0K /tank/usr tank/usr/home 1.29T 8.10T 69.5K /usr/home tank/usr/home/josh 1.29T 8.10T 13.4G /usr/home/josh tank/usr/home/josh/hellanzb 32.0K 8.10T 32.0K /usr/home/josh/hellanzb tank/usr/home/josh/rtorrent 1.27T 8.10T 1.27T /usr/home/josh/rtorrent tank/usr/home/josh/watch 8.00M 8.10T 8.00M /usr/home/josh/watch # zpool status tank pool: tank state: ONLINE scrub: scrub completed after 7h43m with 0 errors on Sun Mar 6 07:43:56 20= 11 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da8 ONLINE 0 0 0 da18 ONLINE 0 0 0 da19 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da11 ONLINE 0 0 0 da10 ONLINE 0 0 0 da17 ONLINE 0 0 0 da9 ONLINE 0 0 0 da5 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors Controller details =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D mpt0: port 0x6000-0x60ff mem 0xf75fc000-0xf75fffff,0xf75e0000-0xf75effff irq 18 at device 0.0 on pci1 mpt0: [ITHREAD] mpt0: MPI Version=3D1.5.20.0 mpt1: port 0x7000-0x70ff mem 0xf78fc000-0xf78fffff,0xf78e0000-0xf78effff irq 19 at device 0.0 on pci2 mpt1: [ITHREAD] mpt1: MPI Version=3D1.5.20.0 mpt2: port 0xd000-0xd0ff mem 0xf7ffc000-0xf7ffffff,0xf7fe0000-0xf7feffff irq 19 at device 0.0 on pci6 mpt2: [ITHREAD] mpt2: MPI Version=3D1.5.19.0 Disk details =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D da8 at mpt0 bus 0 scbus0 target 0 lun 0 da8: Fixed Direct Access SCSI-5 device da8: 300.000MB/s transfers da8: Command Queueing enabled da8: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da9 at mpt0 bus 0 scbus0 target 1 lun 0 da9: Fixed Direct Access SCSI-5 device da9: 300.000MB/s transfers da9: Command Queueing enabled da9: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da0 at mpt1 bus 0 scbus1 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 300.000MB/s transfers da0: Command Queueing enabled da0: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da10 at mpt0 bus 0 scbus0 target 2 lun 0 da10: Fixed Direct Access SCSI-5 device da10: 300.000MB/s transfers da10: Command Queueing enabled da10: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da11 at mpt0 bus 0 scbus0 target 3 lun 0 da11: Fixed Direct Access SCSI-5 device da11: 300.000MB/s transfers da11: Command Queueing enabled da11: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da1 at mpt1 bus 0 scbus1 target 1 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 300.000MB/s transfers da1: Command Queueing enabled da1: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da2 at mpt1 bus 0 scbus1 target 2 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 300.000MB/s transfers da2: Command Queueing enabled da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da3 at mpt1 bus 0 scbus1 target 3 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 300.000MB/s transfers da3: Command Queueing enabled da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da4 at mpt1 bus 0 scbus1 target 4 lun 0 da4: Fixed Direct Access SCSI-5 device da4: 300.000MB/s transfers da4: Command Queueing enabled da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da5 at mpt1 bus 0 scbus1 target 5 lun 0 da5: Fixed Direct Access SCSI-5 device da5: 300.000MB/s transfers da5: Command Queueing enabled da5: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da6 at mpt1 bus 0 scbus1 target 6 lun 0 da6: Fixed Direct Access SCSI-5 device da6: 300.000MB/s transfers da6: Command Queueing enabled da6: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da7 at mpt1 bus 0 scbus1 target 7 lun 0 da7: Fixed Direct Access SCSI-5 device da7: 300.000MB/s transfers da7: Command Queueing enabled da7: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da16 at mpt2 bus 0 scbus2 target 82 lun 0 da16: Fixed Direct Access SCSI-5 device da16: 300.000MB/s transfers da16: Command Queueing enabled da16: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da17 at mpt2 bus 0 scbus2 target 83 lun 0 da17: Fixed Direct Access SCSI-5 device da17: 300.000MB/s transfers da17: Command Queueing enabled da17: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da18 at mpt2 bus 0 scbus2 target 84 lun 0 da18: Fixed Direct Access SCSI-5 device da18: 300.000MB/s transfers da18: Command Queueing enabled da18: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) da19 at mpt2 bus 0 scbus2 target 85 lun 0 da19: Fixed Direct Access SCSI-5 device da19: 300.000MB/s transfers da19: Command Queueing enabled da19: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C) Benchmark results #1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Version 1.96 ------Sequential Output------ --Sequential Input- --Ran= dom- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --See= ks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec= %CP foghornleghorn. 16G 213 99 266782 53 90296 19 480 95 218719 24 229.7 6 Latency 43348us 37929us 242ms 102ms 68306us 462= ms Version 1.96 ------Sequential Create------ --------Random Create----= ---- foghornleghorn.res. -Create-- --Read--- -Delete-- -Create-- --Read--- -Dele= te-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec= %CP 16 15220 44 +++++ +++ 19214 56 22371 58 +++++ +++ 22133= 66 Latency 10658us 60us 82us 6540us 39us 1677= us Benchmark results #2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Version 1.96 ------Sequential Output------ --Sequential Input- --Ran= dom- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --See= ks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec= %CP foghornleghorn. 16G 201 99 276506 56 198428 38 459 97 627451 73 252.0 5 Latency 45695us 35953us 265ms 69630us 42440us 389= ms Version 1.96 ------Sequential Create------ --------Random Create----= ---- foghornleghorn.res. -Create-- --Read--- -Delete-- -Create-- --Read--- -Dele= te-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec= %CP 16 14988 50 +++++ +++ 18693 59 18535 51 +++++ +++ 20827= 67 Latency 13309us 93us 116us 8165us 36us 1046= us --=20 Joshua Boyd JBipNet E-mail: boydjd@jbip.net http://www.jbip.net From owner-freebsd-fs@FreeBSD.ORG Sun Mar 6 20:30:35 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3B5AD1065677; Sun, 6 Mar 2011 20:30:35 +0000 (UTC) (envelope-from etnapierala@googlemail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 5FC188FC26; Sun, 6 Mar 2011 20:30:34 +0000 (UTC) Received: by bwz12 with SMTP id 12so3618140bwz.13 for ; Sun, 06 Mar 2011 12:30:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to:x-mailer; bh=Lm8fQp9ZCrrSAqvkMdMofW6+99+PijaSzKx4a9BUIWE=; b=MZSg5RB9GM8dVxk8k+6nFjpoael34hJWgSULVmISMILK5oLRxzMWEJV6UAGpgZ08EL 5LYwKjqE1Tcg7FWMlF2CM665NJcwJ7M3B+mDQyclGm0ciXKR3XbDMWYNxdToRZ8RJgBk ywdLMF+zmRQnfY0VRjubQoO+uNsT3twZGbWKU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; b=PpCmye6nNAnp37imR9nrhV7U7SFijw4JEIubmE+xP9sfdFBNPCfOW492YGg9xbSGUL O2mvvvS53YC5KGKwOFMuRRvIXDr+6VqYeQ+L4t1EHh803pnwDvxiF2V7GBgQBGIFA+Yx MpDQKWhgIqaG8SLTEaHtH42QC41G5YFdEPKkI= Received: by 10.204.20.136 with SMTP id f8mr2697226bkb.174.1299443433360; Sun, 06 Mar 2011 12:30:33 -0800 (PST) Received: from [192.168.1.102] (45.81.datacomsa.pl [195.34.81.45]) by mx.google.com with ESMTPS id w3sm1270088bkt.5.2011.03.06.12.30.31 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 06 Mar 2011 12:30:32 -0800 (PST) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=iso-8859-2 From: =?iso-8859-2?Q?Edward_Tomasz_Napiera=B3a?= In-Reply-To: <4D73D20C.2010706@FreeBSD.org> Date: Sun, 6 Mar 2011 21:30:28 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20110227202957.GD1992@garage.freebsd.pl> <4D73098F.3000807@FreeBSD.org> <59D664AA-76C6-45C7-94CE-5AA63080368C@FreeBSD.org> <4D738DB0.1090603@FreeBSD.org> <4D739D96.5090705@FreeBSD.org> <4D73D20C.2010706@FreeBSD.org> To: Steve Wills X-Mailer: Apple Mail (2.1082) Cc: freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Mar 2011 20:30:35 -0000 Wiadomo=B6=E6 napisana przez Steve Wills w dniu 2011-03-06, o godz. = 19:27: > On 03/06/11 12:49, Edward Tomasz Napiera=B3a wrote: >>=20 >> The above looks like old-style, "canonical six" trivial ACL. Now, >> cp(1) shouldn't even try to copy the ACL in this case, since there >> is nothing to copy. So, for some reason, something failed between >> cp(1), acl_is_trivial_np(3) and the kernel. >>=20 >> What does "ls -al /usr/local/tinderbox/scripts/lib/buildscript" show? >=20 > It looks like: >=20 > - -rwxr-xr-x+ 1 root wheel 12547 Feb 1 21:21 > /usr/local/tinderbox/scripts/lib/buildscript r219272 introduced an error which made libc treat the "canonical six" ACLs as nontrivial. I backed it out; you need to rebuild libc. Sorry. -- If you cut off my head, what would I say? Me and my head, or me and my = body? From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 09:49:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 715E2106566B for ; Mon, 7 Mar 2011 09:49:30 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 05B5B8FC0C for ; Mon, 7 Mar 2011 09:49:30 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 23DCE13CFF6; Mon, 7 Mar 2011 10:49:29 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id THAuO76QpApO; Mon, 7 Mar 2011 10:49:27 +0100 (CET) Received: from [10.0.3.3] (188-167-50-235.dynamic.chello.sk [188.167.50.235]) by mail.vx.sk (Postfix) with ESMTPSA id D002F13CFEB; Mon, 7 Mar 2011 10:49:26 +0100 (CET) Message-ID: <4D74AA26.10606@FreeBSD.org> Date: Mon, 07 Mar 2011 10:49:26 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 To: Jeremy Chadwick References: <1299232133.18671.3.camel@pc286.embl.fr> <20110304100517.GA23249@icarus.home.lan> In-Reply-To: <20110304100517.GA23249@icarus.home.lan> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, =?UTF-8?B?TWlja2HDq2wgQ2Fuw6l2ZXQ=?= Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 09:49:30 -0000 In 8-STABLE (amd64), starting with SVN revision 214620 vm.kmem_size_scale defaults to 1. This means in 8.2-RELEASE, vm.kmem_size is automatically set to the amout of your system RAM, so vm.kmem_size="16G" is automatically set on a system with 16GB RAM. Dňa 04.03.2011 11:05, Jeremy Chadwick wrote / napĂ­sal(a): > On Fri, Mar 04, 2011 at 10:48:53AM +0100, MickaĂŤl CanĂŠvet wrote: >>> I'd use vm.kmem_size="32G" (i.e. twice your RAM) and that's it. >> Should I also increase vfs.zfs.arc_max ? > You should adjust vm.kmem_size, but not vm.kmem_size_max. > > You can adjust vfs.zfs.arc_max to basically ensure system stability. > This thread is acting as evidence that there are probably edge cases > where the kmem too small panic can still happen despite the limited ARC > maximum defaults. > > For a 16GB system, I'd probably use these settings: > > vm.kmem_size="16384M" > vfs.zfs.arc_max="13312M" > > I would also use these two settings: > > # Disable ZFS prefetching > # http://southbrain.com/south/2008/04/the-nightmare-comes-slowly-zfs.html > # Increases overall speed of ZFS, but when disk flushing/writes occur, > # system is less responsive (due to extreme disk I/O). > # NOTE: Systems with 8GB of RAM or more have prefetch enabled by > # default. > vfs.zfs.prefetch_disable="1" > > # Decrease ZFS txg timeout value from 30 (default) to 5 seconds. This > # should increase throughput and decrease the "bursty" stalls that > # happen during immense I/O with ZFS. > # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007343.html > # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007355.html > vfs.zfs.txg.timeout="5" > > The advice in the Wiki is outdated, especially for 8.2-RELEASE. Best > not to follow it as of this writing. > >> Do you have any idea why the kernel panicked at only 8GB allocated ? > I do not. A kernel developer will have to comment on that. > > Please attempt to reproduce the problem. If you can reproduce it > reliably, this will greatly help kernel developers tracking down the > source of the problem. > From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 10:37:19 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18B50106564A for ; Mon, 7 Mar 2011 10:37:19 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id 70C658FC0A for ; Mon, 7 Mar 2011 10:37:18 +0000 (UTC) Received: (qmail 73452 invoked by uid 98); 7 Mar 2011 11:37:16 +0100 Received: from 91.204.4.103 by smtp.free.de (envelope-from , uid 82) with qmail-scanner-1.25 (clamdscan: 0.96.5/12494. Clear:RC:1(91.204.4.103):. Processed in 0.035237 secs); 07 Mar 2011 10:37:16 -0000 X-Qmail-Scanner-Mail-From: gallasch@free.de via smtp.free.de X-Qmail-Scanner: 1.25 (Clear:RC:1(91.204.4.103):. Processed in 0.035237 secs) Received: from smtp.free.de (HELO [192.168.1.119]) (gallasch@free.de@[91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES128-SHA encrypted SMTP for ; 7 Mar 2011 11:37:16 +0100 References: <1299226985.3391.18.camel@pc286.embl.fr> In-Reply-To: <1299226985.3391.18.camel@pc286.embl.fr> Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=iso-8859-1 Message-Id: <3E910C24-FA7E-461B-9677-ED551D69FBF2@free.de> Content-Transfer-Encoding: quoted-printable From: Kai Gallasch Date: Mon, 7 Mar 2011 11:37:16 +0100 To: =?iso-8859-1?Q?Micka=EBl_Can=E9vet?= X-Mailer: Apple Mail (2.1082) Cc: freebsd-fs@freebsd.org Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 10:37:19 -0000 Am 04.03.2011 um 09:23 schrieb Micka=EBl Can=E9vet: > Hello, >=20 > I know there is a lot of threads about "kmem_map too small" problems = on > former versions of FreeBSD with ZFS, but on the wiki > (http://wiki.freebsd.org/ZFSTuningGuide) it is said that "FreeBSD 7.2+ > has improved kernel memory allocation strategy and no tuning may be > necessary on systems with more than 2 GB of RAM." >=20 > I have a 64bits machine with 16GB of RAM with FreeBSD 8.2-RELEASE and = no > tuning: >=20 > # sysctl -a | grep -e "vm.kmem_size_max:" -e "vm.kmem_size:" -e > "vfs.zfs.arc_max:" > vm.kmem_size_max: 329853485875 > vm.kmem_size: 16624558080 > vfs.zfs.arc_max: 15550816256 >=20 > This morning this server crashed with: >=20 > panic: kmem_malloc(1048576): kmem_map too small: 8658309120 total > allocated Hi, Micka=EBl. If you want to "get a picture" on how setting ZFS tunables in = loader.conf affect the different cache sizes and cache hit ratios, I can = recommend installing the freebsd port sysutils/munin-node together with = sysutils/zfs-stats and the following munin ZFS plugins: http://exchange.munin-monitoring.org/plugins/search?keyword=3DFreeBSD Regards, Kai.= From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 11:06:59 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 09D44106566B for ; Mon, 7 Mar 2011 11:06:59 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D0C678FC1B for ; Mon, 7 Mar 2011 11:06:58 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p27B6w5Y096934 for ; Mon, 7 Mar 2011 11:06:58 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p27B6wmc096931 for freebsd-fs@FreeBSD.org; Mon, 7 Mar 2011 11:06:58 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 7 Mar 2011 11:06:58 GMT Message-Id: <201103071106.p27B6wmc096931@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 11:06:59 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154681 fs [zfs] [panic] panic with FreeBSD-8 STABLE o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 o kern/154447 fs [zfs] [panic] Occasional panics - solaris assert somew f kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153847 fs [nfs] [panic] Kernel panic from incorrect m_free in nf o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153552 fs [zfs] zfsboot from 8.2-RC1 freeze at boot time o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small p kern/152488 fs [tmpfs] [patch] mtime of file updated when only inode o kern/152079 fs [msdosfs] [patch] Small cleanups from the other NetBSD o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o kern/151845 fs [smbfs] [patch] smbfs should be upgraded to support Un o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/150207 fs zpool(1): zpool import -d /dev tries to open weird dev o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa f kern/149022 fs [hang] File system operations hangs with suspfs state o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o bin/148296 fs [zfs] [loader] [patch] Very slow probe in /usr/src/sys o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142401 fs [ntfs] [patch] Minor updates to NTFS from NetBSD o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [ffs] [snapshot] System crashes when manipulat o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/33464 fs [ufs] soft update inconsistencies after system crash o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 218 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 11:42:02 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 67A84106566C; Mon, 7 Mar 2011 11:42:02 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay02.ispgateway.de (smtprelay02.ispgateway.de [80.67.31.29]) by mx1.freebsd.org (Postfix) with ESMTP id E32938FC1F; Mon, 7 Mar 2011 11:42:01 +0000 (UTC) Received: from [78.34.189.182] (helo=r500.local) by smtprelay02.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1PwYp5-0001yG-S0; Mon, 07 Mar 2011 12:42:00 +0100 Date: Mon, 7 Mar 2011 12:41:43 +0100 From: Fabian Keil To: Pawel Jakub Dawidek Message-ID: <20110307124143.3fc12416@r500.local> In-Reply-To: <20110227202957.GD1992@garage.freebsd.pl> References: <20110227202957.GD1992@garage.freebsd.pl> X-Mailer: Claws Mail 3.7.8 (GTK+ 2.22.1; amd64-portbld-freebsd9.0) X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/freebsd-listen-2008-08-18.asc Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/+XMIYqnla0D=4qZ/m1dA.p1"; protocol="application/pgp-signature" X-Df-Sender: 775067 Cc: freebsd-fs@FreeBSD.org Subject: Re: HEADS UP: ZFSv28 is in! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 11:42:02 -0000 --Sig_/+XMIYqnla0D=4qZ/m1dA.p1 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Pawel Jakub Dawidek wrote: > I just committed ZFSv28 to HEAD. >=20 > New major features: >=20 > - Data deduplication. > - Triple parity RAIDZ (RAIDZ3). > - zfs diff. > - zpool split. > - Snapshot holds. > - zpool import -F. Allows to rewind corrupted pool to earlier > transaction group. > - Possibility to import pool in read-only mode. Are the committed man pages the correct ones for ZFSv28? They seem to be from 2009, zfs(1M) doesn't mention "zfs diff" and zpool(1M) doesn't mention "zpool split" or "zpool import -F". Fabian --Sig_/+XMIYqnla0D=4qZ/m1dA.p1 Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iEYEARECAAYFAk10xIkACgkQBYqIVf93VJ1kMwCfUH9G/C1SWTE/NsyqmCcwiyD8 2oAAnjVJUAeisUeiCt1qQWVbr2WQ2m1a =Nj0+ -----END PGP SIGNATURE----- --Sig_/+XMIYqnla0D=4qZ/m1dA.p1-- From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 15:43:06 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EDB841065674 for ; Mon, 7 Mar 2011 15:43:06 +0000 (UTC) (envelope-from cheeky.m@live.com) Received: from bay0-omc4-s4.bay0.hotmail.com (bay0-omc4-s4.bay0.hotmail.com [65.54.190.206]) by mx1.freebsd.org (Postfix) with ESMTP id D7F528FC13 for ; Mon, 7 Mar 2011 15:43:06 +0000 (UTC) Received: from BAY147-W47 ([65.54.190.199]) by bay0-omc4-s4.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 7 Mar 2011 07:43:06 -0800 Message-ID: X-Originating-IP: [209.6.82.6] From: Roger Hammerstein To: Date: Mon, 7 Mar 2011 10:43:05 -0500 Importance: Normal In-Reply-To: References: , <20110302200310.GA86404@alchemy.franken.de>, <20110306152245.GA25023@alchemy.franken.de>, MIME-Version: 1.0 X-OriginalArrivalTime: 07 Mar 2011 15:43:06.0393 (UTC) FILETIME=[5A2CD890:01CBDCDE] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: FW: sparc64 hang with zfs v28 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 15:43:07 -0000 forwarding over here. Anyone using zfs on sparc64 ?=20 Anyone have guesses why zfs/zpool eat 99% of the cpu ?=20 Thanks! From: cheeky.m@live.com CC: freebsd-sparc64@freebsd.org Subject: RE: sparc64 hang with zfs v28 Date: Sun=2C 6 Mar 2011 23:27:42 -0500 > FYI=2C kernel modules generally should work again with r219340=2C I haven= 't > tested ZFS though. Thanks!=20 I cvsuppedd and rebuilt kernel. falcon# uname -a FreeBSD falcon 9.0-CURRENT FreeBSD 9.0-CURRENT #3: Sun Mar 6 18:55:14 EST = 2011 root@falcon:/usr/obj/usr/src/sys/GENERIC sparc64 falcon#=20 I did a kldload zfs and it loaded ok. falcon# kldstat Id Refs Address Size Name 1 9 0xc0000000 e42878 kernel 2 1 0xc14a2000 32e000 zfs.ko 3 1 0xc17d0000 104000 opensolaris.ko falcon#=20 But a 'zpool status' or 'zfs list' will cause a zfs or zpool process to eat 99% of a cpu and essentially hang the shell i ran zfs/zpool in. falcon# zfs list ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is pres= ent=3B to enable=2C add "vfs.zfs.prefetch_disable=3D0" to /boot/loader= .conf. ZFS filesystem version 5 ZFS storage pool version 28 [Hang here] last pid: 1012=3B load averages: 0.79=2C 0.30=2C 0.16 = up 0+00:13:58 20:58:43 23 processes: 2 running=2C 21 sleeping CPU: 0.0% user=2C 0.0% nice=2C 52.5% system=2C 0.0% interrupt=2C 47.5% i= dle Mem: 16M Active=2C 11M Inact=2C 46M Wired=2C 64K Cache=2C 12M Buf=2C 1915M = Free Swap: 4055M Total=2C 4055M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND = =20 1006 root 1 53 0 21672K 2904K CPU1 1 0:05 99.47% zfs 998 root 1 40 0 41776K 6376K select 0 0:01 0.00% sshd 994 root 1 16 0 11880K 3536K pause 0 0:01 0.00% csh 795 root 1 40 0 16720K 3968K select 0 0:00 0.00% ntpd 1001 root 1 16 0 11880K 3464K pause 0 0:00 0.00% csh 975 root 1 8 0 25168K 2672K wait 1 0:00 0.00% login stays at 99%. truss -p 1006 doesn't "attach"=2C it just hangs. ctrl-t on the zfs list shell: oad: 0.95 cmd: zfs 1006 [running] 182.26r 0.00u 4.66s 99% 2872k load: 0.95 cmd: zfs 1006 [running] 183.30r 0.00u 4.66s 99% 2872k load: 0.95 cmd: zfs 1006 [running] 183.76r 0.00u 4.66s 99% 2872k load: 0.95 cmd: zfs 1006 [running] 184.08r 0.00u 4.66s 99% 2872k load: 0.95 cmd: zfs 1006 [running] 184.36r 0.00u 4.66s 99% 2872k A second time with zpool status:: last pid: 1224=3B load averages: 0.98=2C 0.55=2C 0.24 = up 0+02:07:39 23:12:33 26 processes: 2 running=2C 24 sleeping CPU: 0.0% user=2C 0.0% nice=2C 50.2% system=2C 0.4% interrupt=2C 49.4% i= dle Mem: 18M Active=2C 13M Inact=2C 46M Wired=2C 64K Cache=2C 12M Buf=2C 1911M = Free Swap: 4055M Total=2C 4055M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND = =20 1200 root 1 62 0 22704K 2920K CPU1 1 0:00 99.02% zpool 793 root 1 40 0 16720K 3968K select 0 0:02 0.00% ntpd 1180 root 1 16 0 11880K 3536K pause 1 0:01 0.00% csh 1184 root 1 40 0 41776K 6376K select 0 0:01 0.00% sshd 1201 root 1 40 0 41776K 6376K select 0 0:01 0.00% sshd falcon# truss -p 1200 truss: can not attach to target process: Device busy falcon# truss -p 1200 truss: can not attach to target process: Device busy falcon#=20 ctrl-t on the zpool status command: load: 0.62 cmd: zpool 1200 [running] 54.30r 0.00u 0.07s 83% 2888k load: 0.99 cmd: zpool 1200 [running] 271.73r 0.00u 0.07s 99% 2888k load: 0.99 cmd: zpool 1200 [running] 272.37r 0.00u 0.07s 99% 2888k load: 0.99 cmd: zpool 1200 [running] 272.75r 0.00u 0.07s 99% 2888k load: 0.99 cmd: zpool 1200 [running] 273.38r 0.00u 0.07s 99% 2888k truss -f zpool status:: 1014: sigprocmask(SIG_SETMASK=2C0x0=2C0x0) =3D 0 (0x0) 1014: sigprocmask(SIG_BLOCK=2CSIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALR= M|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGX= CPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2=2C0x0) =3D 0= (0x0) 1014: sigprocmask(SIG_SETMASK=2C0x0=2C0x0) =3D 0 (0x0) 1014: sigprocmask(SIG_BLOCK=2CSIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALR= M|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGX= CPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2=2C0x0) =3D 0= (0x0) 1014: sigprocmask(SIG_SETMASK=2C0x0=2C0x0) =3D 0 (0x0) 1014: modfind(0x40d3f140=2C0x9a0=2C0xc78=2C0x10a=2C0x1027e8=2C0x7fdffffe8d= 0) =3D 303 (0x12f) 1014: open("/dev/zfs"=2CO_RDWR=2C06170) =3D 3 (0x3) 1014: open("/dev/zero"=2CO_RDONLY=2C0666) =3D 4 (0x4) 1014: open("/etc/zfs/exports"=2CO_RDONLY=2C0666) ERR#2 'No such file or= directory' 1014: __sysctl(0x7fdffff8de8=2C0x2=2C0x7fdffff8eb0=2C0x7fdffff8f18=2C0x40d= 3f118=2C0x13) =3D 0 (0x0) 1014: __sysctl(0x7fdffff8eb0=2C0x4=2C0x40e4d084=2C0x7fdffff8fe0=2C0x0=2C0x= 0) =3D 0 (0x0) [hang] ctrl-t=20 load: 0.31 cmd: zpool 1014 [running] 12.47r 0.00u 0.07s 44% 2912k 1014 root 1 54 0 22704K 2944K CPU0 0 0:00 98.47% zpool falcon# truss -p 1014 truss: can not attach to target process: Device busy iostat -x 1 shows no reads and no writes to any disks There's a 2-disk zfs mirror attached to this ultra60 from a freebsd-8 insta= ll=2C but I don't know why that would cause a problem with the latest zfs v28. I can successfully read the labels on those two mirror disks with zdb -l /d= ev/da[36] = From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 18:41:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 584191065675 for ; Mon, 7 Mar 2011 18:41:41 +0000 (UTC) (envelope-from mgamsjager@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0E2EA8FC12 for ; Mon, 7 Mar 2011 18:41:40 +0000 (UTC) Received: by gxk7 with SMTP id 7so1967999gxk.13 for ; Mon, 07 Mar 2011 10:41:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=oG6eqDq6RXeRn1p12/2vy1cDa6/ajA9flvq5E6NuAHE=; b=dmA9vtwMvQ2xXIplHP8lFAaTEaKt7HJvD8Q0Vfgg3iyNkSbZ3eiMkYRy+rCTEdhTvk EypE4JTgoQtTND0grsADSdgsgOqShTazEpSOggZzxVf7GLw8uoEDEhhjFzHA1cufpjGe 72OI0uJKw36c3OPSDOxbnEXA+Iq+IFPw4GO0M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=NfcMJWieZZaPZZOU7HEGneXaIgc2j6Mcd1W2rbZnIpmwfseORulvO166LhOY2vQleo j/h2KB1EuaIBaHDiB2lLGuYTEWYbIrRv4r1/WWvEMOYOtLD3cQYg6nmkroif+9/PcdxG FOHKR2+HE9yWk3stN7J7DsNvq/1+1RR7FDWng= Received: by 10.100.36.8 with SMTP id j8mr1507275anj.54.1299523300095; Mon, 07 Mar 2011 10:41:40 -0800 (PST) MIME-Version: 1.0 Received: by 10.100.205.10 with HTTP; Mon, 7 Mar 2011 10:41:10 -0800 (PST) In-Reply-To: References: <1299232133.18671.3.camel@pc286.embl.fr> <20110304100517.GA23249@icarus.home.lan> <20110304105608.GA23887@icarus.home.lan> <20110306090455.GA87055@icarus.home.lan> From: Matthias Gamsjager Date: Mon, 7 Mar 2011 19:41:10 +0100 Message-ID: To: Joshua Boyd Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 18:41:41 -0000 Let me too backup my claim with data: AMD Dual core 4G ram 4x 1TB Samsung drives OS installed on separate ufs disk FreeBSD fb 8.2-STABLE FreeBSD 8.2-STABLE #0 r219265: Fri Mar 4 16:47:35 CET 2011 loader.conf: vm.kmem_size="6G" vfs.zfs.txg.timeout="5" vfs.zfs.vdev.min_pending=1 #default = 4 vfs.zfs.vdev.max_pending=4 #default= 35 sysctl.conf: vfs.zfs.txg.write_limit_override=805306368 kern.sched.preempt_thresh=220 Zpool: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 mirror ONLINE 0 0 0 ad6 ONLINE 0 0 0 ad10 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad8 ONLINE 0 0 0 NAME SIZE USED AVAIL CAP HEALTH ALTROOT storage 1.81T 1.57T 245G 86% ONLINE - Prefetch disable = 1 Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP fb 10000M 54 74 99180 42 35955 14 140 73 68174 11 180.6 4 Latency 295ms 1581ms 1064ms 428ms 58640us 755ms Version 1.96 ------Sequential Create------ --------Random Create-------- fb -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 6697 39 +++++ +++ 11798 74 10060 61 +++++ +++ 11104 72 Latency 213ms 134us 257us 32866us 2672us 174us Prefetch disable = 0 Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP fb 10000M 52 74 107602 46 65443 29 135 74 243760 42 186.5 4 Latency 214ms 865ms 1525ms 79771us 254ms 924ms Version 1.96 ------Sequential Create------ --------Random Create-------- fb -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 8152 56 +++++ +++ 4534 36 10966 69 32607 74 9692 71 Latency 112ms 21108us 169ms 30018us 4097us 318us Read performance 68MB/s vs 243MB/s. Maybe the kind of workload you have does not work well with prefetch, I don't know, but for sequential load like I use my NAS for, which is used as a media tank, it does boost performance quiet a bit. From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 18:50:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9791106566B; Mon, 7 Mar 2011 18:50:52 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 660458FC17; Mon, 7 Mar 2011 18:50:51 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id D1835756E04; Mon, 7 Mar 2011 19:50:49 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 21.5977] X-CRM114-CacheID: sfid-20110307_19504_BDAD6C73 X-CRM114-Status: Good ( pR: 21.5977 ) X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Mon Mar 7 19:50:49 2011 X-DSPAM-Confidence: 0.9961 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 4d752909698301593421374 X-DSPAM-Factors: 27, From*Attila Nagy , 0.00010, >+I, 0.00104, >+I, 0.00104, >+On, 0.00120, wrote+>, 0.00208, struct, 0.00212, struct, 0.00212, >>+>>, 0.00239, >>+>>, 0.00239, >+>, 0.00262, wrote+>>, 0.00323, is+>, 0.00343, NULL), 0.00366, error+=, 0.00422, wrote, 0.00442, wrote, 0.00442, files+with, 0.00499, enabled, 0.00548, enabled, 0.00548, h>, 0.00548, all+>, 0.00548, I+guess, 0.00609, I+guess, 0.00609, #ifdef, 0.00685, files, 0.00740, files, 0.00740, X-Spambayes-Classification: ham; 0.00 Message-ID: <4D752909.3050503@fsn.hu> Date: Mon, 07 Mar 2011 19:50:49 +0100 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4D710154.90409@fsn.hu> <20110306084217.GA9791@garage.freebsd.pl> In-Reply-To: <20110306084217.GA9791@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Punching holes into (sparse) files - porting Solaris fcntl(F_FREESP) to FreeBSD? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 18:50:53 -0000 On 03/06/2011 09:54 AM, Pawel Jakub Dawidek wrote: > On Fri, Mar 04, 2011 at 04:12:20PM +0100, Attila Nagy wrote: >> Hi, >> >> Is it possible to make regions of files, with already written data >> sparse? (I'm interested to do this on ZFS) >> >> All I could find in this topic is: >> http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg29047.html >> >> grepping through the source gives a match for VOP_SPACE in >> cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c: >> zfs_replay_truncate(zfsvfs_t *zfsvfs, lr_truncate_t *lr, boolean_t byteswap) >> { >> #ifdef sun >> [...] >> error = VOP_SPACE(ZTOV(zp), F_FREESP,&fl, FWRITE | FOFFMAX, >> lr->lr_offset, kcred, NULL); >> >> And the relevant section from fcntl(2) in Solaris: >> F_FREESP >> >> Free storage space associated with a section of the >> ordinary file fildes. The section is specified by a >> variable of data type struct flock pointed to by arg. >> The data type struct flock is defined in the >> header (see fcntl.h(3HEAD)) and is described below. Note >> that all file systems might not support all possible >> variations of F_FREESP arguments. In particular, many >> file systems allow space to be freed only at the end of >> a file. >> >> F_FREESP seems to be my friend, and it's implemented in Solaris's ZFS. >> How hard would it be to complete the port and make it accessible from >> FreeBSD? >> I guess it was left out with a reason... > Well, adding new VOP is important decision. We could eventually > implement this via ioctl(2), I think... This is a nice feature after all. > > I don't know why do you need this, but note that when compression is > enabled on a ZFS file system, all-zeros blocks are turned into holes, so > if you do have compression enabled and you write all zeros in the place > you want to punch a hole, the pool space should be reclaimed. > I would like to use it for integer-indexed fixed size storage, where the given block can be accessed by multiplying the block size with the index number. A sparse file would allow to reclaim freed blocks' space. But with SEEK_HOLE and SEEK_DATA, and the promised efficiency of sparse files on ZFS I guess there are a lot more use cases than before (for sparse files). Thanks for the info about compression, I didn't know that. Should I assume that using compression and writing blocksize number of zeroes is efficient as F_FREESP? Thanks, From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 19:13:07 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1797C1065673; Mon, 7 Mar 2011 19:13:07 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay06.ispgateway.de (smtprelay06.ispgateway.de [80.67.31.102]) by mx1.freebsd.org (Postfix) with ESMTP id 8094E8FC17; Mon, 7 Mar 2011 19:13:06 +0000 (UTC) Received: from [78.34.189.182] (helo=r500.local) by smtprelay06.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1Pwfrc-0004bi-Hv; Mon, 07 Mar 2011 20:13:04 +0100 Date: Mon, 7 Mar 2011 20:06:34 +0100 From: Fabian Keil To: freebsd-fs@FreeBSD.org Message-ID: <20110307200634.3c0f92df@r500.local> In-Reply-To: <20110228192129.119cac0c@r500.local> References: <20110227202957.GD1992@garage.freebsd.pl> <20110228192129.119cac0c@r500.local> X-Mailer: Claws Mail 3.7.8 (GTK+ 2.22.1; amd64-portbld-freebsd9.0) X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/freebsd-listen-2008-08-18.asc Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/oOazOkkVEQVl61drUilc=as"; protocol="application/pgp-signature" X-Df-Sender: 775067 Cc: Pawel Jakub Dawidek Subject: ZFSv28: log_sysevent: type 19 is not implemented (was: HEADS UP: ZFSv28 is in!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 19:13:07 -0000 --Sig_/oOazOkkVEQVl61drUilc=as Content-Type: multipart/mixed; boundary="MP_/L/L31YZZ9emswjliN6_sjlJ" --MP_/L/L31YZZ9emswjliN6_sjlJ Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Fabian Keil wrote: > Pawel Jakub Dawidek wrote: >=20 > > I just committed ZFSv28 to HEAD. > Anyway, the things I tested so far (zfs/zpool upgrade, > delegation, send, receive, snapshot) worked fine. After unintentionally unplugging an USB disk with this zpool on it: | fk@r500 ~ $zpool status toshiba | pool: toshiba | state: ONLINE | scan: none requested | config: | | NAME STATE READ WRITE CKSUM | toshiba ONLINE 0 0 0 | label/toshiba.eli ONLINE 0 0 0 | | errors: No known data errors the system became sluggish and /var/log/messages got spammed with error messages: Mar 6 21:33:10 r500 kernel: ugen7.2: at usbus7 (disconnected) Mar 6 21:33:10 r500 kernel: umass1: at uhub7, port 1, addr 2 (disconnected) Mar 6 21:33:10 r500 kernel: (pass3:umass-sim1:1:0:0): lost device Mar 6 21:33:10 r500 kernel: (pass3:umass-sim1:1:0:0): removing device entry Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): lost device Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): Synchronize cache fail= ed, status =3D=3D 0xa, scsi status =3D=3D 0x0 Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): removing device entry Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:33:10 r500 last message repeated 50 times Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_sy= sevent: type 19 is not im Mar 6 21:33:10 r500 kernel: plemented Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:33:10 r500 last message repeated 238 times Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_sy= sevent: type 19 is Mar 6 21:33:10 r500 kernel: not implemented Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_sy= sevent: type 19 is not impleme Mar 6 21:33:10 r500 kernel: nted Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:33:10 r500 last message repeated 87 times Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_sy= sevent: type 19 is not implemented Mar 6 21:33:10 r500 kernel:=20 Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:33:10 r500 last message repeated 47 times Mar 6 21:33:11 r500 kernel: type 19 is not implemented Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:33:11 r500 last message repeated 1527 times Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implementedlog_sy= sevent: type 19 is not implemented Mar 6 21:33:11 r500 kernel:=20 Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implemented fk@r500 ~ $zcat /var/log/messages.0.bz2 | grep -c "type 19 is not implement= ed" 34101 At the time of the unplugging, a video was read from the pool. When trying to export the pool, zpool export hung. After rebooting the system, the message was shown two more times between the creation of the provider for the main zpool and the swap device: Mar 6 21:43:20 r500 kernel: GEOM_ELI: Device ada0s1d.eli created. Mar 6 21:43:20 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 Mar 6 21:43:20 r500 kernel: GEOM_ELI: Crypto: software Mar 6 21:43:20 r500 kernel: Trying to mount root from ufs:/dev/ada0s1a [rw= ]... Mar 6 21:43:20 r500 kernel: WARNING: / was not properly dismounted Mar 6 21:43:20 r500 kernel: start_init: trying /sbin/init Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense data Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 0 = 0 0 0 0 0 0 0 0=20 Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status E= rror Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Condit= ion Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY asc= :3a,1 (Medium not present - tray closed) Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable error Mar 6 21:43:20 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:43:20 r500 kernel: log_sysevent: type 19 is not implemented Mar 6 21:43:20 r500 kernel: GEOM_ELI: Device ada0s1b.eli created. Mar 6 21:43:20 r500 kernel: GEOM_ELI: Encryption: AES-XTS 256 Mar 6 21:43:20 r500 kernel: GEOM_ELI: Crypto: software Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense data Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 0 = 0 0 0 0 0 0 0 0=20 Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status E= rror Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Condit= ion Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY asc= :3a,1 (Medium not present - tray closed) Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense data Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 0 = 0 0 0 0 0 0 0 0=20 Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status E= rror Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Condit= ion Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY asc= :3a,1 (Medium not present - tray closed) Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable error Mar 6 21:43:20 r500 kernel: lo1: bpf attached Mar 6 21:43:20 r500 kernel: wlan0: bpf attached Mar 6 21:43:20 r500 kernel: wlan0: bpf attached Mar 6 21:43:20 r500 kernel: wlan0: Ethernet address: 00:[...] Mar 6 21:43:20 r500 kernel: firmware: 'iwn5000fw' version 0: 353240 bytes = loaded at 0xffffffff814120b0 Mar 6 21:43:20 r500 kernel: firmware: 'iwn5000fw' version 0: 353240 bytes = loaded at 0xffffffff814120b0 Mar 6 21:43:20 r500 kernel: bge0: Disabling fastboot Mar 6 21:43:20 r500 kernel: bge0: Disabling fastboot Mar 6 21:43:20 r500 savecore: /dev/ada0s1b: Operation not permitted Mar 6 21:43:21 r500 named[2219]: starting BIND 9.6.3 -t /var/named -u bind Mar 6 21:43:21 r500 named[2219]: built with '--prefix=3D/usr' '--infodir= =3D/usr/share/info' '--mandir=3D/usr/share/man' '--enable-threads' '--enabl= e-getifaddrs' '--disable-linux-caps' '--with-openssl=3D/usr' '--with-random= dev=3D/dev/random' '--without-idn' '--without-libxml2' Mar 6 21:43:21 r500 named[2219]: command channel listening on 127.0.0.1#953 Mar 6 21:43:21 r500 named[2219]: command channel listening on ::1#953 Mar 6 21:43:21 r500 named[2219]: the working directory is not writable Mar 6 21:43:21 r500 named[2219]: running Mar 6 21:43:53 r500 wpa_supplicant[512]: WPA: Group rekeying completed wit= h 00:[...] [GTK=3DTKIP] Mar 6 21:44:14 r500 syslogd: exiting on signal 15 As the boot process got stuck with no additional messages printed, I rebooted into single-user mode, exported the faulted pool and finished the boot process. The system came back normally and the pool could be imported without issues fk@r500 ~ $grep zfs /boot/loader.conf | grep -v "^ *#" zfs_load=3D"YES" I used the attached patch to stop the log spam, but the main issue seems to be reproducible. The top output after detaching the pool: last pid: 4985; load averages: 10.47, 3.96, 2.02 up 0+02:01:49 19:20= :49 552 processes: 12 running, 518 sleeping, 22 waiting CPU: 1.2% user, 0.0% nice, 98.8% system, 0.0% interrupt, 0.0% idle Mem: 267M Active, 95M Inact, 859M Wired, 2032K Cache, 7872K Buf, 692M Free Swap: 2048M Total, 2048M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAN= D = =20 2 root 1 -8 - 0K 16K CPU1 1 1:26 96.97% g_event 0 root 387 -8 0 0K 6192K - 0 2:11 80.66% kernel 1702 root 1 76 0 6276K 796K RUN 0 0:15 11.96% devd 11 root 2 155 ki31 0K 32K RUN 0 26:46 0.00% idle 3395 fk 1 22 0 465M 329M select 0 7:27 0.00% Xorg 3398 fk 1 20 0 96120K 9704K RUN 0 0:49 0.00% e16 12 root 22 -84 - 0K 352K WAIT 1 0:37 0.00% intr 26 root 1 20 - 0K 16K geli:w 0 0:31 0.00% g_eli[= 0] ada0s1d 27 root 1 22 - 0K 16K geli:w 1 0:29 0.00% g_eli[= 1] ada0s1d The stuck zpool's stack: fk@r500 ~ $sudo procstat -kk $(pgrep zpool) PID TID COMM TDNAME KSTACK = =20 5087 100490 zpool initial thread mi_switch+0x174 sleepq_wait+= 0x42 __lockmgr_args+0x7a3 vop_stdlock+0x39 VOP_LOCK1_APV+0x52 _vn_lock+0x47= vflush+0x125 zfs_umount+0x9f dounmount+0x31e unmount+0x38b syscallenter+0x= 331 syscall+0x4b Xfast_syscall+0xdd=20 When I re-attached the disk, g_event kept eating cpu. I'm using a script to attach and import pools on USB devices and as it currently doesn't handle faulted pools, it tried to import the already faulted pool, which resulted in a zpool core dump. Mar 7 19:27:25 r500 sudo: fk : TTY=3Dttyv0 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/geli attach -j - -k /home/fk/geli-keys/toshiba.ke= y /dev/label/toshiba Mar 7 19:27:33 r500 sudo: fk : TTY=3Dttyv0 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zpool import toshiba Mar 7 19:27:33 r500 kernel: GEOM_ELI: Device label/toshiba.eli created. Mar 7 19:27:33 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 Mar 7 19:27:33 r500 kernel: GEOM_ELI: Crypto: software Mar 7 19:27:33 r500 kernel: pid 5206 (zpool), uid 0: exited on signal 11 (= core dumped) fk@r500 ~ $gdb /sbin/zpool zpool.core=20 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. [...] Loaded symbols for /libexec/ld-elf.so.1 #0 0x000000080085ec7d in avl_insert () from /lib/libavl.so.2 [New Thread 802807400 (LWP 100524/initial thread)] (gdb) where #0 0x000000080085ec7d in avl_insert () from /lib/libavl.so.2 #1 0x000000080085ed5e in avl_add () from /lib/libavl.so.2 #2 0x0000000801ae9105 in zpool_find_import_cached () from /lib/libzfs.so.2 #3 0x0000000000409deb in zpool_do_import () #4 0x0000000000406fa9 in main () This time rebooting into single-user mode to export the faulted pool wasn't necessary, but I doubt that the patch had anything to do with it. The second time I was using today's CURRENT instead of yesterday's, but I don't think there were any ZFS-related commits. Fabian --MP_/L/L31YZZ9emswjliN6_sjlJ Content-Type: text/x-patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename=0001-In-log_sysevent-only-warn-about-the-unsupported-type.patch =46rom 9689a1342d30bb0408afd694b4146cf0d84b61c3 Mon Sep 17 00:00:00 2001 From: Fabian Keil Date: Mon, 7 Mar 2011 12:59:03 +0100 Subject: [PATCH] In log_sysevent(), only warn about the unsupported type if= the type is different than the last unsupported one. --- .../compat/opensolaris/kern/opensolaris_sysevent.c | 13 +++++++++++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/sys/cddl/compat/opensolaris/kern/opensolaris_sysevent.c b/sys/= cddl/compat/opensolaris/kern/opensolaris_sysevent.c index ad9ba00..bc697ff 100644 --- a/sys/cddl/compat/opensolaris/kern/opensolaris_sysevent.c +++ b/sys/cddl/compat/opensolaris/kern/opensolaris_sysevent.c @@ -286,9 +286,18 @@ log_sysevent(sysevent_t *evp, int flag, sysevent_id_t = *eid) break; } default: - printf("%s: type %d is not implemented\n", __func__, - nvpair_type(elem)); + { + static int last_unsupported_type; + int unsupported_type =3D nvpair_type(elem); + + if (last_unsupported_type !=3D unsupported_type) + { + printf("%s: type %d is not implemented\n", + __func__, unsupported_type); + last_unsupported_type =3D unsupported_type; + } break; + } } } =20 --=20 1.7.4.1 --MP_/L/L31YZZ9emswjliN6_sjlJ-- --Sig_/oOazOkkVEQVl61drUilc=as Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iEYEARECAAYFAk11LL0ACgkQBYqIVf93VJ0ZWwCfd33ylh/Wa1UEAEzq1DLAeVV3 chEAn1EbtLniRqrV0A/UijEnl6EAZ9TL =vRYt -----END PGP SIGNATURE----- --Sig_/oOazOkkVEQVl61drUilc=as-- From owner-freebsd-fs@FreeBSD.ORG Mon Mar 7 19:26:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06C751065670 for ; Mon, 7 Mar 2011 19:26:27 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay06.ispgateway.de (smtprelay06.ispgateway.de [80.67.31.101]) by mx1.freebsd.org (Postfix) with ESMTP id 421208FC12 for ; Mon, 7 Mar 2011 19:26:26 +0000 (UTC) Received: from [78.34.189.182] (helo=r500.local) by smtprelay06.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1Pwg4W-0007Mp-Rm for freebsd-fs@freebsd.org; Mon, 07 Mar 2011 20:26:25 +0100 Date: Mon, 7 Mar 2011 20:25:31 +0100 From: Fabian Keil To: freebsd-fs@freebsd.org Message-ID: <20110307202531.2c90ff5a@r500.local> In-Reply-To: <20110307200634.3c0f92df@r500.local> References: <20110227202957.GD1992@garage.freebsd.pl> <20110228192129.119cac0c@r500.local> <20110307200634.3c0f92df@r500.local> X-Mailer: Claws Mail 3.7.8 (GTK+ 2.22.1; amd64-portbld-freebsd9.0) X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/freebsd-listen-2008-08-18.asc Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/CgFsehXtW9zopU6fiNqtLkR"; protocol="application/pgp-signature" X-Df-Sender: 775067 Subject: Re: ZFSv28: log_sysevent: type 19 is not implemented X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2011 19:26:27 -0000 --Sig_/CgFsehXtW9zopU6fiNqtLkR Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Fabian Keil wrote: > Fabian Keil wrote: >=20 > > Pawel Jakub Dawidek wrote: > >=20 > > > I just committed ZFSv28 to HEAD. >=20 > > Anyway, the things I tested so far (zfs/zpool upgrade, > > delegation, send, receive, snapshot) worked fine. >=20 > After unintentionally unplugging an USB disk with this > zpool on it: >=20 > | fk@r500 ~ $zpool status toshiba > | pool: toshiba > | state: ONLINE > | scan: none requested > | config: > | > | NAME STATE READ WRITE CKSUM > | toshiba ONLINE 0 0 0 > | label/toshiba.eli ONLINE 0 0 0 > | > | errors: No known data errors >=20 > the system became sluggish and /var/log/messages got spammed > with error messages: >=20 > Mar 6 21:33:10 r500 kernel: ugen7.2: at usbus7 (disconnected) > Mar 6 21:33:10 r500 kernel: umass1: at uhub7, port 1, addr 2 (disconnect= ed) > Mar 6 21:33:10 r500 kernel: (pass3:umass-sim1:1:0:0): lost device > Mar 6 21:33:10 r500 kernel: (pass3:umass-sim1:1:0:0): removing device en= try > Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): lost device > Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): Synchronize cache fa= iled, status =3D=3D 0xa, scsi status =3D=3D 0x0 > Mar 6 21:33:10 r500 kernel: (da1:umass-sim1:1:0:0): removing device entry > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 50 times > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not im > Mar 6 21:33:10 r500 kernel: plemented > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 238 times > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is > Mar 6 21:33:10 r500 kernel: not implemented > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not impleme > Mar 6 21:33:10 r500 kernel: nted > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 87 times > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 kernel:=20 > Mar 6 21:33:10 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:10 r500 last message repeated 47 times > Mar 6 21:33:11 r500 kernel: type 19 is not implemented > Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:33:11 r500 last message repeated 1527 times > Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implementedlog_= sysevent: type 19 is not implemented > Mar 6 21:33:11 r500 kernel:=20 > Mar 6 21:33:11 r500 kernel: log_sysevent: type 19 is not implemented >=20 > fk@r500 ~ $zcat /var/log/messages.0.bz2 | grep -c "type 19 is not impleme= nted" > 34101 >=20 > At the time of the unplugging, a video was read from the pool. >=20 > When trying to export the pool, zpool export hung. >=20 > After rebooting the system, the message was shown two > more times between the creation of the provider for the > main zpool and the swap device: >=20 > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Device ada0s1d.eli created. > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Crypto: software > Mar 6 21:43:20 r500 kernel: Trying to mount root from ufs:/dev/ada0s1a [= rw]... > Mar 6 21:43:20 r500 kernel: WARNING: / was not properly dismounted > Mar 6 21:43:20 r500 kernel: start_init: trying /sbin/init > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense d= ata > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 = 0 0 0 0 0 0 0 0 0=20 > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status= Error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Cond= ition > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY a= sc:3a,1 (Medium not present - tray closed) > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable er= ror > Mar 6 21:43:20 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:43:20 r500 kernel: log_sysevent: type 19 is not implemented > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Device ada0s1b.eli created. > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Encryption: AES-XTS 256 > Mar 6 21:43:20 r500 kernel: GEOM_ELI: Crypto: software > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense d= ata > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 = 0 0 0 0 0 0 0 0 0=20 > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status= Error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Cond= ition > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY a= sc:3a,1 (Medium not present - tray closed) > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable er= ror > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Requesting SCSI sense d= ata > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): READ CAPACITY. CDB: 25 = 0 0 0 0 0 0 0 0 0=20 > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): CAM status: SCSI Status= Error > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI status: Check Cond= ition > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): SCSI sense: NOT READY a= sc:3a,1 (Medium not present - tray closed) > Mar 6 21:43:20 r500 kernel: (cd0:ahcich1:0:0:0): Error 6, Unretryable er= ror > Mar 6 21:43:20 r500 kernel: lo1: bpf attached > Mar 6 21:43:20 r500 kernel: wlan0: bpf attached > Mar 6 21:43:20 r500 kernel: wlan0: bpf attached > Mar 6 21:43:20 r500 kernel: wlan0: Ethernet address: 00:[...] > Mar 6 21:43:20 r500 kernel: firmware: 'iwn5000fw' version 0: 353240 byte= s loaded at 0xffffffff814120b0 > Mar 6 21:43:20 r500 kernel: firmware: 'iwn5000fw' version 0: 353240 byte= s loaded at 0xffffffff814120b0 > Mar 6 21:43:20 r500 kernel: bge0: Disabling fastboot > Mar 6 21:43:20 r500 kernel: bge0: Disabling fastboot > Mar 6 21:43:20 r500 savecore: /dev/ada0s1b: Operation not permitted > Mar 6 21:43:21 r500 named[2219]: starting BIND 9.6.3 -t /var/named -u bi= nd > Mar 6 21:43:21 r500 named[2219]: built with '--prefix=3D/usr' '--infodir= =3D/usr/share/info' '--mandir=3D/usr/share/man' '--enable-threads' '--enabl= e-getifaddrs' '--disable-linux-caps' '--with-openssl=3D/usr' '--with-random= dev=3D/dev/random' '--without-idn' '--without-libxml2' > Mar 6 21:43:21 r500 named[2219]: command channel listening on 127.0.0.1#= 953 > Mar 6 21:43:21 r500 named[2219]: command channel listening on ::1#953 > Mar 6 21:43:21 r500 named[2219]: the working directory is not writable > Mar 6 21:43:21 r500 named[2219]: running > Mar 6 21:43:53 r500 wpa_supplicant[512]: WPA: Group rekeying completed w= ith 00:[...] [GTK=3DTKIP] > Mar 6 21:44:14 r500 syslogd: exiting on signal 15 >=20 > As the boot process got stuck with no additional messages > printed, I rebooted into single-user mode, exported the > faulted pool and finished the boot process. The system > came back normally and the pool could be imported without > issues >=20 > fk@r500 ~ $grep zfs /boot/loader.conf | grep -v "^ *#" > zfs_load=3D"YES" >=20 > I used the attached patch to stop the log spam, but the > main issue seems to be reproducible. The top output after > detaching the pool: >=20 > last pid: 4985; load averages: 10.47, 3.96, 2.02 up 0+02:01:49 19:= 20:49 > 552 processes: 12 running, 518 sleeping, 22 waiting > CPU: 1.2% user, 0.0% nice, 98.8% system, 0.0% interrupt, 0.0% idle > Mem: 267M Active, 95M Inact, 859M Wired, 2032K Cache, 7872K Buf, 692M Free > Swap: 2048M Total, 2048M Free >=20 > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMM= AND = =20 > 2 root 1 -8 - 0K 16K CPU1 1 1:26 96.97% g_ev= ent > 0 root 387 -8 0 0K 6192K - 0 2:11 80.66% kern= el > 1702 root 1 76 0 6276K 796K RUN 0 0:15 11.96% devd > 11 root 2 155 ki31 0K 32K RUN 0 26:46 0.00% idle > 3395 fk 1 22 0 465M 329M select 0 7:27 0.00% Xorg > 3398 fk 1 20 0 96120K 9704K RUN 0 0:49 0.00% e16 > 12 root 22 -84 - 0K 352K WAIT 1 0:37 0.00% intr > 26 root 1 20 - 0K 16K geli:w 0 0:31 0.00% g_el= i[0] ada0s1d > 27 root 1 22 - 0K 16K geli:w 1 0:29 0.00% g_el= i[1] ada0s1d >=20 > The stuck zpool's stack: >=20 > fk@r500 ~ $sudo procstat -kk $(pgrep zpool) > PID TID COMM TDNAME KSTACK = =20 > 5087 100490 zpool initial thread mi_switch+0x174 sleepq_wai= t+0x42 __lockmgr_args+0x7a3 vop_stdlock+0x39 VOP_LOCK1_APV+0x52 _vn_lock+0x= 47 vflush+0x125 zfs_umount+0x9f dounmount+0x31e unmount+0x38b syscallenter+= 0x331 syscall+0x4b Xfast_syscall+0xdd=20 >=20 > When I re-attached the disk, g_event kept eating cpu. >=20 > I'm using a script to attach and import pools on USB devices and as > it currently doesn't handle faulted pools, it tried to import the > already faulted pool, which resulted in a zpool core dump. >=20 > Mar 7 19:27:25 r500 sudo: fk : TTY=3Dttyv0 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/geli attach -j - -k /home/fk/geli-keys/toshiba.ke= y /dev/label/toshiba > Mar 7 19:27:33 r500 sudo: fk : TTY=3Dttyv0 ; PWD=3D/home/fk ; USER= =3Droot ; COMMAND=3D/sbin/zpool import toshiba > Mar 7 19:27:33 r500 kernel: GEOM_ELI: Device label/toshiba.eli created. > Mar 7 19:27:33 r500 kernel: GEOM_ELI: Encryption: AES-CBC 128 > Mar 7 19:27:33 r500 kernel: GEOM_ELI: Crypto: software > Mar 7 19:27:33 r500 kernel: pid 5206 (zpool), uid 0: exited on signal 11= (core dumped) >=20 > fk@r500 ~ $gdb /sbin/zpool zpool.core=20 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > [...] > Loaded symbols for /libexec/ld-elf.so.1 > #0 0x000000080085ec7d in avl_insert () from /lib/libavl.so.2 > [New Thread 802807400 (LWP 100524/initial thread)] > (gdb) where > #0 0x000000080085ec7d in avl_insert () from /lib/libavl.so.2 > #1 0x000000080085ed5e in avl_add () from /lib/libavl.so.2 > #2 0x0000000801ae9105 in zpool_find_import_cached () from /lib/libzfs.so= .2 > #3 0x0000000000409deb in zpool_do_import () > #4 0x0000000000406fa9 in main () >=20 > This time rebooting into single-user mode to export the faulted > pool wasn't necessary, but I doubt that the patch had anything > to do with it. >=20 > The second time I was using today's CURRENT instead of yesterday's, > but I don't think there were any ZFS-related commits. I just tried it a third time, this time with the pool imported but not in active use. Again I couldn't export the pool afterwards: fk@r500 ~ $sudo zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 228G 166G 62.1G 72% 1.00x ONLINE - toshiba 1.36T 1.31T 47.9G 96% 1.00x UNAVAIL - fk@r500 ~ $sudo zpool export toshiba load: 1.05 cmd: zpool 4117 [tx->tx_sync_done_cv)] 249.93r 0.00u 0.06s 0% 2= 528k load: 1.05 cmd: zpool 4117 [tx->tx_sync_done_cv)] 250.11r 0.00u 0.06s 0% 2= 528k load: 1.05 cmd: zpool 4117 [tx->tx_sync_done_cv)] 250.25r 0.00u 0.06s 0% 2= 528k fk@r500 ~ $sudo procstat -kk $(pgrep zpool) PID TID COMM TDNAME KSTACK = =20 4117 102251 zpool initial thread mi_switch+0x174 sleepq_wait+= 0x42 _cv_wait+0x129 txg_wait_synced+0x85 dmu_tx_assign+0x170 spa_history_lo= g+0x43 zfs_log_history+0x82 zfs_ioc_pool_export+0x2a zfsdev_ioctl+0xe6 devf= s_ioctl_f+0x7b kern_ioctl+0x102 ioctl+0xfd syscallenter+0x331 syscall+0x4b = Xfast_syscall+0xdd=20 And importing the pool again wasn't possible either: fk@r500 ~ $zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 228G 166G 62.1G 72% 1.00x ONLINE - toshiba 1.36T 1.31T 47.9G 96% 1.00x UNAVAIL - fk@r500 ~ $sudo zpool import toshiba cannot import 'toshiba': a pool with that name is already created/imported, and no additional pools with that name were found The system stayed responsive, though, and other pools could still be imported and exported. It also didn't result in a "log_sysevent: type 19 is not implemented" message. Fabian --Sig_/CgFsehXtW9zopU6fiNqtLkR Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iEYEARECAAYFAk11MS4ACgkQBYqIVf93VJ2AagCgj3v2JqFg5PX/mIIwqiPwRWbV LUcAoJExtK7u/bRsdssIvV/F9PzFFhNq =A/dc -----END PGP SIGNATURE----- --Sig_/CgFsehXtW9zopU6fiNqtLkR-- From owner-freebsd-fs@FreeBSD.ORG Tue Mar 8 00:28:23 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3CC51065672 for ; Tue, 8 Mar 2011 00:28:23 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.emeryville.ca.mail.comcast.net (qmta10.emeryville.ca.mail.comcast.net [76.96.30.17]) by mx1.freebsd.org (Postfix) with ESMTP id 885368FC15 for ; Tue, 8 Mar 2011 00:28:23 +0000 (UTC) Received: from omta04.emeryville.ca.mail.comcast.net ([76.96.30.35]) by qmta10.emeryville.ca.mail.comcast.net with comcast id GQTN1g00A0lTkoCAAQUN9c; Tue, 08 Mar 2011 00:28:23 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta04.emeryville.ca.mail.comcast.net with comcast id GQUM1g0020PUQVN8QQUMty; Tue, 08 Mar 2011 00:28:22 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 92F089B422; Mon, 7 Mar 2011 16:28:20 -0800 (PST) Date: Mon, 7 Mar 2011 16:28:20 -0800 From: Jeremy Chadwick To: Martin Matuska Message-ID: <20110308002820.GA26744@icarus.home.lan> References: <1299232133.18671.3.camel@pc286.embl.fr> <20110304100517.GA23249@icarus.home.lan> <4D74AA26.10606@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4D74AA26.10606@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, Micka??l Can??vet Subject: Re: kmem_map too small with ZFS and 8.2-RELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2011 00:28:23 -0000 On Mon, Mar 07, 2011 at 10:49:26AM +0100, Martin Matuska wrote: > In 8-STABLE (amd64), starting with SVN revision 214620 > vm.kmem_size_scale defaults to 1. > > This means in 8.2-RELEASE, vm.kmem_size is automatically set to the > amout of your system RAM, > so vm.kmem_size="16G" is automatically set on a system with 16GB RAM. > > D??a 04.03.2011 11:05, Jeremy Chadwick wrote / nap??sal(a): > > On Fri, Mar 04, 2011 at 10:48:53AM +0100, Micka??l Can??vet wrote: > >>> I'd use vm.kmem_size="32G" (i.e. twice your RAM) and that's it. > >> Should I also increase vfs.zfs.arc_max ? > > You should adjust vm.kmem_size, but not vm.kmem_size_max. > > > > You can adjust vfs.zfs.arc_max to basically ensure system stability. > > This thread is acting as evidence that there are probably edge cases > > where the kmem too small panic can still happen despite the limited ARC > > maximum defaults. > > > > For a 16GB system, I'd probably use these settings: > > > > vm.kmem_size="16384M" > > vfs.zfs.arc_max="13312M" > > > > I would also use these two settings: > > > > # Disable ZFS prefetching > > # http://southbrain.com/south/2008/04/the-nightmare-comes-slowly-zfs.html > > # Increases overall speed of ZFS, but when disk flushing/writes occur, > > # system is less responsive (due to extreme disk I/O). > > # NOTE: Systems with 8GB of RAM or more have prefetch enabled by > > # default. > > vfs.zfs.prefetch_disable="1" > > > > # Decrease ZFS txg timeout value from 30 (default) to 5 seconds. This > > # should increase throughput and decrease the "bursty" stalls that > > # happen during immense I/O with ZFS. > > # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007343.html > > # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007355.html > > vfs.zfs.txg.timeout="5" > > > > The advice in the Wiki is outdated, especially for 8.2-RELEASE. Best > > not to follow it as of this writing. > > > >> Do you have any idea why the kernel panicked at only 8GB allocated ? > > I do not. A kernel developer will have to comment on that. > > > > Please attempt to reproduce the problem. If you can reproduce it > > reliably, this will greatly help kernel developers tracking down the > > source of the problem. Thanks -- you're absolutely right. I had forgotten all about vm.kmem_size_scale. :-) So yeah, with 8.2-RELEASE onward (rather than get into individual SVN revisions I'm using 8.2-RELEASE as "a point in time"), vm.kmem_size will default to the amount of usable memory (usually slightly less than hw.physmem). Validation: $ sysctl hw.realmem hw.physmem hw.usermem hw.realmem: 9395240960 hw.physmem: 8579981312 hw.usermem: 1086521344 $ sysctl vm.kmem_size vm.kmem_size: 8303894528 As such, one only needs to tune vfs.zfs.arc_max if desired. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Tue Mar 8 14:41:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D76A7106564A; Tue, 8 Mar 2011 14:41:12 +0000 (UTC) (envelope-from smckay@internode.on.net) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by mx1.freebsd.org (Postfix) with ESMTP id 0C9298FC08; Tue, 8 Mar 2011 14:41:11 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhgFAFbKdU120Fhq/2dsb2JhbACYUAGNbnW/E4VjBA Received: from ppp118-208-88-106.lns20.bne4.internode.on.net (HELO dungeon.home) ([118.208.88.106]) by ipmail06.adl6.internode.on.net with ESMTP; 09 Mar 2011 00:55:52 +1030 Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.14.3/8.14.3) with ESMTP id p28EPQtM002115; Wed, 9 Mar 2011 00:25:26 +1000 (EST) (envelope-from mckay) Message-Id: <201103081425.p28EPQtM002115@dungeon.home> From: Stephen McKay To: freebsd-fs@freebsd.org Date: Wed, 09 Mar 2011 00:25:26 +1000 Sender: smckay@internode.on.net Cc: Stephen McKay Subject: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2011 14:41:13 -0000 Hi! At work I've built a few ZFS based FreeBSD boxes, culminating in a decent sized rack mount server with 12 2TB disks. Unfortunately, I can't make this server stable. Over the last week or so I've repeated a cycle of: 1) copy 1TB of data from an NFS mount into a ZFS filesystem 2) scrub So far, every one of these has exhibited checksum errors in one or both stages. Using smartmontools, I know that none of the disks have reported any errors. No errors are reported by the disk drivers either (ahci and mps). No ECC (MCA) errors are reported. The problem occurs with 8.2.0 and with 9-current (note: I kept the zfsv15 pool). I've swapped the memory with a different brand and I'm now running it at low speed (800MHz). I disabled hyperthreading and all the other funky CPU related things I could find in the BIOS. I've tried the normal and "new experimental" NFS client (mount -t newnfs ...) Nothing so far has had any effect. At all times I can build "world" with no errors, even if I put in stupidly high parallel "-j" values and cause severe swapping. I tried both with the source on ufs and with it on zfs. No problems. So the hardware seems generally stable. I wrote a program to generate streams of pseudorandom trash (using srandom() and random()). I generated a TB of this onto the ZFS pool and read it back. No problems. I even did a few hundred GB to two files in parallel. Again, no problems. So ZFS itself seems generally sound. However, copying 1TB of data from NFS to ZFS always corrupts just a few blocks, as reported by ZFS during the copy, or in the subsequent scrub. These corrupted blocks may be on any disk or disks, and are not limited to just one controller or a subset of disks or to one vdev. ZFS has always successfully reconstructed the data, but I'm hoping to use that redundancy to guard against failing disks, not against whatever gremlin is scrambling my data on the way in. The hardware is: Asus P7F-E (includes 6 3Gb/s SATA ports) PIKE2008 (8 port SAS card based on LSI2008 chip, supports 6Gb/s) Xeon X3440 (2.53GHz 4core with hyperthreading) Chenbro CSPC-41416AB rackmount case 2x 2GB 1333MHz ECC DDR3 RAM (Corsair) (currently using 1x 2GB Kingston ECC RAM) 2x Seagate ST3500418AS 500GB normal disks, for OS booting 12x Seagate ST2000DL003 2TB "green" disks (yes, with 4kB sectors) (4 disks on the onboard Intel SATA controller using ahci driver, 8 disks on the PIKE using the mps driver) What experiments do you think I should try? I note that during the large copies from NFS to ZFS, the "inactive" page list takes all the spare memory, starving the ARC, which drops to its minimum size. During make world and my junk creation tests the ARC remained full size. Could there be a bug in the ARC shrinking code? I also note that -current spits out: kernel: log_sysevent: type 19 is not implemented instead of what 8.2.0 produces: root: ZFS: checksum mismatch, zpool=dread path=/dev/gpt/bay14 offset=766747611136 size=4096 I have added some code to cddl/compat/opensolaris/kern/opensolaris_sysevent.c to print NVLIST elements (type 19) and hope to see the results at the end of the next run. BTW, does /etc/devd.conf need tweaking now? If ZFSv28 produces different format error messages they may not be logged. Indeed, I have added a printf in log_sysevent() because I can't (yet) make devd do what I want. Also, -current produces many scary lock order reversals. Are we still ignoring these? Here's the pool layout: # zpool status pool: dread state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scan: scrub in progress since Tue Mar 8 15:14:51 2011 5.66T scanned out of 8.49T at 402M/s, 2h2m to go 92K repaired, 66.71% done config: NAME STATE READ WRITE CKSUM dread ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/bay3 ONLINE 0 0 0 gpt/bay4 ONLINE 0 0 6 (repairing) gpt/bay5 ONLINE 0 0 0 gpt/bay6 ONLINE 0 0 0 gpt/bay7 ONLINE 0 0 0 gpt/bay8 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gpt/bay9 ONLINE 0 0 1 (repairing) gpt/bay10 ONLINE 0 0 6 (repairing) gpt/bay11 ONLINE 0 0 2 (repairing) gpt/bay12 ONLINE 0 0 0 gpt/bay13 ONLINE 0 0 8 (repairing) gpt/bay14 ONLINE 0 0 0 errors: No known data errors Bay3 through 6 are on the onboard controller. Bay7 through 14 are on the PIKE card. Each disk is partitioned alike: # gpart show ada2 => 34 3907029101 ada2 GPT (1.8T) 34 94 - free - (47K) 128 128 1 freebsd-boot (64K) 256 3906994176 2 freebsd-zfs (1.8T) 3906994432 34703 - free - (17M) I used well known tricks to fool ZFS into using ashift=12 to align for lying 4kB sector drives. The next run will take NFS out of the equation (substituting SSH as a transport). Any ideas on what I could try after that? Stephen McKay. PS Anybody got a mirror of http://www.sun.com/msg/ZFS-8000-9P and similar pages? Oracle has hidden them all, so it's a bit silly to refer to them in our ZFS implementation. From owner-freebsd-fs@FreeBSD.ORG Tue Mar 8 15:47:53 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07E661065670 for ; Tue, 8 Mar 2011 15:47:53 +0000 (UTC) (envelope-from feld@feld.me) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id CA5948FC16 for ; Tue, 8 Mar 2011 15:47:52 +0000 (UTC) Received: by iyj12 with SMTP id 12so5802822iyj.13 for ; Tue, 08 Mar 2011 07:47:52 -0800 (PST) Received: by 10.231.10.139 with SMTP id p11mr561052ibp.147.1299599272136; Tue, 08 Mar 2011 07:47:52 -0800 (PST) Received: from tech304 (supranet-tech.secure-on.net [66.170.8.18]) by mx.google.com with ESMTPS id d10sm745021ibb.6.2011.03.08.07.47.49 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 08 Mar 2011 07:47:50 -0800 (PST) Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: freebsd-fs@freebsd.org References: <201103081425.p28EPQtM002115@dungeon.home> MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Mark Felder" Date: Tue, 08 Mar 2011 09:47:48 -0600 Message-ID: In-Reply-To: <201103081425.p28EPQtM002115@dungeon.home> User-Agent: Opera Mail/11.01 (FreeBSD) Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2011 15:47:53 -0000 On Tue, 08 Mar 2011 08:25:26 -0600, Stephen McKay wrote: > At work I've built a few ZFS based FreeBSD boxes, culminating in a > decent sized rack mount server with 12 2TB disks. Unfortunately, > I can't make this server stable. Highly interested in what FreeBSD version and what ZFS version and zpool version you're running. Regards, Mark From owner-freebsd-fs@FreeBSD.ORG Tue Mar 8 22:40:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 321841065670 for ; Tue, 8 Mar 2011 22:40:03 +0000 (UTC) (envelope-from cforgeron@acsi.ca) Received: from mta03.eastlink.ca (mta03.eastlink.ca [24.224.136.9]) by mx1.freebsd.org (Postfix) with ESMTP id E3BD58FC0C for ; Tue, 8 Mar 2011 22:40:02 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII Received: from ip05.eastlink.ca ([unknown] [24.222.39.68]) by mta03.eastlink.ca (Sun Java(tm) System Messaging Server 7.3-11.01 64bit (built Sep 1 2009)) with ESMTP id <0LHR00EBQGAP1D02@mta03.eastlink.ca>; Tue, 08 Mar 2011 18:40:01 -0400 (AST) X-CMAE-Score: 0 X-CMAE-Analysis: v=1.1 cv=8reSTVRqS4Rq5Xx4Jai9N41eZpHz3D5gSX5rA0od4mg= c=1 sm=1 a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=cexIBkohAAAA:8 a=AfxWm_Faw7MPB4GGfNYA:9 a=UzHS8wTdMiIv_bm3hP0A:7 a=ZYc0BmPx8R0ewweO_n0I5hkVIqQA:4 a=CjuIK1q_8ugA:10 a=SV7veod9ZcQA:10 a=ya8imeLlRIUHCs7r:21 a=1gGZNCYUkpo8JCAv:21 a=/bLbuBD0lrv91xL1PDQKaA==:117 Received: from blk-222-10-85.eastlink.ca (HELO server7.acsi.ca) ([24.222.10.85]) by ip05.eastlink.ca with ESMTP; Tue, 08 Mar 2011 18:40:01 -0400 Received: from server7.acsi.ca ([192.168.9.7]) by server7.acsi.ca ([192.168.9.7]) with mapi; Tue, 08 Mar 2011 18:40:01 -0400 From: Chris Forgeron To: Stephen McKay , "freebsd-fs@freebsd.org" Date: Tue, 08 Mar 2011 18:40:00 -0400 Thread-topic: Constant minor ZFS corruption Thread-index: AcvdnunN3bpgHnUMRLWtJ4RE48s6qQAQhEfA Message-id: References: <201103081425.p28EPQtM002115@dungeon.home> In-reply-to: <201103081425.p28EPQtM002115@dungeon.home> Accept-Language: en-US Content-language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Cc: Subject: RE: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2011 22:40:03 -0000 Don't rule out disk corruption here. I've applied close to 50 Seagate 1.5 TB drives through ZFS now, and I've had ZFS reject 3 of them without ever being able to find a problem with these drives with any other diagnostic program (Seatool, all disk checks, etc), including some of our data recovery software that gives us very detailed information about errors. It was always these three drives that acted up, and when I pulled them and replaced with fresh ones, my checksum errors stopped. This was under 9-CURRENT, probably the Dec 12-12 build that I experimented with for a while. It shouldn't matter which NFS client you use, if you're seeing ZFS checksum errors in zpool status, that won't be from whatever program is writing it. Have you make sure it's not always the same drives with the checksum errors? It make take a few days to know for sure.. ..oh, and don't forget about fun with Expanders. I assume you're using one? I've got 2 LSI2008 based controllers in my 9-Current machine without any fuss. That's running a 24 disk Mirror right now. -----Original Message----- From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org] On Behalf Of Stephen McKay Sent: Tuesday, March 08, 2011 10:25 AM To: freebsd-fs@freebsd.org Cc: Stephen McKay Subject: Constant minor ZFS corruption Hi! At work I've built a few ZFS based FreeBSD boxes, culminating in a decent sized rack mount server with 12 2TB disks. Unfortunately, I can't make this server stable. Over the last week or so I've repeated a cycle of: 1) copy 1TB of data from an NFS mount into a ZFS filesystem 2) scrub So far, every one of these has exhibited checksum errors in one or both stages. Using smartmontools, I know that none of the disks have reported any errors. No errors are reported by the disk drivers either (ahci and mps). No ECC (MCA) errors are reported. The problem occurs with 8.2.0 and with 9-current (note: I kept the zfsv15 pool). I've swapped the memory with a different brand and I'm now running it at low speed (800MHz). I disabled hyperthreading and all the other funky CPU related things I could find in the BIOS. I've tried the normal and "new experimental" NFS client (mount -t newnfs ...) Nothing so far has had any effect. At all times I can build "world" with no errors, even if I put in stupidly high parallel "-j" values and cause severe swapping. I tried both with the source on ufs and with it on zfs. No problems. So the hardware seems generally stable. I wrote a program to generate streams of pseudorandom trash (using srandom() and random()). I generated a TB of this onto the ZFS pool and read it back. No problems. I even did a few hundred GB to two files in parallel. Again, no problems. So ZFS itself seems generally sound. However, copying 1TB of data from NFS to ZFS always corrupts just a few blocks, as reported by ZFS during the copy, or in the subsequent scrub. These corrupted blocks may be on any disk or disks, and are not limited to just one controller or a subset of disks or to one vdev. ZFS has always successfully reconstructed the data, but I'm hoping to use that redundancy to guard against failing disks, not against whatever gremlin is scrambling my data on the way in. The hardware is: Asus P7F-E (includes 6 3Gb/s SATA ports) PIKE2008 (8 port SAS card based on LSI2008 chip, supports 6Gb/s) Xeon X3440 (2.53GHz 4core with hyperthreading) Chenbro CSPC-41416AB rackmount case 2x 2GB 1333MHz ECC DDR3 RAM (Corsair) (currently using 1x 2GB Kingston ECC RAM) 2x Seagate ST3500418AS 500GB normal disks, for OS booting 12x Seagate ST2000DL003 2TB "green" disks (yes, with 4kB sectors) (4 disks on the onboard Intel SATA controller using ahci driver, 8 disks on the PIKE using the mps driver) What experiments do you think I should try? I note that during the large copies from NFS to ZFS, the "inactive" page list takes all the spare memory, starving the ARC, which drops to its minimum size. During make world and my junk creation tests the ARC remained full size. Could there be a bug in the ARC shrinking code? I also note that -current spits out: kernel: log_sysevent: type 19 is not implemented instead of what 8.2.0 produces: root: ZFS: checksum mismatch, zpool=dread path=/dev/gpt/bay14 offset=766747611136 size=4096 I have added some code to cddl/compat/opensolaris/kern/opensolaris_sysevent.c to print NVLIST elements (type 19) and hope to see the results at the end of the next run. BTW, does /etc/devd.conf need tweaking now? If ZFSv28 produces different format error messages they may not be logged. Indeed, I have added a printf in log_sysevent() because I can't (yet) make devd do what I want. Also, -current produces many scary lock order reversals. Are we still ignoring these? Here's the pool layout: # zpool status pool: dread state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scan: scrub in progress since Tue Mar 8 15:14:51 2011 5.66T scanned out of 8.49T at 402M/s, 2h2m to go 92K repaired, 66.71% done config: NAME STATE READ WRITE CKSUM dread ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/bay3 ONLINE 0 0 0 gpt/bay4 ONLINE 0 0 6 (repairing) gpt/bay5 ONLINE 0 0 0 gpt/bay6 ONLINE 0 0 0 gpt/bay7 ONLINE 0 0 0 gpt/bay8 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gpt/bay9 ONLINE 0 0 1 (repairing) gpt/bay10 ONLINE 0 0 6 (repairing) gpt/bay11 ONLINE 0 0 2 (repairing) gpt/bay12 ONLINE 0 0 0 gpt/bay13 ONLINE 0 0 8 (repairing) gpt/bay14 ONLINE 0 0 0 errors: No known data errors Bay3 through 6 are on the onboard controller. Bay7 through 14 are on the PIKE card. Each disk is partitioned alike: # gpart show ada2 => 34 3907029101 ada2 GPT (1.8T) 34 94 - free - (47K) 128 128 1 freebsd-boot (64K) 256 3906994176 2 freebsd-zfs (1.8T) 3906994432 34703 - free - (17M) I used well known tricks to fool ZFS into using ashift=12 to align for lying 4kB sector drives. The next run will take NFS out of the equation (substituting SSH as a transport). Any ideas on what I could try after that? Stephen McKay. PS Anybody got a mirror of http://www.sun.com/msg/ZFS-8000-9P and similar pages? Oracle has hidden them all, so it's a bit silly to refer to them in our ZFS implementation. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Tue Mar 8 23:01:37 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9B101106568A for ; Tue, 8 Mar 2011 23:01:37 +0000 (UTC) (envelope-from feld@feld.me) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6DC388FC1B for ; Tue, 8 Mar 2011 23:01:37 +0000 (UTC) Received: by iyj12 with SMTP id 12so6237165iyj.13 for ; Tue, 08 Mar 2011 15:01:36 -0800 (PST) Received: by 10.43.71.13 with SMTP id yi13mr7080207icb.432.1299625296677; Tue, 08 Mar 2011 15:01:36 -0800 (PST) Received: from tech304 (supranet-tech.secure-on.net [66.170.8.18]) by mx.google.com with ESMTPS id u9sm974292ibe.20.2011.03.08.15.01.35 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 08 Mar 2011 15:01:35 -0800 (PST) Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: freebsd-fs@freebsd.org References: <201103081425.p28EPQtM002115@dungeon.home> Date: Tue, 08 Mar 2011 17:01:34 -0600 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Mark Felder" Message-ID: In-Reply-To: <201103081425.p28EPQtM002115@dungeon.home> User-Agent: Opera Mail/11.01 (FreeBSD) Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2011 23:01:37 -0000 On Tue, 08 Mar 2011 08:25:26 -0600, Stephen McKay wrote: > PS Anybody got a mirror of http://www.sun.com/msg/ZFS-8000-9P and similar > pages? Oracle has hidden them all, so it's a bit silly to refer to them > in our ZFS implementation. http://webcache.googleusercontent.com/search?q=cache:UopFp4sB0hMJ:www.sun.com/bigadmin/features/articles/zfs_part1.scalable.jsp+http://www.sun.com/msg/ZFS-8000-9P&cd=1&hl=en&ct=clnk&gl=us&client=opera&source=www.google.com From owner-freebsd-fs@FreeBSD.ORG Tue Mar 8 23:42:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F4F81065673 for ; Tue, 8 Mar 2011 23:42:51 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7DF618FC26 for ; Tue, 8 Mar 2011 23:42:50 +0000 (UTC) Received: by qwj8 with SMTP id 8so4769865qwj.13 for ; Tue, 08 Mar 2011 15:42:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=3l1JOWFxDHs/aYAGmWgvGcNStcJqCZvahAp2hCcYF9I=; b=U19zCu+/fuQIJiessCJYL0uQnPpwYujCmtQs/X2vOEyMF5c71ey7nHZMg3vQPcq8qC mJn4+SmSTHHgy/sZWqZ46WzYLPSL17o8rlCdj6Qi/3yo7RohYLibwHn9d93vrkXvRGQa hrg7XdgaJSMlQuuWDwuYnS3IIy2Sz91wcD5eE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=ouvnun1cShGw4Tfwt2k8KOQTpGCqJgOVIaK/OK6+ePj2Z+l2nXDNzTht2tkMh6WK5W InCnrUNgFIQzrrRMdmP4jc38i0VAFBLZrQgyjXspxvVRmkeXKC+oA3t0t/mLwPCWhwcs herMRz7wfwKbrIb8JV3YmKFKKJeSQNlxQCpd4= MIME-Version: 1.0 Received: by 10.229.45.3 with SMTP id c3mr4459295qcf.249.1299627769848; Tue, 08 Mar 2011 15:42:49 -0800 (PST) Sender: artemb@gmail.com Received: by 10.229.28.131 with HTTP; Tue, 8 Mar 2011 15:42:49 -0800 (PST) In-Reply-To: References: <201103081425.p28EPQtM002115@dungeon.home> Date: Tue, 8 Mar 2011 15:42:49 -0800 X-Google-Sender-Auth: QPwkVR6HCc7_4MAe6Gt10_ppkuM Message-ID: From: Artem Belevich To: Mark Felder Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2011 23:42:51 -0000 > On Tue, 08 Mar 2011 08:25:26 -0600, Stephen McKay wro= te: > >> PS Anybody got a mirror of http://www.sun.com/msg/ZFS-8000-9P and simila= r >> pages? =A0Oracle has hidden them all, so it's a bit silly to refer to th= em >> in our ZFS implementation. Wayback machine does wonders: http://web.archive.org/http://www.sun.com/msg/ZFS-8000-9P --Artem From owner-freebsd-fs@FreeBSD.ORG Wed Mar 9 12:57:11 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E151B106566B; Wed, 9 Mar 2011 12:57:10 +0000 (UTC) (envelope-from smckay@internode.on.net) Received: from ipmail06.adl2.internode.on.net (ipmail06.adl2.internode.on.net [150.101.137.129]) by mx1.freebsd.org (Postfix) with ESMTP id 460C28FC0C; Wed, 9 Mar 2011 12:57:10 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAHYDd0120Fhq/2dsb2JhbACmbHTDGIVlBJAc Received: from unknown (HELO dungeon.home) ([118.208.88.106]) by ipmail06.adl2.internode.on.net with ESMTP; 09 Mar 2011 23:11:57 +1030 Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.14.3/8.14.3) with ESMTP id p29CfUM1003302; Wed, 9 Mar 2011 22:41:30 +1000 (EST) (envelope-from mckay) Message-Id: <201103091241.p29CfUM1003302@dungeon.home> From: Stephen McKay To: Chris Forgeron , Mark Felder References: <201103081425.p28EPQtM002115@dungeon.home> In-Reply-To: from Chris Forgeron at "Tue, 08 Mar 2011 18:40:00 -0400" Date: Wed, 09 Mar 2011 22:41:30 +1000 Sender: smckay@internode.on.net Cc: freebsd-fs@freebsd.org, Stephen McKay Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2011 12:57:11 -0000 On Tuesday, 8th March 2011, Chris Forgeron wrote: >Have you make sure it's not always the same drives with the checksum >errors? It make take a few days to know for sure.. Of the 12 disks, only 1 has been error-free. I've been doing this for about 10 days now and there is no pattern that I can see in the errors. >It shouldn't matter which NFS client you use, if you're seeing ZFS >checksum errors in zpool status, that won't be from whatever program >is writing it. I'm mounting another server using NFS to get my 1TB of test data, so the problem box is the NFS client, not the server. Sorry if there was any confusion. I had a theory that there was a race in the NFS client or perhaps the code that steal memory from ZFS's ARC when NFS needs it. However, that seems less likely now as I have done the same 1TB test copy again today but this time using ssh as the transport. I saw the same ZFS checksum errors as before. :-( >..oh, and don't forget about fun with Expanders. I assume you're using one? No. This board has 14 ports made up of 6 native and 8 from the LSI2008 chip on the PIKE card. Each is cabled directly to a drive. >I've got 2 LSI2008 based controllers in my 9-Current machine without >any fuss. That's running a 24 disk Mirror right now. That's encouraging news. Maybe I can win eventually. On Tuesday, 8th March 2011, Mark Felder wrote: >Highly interested in what FreeBSD version and what ZFS version and zpool >version you're running. I was using 8.2-release plus the mps driver from 8.2-stable. Hence the filesystem version is 4 and pool version is 15. But I installed -current a few days ago while keeping the same pool and found that the errors still occurred. The v28 code has extra locking in interesting places but it made no difference to the checksum errors. As of today, I've destroyed the pool and built a version 28 pool (fs version 5) on a subset of disks (those attached to the onboard controller). I'll know by tomorrow how that went. BTW, with my code in place to decipher "type 19" entries and a kernel printf that bypasses the need to get devd.conf right, I see something like this for each checksum error: log_sysevent: ESC_dev_dle: class=ereport.fs.zfs.checksum ena=4822220020083854337 detector={ version=0 scheme=zfs pool=44947180927799912 vdev=6194846651369573567} pool=dread pool_guid=44947180927799912 pool_context=0 pool_failmode=wait vdev_guid=6194846651369573567 vdev_type=disk vdev_path=/dev/gpt/bay7 parent_guid=18008078209829074821 parent_type=raidz zio_err=0 zio_offset=194516353024 zio_size=4096 zio_objset=276 zio_object=0 zio_level=0 zio_blkid=132419 bad_ranges=0000000000001000 bad_ranges_min_gap=8 bad_range_sets=00000445 bad_range_clears=00002924 bad_set_histogram=001b001a001e002b0021001600120018001a001600210018001500150016001c001c0019001200190022001b0019001b0017000f0014000e0013001a001c001f000c000c000c0007000b000d0010001f00060009000800080007000c0010000f00070007000500070008000600080008000a0002000100060004000300070004 bad_cleared_histogram=00820089009700ac00a700b900b2009000730084009500af00a300ad00a900ac0082009300ad00c200ac00d200a8008f0078008b008e00b700bf00b9009f00a60083! 009500a400c100c200b700cd009900780090009b00be00af00c100a700980083008a00a200c900bc00d400b200a3007e0089009400c400c700d400b8009b That's a hideous blob of awful, and I don't really know what to do with it. Cheers, Stephen. From owner-freebsd-fs@FreeBSD.ORG Wed Mar 9 14:04:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C632B106564A; Wed, 9 Mar 2011 14:04:12 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 7D0EA8FC12; Wed, 9 Mar 2011 14:04:12 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:4433:c074:8d7b:b33d] ([IPv6:2607:f3e0:0:4:4433:c074:8d7b:b33d]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p29E4Alk016380 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 9 Mar 2011 09:04:10 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4D7788D9.50808@sentex.net> Date: Wed, 09 Mar 2011 09:04:09 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Stephen McKay References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> In-Reply-To: <201103091241.p29CfUM1003302@dungeon.home> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Cc: freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2011 14:04:12 -0000 On 3/9/2011 7:41 AM, Stephen McKay wrote: > On Tuesday, 8th March 2011, Chris Forgeron wrote: > >> Have you make sure it's not always the same drives with the checksum >> errors? It make take a few days to know for sure.. > > Of the 12 disks, only 1 has been error-free. I've been doing this for > about 10 days now and there is no pattern that I can see in the errors. > We sort of went through something similar to this on our offsite/DR backup server just last week. I dont have as many disks as you, but 0(offsite)# zpool status pool: tank1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank1 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad0 ONLINE 0 0 0 ada4 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada0 ONLINE 0 0 0 ada1 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ada5 ONLINE 0 0 0 ada8 ONLINE 0 0 0 ada7 ONLINE 0 0 0 ada6 ONLINE 0 0 0 errors: No known data errors 0(offsite)# After adding a larger case for future expansion, we found the next day we were seeing all sorts of random errors Like Mar 3 05:34:47 offsite kernel: ad1: FAILURE - WRITE_DMA48 status=51 error=10 LBA=2281852580 Mar 3 06:11:59 offsite kernel: ad1: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=2292675553 Mar 3 06:11:59 offsite kernel: ad1: FAILURE - WRITE_DMA48 status=51 error=10 LBA=2292675553 Mar 3 06:23:54 offsite kernel: ad1: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=2292734035 Mar 3 06:23:54 offsite kernel: ad1: FAILURE - WRITE_DMA48 status=51 error=10 LBA=2292734035 and Mar 4 08:56:15 offsite kernel: siisch1: siis_timeout is 00040000 ss 04000000 rs 04000000 es 00000000 sts 801e2000 serr 00000000 Mar 4 09:18:33 offsite kernel: siisch1: Timeout on slot 26 Mar 4 09:18:33 offsite kernel: siisch1: siis_timeout is 00040000 ss 04000000 rs 04000000 es 00000000 sts 801b2000 serr 00000000 Mar 4 09:21:09 offsite kernel: siisch1: Timeout on slot 26 Mar 4 09:21:09 offsite kernel: siisch1: siis_timeout is 00040000 ss 04000000 rs 04000000 es 00000000 sts 801d2000 serr 00000000 Mar 4 09:22:44 offsite kernel: siisch1: Timeout on slot 26 Mar 4 09:22:44 offsite kernel: siisch1: siis_timeout is 00040000 ss 04000000 rs 04000000 es 00000000 sts 801d2000 serr 00000000 Mar 4 09:23:16 offsite kernel: siisch1: Timeout on slot 30 Mar 4 09:23:16 offsite kernel: siisch1: siis_timeout is 00040000 ss 40000000 rs 40000000 es 00000000 sts 801a2000 serr 00000000 on multiple disks and on multiple controllers... I have disks off the MB and off 2 PMPs on an sil3124 controller. We narrowed it down to 2 problems. Failing / Marginal power supply and bad SATA cables. After changing the power supply, we still had a few disks errors. smartctl said all disks didnt have errors... Changed the SATA cables, and those too were fixed. After almost 5 days of uptime, no problems at all now. Not one error. ---Mike ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-fs@FreeBSD.ORG Wed Mar 9 19:59:01 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B4531065670; Wed, 9 Mar 2011 19:59:01 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 216948FC12; Wed, 9 Mar 2011 19:59:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p29Jx1TJ021799; Wed, 9 Mar 2011 19:59:01 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p29Jx0d5021795; Wed, 9 Mar 2011 19:59:01 GMT (envelope-from linimon) Date: Wed, 9 Mar 2011 19:59:01 GMT Message-Id: <201103091959.p29Jx0d5021795@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/155411: [regression] [8.2-release] [tmpfs]: mount: tmpfs : No space left on device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2011 19:59:01 -0000 Synopsis: [regression] [8.2-release] [tmpfs]: mount: tmpfs : No space left on device Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Mar 9 19:58:54 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=155411 From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 13:20:13 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 055B3106566C for ; Thu, 10 Mar 2011 13:20:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E6CC18FC08 for ; Thu, 10 Mar 2011 13:20:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2ADKCKO048985 for ; Thu, 10 Mar 2011 13:20:12 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2ADKCVr048984; Thu, 10 Mar 2011 13:20:12 GMT (envelope-from gnats) Date: Thu, 10 Mar 2011 13:20:12 GMT Message-Id: <201103101320.p2ADKCVr048984@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Gleb Kurtsou Cc: Subject: Re: kern/155411: [regression] [8.2-release] [tmpfs]: mount: tmpfs : No space left on device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Gleb Kurtsou List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 13:20:13 -0000 The following reply was made to PR kern/155411; it has been noted by GNATS. From: Gleb Kurtsou To: bug-followup@FreeBSD.org, pgollucci@FreeBSD.org Cc: Subject: Re: kern/155411: [regression] [8.2-release] [tmpfs]: mount: tmpfs : No space left on device Date: Thu, 10 Mar 2011 14:53:54 +0200 Could you test the patch. It changes the way tmpfs grows, sets filesystem default size to half of RAM by default. Tmpfs no longer depends on inactive/wired memory stats, but checks if swap is nearly full. I've added vfs.tmpfs.swap_reserved sysctl to limit tmpfs growth. Bottom line is, that you should specify meaningful filesystem size to prevent resource exhaustion. In my tests system didn't panic nor invoked OOM killer while consuming nearly all available ram and swap. Patch: http://marc.info/?l=freebsd-fs&m=129735686129438&w=2 It also handles test case described in OpenSolaris bug report for me: http://marc.info/?l=freebsd-fs&m=129747362722933&w=2 Thanks, Gleb. From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 15:30:18 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64C321065670 for ; Thu, 10 Mar 2011 15:30:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 519C08FC15 for ; Thu, 10 Mar 2011 15:30:18 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2AFUILa070647 for ; Thu, 10 Mar 2011 15:30:18 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2AFUIn1070644; Thu, 10 Mar 2011 15:30:18 GMT (envelope-from gnats) Date: Thu, 10 Mar 2011 15:30:18 GMT Message-Id: <201103101530.p2AFUIn1070644@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "Philip M. Gollucci" Cc: Subject: Re: kern/155411: [regression] [8.2-release] [tmpfs]: mount: tmpfs : No space left on device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: "Philip M. Gollucci" List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 15:30:18 -0000 The following reply was made to PR kern/155411; it has been noted by GNATS. From: "Philip M. Gollucci" To: Gleb Kurtsou Cc: , Subject: Re: kern/155411: [regression] [8.2-release] [tmpfs]: mount: tmpfs : No space left on device Date: Thu, 10 Mar 2011 15:28:26 +0000 --------------enig2D2A45DAF19A5900C29D42FA Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I/we would be glad to test it; however, we'll have to schedule since the computers in question are both front facing ASF services. On 03/10/11 12:53, Gleb Kurtsou wrote: > Could you test the patch. It changes the way tmpfs grows, sets > filesystem default size to half of RAM by default. Tmpfs no longer > depends on inactive/wired memory stats, but checks if swap is nearly > full. I've added vfs.tmpfs.swap_reserved sysctl to limit tmpfs growth. >=20 > Bottom line is, that you should specify meaningful filesystem size to > prevent resource exhaustion. >=20 > In my tests system didn't panic nor invoked OOM killer while consuming > nearly all available ram and swap. >=20 > Patch: > http://marc.info/?l=3Dfreebsd-fs&m=3D129735686129438&w=3D2 >=20 > It also handles test case described in OpenSolaris bug report for me: > http://marc.info/?l=3Dfreebsd-fs&m=3D129747362722933&w=3D2 >=20 > Thanks, > Gleb. >=20 --=20 ------------------------------------------------------------------------ 1024D/DB9B8C1C B90B FBC3 A3A1 C71A 8E70 3F8C 75B8 8FFB DB9B 8C1C Philip M. Gollucci (pgollucci@p6m7g8.com) c: 703.336.9354 VP Apache Infrastructure; Member, Apache Software Foundation Committer, FreeBSD Foundation Consultant, P6M7G8 Inc. Sr. System Admin, Ridecharge Inc. Work like you don't need the money, love like you'll never get hurt, and dance like nobody's watching. --------------enig2D2A45DAF19A5900C29D42FA Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iD8DBQFNeO4kdbiP+9ubjBwRAvsaAJ9mCFT6ih/2g0aB2lgnTToajxB5dACfdGep 2Iw1F55WtSV162sdhrgOSYs= =hNNY -----END PGP SIGNATURE----- --------------enig2D2A45DAF19A5900C29D42FA-- From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 16:04:32 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 005EA106564A for ; Thu, 10 Mar 2011 16:04:32 +0000 (UTC) (envelope-from i.radomskyi@gmail.com) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id CE0728FC12 for ; Thu, 10 Mar 2011 16:04:31 +0000 (UTC) Received: by pvg11 with SMTP id 11so385319pvg.13 for ; Thu, 10 Mar 2011 08:04:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=ewKul52ZxbwclXiX2HzPEKD/h/kEmAWifxSVDdhcywg=; b=lwwwR93zJCTrFvRQzQ1rw7kECNBNOhN6lstY9eubMDQ7Zg6CnRbmjcY5UzE8lXoSrw XgyA0r0FI54zxMoI111RN7OwORQ0vUaoDrBzH3pLqm/DZ4xPBVU5iKH069PdoCrsfcAk cYQiDJl+tvXgBD8u5yv1eFHQ5yYkqIt18vcvw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=R0zPcGhWTq08i+257wc9NAVNepGNh0qbXtPzEdmRuItYakF5nI/pygdI2YQaPVJtvk LA3XDo6Orti5H3+J2GcjMC7uPskFwHBcNmm1HyOdvoxuYaB5rssP/G2GmcwIwNYVFLtI tZ5FGVTtF5COH1LhkEUHV7EcOZemIgYTp4D/E= MIME-Version: 1.0 Received: by 10.142.248.41 with SMTP id v41mr6644141wfh.323.1299771485131; Thu, 10 Mar 2011 07:38:05 -0800 (PST) Received: by 10.143.7.13 with HTTP; Thu, 10 Mar 2011 07:38:05 -0800 (PST) Date: Thu, 10 Mar 2011 17:38:05 +0200 Message-ID: From: Iurii Radomskyi To: freebsd-fs@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: HAST wiki outdated X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 16:04:32 -0000 Hello. In this wiki article: http://wiki.freebsd.org/HAST there is info on "Replication modes". It is outdatged comparing to the man hast.conf of 8.2 release in the following parts: 1. last sentense of memsync 2. last sentense of fullsync maybe something else, not sure ) if you reply, please cc as i am not a member of the list. From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 20:43:46 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1268A1065676 for ; Thu, 10 Mar 2011 20:43:46 +0000 (UTC) (envelope-from cforgeron@acsi.ca) Received: from mta04.eastlink.ca (mta04.eastlink.ca [24.224.136.10]) by mx1.freebsd.org (Postfix) with ESMTP id CDA338FC18 for ; Thu, 10 Mar 2011 20:43:45 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII Received: from ip01.eastlink.ca ([unknown] [24.222.39.10]) by mta04.eastlink.ca (Sun Java(tm) System Messaging Server 7.3-11.01 64bit (built Sep 1 2009)) with ESMTP id <0LHV00GH908WI0S1@mta04.eastlink.ca>; Thu, 10 Mar 2011 16:43:44 -0400 (AST) X-CMAE-Score: 0 X-CMAE-Analysis: v=1.1 cv=mORQtGzMSGJSBwuMSvVfB0MKjPGmXehAuj88Uvu04o4= c=1 sm=1 a=kj9zAlcOel0A:10 a=Npn9PEg5AAAA:8 a=6I5d2MoRAAAA:8 a=cD9dlkaaAFRq7P8gXd4A:9 a=j3nuLlU1tdTdNHtoYNYA:7 a=krB3caHU28uymd_bvpOVvgeGbZ0A:4 a=CjuIK1q_8ugA:10 a=SV7veod9ZcQA:10 a=23TYa8vstklZDaDr:21 a=K_Jb1XJ4RimbJMEe:21 a=E/PVjAe7IbPkHCM0BPV0xg==:117 Received: from blk-222-10-85.eastlink.ca (HELO server7.acsi.ca) ([24.222.10.85]) by ip01.eastlink.ca with ESMTP; Thu, 10 Mar 2011 16:43:44 -0400 Received: from server7.acsi.ca ([192.168.9.7]) by server7.acsi.ca ([192.168.9.7]) with mapi; Thu, 10 Mar 2011 16:43:44 -0400 From: Chris Forgeron To: Stephen McKay , Mark Felder Date: Thu, 10 Mar 2011 16:43:43 -0400 Thread-topic: Constant minor ZFS corruption Thread-index: AcveXkZE+SpnKX74Sv+/Ft28q00SywBBCZYw Message-id: References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> In-reply-to: <201103091241.p29CfUM1003302@dungeon.home> Accept-Language: en-US Content-language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Cc: "freebsd-fs@freebsd.org" Subject: RE: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 20:43:46 -0000 You know, I've had better luck with v28 and FreeBSD-9-CURRENT. Make a very minimal compile, test it well, and you should be fine. I just upgraded my last 8.2 v14 ZFS FreeBSD system earlier this week, so I'm now 9-Current with v28 across the board. The only issue I've found so far is a small oddity with displaying files across ZFS, but pjd has already patched that in r219404. (I'm about to test it now) Oh - and you're AMD64, correct, not i386? I think we (royal we) should remove support for i385 in ZFS, it has never been stable for me, and I see a lot of grief about it on the boards. I also think you need 8 GB of RAM to play seriously. I've had reasonable success with 4GB and a light load, but any serious file traffic needs 8GB of breathing room as ZFS gobbles up the RAM in a very aggressive manner. Lastly, check what Mike Tancsa said about his hardware - All of my gear is quality, 1000W dual redundant power supplies, LSI SAS controllers, ECC registered ram, no overclocking, etc, etc. You may have a software issue, but it's more likely that ZFS is just exposing some instability in your system. Has your RAM checked out with a Memtest run overnight? We're talking small, intermittent errors here, not big red flags that will be obvious to spot. -----Original Message----- From: smckay@internode.on.net [mailto:smckay@internode.on.net] On Behalf Of Stephen McKay Sent: Wednesday, March 09, 2011 8:42 AM To: Chris Forgeron; Mark Felder Cc: Stephen McKay; freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption On Tuesday, 8th March 2011, Chris Forgeron wrote: >Have you make sure it's not always the same drives with the checksum >errors? It make take a few days to know for sure.. Of the 12 disks, only 1 has been error-free. I've been doing this for about 10 days now and there is no pattern that I can see in the errors. >It shouldn't matter which NFS client you use, if you're seeing ZFS >checksum errors in zpool status, that won't be from whatever program is >writing it. I'm mounting another server using NFS to get my 1TB of test data, so the problem box is the NFS client, not the server. Sorry if there was any confusion. I had a theory that there was a race in the NFS client or perhaps the code that steal memory from ZFS's ARC when NFS needs it. However, that seems less likely now as I have done the same 1TB test copy again today but this time using ssh as the transport. I saw the same ZFS checksum errors as before. :-( >..oh, and don't forget about fun with Expanders. I assume you're using one? No. This board has 14 ports made up of 6 native and 8 from the LSI2008 chip on the PIKE card. Each is cabled directly to a drive. >I've got 2 LSI2008 based controllers in my 9-Current machine without >any fuss. That's running a 24 disk Mirror right now. That's encouraging news. Maybe I can win eventually. On Tuesday, 8th March 2011, Mark Felder wrote: >Highly interested in what FreeBSD version and what ZFS version and >zpool version you're running. I was using 8.2-release plus the mps driver from 8.2-stable. Hence the filesystem version is 4 and pool version is 15. But I installed -current a few days ago while keeping the same pool and found that the errors still occurred. The v28 code has extra locking in interesting places but it made no difference to the checksum errors. As of today, I've destroyed the pool and built a version 28 pool (fs version 5) on a subset of disks (those attached to the onboard controller). I'll know by tomorrow how that went. BTW, with my code in place to decipher "type 19" entries and a kernel printf that bypasses the need to get devd.conf right, I see something like this for each checksum error: log_sysevent: ESC_dev_dle: class=ereport.fs.zfs.checksum ena=4822220020083854337 detector={ version=0 scheme=zfs pool=44947180927799912 vdev=6194846651369573567} pool=dread pool_guid=44947180927799912 pool_context=0 pool_failmode=wait vdev_guid=6194846651369573567 vdev_type=disk vdev_path=/dev/gpt/bay7 parent_guid=18008078209829074821 parent_type=raidz zio_err=0 zio_offset=194516353024 zio_size=4096 zio_objset=276 zio_object=0 zio_level=0 zio_blkid=132419 bad_ranges=0000000000001000 bad_ranges_min_gap=8 bad_range_sets=00000445 bad_range_clears=00002924 bad_set_histogram=001b001a001e002b0021001600120018001a001600210018001500150016001c001c0019001200190022001b0019001b0017000f0014000e0013001a001c001f000c000c000c0007000b000d0010001f00060009000800080007000c0010000f00070007000500070008000600080008000a0002000100060004000300070004 bad_cleared_histogram=00820089009700ac00a700b900b2009000730084009500af00a300ad00a900ac0082009300ad00c200ac00d200a8008f0078008b008e00b700bf00b9009f00a60083! 009500a400c100c200b700cd009900780090009b00be00af00c100a700980083008a00a200c900bc00d400b200a3007e0089009400c400c700d400b8009b That's a hideous blob of awful, and I don't really know what to do with it. Cheers, Stephen. From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 20:54:02 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id DD1EF1065675; Thu, 10 Mar 2011 20:54:02 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from doug-optiplex.ka9q.net (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id DF77615797B; Thu, 10 Mar 2011 20:53:26 +0000 (UTC) Message-ID: <4D793A46.9090603@FreeBSD.org> Date: Thu, 10 Mar 2011 12:53:26 -0800 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110304 Thunderbird/3.1.9 MIME-Version: 1.0 To: Stephen McKay References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> In-Reply-To: <201103091241.p29CfUM1003302@dungeon.home> X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 20:54:02 -0000 On 03/09/2011 04:41, Stephen McKay wrote: > Of the 12 disks, only 1 has been error-free. I've been doing this for > about 10 days now and there is no pattern that I can see in the errors. Are all the disks from the same batch? If so, you likely got a bad batch. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 23:03:35 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 235BA106564A; Thu, 10 Mar 2011 23:03:35 +0000 (UTC) (envelope-from smckay@internode.on.net) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by mx1.freebsd.org (Postfix) with ESMTP id 68C848FC08; Thu, 10 Mar 2011 23:03:33 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAOrkeE120DXi/2dsb2JhbACmMnjBPIViBA Received: from ppp118-208-53-226.lns20.bne1.internode.on.net (HELO dungeon.home) ([118.208.53.226]) by ipmail06.adl6.internode.on.net with ESMTP; 11 Mar 2011 09:33:30 +1030 Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.14.3/8.14.3) with ESMTP id p2AN2hNB002016; Fri, 11 Mar 2011 09:02:43 +1000 (EST) (envelope-from mckay) Message-Id: <201103102302.p2AN2hNB002016@dungeon.home> From: Stephen McKay To: Mike Tancsa References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> <4D7788D9.50808@sentex.net> In-Reply-To: <4D7788D9.50808@sentex.net> from Mike Tancsa at "Wed, 09 Mar 2011 09:04:09 -0500" Date: Fri, 11 Mar 2011 09:02:43 +1000 Sender: smckay@internode.on.net Cc: freebsd-fs@freebsd.org, Stephen McKay Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 23:03:35 -0000 On Wednesday, 9th March 2011, Mike Tancsa wrote: >On 3/9/2011 7:41 AM, Stephen McKay wrote: >> Of the 12 disks, only 1 has been error-free. I've been doing this for >> about 10 days now and there is no pattern that I can see in the errors. >After adding a larger case for future expansion, we found the next day >we were seeing all sorts of random errors > >Like > >Mar 3 05:34:47 offsite kernel: ad1: FAILURE - WRITE_DMA48 >status=51 error=10 LBA=2281852580 > >and > >Mar 4 08:56:15 offsite kernel: siisch1: siis_timeout is 00040000 ss >04000000 rs 04000000 es 00000000 sts 801e2000 serr 00000000 Our system does not report any driver errors or disk errors. We see checksum errors from ZFS (mostly in scrubs). It's like there's an invisible pixie sprinkling bad data on our disks while we sleep. >We narrowed it down to 2 problems. Failing / Marginal power supply and >bad SATA cables. After changing the power supply, we still had a few >disks errors. If either of these were the cause of our problem, we'd see errors logged, right? Not just invisible corruption? We will probably swap the power supply and cables anyway soon, just to see what happens, but on other machines where cables or power was the problem I saw errors (just like yours) in the logs. >After almost 5 days of uptime, no problems at all now. Not one error. Well, we've got something to aim for, eh? :-) Cheers, Stephen. From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 23:19:56 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B75BB106566B; Thu, 10 Mar 2011 23:19:56 +0000 (UTC) (envelope-from smckay@internode.on.net) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by mx1.freebsd.org (Postfix) with ESMTP id ABF128FC1C; Thu, 10 Mar 2011 23:19:54 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAGLoeE120DXi/2dsb2JhbACmMnjBOYMDgl8E Received: from ppp118-208-53-226.lns20.bne1.internode.on.net (HELO dungeon.home) ([118.208.53.226]) by ipmail06.adl6.internode.on.net with ESMTP; 11 Mar 2011 09:49:53 +1030 Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.14.3/8.14.3) with ESMTP id p2ANJWxN002125; Fri, 11 Mar 2011 09:19:32 +1000 (EST) (envelope-from mckay) Message-Id: <201103102319.p2ANJWxN002125@dungeon.home> From: Stephen McKay To: Chris Forgeron References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> In-Reply-To: from Chris Forgeron at "Thu, 10 Mar 2011 16:43:43 -0400" Date: Fri, 11 Mar 2011 09:19:32 +1000 Sender: smckay@internode.on.net Cc: freebsd-fs@freebsd.org, Stephen McKay Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 23:19:56 -0000 On Thursday, 10th March 2011, Chris Forgeron wrote: >You know, I've had better luck with v28 and FreeBSD-9-CURRENT. Make >a very minimal compile, test it well, and you should be fine. I just >upgraded my last 8.2 v14 ZFS FreeBSD system earlier this week, so I'm >now 9-Current with v28 across the board. The only issue I've found so >far is a small oddity with displaying files across ZFS, but pjd has >already patched that in r219404. (I'm about to test it now) We are OK using -current if we really have to, but would prefer to stick with an official release (maybe with one or two hand-rolled patches if they are important enough). We've already noticed the -current "upgrade treadmill", having to build a new kernel every day of our testing because important bug fixes are arriving. And in the end, we saw no difference in behaviour, so -current doesn't fix our problems. It's important to test -current, but not in production. :-) >Oh - and you're AMD64, correct, not i386? I think we (royal we) should >remove support for i385 in ZFS, it has never been stable for me, and >I see a lot of grief about it on the boards. I also think you need 8 >GB of RAM to play seriously. I've had reasonable success with 4GB and >a light load, but any serious file traffic needs 8GB of breathing room >as ZFS gobbles up the RAM in a very aggressive manner. Yes, we are running the adm64 kernel. Currently we're low on memory (2GB) because I swapped out the RAM, but that, again, didn't affect our failures. >Lastly, check what Mike Tancsa said about his hardware - All of my >gear is quality, 1000W dual redundant power supplies, LSI SAS >controllers, ECC registered ram, no overclocking, etc, etc. You may >have a software issue, but it's more likely that ZFS is just exposing >some instability in your system. Has your RAM checked out with a Memtest >run overnight? We're talking small, intermittent errors here, not big >red flags that will be obvious to spot. The ASUS PIKE2008 card is LSI based. Our RAM is ECC. We're not overclocking (in fact I disabled turbo-boost). We haven't run memtest but we have done a few "make buildworld" runs. All of these completed without error. And with ECC RAM, we should see log messages if anything is wrong there anyway. We have tried to buy quality hardware. At least, we didn't deliberately skimp (except to build our own box vs buy a big name brand pre-built zfs server). We're starting to get suspicious of the PIKE card though. Is there anyone here who is using an ASUS PIKE2008 (as opposed to other LSI SAS 2008 cards)? We're kinda wishing we'd gotten an older PIKE 1068E instead... Cheers, Stephen. From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 23:29:08 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F78E1065676; Thu, 10 Mar 2011 23:29:08 +0000 (UTC) (envelope-from smckay@internode.on.net) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by mx1.freebsd.org (Postfix) with ESMTP id AE27B8FC2A; Thu, 10 Mar 2011 23:29:07 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAOfreE120DXi/2dsb2JhbACmMnjBRIViBA Received: from ppp118-208-53-226.lns20.bne1.internode.on.net (HELO dungeon.home) ([118.208.53.226]) by ipmail06.adl6.internode.on.net with ESMTP; 11 Mar 2011 09:59:05 +1030 Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.14.3/8.14.3) with ESMTP id p2ANSmnn002180; Fri, 11 Mar 2011 09:28:48 +1000 (EST) (envelope-from mckay) Message-Id: <201103102328.p2ANSmnn002180@dungeon.home> From: Stephen McKay To: Doug Barton References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> <4D793A46.9090603@FreeBSD.org> In-Reply-To: <4D793A46.9090603@FreeBSD.org> from Doug Barton at "Thu, 10 Mar 2011 12:53:26 -0800" Date: Fri, 11 Mar 2011 09:28:48 +1000 Sender: smckay@internode.on.net Cc: freebsd-fs@freebsd.org, Stephen McKay Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 23:29:08 -0000 On Thursday, 10th March 2011, Doug Barton wrote: >On 03/09/2011 04:41, Stephen McKay wrote: >> Of the 12 disks, only 1 has been error-free. I've been doing this for >> about 10 days now and there is no pattern that I can see in the errors. > >Are all the disks from the same batch? If so, you likely got a bad batch. They are all from the same batch, and this "bad batch" idea has gone though my mind too, but we've taken 6 of the disks and built a raidz2 array in a different machine, and this time they seem to be error free. These are the same disks that have already had corruption, according to ZFS, when used in the original machine. Apart from the obvious (that everything changed), the major change in my eyes is that the other box (a Dell PowerEdge 840) runs the disks at SATA1 speed, not SATA3. There's no way to set these drives to run at 1.5Gb/s via jumpers or I'd put them straight back in the main box and test them. Is there a way to tell FreeBSD to run disks at a slower transfer rate? I've not seen any switches or tunables that apply with the mps driver. Cheers, Stephen. From owner-freebsd-fs@FreeBSD.ORG Thu Mar 10 23:41:57 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9E5D1065670 for ; Thu, 10 Mar 2011 23:41:57 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta04.emeryville.ca.mail.comcast.net (qmta04.emeryville.ca.mail.comcast.net [76.96.30.40]) by mx1.freebsd.org (Postfix) with ESMTP id AD2038FC0A for ; Thu, 10 Mar 2011 23:41:57 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta04.emeryville.ca.mail.comcast.net with comcast id Hb2G1g0071Y3wxoA4bhxFu; Thu, 10 Mar 2011 23:41:57 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta15.emeryville.ca.mail.comcast.net with comcast id Hbhj1g01P0PUQVN8bbhojy; Thu, 10 Mar 2011 23:41:54 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id B1EEC9B422; Thu, 10 Mar 2011 15:41:43 -0800 (PST) Date: Thu, 10 Mar 2011 15:41:43 -0800 From: Jeremy Chadwick To: Stephen McKay Message-ID: <20110310234143.GA9136@icarus.home.lan> References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> <4D7788D9.50808@sentex.net> <201103102302.p2AN2hNB002016@dungeon.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201103102302.p2AN2hNB002016@dungeon.home> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 23:41:57 -0000 On Fri, Mar 11, 2011 at 09:02:43AM +1000, Stephen McKay wrote: > On Wednesday, 9th March 2011, Mike Tancsa wrote: > > >On 3/9/2011 7:41 AM, Stephen McKay wrote: > >> Of the 12 disks, only 1 has been error-free. I've been doing this for > >> about 10 days now and there is no pattern that I can see in the errors. > > >After adding a larger case for future expansion, we found the next day > >we were seeing all sorts of random errors > > > >Like > > > >Mar 3 05:34:47 offsite kernel: ad1: FAILURE - WRITE_DMA48 > >status=51 error=10 LBA=2281852580 > > > >and > > > >Mar 4 08:56:15 offsite kernel: siisch1: siis_timeout is 00040000 ss > >04000000 rs 04000000 es 00000000 sts 801e2000 serr 00000000 Speaking strictly to Mike here: I spent some time a while ago trying to figure out the NID_NOT_FOUND error. Something I wrote back when I was contributing on the Wiki; see section "SATA disk troubleshooting": http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting So, it could be that the LBA being accessed isn't within the permitted valid range. I could be completely off my rocker though; I'd need someone much more familiar with the ATA-7 specification to state up front what this bit actually defines. Anyway, despite that, the controller is also reporting timeouts. What you haven't shown is what exact model of Silicon Image controller you're using. It matters. There are certain models of SI chipsets that have very bad, nasty bugs. Other models of chips do not have these issues: http://en.wikipedia.org/wiki/Silicon_Image#Product_alerts > Our system does not report any driver errors or disk errors. We see > checksum errors from ZFS (mostly in scrubs). It's like there's an > invisible pixie sprinkling bad data on our disks while we sleep. Speaking to Stephen: With disk bit rot, your "system" (motherboard) won't report any errors. The controller you're using won't report any errors. The disks also won't report any errors. ZFS, however, *will* report checksum errors. What if there's a bug in the FreeBSD driver you're using? What if on some rare occasion it only writes 4095 bytes of the 4096 it needs to write? What if there's a off-by-one bug in the FreeBSD driver where it's randomly corrupting a piece of data it intends to write to the disk? And what about the firmware, which controls all the disk interaction? There's also the possibility that there's some wonkiness going on with the memory controller on your mainboard; maybe it's randomly corrupting something. ECC RAM wouldn't necessarily detect this either. FreeBSD kmem/KVA, as I understand it, is dedicated solely to the kernel and not to userland (so a userland app might not sig11, for example). However I would expect the kernel to be freaking out randomly in other ways (e.g. I would expect the system to be behaving oddly and not just limited to ZFS or disk I/O). You get the idea. The problem could be anywhere. Welcome to OS, system, and hardware troubleshooting in 2011, glad to have you on the team. ;-) You're going to need to spend a lot of time debugging this, and some of it will absolutely involve downtime unless you can afford to build a complete 100% identical replica system that can reproduce the problem. If you can reproduce the problem on that system, awesome. My advice would be to start (on the replica system) by replacing the controller entirely. Use some on-board SATA controller, or invest in an Areca or "something else". This will narrow down the problem to either the controller, the controller firmware, or the FreeBSD driver. That should help. > >We narrowed it down to 2 problems. Failing / Marginal power supply and > >bad SATA cables. After changing the power supply, we still had a few > >disks errors. > > If either of these were the cause of our problem, we'd see errors > logged, right? Not just invisible corruption? Simple answer: no. Long answer: I can't provide one because I'm not an EE guy, so you'll just have to trust me: problems caused by dirty power or "bad power" are absolutely crazy. Given how complex hardware is these days (numerous ASICs, circuitry components, etc.), absolutely bizarre and weird things happen when a device doesn't get what it expects. That's about all I know, and there's lots of evidence on the net to validate this fact. I just wish I could put more absolute faith into it, but since I don't understand EE/power "stuff", it'll always be a mystery to me. I could give you an example of a power-related problem I'm dealing with at home that would probably blow your mind. Contact me off-list if you want the story (every person I've given it to so far has gone "...what? That makes absolutely no sense. Did you try...?" "Yes" "What about..." "Yep" "...wow"). > We will probably swap the power supply and cables anyway soon, just to > see what happens, but on other machines where cables or power was the > problem I saw errors (just like yours) in the logs. I imagine your controller has some kind of multi-lane break-out cable that's used. It's possible that thing is bad. God I hate to bring this up, because it's really going out on a limb, but there's always the remote possibility of interference/EMI causing "weird things" to happen with data flowing across the cable. However, I STRONGLY doubt this; SMART attribute 199 (UDMA CRC Error Count) would absolutely be incrementing whenever this occurred. If you want to provide me with SMART stats (smartctl -a /dev/disk) for each of your disks (please be sure to label them and re-provide me "zpool status" output so I can correlate the checksum errors with the disks), I will be happy to review them for you. > >After almost 5 days of uptime, no problems at all now. Not one error. > > Well, we've got something to aim for, eh? :-) I sure hope so. :-) Like you, I hate problems of this nature. And problems of this nature are exactly why I started spending a *lot* of time, both at my job and outside of work, studying disks, ATA/SATA, and storage a bit better. I honestly don't mean to sound like a braggart (despite how direct/pompous I am, I honestly have a very small ego), but I've more or less become the main guy at my workplace when it comes to disk/storage problems. I just got done dealing with two separate cases of desktop-grade SATA disks in our Citrix Netscaler products (which use FreeBSD) spewing DMA errors right in the middle of OS upgrades (worst time for it to happen). I was able to work around the problem using a combo of a sh script, dd, and smartmontools, allowing upgrades to complete + get production traffic working again. We did RMAs on the disks/units later, since the turnaround time for replacements was way outside of the permitted maintenance window. Networking owes me a case of beer. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Mar 11 00:06:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 02E86106566B; Fri, 11 Mar 2011 00:06:30 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id AE7838FC16; Fri, 11 Mar 2011 00:06:29 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:4433:c074:8d7b:b33d] ([IPv6:2607:f3e0:0:4:4433:c074:8d7b:b33d]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p2B06RN9020450 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 10 Mar 2011 19:06:27 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4D79677F.2080908@sentex.net> Date: Thu, 10 Mar 2011 19:06:23 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Stephen McKay References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> <4D7788D9.50808@sentex.net> <201103102302.p2AN2hNB002016@dungeon.home> In-Reply-To: <201103102302.p2AN2hNB002016@dungeon.home> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Cc: freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 00:06:30 -0000 On 3/10/2011 6:02 PM, Stephen McKay wrote: > > If either of these were the cause of our problem, we'd see errors > logged, right? Not just invisible corruption? You would think.... Controller or controller driver issue ? Do you have any other systems with the same parts ? If you swap out 2 or 3 of the drives and then take those disks into another system, are you able to trigger the issue or an issue ? > > We will probably swap the power supply and cables anyway soon, just to > see what happens, but on other machines where cables or power was the > problem I saw errors (just like yours) in the logs. > >> After almost 5 days of uptime, no problems at all now. Not one error. > > Well, we've got something to aim for, eh? :-) Considering I was seeing dozens per hr, 5 days (now 6!) without an error is great :) ---Mike > > Cheers, > > Stephen. > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-fs@FreeBSD.ORG Fri Mar 11 00:18:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1FE1A106566C for ; Fri, 11 Mar 2011 00:18:00 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta08.westchester.pa.mail.comcast.net (qmta08.westchester.pa.mail.comcast.net [76.96.62.80]) by mx1.freebsd.org (Postfix) with ESMTP id CEA3F8FC0A for ; Fri, 11 Mar 2011 00:17:59 +0000 (UTC) Received: from omta20.westchester.pa.mail.comcast.net ([76.96.62.71]) by qmta08.westchester.pa.mail.comcast.net with comcast id HcBo1g0081YDfWL58cJ09P; Fri, 11 Mar 2011 00:18:00 +0000 Received: from koitsu.dyndns.org ([98.248.33.18]) by omta20.westchester.pa.mail.comcast.net with comcast id HcHw1g00l0PUQVN3gcHxDm; Fri, 11 Mar 2011 00:17:59 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 53D829B422; Thu, 10 Mar 2011 16:17:55 -0800 (PST) Date: Thu, 10 Mar 2011 16:17:55 -0800 From: Jeremy Chadwick To: Stephen McKay Message-ID: <20110311001755.GB9136@icarus.home.lan> References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> <201103102319.p2ANJWxN002125@dungeon.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201103102319.p2ANJWxN002125@dungeon.home> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 00:18:00 -0000 On Fri, Mar 11, 2011 at 09:19:32AM +1000, Stephen McKay wrote: > On Thursday, 10th March 2011, Chris Forgeron wrote: > >Lastly, check what Mike Tancsa said about his hardware - All of my > >gear is quality, 1000W dual redundant power supplies, LSI SAS > >controllers, ECC registered ram, no overclocking, etc, etc. You may > >have a software issue, but it's more likely that ZFS is just exposing > >some instability in your system. Has your RAM checked out with a Memtest > >run overnight? We're talking small, intermittent errors here, not big > >red flags that will be obvious to spot. > > The ASUS PIKE2008 card is LSI based. Our RAM is ECC. We're not > overclocking (in fact I disabled turbo-boost). We haven't run memtest > but we have done a few "make buildworld" runs. All of these completed > without error. And with ECC RAM, we should see log messages if anything > is wrong there anyway. Specifically with regards to your last sentence: you're making blind assumptions here. Let me talk a bit about how ECC RAM errors are reported to the motherboard and how all of that works. (Also -- calling John Baldwin to come in here and correct me if I'm wrong, because over the years I've had to piece all of this together myself, and I could obviously have parts wrong. :-) ) When there's an uncorrectable-bit or correctable-bit errors (of either single-bit or multi-bit types), witnessed on ECC RAM, the memory controller can (doesn't have to!) throw, on the PCI bus, what's called a PERR or SERR signal. The BIOS controls this capability, and what PERR/SERR can get turned into. Some BIOSes permit you to tie these signals to an interrupt (usually some form of NMI). The operating system's kernel has to be written to understand this NMI and handle it appropriately. So you have the following pieces that are required for the OS to report an ECC error: 1) Use of ECC RAM, 2) A memory controller on your motherboard (or possibly the MCH is within the CPU, such as on newer Core iX CPUs or some Xeons) that supports throwing PERR# and SERR# signals, 3) A BIOS that can set up an NMI generation on PERR or SERR, 4) An operating system that knows how to handle that NMI. There are a LOT of motherboards out there which "support ECC", but what they mean to say is "our board works with ECC RAM, but if there's uncorrected bit errors we didn't implement any mechanisms to tell the underlying OS, lolz". Lots of consumer-grade boards that claim to work with either ECC or non-ECC RAM do this. You won't find the BIOS tweaks in there, and Technical Support will just tell you "yes board X works with ECC". Lovely situation. Does FreeBSD support the above? I have absolutely no idea. The only systems I've used which can generate an NMI on PERR or SERR are Tyan boards (we use them at work), and all those systems run Solaris. Solaris also has really good MCA support -- more on that next. Now, there's also another possibility/mechanism, which is MCA. MCA is something that's generated by the actual processor and covers quite a vast number of hardware events of all ranges (some minor, some major). MCA will generate an MCE when there's any sort of memory error and so on. The OS has to have support for handling MCA, and also has to provide decent details of the MCE. Decoding MCEs is tricky, especially on FreeBSD. John Baldwin has made some patches for getting Linux's mcelog working -- well, the log parsing part -- on FreeBSD (but they're slightly out of date; I can provide more recent patches if need be). Don't expect direct DMI to work on FreeBSD with mcelog, for example. So with this situation we now have: 1) CPU has to support MCA, 2) OS has to support MCA and know how to decode MCEs properly, 3) Utilities to decode MCEs correctly. FreeBSD 8.x does support MCA (it's enabled by default), and if you skim the -stable list you'll find people occasionally trying to figure out why their system is spewing these mysterious MCEs and what they mean. MCA is only available, however, if your CPU supports it, and my gut feeling says that parts of the system (motherboard) have to have parts integrated as well. So circling back to your very first post, you said you were using: Asus P7F-E (includes 6 3Gb/s SATA ports) Oh dear, Asus. What kind of mission-critical environment uses this hardware? :-) Let's see what the user manual has in it. Section 4.4.2 has options related to the Northbridge (which I'm not sure what it is in this case; the board supports Core iX CPUs which have on-die MCH, so I'm not sure what this controls). All of the items in this section of the manual are horribly documented, but ones that catch my eye are: * DRAM Margin Ranks (Enabled/Disabled) * MRC Serial Debug Message Level (Disabled/Min/Max/Test) * Memory ECC Function (Enabled/Disabled) * Page Policy (Closed/Open) * Adaptive Page (Disabled/Enabled) * Data Scramble (Disabled/Enabled) * Memory Thermal Throttling (Disabled/CLTT/OLTT) I know what the 3rd and last items do, but not the rest. There's also something on the Southbridge part of the manual which is strange: something called "Energy Lake Feature". It defaults to Disabled, with a comment "We do not recommend you enable this feature". This is all I could find: * Energy Lake technology introduces two main end-user features: the "Consumer Electronics" (CE)-like device power behavior, and maintaining system state and data integrity during power loss events. * Allow you to configure Intel's Energy Lake power management technology. If you are running a Media Center you can install the Intel VIIV software to get the correct driver; otherwise disable the Energy Lake feature in BIOS (it relates purely to Intel's Quick Resume feature, which is generally useless). Otherwise, I see no mention of MCA, PERR/SERR, or anything else that's considered useful (by my standards). I see lots of server-esque options like BIOS-level serial console, but the rest of the board is extremely desktop-oriented, which is what Asus is known for. > We have tried to buy quality hardware. At least, we didn't deliberately > skimp (except to build our own box vs buy a big name brand pre-built zfs > server). No offence intended -- honestly -- but I question anyone who would buy an Asus motherboard for a server. If I was sitting in a meeting room with infrastructure engineers discussing what to buy and someone said "We're considering Asus", I would say "This is a joke, right?" (Note that for my home Windows workstations, I do use Asus motherboards) Sure, the motherboard might not even be the problem. But I'm just saying, who knows what's going on here, I have to question everything. You followed up with "we're starting to question the PIKE card", which should in turn make you question exactly why you bought this hardware to begin with. My recommendation, while not wanting to spend zillions of bucks on HP/Compaq or Dell hardware? Supermicro. I can't talk about their storage HBAs, but many other people here can -- the results have been hit-or-miss. I tend to stick with solely Intel ICHxx or ESBx on-board controllers, which FreeBSD works wonderfully with. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Mar 11 00:27:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 22A521065675; Fri, 11 Mar 2011 00:27:48 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id B41598FC14; Fri, 11 Mar 2011 00:27:47 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:4433:c074:8d7b:b33d] ([IPv6:2607:f3e0:0:4:4433:c074:8d7b:b33d]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p2B0Rjbm021808 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 10 Mar 2011 19:27:46 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4D796C7D.6080208@sentex.net> Date: Thu, 10 Mar 2011 19:27:41 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Jeremy Chadwick References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> <4D7788D9.50808@sentex.net> <201103102302.p2AN2hNB002016@dungeon.home> <20110310234143.GA9136@icarus.home.lan> In-Reply-To: <20110310234143.GA9136@icarus.home.lan> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Cc: freebsd-fs@freebsd.org, Stephen McKay Subject: Re: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 00:27:48 -0000 On 3/10/2011 6:41 PM, Jeremy Chadwick wrote: >>> >>> Mar 3 05:34:47 offsite kernel: ad1: FAILURE - WRITE_DMA48 >>> status=51 error=10 LBA=2281852580 >>> >>> and >>> >>> Mar 4 08:56:15 offsite kernel: siisch1: siis_timeout is 00040000 ss >>> 04000000 rs 04000000 es 00000000 sts 801e2000 serr 00000000 > > Speaking strictly to Mike here: > > I spent some time a while ago trying to figure out the NID_NOT_FOUND > error. Something I wrote back when I was contributing on the Wiki; see > section "SATA disk troubleshooting": > > http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting > > So, it could be that the LBA being accessed isn't within the permitted > valid range. I could be completely off my rocker though; I'd need > someone much more familiar with the ATA-7 specification to state up > front what this bit actually defines. Well, not sure about the NID not found errors. If the cable is bad or the power supply is marginal, who knows what the disk thinks its getting in terms of requests? New PS and new cables took away the rrors. The only other place I have seen the NID not found error consistently is on large SANDISK CFs on Alix and Soekris boxes. Havent found a work around for that unfortunately. > > Anyway, despite that, the controller is also reporting timeouts. What > you haven't shown is what exact model of Silicon Image controller you're > using. It matters. There are certain models of SI chipsets that have > very bad, nasty bugs. Other models of chips do not have these issues: 3124 http://www.addonics.com/products/host_controller/adsa3gpx8-4e.asp. They work quite well. mav@freebsd.org wrote the drivers using this card and they have been rock solid for us so far on two heavily used nfs/smb servers. ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-fs@FreeBSD.ORG Fri Mar 11 14:01:01 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E75E1065680 for ; Fri, 11 Mar 2011 14:01:01 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id F19E58FC26 for ; Fri, 11 Mar 2011 14:01:00 +0000 (UTC) Received: from outgoing.leidinger.net (p5B15535C.dip.t-dialin.net [91.21.83.92]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 85A0C844015; Fri, 11 Mar 2011 15:00:54 +0100 (CET) Received: from webmail.leidinger.net (unknown [IPv6:fd73:10c7:2053:1::2:102]) by outgoing.leidinger.net (Postfix) with ESMTP id E7A9428FE; Fri, 11 Mar 2011 15:00:48 +0100 (CET) Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id p2BE0R1X065995; Fri, 11 Mar 2011 15:00:27 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Fri, 11 Mar 2011 15:00:27 +0100 Message-ID: <20110311150027.153506yognqhzx18@webmail.leidinger.net> Date: Fri, 11 Mar 2011 15:00:27 +0100 From: Alexander Leidinger To: Chris Forgeron References: <201103081425.p28EPQtM002115@dungeon.home> <201103091241.p29CfUM1003302@dungeon.home> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: 85A0C844015.A43D8 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=1.274, required 6, autolearn=disabled, RDNS_NONE 1.27) X-EBL-MailScanner-SpamScore: s X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1300456855.92446@aAroBSI0ru/Uf97RCdjijw X-EBL-Spam-Status: No Cc: "freebsd-fs@freebsd.org" , Stephen McKay Subject: RE: Constant minor ZFS corruption X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 14:01:01 -0000 Quoting Chris Forgeron (from Thu, 10 Mar 2011 16:43:43 -0400): > Oh - and you're AMD64, correct, not i386? I think we (royal we) > should remove support for i385 in ZFS, it has never been stable for > me, and I see a lot of grief about it on the boards. I also think > you need 8 GB of RAM to play seriously. I've had reasonable success > with 4GB and a light load, but any serious file traffic needs 8GB of > breathing room as ZFS gobbles up the RAM in a very aggressive manner. Veto! I have two x86 machines, one with "only" 768 MB RAM. Both of them run with ZFS without problems. The scenario I use them in may not be the scenario you need to provide a machine for, but there are scenarios where ZFS on x86 works. Bye, Alexander. -- BOFH excuse #113: Root nameservers are out of sync http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-fs@FreeBSD.ORG Fri Mar 11 19:30:14 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EC07F1065673 for ; Fri, 11 Mar 2011 19:30:14 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DAC8C8FC08 for ; Fri, 11 Mar 2011 19:30:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2BJUETP005157 for ; Fri, 11 Mar 2011 19:30:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2BJUEdK005154; Fri, 11 Mar 2011 19:30:14 GMT (envelope-from gnats) Date: Fri, 11 Mar 2011 19:30:14 GMT Message-Id: <201103111930.p2BJUEdK005154@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: dfilter@FreeBSD.ORG (dfilter service) Cc: Subject: Re: kern/154681: commit references a PR X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: dfilter service List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 19:30:15 -0000 The following reply was made to PR kern/154681; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/154681: commit references a PR Date: Fri, 11 Mar 2011 19:27:50 +0000 (UTC) Author: avg Date: Fri Mar 11 19:27:31 2011 New Revision: 219526 URL: http://svn.freebsd.org/changeset/base/219526 Log: use even larger stack size for ZFS txg_sync_thread While the stack size was larger than the default stack size on i386, it was smaller than the default stack size on amd64 and apparently that wasn't enough. So, bump the size to 4 pages. Upcoming ZFSv28 code uses 8 pages for this stack size. This is a direct commit to stable/8. PR: kern/154681 Discussed with: pjd Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Fri Mar 11 19:21:42 2011 (r219525) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Fri Mar 11 19:27:31 2011 (r219526) @@ -146,7 +146,7 @@ txg_sync_start(dsl_pool_t *dp) * 32-bit x86. This is due in part to nested pools and * scrub_visitbp() recursion. */ - tx->tx_sync_thread = thread_create(NULL, 12<<10, txg_sync_thread, + tx->tx_sync_thread = thread_create(NULL, 16<<10, txg_sync_thread, dp, 0, &p0, TS_RUN, minclsyspri); mutex_exit(&tx->tx_sync_lock); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 03:56:05 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9E6E1065679; Sat, 12 Mar 2011 03:56:05 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id A19468FC16; Sat, 12 Mar 2011 03:56:05 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2C3u5YF087637; Sat, 12 Mar 2011 03:56:05 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2C3u5YC087633; Sat, 12 Mar 2011 03:56:05 GMT (envelope-from linimon) Date: Sat, 12 Mar 2011 03:56:05 GMT Message-Id: <201103120356.p2C3u5YC087633@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/155484: [ufs] GPT + UFS boot don't work well together X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 03:56:05 -0000 Old Synopsis: GPT + UFS boot New Synopsis: [ufs] GPT + UFS boot don't work well together Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Sat Mar 12 03:54:09 UTC 2011 Responsible-Changed-Why: Reclassify and assign. http://www.freebsd.org/cgi/query-pr.cgi?pr=155484 From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 07:24:28 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D76CD1065675; Sat, 12 Mar 2011 07:24:28 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id AE1CD8FC12; Sat, 12 Mar 2011 07:24:28 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2C7OSmU098214; Sat, 12 Mar 2011 07:24:28 GMT (envelope-from avg@freefall.freebsd.org) Received: (from avg@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2C7OSRm098210; Sat, 12 Mar 2011 07:24:28 GMT (envelope-from avg) Date: Sat, 12 Mar 2011 07:24:28 GMT Message-Id: <201103120724.p2C7OSRm098210@freefall.freebsd.org> To: rs@bytecamp.net, avg@FreeBSD.org, freebsd-fs@FreeBSD.org, avg@FreeBSD.org From: avg@FreeBSD.org Cc: Subject: Re: kern/154681: [zfs] [panic] panic with FreeBSD-8 STABLE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 07:24:28 -0000 Synopsis: [zfs] [panic] panic with FreeBSD-8 STABLE State-Changed-From-To: open->closed State-Changed-By: avg State-Changed-When: Sat Mar 12 07:23:06 UTC 2011 State-Changed-Why: A fix is committed. Responsible-Changed-From-To: freebsd-fs->avg Responsible-Changed-By: avg Responsible-Changed-When: Sat Mar 12 07:23:06 UTC 2011 Responsible-Changed-Why: I am hanlding this issue. http://www.freebsd.org/cgi/query-pr.cgi?pr=154681 From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 07:50:13 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 894C1106564A for ; Sat, 12 Mar 2011 07:50:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 5EE1F8FC22 for ; Sat, 12 Mar 2011 07:50:13 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2C7oD3B037948 for ; Sat, 12 Mar 2011 07:50:13 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2C7oDI7037947; Sat, 12 Mar 2011 07:50:13 GMT (envelope-from gnats) Date: Sat, 12 Mar 2011 07:50:13 GMT Message-Id: <201103120750.p2C7oDI7037947@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Andriy Gapon Cc: Subject: Re: kern/155484: [ufs] GPT + UFS boot don't work well together X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andriy Gapon List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 07:50:13 -0000 The following reply was made to PR kern/155484; it has been noted by GNATS. From: Andriy Gapon To: bug-followup@freebsd.org, rarehawk@gmail.com Cc: Subject: Re: kern/155484: [ufs] GPT + UFS boot don't work well together Date: Sat, 12 Mar 2011 09:47:43 +0200 gpart(8) /bootme Does this help? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 08:30:18 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04F26106564A for ; Sat, 12 Mar 2011 08:30:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id CE2B08FC08 for ; Sat, 12 Mar 2011 08:30:17 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2C8UHvs080277 for ; Sat, 12 Mar 2011 08:30:17 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2C8UH04080272; Sat, 12 Mar 2011 08:30:17 GMT (envelope-from gnats) Date: Sat, 12 Mar 2011 08:30:17 GMT Message-Id: <201103120830.p2C8UH04080272@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Alexander Best Cc: Subject: Re: kern/155484: GPT + UFS boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Alexander Best List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 08:30:18 -0000 The following reply was made to PR kern/155484; it has been noted by GNATS. From: Alexander Best To: Andrey Vladimirov Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/155484: GPT + UFS boot Date: Sat, 12 Mar 2011 08:22:30 +0000 On Fri Mar 11 11, Andrey Vladimirov wrote: > > >Number: 155484 > >Category: kern > >Synopsis: GPT + UFS boot > >Confidential: no > >Severity: serious > >Priority: medium > >Responsible: freebsd-bugs > >State: open > >Quarter: > >Keywords: > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Fri Mar 11 22:50:10 UTC 2011 > >Closed-Date: > >Last-Modified: > >Originator: Andrey Vladimirov > >Release: Freebsd 8.2 RELEASE > >Organization: > >Environment: > FreeBSD 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Mar 9 20:11:11 UTC 2011 andrey@:/usr/obj/usr/src/sys/x3650m2 amd64 > >Description: > I'm trying to setup a system with a large RAID array (total ~4TB) > I do next step: > 1.Create the boot, swap and UFS partitions: > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > Fixit# gpart add -t freebsd-ufs -l disk0 mfid0 > 2. Install the Protected MBR (pmbr) and gptboot loader > Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1 ad0 > Then install FreeBSD and add this: > echo 'vfs.root.mountfrom="ufs:/dev/mfid0p2"' >> /boot/loader.conf isn't this line pointing to your swap partition? try echo 'vfs.root.mountfrom="ufs:/dev/mfid0p3"' >> /boot/loader.conf ...also shouldn't this be: gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptboot -i 1 mfid0 ? also i'm not sure -b and -p can be used together in one command. at least the gpart(8) manual uses two commands for it. > After rebooting (system not booting) i see message: > is unable to find loader at /boot/loader or can it load /boot/kernel/kernel > > I load from DVD and go to Fixit > Copying /boot/loader to /loader allows me to enter /loader at the "boot:" prompt and the loader will load, however, its unable to load the kernel. > > If I do an "ls" at the loader prompt I can see boot listed as a directory (with a "d" before it) > Trying to do "ls boot" inexplicably it says "boot: not a directory" > > > > > > >How-To-Repeat: > do next step: > 1.Create the boot, swap and UFS partitions: > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > Fixit# gpart add -t freebsd-ufs -l disk0 mfid0 > 2. Install the Protected MBR (pmbr) and gptboot loader > Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1 ad0 > echo 'vfs.root.mountfrom="ufs:/dev/mfid0p2"' >> /boot/loader.conf > > >Fix: > If do next step: > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > Fixit# gpart add -s 3800G -t freebsd-ufs -l disk0 mfid0 > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > or > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > next partition must be root(/) > Fixit# gpart add -s 3G -t freebsd-ufs -l disk0 mfid0 > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > > No problem with boot on GPT. > > >Release-Note: > >Audit-Trail: > >Unformatted: -- a13x From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 09:00:27 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60E641065793 for ; Sat, 12 Mar 2011 09:00:27 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 358AE8FC0C for ; Sat, 12 Mar 2011 09:00:27 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2C90QO2008457 for ; Sat, 12 Mar 2011 09:00:26 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2C90QGA008433; Sat, 12 Mar 2011 09:00:26 GMT (envelope-from gnats) Date: Sat, 12 Mar 2011 09:00:26 GMT Message-Id: <201103120900.p2C90QGA008433@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "Andrey V. Elsukov" Cc: Subject: Re: kern/155484: GPT + UFS boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: "Andrey V. Elsukov" List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 09:00:27 -0000 The following reply was made to PR kern/155484; it has been noted by GNATS. From: "Andrey V. Elsukov" To: Andrey Vladimirov Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/155484: GPT + UFS boot Date: Sat, 12 Mar 2011 11:36:10 +0300 On 12.03.2011 01:42, Andrey Vladimirov wrote: > 2. Install the Protected MBR (pmbr) and gptboot loader > Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1 ad0 Did you try to use gptboot instead gptzfsboot? -- WBR, Andrey V. Elsukov From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 14:10:15 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 368941065670 for ; Sat, 12 Mar 2011 14:10:15 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 015658FC12 for ; Sat, 12 Mar 2011 14:10:15 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2CEAEx0009163 for ; Sat, 12 Mar 2011 14:10:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2CEAEHC009162; Sat, 12 Mar 2011 14:10:14 GMT (envelope-from gnats) Date: Sat, 12 Mar 2011 14:10:14 GMT Message-Id: <201103121410.p2CEAEHC009162@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Andrey Vladimirov Cc: Subject: Re: kern/155484: GPT + UFS boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andrey Vladimirov List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 14:10:15 -0000 The following reply was made to PR kern/155484; it has been noted by GNATS. From: Andrey Vladimirov To: Alexander Best Cc: freebsd-gnats-submit@freebsd.org Subject: Re: kern/155484: GPT + UFS boot Date: Sat, 12 Mar 2011 16:09:53 +0200 --0015174be5b0f2f393049e49a1ea Content-Type: text/plain; charset=ISO-8859-1 2011/3/12 Alexander Best > On Fri Mar 11 11, Andrey Vladimirov wrote: > > > > >Number: 155484 > > >Category: kern > > >Synopsis: GPT + UFS boot > > >Confidential: no > > >Severity: serious > > >Priority: medium > > >Responsible: freebsd-bugs > > >State: open > > >Quarter: > > >Keywords: > > >Date-Required: > > >Class: sw-bug > > >Submitter-Id: current-users > > >Arrival-Date: Fri Mar 11 22:50:10 UTC 2011 > > >Closed-Date: > > >Last-Modified: > > >Originator: Andrey Vladimirov > > >Release: Freebsd 8.2 RELEASE > > >Organization: > > >Environment: > > FreeBSD 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Mar 9 20:11:11 UTC 2011 > andrey@:/usr/obj/usr/src/sys/x3650m2 amd64 > > >Description: > > I'm trying to setup a system with a large RAID array (total ~4TB) > > I do next step: > > 1.Create the boot, swap and UFS partitions: > > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > > Fixit# gpart add -t freebsd-ufs -l disk0 mfid0 > > 2. Install the Protected MBR (pmbr) and gptboot loader > > Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1 > ad0 > > Then install FreeBSD and add this: > > echo 'vfs.root.mountfrom="ufs:/dev/mfid0p2"' >> /boot/loader.conf > > isn't this line pointing to your swap partition? > > try echo 'vfs.root.mountfrom="ufs:/dev/mfid0p3"' >> /boot/loader.conf > > ...also shouldn't this be: > > gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptboot -i 1 mfid0 ? > > also i'm not sure -b and -p can be used together in one command. at least > the > gpart(8) manual uses two commands for it. > Yes i this command(sorry it is my misprint): gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptboot -i 1 mfid0 About this "vfs.root.mountfrom=" - it does not matter, because if create swap partition after (freebsd-boot) loader and kernel not booting. If I create root partition after freebsd-boot - all work!!!! > > After rebooting (system not booting) i see message: > > is unable to find loader at /boot/loader or can it load > /boot/kernel/kernel > > > > I load from DVD and go to Fixit > > Copying /boot/loader to /loader allows me to enter /loader at the "boot:" > prompt and the loader will load, however, its unable to load the kernel. > > > > If I do an "ls" at the loader prompt I can see boot listed as a directory > (with a "d" before it) > > Trying to do "ls boot" inexplicably it says "boot: not a directory" > > > > > > > > > > > > >How-To-Repeat: > > do next step: > > 1.Create the boot, swap and UFS partitions: > > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > > Fixit# gpart add -t freebsd-ufs -l disk0 mfid0 > > 2. Install the Protected MBR (pmbr) and gptboot loader > > Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1 > ad0 > > echo 'vfs.root.mountfrom="ufs:/dev/mfid0p2"' >> /boot/loader.conf > > > > >Fix: > > If do next step: > > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > > Fixit# gpart add -s 3800G -t freebsd-ufs -l disk0 mfid0 > > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > > or > > Fixit# gpart add -s 64K -t freebsd-boot mfid0 > > next partition must be root(/) > > Fixit# gpart add -s 3G -t freebsd-ufs -l disk0 mfid0 > > Fixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0 > > > > No problem with boot on GPT. > > > > >Release-Note: > > >Audit-Trail: > > >Unformatted: > > -- > a13x > -- Best regards, Andrey Vladimirov --0015174be5b0f2f393049e49a1ea Content-Type: text/html; charset=KOI8-R Content-Transfer-Encoding: quoted-printable

2011/3/12 Alexander Best <arundel@freebsd.org>
On Fri Mar 11 11, Andrey Vladimirov wrote:
>
> >Number: =9A =9A =9A =9A 155484
> >Category: =9A =9A =9A kern
> >Synopsis: =9A =9A =9A GPT + UFS boot
> >Confidential: =9A no
> >Severity: =9A =9A =9A serious
> >Priority: =9A =9A =9A medium
> >Responsible: =9A =9Afreebsd-bugs
> >State: =9A =9A =9A =9A =9Aopen
> >Quarter:
> >Keywords:
> >Date-Required:
> >Class: =9A =9A =9A =9A =9Asw-bug
> >Submitter-Id: =9A current-users
> >Arrival-Date: =9A Fri Mar 11 22:50:10 UTC 2011
> >Closed-Date:
> >Last-Modified:
> >Originator: =9A =9A Andrey Vladimirov
> >Release: =9A =9A =9A =9AFreebsd 8.2 RELEASE
> >Organization:
> >Environment:
> FreeBSD =9A8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Mar =9A9 20:11:11 UTC= 2011 =9A =9A andrey@:/usr/obj/usr/src/sys/x3650m2 =9Aamd64
> >Description:
> I'm trying to setup a system with a large RAID array (total ~4TB)<= br> > I do next step:
> 1.Create the boot, swap and UFS partitions:
> =9AFixit# gpart add -s 64K -t freebsd-boot mfid0
> =9AFixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0
> =9AFixit# gpart add -t freebsd-ufs -l disk0 mfid0
> 2. Install the Protected MBR (pmbr) and gptboot loader
> Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1= ad0
> Then install FreeBSD and add this:
> echo 'vfs.root.mountfrom=3D"ufs:/dev/mfid0p2"' >&= gt; /boot/loader.conf

isn't this line pointing to your swap partition?

try echo 'vfs.root.mountfrom=3D"ufs:/dev/mfid0p3"' >&g= t; /boot/loader.conf

...also shouldn't this be:

gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptboot -i 1 mfid0 ?

also i'm not sure -b and -p can be used together in one command. at lea= st the
gpart(8) manual uses two commands for it.

Yes i this command(sorry it is my misprint):
gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptboot -i 1 = mfid0

About this "vfs.root.mountfrom=3D" - it does not matter, because if create swap partition after=9A (freebsd-boot) lo= ader and kernel not booting.
If I creat= e root partition after freebsd-boot - all work!!!!


> After rebooting (system not booting) i see message:
> is unable to find loader at /boot/loader or can it load /boot/kernel/k= ernel
>
> I load from DVD and go to Fixit
> Copying /boot/loader to /loader allows me to enter /loader at the &quo= t;boot:" prompt and the loader will load, however, its unable to load = the kernel.
>
> If I do an "ls" at the loader prompt I can see boot listed a= s a directory (with a "d" before it)
> Trying to do "ls boot" inexplicably it says "boot: not = a directory"
>
>
>
>
>
> >How-To-Repeat:
> do next step:
> 1.Create the boot, swap and UFS partitions:
> =9AFixit# gpart add -s 64K -t freebsd-boot mfid0
> =9AFixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0
> =9AFixit# gpart add -t freebsd-ufs -l disk0 mfid0
> 2. Install the Protected MBR (pmbr) and gptboot loader
> Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1= ad0
> echo 'vfs.root.mountfrom=3D"ufs:/dev/mfid0p2"' >&= gt; /boot/loader.conf
>
> >Fix:
> If do next step:
> =9AFixit# gpart add -s 64K -t freebsd-boot mfid0
> =9AFixit# gpart add -s 3800G -t freebsd-ufs -l disk0 mfid0
> =9AFixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0
> or
> =9AFixit# gpart add -s 64K -t freebsd-boot mfid0
> next partition must be root(/)
> =9AFixit# gpart add -s 3G -t freebsd-ufs -l disk0 mfid0
> =9AFixit# gpart add -s 8G -t freebsd-swap -l swap0 mfid0
>
> No problem with boot on GPT.
>
> >Release-Note:
> >Audit-Trail:
> >Unformatted:

--
a13x



--
Best regards,Andrey Vladimirov
--0015174be5b0f2f393049e49a1ea-- From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 14:30:15 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FC9F1065672 for ; Sat, 12 Mar 2011 14:30:15 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 665B48FC15 for ; Sat, 12 Mar 2011 14:30:15 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2CEUFDJ027373 for ; Sat, 12 Mar 2011 14:30:15 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2CEUFdp027369; Sat, 12 Mar 2011 14:30:15 GMT (envelope-from gnats) Date: Sat, 12 Mar 2011 14:30:15 GMT Message-Id: <201103121430.p2CEUFdp027369@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Andrey Vladimirov Cc: Subject: Re: kern/155484: GPT + UFS boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andrey Vladimirov List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 14:30:15 -0000 The following reply was made to PR kern/155484; it has been noted by GNATS. From: Andrey Vladimirov To: "Andrey V. Elsukov" Cc: freebsd-gnats-submit@freebsd.org Subject: Re: kern/155484: GPT + UFS boot Date: Sat, 12 Mar 2011 15:58:35 +0200 --0015174c45a285ca63049e4979f1 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable Sorry it is misprint. I use this: Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptboot -i 1 mfid0 12 =CD=C1=D2=D4=C1 2011 =C7. 10:36 =D0=CF=CC=D8=DA=CF=D7=C1=D4=C5=CC=D8 And= rey V. Elsukov =CE=C1=D0=C9=D3=C1=CC: > On 12.03.2011 01:42, Andrey Vladimirov wrote: > > 2. Install the Protected MBR (pmbr) and gptboot loader > > Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1 > ad0 > > Did you try to use gptboot instead gptzfsboot? > > -- > WBR, Andrey V. Elsukov > --=20 Best regards, Andrey Vladimirov --0015174c45a285ca63049e4979f1 Content-Type: text/html; charset=KOI8-R Content-Transfer-Encoding: quoted-printable Sorry it is misprint.

I use this:
Fixit# gpart bootcode -b /mnt2/= boot/pmbr -p /mnt2/boot/gptboot -i 1 mfid0


12 =CD=C1=D2=D4=C1 2011=9A=C7. 10:36 =D0=CF=CC=D8=DA=CF=D7=C1=D4=C5= =CC=D8 Andrey V. Elsukov <bu7cher@yandex.ru> =CE=C1=D0=C9=D3=C1=CC:
On 12.03.2011 01:42, Andrey Vladimirov wrote:
> 2. Install the Protected MBR (pmbr) and gptboot loader
> Fixit# gpart bootcode -b /mnt2/boot/pmbr -p /mnt2/boot/gptzfsboot -i 1= ad0

Did you try to use gptboot instead gptzfsboot?

--
WBR, Andrey V. Elsukov



--
Best regards,Andrey Vladimirov
--0015174c45a285ca63049e4979f1-- From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 17:03:34 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 65DC81065670 for ; Sat, 12 Mar 2011 17:03:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id D49EB8FC18 for ; Sat, 12 Mar 2011 17:03:33 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p2CH1NlE075490 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 12 Mar 2011 19:01:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p2CH1Nkv044625; Sat, 12 Mar 2011 19:01:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p2CH1Ni0044624; Sat, 12 Mar 2011 19:01:23 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Mar 2011 19:01:23 +0200 From: Kostik Belousov To: freebsd-fs@freebsd.org, freebsd-standards@freebsd.org Message-ID: <20110312170123.GT78089@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oT9u3ind7B9FXzeC" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Subject: open(O_NOFOLLOW) error when encountered symlink X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 17:03:34 -0000 --oT9u3ind7B9FXzeC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hello, I noted the following discussion and commits in the gnu tar repository: http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00080.html http://git.savannah.gnu.org/cgit/tar.git/commit/?id=1584b72ff271e7f826dd64d7a1c7cd2f66504acb http://git.savannah.gnu.org/cgit/tar.git/commit/?id=649b747913d2b289e904b5f1d222af886acd209c The issue is that in case of open(path, O_NOFOLLOW), when path is naming a symlink, FreeBSD returns EMLINK error. On the other hand, the POSIX requirement is absolutely clear that it shall be ELOOP. I found FreeBSD commit r35088 that specifically changed the error code from the required ELOOP to EMLINK. I doubt that somebody can remember a reason for the change done more then 12 years ago. Anybody have strong objections against the patch below ? diff --git a/lib/libc/sys/open.2 b/lib/libc/sys/open.2 index deca8bc..20877b5 100644 --- a/lib/libc/sys/open.2 +++ b/lib/libc/sys/open.2 @@ -318,7 +318,7 @@ is specified and the named file would reside on a read-only file system. The process has already reached its limit for open file descriptors. .It Bq Er ENFILE The system file table is full. -.It Bq Er EMLINK +.It Bq Er ELOOP .Dv O_NOFOLLOW was specified and the target is a symbolic link. .It Bq Er ENXIO diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c index 7b5cad1..c7985ef 100644 --- a/sys/kern/vfs_vnops.c +++ b/sys/kern/vfs_vnops.c @@ -194,7 +194,7 @@ restart: vp = ndp->ni_vp; } if (vp->v_type == VLNK) { - error = EMLINK; + error = ELOOP; goto bad; } if (vp->v_type == VSOCK) { --oT9u3ind7B9FXzeC Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk17puMACgkQC3+MBN1Mb4ig0ACcDtap15B/FoQHS7dsaErvo3NU WbgAoO8kSt4R5lAnVBPQBFu5ZFFLvl9B =JrMX -----END PGP SIGNATURE----- --oT9u3ind7B9FXzeC-- From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 17:30:14 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BC03106566B for ; Sat, 12 Mar 2011 17:30:14 +0000 (UTC) (envelope-from sirsquat@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 935DA8FC19 for ; Sat, 12 Mar 2011 17:30:12 +0000 (UTC) Received: by bwz12 with SMTP id 12so3894862bwz.13 for ; Sat, 12 Mar 2011 09:30:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=RhvE6nMLz+tLZTrLt+GXlpwOOlC8GSNAT8PoxavTMr0=; b=q+PQiofAZtKgjNQKmvve+eKWS7mo8SBWdLZXh8s/HhDQxxF28Glwfhx+9GeJl5Yvdp ZU9KfTgpI7sjhGtGyYX2JXExF286+ypX5uf52psCXevdnXH1N1Sc/X1rHmbT4lIaeG/z xaJ+p0xVHZZPvV4BI7yEMWgIEGcPM5z9tVE38= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=G3+I9OQVyKDHlU4Vp0qg3VHsYnlmxD8hQ5ehG3UzYiNB0Y9k2R1Tx8QD14KybwO9pE LEwARtOwdSccvaMpaCWsKfvxiV3dHLIbf6EXSqjt4mnW4H0/ZxJlEeqrWryTfR2i+J4x s1vR3BwMS0lysVilh+8cGAjbhsFRPREv//jzU= MIME-Version: 1.0 Received: by 10.204.19.14 with SMTP id y14mr2425332bka.187.1299949289018; Sat, 12 Mar 2011 09:01:29 -0800 (PST) Received: by 10.204.9.215 with HTTP; Sat, 12 Mar 2011 09:01:28 -0800 (PST) Date: Sat, 12 Mar 2011 18:01:28 +0100 Message-ID: From: Squat To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: ZFS: Booting problem with zfsboot and MBR X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 17:30:14 -0000 After upgrading my ZFS rootpool from v14 to v15 (and ZFS from v1 to v4) I've been unable to boot my system. At first the booting process of cause complained that the pool was v15 while it expected v14. (Plan was to look into the gpart equivalent for MBR/slices later, sometime). So I booted up from a USB memory stick, imported the pool and dd-ed over /boot/zfsboot from my rootpool as described in: http://wiki.freebsd.org/RootOnZFS/ZFSBootPartition (my system was installed pretty much as described in this guide, as 8.0-RC1, but with ad0 and ad4 in a mirror). (dd if=/mnt/boot/zfsboot of=/dev/ad0s1 count=1 dd if=/mnt/boot/zfsboot of=/dev/ad0s1a skip=1 seek=1024) exported my rootpool and rebooted. No my system doesn't get to complain about zpool version, the booting process just hang on "-" (hyphen). Tried again with the /mnt2/boot/zfsboot from the 8.2-RELEASE memory stick with the same result. Kind of look like the problem described in: http://www.freebsd.org/cgi/query-pr.cgi?pr=153552 even though my zfsboot was from 8.2-RELEASE. So I've tried with zfsboot from 9.0-CURRENT from mid january and 2011-03-10, but the result is the same (expect that I now can reboot from "-" with Ctrl+Alt+Delete). Then I went over to that maybe the problem is that I was stupid enough to import my rootpool and use /boot/zfsboot from there (as is the same on the .img I booted from), but as I understand: http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010813.html the /boot/zfsboot from 9.0-CURRENT 2011-03-10 should be able to boot an exported zpool? I've also tried to replace the boot0 with: gpart bootcode -b /mnt/boot/boot0 ad0 from 8.2-RELEASE. (Still just "-" after I push F1). How can I get up my system again? Or where should I focus my troubleshooting when the booting process just gets to "-" and halts with that? (I'm sorry if I posted this message on the wrong list/channel or anything like that, it's my first.) And thanks to everyone who've contributed to FreeBSD, especially those who've worked with ZFS. -- Tor Halvard Furulund From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 19:31:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BDD8A106566C; Sat, 12 Mar 2011 19:31:36 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay02.stack.nl [IPv6:2001:610:1108:5010::104]) by mx1.freebsd.org (Postfix) with ESMTP id 5A9DD8FC14; Sat, 12 Mar 2011 19:31:36 +0000 (UTC) Received: from turtle.stack.nl (turtle.stack.nl [IPv6:2001:610:1108:5010::132]) by mx1.stack.nl (Postfix) with ESMTP id 41D96359329; Sat, 12 Mar 2011 20:31:32 +0100 (CET) Received: by turtle.stack.nl (Postfix, from userid 1677) id 2C919173A3; Sat, 12 Mar 2011 20:31:32 +0100 (CET) Date: Sat, 12 Mar 2011 20:31:32 +0100 From: Jilles Tjoelker To: Kostik Belousov Message-ID: <20110312193131.GA97300@stack.nl> References: <20110312170123.GT78089@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110312170123.GT78089@deviant.kiev.zoral.com.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, freebsd-standards@freebsd.org Subject: Re: open(O_NOFOLLOW) error when encountered symlink X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 19:31:36 -0000 On Sat, Mar 12, 2011 at 07:01:23PM +0200, Kostik Belousov wrote: > Hello, > I noted the following discussion and commits in the gnu tar repository: > http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00080.html > > http://git.savannah.gnu.org/cgit/tar.git/commit/?id=1584b72ff271e7f826dd64d7a1c7cd2f66504acb > http://git.savannah.gnu.org/cgit/tar.git/commit/?id=649b747913d2b289e904b5f1d222af886acd209c > The issue is that in case of open(path, O_NOFOLLOW), when path is naming > a symlink, FreeBSD returns EMLINK error. On the other hand, the POSIX > requirement is absolutely clear that it shall be ELOOP. > I found FreeBSD commit r35088 that specifically changed the error code > from the required ELOOP to EMLINK. I doubt that somebody can remember > a reason for the change done more then 12 years ago. In fact that change was done hours after the new ELOOP error. > Anybody have strong objections against the patch below ? Although it loses information (ELOOP may also be caused by the directory prefix), I think we should make the change. Please move the error condition in open.2 below the other [ELOOP] error. usr.bin/cmp relies on the EMLINK error for the -h option and needs some adjustment. If ELOOP is returned and O_NOFOLLOW is in use, it needs to check using lstat() if the file is a symlink. > diff --git a/lib/libc/sys/open.2 b/lib/libc/sys/open.2 > index deca8bc..20877b5 100644 > --- a/lib/libc/sys/open.2 > +++ b/lib/libc/sys/open.2 > @@ -318,7 +318,7 @@ is specified and the named file would reside on a read-only file system. > The process has already reached its limit for open file descriptors. > .It Bq Er ENFILE > The system file table is full. > -.It Bq Er EMLINK > +.It Bq Er ELOOP > .Dv O_NOFOLLOW > was specified and the target is a symbolic link. > .It Bq Er ENXIO > diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c > index 7b5cad1..c7985ef 100644 > --- a/sys/kern/vfs_vnops.c > +++ b/sys/kern/vfs_vnops.c > @@ -194,7 +194,7 @@ restart: > vp = ndp->ni_vp; > } > if (vp->v_type == VLNK) { > - error = EMLINK; > + error = ELOOP; > goto bad; > } > if (vp->v_type == VSOCK) { -- Jilles Tjoelker From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 20:05:38 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E216106564A; Sat, 12 Mar 2011 20:05:38 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id BEE568FC08; Sat, 12 Mar 2011 20:05:36 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p2CK5Uh8087596 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 12 Mar 2011 22:05:30 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p2CK5UPg045543; Sat, 12 Mar 2011 22:05:30 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p2CK5Ues045542; Sat, 12 Mar 2011 22:05:30 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Mar 2011 22:05:30 +0200 From: Kostik Belousov To: Jilles Tjoelker Message-ID: <20110312200530.GA78089@deviant.kiev.zoral.com.ua> References: <20110312170123.GT78089@deviant.kiev.zoral.com.ua> <20110312193131.GA97300@stack.nl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="FPVPwrFAzg68gQ6L" Content-Disposition: inline In-Reply-To: <20110312193131.GA97300@stack.nl> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, freebsd-standards@freebsd.org Subject: Re: open(O_NOFOLLOW) error when encountered symlink X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 20:05:38 -0000 --FPVPwrFAzg68gQ6L Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Mar 12, 2011 at 08:31:32PM +0100, Jilles Tjoelker wrote: > On Sat, Mar 12, 2011 at 07:01:23PM +0200, Kostik Belousov wrote: > > Hello, > > I noted the following discussion and commits in the gnu tar repository: >=20 > > http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00080.html > >=20 > > http://git.savannah.gnu.org/cgit/tar.git/commit/?id=3D1584b72ff271e7f82= 6dd64d7a1c7cd2f66504acb > > http://git.savannah.gnu.org/cgit/tar.git/commit/?id=3D649b747913d2b289e= 904b5f1d222af886acd209c >=20 > > The issue is that in case of open(path, O_NOFOLLOW), when path is naming > > a symlink, FreeBSD returns EMLINK error. On the other hand, the POSIX > > requirement is absolutely clear that it shall be ELOOP. >=20 > > I found FreeBSD commit r35088 that specifically changed the error code > > from the required ELOOP to EMLINK. I doubt that somebody can remember > > a reason for the change done more then 12 years ago. >=20 > In fact that change was done hours after the new ELOOP error. Do you mean r35083, r35084 and r35085 ? >=20 > > Anybody have strong objections against the patch below ? >=20 > Although it loses information (ELOOP may also be caused by the directory > prefix), I think we should make the change. >=20 > Please move the error condition in open.2 below the other [ELOOP] error. >=20 > usr.bin/cmp relies on the EMLINK error for the -h option and needs some > adjustment. If ELOOP is returned and O_NOFOLLOW is in use, it needs to > check using lstat() if the file is a symlink. This is quite serious argument against the change, IMO. Also, after your comment, I found the similar code in contrib/xz, and somewhat doubtful resolve_symlink() in rcs sources. I am recalling my proposal. Just for record, below is the updated patch diff --git a/lib/libc/sys/open.2 b/lib/libc/sys/open.2 index deca8bc..1c9095d 100644 --- a/lib/libc/sys/open.2 +++ b/lib/libc/sys/open.2 @@ -304,6 +304,9 @@ is specified or .Dv O_APPEND is not specified. .It Bq Er ELOOP +.Dv O_NOFOLLOW +was specified and the target is a symbolic link. +.It Bq Er ELOOP Too many symbolic links were encountered in translating the pathname. .It Bq Er EISDIR The named file is a directory, and the arguments specify @@ -318,9 +321,6 @@ is specified and the named file would reside on a read-= only file system. The process has already reached its limit for open file descriptors. .It Bq Er ENFILE The system file table is full. -.It Bq Er EMLINK -.Dv O_NOFOLLOW -was specified and the target is a symbolic link. .It Bq Er ENXIO The named file is a character special or block special file, and the device associated with this special file diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c index 7b5cad1..c7985ef 100644 --- a/sys/kern/vfs_vnops.c +++ b/sys/kern/vfs_vnops.c @@ -194,7 +194,7 @@ restart: vp =3D ndp->ni_vp; } if (vp->v_type =3D=3D VLNK) { - error =3D EMLINK; + error =3D ELOOP; goto bad; } if (vp->v_type =3D=3D VSOCK) { diff --git a/usr.bin/cmp/cmp.c b/usr.bin/cmp/cmp.c index f3ac717..84514d0 100644 --- a/usr.bin/cmp/cmp.c +++ b/usr.bin/cmp/cmp.c @@ -107,11 +107,14 @@ main(int argc, char *argv[]) fd1 =3D 0; file1 =3D "stdin"; } - else if ((fd1 =3D open(file1, oflag, 0)) < 0 && errno !=3D EMLINK) { - if (!sflag) - err(ERR_EXIT, "%s", file1); - else - exit(ERR_EXIT); + else if ((fd1 =3D open(file1, oflag, 0)) < 0) { + if (errno !=3D ELOOP || stat(file1, &sb1) =3D=3D -1 || + !S_ISLNK(sb1.st_mode)) { + if (!sflag) + err(ERR_EXIT, "%s", file1); + else + exit(ERR_EXIT); + } } if (strcmp(file2 =3D argv[1], "-") =3D=3D 0) { if (special) @@ -121,11 +124,14 @@ main(int argc, char *argv[]) fd2 =3D 0; file2 =3D "stdin"; } - else if ((fd2 =3D open(file2, oflag, 0)) < 0 && errno !=3D EMLINK) { - if (!sflag) - err(ERR_EXIT, "%s", file2); - else - exit(ERR_EXIT); + else if ((fd2 =3D open(file2, oflag, 0)) < 0) { + if (errno !=3D ELOOP || stat(file2, &sb2) =3D=3D -1 || + !S_ISLNK(sb2.st_mode)) { + if (!sflag) + err(ERR_EXIT, "%s", file2); + else + exit(ERR_EXIT); + } } =20 skip1 =3D argc > 2 ? strtol(argv[2], NULL, 0) : 0; --FPVPwrFAzg68gQ6L Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk170goACgkQC3+MBN1Mb4gdCQCgn2kTliOfDaprReESRbgbMJYS aLUAn0YcEQTDRloHKRzF4s6SjlCH38V+ =+Epr -----END PGP SIGNATURE----- --FPVPwrFAzg68gQ6L-- From owner-freebsd-fs@FreeBSD.ORG Sat Mar 12 22:42:57 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA556106564A for ; Sat, 12 Mar 2011 22:42:57 +0000 (UTC) (envelope-from peppe.maniscalco@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 419408FC16 for ; Sat, 12 Mar 2011 22:42:56 +0000 (UTC) Received: by eyg7 with SMTP id 7so1279931eyg.13 for ; Sat, 12 Mar 2011 14:42:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=3rAM7VGM4byhJ7xVSJ4so8cn4SBSpVXZcOsAZTi7wfo=; b=cFNtpkX7dm/PyfVokmjKgpW5F13cBd/2XVDPbl1uiMF3tFIQzzK3RRB77rHTyCqHPq uqcYwdG9y2Te+kfMduBm6BLNar/m6g+6VblE5KtlU82BYm5SD4dyI3BCLKYq1SfuI4RQ r9N4LohKLkc9A28qO/SOiTRR1ih6Q3+zUHMEE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=eCLC2F8QbYiUQ8YsLV5Bs8xAZuhauZTBdHHU9aV8BpJJq/kr5nO1ejNg15aXs+Tz1R oAmi53GxTJwaEsQy1hNq+DkAWili7T8F1gZa4gdbE2FtVFflm2VPGdb+1WIEU8YmT/J9 fTEhybzZlfSrkV68nWFHAhxFVjX/pvarkmAao= MIME-Version: 1.0 Received: by 10.14.126.205 with SMTP id b53mr3689318eei.41.1299968282781; Sat, 12 Mar 2011 14:18:02 -0800 (PST) Received: by 10.14.122.202 with HTTP; Sat, 12 Mar 2011 14:18:02 -0800 (PST) Date: Sat, 12 Mar 2011 23:18:02 +0100 Message-ID: From: Giuseppe Maniscalco To: freebsd-fs Content-Type: text/plain; charset=ISO-8859-1 Subject: FreeBSD7, pf, carp... X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 22:42:57 -0000 Hi List! I need your help!!! I've two firewalls configured in parallel (connected with a crossover cable) and I use pfsync+carp to failover. So one firewall (A) handles all traffic as MASTER and, if it dies or if some NIC interface go down, the second firewall (B) takes over automatically. Well... As usually everything works properly, but since a few days ago "B" takes control and "A" become backup. But "A" cannot return to be master until rebooting! After reboot, "A" is the master for a while, then I've the same problem... I identified a problem here: fwA# sysctl -a | grep arp net.inet.ip.same_prefix_carp_only: 0 net.inet.carp.allow: 1 net.inet.carp.preempt: 1 net.inet.carp.log: 1 net.inet.carp.arpbalance: 0 net.inet.carp.suppress_preempt: 1 >From man carp: net.inet.carp.suppress_preempt: A read only value showing the status of preemption suppression. Preemption can be suppressed if link on an interface is down or when pfsync(4) interface is not synchronized. Value of 0 means that preemption is not suppressed, since no problems are detected. Every problem increments suppression counter. All my interfaces are UP... now I don't know how to check if pfsync is synched or not... Meanwhile, in B node: fwB# sysctl -a | grep arp net.inet.ip.same_prefix_carp_only: 0 net.inet.carp.allow: 1 net.inet.carp.preempt: 1 net.inet.carp.log: 1 net.inet.carp.arpbalance: 0 net.inet.carp.suppress_preempt: 0 I tried with a tcpdump on the interfaces, but I see just the change of condition (master/backup) with the advskew modification... This is the only strange thing on DMZ interface... : 17:01:32.397429 01:80:c2:00:00:01 (oui Unknown) > 01:80:c2:00:00:01 (oui Unknown), ethertype Unknown (0x8808), length 60: 0x0000: 0001 ffff 0000 0000 0000 0000 0000 0000 ................ 0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x0020: 0000 0000 0000 0000 0000 0000 0000 .............. I just tried to change the NIC, but nothing! "A" continue to lose control in 30/45 minutes... I read somewhere that the result of "pfctl -ss" must give the same result on both nodes: fwA# pfctl -ss | wc -l 5833 fwB# pfctl -ss | wc -l 5507 Could it be important? Some additional information: fwA# more /etc/rc.conf ifconfig_em0="inet a.a.a.12 netmask 255.255.255.0 polling" ### DMZ ### ifconfig_em1="inet b.b.b.2 netmask 255.255.0.0 polling" ### CROSSOVER ### ifconfig_em2="inet c.c.c.189 netmask 255.255.255.224 polling" ### ISP1 ### ifconfig_em3="inet d.d.d.249 netmask 255.255.255.0 polling" ### ISP2 ### defaultrouter="a.a.a.1" #Firewall pf_enable="YES" pf_rules="/etc/pf.conf" pf_flags="" pflog_enable="YES" pflog_logfile="/var/log/pflog" #Failover pfsync_enable="YES" pfsync_syncdev="em1" cloned_interfaces="carp0 carp1 carp2" ifconfig_carp0="a.a.a.1/24 vhid 1 pass foo" ifconfig_carp0_alias0="a.a.a.11/24 vhid 1 pass foo" ifconfig_carp1="d.d.d.14/24 vhid 2 pass bar" ifconfig_carp1_alias0="d.d.d.2/24 vhid 2 pass bar" ifconfig_carp2="c.c.c.188/27 vhid 3 pass jack" ifconfig_carp2_alias0="c.c.c.165/27 vhid 3 pass jack" fwB# more /etc/rc.conf ifconfig_ste0="inet a.a.a.13 netmask 255.255.255.0 polling" ifconfig_ste1="inet b.b.b.3 netmask 255.255.0.0 polling" ifconfig_em0="inet c.c.c.190 netmask 255.255.255.224 polling" ifconfig_em1="inet d.d.d.250 netmask 255.255.255.0 polling" defaultrouter="c.c.c.1" #Firewall pf_enable="YES" pf_rules="/etc/pf.conf" pf_flags="" pflog_enable="YES" pflog_logfile="/var/log/pflog" #Failover pfsync_enable="YES" pfsync_syncdev="ste1" cloned_interfaces="carp0 carp1 carp2" ifconfig_carp0="a.a.a.1/24 vhid 1 advskew 128 pass foo" ifconfig_carp0_alias0="a.a.a.11/24 vhid 1 advskew 128 pass foo" ifconfig_carp1="d.d.d.14/24 vhid 2 advskew 64 pass bar" ifconfig_carp1_alias0="d.d.d.2/24 vhid 2 advskew 64 pass bar" ifconfig_carp2="c.c.c.188/27 vhid 3 advskew 100 pass jack" ifconfig_carp2_alias0="c.c.c.165/27 vhid 3 advskew 100 pass jack" In each node pf.conf I added: fwA# more pf.conf | grep failover pass quick on { em1 } proto pfsync # failover pass on { em0 em2 em3 } proto carp # failover fwB# more pf.conf | grep failover pass quick on { ste1 } proto pfsync # failover pass on { em0 ste0 em1 } proto carp # failover I hope that someone can give me a solution please, or maybe just an idea, cause I'm getting crazy!!! Please ask me, if you need further information... Thank you all!