From owner-freebsd-fs@FreeBSD.ORG Sun Oct 19 13:16:26 2008 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE0B6106568C; Sun, 19 Oct 2008 13:16:26 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 83F398FC19; Sun, 19 Oct 2008 13:16:26 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id m9JDGQha030234; Sun, 19 Oct 2008 13:16:26 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id m9JDGP37030230; Sun, 19 Oct 2008 13:16:25 GMT (envelope-from linimon) Date: Sun, 19 Oct 2008 13:16:25 GMT Message-Id: <200810191316.m9JDGP37030230@freefall.freebsd.org> To: johan@giantfoo.org, linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/119868: [zfs] [patch] 7.0 kernel panic during boot with ZFS and WD1600JS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Oct 2008 13:16:26 -0000 Old Synopsis: [zfs] 7.0 kernel panic during boot with ZFS and WD1600JS New Synopsis: [zfs] [patch] 7.0 kernel panic during boot with ZFS and WD1600JS State-Changed-From-To: open->analyzed State-Changed-By: linimon State-Changed-When: Sun Oct 19 13:15:19 UTC 2008 State-Changed-Why: Patch has been submitted and has been confirmed as fixing the problem. Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Sun Oct 19 13:15:19 UTC 2008 Responsible-Changed-Why: http://www.freebsd.org/cgi/query-pr.cgi?pr=119868 From owner-freebsd-fs@FreeBSD.ORG Mon Oct 20 11:06:51 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E8ED106566B for ; Mon, 20 Oct 2008 11:06:51 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2C3B88FC20 for ; Mon, 20 Oct 2008 11:06:51 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id m9KB6p26082651 for ; Mon, 20 Oct 2008 11:06:51 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id m9KB6o7P082647 for freebsd-fs@FreeBSD.org; Mon, 20 Oct 2008 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 20 Oct 2008 11:06:50 GMT Message-Id: <200810201106.m9KB6o7P082647@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2008 11:06:51 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs][panic] changing into .zfs dir from nfs client ca o kern/124621 fs [ext3] Cannot mount ext2fs partition o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha a kern/119868 fs [zfs] [patch] 7.0 kernel panic during boot with ZFS an o bin/118249 fs mv(1): moving a directory changes its mtime o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D 22 problems total. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 08:36:18 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 70BE21065725 for ; Tue, 21 Oct 2008 08:36:18 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from bene1.itea.ntnu.no (bene1.itea.ntnu.no [IPv6:2001:700:300:3::56]) by mx1.freebsd.org (Postfix) with ESMTP id 7AFA48FC29 for ; Tue, 21 Oct 2008 08:36:17 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by bene1.itea.ntnu.no (Postfix) with ESMTP id 839A5176ADF; Tue, 21 Oct 2008 10:36:15 +0200 (CEST) Received: from carrot.studby.ntnu.no (unknown [IPv6:2001:700:300:3::184]) by bene1.itea.ntnu.no (Postfix) with ESMTP id 0B70A1769C0; Tue, 21 Oct 2008 10:36:14 +0200 (CEST) Date: Tue, 21 Oct 2008 10:36:14 +0200 From: Ulf Lilleengen To: andys Message-ID: <20081021083415.GA1571@carrot.studby.ntnu.no> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Virus-Scanned: Debian amavisd-new at bene1.itea.ntnu.no Cc: freebsd-fs@freebsd.org Subject: Re: bsdlabel partiton c error message on new install X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2008 08:36:18 -0000 On fre, okt 17, 2008 at 04:16:14pm +0100, andys wrote: > Hi, > > on a newly installed FreeBSD 7.0 system on a dell 1950 server I see the > following error from bsdlabel. Is there any known issues with this or is the > only reasonable explanation that I have managed to mess it up without even > knowing? :P And should I manually change the partition c to fix the prob? Is > this safe to do? > > bsdlabel -A /dev/da0s1 > # /dev/da0s1: > type: SCSI > disk: da0s1 > label: > flags: > bytes/sector: 512 > sectors/track: 63 > tracks/cylinder: 255 > sectors/cylinder: 16065 > cylinders: 17750 > sectors/unit: 285155328 > rpm: 3600 > interleave: 1 > trackskew: 0 > cylinderskew: 0 > headswitch: 0 # milliseconds > track-to-track seek: 0 # milliseconds > drivedata: 0 > > 8 partitions: > # size offset fstype [fsize bsize bps/cpg] > a: 20971520 0 4.2BSD 2048 16384 28552 > b: 20971520 75497472 swap > c: 285153687 0 unused 0 0 # "raw" part, don't 285155328 > edit > d: 20971520 20971520 4.2BSD 2048 16384 28552 > e: 20971520 41943040 4.2BSD 2048 16384 28552 > f: 12582912 62914560 4.2BSD 2048 16384 28552 > bsdlabel: partition c doesn't cover the whole unit! > bsdlabel: An incorrect partition c may cause problems for standard system > utilities > > > thanks for any advice, Im not really confident with the FreeBSD disk > management as I havent used it much, Hello, This is completely ok. The reasons that you might get warnings like this is that fdisk tries to put the sector number on a cylinder boundary. If that means that the partition is larger than the actual disklabel size, that is ok. What would have been a problem is if the disklabel extends past the partition size! (I think the installer makes sure this does not happen). You do waste a few sectors because of this, but unless you are really interested in getting them back, I would not start bothering with it. One way to "fix" it is to do a bsdlabel -e and change c: 285153687 0 unused 0 0 to c: 285155328 0 unused 0 0 But again, it is not many sectors that is currently wasted. -- Ulf Lilleengen From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 09:42:08 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D2B171065682 for ; Tue, 21 Oct 2008 09:42:08 +0000 (UTC) (envelope-from vmail@lists.ukgrid.net) Received: from alpha.ukgrid.net (lists.manap.net [85.159.60.196]) by mx1.freebsd.org (Postfix) with ESMTP id 9AFFE8FC19 for ; Tue, 21 Oct 2008 09:42:08 +0000 (UTC) (envelope-from vmail@lists.ukgrid.net) Received: from vmail by alpha.ukgrid.net with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1KsDkd-000LLy-0C; Tue, 21 Oct 2008 10:42:07 +0100 References: <20081021083415.GA1571@carrot.studby.ntnu.no> In-Reply-To: <20081021083415.GA1571@carrot.studby.ntnu.no> From: "andys" To: Ulf Lilleengen Date: Tue, 21 Oct 2008 10:42:06 +0100 Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Message-Id: Sender: VMail virtual user Cc: freebsd-fs@freebsd.org Subject: Re: bsdlabel partiton c error message on new install X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2008 09:42:08 -0000 Hi Ulf, thanks a lot for your answer, previously I'd asked this question on the freebsd-questions list and someone suggested asking it here as they didnt know the answer, however I did get pretty much 2 responces telling me to reinstall the OS!! :S For example I had this answer: http://lists.freebsd.org/pipermail/freebsd-questions/2008-October/184617.htm l So I assume you would disagree with this and the other person who advised me this was a serious error? And if this actually isnt a problem, does bsdlabel need to be updated (and the man page) to reflect the fact this can be seen on a healthy system? thanks a lot! Andy. > > This is completely ok. The reasons that you might get warnings like this is > that fdisk tries to put the sector number on a cylinder boundary. If that > means that the partition is larger than the actual disklabel size, that is > ok. What would have been a problem is if the disklabel extends past the > partition size! (I think the installer makes sure this does not happen). > > You do waste a few sectors because of this, but unless you are really > interested in getting them back, I would not start bothering with it. One way > to "fix" it is to do a bsdlabel -e and change > c: 285153687 0 unused 0 0 > to > c: 285155328 0 unused 0 0 > > But again, it is not many sectors that is currently wasted. > > -- > Ulf Lilleengen From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 10:15:15 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C624A1065684 for ; Tue, 21 Oct 2008 10:15:15 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA07.emeryville.ca.mail.comcast.net (qmta07.emeryville.ca.mail.comcast.net [76.96.30.64]) by mx1.freebsd.org (Postfix) with ESMTP id AA9038FC28 for ; Tue, 21 Oct 2008 10:15:15 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from OMTA13.emeryville.ca.mail.comcast.net ([76.96.30.52]) by QMTA07.emeryville.ca.mail.comcast.net with comcast id VMrT1a00D17UAYkA7MzFgC; Tue, 21 Oct 2008 09:59:15 +0000 Received: from koitsu.dyndns.org ([69.181.141.110]) by OMTA13.emeryville.ca.mail.comcast.net with comcast id VMzD1a0072P6wsM8ZMzESN; Tue, 21 Oct 2008 09:59:14 +0000 X-Authority-Analysis: v=1.0 c=1 a=526rwrygDQ8A:10 a=mQ5bP1b1WGUA:10 a=6I5d2MoRAAAA:8 a=QycZ5dHgAAAA:8 a=4q6kWHdsqvjBsj2Ty4AA:9 a=zOfaNmc9naUZ7I5Om5wA:7 a=2bpoN66n1swy-Zj4H1bMoFXSxr8A:4 a=EoioJ0NPDVgA:10 a=LY0hPdMaydYA:10 Received: by icarus.home.lan (Postfix, from userid 1000) id 7ECFBC9432; Tue, 21 Oct 2008 02:59:13 -0700 (PDT) Date: Tue, 21 Oct 2008 02:59:13 -0700 From: Jeremy Chadwick To: andys Message-ID: <20081021095913.GA26955@icarus.home.lan> References: <20081021083415.GA1571@carrot.studby.ntnu.no> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-fs@freebsd.org Subject: Re: bsdlabel partiton c error message on new install X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2008 10:15:15 -0000 On Tue, Oct 21, 2008 at 10:42:06AM +0100, andys wrote: > Hi Ulf, > > thanks a lot for your answer, previously I'd asked this question on the > freebsd-questions list and someone suggested asking it here as they didnt > know the answer, however I did get pretty much 2 responces telling me to > reinstall the OS!! :S > > For example I had this answer: > > http://lists.freebsd.org/pipermail/freebsd-questions/2008-October/184617.htm > l > > So I assume you would disagree with this and the other person who advised > me this was a serious error? And if this actually isnt a problem, does > bsdlabel need to be updated (and the man page) to reflect the fact this > can be seen on a healthy system? Part of the problem is that you're "tinkering with bsdlabel" when most users simply create slices and partitions and don't bother to look at the results -- they build it all, install, and don't worry about it. I'm sure if ran bsdlabel and saw what you did, I'd be concerned too, so you did the right thing by asking. All the systems I maintain have the c slice offset at zero, but Ulf's explanation makes perfect sense. (I believe even Windows does something similar to this, except it leaves the leftovers at the end of the partition table for alignment.) Comparatively, there's the silly "cylinder geometry" warning that sysinstall spits out prior to launching into slice manipulation. It's silly in the majority of cases, but apparently it's legitimate when it comes to older/smaller disks, particularly SCSI. That said, you should see the look on Linux users' faces when they see it -- a look of fear, followed by someone saying "You can ignore that", followed by "...then what the hell is the point of printing it?!!" :-) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 12:14:12 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6FAD910656A0 for ; Tue, 21 Oct 2008 12:14:12 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from bene2.itea.ntnu.no (bene2.itea.ntnu.no [IPv6:2001:700:300:3::57]) by mx1.freebsd.org (Postfix) with ESMTP id B8C828FC2D for ; Tue, 21 Oct 2008 12:14:11 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by bene2.itea.ntnu.no (Postfix) with ESMTP id 988959001C; Tue, 21 Oct 2008 14:14:09 +0200 (CEST) Received: from carrot.studby.ntnu.no (unknown [IPv6:2001:700:300:3::184]) by bene2.itea.ntnu.no (Postfix) with ESMTP id 58A3A90029; Tue, 21 Oct 2008 14:13:58 +0200 (CEST) Date: Tue, 21 Oct 2008 14:13:33 +0200 From: Ulf Lilleengen To: andys Message-ID: <20081021121332.GA2280@carrot.studby.ntnu.no> References: <20081021083415.GA1571@carrot.studby.ntnu.no> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Virus-Scanned: Debian amavisd-new at bene2.itea.ntnu.no Cc: freebsd-fs@freebsd.org Subject: Re: bsdlabel partiton c error message on new install X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2008 12:14:12 -0000 On tir, okt 21, 2008 at 10:42:06am +0100, andys wrote: > Hi Ulf, > > thanks a lot for your answer, previously I'd asked this question on the > freebsd-questions list and someone suggested asking it here as they didnt > know the answer, however I did get pretty much 2 responces telling me to > reinstall the OS!! :S > > For example I had this answer: > > http://lists.freebsd.org/pipermail/freebsd-questions/2008-October/184617.htm > l > > So I assume you would disagree with this and the other person who advised me > this was a serious error? And if this actually isnt a problem, does bsdlabel > need to be updated (and the man page) to reflect the fact this can be seen > on a healthy system? > Well, this depends really on if you did this by purpose or if it was created this way by fdisk. There are many factors which can influence this, since it is not necessary something done by the system itself (I really doubt that the label size will change without the user being notified if the user did not make any such request, but then again, perhaps it should be checked in the utilities). If the disklabels were shortened after file system creation and the filesystem really expects a larger label, you might be in trouble (when you require the filesystem to do something involving the provider size which in this case might be a slice which have changed size, you might have trouble). You should perhaps do some testing with fsck and see if it complaints. Or perhaps find out the real size that UFS expects (I am not really sure how). If it is the same as the current label size, you are safe. -- Ulf Lilleengen From owner-freebsd-fs@FreeBSD.ORG Tue Oct 21 14:18:35 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6B5151065679 for ; Tue, 21 Oct 2008 14:18:35 +0000 (UTC) (envelope-from vmail@lists.ukgrid.net) Received: from alpha.ukgrid.net (lists.manap.net [85.159.60.196]) by mx1.freebsd.org (Postfix) with ESMTP id 3130D8FC17 for ; Tue, 21 Oct 2008 14:18:34 +0000 (UTC) (envelope-from vmail@lists.ukgrid.net) Received: from vmail by alpha.ukgrid.net with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1KsI49-000NnK-7Y for freebsd-fs@freebsd.org; Tue, 21 Oct 2008 15:18:33 +0100 References: <20081021083415.GA1571@carrot.studby.ntnu.no> <20081021121332.GA2280@carrot.studby.ntnu.no> In-Reply-To: <20081021121332.GA2280@carrot.studby.ntnu.no> From: "andys" To: freebsd-fs@freebsd.org Date: Tue, 21 Oct 2008 15:18:33 +0100 Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Message-Id: Sender: VMail virtual user Subject: Re: bsdlabel partiton c error message on new install X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Oct 2008 14:18:35 -0000 Hi, ok, so I have attempted to proceed with my original task which was to create a new UFS2 parition (using sysinstall). Having chosen "c" and then "w" from the lable section, i recieve the following error: Error mounting /dev/da0s1g on /export : No such file or directory After exiting sysinstall, I can see from bsdlabel: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 20971520 0 4.2BSD 0 0 0 b: 20971520 75497472 swap c: 285153687 0 unused 0 0 # "raw" part, don't edit d: 20971520 20971520 4.2BSD 0 0 0 e: 20971520 41943040 4.2BSD 0 0 0 f: 12582912 62914560 4.2BSD 0 0 0 g: 146800640 96468992 4.2BSD 0 0 0 bsdlabel: partition c doesn't cover the whole unit! "g" is my new partition. Under /dev however I dont see the device file: ls /dev/da0* /dev/da0 /dev/da0s1a /dev/da0s1c /dev/da0s1e /dev/da0s1 /dev/da0s1b /dev/da0s1d /dev/da0s1f Can anyone help :( thanks a lot, Andy. From owner-freebsd-fs@FreeBSD.ORG Wed Oct 22 11:23:34 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1066A10656A8 for ; Wed, 22 Oct 2008 11:23:34 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: from web36608.mail.mud.yahoo.com (web36608.mail.mud.yahoo.com [209.191.85.25]) by mx1.freebsd.org (Postfix) with SMTP id C49AF8FC14 for ; Wed, 22 Oct 2008 11:23:33 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: (qmail 26267 invoked by uid 60001); 22 Oct 2008 10:56:51 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:To:Cc:MIME-Version:Content-Type:Message-ID; b=fo+Cnvf6U/zM8ghx3mj9D+oZakQtHYgky4HPL0p/vu4s3tyjuR6eD5gnd2QCLH92rtDUccFWRQToHNKnpee7pyhqypQqysljkf2i0scE/JxEb51MTTa2FxI5KVw7KhzzTqijiQ12cnst6M2mVvB/80plrp0HLMDf6HztjGBfoKM=; X-YMail-OSG: a3LPBGAVM1l0tvny.AtE6Yusx751dHtzvC4yDvIpSAuszzT.0gnbDgXzqHg2uNPFZ.PcgiHKtegU7R9gbs9yfPf2iDSnE0iUtc4gEGB8mHNNZyfD_Ut50PR.cJJdFYmR4ZHg3kycqQpuUjjSmbJWP5JQSHj3NvSxin85gebB Received: from [213.147.110.159] by web36608.mail.mud.yahoo.com via HTTP; Wed, 22 Oct 2008 03:56:51 PDT X-Mailer: YahooMailWebService/0.7.247.3 Date: Wed, 22 Oct 2008 03:56:51 -0700 (PDT) From: Simun Mikecin To: Anthony Chavez MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Message-ID: <207982.25549.qm@web36608.mail.mud.yahoo.com> Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: (no subject) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: numisemis@yahoo.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Oct 2008 11:23:34 -0000 > I will soon be installing an Areca ARC-1110 and 3x 1.5TB Seagate > Barracuda SATAs into a 3.2GHz Northwood P4 with 1GB of RAM, and I'm > wondering which would be the most stable filesystem to use. > I've read the bigdisk page [1] and the various information about ZFS on > the FreeBSD Wiki [2]. I'm aware of the tuning requirements that ZFS > requires, and upgrading to 4GB of RAM would be quite possible as it was > understood beforehand that ZFS requires a large quantity of it. > My questions are as follows. > 1. I'm aware of the fact that ZFS works better on 64-bit platforms, and > that alone has me thinking that it's not a good fit for this particular > machine. But apart from that, it seems that ZFS is not yet stable > enough for my environment (only about 25 users but in production > nonetheless). To me, [3] paints all sorts of ugly pictures, which can > be summarized as "count on ZFS-related panics and deadlocks happening > fairly regularly" and "disabling ZIL in the interest of stability will > put your data at risk." Comments about live systems using ZFS (on > 7.0-RELEASE or 7-STABLE) would be appreciated. I'm using 7.0-RELEASE/amd64 with ZFS on several machines without any stability problems. Here are the configs (prefetch is disabled for performance reasons): - with 1GB RAM (probably with just 1GB RAM system would be faster using UFS2 instead of ZFS) vfs.zfs.prefetch_disable=1 vm.kmem_size="512M" vfs.zfs.arc_max="150M" (at first it was 200M, but lots of swapping made me reduce it) - with 2GB RAM: vfs.zfs.prefetch_disable=1 vm.kmem_size="950M" (could be higher, even 1536M, but then there is not much RAM left for your apps) - with 8GB RAM: vfs.zfs.prefetch_disable=1 vm.kmem_size="1536M" (this could probably be higher: up to 2047M, but I haven't tried it). General rule to make it stable is to make the difference between vm.kmem_size and vfs.zfs.arc_max larger. vfs.zfs.arc_max is by default 3/4 of vm.kmem_size. You can achieve it by making vm.kmem_size bigger (but this leaves less memory for your applications) or reducing vfs.zfs.arc_max (but this reduces performance, since less memory will be available for caching). Problem with stability comes when kmem usage is at it's peak. arc_max is just a value after some of it will be deleted. But in some cases (high I/O activity) it will grow faster than the thread that reduces it (to a size less than arc_max) can delete. > 2. [1] appears to be a bit dated. Nevertheless, I'm inclined to think > that the status described there (as well as in various man pages) still > applies to UFS2 on 7.0-RELEASE. Please correct me if I'm wrong or let > me know if the state of affairs has improved significantly in 7-STABLE. > 2a. Does the information contained in [1] apply to ZFS as well? [1] is outdated. GEOM, GPT, UFS2 and ZFS are safe to use for many hundreds of terabytes. What is limited is MBR partitioning used by fdisk (2TB limit). > 3. As the array will be for data only and not be booted, will it be > possible to use fdisk to slice it up, or will I need to use gpt? fdisk can be used for slicing for disks that are up to 2TB. But I would recomment to use GPT (which doesn't have this limit) instead. There is no reason not to. > 4. My planned course of action will be to attempt to newfs the device > itself (da0, all 3TB of it) or 1 full-disk slice (da0s1). Failing that, > I will attempt to gconcat da0s1 and da0s2 (1.5TB each), although I > suspect that may not work since for one thing, growfs is not yet 64-bit > clean. In either case, I'm very interested in using gbde/geli to > encrypt the fs. If either of these paths are not possible or > recommended, are there any suggestions for alternate means of creating a > 3TB fs? If you will go for the ZFS route instead of UFS2 then don't make one logical array from your disks (da0) in your RAID controller, but instead make one logical array to be one phisical disk (so you have da0, da1 and da2) so you can use ZFS RAID functionality instead. If you are going to use UFS2 then you must use gjournal for disks of that size (or you will have background fscks that can last for ages and die a horrible death saying not enough memory). From owner-freebsd-fs@FreeBSD.ORG Thu Oct 23 11:27:47 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9FF65106567C; Thu, 23 Oct 2008 11:27:47 +0000 (UTC) (envelope-from vmail@lists.ukgrid.net) Received: from alpha.ukgrid.net (lists.manap.net [85.159.60.196]) by mx1.freebsd.org (Postfix) with ESMTP id 690D28FC1A; Thu, 23 Oct 2008 11:27:47 +0000 (UTC) (envelope-from vmail@lists.ukgrid.net) Received: from vmail by alpha.ukgrid.net with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1KsyLx-000JV7-En; Thu, 23 Oct 2008 12:27:45 +0100 From: "andys" To: "andys" Date: Thu, 23 Oct 2008 13:27:45 +0200 Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Message-Id: Sender: VMail virtual user Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: bsdlabel partiton c error message on new install X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Oct 2008 11:27:47 -0000 Hi, the below was resolved by rebooting the server. After a reboot the device file /dev/da0s1g has been created, however this doesnt seem completely normal as sysinstall obviously expected to see the new device file immediately. Perhaps there is a prob with my system or is there just a problem with the expectations of sysinstall?? :S cheers Andy. andys writes: > Hi, > > ok, so I have attempted to proceed with my original task which was to > create a new UFS2 parition (using sysinstall). Having chosen "c" and then > "w" from the lable section, i recieve the following error: > > Error mounting /dev/da0s1g on /export : No such file or directory > > After exiting sysinstall, I can see from bsdlabel: > > 8 partitions: > # size offset fstype [fsize bsize bps/cpg] > a: 20971520 0 4.2BSD 0 0 0 > b: 20971520 75497472 swap > c: 285153687 0 unused 0 0 # "raw" part, don't > edit > d: 20971520 20971520 4.2BSD 0 0 0 > e: 20971520 41943040 4.2BSD 0 0 0 > f: 12582912 62914560 4.2BSD 0 0 0 > g: 146800640 96468992 4.2BSD 0 0 0 > bsdlabel: partition c doesn't cover the whole unit! > > "g" is my new partition. Under /dev however I dont see the device file: > > ls /dev/da0* > /dev/da0 /dev/da0s1a /dev/da0s1c /dev/da0s1e > /dev/da0s1 /dev/da0s1b /dev/da0s1d /dev/da0s1f > > Can anyone help :( > > thanks a lot, > Andy. From owner-freebsd-fs@FreeBSD.ORG Fri Oct 24 16:36:59 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88FBC1065685; Fri, 24 Oct 2008 16:36:59 +0000 (UTC) (envelope-from thierry@herbelot.com) Received: from postfix2-g20.free.fr (postfix2-g20.free.fr [212.27.60.43]) by mx1.freebsd.org (Postfix) with ESMTP id 409748FC08; Fri, 24 Oct 2008 16:36:59 +0000 (UTC) (envelope-from thierry@herbelot.com) Received: from smtp6-g19.free.fr (smtp6-g19.free.fr [212.27.42.36]) by postfix2-g20.free.fr (Postfix) with ESMTP id A88572C108F2; Fri, 24 Oct 2008 16:18:59 +0200 (CEST) Received: from smtp6-g19.free.fr (localhost.localdomain [127.0.0.1]) by smtp6-g19.free.fr (Postfix) with ESMTP id 8968217EB7; Fri, 24 Oct 2008 18:19:13 +0200 (CEST) Received: from mail.herbelot.nom (bne75-4-82-227-159-103.fbx.proxad.net [82.227.159.103]) by smtp6-g19.free.fr (Postfix) with ESMTP id 4BF3A17824; Fri, 24 Oct 2008 18:19:02 +0200 (CEST) Received: from diversion.herbelot.nom (diversion.herbelot.nom [192.168.2.6]) by mail.herbelot.nom (8.14.1/8.14.1) with ESMTP id m9OGIhoU028626; Fri, 24 Oct 2008 18:18:44 +0200 (CEST) From: Thierry Herbelot To: hackers@freebsd.org Date: Fri, 24 Oct 2008 18:18:36 +0200 User-Agent: KMail/1.9.10 X-Warning: Windows can lose your files X-Op-Sys: Le FriBi de la mort qui tue X-Org: TfH&Co X-MailScanner: Found to be clean MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200810241818.37262.thierry@herbelot.com> Cc: freebsd-fs@freebsd.org Subject: question about sb->st_blksize in src/sys/kern/vfs_vnops.c X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: thierry@herbelot.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Oct 2008 16:36:59 -0000 Hello, the [SUBJ] file contains the following extract (around line 705) : * Default to PAGE_SIZE after much discussion. * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more correct. */ sb->st_blksize = PAGE_SIZE; which arrived around four years ago, with revision 1.211 (see http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/vfs_vnops.c.diff?r1=1.210;r2=1.211;f=h) the net effect of this change is to decrease the block buffer size used in libc/stdio from 16 kbytes (derived from the underlying ufs partition) to PAGE_SIZE ==4 kbytes (fixed value), and consequently the I/O bandwidth is lowered (this is on a slow Flash). I have patched the kernel with a larger, fixed value (simply 4*PAGE_SIZE, to revert to the block size previoulsly used), and the kernel and world seem to be running fine. Seeing the XXX coment above, I'm a bit worried about keeping this new st_blksize value. are there any drawbacks with running with this bigger buffer size value ? TfH From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 08:35:41 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4068A106566C for ; Sat, 25 Oct 2008 08:35:41 +0000 (UTC) (envelope-from glz@hidden-powers.com) Received: from mail.hidden-powers.com (mail.hidden-powers.com [213.242.135.162]) by mx1.freebsd.org (Postfix) with ESMTP id E8BC28FC22 for ; Sat, 25 Oct 2008 08:35:40 +0000 (UTC) (envelope-from glz@hidden-powers.com) Received: from mail.hidden-powers.com (localhost [127.0.0.1]) by dkim.hidden-powers.com (Postfix) with ESMTP id CDBD16D590 for ; Sat, 25 Oct 2008 10:20:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=hidden-powers.com; h=date :from:to:subject:message-id:mime-version:content-type: content-transfer-encoding; s=selector1; bh=Oy49kAKyU2mKhUkPj4f3p N5PQbo=; b=F3HyN0EoX43lwJqirGLRuNNEZRx16jmoNDBsW5Fn+tO9Uqi6AxbFU tcJVI3NdvT7BOdOKsC3iCX8MeCT3S/MdgmOX1zngPjmeSbkfHicXe2X+lnSySzNr bujqnfOoNZZEg3I0w4/RaC6BADt4+XGmbMmyCrwIPNEPuqg8tqc2wE= Received: from [10.255.253.2] (unknown [10.255.253.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.hidden-powers.com (Postfix) with ESMTPSA id C1EC86D4DD for ; Sat, 25 Oct 2008 10:20:01 +0200 (CEST) Date: Sat, 25 Oct 2008 10:19:46 +0200 From: Goran Lowkrantz To: freebsd-fs@freebsd.org Message-ID: <2B8727C356B2422602B9425C@[10.255.253.2]> X-Mailer: Mulberry/4.0.8 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: Whatever happened to autofs? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2008 08:35:41 -0000 Found this: Anyone know what happened with autofs? /glz --- Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 15:05:45 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D9C9106567F; Sat, 25 Oct 2008 15:05:45 +0000 (UTC) (envelope-from thierry.herbelot@laposte.net) Received: from postfix1-g20.free.fr (postfix1-g20.free.fr [212.27.60.42]) by mx1.freebsd.org (Postfix) with ESMTP id D207A8FC0A; Sat, 25 Oct 2008 15:05:44 +0000 (UTC) (envelope-from thierry.herbelot@laposte.net) Received: from smtp6-g19.free.fr (smtp6-g19.free.fr [212.27.42.36]) by postfix1-g20.free.fr (Postfix) with ESMTP id 34AF22D0A870; Sat, 25 Oct 2008 16:39:38 +0200 (CEST) Received: from smtp6-g19.free.fr (localhost.localdomain [127.0.0.1]) by smtp6-g19.free.fr (Postfix) with ESMTP id BF37E19799; Sat, 25 Oct 2008 16:39:36 +0200 (CEST) Received: from mail.herbelot.nom (bne75-4-82-227-159-103.fbx.proxad.net [82.227.159.103]) by smtp6-g19.free.fr (Postfix) with ESMTP id 58A6D1977D; Sat, 25 Oct 2008 16:39:35 +0200 (CEST) Received: from diversion.herbelot.nom (diversion.herbelot.nom [192.168.2.6]) by mail.herbelot.nom (8.14.1/8.14.1) with ESMTP id m9PEdNh2028982; Sat, 25 Oct 2008 16:39:25 +0200 (CEST) From: Thierry Herbelot To: Bruce Evans Date: Sat, 25 Oct 2008 16:39:17 +0200 User-Agent: KMail/1.9.10 References: <200810241818.37262.thierry@herbelot.com> <20081025203549.C76165@delplex.bde.org> In-Reply-To: <20081025203549.C76165@delplex.bde.org> X-Warning: Windows can lose your files X-Op-Sys: Le FriBi de la mort qui tue X-Org: TfH&Co X-MailScanner: Found to be clean MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Content-Disposition: inline Message-Id: <200810251639.17586.thierry.herbelot@laposte.net> Cc: freebsd-fs@freebsd.org, hackers@freebsd.org Subject: Re: question about sb->st_blksize in src/sys/kern/vfs_vnops.c X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2008 15:05:45 -0000 Le Saturday 25 October 2008, Bruce Evans a écrit : > On Fri, 24 Oct 2008, Thierry Herbelot wrote: > > the [SUBJ] file contains the following extract (around line 705) : > > > > * Default to PAGE_SIZE after much discussion. > > * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more correct. > > */ > > > > sb->st_blksize = PAGE_SIZE; > > > > which arrived around four years ago, with revision 1.211 (see > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/vfs_vnops.c.diff?r1=1. > >210;r2=1.211;f=h) > > Indeed, this was completely broken long ago (in 1.211). Before then, and > after 1.128, some cases worked as intended if not perfectly: > - regular files: file systems still set va_blksize to their idea of the > best i/o size (normally to the file system block size, which is > normally larger than PAGE_SIZE and probably better in all cases) and > this was used here. However, for regular files, the fs block size > and the application's i/o size are almost irrelevant in most cases > due to vfs clustering. Most large i/o's are done physically with > the cluster size (which due to a related bug suite ends up being > hard-coded to MAXPHYS (128K) at a minor cost when this is different > from the best size). > - disk files: non-broken device drivers set si_iosize_best to their idea > of the best i/o size (normally to the max i/o size, which is normally > better than PAGE_SIZE) and this was used here. The bogus default > of BLKDEV_IOSIZE was used for broken drivers (this is bogus because it > was for the buffer cache implementation for block devices which no > longer exist and was too small for them anyway). > - non-disk character-special files: the default of PAGE_SIZE was used. > The comment about defaulting to PAGE_SIZE was added in 1.128 and is > mainly for this case. Now the comment is nonsense since the value is > fixed, not a default. > - other file types (fifos, pipes, sockets, ...): these got the default of > PAGE_SIZE too. > > In rev.1.1, st_blksize was set to va_blksize in all cases. So file systems > were supposed to set va_blksize reasonably in all cases, but this is not > easy and they did nothing good except for regular files. agreed, anyway the comment by phk about using ioctl(DIOCGSECTORSIZE) applies. > > Versions between 1.2 and 1.127 did weird things like defaulting to DFLTPHYS > (64K) for most cdevs but using a small size like BLKDEV_IOSIZE (2K) for > disks. This gave nonsense like 64K buffers for slow tty devices (keyboards) > and 2K buffers for fast disks. At least for programs that trust st_blksize > o be reasonable. Fortunately, st_blsize is rarely used... > > > the net effect of this change is to decrease the block buffer size used > > in libc/stdio from 16 kbytes (derived from the underlying ufs partition) > > to PAGE_SIZE ==4 kbytes (fixed value), and consequently the I/O bandwidth > > is lowered (this is on a slow Flash). > > ... except it is used by stdio. (Another mess here is that stdio mostly > doesn't use its own BUFSIZ. It trusts st_blksize if fstat() to determine This is indeed what I saw, meandering between the libc and the vfs part of the kernel. In fact, I was essentially wondering if st_blksize was used *elsewhere*, and bumping the value could break some memory allocation ... > st_blksize works. Of course, the existence of BUFSIZ is a related > historical mistake -- no fixed size can work best for all cases. But > when BUFSIZ is used, it is an even worse default than PAGE_SIZE.) (as it is even smaller ?) > > It's interesting that you can see the difference. Clustering is especially > good for hiding slowness on slow devices. Maybe you are using a > configuration that makes clustering ineffective. Mounting the file system > with -o sync or equivalently, doing a sync after every (too-small) write > would do it. Otherwise, writes are normally delated until the next cluster > boundary. My use case is for small (buffered) writes to a file between 4 kbytes and 16 16 kbytes. For example, writing a 16-kbyte file with a st_blksize of 4k is twice as slow as with 16k (220 ms compared to 110). The penalty is less for 8k-byte (105 ms vs 66). > > > I have patched the kernel with a larger, fixed value (simply 4*PAGE_SIZE, > > to revert to the block size previoulsly used), and the kernel and world > > seem to be running fine. > > > > Seeing the XXX coment above, I'm a bit worried about keeping this new > > st_blksize value. > > > > are there any drawbacks with running with this bigger buffer size value ? > > Mostly it doesn't matter, since buffering (clustering) hides the > differences. (as seen before, mostly) > Without clustering, 16K is a much better default for disks > than 4K, though not as good as the non-default va_blksize for regular > files. Newer disks might prefer 32K or 64k, but then the fs block size > should also be increased from 16K. Otherwise, increasing the block size > usually reduces performance, by thrashing caches or increasing latencies. > With modern cache sizes and disk speeds, you won't see these effects for a > block size of 64K, so defaulting to 64K would be reasonable for disks. It > would be silly for keyboards, but with modern memory sizes you would notice > this even less than when it was that in old versions. OK, thanks for the answer : I will submit the change to more stress tests and hope to shake it all before putting it to production. TfH > > Bruce From owner-freebsd-fs@FreeBSD.ORG Sat Oct 25 19:46:24 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C018C1065671; Sat, 25 Oct 2008 19:46:24 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx10.syd.optusnet.com.au (fallbackmx10.syd.optusnet.com.au [211.29.132.251]) by mx1.freebsd.org (Postfix) with ESMTP id 66B8A8FC22; Sat, 25 Oct 2008 19:46:23 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by fallbackmx10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m9PB9NuK027440; Sat, 25 Oct 2008 22:09:23 +1100 Received: from c122-106-151-199.carlnfd1.nsw.optusnet.com.au (c122-106-151-199.carlnfd1.nsw.optusnet.com.au [122.106.151.199]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m9PB9HtJ029625 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 25 Oct 2008 22:09:20 +1100 Date: Sat, 25 Oct 2008 22:09:17 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Thierry Herbelot In-Reply-To: <200810241818.37262.thierry@herbelot.com> Message-ID: <20081025203549.C76165@delplex.bde.org> References: <200810241818.37262.thierry@herbelot.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, hackers@FreeBSD.org Subject: Re: question about sb->st_blksize in src/sys/kern/vfs_vnops.c X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Oct 2008 19:46:24 -0000 On Fri, 24 Oct 2008, Thierry Herbelot wrote: > the [SUBJ] file contains the following extract (around line 705) : > > * Default to PAGE_SIZE after much discussion. > * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more correct. > */ > > sb->st_blksize = PAGE_SIZE; > > which arrived around four years ago, with revision 1.211 (see > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/vfs_vnops.c.diff?r1=1.210;r2=1.211;f=h) Indeed, this was completely broken long ago (in 1.211). Before then, and after 1.128, some cases worked as intended if not perfectly: - regular files: file systems still set va_blksize to their idea of the best i/o size (normally to the file system block size, which is normally larger than PAGE_SIZE and probably better in all cases) and this was used here. However, for regular files, the fs block size and the application's i/o size are almost irrelevant in most cases due to vfs clustering. Most large i/o's are done physically with the cluster size (which due to a related bug suite ends up being hard-coded to MAXPHYS (128K) at a minor cost when this is different from the best size). - disk files: non-broken device drivers set si_iosize_best to their idea of the best i/o size (normally to the max i/o size, which is normally better than PAGE_SIZE) and this was used here. The bogus default of BLKDEV_IOSIZE was used for broken drivers (this is bogus because it was for the buffer cache implementation for block devices which no longer exist and was too small for them anyway). - non-disk character-special files: the default of PAGE_SIZE was used. The comment about defaulting to PAGE_SIZE was added in 1.128 and is mainly for this case. Now the comment is nonsense since the value is fixed, not a default. - other file types (fifos, pipes, sockets, ...): these got the default of PAGE_SIZE too. In rev.1.1, st_blksize was set to va_blksize in all cases. So file systems were supposed to set va_blksize reasonably in all cases, but this is not easy and they did nothing good except for regular files. Versions between 1.2 and 1.127 did weird things like defaulting to DFLTPHYS (64K) for most cdevs but using a small size like BLKDEV_IOSIZE (2K) for disks. This gave nonsense like 64K buffers for slow tty devices (keyboards) and 2K buffers for fast disks. At least for programs that trust st_blksize o be reasonable. Fortunately, st_blsize is rarely used... > the net effect of this change is to decrease the block buffer size used in > libc/stdio from 16 kbytes (derived from the underlying ufs partition) to > PAGE_SIZE ==4 kbytes (fixed value), and consequently the I/O bandwidth is > lowered (this is on a slow Flash). ... except it is used by stdio. (Another mess here is that stdio mostly doesn't use its own BUFSIZ. It trusts st_blksize if fstat() to determine st_blksize works. Of course, the existence of BUFSIZ is a related historical mistake -- no fixed size can work best for all cases. But when BUFSIZ is used, it is an even worse default than PAGE_SIZE.) It's interesting that you can see the difference. Clustering is especially good for hiding slowness on slow devices. Maybe you are using a configuration that makes clustering ineffective. Mounting the file system with -o sync or equivalently, doing a sync after every (too-small) write would do it. Otherwise, writes are normally delated until the next cluster boundary. > I have patched the kernel with a larger, fixed value (simply 4*PAGE_SIZE, to > revert to the block size previoulsly used), and the kernel and world seem to > be running fine. > > Seeing the XXX coment above, I'm a bit worried about keeping this new > st_blksize value. > > are there any drawbacks with running with this bigger buffer size value ? Mostly it doesn't matter, since buffering (clustering) hides the differences. Without clustering, 16K is a much better default for disks than 4K, though not as good as the non-default va_blksize for regular files. Newer disks might prefer 32K or 64k, but then the fs block size should also be increased from 16K. Otherwise, increasing the block size usually reduces performance, by thrashing caches or increasing latencies. With modern cache sizes and disk speeds, you won't see these effects for a block size of 64K, so defaulting to 64K would be reasonable for disks. It would be silly for keyboards, but with modern memory sizes you would notice this even less than when it was that in old versions. Bruce