From owner-freebsd-geom@FreeBSD.ORG Sun Dec 5 14:29:14 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C1EF106566C for ; Sun, 5 Dec 2010 14:29:14 +0000 (UTC) (envelope-from gcubfg-freebsd-geom@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 996C88FC1D for ; Sun, 5 Dec 2010 14:29:13 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PPFaR-0005wb-Nd for freebsd-geom@freebsd.org; Sun, 05 Dec 2010 15:29:11 +0100 Received: from cpe-188-129-83-203.dynamic.amis.hr ([188.129.83.203]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 05 Dec 2010 15:29:11 +0100 Received: from ivoras by cpe-188-129-83-203.dynamic.amis.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 05 Dec 2010 15:29:11 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-geom@freebsd.org From: Ivan Voras Date: Sun, 05 Dec 2010 15:28:56 +0100 Lines: 63 Message-ID: References: <4CE4E2B2.7070702@delphij.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cpe-188-129-83-203.dynamic.amis.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101102 Thunderbird/3.1.6 In-Reply-To: <4CE4E2B2.7070702@delphij.net> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS stripesize patch (in the context of 4k sector drives) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Dec 2010 14:29:14 -0000 On 11/18/10 09:24, Xin LI wrote: > On 11/12/10 10:09, Ivan Voras wrote: >> On 11/12/10 16:00, Ivan Voras wrote: >>> Hello, >>> >>> Any objections to me committing the following patch? >>> >>> The intention is to use stripesize info from GEOM in creating vdevs, in >>> the hope that the 4 KiB sector magic will work. > >> Or maybe not. I've grepped and other tools use stripesize in the way its >> name suggests - as RAID stripe size, not as logical sector size. > >> New idea on the menu: make the logical sector size a separate concept >> and a separate variable from stripe size. Would that be a better approach? > > Have you tested this booting from existing ZFS file system? No, but it will probably work because ashift is stored in ZFS metadata and compliant implementations should read it. I did tests with ZFS and combined ashift and sector sizes with gnop and here is what is possible and what isn't: * Pools created with ashift of 512 and imported while sectorsize in GEOM is 512 byte will work. * Pools created with ashift of 512 and imported while sectorsize in GEOM is 4096 will NOT work * Pools created with ashift of 4096 and imported while sectorsize in GEOM is 512 byte will work * Pools created with ashift of 4096 and imported while sectorsize in GEOM is 4096 byte will work. Basically, only increasing sectorsize (i.e. minimum IO alignment) will cause drives which had formerly been formatted with old (512 byte) sector size will not work. Personally, I'd still do it sooner rather than later to reduce the number of users which have problems with it, but after discussing it with mav I also understand the conservative side. Also from this discussion came the idea of capping ashift to some upper value. SPA_MAXBLOCKSIZE (128 KiB) looks reasonable for this so here's an updated patch. As the goal is to deal with current 4 KiB sector drives, the whole thing may need to be revisited in the future if there are other devices which fill in stripesize (probably by introducing a "physsectorsize" field). Comments? Ideas? --- vdev_geom.c.ori 2010-12-05 15:08:09.000000000 +0100 +++ vdev_geom.c 2010-12-05 15:10:50.000000000 +0100 @@ -496,7 +496,10 @@ /* * Determine the device's minimum transfer size. */ - *ashift = highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1; + if (pp->stripesize != 0 && pp->stripesize > pp->sectorsize) + *ashift = highbit(MIN(pp->stripesize, SPA_MAXBLOCKSIZE)) - 1; + else + *ashift = highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1; /* * Clear the nowritecache bit, so that on a vdev_reopen() we will From owner-freebsd-geom@FreeBSD.ORG Mon Dec 6 11:06:58 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC82410656C7 for ; Mon, 6 Dec 2010 11:06:58 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B98D78FC0A for ; Mon, 6 Dec 2010 11:06:58 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oB6B6wOX068238 for ; Mon, 6 Dec 2010 11:06:58 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oB6B6wlc068236 for freebsd-geom@FreeBSD.org; Mon, 6 Dec 2010 11:06:58 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 6 Dec 2010 11:06:58 GMT Message-Id: <201012061106.oB6B6wlc068236@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Dec 2010 11:06:58 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/152609 geom [geli] geli onetime on gzero panics o kern/150858 geom [geom] [geom_label] [patch] glabel(8) is not compatibl o kern/150626 geom [geom] [gjournal] gjournal(8) destroys label o kern/150555 geom [geom] gjournal unusable on GPT partitions o kern/150334 geom [geom] [udf] [patch] geom label does not support UDF o kern/149762 geom volume labels with rogue characters o bin/149215 geom [panic] [geom_part] gpart(8): Delete linux's slice via o kern/147852 geom [geom] [panic] graid3 panic: wrong offset 16384 for se o kern/147851 geom [geom] [panic] graid3 panic: g_read_data: invalid leng o kern/147667 geom [gmirror] Booting with one component of a gmirror, the o kern/147664 geom [geom] [patch] Add the ability to create linux and fat o kern/145818 geom [geom] geom_stat_open showing cached information for n o kern/145042 geom [geom] System stops booting after printing message "GE o kern/144962 geom [geom] panic when accessing GPT disk with a large numb o kern/144905 geom [geom][geom_part] panic in gpart_ctlreq when unpluggin o kern/143455 geom gstripe(8) in RELENG_8 (31st Jan 2010) broken o kern/142563 geom [geom] [hang] ioctl freeze in zpool o kern/141740 geom [geom] gjournal(8): g_journal_destroy concurrent error s kern/141235 geom [geom_part] 8.0 no longer provides /dev entries for al o kern/140352 geom [geom] gjournal + glabel not working o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition f kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/127420 geom [geom] [gjournal] [panic] Journal overflow on gmirrore o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o bin/120990 geom [patch] support "BIOS Boot" partition type in gpt(8) o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile f kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. 59 problems total. From owner-freebsd-geom@FreeBSD.ORG Thu Dec 9 11:05:42 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9D4C1065672 for ; Thu, 9 Dec 2010 11:05:42 +0000 (UTC) (envelope-from nonsolosoft@diff.org) Received: from smtpi3.ngi.it (smtpi3.ngi.it [88.149.128.33]) by mx1.freebsd.org (Postfix) with ESMTP id 9017F8FC15 for ; Thu, 9 Dec 2010 11:05:42 +0000 (UTC) Received: from lap.diff.org (vola.diff.org [81.174.26.135]) by smtpi3.ngi.it (Postfix) with ESMTP id EDCF5318DB3 for ; Thu, 9 Dec 2010 11:46:04 +0100 (CET) Message-ID: <4D00B36C.5070509@diff.org> Date: Thu, 09 Dec 2010 11:46:04 +0100 From: Ferruccio Zamuner Organization: NonSoLoSoft User-Agent: Mozilla/5.0 (X11; U; DragonFly i386; en-US; rv:1.9.2.9) Gecko/20101023 Lightning/1.0b3pre Lanikai/3.1.3 MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: gmirror insert X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Dec 2010 11:05:42 -0000 I've a trouble restoring ad4s2 in the mirror, same operation for ad4s1 on gm0 succeded. I'm on FreeBSD 8.1 adm64. Have you hint? Following it is the log: u1# fdisk ad6 ******* Working on device /dev/ad6 ******* parameters extracted from in-core disklabel are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 63, size 251657217 (122879 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 15/ sector 63 The data for partition 2 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 251657280, size 1048575024 (511999 Meg), flag 0 beg: cyl 1023/ head 255/ sector 63; end: cyl 1023/ head 15/ sector 63 The data for partition 3 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 1300232304, size 164916864 (80525 Meg), flag 0 beg: cyl 1023/ head 255/ sector 63; end: cyl 1023/ head 15/ sector 63 The data for partition 4 is: u1# gmirror forget gm1 ad4s2 u1# gmirror list gm1 Geom name: gm1 State: COMPLETE Components: 1 Balance: split Slice: 1048576 Flags: NONE GenID: 0 SyncID: 2 ID: 3153972535 Providers: 1. Name: mirror/gm1 Mediasize: 536870411776 (500G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad6s2 Mediasize: 536870412288 (500G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: HARDCODED GenID: 0 SyncID: 2 ID: 3208481414 u1# fdisk -2 ad4 ******* Working on device /dev/ad4 ******* parameters extracted from in-core disklabel are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 2 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 251657280, size 1048575024 (511999 Meg), flag 80 (active) beg: cyl 1023/ head 255/ sector 63; end: cyl 1023/ head 15/ sector 63 u1# gmirror insert gm1 ad4s2 gmirror: Provider ad4s2 too small. From owner-freebsd-geom@FreeBSD.ORG Thu Dec 9 11:32:29 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9AA1106564A for ; Thu, 9 Dec 2010 11:32:29 +0000 (UTC) (envelope-from nonsolosoft@diff.org) Received: from u1.diff.org (u1.diff.org [78.46.96.43]) by mx1.freebsd.org (Postfix) with ESMTP id 69D588FC0C for ; Thu, 9 Dec 2010 11:32:28 +0000 (UTC) Received: from lap.diff.org (vola.diff.org [81.174.26.135]) (authenticated bits=0) by u1.diff.org (8.14.4/8.14.4) with ESMTP id oB9Ar9KO016207 for ; Thu, 9 Dec 2010 10:53:11 GMT (envelope-from nonsolosoft@diff.org) Message-ID: <4D00B514.7090204@diff.org> Date: Thu, 09 Dec 2010 11:53:08 +0100 From: Ferruccio Zamuner Organization: NonSoLoSoft User-Agent: Mozilla/5.0 (X11; U; DragonFly i386; en-US; rv:1.9.2.9) Gecko/20101023 Lightning/1.0b3pre Lanikai/3.1.3 MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.3.5 (u1.diff.org [78.46.96.43]); Thu, 09 Dec 2010 10:53:12 +0000 (UTC) Subject: gmirror insert X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Dec 2010 11:32:30 -0000 I've a trouble restoring ad4s2 in the mirror, same operation for ad4s1 on gm0 succeded. I'm on FreeBSD 8.1 adm64. Have you hint? Following it is the log: u1# fdisk ad6 ******* Working on device /dev/ad6 ******* parameters extracted from in-core disklabel are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 63, size 251657217 (122879 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 15/ sector 63 The data for partition 2 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 251657280, size 1048575024 (511999 Meg), flag 0 beg: cyl 1023/ head 255/ sector 63; end: cyl 1023/ head 15/ sector 63 The data for partition 3 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 1300232304, size 164916864 (80525 Meg), flag 0 beg: cyl 1023/ head 255/ sector 63; end: cyl 1023/ head 15/ sector 63 The data for partition 4 is: u1# gmirror forget gm1 ad4s2 u1# gmirror list gm1 Geom name: gm1 State: COMPLETE Components: 1 Balance: split Slice: 1048576 Flags: NONE GenID: 0 SyncID: 2 ID: 3153972535 Providers: 1. Name: mirror/gm1 Mediasize: 536870411776 (500G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad6s2 Mediasize: 536870412288 (500G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: HARDCODED GenID: 0 SyncID: 2 ID: 3208481414 u1# fdisk -2 ad4 ******* Working on device /dev/ad4 ******* parameters extracted from in-core disklabel are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=1453521 heads=16 sectors/track=63 (1008 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 2 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 251657280, size 1048575024 (511999 Meg), flag 80 (active) beg: cyl 1023/ head 255/ sector 63; end: cyl 1023/ head 15/ sector 63 u1# gmirror insert gm1 ad4s2 gmirror: Provider ad4s2 too small. From owner-freebsd-geom@FreeBSD.ORG Thu Dec 9 11:54:43 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE46B106566C for ; Thu, 9 Dec 2010 11:54:43 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 6B6BC8FC0C for ; Thu, 9 Dec 2010 11:54:43 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id D80E145C99; Thu, 9 Dec 2010 12:54:41 +0100 (CET) Received: from localhost (pdawidek.whl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 923C345C89; Thu, 9 Dec 2010 12:54:36 +0100 (CET) Date: Thu, 9 Dec 2010 12:54:35 +0100 From: Pawel Jakub Dawidek To: Ferruccio Zamuner Message-ID: <20101209115435.GC1745@garage.freebsd.pl> References: <4D00B36C.5070509@diff.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="DSayHWYpDlRfCAAQ" Content-Disposition: inline In-Reply-To: <4D00B36C.5070509@diff.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT amd64 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-geom@freebsd.org Subject: Re: gmirror insert X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Dec 2010 11:54:43 -0000 --DSayHWYpDlRfCAAQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 09, 2010 at 11:46:04AM +0100, Ferruccio Zamuner wrote: > I've a trouble restoring ad4s2 in the mirror, same operation for ad4s1=20 > on gm0 succeded. I'm on FreeBSD 8.1 adm64. >=20 > Have you hint? Following it is the log: Could you provide the output of: diskinfo -v /dev/ad[46]s2 /dev/mirror/gm1 --=20 Pawel Jakub Dawidek http://www.wheelsystems.com pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --DSayHWYpDlRfCAAQ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk0Aw3sACgkQForvXbEpPzR6qACfecjX2izzDYEHhdMfRWiyTkB/ /scAoOOTWp3UnvB4z9KVB7x7J6AZWRw2 =fzGq -----END PGP SIGNATURE----- --DSayHWYpDlRfCAAQ-- From owner-freebsd-geom@FreeBSD.ORG Thu Dec 9 19:16:48 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 654DD1065670 for ; Thu, 9 Dec 2010 19:16:48 +0000 (UTC) (envelope-from lev@serebryakov.spb.ru) Received: from ftp.translate.ru (ftp.translate.ru [80.249.188.42]) by mx1.freebsd.org (Postfix) with ESMTP id 255428FC19 for ; Thu, 9 Dec 2010 19:16:47 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (89.112.15.178.pppoe.eltel.net [89.112.15.178]) (Authenticated sender: lev@serebryakov.spb.ru) by ftp.translate.ru (Postfix) with ESMTPA id 5D07213DF49 for ; Thu, 9 Dec 2010 22:00:27 +0300 (MSK) Date: Thu, 9 Dec 2010 22:00:23 +0300 From: Lev Serebryakov X-Priority: 3 (Normal) Message-ID: <166474821.20101209220023@serebryakov.spb.ru> To: freebsd-geom@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Subject: struct bio: mark as complete, but keep data buffer (and protect it from freeing/using)? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Dec 2010 19:16:48 -0000 Hello, Freebsd-geom. Is it possible to mark "struct bio" as complete (pass to g_io_deliver()) but hold its data buffer, to avoid unnecessary data copy? --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 13:22:59 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DD331065670 for ; Fri, 10 Dec 2010 13:22:59 +0000 (UTC) (envelope-from lev@serebryakov.spb.ru) Received: from ftp.translate.ru (ftp.translate.ru [80.249.188.42]) by mx1.freebsd.org (Postfix) with ESMTP id 07E708FC13 for ; Fri, 10 Dec 2010 13:22:58 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (89.112.15.178.pppoe.eltel.net [89.112.15.178]) (Authenticated sender: lev@serebryakov.spb.ru) by ftp.translate.ru (Postfix) with ESMTPA id 9562313DF48; Fri, 10 Dec 2010 16:22:57 +0300 (MSK) Date: Fri, 10 Dec 2010 16:22:53 +0300 From: Lev Serebryakov X-Priority: 3 (Normal) Message-ID: <1365605559.20101210162253@serebryakov.spb.ru> To: freebsd-hackers@freebsd.org, freebsd-geom@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: Subject: Where userland read/write requests, whcih is larger than MAXPHYS, are splitted? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 13:22:59 -0000 Hello, Freebsd-geom. I'm digging thought GEOM/IO code and can not find place, where requests from userland to read more than MAXPHYS bytes, is splitted into several "struct bio"? It seems, that these children request are issued one-by-one, not in parallel, am I right? Why? It breaks down parallelism, when underlying GEOM can process several requests simoltaneously? --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 14:37:23 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA46E106566C for ; Fri, 10 Dec 2010 14:37:23 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 1B90D8FC16 for ; Fri, 10 Dec 2010 14:37:21 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA08883; Fri, 10 Dec 2010 16:37:16 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4D023B1C.3070707@freebsd.org> Date: Fri, 10 Dec 2010 16:37:16 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Lev Serebryakov References: <166474821.20101209220023@serebryakov.spb.ru> In-Reply-To: <166474821.20101209220023@serebryakov.spb.ru> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: struct bio: mark as complete, but keep data buffer (and protect it from freeing/using)? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 14:37:23 -0000 on 09/12/2010 21:00 Lev Serebryakov said the following: > Hello, Freebsd-geom. > > Is it possible to mark "struct bio" as complete (pass to > g_io_deliver()) but hold its data buffer, to avoid unnecessary data > copy? What do you mean by 'hold'? Store bio_data pointer locally? I think that that greatly depends on what owner of that bio is going to do with it after the operation completes. E.g. if it frees that memory then the pointer would become stale. So, in general case, it would probably be not a good idea. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 14:48:27 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AFE31065672; Fri, 10 Dec 2010 14:48:27 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 728E78FC15; Fri, 10 Dec 2010 14:48:26 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA09006; Fri, 10 Dec 2010 16:48:24 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4D023DB7.9080509@freebsd.org> Date: Fri, 10 Dec 2010 16:48:23 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Lev Serebryakov References: <1365605559.20101210162253@serebryakov.spb.ru> In-Reply-To: <1365605559.20101210162253@serebryakov.spb.ru> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Where userland read/write requests, whcih is larger than MAXPHYS, are splitted? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 14:48:27 -0000 on 10/12/2010 15:22 Lev Serebryakov said the following: > Hello, Freebsd-geom. > > I'm digging thought GEOM/IO code and can not find place, where > requests from userland to read more than MAXPHYS bytes, is splitted > into several "struct bio"? Check out g_disk_start(). The split is done based on disk-specific d_maxsize, not hardcoded MAXPHYS, of course. > It seems, that these children request are issued one-by-one, not in > parallel, am I right? Why? It breaks down parallelism, when > underlying GEOM can process several requests simoltaneously? How do you *issue* the child requests in parallel? Of course, they can *run* in parallel if system configuration permits that and request run time is sufficient for an overlap to happen. Besides, there are no geoms under disk geom, it works on peripheral drivers. But maybe I misunderstood your question and you talked about a different I/O layer or different I/O path. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 15:03:31 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 09A9E1065672; Fri, 10 Dec 2010 15:03:31 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 228A28FC20; Fri, 10 Dec 2010 15:03:29 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA09219; Fri, 10 Dec 2010 17:03:27 +0200 (EET) (envelope-from avg@freebsd.org) Message-ID: <4D02413F.8020007@freebsd.org> Date: Fri, 10 Dec 2010 17:03:27 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.12) Gecko/20101029 Lightning/1.0b2 Thunderbird/3.1.6 MIME-Version: 1.0 To: Lev Serebryakov References: <1365605559.20101210162253@serebryakov.spb.ru> <4D023DB7.9080509@freebsd.org> In-Reply-To: <4D023DB7.9080509@freebsd.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Where userland read/write requests, whcih is larger than MAXPHYS, are splitted? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 15:03:31 -0000 on 10/12/2010 16:48 Andriy Gapon said the following: > But maybe I misunderstood your question and you talked about a different I/O layer > or different I/O path. > Oh, probably you talk about physread/physwrite == physio. Indeed, it issues bio-s with max size of si_iosize_max and runs them sequentially. Besides, if uio is really "vectored", then each uio sub-buffer is processed sequentially too. This is probably less fast than running the requests in parallel; plus side could be that less KVA is required for mapping user space buffer (UIO_USERSPACE case) into kernel. Not sure if the latter is much of concern though. The sequential code is simpler too :-) -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 15:29:21 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F6041065693; Fri, 10 Dec 2010 15:29:21 +0000 (UTC) (envelope-from lev@serebryakov.spb.ru) Received: from ftp.translate.ru (ftp.translate.ru [80.249.188.42]) by mx1.freebsd.org (Postfix) with ESMTP id 08AD48FC0A; Fri, 10 Dec 2010 15:29:20 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (89.112.15.178.pppoe.eltel.net [89.112.15.178]) (Authenticated sender: lev@serebryakov.spb.ru) by ftp.translate.ru (Postfix) with ESMTPA id 0C22513DF48; Fri, 10 Dec 2010 18:29:20 +0300 (MSK) Date: Fri, 10 Dec 2010 18:29:15 +0300 From: Lev Serebryakov X-Priority: 3 (Normal) Message-ID: <1272967424.20101210182915@serebryakov.spb.ru> To: Andriy Gapon In-Reply-To: <4D02413F.8020007@freebsd.org> References: <1365605559.20101210162253@serebryakov.spb.ru> <4D023DB7.9080509@freebsd.org> <4D02413F.8020007@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Where userland read/write requests, whcih is larger than MAXPHYS, are splitted? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 15:29:21 -0000 Hello, Andriy. You wrote 10 =E4=E5=EA=E0=E1=F0=FF 2010 =E3., 18:03:27: > on 10/12/2010 16:48 Andriy Gapon said the following: >> But maybe I misunderstood your question and you talked about a different= I/O layer >> or different I/O path. > Oh, probably you talk about physread/physwrite =3D=3D physio. > Indeed, it issues bio-s with max size of si_iosize_max and runs them sequ= entially. Yep, I'm talking about this case. See my message to Alexander Motin with explanation why I think sequential processing here is not good idea. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 15:33:27 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97DB8106564A for ; Fri, 10 Dec 2010 15:33:27 +0000 (UTC) (envelope-from lev@serebryakov.spb.ru) Received: from ftp.translate.ru (ftp.translate.ru [80.249.188.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5749F8FC1D for ; Fri, 10 Dec 2010 15:33:27 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (89.112.15.178.pppoe.eltel.net [89.112.15.178]) (Authenticated sender: lev@serebryakov.spb.ru) by ftp.translate.ru (Postfix) with ESMTPA id 5B73C13DF48 for ; Fri, 10 Dec 2010 18:33:26 +0300 (MSK) Date: Fri, 10 Dec 2010 18:33:22 +0300 From: Lev Serebryakov X-Priority: 3 (Normal) Message-ID: <1136849868.20101210183322@serebryakov.spb.ru> To: freebsd-geom@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Subject: One more question: is here any way to know is consumer busy or not? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 15:33:27 -0000 Hello, Freebsd-geom. The same idea about pre-reading in case of providers with multiple consumers (BTW, I think that terminology here is swapped): it looks good to issue pre-read only if target consumer is idle. Is here any way to determine that? For example, ehat exactly means percents of load in gstat output? Can I rely on this statistics, and how should I get it in-kernel? --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-geom@FreeBSD.ORG Fri Dec 10 15:59:15 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41EEC1065673 for ; Fri, 10 Dec 2010 15:59:15 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 056848FC13 for ; Fri, 10 Dec 2010 15:59:14 +0000 (UTC) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id C4F723F62C; Fri, 10 Dec 2010 15:40:09 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id oBAFe9d3056018; Fri, 10 Dec 2010 15:40:09 GMT (envelope-from phk@critter.freebsd.dk) To: Lev Serebryakov From: "Poul-Henning Kamp" In-Reply-To: Your message of "Fri, 10 Dec 2010 18:33:22 +0300." <1136849868.20101210183322@serebryakov.spb.ru> Date: Fri, 10 Dec 2010 15:40:09 +0000 Message-ID: <56017.1291995609@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-geom@freebsd.org Subject: Re: One more question: is here any way to know is consumer busy or not? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 15:59:15 -0000 In message <1136849868.20101210183322@serebryakov.spb.ru>, Lev Serebryakov writ es: >Hello, Freebsd-geom. > > The same idea about pre-reading in case of providers with multiple > consumers (BTW, I think that terminology here is swapped): The provider offers access to a disk(-like) device, which the consumer can use. > looks good to issue pre-read only if target consumer is idle. Is > here any way to determine that? you can look at the nstart and nend elements of the g_consumer structure to tell how many outstanding requests there are on that consumer. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.