From owner-freebsd-geom@FreeBSD.ORG Mon Apr 19 02:11:12 2010 Return-Path: Delivered-To: geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 606AC106564A for ; Mon, 19 Apr 2010 02:11:12 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1A1468FC0A for ; Mon, 19 Apr 2010 02:11:11 +0000 (UTC) Received: by gwb1 with SMTP id 1so849719gwb.13 for ; Sun, 18 Apr 2010 19:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:received:message-id:subject :from:to:cc:content-type; bh=vMXy07Y5sOgNsUENCQn4KLu5bN0UMzkcTjTNkJEpm7A=; b=Z6t/foZP2PQljxptTPw16uKvesFv0JptgJSj0RfGhAbv13n/m6xidnJVFSlZ6rhA6p kFW1SmKPeHdD6JXwNq/TactfLE4asrtgQLFBAH+RfDsd+ivQC2OJ2rOqP/htTVcdj26E i290A5gfUDNqeL/akp7H5ftR05XM2684rb5hM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=X6HJ2g7ChI4Xtm2oeaQHAIoJE6vYY+t3tBfJIVwACvU+QIeHpB5M6NEKk/j+bm/CfQ t+ZnsSuQjQwA258L7E9QQweTcBo5/2lGMFK13SPli85IGfK/1GWl59a4bzy05sg18tKm hgDWSgeBbvsGLfZajFDAQluUjEUCEyixQpJJI= MIME-Version: 1.0 Sender: yanegomi@gmail.com Received: by 10.231.183.17 with HTTP; Sun, 18 Apr 2010 18:48:59 -0700 (PDT) In-Reply-To: <201004190110.o3J1A0Fo084788@freefall.freebsd.org> References: <201004190109.o3J192F4002235@www.freebsd.org> <201004190110.o3J1A0Fo084788@freefall.freebsd.org> Date: Sun, 18 Apr 2010 18:48:59 -0700 X-Google-Sender-Auth: 893f81de1a0a7605 Received: by 10.101.210.33 with SMTP id m33mr8567564anq.13.1271641739923; Sun, 18 Apr 2010 18:48:59 -0700 (PDT) Message-ID: From: Garrett Cooper To: bug-followup Content-Type: text/plain; charset=ISO-8859-1 Cc: geom@freebsd.org, mav@freebsd.org Subject: Re: kern/145818: [geom] geom_stat_open showing cached information for non-present iso X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2010 02:11:12 -0000 I did some more tracking down of this issue and the problem is with the cached data being reported by devstat(9) -- somehow when the device is being detached it's not properly reporting that it needs to be removed in geom(4). I'm a bit stuck after that, but it does appear to be a communication issue between [atapi]cam(4) and geom(4). Thanks, -Garrett From owner-freebsd-geom@FreeBSD.ORG Mon Apr 19 05:29:18 2010 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C5F26106566C; Mon, 19 Apr 2010 05:29:18 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 9CD568FC13; Mon, 19 Apr 2010 05:29:18 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3J5TIRC011792; Mon, 19 Apr 2010 05:29:18 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3J5TIWA011788; Mon, 19 Apr 2010 05:29:18 GMT (envelope-from linimon) Date: Mon, 19 Apr 2010 05:29:18 GMT Message-Id: <201004190529.o3J5TIWA011788@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-geom@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/145818: [geom] geom_stat_open showing cached information for non-present iso X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2010 05:29:18 -0000 Synopsis: [geom] geom_stat_open showing cached information for non-present iso Responsible-Changed-From-To: freebsd-bugs->freebsd-geom Responsible-Changed-By: linimon Responsible-Changed-When: Mon Apr 19 05:29:09 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=145818 From owner-freebsd-geom@FreeBSD.ORG Mon Apr 19 11:06:59 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED085106564A for ; Mon, 19 Apr 2010 11:06:59 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id DB7768FC37 for ; Mon, 19 Apr 2010 11:06:59 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3JB6xsa034105 for ; Mon, 19 Apr 2010 11:06:59 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3JB6xUe034103 for freebsd-geom@FreeBSD.org; Mon, 19 Apr 2010 11:06:59 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 19 Apr 2010 11:06:59 GMT Message-Id: <201004191106.o3JB6xUe034103@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-geom@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2010 11:07:00 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/145818 geom [geom] geom_stat_open showing cached information for n o kern/145452 geom [geom] [panic] panic in geom_part_mbr when undoing des o kern/145042 geom [geom] System stops booting after printing message "GE o kern/144962 geom [geom] panic when accessing GPT disk with a large numb o bin/144943 geom [geom] gconcat(8) randomly "loses" all knowledge of JB o kern/144905 geom [geom][gpart] panic in gpart_ctlreq when unplugging ca o kern/144732 geom [geom] [patch] geom_cache erroneously decodes its on-d o bin/144521 geom geom(1) tool parsing non-subclass command broken o kern/143455 geom gstripe(8) in RELENG_8 (31st Jan 2010) broken o kern/142563 geom [geom] [hang] ioctl freeze in zpool f kern/142365 geom [geom] FreeBSD RAID1 (gmirror) is much slower than Lin o kern/141740 geom [geom] gjournal(8): g_journal_destroy concurrent error o kern/140352 geom [geom] gjournal + glabel not working o kern/139847 geom [geom_mbr] [patch] load/unload causes system to hang o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/134044 geom [geom] gmirror(8) overwrites fs with stale data from r o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition f kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/119743 geom [geom] geom label for cds is keeped after dismount and o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile f kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 57 problems total. From owner-freebsd-geom@FreeBSD.ORG Mon Apr 19 13:54:32 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E3B11065674; Mon, 19 Apr 2010 13:54:32 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 2B0758FC16; Mon, 19 Apr 2010 13:54:30 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA28243; Mon, 19 Apr 2010 16:54:28 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4BCC6093.5060007@icyb.net.ua> Date: Mon, 19 Apr 2010 16:54:27 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100319) MIME-Version: 1.0 To: bug-followup@FreeBSD.org, gcooper@FreeBSD.org, freebsd-geom@FreeBSD.org X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Subject: Re: kern/145818: [geom] geom_stat_open showing cached information for non-present iso X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2010 13:54:32 -0000 In my opinion this is an issue of CAM peripheral disk drivers (and most probably all other disk drivers) not communicating media change events even when those events are readily known to the drivers. One simple is example is CDIOCEJECT/CDIOCCLOSE ioctls of cd(4). Other possibilities for detecting media change: handle SCSI ASC 28h ("NOT READY TO READY CHANGE, MEDIUM MAY HAVE CHANGED"); poll for media removal/change; poll for CD drive eject button precesses (things that hald does in userland). AFAICS, currently there is no abstraction to pass media change events from disk layer to GEOM. E.g. something like disk_media_changed() that would call g_spoil or post g_new_provider_event to trigger re-taste as appropriate. Sounds easy, but the devil is in the details, there might be some locking/layering concerns. But, OTOH, we do this kind of things in g_access, I don't see why we couldn't do them in disk drivers. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Mon Apr 19 14:00:16 2010 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 39798106568A for ; Mon, 19 Apr 2010 14:00:16 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 282E78FC18 for ; Mon, 19 Apr 2010 14:00:16 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3JE0FfU085816 for ; Mon, 19 Apr 2010 14:00:15 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3JE0FWA085815; Mon, 19 Apr 2010 14:00:15 GMT (envelope-from gnats) Date: Mon, 19 Apr 2010 14:00:15 GMT Message-Id: <201004191400.o3JE0FWA085815@freefall.freebsd.org> To: freebsd-geom@FreeBSD.org From: Andriy Gapon Cc: Subject: Re: kern/145818: [geom] geom_stat_open showing cached information for non-present iso X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Andriy Gapon List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2010 14:00:16 -0000 The following reply was made to PR kern/145818; it has been noted by GNATS. From: Andriy Gapon To: bug-followup@FreeBSD.org, gcooper@FreeBSD.org, freebsd-geom@FreeBSD.org Cc: Subject: Re: kern/145818: [geom] geom_stat_open showing cached information for non-present iso Date: Mon, 19 Apr 2010 16:54:27 +0300 In my opinion this is an issue of CAM peripheral disk drivers (and most probably all other disk drivers) not communicating media change events even when those events are readily known to the drivers. One simple is example is CDIOCEJECT/CDIOCCLOSE ioctls of cd(4). Other possibilities for detecting media change: handle SCSI ASC 28h ("NOT READY TO READY CHANGE, MEDIUM MAY HAVE CHANGED"); poll for media removal/change; poll for CD drive eject button precesses (things that hald does in userland). AFAICS, currently there is no abstraction to pass media change events from disk layer to GEOM. E.g. something like disk_media_changed() that would call g_spoil or post g_new_provider_event to trigger re-taste as appropriate. Sounds easy, but the devil is in the details, there might be some locking/layering concerns. But, OTOH, we do this kind of things in g_access, I don't see why we couldn't do them in disk drivers. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Mon Apr 19 19:34:08 2010 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4958F1065672; Mon, 19 Apr 2010 19:34:08 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 205C98FC0C; Mon, 19 Apr 2010 19:34:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3JJY8fb075946; Mon, 19 Apr 2010 19:34:08 GMT (envelope-from jh@freefall.freebsd.org) Received: (from jh@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3JJY7JU075942; Mon, 19 Apr 2010 19:34:07 GMT (envelope-from jh) Date: Mon, 19 Apr 2010 19:34:07 GMT Message-Id: <201004191934.o3JJY7JU075942@freefall.freebsd.org> To: estartu@ze.tum.de, jh@FreeBSD.org, freebsd-geom@FreeBSD.org From: jh@FreeBSD.org Cc: Subject: Re: kern/119743: [geom] geom label for cds is keeped after dismount and eject X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2010 19:34:08 -0000 Synopsis: [geom] geom label for cds is keeped after dismount and eject State-Changed-From-To: open->closed State-Changed-By: jh State-Changed-When: Mon Apr 19 19:34:07 UTC 2010 State-Changed-Why: Duplicate of kern/145818. http://www.freebsd.org/cgi/query-pr.cgi?pr=119743 From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 09:44:27 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D06C1065678 for ; Wed, 21 Apr 2010 09:44:27 +0000 (UTC) (envelope-from lister@kawashti.org) Received: from mra.kawashti.org (mra.kawashti.org [78.136.5.95]) by mx1.freebsd.org (Postfix) with ESMTP id 424EE8FC19 for ; Wed, 21 Apr 2010 09:44:27 +0000 (UTC) Received: from mx.kawashti.org (mx.kawashti.org [196.218.21.179]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mra.kawashti.org (Postfix) with ESMTP id 0A6424902E4 for ; Wed, 21 Apr 2010 10:29:07 +0100 (BST) Received: from neo ([10.10.10.10]) by mx.kawashti.org (Kawashti Mail) with SMTP id RDS02182 for ; Wed, 21 Apr 2010 11:21:47 +0200 Message-ID: From: "Lister" To: Date: Wed, 21 Apr 2010 11:21:41 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="windows-1256"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.4548 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.4325 X-Mailman-Approved-At: Wed, 21 Apr 2010 11:19:01 +0000 Subject: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 09:44:27 -0000 Hi All, I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE. I partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE. P3 is 3.5TB and is the one I need to expand by adding another 1TB drive to the RAID. It is now 87% full. Both gpt and gpart don't allow resizing a partition. Of course, backing up the RAID to another is not an option. I'm in a rather desperate situation and I'm willing to do whatever it takes. If there's no current software solution, I'm willing to use a hex editor to edit the disk directly if someone could advise me of the layout of GPT as created by gpt- and gpart if different. I used to do this on MBR disks at times of necessity. Kind regards, Hatem Kawashti From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 12:04:53 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9B31106564A; Wed, 21 Apr 2010 12:04:53 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward4.mail.yandex.net (forward4.mail.yandex.net [77.88.46.9]) by mx1.freebsd.org (Postfix) with ESMTP id 969E68FC17; Wed, 21 Apr 2010 12:04:53 +0000 (UTC) Received: from smtp1.mail.yandex.net (smtp1.mail.yandex.net [77.88.46.101]) by forward4.mail.yandex.net (Yandex) with ESMTP id CCABA6AD99C3; Wed, 21 Apr 2010 16:04:51 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271851491; bh=7b+kGfa9tejeBxFiyRpNMBwxNr4I/s2BBAmGV7u/E9g=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=DT1cKR4owEd4LzQ0K3+W4ps5T20AREf22I0x8tGgaBEY5QqWmSBxl8pk8ZMv05tfZ qJP73U9SQtKjhNkb5OA3jGr3SsO3IFYWWKwX3ARCj1r1g9+HiPSeoPQeGN/RKFmYXO sKtgraiUDxhgi+Os5LsWbIW3P3ZpjCE3aAYYiCZA= Received: from [127.0.0.1] (mail.kirov.so-cdu.ru [77.72.136.145]) by smtp1.mail.yandex.net (Yandex) with ESMTPSA id 8D9192900A4; Wed, 21 Apr 2010 16:04:51 +0400 (MSD) Message-ID: <4BCEE9E2.6010007@yandex.ru> Date: Wed, 21 Apr 2010 16:04:50 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Lister References: In-Reply-To: Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Yandex-TimeMark: 1271851491 X-Yandex-Spam: 1 X-Yandex-Front: smtp1.mail.yandex.net Cc: Marcel Moolenaar , freebsd-geom@freebsd.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 12:04:54 -0000 On 21.04.2010 13:21, Lister wrote: > I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE. I > partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE. > P3 is 3.5TB and is the one I need to expand by adding another 1TB drive > to the RAID. It is now 87% full. > > Both gpt and gpart don't allow resizing a partition. > Of course, backing up the RAID to another is not an option. > > I'm in a rather desperate situation and I'm willing to do whatever it > takes. If there's no current software solution, I'm willing to use a hex > editor to edit the disk directly if someone could advise me of the > layout of GPT as created by gpt- and gpart if different. I used to do > this on MBR disks at times of necessity. I published patch to add resize feature to GEOM PART class some time ago: http://lists.freebsd.org/pipermail/freebsd-geom/2010-March/003955.html -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 12:07:39 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 44400106566B for ; Wed, 21 Apr 2010 12:07:39 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 85A338FC1D for ; Wed, 21 Apr 2010 12:07:38 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA01915; Wed, 21 Apr 2010 15:07:21 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4BCEEA79.7080309@icyb.net.ua> Date: Wed, 21 Apr 2010 15:07:21 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100319) MIME-Version: 1.0 To: Lister References: In-Reply-To: X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=windows-1256 Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 12:07:39 -0000 on 21/04/2010 12:21 Lister said the following: > Hi All, > > I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE. I > partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE. > P3 is 3.5TB and is the one I need to expand by adding another 1TB drive > to the RAID. It is now 87% full. > > Both gpt and gpart don't allow resizing a partition. > Of course, backing up the RAID to another is not an option. > > I'm in a rather desperate situation and I'm willing to do whatever it > takes. If there's no current software solution, I'm willing to use a hex > editor to edit the disk directly if someone could advise me of the > layout of GPT as created by gpt- and gpart if different. I used to do > this on MBR disks at times of necessity. If you make any mistake and lose your data, then don't blame me. Before trying what I suggest wait for a few days in case someone points out a mistake or suggests a better way. 1. Get current layout e.g. with 'gpart show' 2. Print (several copies of) it and don't lose it 3. Boot using Live CD (if da0 is your boot disk) 4. Undo the whole GPT layout using 'gpart delete' and 'gpart destroy' 5. Expand RAID (I hope OCE means that the new space will be added at the end) 5. Re-create the same layout but using new size for p3 Some notes: 1. Deleting/destroying/adding/creating partitions and scheme does not touch your data/filesystems; it operates only on sectors belonging to GPT metadata. 2. There are two copies of GPT metadata, one at the start of a disk, the other at the end; they both must be valid and provide the same information. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 12:15:37 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BCE67106564A; Wed, 21 Apr 2010 12:15:37 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward12.mail.yandex.net (forward12.mail.yandex.net [95.108.130.94]) by mx1.freebsd.org (Postfix) with ESMTP id 699228FC0C; Wed, 21 Apr 2010 12:15:37 +0000 (UTC) Received: from smtp13.mail.yandex.net (smtp13.mail.yandex.net [95.108.130.68]) by forward12.mail.yandex.net (Yandex) with ESMTP id B548D4EA0132; Wed, 21 Apr 2010 16:15:35 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271852135; bh=V6HBKKsXEuCTpsgGBzV+8uLrWxfKTAuFO1jNKJbxMHI=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=JVUjJPWeQQnmD5fNZurClmzp80JqiflZ3Aj86YNjentdNpl07H/RVXyExhk4p4uBg tu66KoLmhvEwI0612LUM9gseoykkV/Jl7bsWUkuu9G/rs6klbD/LTwRjg/h+YDKPGm ieQ1qD+yKDdLZnHudnhu1xxLeuOuyVlWHbR+A14s= Received: from [127.0.0.1] (ns.kirov.so-ups.ru [77.72.136.145]) by smtp13.mail.yandex.net (Yandex) with ESMTPSA id 74ADE41580B7; Wed, 21 Apr 2010 16:15:35 +0400 (MSD) Message-ID: <4BCEEC66.1080804@yandex.ru> Date: Wed, 21 Apr 2010 16:15:34 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Lister References: <4BCEE9E2.6010007@yandex.ru> In-Reply-To: <4BCEE9E2.6010007@yandex.ru> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Yandex-TimeMark: 1271852135 X-Yandex-Spam: 1 X-Yandex-Front: smtp13.mail.yandex.net Cc: Marcel Moolenaar , freebsd-geom@freebsd.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 12:15:37 -0000 On 21.04.2010 16:04, Andrey V. Elsukov wrote: > I published patch to add resize feature to GEOM PART class some time ago: > http://lists.freebsd.org/pipermail/freebsd-geom/2010-March/003955.html Also you can found some info here: http://bu7cher.blogspot.com/2010/03/gpart_22.html (in russian) http://p4db.freebsd.org/clientView.cgi?CLIENT=butcher-gpart http://butcher.heavennet.ru/patches/kernel/gpart_resize/gpart.patch -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 12:26:54 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 59A971065670 for ; Wed, 21 Apr 2010 12:26:54 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 680168FC08 for ; Wed, 21 Apr 2010 12:26:53 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA02650; Wed, 21 Apr 2010 15:26:46 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4BCEEF06.8010203@icyb.net.ua> Date: Wed, 21 Apr 2010 15:26:46 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100319) MIME-Version: 1.0 To: "Andrey V. Elsukov" References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> In-Reply-To: <4BCEEC66.1080804@yandex.ru> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: Lister , Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 12:26:54 -0000 on 21/04/2010 15:15 Andrey V. Elsukov said the following: > On 21.04.2010 16:04, Andrey V. Elsukov wrote: >> I published patch to add resize feature to GEOM PART class some time ago: >> http://lists.freebsd.org/pipermail/freebsd-geom/2010-March/003955.html > > Also you can found some info here: > http://bu7cher.blogspot.com/2010/03/gpart_22.html (in russian) > http://p4db.freebsd.org/clientView.cgi?CLIENT=butcher-gpart > http://butcher.heavennet.ru/patches/kernel/gpart_resize/gpart.patch Andrey, will your patch take care of moving the second copy of GPT towards (new) disk end if disk size is increased? -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 12:38:36 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 410C2106566B; Wed, 21 Apr 2010 12:38:36 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward10.mail.yandex.net (forward10.mail.yandex.net [77.88.61.49]) by mx1.freebsd.org (Postfix) with ESMTP id E0EA18FC08; Wed, 21 Apr 2010 12:38:35 +0000 (UTC) Received: from smtp6.mail.yandex.net (smtp6.mail.yandex.net [77.88.61.56]) by forward10.mail.yandex.net (Yandex) with ESMTP id E21A51F503A6; Wed, 21 Apr 2010 16:38:33 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271853513; bh=YK39Xg2SacxVX0Q7WdiwUAaHKQoDbe6TGV0+7IYxvUI=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=AAT/3ZvKsUyt550cwV5aVJkgVx9cU2iQwfnU4dwcQD4LO8rgX6EjbHeQHONCsA/mE ZpYgjBZXlyWSi5Zq1OjvDX9xyL9QMEysuCfEsdqsv38ry0v7Z2/QtPbXiykvPK3+QC 4QFM+Ctwk/pT1jwXgxh3+vSQtEniSpV0WJBtBuNM= Received: from [127.0.0.1] (ns.kirov.so-ups.ru [77.72.136.145]) by smtp6.mail.yandex.net (Yandex) with ESMTPSA id 886DE3280B0; Wed, 21 Apr 2010 16:38:33 +0400 (MSD) Message-ID: <4BCEF1C8.8080805@yandex.ru> Date: Wed, 21 Apr 2010 16:38:32 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Andriy Gapon References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> In-Reply-To: <4BCEEF06.8010203@icyb.net.ua> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Yandex-TimeMark: 1271853513 X-Yandex-Spam: 1 X-Yandex-Front: smtp6.mail.yandex.net Cc: Lister , Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 12:38:36 -0000 On 21.04.2010 16:26, Andriy Gapon wrote: > will your patch take care of moving the second copy of GPT towards (new) disk end > if disk size is increased? Yes, you are right. It should be modified. I'll thinking about this today. -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 12:56:27 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A50E51065672; Wed, 21 Apr 2010 12:56:27 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward2.mail.yandex.net (forward2.mail.yandex.net [77.88.46.7]) by mx1.freebsd.org (Postfix) with ESMTP id 511AC8FC1B; Wed, 21 Apr 2010 12:56:27 +0000 (UTC) Received: from smtp2.mail.yandex.net (smtp2.mail.yandex.net [77.88.46.102]) by forward2.mail.yandex.net (Yandex) with ESMTP id 7C44738A8FF2; Wed, 21 Apr 2010 16:56:25 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271854585; bh=Vq1ufzHWapoOXjv2+pScrjbhwBNJbFNgdDEQ51Dlriw=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=DcUVDC3WmT/yb/Z4JMVUefOj0bH+aeQrwtp6FFacd1dWZXEoho8DDplwNQzqR8K5I /nD5LSv2erSJysqpmdmjE8Iu1kNxyI1v7chZLp1ASrN3TCvSlI/R5kT3q6iEPfj3fi ac43wiA5lf6wxdL8whb5D5c0870XcCF5pG3IFcGs= Received: from [127.0.0.1] (ns.kirov.so-ups.ru [77.72.136.145]) by smtp2.mail.yandex.net (Yandex) with ESMTPSA id 1B9A3528067; Wed, 21 Apr 2010 16:56:25 +0400 (MSD) Message-ID: <4BCEF5F8.6090102@yandex.ru> Date: Wed, 21 Apr 2010 16:56:24 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Andriy Gapon References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> In-Reply-To: <4BCEEF06.8010203@icyb.net.ua> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Yandex-TimeMark: 1271854585 X-Yandex-Spam: 1 X-Yandex-Front: smtp2.mail.yandex.net Cc: Lister , Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 12:56:27 -0000 On 21.04.2010 16:26, Andriy Gapon wrote: > will your patch take care of moving the second copy of GPT towards (new) disk end > if disk size is increased? Current implementation of resize feature is targeted to resize providers withing scheme. But with GPT we have problem, after booting with bigger media size the second partition table will be lost. And GPT will be broken. I think first of it should be recovered. And there are some plans about implementing this feature. After that partitions within scheme can be resized with my patch. Recovering of GPT will write secondary table and also should fix internal information about media size. And there can be several ways (if we think about generic implementation). What should we do when media size will be smaller that it was before? Should we reject this way of recovering and allow recovering only for the same size or bigger media size? -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 13:59:46 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6AD221065670; Wed, 21 Apr 2010 13:59:46 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 7695F8FC0A; Wed, 21 Apr 2010 13:59:44 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA05010; Wed, 21 Apr 2010 16:59:35 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4BCF04C7.1050701@icyb.net.ua> Date: Wed, 21 Apr 2010 16:59:35 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100319) MIME-Version: 1.0 To: "Andrey V. Elsukov" References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> In-Reply-To: <4BCEF5F8.6090102@yandex.ru> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: Lister , Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 13:59:46 -0000 on 21/04/2010 15:56 Andrey V. Elsukov said the following: > On 21.04.2010 16:26, Andriy Gapon wrote: >> will your patch take care of moving the second copy of GPT towards >> (new) disk end >> if disk size is increased? > > Current implementation of resize feature is targeted to resize > providers withing scheme. But with GPT we have problem, after > booting with bigger media size the second partition table will > be lost. And GPT will be broken. Why? Do we have it hardcoded where to look for the secondary GPT? Or do we enforce that it is at the very end of disk? My understanding is that GPT partition table header contains positions of both primary and secondary GPT (fields at offsets 24 and 32). I think that we should use that and growing disk would not cause any problem. GPT scheme would cover only a portion of disk, but that should be OK as a temporary state. Then, we could have some additional command like 'reinit' that would relocate the secondary table to the new end of disk and update recorded positions to new values. > I think first of it should be recovered. > And there are some plans about implementing this feature. > After that partitions within scheme can be resized with my patch. I also think that this recovery mechanism is needed. In short: recover - re-create secondary table based on primary table reinit - relocate secondary table to a new position and update offsets in both tables accordingly > Recovering of GPT will write secondary table and also should fix internal > information about media size. And there can be several ways (if we think > about > generic implementation). What should we do when media size will be > smaller that > it was before? Should we reject this way of recovering and allow recovering > only for the same size or bigger media size? I think that we could allow smaller media size _provided_ that lost space doesn't overlap with any partitions found in primary table. Otherwise it's a data loss scenario and a user should be left to deal with it. IMHO, of course. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 17:48:20 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11D89106566C; Wed, 21 Apr 2010 17:48:20 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward11.mail.yandex.net (forward11.mail.yandex.net [95.108.130.93]) by mx1.freebsd.org (Postfix) with ESMTP id AFC568FC17; Wed, 21 Apr 2010 17:48:19 +0000 (UTC) Received: from web136.yandex.ru (web136.yandex.ru [95.108.130.34]) by forward11.mail.yandex.net (Yandex) with ESMTP id D6B883ED0655; Wed, 21 Apr 2010 21:48:17 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271872097; bh=Hl52msHd5VYFvZ7M0qm9Rjy0D950OhFVLHv9HN4q+Ak=; h=From:To:Cc:In-Reply-To:References:Subject:MIME-Version:Message-Id: Date:Content-Transfer-Encoding:Content-Type; b=kZiwgy5hDfJyPZdszGjy7vagGf4nMmt6/nfM8t7mnbgZid3resg97DB8txyfQWt9w iU8mQfHV1aXvmRMdrjTYyjRJGph1J+NIcJ1Rn9ufttlVqcSFnA3BjBBBVLtgesZAOn CbDkZHocWjKrgER9Yj2u+najyJeh+R2LWjdIvHNc= Received: from localhost (localhost.localdomain [127.0.0.1]) by web136.yandex.ru (Yandex) with ESMTP id D369D24F807E; Wed, 21 Apr 2010 21:48:17 +0400 (MSD) X-Yandex-Spam: 1 X-Yandex-Front: web136.yandex.ru X-Yandex-TimeMark: 1271872097 Received: from [77.72.138.63] ([77.72.138.63]) by mail.yandex.ru with HTTP; Wed, 21 Apr 2010 21:48:15 +0400 From: Andrey V. Elsukov To: Andriy Gapon In-Reply-To: <4BCF04C7.1050701@icyb.net.ua> References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> MIME-Version: 1.0 Message-Id: <50691271872096@web136.yandex.ru> Date: Wed, 21 Apr 2010 21:48:16 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain Cc: Lister , Marcel Moolenaar , freebsd-geom@freebsd.org Subject: Re: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 17:48:20 -0000 21.04.10, 16:59, Andriy Gapon: > > providers withing scheme. But with GPT we have problem, after > > booting with bigger media size the second partition table will > > be lost. And GPT will be broken. > > Why? > Do we have it hardcoded where to look for the secondary GPT? Yes. Current implementation does search for second GPT table only at last LBA. And it violates with UEFI 2.3 specification. > I also think that this recovery mechanism is needed. > In short: > recover - re-create secondary table based on primary table Also there can be situation when primary table is corrupted, but secondary isn't. > reinit - relocate secondary table to a new position and update offsets in both > tables accordingly What should reinit do when nothing was changed? > > it was before? Should we reject this way of recovering and allow recovering > > only for the same size or bigger media size? > > I think that we could allow smaller media size _provided_ that lost space doesn't > overlap with any partitions found in primary table. Otherwise it's a data loss > scenario and a user should be left to deal with it. IMHO, of course. Current implementation rejects table where 'LastUsableLBA' is greather than last medium's LBA. This behavior also doesn't described in UEFI spec. =) I think if we will have recover and reinit verbs implemented we can change some algorithms of table detection. -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 17:59:55 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E511106564A; Wed, 21 Apr 2010 17:59:55 +0000 (UTC) (envelope-from xcllnt@mac.com) Received: from asmtpout030.mac.com (asmtpout030.mac.com [17.148.16.105]) by mx1.freebsd.org (Postfix) with ESMTP id 228118FC0C; Wed, 21 Apr 2010 17:59:54 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=us-ascii Received: from macbook-pro.jnpr.net (natint3.juniper.net [66.129.224.36]) by asmtp030.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0L18006UANBOU750@asmtp030.mac.com>; Wed, 21 Apr 2010 10:59:52 -0700 (PDT) X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=5.0.0-0908210000 definitions=main-1004210171 From: Marcel Moolenaar In-reply-to: <50691271872096@web136.yandex.ru> Date: Wed, 21 Apr 2010 10:59:48 -0700 Message-id: <75798832-C041-4796-8C10-5BE61FB7583A@mac.com> References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> <50691271872096@web136.yandex.ru> To: "Andrey V. Elsukov" X-Mailer: Apple Mail (2.1078) Cc: Lister , Marcel Moolenaar , Andriy Gapon , freebsd-geom@freebsd.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 17:59:55 -0000 On Apr 21, 2010, at 10:48 AM, Andrey V. Elsukov wrote: > 21.04.10, 16:59, Andriy Gapon: > >>> providers withing scheme. But with GPT we have problem, after >>> booting with bigger media size the second partition table will >>> be lost. And GPT will be broken. >> >> Why? >> Do we have it hardcoded where to look for the secondary GPT? > > Yes. Current implementation does search for second GPT table only at last LBA. > And it violates with UEFI 2.3 specification. No, it's ACCORDING to the specification: UEFI version 2.3, page 99 (paragraph 5.3.1): "Two GPT Header structures are stored on the device: the primary and the backup. The primary GPT Header must be located in LBA 1 (i.e., the second logical block), and the backup GPT Header must be located in the last LBA of the device." FYI, -- Marcel Moolenaar xcllnt@mac.com From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 18:15:44 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F3C341065675; Wed, 21 Apr 2010 18:15:43 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward8.mail.yandex.net (forward8.mail.yandex.net [77.88.61.38]) by mx1.freebsd.org (Postfix) with ESMTP id 9A7108FC19; Wed, 21 Apr 2010 18:15:43 +0000 (UTC) Received: from web84.yandex.ru (web84.yandex.ru [77.88.60.74]) by forward8.mail.yandex.net (Yandex) with ESMTP id C47DA16F058F; Wed, 21 Apr 2010 22:15:41 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271873741; bh=gl+otgw+bX3LLfmPLZsknjWYAwgct3RIBOIQm/qKdGM=; h=From:To:Cc:In-Reply-To:References:Subject:MIME-Version:Message-Id: Date:Content-Transfer-Encoding:Content-Type; b=mKnoFCnAkiJZHa2CPN+hul5rJtqEZPojmZwu+o2jGjYIlUXADBJ9AJZ0CtZTY51nh 6f1iU3tq7sLxZvIatLL2nre78xvYo+zF1Y7WhRUN1JsnXPzirfEWKdJ/P7CBbS382O rtKiXZWRvsMeiuF7Xcp+kFziMkCgb3Hu36/tk99Q= Received: from localhost (localhost.localdomain [127.0.0.1]) by web84.yandex.ru (Yandex) with ESMTP id BFBCDCD0042; Wed, 21 Apr 2010 22:15:41 +0400 (MSD) X-Yandex-Spam: 1 X-Yandex-Front: web84.yandex.ru X-Yandex-TimeMark: 1271873741 Received: from [77.72.138.63] ([77.72.138.63]) by mail.yandex.ru with HTTP; Wed, 21 Apr 2010 22:15:41 +0400 From: Andrey V. Elsukov To: Marcel Moolenaar In-Reply-To: <75798832-C041-4796-8C10-5BE61FB7583A@mac.com> References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> <50691271872096@web136.yandex.ru> <75798832-C041-4796-8C10-5BE61FB7583A@mac.com> MIME-Version: 1.0 Message-Id: <144661271873741@web84.yandex.ru> Date: Wed, 21 Apr 2010 22:15:41 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=utf-8 Cc: Lister , Marcel Moolenaar , Andriy Gapon , freebsd-geom@freebsd.org Subject: Re: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 18:15:44 -0000 21.04.10, 10:59, "Marcel Moolenaar" : > UEFI version 2.3, page 99 (paragraph 5.3.1): > "Two GPT Header structures are stored on the device: the primary and the > backup. The primary GPT Header must be located in LBA 1 (i.e., the second > logical block), and the backup GPT Header must be located in the last LBA > of the device." Hi, Marcel Ok, you are right, but: GPT Header, page 102 (paragraph 5.3.2): The following test must be performed to determine if a GPT is valid: • Check the Signature • Check the Header CRC • Check that the MyLBA entry points to the LBA that contains the GUID Partition Table • Check the CRC of the GUID Partition Entry Array If the GPT is the primary table, stored at LBA 1: • Check the AlternateLBA to see if it is a valid GPT So, in our case (when resizing is allowed) it can be more usable to check AlternateLBA (it can be not equal to last LBA) for correct primary header. IMHO. Marcel, what you think about recover and reinit verbs? -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Wed Apr 21 20:49:44 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 02A751065675 for ; Wed, 21 Apr 2010 20:49:44 +0000 (UTC) (envelope-from lister@kawashti.org) Received: from mra.kawashti.org (mra.kawashti.org [78.136.5.95]) by mx1.freebsd.org (Postfix) with ESMTP id 940008FC08 for ; Wed, 21 Apr 2010 20:49:43 +0000 (UTC) Received: from mx.kawashti.org (mx.kawashti.org [196.218.21.179]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mra.kawashti.org (Postfix) with ESMTP id 973854902E5 for ; Wed, 21 Apr 2010 21:49:35 +0100 (BST) Received: from neo ([10.10.10.10]) by mx.kawashti.org (Kawashti Mail) with SMTP id RDS02182 for ; Wed, 21 Apr 2010 22:49:27 +0200 Message-ID: From: "Lister" To: References: <4BCEEA79.7080309@icyb.net.ua> Date: Wed, 21 Apr 2010 22:49:20 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="windows-1256"; reply-type=original Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.4548 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.4325 X-Mailman-Approved-At: Wed, 21 Apr 2010 21:10:10 +0000 Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2010 20:49:44 -0000 Hello All, I'd like to first thank Andrey Elsukov and Andriy Gapon for their valuable contribution and very quick reply. Given that the patch is not yet ready as I understand it, I'll go with the alternate method of destroying and recreating the GPT. To that end I yet have to ask 3 more questions: 1. How do I make sure I have a valid secondary GPT? Neither gpt nor gpart tell anything about it. Can I assume that if 'gpart show da0' shows a proper layout and no error messages that the 2ry is valid? I tried to make a quick visual comparison on another system (8.0-RELEASE this time) with a 4TB RAID5 that I just setup yesterday, using gpart this time because I had to. I used hexdump for the purpose, dumping the first 34 sectors of /dev/da0, and on another ssh shell, THE 34 sectors beyond the last partition. hexdump of the second got nothing, it seemed to have frozen but would break normally on CTRL+C. I've never seen the likes of this before. In an attempt to troubleshoot, I narrowed the selection to only ONE sector…same result. Then the last sector of the last partition…same thing. Even dump of the first sector of the last partition exhibited same behavior. The partition is viable, though. I copied a 4.4GB file to it over ssh without a problem and the data rate was consistent with expectations. I know this is a side issue, but is hexdump/hd known to have problems with large devices, or perhaps 32/64-bit issues? I forgot to mention that all my systems are AMD64. 2. Now assuming OCE adds the new space at the tail– which I yet have to verify before proceeding– will 'growfs' serve the purpose of extending newfs' work? Its man page doesn't reference gpt or gpart, but rather bsdlabel and fdisk; something suggestiive of the contrary. 3. Does it make a difference if use gpt or gpart to recreate the gpt, given that I'd initially created it with gpt? Note. My root fs and everything else beyond the library is on another RAID1 (on the Motherboard). Kind regards, Hatem Kawashti ----- Original Message ----- From: "Andriy Gapon" To: "Lister" Cc: Sent: Wednesday, April 21, 2010 14:07 Subject: Re: OCE and GPT > on 21/04/2010 12:21 Lister said the following: >> Hi All, >> >> I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE. I >> partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE. >> P3 is 3.5TB and is the one I need to expand by adding another 1TB drive >> to the RAID. It is now 87% full. >> >> Both gpt and gpart don't allow resizing a partition. >> Of course, backing up the RAID to another is not an option. >> >> I'm in a rather desperate situation and I'm willing to do whatever it >> takes. If there's no current software solution, I'm willing to use a hex >> editor to edit the disk directly if someone could advise me of the >> layout of GPT as created by gpt- and gpart if different. I used to do >> this on MBR disks at times of necessity. > > If you make any mistake and lose your data, then don't blame me. > Before trying what I suggest wait for a few days in case someone points out a > mistake or suggests a better way. > > 1. Get current layout e.g. with 'gpart show' > 2. Print (several copies of) it and don't lose it > 3. Boot using Live CD (if da0 is your boot disk) > 4. Undo the whole GPT layout using 'gpart delete' and 'gpart destroy' > 5. Expand RAID (I hope OCE means that the new space will be added at the end) > 5. Re-create the same layout but using new size for p3 > > Some notes: > 1. Deleting/destroying/adding/creating partitions and scheme does not touch your > data/filesystems; it operates only on sectors belonging to GPT metadata. > 2. There are two copies of GPT metadata, one at the start of a disk, the other at > the end; they both must be valid and provide the same information. > -- > Andriy Gapon > _______________________________________________ > freebsd-geom@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 09:58:32 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 473861065677 for ; Thu, 22 Apr 2010 09:58:32 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 811D58FC24 for ; Thu, 22 Apr 2010 09:58:31 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA00510; Thu, 22 Apr 2010 12:58:24 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O4tAt-0007Pi-NH; Thu, 22 Apr 2010 12:58:23 +0300 Message-ID: <4BD01DBE.7030905@icyb.net.ua> Date: Thu, 22 Apr 2010 12:58:22 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: Lister References: <4BCEEA79.7080309@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=windows-1256 Content-Transfer-Encoding: 8bit Cc: freebsd-geom@freebsd.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 09:58:32 -0000 on 21/04/2010 23:49 Lister said the following: > Hello All, > > I'd like to first thank Andrey Elsukov and Andriy Gapon for their > valuable contribution and very quick reply. > Given that the patch is not yet ready as I understand it, I'll go with > the alternate method of destroying and recreating the GPT. To that end I > yet have to ask 3 more questions: > 1. How do I make sure I have a valid secondary GPT? Neither gpt nor > gpart tell anything about it. Can I assume that if 'gpart show da0' > shows a proper layout and no error messages that the 2ry is valid? I think that should be sufficient. > I tried to make a quick visual comparison on another system > (8.0-RELEASE this time) with a 4TB RAID5 that I just setup yesterday, > using gpart this time because I had to. I used hexdump for the purpose, > dumping the first 34 sectors of /dev/da0, and on another ssh shell, THE > 34 sectors beyond the last partition. > hexdump of the second got nothing, it seemed to have frozen but would > break normally on CTRL+C. I've never seen the likes of this before. > In an attempt to troubleshoot, I narrowed the selection to only ONE > sector…same result. Then the last sector of the last partition…same > thing. Even dump of the first sector of the last partition exhibited > same behavior. The partition is viable, though. I copied a 4.4GB file > to it over ssh without a problem and the data rate was consistent with > expectations. > I know this is a side issue, but is hexdump/hd known to have problems > with large devices, or perhaps 32/64-bit issues? > I forgot to mention that all my systems are AMD64. Can you provide the actual commands you used? Not to doubt your skills, but just to be sure. BTW, you can discover disk size with diskinfo tool, subtract 34 from that and use dd on that. > 2. Now assuming OCE adds the new space at the tail– which I yet have to > verify before proceeding– will 'growfs' serve the purpose of extending > newfs' work? > Its man page doesn't reference gpt or gpart, but rather bsdlabel and > fdisk; something suggestiive of the contrary. Theoretically growfs should work with filesystem data within a partition and should be agnostic to partition type. Practically, I am not sure. Also, there _could_ be issues with very large FS sizes. In your case it would be great if you could experiment with dummy data on a different system. I.e. create something similar to what you have now, then grow it the way you want and see how it works out. Don't forget to share the results with us :) > 3. Does it make a difference if use gpt or gpart to recreate the gpt, > given that I'd initially created it with gpt? I think that it's better to use gpart because gpt was deprecated. But I am not sure what version of FreeBSD you use, that may be important. > Note. My root fs and everything else beyond the library is on another > RAID1 (on the Motherboard). That's good, gives you more freedom in actions. > ----- Original Message ----- From: "Andriy Gapon" > To: "Lister" > Cc: > Sent: Wednesday, April 21, 2010 14:07 > Subject: Re: OCE and GPT > > >> on 21/04/2010 12:21 Lister said the following: >>> Hi All, >>> >>> I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE. I >>> partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE. >>> P3 is 3.5TB and is the one I need to expand by adding another 1TB drive >>> to the RAID. It is now 87% full. >>> >>> Both gpt and gpart don't allow resizing a partition. >>> Of course, backing up the RAID to another is not an option. >>> >>> I'm in a rather desperate situation and I'm willing to do whatever it >>> takes. If there's no current software solution, I'm willing to use a hex >>> editor to edit the disk directly if someone could advise me of the >>> layout of GPT as created by gpt- and gpart if different. I used to do >>> this on MBR disks at times of necessity. >> >> If you make any mistake and lose your data, then don't blame me. >> Before trying what I suggest wait for a few days in case someone >> points out a >> mistake or suggests a better way. >> >> 1. Get current layout e.g. with 'gpart show' >> 2. Print (several copies of) it and don't lose it >> 3. Boot using Live CD (if da0 is your boot disk) >> 4. Undo the whole GPT layout using 'gpart delete' and 'gpart destroy' >> 5. Expand RAID (I hope OCE means that the new space will be added at >> the end) >> 5. Re-create the same layout but using new size for p3 >> >> Some notes: >> 1. Deleting/destroying/adding/creating partitions and scheme does not >> touch your >> data/filesystems; it operates only on sectors belonging to GPT metadata. >> 2. There are two copies of GPT metadata, one at the start of a disk, >> the other at >> the end; they both must be valid and provide the same information. >> -- >> Andriy Gapon >> _______________________________________________ >> freebsd-geom@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-geom >> To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" > -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 11:41:46 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF687106566C; Thu, 22 Apr 2010 11:41:46 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward3.mail.yandex.net (forward3.mail.yandex.net [77.88.46.8]) by mx1.freebsd.org (Postfix) with ESMTP id 06C408FC0A; Thu, 22 Apr 2010 11:41:45 +0000 (UTC) Received: from smtp3.mail.yandex.net (smtp3.mail.yandex.net [77.88.46.103]) by forward3.mail.yandex.net (Yandex) with ESMTP id 7F65856D80F9; Thu, 22 Apr 2010 15:35:16 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271936116; bh=gmL6u9x/40Y7jCYOtvhFHKK6SPZ5SnSvwwKL6YzhZOY=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type; b=Y7PxhrqRpIZepADJ92fpReDZc8Cz2kIPcEa5AdY6yPG1X1J+jRqAVx3UmNeAuHACh uSO+EhEt6fbIPoZ78LhtFggfNCT6sgM930yQKtp5tfFYBGKVmg6Lt80qTB3JcL0xZI /hTy9QEPomAfZpO1sdbjF1qNiLXG8eA0nbBGIr34= Received: from [127.0.0.1] (ns.kirov.so-cdu.ru [77.72.136.145]) by smtp3.mail.yandex.net (Yandex) with ESMTPSA id CD5E4278094; Thu, 22 Apr 2010 15:35:15 +0400 (MSD) Message-ID: <4BD03472.6030201@yandex.ru> Date: Thu, 22 Apr 2010 15:35:14 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Andriy Gapon References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> In-Reply-To: <4BCF04C7.1050701@icyb.net.ua> Content-Type: multipart/mixed; boundary="------------070101000704010802030502" X-Yandex-TimeMark: 1271936116 X-Yandex-Spam: 1 X-Yandex-Front: smtp3.mail.yandex.net Cc: Lister , Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: [patch] resize and recover support for GPART GPT scheme (was: Re: OCE and GPT) X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 11:41:46 -0000 This is a multi-part message in MIME format. --------------070101000704010802030502 Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 8bit On 21.04.2010 17:59, Andriy Gapon wrote: > I also think that this recovery mechanism is needed. > In short: > recover - re-create secondary table based on primary table > reinit - relocate secondary table to a new position and update offsets in both > tables accordingly I implemented 'recover' verb. I changed detection algoritm in GPT scheme. Now when primary GPT header is valid reading of second header will be done from alternateLBA offset (which read from GPT header). When primary header is invalid reading of second header will be from the last medium's LBA. And now the following scenario works: ==================================================================== # dd if=/dev/zero of=./d.img bs=1m count=100 100+0 records in 100+0 records out 104857600 bytes transferred in 2.895854 secs (36209560 bytes/sec) # mdconfig -f d.img md0 # gpart create -s gpt md0 md0 created # gpart add -t freebsd-zfs md0 md0p1 added # gpart show md0 => 34 204733 md0 GPT (100M) 34 204733 1 freebsd-zfs (100M) # mdconfig -du 0 # dd if=/dev/zero of=./d.img bs=1m count=50 seek=100 50+0 records in 50+0 records out 52428800 bytes transferred in 1.175911 secs (44585689 bytes/sec) # ls -lh d.img -rw-r--r-- 1 root wheel 150M 22 ÁÐÒ 14:56 d.img # mdconfig -f d.img md0 # dmesg | tail -2 GEOM: md0: the secondary GPT table is corrupt or invalid. GEOM: md0: using the primary only -- recovery suggested. # gpart show md0 => 34 204733 md0 GPT (150M) 34 204733 1 freebsd-zfs (100M) # gpart recover md0 md0 recovered # gpart show md0 => 34 307133 md0 GPT (150M) 34 204733 1 freebsd-zfs (100M) 204767 102400 - free - (50M) # gpart resize -i 1 md0 md0p1 resized # gpart show md0 => 34 307133 md0 GPT (150M) 34 307133 1 freebsd-zfs (150M) ==================================================================== There are several things that can be do where i need suggestions. 1. What code should do when user doing `gpart recover` for scheme that doesn't need recovering? 2. Probably there are needed some checks before changing metadata in g_part_gpt_recover method. So, patch attached and comments are welcome. -- WBR, Andrey V. Elsukov --------------070101000704010802030502 Content-Type: text/plain; name="gpart_recover.diff.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gpart_recover.diff.txt" Index: gpart.8 =================================================================== --- gpart.8 (revision 205576) +++ gpart.8 (working copy) @@ -120,6 +120,18 @@ utility: .Op Fl t Ar type .Op Fl f Ar flags .Ar geom +.\" ==== RECOVER ==== +.Nm +.Cm recover +.Op Fl f Ar flags +.Ar geom +.\" ==== RESIZE ==== +.Nm +.Cm resize +.Fl i Ar index +.Op Fl s Ar size +.Op Fl f Ar flags +.Ar geom .\" ==== SET ==== .Nm .Cm set @@ -325,6 +337,45 @@ See the section entitled below for a discussion about its use. .El +.\" ==== RECOVER ==== +.It Cm recover +Recover a corrupted metadata on geom +.Ar geom . +Currently recovering implemented only for GPT scheme. +.Pp +Additional options include: +.Bl -tag -width 10n +.It Fl f Ar flags +Additional operational flags. +See the section entitled +.Sx "OPERATIONAL FLAGS" +below for a discussion +about its use. +.El +.\" ==== RESIZE ==== +.It Cm resize +Resize a partition from geom +.Ar geom +and further identified by the +.Fl i Ar index +option. New partition size is expressed in logical block +numbers and can be given by the +.Fl s Ar size +option. If +.Fl s +option is ommited then new size is automatically calculated +to maximum available from given geom +.Ar geom . +.Pp +Additional options include: +.Bl -tag -width 10n +.It Fl f Ar flags +Additional operational flags. +See the section entitled +.Sx "OPERATIONAL FLAGS" +below for a discussion +about its use. +.El .\" ==== SET ==== .It Cm set Set the named attribute on the partition entry. Index: geom_part.c =================================================================== --- geom_part.c (revision 205576) +++ geom_part.c (working copy) @@ -112,6 +112,18 @@ struct g_command PUBSYM(class_commands)[] = { G_OPT_SENTINEL }, "geom", NULL }, + { "recover", 0, gpart_issue, { + { 'f', "flags", flags, G_TYPE_STRING }, + G_OPT_SENTINEL }, + "geom", NULL + }, + { "resize", 0, gpart_issue, { + { 's', "size", autofill, G_TYPE_ASCLBA }, + { 'i', index_param, NULL, G_TYPE_ASCNUM }, + { 'f', "flags", flags, G_TYPE_STRING }, + G_OPT_SENTINEL }, + "geom", NULL + }, { "set", 0, gpart_issue, { { 'a', "attrib", NULL, G_TYPE_STRING }, { 'i', index_param, NULL, G_TYPE_ASCNUM }, @@ -243,6 +255,99 @@ fmtattrib(struct gprovider *pp) } static int +gpart_autofill_resize(struct gctl_req *req) +{ + struct gmesh mesh; + struct gclass *cp; + struct ggeom *gp; + struct gprovider *pp; + unsigned long long last, size, start, new_size; + unsigned long long lba, new_lba; + const char *s; + char *val; + int error, idx; + + s = gctl_get_ascii(req, "size"); + if (*s == '*') + new_size = (unsigned long long)atoll(s); + else + return (0); + + s = gctl_get_ascii(req, index_param); + idx = strtol(s, &val, 10); + if (idx < 1 || *s == '\0' || *val != '\0') + errx(EXIT_FAILURE, "invalid partition index"); + + error = geom_gettree(&mesh); + if (error) + return (error); + s = gctl_get_ascii(req, "class"); + if (s == NULL) + abort(); + cp = find_class(&mesh, s); + if (cp == NULL) + errx(EXIT_FAILURE, "Class %s not found.", s); + s = gctl_get_ascii(req, "geom"); + if (s == NULL) + abort(); + gp = find_geom(cp, s); + if (gp == NULL) + errx(EXIT_FAILURE, "No such geom: %s.", s); + last = atoll(find_geomcfg(gp, "last")); + + LIST_FOREACH(pp, &gp->lg_provider, lg_provider) { + s = find_provcfg(pp, "index"); + if (s == NULL) + continue; + if (atoi(s) == idx) + break; + } + if (pp == NULL) + errx(EXIT_FAILURE, "invalid partition index"); + + s = find_provcfg(pp, "start"); + if (s == NULL) { + s = find_provcfg(pp, "offset"); + start = atoll(s) / pp->lg_sectorsize; + } else + start = atoll(s); + s = find_provcfg(pp, "end"); + if (s == NULL) { + s = find_provcfg(pp, "length"); + lba = start + atoll(s) / pp->lg_sectorsize; + } else + lba = atoll(s) + 1; + + if (lba > last) + return (ENOSPC); + size = lba - start; + pp = find_provider(gp, lba); + if (pp == NULL) + new_size = last - start + 1; + else { + s = find_provcfg(pp, "start"); + if (s == NULL) { + s = find_provcfg(pp, "offset"); + new_lba = atoll(s) / pp->lg_sectorsize; + } else + new_lba = atoll(s); + /* Is there any free space between current and + * next providers? + */ + if (new_lba > lba) + new_size = new_lba - start; + else + return (ENOSPC); + } + asprintf(&val, "%llu", new_size); + if (val == NULL) + return (ENOMEM); + gctl_change_param(req, "size", -1, val); + + return (0); +} + +static int gpart_autofill(struct gctl_req *req) { struct gmesh mesh; @@ -257,6 +362,8 @@ gpart_autofill(struct gctl_req *req) int error, has_size, has_start; s = gctl_get_ascii(req, "verb"); + if (strcmp(s, "resize") == 0) + return gpart_autofill_resize(req); if (strcmp(s, "add") != 0) return (0); Index: g_part_pc98.c =================================================================== --- g_part_pc98.c (revision 204945) +++ g_part_pc98.c (working copy) @@ -77,6 +77,8 @@ static int g_part_pc98_setunset(struct g_part_tabl static const char *g_part_pc98_type(struct g_part_table *, struct g_part_entry *, char *, size_t); static int g_part_pc98_write(struct g_part_table *, struct g_consumer *); +static int g_part_pc98_resize(struct g_part_table *, struct g_part_entry *, + struct g_part_parms *); static kobj_method_t g_part_pc98_methods[] = { KOBJMETHOD(g_part_add, g_part_pc98_add), @@ -86,6 +88,7 @@ static kobj_method_t g_part_pc98_methods[] = { KOBJMETHOD(g_part_dumpconf, g_part_pc98_dumpconf), KOBJMETHOD(g_part_dumpto, g_part_pc98_dumpto), KOBJMETHOD(g_part_modify, g_part_pc98_modify), + KOBJMETHOD(g_part_resize, g_part_pc98_resize), KOBJMETHOD(g_part_name, g_part_pc98_name), KOBJMETHOD(g_part_probe, g_part_pc98_probe), KOBJMETHOD(g_part_read, g_part_pc98_read), @@ -308,6 +311,31 @@ g_part_pc98_modify(struct g_part_table *basetable, return (0); } +static int +g_part_pc98_resize(struct g_part_table *basetable, + struct g_part_entry *baseentry, struct g_part_parms *gpp) +{ + struct g_part_pc98_entry *entry; + uint32_t size, cyl; + + cyl = basetable->gpt_heads * basetable->gpt_sectors; + size = gpp->gpp_size; + + if (size < cyl) + return (EINVAL); + if (size % cyl) + size = size - (size % cyl); + if (size < cyl) + return (EINVAL); + + entry = (struct g_part_pc98_entry *)baseentry; + baseentry->gpe_end = baseentry->gpe_start + size - 1; + pc98_set_chs(basetable, baseentry->gpe_end, &entry->ent.dp_ecyl, + &entry->ent.dp_ehd, &entry->ent.dp_esect); + + return (0); +} + static const char * g_part_pc98_name(struct g_part_table *table, struct g_part_entry *baseentry, char *buf, size_t bufsz) Index: g_part_vtoc8.c =================================================================== --- g_part_vtoc8.c (revision 204945) +++ g_part_vtoc8.c (working copy) @@ -67,6 +67,8 @@ static int g_part_vtoc8_read(struct g_part_table * static const char *g_part_vtoc8_type(struct g_part_table *, struct g_part_entry *, char *, size_t); static int g_part_vtoc8_write(struct g_part_table *, struct g_consumer *); +static int g_part_vtoc8_resize(struct g_part_table *, struct g_part_entry *, + struct g_part_parms *); static kobj_method_t g_part_vtoc8_methods[] = { KOBJMETHOD(g_part_add, g_part_vtoc8_add), @@ -75,6 +77,7 @@ static kobj_method_t g_part_vtoc8_methods[] = { KOBJMETHOD(g_part_dumpconf, g_part_vtoc8_dumpconf), KOBJMETHOD(g_part_dumpto, g_part_vtoc8_dumpto), KOBJMETHOD(g_part_modify, g_part_vtoc8_modify), + KOBJMETHOD(g_part_resize, g_part_vtoc8_resize), KOBJMETHOD(g_part_name, g_part_vtoc8_name), KOBJMETHOD(g_part_probe, g_part_vtoc8_probe), KOBJMETHOD(g_part_read, g_part_vtoc8_read), @@ -294,6 +297,26 @@ g_part_vtoc8_modify(struct g_part_table *basetable return (0); } +static int +g_part_vtoc8_resize(struct g_part_table *basetable, + struct g_part_entry *entry, struct g_part_parms *gpp) +{ + struct g_part_vtoc8_table *table; + uint64_t size; + + table = (struct g_part_vtoc8_table *)basetable; + size = gpp->gpp_size; + if (size % table->secpercyl) + size = size - (size % table->secpercyl); + if (size < table->secpercyl) + return (EINVAL); + + entry->gpe_end = entry->gpe_start + size - 1; + be32enc(&table->vtoc.map[entry->gpe_index - 1].nblks, size); + + return (0); +} + static const char * g_part_vtoc8_name(struct g_part_table *table, struct g_part_entry *baseentry, char *buf, size_t bufsz) Index: g_part_bsd.c =================================================================== --- g_part_bsd.c (revision 204945) +++ g_part_bsd.c (working copy) @@ -73,6 +73,8 @@ static int g_part_bsd_read(struct g_part_table *, static const char *g_part_bsd_type(struct g_part_table *, struct g_part_entry *, char *, size_t); static int g_part_bsd_write(struct g_part_table *, struct g_consumer *); +static int g_part_bsd_resize(struct g_part_table *, struct g_part_entry *, + struct g_part_parms *); static kobj_method_t g_part_bsd_methods[] = { KOBJMETHOD(g_part_add, g_part_bsd_add), @@ -82,6 +84,7 @@ static kobj_method_t g_part_bsd_methods[] = { KOBJMETHOD(g_part_dumpconf, g_part_bsd_dumpconf), KOBJMETHOD(g_part_dumpto, g_part_bsd_dumpto), KOBJMETHOD(g_part_modify, g_part_bsd_modify), + KOBJMETHOD(g_part_resize, g_part_bsd_resize), KOBJMETHOD(g_part_name, g_part_bsd_name), KOBJMETHOD(g_part_probe, g_part_bsd_probe), KOBJMETHOD(g_part_read, g_part_bsd_read), @@ -288,6 +291,19 @@ g_part_bsd_modify(struct g_part_table *basetable, return (0); } +static int +g_part_bsd_resize(struct g_part_table *basetable, + struct g_part_entry *baseentry, struct g_part_parms *gpp) +{ + struct g_part_bsd_entry *entry; + + entry = (struct g_part_bsd_entry *)baseentry; + baseentry->gpe_end = baseentry->gpe_start + gpp->gpp_size - 1; + entry->part.p_size = gpp->gpp_size; + + return (0); +} + static const char * g_part_bsd_name(struct g_part_table *table, struct g_part_entry *baseentry, char *buf, size_t bufsz) Index: g_part_if.m =================================================================== --- g_part_if.m (revision 204945) +++ g_part_if.m (working copy) @@ -58,6 +58,20 @@ CODE { { return (0); } + + static int + default_recover(struct g_part_table *t __unused, + struct g_consumer *c __unused) + { + return (ENOSYS); + } + + static int + default_resize(struct g_part_table *t __unused, + struct g_part_entry *e __unused, struct g_part_parms *p __unused) + { + return (ENOSYS); + } }; # add() - scheme specific processing for the add verb. @@ -149,6 +163,19 @@ METHOD int read { struct g_consumer *cp; }; +# recover() - scheme specific processing for the recover verb. +METHOD int recover { + struct g_part_table *table; + struct g_consumer *cp; +} DEFAULT default_recover; + +# resize() - scheme specific processing for the resize verb. +METHOD int resize { + struct g_part_table *table; + struct g_part_entry *entry; + struct g_part_parms *gpp; +} DEFAULT default_resize; + # setunset() - set or unset partition entry attributes. METHOD int setunset { struct g_part_table *table; Index: g_part_gpt.c =================================================================== --- g_part_gpt.c (revision 204945) +++ g_part_gpt.c (working copy) @@ -100,6 +100,9 @@ static const char *g_part_gpt_name(struct g_part_t char *, size_t); static int g_part_gpt_probe(struct g_part_table *, struct g_consumer *); static int g_part_gpt_read(struct g_part_table *, struct g_consumer *); +static int g_part_gpt_recover(struct g_part_table *, struct g_consumer *); +static int g_part_gpt_resize(struct g_part_table *, struct g_part_entry *, + struct g_part_parms *); static const char *g_part_gpt_type(struct g_part_table *, struct g_part_entry *, char *, size_t); static int g_part_gpt_write(struct g_part_table *, struct g_consumer *); @@ -115,6 +118,8 @@ static kobj_method_t g_part_gpt_methods[] = { KOBJMETHOD(g_part_name, g_part_gpt_name), KOBJMETHOD(g_part_probe, g_part_gpt_probe), KOBJMETHOD(g_part_read, g_part_gpt_read), + KOBJMETHOD(g_part_recover, g_part_gpt_recover), + KOBJMETHOD(g_part_resize, g_part_gpt_resize), KOBJMETHOD(g_part_type, g_part_gpt_type), KOBJMETHOD(g_part_write, g_part_gpt_write), { 0, 0 } @@ -164,7 +169,7 @@ static struct uuid gpt_uuid_unused = GPT_ENT_TYPE_ static struct g_part_uuid_alias { struct uuid *uuid; - int alias; + int alias; } gpt_uuid_alias_match[] = { { &gpt_uuid_apple_boot, G_PART_ALIAS_APPLE_BOOT }, { &gpt_uuid_apple_hfs, G_PART_ALIAS_APPLE_HFS }, @@ -211,7 +216,14 @@ gpt_read_hdr(struct g_part_gpt_table *table, struc pp = cp->provider; last = (pp->mediasize / pp->sectorsize) - 1; - table->lba[elt] = (elt == GPT_ELT_PRIHDR) ? 1 : last; + if (elt == GPT_ELT_SECHDR) { + /* When the primary header is valid look for secondary + * header at AlternateLBA. Otherwise - at last medium's LBA. + */ + if (table->state[GPT_ELT_PRIHDR] != GPT_STATE_OK) + table->lba[elt] = last; + } else + table->lba[elt] = 1; table->state[elt] = GPT_STATE_MISSING; buf = g_read_data(cp, table->lba[elt] * pp->sectorsize, pp->sectorsize, &error); @@ -238,12 +250,15 @@ gpt_read_hdr(struct g_part_gpt_table *table, struc table->state[elt] = GPT_STATE_INVALID; hdr->hdr_revision = le32toh(buf->hdr_revision); - if (hdr->hdr_revision < 0x00010000) + if (hdr->hdr_revision < GPT_HDR_REVISION) goto fail; hdr->hdr_lba_self = le64toh(buf->hdr_lba_self); if (hdr->hdr_lba_self != table->lba[elt]) goto fail; hdr->hdr_lba_alt = le64toh(buf->hdr_lba_alt); + if (hdr->hdr_lba_alt == hdr->hdr_lba_self || + hdr->hdr_lba_alt > last) + goto fail; /* Check the managed area. */ hdr->hdr_lba_start = le64toh(buf->hdr_lba_start); @@ -277,6 +292,10 @@ gpt_read_hdr(struct g_part_gpt_table *table, struc le_uuid_dec(&buf->hdr_uuid, &hdr->hdr_uuid); hdr->hdr_crc_table = le32toh(buf->hdr_crc_table); + /* save LBA for secondary header */ + if (elt == GPT_ELT_PRIHDR) + table->lba[GPT_ELT_SECHDR] = hdr->hdr_lba_alt; + g_free(buf); return (hdr); @@ -550,6 +569,50 @@ g_part_gpt_modify(struct g_part_table *basetable, return (0); } +static int +g_part_gpt_resize(struct g_part_table *basetable, + struct g_part_entry *baseentry, struct g_part_parms *gpp) +{ + struct g_part_gpt_entry *entry; + entry = (struct g_part_gpt_entry *)baseentry; + + baseentry->gpe_end = baseentry->gpe_start + gpp->gpp_size - 1; + entry->ent.ent_lba_end = baseentry->gpe_end; + + return (0); +} + +static int +g_part_gpt_recover(struct g_part_table *basetable, struct g_consumer *cp) +{ + struct g_part_gpt_table *table; + struct g_provider *pp; + size_t tblsz; + quad_t last; + + pp = cp->provider; + table = (struct g_part_gpt_table *)basetable; + last = (pp->mediasize / pp->sectorsize) - 1; + tblsz = (basetable->gpt_entries * sizeof(struct gpt_ent) + + pp->sectorsize - 1) / pp->sectorsize; + + table->lba[GPT_ELT_PRIHDR] = 1; + table->lba[GPT_ELT_PRITBL] = 2; + table->lba[GPT_ELT_SECHDR] = last; + table->lba[GPT_ELT_SECTBL] = last - tblsz; + table->state[GPT_ELT_PRIHDR] = GPT_STATE_OK; + table->state[GPT_ELT_PRITBL] = GPT_STATE_OK; + table->state[GPT_ELT_SECHDR] = GPT_STATE_OK; + table->state[GPT_ELT_SECTBL] = GPT_STATE_OK; + + table->hdr->hdr_lba_start = 2 + tblsz; + table->hdr->hdr_lba_end = last - tblsz - 1; + basetable->gpt_first = table->hdr->hdr_lba_start; + basetable->gpt_last = table->hdr->hdr_lba_end; + + return (0); +} + static const char * g_part_gpt_name(struct g_part_table *table, struct g_part_entry *baseentry, char *buf, size_t bufsz) @@ -630,10 +693,12 @@ g_part_gpt_read(struct g_part_table *basetable, st struct g_part_gpt_table *table; struct g_part_gpt_entry *entry; u_char *buf; + quad_t last; int error, index; table = (struct g_part_gpt_table *)basetable; pp = cp->provider; + last = (pp->mediasize / pp->sectorsize) - 1; /* Read the PMBR */ buf = g_read_data(cp, 0, pp->sectorsize, &error); @@ -703,7 +768,8 @@ g_part_gpt_read(struct g_part_table *basetable, st if (pritbl != NULL) g_free(pritbl); } else { - if (table->state[GPT_ELT_SECTBL] != GPT_STATE_OK) { + if (table->state[GPT_ELT_SECTBL] != GPT_STATE_OK || + table->lba[GPT_ELT_SECHDR] != last) { printf("GEOM: %s: the secondary GPT table is corrupt " "or invalid.\n", pp->name); printf("GEOM: %s: using the primary only -- recovery " Index: g_part_apm.c =================================================================== --- g_part_apm.c (revision 204945) +++ g_part_apm.c (working copy) @@ -74,6 +74,8 @@ static int g_part_apm_read(struct g_part_table *, static const char *g_part_apm_type(struct g_part_table *, struct g_part_entry *, char *, size_t); static int g_part_apm_write(struct g_part_table *, struct g_consumer *); +static int g_part_apm_resize(struct g_part_table *, struct g_part_entry *, + struct g_part_parms *); static kobj_method_t g_part_apm_methods[] = { KOBJMETHOD(g_part_add, g_part_apm_add), @@ -82,6 +84,7 @@ static kobj_method_t g_part_apm_methods[] = { KOBJMETHOD(g_part_dumpconf, g_part_apm_dumpconf), KOBJMETHOD(g_part_dumpto, g_part_apm_dumpto), KOBJMETHOD(g_part_modify, g_part_apm_modify), + KOBJMETHOD(g_part_resize, g_part_apm_resize), KOBJMETHOD(g_part_name, g_part_apm_name), KOBJMETHOD(g_part_probe, g_part_apm_probe), KOBJMETHOD(g_part_read, g_part_apm_read), @@ -318,6 +321,19 @@ g_part_apm_modify(struct g_part_table *basetable, return (0); } +static int +g_part_apm_resize(struct g_part_table *basetable, + struct g_part_entry *baseentry, struct g_part_parms *gpp) +{ + struct g_part_apm_entry *entry; + + entry = (struct g_part_apm_entry *)baseentry; + baseentry->gpe_end = baseentry->gpe_start + gpp->gpp_size - 1; + entry->ent.ent_size = gpp->gpp_size; + + return (0); +} + static const char * g_part_apm_name(struct g_part_table *table, struct g_part_entry *baseentry, char *buf, size_t bufsz) Index: g_part.c =================================================================== --- g_part.c (revision 204945) +++ g_part.c (working copy) @@ -959,22 +959,124 @@ g_part_ctl_move(struct gctl_req *req, struct g_par { gctl_error(req, "%d verb 'move'", ENOSYS); return (ENOSYS); -} +} static int g_part_ctl_recover(struct gctl_req *req, struct g_part_parms *gpp) { - gctl_error(req, "%d verb 'recover'", ENOSYS); - return (ENOSYS); + struct g_geom *gp; + struct g_consumer *cp; + struct g_part_table *table; + struct sbuf *sb; + int error; + + gp = gpp->gpp_geom; + G_PART_TRACE((G_T_TOPOLOGY, "%s(%s)", __func__, gp->name)); + g_topology_assert(); + + cp = LIST_FIRST(&gp->consumer); + table = gp->softc; + + error = G_PART_RECOVER(table, cp); + if (error) { + gctl_error(req, "%d", error); + return (error); + } + + /* Provide feedback if so requested. */ + if (gpp->gpp_parms & G_PART_PARM_OUTPUT) { + sb = sbuf_new_auto(); + sbuf_printf(sb, "%s recovered\n", gp->name); + sbuf_finish(sb); + gctl_set_param(req, "output", sbuf_data(sb), sbuf_len(sb) + 1); + sbuf_delete(sb); + } + return (0); } static int g_part_ctl_resize(struct gctl_req *req, struct g_part_parms *gpp) { - gctl_error(req, "%d verb 'resize'", ENOSYS); - return (ENOSYS); -} + struct g_geom *gp; + struct g_provider *pp; + struct g_part_entry *pe, *entry; + struct g_part_table *table; + struct sbuf *sb; + quad_t end; + int error; + gp = gpp->gpp_geom; + G_PART_TRACE((G_T_TOPOLOGY, "%s(%s)", __func__, gp->name)); + g_topology_assert(); + table = gp->softc; + + /* check gpp_index */ + LIST_FOREACH(entry, &table->gpt_entry, gpe_entry) { + if (entry->gpe_deleted || entry->gpe_internal) + continue; + if (entry->gpe_index == gpp->gpp_index) + break; + } + if (entry == NULL) { + gctl_error(req, "%d index '%d'", ENOENT, gpp->gpp_index); + return (ENOENT); + } + + /* check gpp_size */ + end = entry->gpe_start + gpp->gpp_size - 1; + if (gpp->gpp_size < 1 || end > table->gpt_last) { + gctl_error(req, "%d size '%jd'", EINVAL, + (intmax_t)gpp->gpp_size); + return (EINVAL); + } + + LIST_FOREACH(pe, &table->gpt_entry, gpe_entry) { + if (pe->gpe_deleted || pe->gpe_internal || pe == entry) + continue; + if (end >= pe->gpe_start && end <= pe->gpe_end) { + gctl_error(req, "%d end '%jd'", ENOSPC, + (intmax_t)end); + return (ENOSPC); + } + if (entry->gpe_start < pe->gpe_start && end > pe->gpe_end) { + gctl_error(req, "%d size '%jd'", ENOSPC, + (intmax_t)gpp->gpp_size); + return (ENOSPC); + } + } + + pp = entry->gpe_pp; + if ((g_debugflags & 16) == 0 && + (pp->acr > 0 || pp->acw > 0 || pp->ace > 0)) { + gctl_error(req, "%d", EBUSY); + return (EBUSY); + } + + error = G_PART_RESIZE(table, entry, gpp); + if (error) { + gctl_error(req, "%d", error); + return (error); + } + + if (!entry->gpe_created) + entry->gpe_modified = 1; + + /* update mediasize of changed provider */ + pp->mediasize = (entry->gpe_end - entry->gpe_start + 1) * + pp->sectorsize; + + /* Provide feedback if so requested. */ + if (gpp->gpp_parms & G_PART_PARM_OUTPUT) { + sb = sbuf_new_auto(); + G_PART_FULLNAME(table, entry, sb, gp->name); + sbuf_cat(sb, " resized\n"); + sbuf_finish(sb); + gctl_set_param(req, "output", sbuf_data(sb), sbuf_len(sb) + 1); + sbuf_delete(sb); + } + return (0); +} + static int g_part_ctl_setunset(struct gctl_req *req, struct g_part_parms *gpp, unsigned int set) @@ -1194,7 +1296,8 @@ g_part_ctlreq(struct gctl_req *req, struct g_class mparms |= G_PART_PARM_GEOM; } else if (!strcmp(verb, "resize")) { ctlreq = G_PART_CTL_RESIZE; - mparms |= G_PART_PARM_GEOM | G_PART_PARM_INDEX; + mparms |= G_PART_PARM_GEOM | G_PART_PARM_INDEX | + G_PART_PARM_SIZE; } break; case 's': Index: g_part_mbr.c =================================================================== --- g_part_mbr.c (revision 204945) +++ g_part_mbr.c (working copy) @@ -76,6 +76,8 @@ static int g_part_mbr_setunset(struct g_part_table static const char *g_part_mbr_type(struct g_part_table *, struct g_part_entry *, char *, size_t); static int g_part_mbr_write(struct g_part_table *, struct g_consumer *); +static int g_part_mbr_resize(struct g_part_table *, struct g_part_entry *, + struct g_part_parms *); static kobj_method_t g_part_mbr_methods[] = { KOBJMETHOD(g_part_add, g_part_mbr_add), @@ -85,6 +87,7 @@ static kobj_method_t g_part_mbr_methods[] = { KOBJMETHOD(g_part_dumpconf, g_part_mbr_dumpconf), KOBJMETHOD(g_part_dumpto, g_part_mbr_dumpto), KOBJMETHOD(g_part_modify, g_part_mbr_modify), + KOBJMETHOD(g_part_resize, g_part_mbr_resize), KOBJMETHOD(g_part_name, g_part_mbr_name), KOBJMETHOD(g_part_probe, g_part_mbr_probe), KOBJMETHOD(g_part_read, g_part_mbr_read), @@ -302,6 +305,31 @@ g_part_mbr_modify(struct g_part_table *basetable, return (0); } +static int +g_part_mbr_resize(struct g_part_table *basetable, + struct g_part_entry *baseentry, struct g_part_parms *gpp) +{ + struct g_part_mbr_entry *entry; + uint32_t size, sectors; + + sectors = basetable->gpt_sectors; + size = gpp->gpp_size; + + if (size < sectors) + return (EINVAL); + if (size % sectors) + size = size - (size % sectors); + if (size < sectors) + return (EINVAL); + + entry = (struct g_part_mbr_entry *)baseentry; + baseentry->gpe_end = baseentry->gpe_start + size - 1; + entry->ent.dp_size = size; + mbr_set_chs(basetable, baseentry->gpe_end, &entry->ent.dp_ecyl, + &entry->ent.dp_ehd, &entry->ent.dp_esect); + return (0); +} + static const char * g_part_mbr_name(struct g_part_table *table, struct g_part_entry *baseentry, char *buf, size_t bufsz) --------------070101000704010802030502-- From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 15:22:04 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A05B106567D for ; Thu, 22 Apr 2010 15:22:04 +0000 (UTC) (envelope-from lister@kawashti.org) Received: from mra.kawashti.org (mra.kawashti.org [78.136.5.95]) by mx1.freebsd.org (Postfix) with ESMTP id D7AF58FC14 for ; Thu, 22 Apr 2010 15:22:03 +0000 (UTC) Received: from mx.kawashti.org (mx.kawashti.org [196.218.21.179]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mra.kawashti.org (Postfix) with ESMTP id 345834902E4; Thu, 22 Apr 2010 16:21:56 +0100 (BST) Received: from neo ([10.10.10.10]) by mx.kawashti.org (Kawashti Mail) with SMTP id RDS02182; Thu, 22 Apr 2010 17:21:47 +0200 Message-ID: <9C9F1EE6F5A24B3695327442FBF565C6@neo> From: "Lister" To: "Andriy Gapon" References: <4BCEEA79.7080309@icyb.net.ua> <4BD01DBE.7030905@icyb.net.ua> Date: Thu, 22 Apr 2010 17:21:40 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="windows-1256"; reply-type=original Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.4548 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.4325 Cc: GEOM Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 15:22:04 -0000 Hello all, I'd like to make a few clarifications first: 1. All my systems are AMD64 and either 7.1-REL or 8.0-REL. 2. The GPT on the 5TB RAID5 I want to expand is on 7.1. The latter has both gpt and gpart. I didn't know about gpart when I built its RAID. Partition 3 is the last, 3.6TB, 87% full, and is the one desperately needing expansion. 3. The GPT on 4TB RAID is on 8.0. I just built it a few days ago for a project (not my own). Its entire 9 partitions are still empty (just newfs'd) and for that reason I can use it temporarily for the experiment. Certainly, I'll share the results of the expansion experiment with you. It'll just be a day or so before I get there, as it evidently calls for a good deal of prep. Fortunately, I wrote a verbose Bash script to automate the process of creating a GPT, so I don't have to read everything when I need do it again a few months down the line. It handles everything from deletion, destruction, creation, newfs, tunefs, /etc/fstab updates, mounting and 'df' summary display. To customize, only a few variables need be changed. If anyone thinks this might come in handy someday, please let me know to post it. Now regarding the hexdump commands, I used them on the 8.0 system for a reference visual comparision. First here's the output of gpart on that system: / :633: gpart show da0 => 34 7812415421 da0 GPT (3.6T) 34 41943040 1 freebsd-ufs (20G) 41943074 188743680 2 freebsd-ufs (90G) 230686754 62914560 3 freebsd-ufs (30G) 293601314 4294967296 4 freebsd-ufs (2.0T) 4588568610 838860800 5 freebsd-ufs (400G) 5427429410 1111490560 6 freebsd-ufs (530G) 6538919970 209715200 7 freebsd-ufs (100G) 6748635170 629145600 8 freebsd-ufs (300G) 7377780770 434634685 9 freebsd-ufs (207G) Here are the commands. Note that I used Bash notations here for easy immediate recognition. I tested everyone of them before submitting this message. They do exactly the same as the ones I originally used which included absolute lengths (in decimal) and offsets (in decimal, hexadecimal and 'b' for block variations). I've thoroughly tested all combinations and proved them to achieve same result (which is nothingness.) # First 34 sectors of /dev/da0 hd -n $((512*34)) /dev/da0 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 |................| 000001c0 01 00 ee ff ff ff 01 00 00 00 ff ff ff ff 00 00 |................| 000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.| 00000200 45 46 49 20 50 41 52 54 00 00 01 00 5c 00 00 00 |EFI PART....\...| 00000210 54 69 12 d7 00 00 00 00 01 00 00 00 00 00 00 00 |Ti..............| 00000220 ff ff a7 d1 01 00 00 00 22 00 00 00 00 00 00 00 |........".......| 00000230 de ff a7 d1 01 00 00 00 80 b7 4f 66 87 4c df 11 |..........Of.L..| 00000240 97 12 00 e0 81 b3 63 76 02 00 00 00 00 00 00 00 |......cv........| 00000250 80 00 00 00 80 00 00 00 cd c5 e0 ec 00 00 00 00 |................| 00000260 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * ……… edited for brevity # 34 sectors after the last partition hd -n $((512*34)) -s $((7377780770+434634685)) /dev/da0 -> frozen # First 1 sector after the last partition hd -n 512 -s $((7377780770+434634685)) /dev/da0 -> frozen # Last 1 sector of last partition hd -n 512 -s $((7377780770+434634685-1)) /dev/da0 -> frozen # First 1 sector of last partition hd -n 512 -s 7377780770 /dev/da0 -> frozen This is interesting: Since 7.1-REL has both gpt and gpart, I did both on my 5TB array. Here's the output: / :1134: gpt show da0 start size index contents 0 1 PMBR 1 1 Pri GPT header 2 32 Pri GPT table 34 1677721600 1 GPT part - FreeBSD UFS/UFS2 1677721634 419430400 2 GPT part - FreeBSD UFS/UFS2 2097152034 7668367293 3 GPT part - FreeBSD UFS/UFS2 9765519327 32 Sec GPT table 9765519359 1 Sec GPT header / :1135: gpart show da0 => 34 9765519293 da0 GPT (4.5T) 34 1677721600 1 freebsd-ufs (800G) 1677721634 419430400 2 freebsd-ufs (200G) 2097152034 7668367293 3 freebsd-ufs (3.6T) Obviously the output of gpt is more detailed and does reference the 2ry GPT. It lead me to incidentally learn that the 2ry is one sector shorter than 1ry on account of absent PMBR. This also leads me suggest to the implementers of gpart to use the verbosity of gpt. Would you concur? Kind regards, Hatem Kawashti ----- Original Message ----- From: "Andriy Gapon" To: "Lister" Cc: Sent: Thursday, April 22, 2010 11:58 Subject: Re: OCE and GPT > on 21/04/2010 23:49 Lister said the following: >> Hello All, >> >> I'd like to first thank Andrey Elsukov and Andriy Gapon for their >> valuable contribution and very quick reply. >> Given that the patch is not yet ready as I understand it, I'll go with >> the alternate method of destroying and recreating the GPT. To that end I >> yet have to ask 3 more questions: >> 1. How do I make sure I have a valid secondary GPT? Neither gpt nor >> gpart tell anything about it. Can I assume that if 'gpart show da0' >> shows a proper layout and no error messages that the 2ry is valid? > > I think that should be sufficient. > >> I tried to make a quick visual comparison on another system >> (8.0-RELEASE this time) with a 4TB RAID5 that I just setup yesterday, >> using gpart this time because I had to. I used hexdump for the purpose, >> dumping the first 34 sectors of /dev/da0, and on another ssh shell, THE >> 34 sectors beyond the last partition. >> hexdump of the second got nothing, it seemed to have frozen but would >> break normally on CTRL+C. I've never seen the likes of this before. >> In an attempt to troubleshoot, I narrowed the selection to only ONE >> sector…same result. Then the last sector of the last partition…same >> thing. Even dump of the first sector of the last partition exhibited >> same behavior. The partition is viable, though. I copied a 4.4GB file >> to it over ssh without a problem and the data rate was consistent with >> expectations. >> I know this is a side issue, but is hexdump/hd known to have problems >> with large devices, or perhaps 32/64-bit issues? >> I forgot to mention that all my systems are AMD64. > > Can you provide the actual commands you used? > Not to doubt your skills, but just to be sure. > BTW, you can discover disk size with diskinfo tool, subtract 34 from that and > use dd on that. > >> 2. Now assuming OCE adds the new space at the tail– which I yet have to >> verify before proceeding– will 'growfs' serve the purpose of extending >> newfs' work? >> Its man page doesn't reference gpt or gpart, but rather bsdlabel and >> fdisk; something suggestiive of the contrary. > > Theoretically growfs should work with filesystem data within a partition and > should be agnostic to partition type. > Practically, I am not sure. > Also, there _could_ be issues with very large FS sizes. > > In your case it would be great if you could experiment with dummy data on a > different system. I.e. create something similar to what you have now, then grow > it the way you want and see how it works out. > > Don't forget to share the results with us :) > >> 3. Does it make a difference if use gpt or gpart to recreate the gpt, >> given that I'd initially created it with gpt? > > I think that it's better to use gpart because gpt was deprecated. > But I am not sure what version of FreeBSD you use, that may be important. > >> Note. My root fs and everything else beyond the library is on another >> RAID1 (on the Motherboard). > > That's good, gives you more freedom in actions. > >> ----- Original Message ----- From: "Andriy Gapon" >> To: "Lister" >> Cc: >> Sent: Wednesday, April 21, 2010 14:07 >> Subject: Re: OCE and GPT >> >> >>> on 21/04/2010 12:21 Lister said the following: >>>> Hi All, >>>> >>>> I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE. I >>>> partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE. >>>> P3 is 3.5TB and is the one I need to expand by adding another 1TB drive >>>> to the RAID. It is now 87% full. >>>> >>>> Both gpt and gpart don't allow resizing a partition. >>>> Of course, backing up the RAID to another is not an option. >>>> >>>> I'm in a rather desperate situation and I'm willing to do whatever it >>>> takes. If there's no current software solution, I'm willing to use a hex >>>> editor to edit the disk directly if someone could advise me of the >>>> layout of GPT as created by gpt- and gpart if different. I used to do >>>> this on MBR disks at times of necessity. >>> >>> If you make any mistake and lose your data, then don't blame me. >>> Before trying what I suggest wait for a few days in case someone >>> points out a >>> mistake or suggests a better way. >>> >>> 1. Get current layout e.g. with 'gpart show' >>> 2. Print (several copies of) it and don't lose it >>> 3. Boot using Live CD (if da0 is your boot disk) >>> 4. Undo the whole GPT layout using 'gpart delete' and 'gpart destroy' >>> 5. Expand RAID (I hope OCE means that the new space will be added at >>> the end) >>> 5. Re-create the same layout but using new size for p3 >>> >>> Some notes: >>> 1. Deleting/destroying/adding/creating partitions and scheme does not >>> touch your >>> data/filesystems; it operates only on sectors belonging to GPT metadata. >>> 2. There are two copies of GPT metadata, one at the start of a disk, >>> the other at >>> the end; they both must be valid and provide the same information. >>> -- >>> Andriy Gapon >>> _______________________________________________ >>> freebsd-geom@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-geom >>> To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" >> > > > -- > Andriy Gapon > _______________________________________________ > freebsd-geom@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 15:31:53 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA261106566B; Thu, 22 Apr 2010 15:31:53 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 377F38FC1A; Thu, 22 Apr 2010 15:31:52 +0000 (UTC) Received: by wye20 with SMTP id 20so1899402wye.13 for ; Thu, 22 Apr 2010 08:31:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:x-enigmail-version :content-type:content-transfer-encoding; bh=4QI4PUfCYDD+MgGyxvK+jjeLEpiBut1qnnidbYQN/JM=; b=x0guPpga4bs57yV7wTa9DYRKGNPFCO6RfdrFGZyLYF1PaRL12y503FLRtemoWicU+F 7dmovUYSXjtLd6J/bD0atSDRD42pbmX3gDT2tKLRJHj8ew/c0qcCtmBcsVaJbyKaRTBA 2wfUgmQoVpgakYHWIv+qh37QpXOC6MVog70y4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :x-enigmail-version:content-type:content-transfer-encoding; b=RZQfAluTzQUzn9e1EsIj1jgQqPtEArryhO/5wVaZyCTJtJxAlK2/WxaFZX5wxfz/Np U1X+3mzRn3Mux/1ayZtiLlupu3y6PHC1KAqw+w9rjMeEpPjjN3ZIfvvZ1A/fQWTxUdJA n9I/1kkOLWz/CkXffAJ3I98G8SlbDAu9ASDds= Received: by 10.103.84.25 with SMTP id m25mr2348955mul.108.1271950311958; Thu, 22 Apr 2010 08:31:51 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id y6sm274676mug.50.2010.04.22.08.31.50 (version=SSLv3 cipher=RC4-MD5); Thu, 22 Apr 2010 08:31:51 -0700 (PDT) Sender: Alexander Motin Message-ID: <4BD06BD9.6030401@FreeBSD.org> Date: Thu, 22 Apr 2010 18:31:37 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: FreeBSD-Current X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 15:31:53 -0000 Hi. With time passed, CAM-based ATA infrastructure IMHO looks enough mature now to enable it in HEAD. Now we have two new stable drivers ahci(4) and siis(4), covering major part of modern SATA HBAs, `options ATA_CAM` wrapper for ata(4) to supports legacy hardware, and one more improved driver for Marvell HBAs (mvs) is now in development and soon will be present for testing. Together with many other people I have tested above at least on i386, amd64, arm and spart64 architectures. This switchover would give us significant performance improvement on new hardware because of NCQ support in ahci/siis/mvs drivers; improved functionality, including SATA Port Multipliers support, better hot-plug support; and reduced code duplication between ata(4) and cam(4) subsystems and applications. Two issues left at this moment are: 1) POLA breakage due to disk device being renamed from adX to adaY; 2) lack of araraid(4) alternative in new infrastructure. It should be reimplemented in GEOM in some way, but it still wasn't. So what is the public opinion: Is the lack of ataraid(4) fatal or we can live without it? Can we do switchover now, or some more reasons preventing this? If ataraid(4) should be reimplemented in GEOM, then how exactly? One more separate RAID infrastructure in GEOM (third?) looks excessive. Reuse gmirror, gstripe,... code would be nice, but will make them more complicated and could be not easy for RAID0+1 (due to common metadata) and RAID5 (due to lack of module in a base system). -- Alexander Motin From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 16:06:01 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58AA51065675 for ; Thu, 22 Apr 2010 16:06:01 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 2D7838FC14 for ; Thu, 22 Apr 2010 16:06:00 +0000 (UTC) Received: by pwi9 with SMTP id 9so6275703pwi.13 for ; Thu, 22 Apr 2010 09:06:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=H1cUsTdD4tdQtPOwlug7uKSOxpxh9An8SuSfnEeghFo=; b=C2dZLCzV5cdYP//XRGPDeY+l9WJGiderZbRa9k73B40XoZkj7XaAiGke1M4uKjS8Tb efptL+MrPUL9nb5Bg9s5FBNdBbfwkj+ONDTufJNZyDgGY2rF6uDiwOpK4YuvX2j7aWwh HWjtinu5XEX2hVrGYcPBQim7ALgpa847kgTf0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=PBmZHIFmHyodyN2irE1LfxAdYf7lFxEOVVpsWlJ6M0D295o7web0iBym7apqOCQc77 TrzRYw8fmV2Hhvf1NwowL8zAoPQ90ggpxX8DGyqVtpw2vXPZHyfBrx/t65ogwW3oRy06 75CzGknKuTqB5Ra4kS9vaQ6IUlMD+p7AqsmCI= MIME-Version: 1.0 Received: by 10.231.18.74 with HTTP; Thu, 22 Apr 2010 08:42:04 -0700 (PDT) In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> References: <4BD06BD9.6030401@FreeBSD.org> Date: Thu, 22 Apr 2010 08:42:04 -0700 Received: by 10.141.214.6 with SMTP id r6mr114942rvq.138.1271950924623; Thu, 22 Apr 2010 08:42:04 -0700 (PDT) Message-ID: From: Freddie Cash To: freebsd-current@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 16:06:01 -0000 2010/4/22 Alexander Motin > With time passed, CAM-based ATA infrastructure IMHO looks enough mature > now to enable it in HEAD. Now we have two new stable drivers ahci(4) and > siis(4), covering major part of modern SATA HBAs, `options ATA_CAM` > wrapper for ata(4) to supports legacy hardware, and one more improved > driver for Marvell HBAs (mvs) is now in development and soon will be > present for testing. Together with many other people I have tested above > at least on i386, amd64, arm and spart64 architectures. > I haven't updated my 8-STABLE box in a couple of weeks. Have the issues with ATAPI DVD-burners been worked out, when using ATA_CAM? Back in Jan/Feb, thereabouts, I tested an ATA_CAM kernel and could not get a device node of any kind to show up for the DVD burner (no acd0, no cd0, nothing in dmesg). A non-ATA_CAM kernel shows both acd0 and cd0. Maybe I'll update my system this weekend and give ATA_CAM another test run. This switchover would give us significant performance improvement on new > hardware because of NCQ support in ahci/siis/mvs drivers; improved > functionality, including SATA Port Multipliers support, better hot-plug > support; and reduced code duplication between ata(4) and cam(4) > subsystems and applications. > > Two issues left at this moment are: > 1) POLA breakage due to disk device being renamed from adX to adaY; > 2) lack of araraid(4) alternative in new infrastructure. It should be > reimplemented in GEOM in some way, but it still wasn't. > > So what is the public opinion: Is the lack of ataraid(4) fatal or we can > live without it? > > Can we do switchover now, or some more reasons preventing this? > > If ataraid(4) should be reimplemented in GEOM, then how exactly? One > more separate RAID infrastructure in GEOM (third?) looks excessive. > Reuse gmirror, gstripe,... code would be nice, but will make them more > complicated and could be not easy for RAID0+1 (due to common metadata) > and RAID5 (due to lack of module in a base system). If a lowly user's vote counts for anything, I'd vote for the complete removal of ataraid support. We have gstripe, gmirror, graid3, graid5, and zfs (and gvinum for the masochistics). :) We don't need to support any of the crappy pseudo-raid "hardware" out there. ataraid(4) has served it's purpose, tiding us over until GEOM RAID facilities were in place. Now it's time for it to be retired. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 16:20:59 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 35AA0106566B for ; Thu, 22 Apr 2010 16:20:59 +0000 (UTC) (envelope-from ale@FreeBSD.org) Received: from andxor.it (relay.andxor.it [195.223.2.3]) by mx1.freebsd.org (Postfix) with SMTP id 749108FC1D for ; Thu, 22 Apr 2010 16:20:58 +0000 (UTC) Received: (qmail 38928 invoked from network); 22 Apr 2010 15:54:15 -0000 Received: from unknown (HELO ale.andxor.it) (192.168.2.5) by andxor.it with SMTP; 22 Apr 2010 15:54:15 -0000 Message-ID: <4BD07127.6000601@FreeBSD.org> Date: Thu, 22 Apr 2010 17:54:15 +0200 From: Alex Dupre User-Agent: Thunderbird 2.0.0.22 (X11/20090624) MIME-Version: 1.0 To: freebsd-current@freebsd.org References: <4BD06BD9.6030401@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 16:20:59 -0000 Freddie Cash ha scritto: >> So what is the public opinion: Is the lack of ataraid(4) fatal or we can >> live without it? Lack of ataraid means no more arX devices, right? I'd say it's not fatal for HEAD, but it is for a -STABLE branch. > ataraid(4) has served it's > purpose, tiding us over until GEOM RAID facilities were in place. Now it's > time for it to be retired. It doesn't seem to me that sysinstall supports gmirror or gstripe, so even if they could be better, currently I think many users still use ataraid for simple installations with mirrored disks. -- Alex Dupre From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 16:37:17 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40E91106566C; Thu, 22 Apr 2010 16:37:17 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-qy0-f181.google.com (mail-qy0-f181.google.com [209.85.221.181]) by mx1.freebsd.org (Postfix) with ESMTP id D57BE8FC1D; Thu, 22 Apr 2010 16:37:16 +0000 (UTC) Received: by qyk11 with SMTP id 11so10055519qyk.13 for ; Thu, 22 Apr 2010 09:37:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=gNu0EiWqpe7D+ilHMvabNJl8muhm4mrrsUTLqPvpLMQ=; b=qIXnmeM9T9i0PCEpWaOPrRB/3z8l0zQsT2+dZcBdeI3p1+q53QbYi/W2xrP3BhphRB ehcTJLvdOP2NX7sApTjZx5NfiLpAkFw8xG4YY+ggQzjWjOwHr7ADBK4RUv4EG+w1Ec1R xYpUBx/HEuJcSMklFOmRHEFKs6RfWVVwOucac= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=C6D4JZsHi8wC44yWD0qu8G/Rn2QMsmNmpQY+NsH9FPiN9CTeUK0p41PBk/EaAvUc4Q HFkfMp/M+26K5NRIb/lDzuWvHQPSkBHcMYHmkxySZHjdZOJ1/vxoQbVEF/+qhn3O6NSC FLkmQt1wFP77d6yOZMk7taQjQv1zqYEXCKGcA= MIME-Version: 1.0 Received: by 10.229.99.67 with HTTP; Thu, 22 Apr 2010 09:37:15 -0700 (PDT) In-Reply-To: <4BD07863.4020106@elischer.org> References: <4BD06BD9.6030401@FreeBSD.org> <4BD07863.4020106@elischer.org> Date: Thu, 22 Apr 2010 11:37:15 -0500 Received: by 10.229.227.83 with SMTP id iz19mr1241714qcb.44.1271954235855; Thu, 22 Apr 2010 09:37:15 -0700 (PDT) Message-ID: From: Adam Vande More To: Julian Elischer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-current@freebsd.org, Freddie Cash , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 16:37:17 -0000 On Thu, Apr 22, 2010 at 11:25 AM, Julian Elischer wrote: > just one little fly in that ointment... booting. > > You need to be able to act with the raid in the same way the bios does > or you can't boot. I don't think geom would easlily do that but I could be > wrong. Certainly if you treat teh ata raid as just a bunch of striped disks, > then the bios will not be able to boot off it. > > of course don't take my word too seriosly asn I'm not running an ata raid > system at the moment. > gmirror booting works great only thing to change is fstab to reflect block dev changes, gstripe doesn't. I honestly wasn't aware ataraid could boot a striped volume, if so it does something geom can't. -- Adam Vande More From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 16:44:04 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6FDC3106564A for ; Thu, 22 Apr 2010 16:44:04 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-qy0-f181.google.com (mail-qy0-f181.google.com [209.85.221.181]) by mx1.freebsd.org (Postfix) with ESMTP id 26AC98FC1C for ; Thu, 22 Apr 2010 16:44:03 +0000 (UTC) Received: by qyk11 with SMTP id 11so10063976qyk.13 for ; Thu, 22 Apr 2010 09:44:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=0PZsUSbizsuIJj+XXUtd7QH82XiTBjP+JCE4QRdhgq4=; b=q/co+GWEUwW5gRKiVOZiHKC3ANMTkbU+YByO5arsRlsctMUEJKCbH2ddwnsDOvsTxD QsZD0BkEmnredY3h4L9hHgv5qXwAN342t7P4YcJw9vJMoUMt1vSbXteAQiHS6sbClghw TbivCEBDcCpWP2cq5jcPoYdC2vWu6EByd1OgI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=SAFFHRnattO4ov9N4QPlhpN/I4tFFUahBDvejiBcinifjxjJOIYn7KSdGpuRKFxwYn lMiapcEuyYaEI8QR/ljujpgX9lUTFyw0IDPFt544uCzWYusIizWG5SAuaX2/AnuQKwC0 3Hy1ykaE/gUNLv9SZjmSLMePH/LUTsDnDHI44= MIME-Version: 1.0 Received: by 10.229.99.67 with HTTP; Thu, 22 Apr 2010 09:17:41 -0700 (PDT) In-Reply-To: References: <4BD06BD9.6030401@FreeBSD.org> Date: Thu, 22 Apr 2010 11:17:41 -0500 Received: by 10.229.251.72 with SMTP id mr8mr4327772qcb.30.1271953062544; Thu, 22 Apr 2010 09:17:42 -0700 (PDT) Message-ID: From: Adam Vande More To: Freddie Cash Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-current@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 16:44:04 -0000 On Thu, Apr 22, 2010 at 10:42 AM, Freddie Cash wrote: > If a lowly user's vote counts for anything, I'd vote for the complete > removal of ataraid support. We have gstripe, gmirror, graid3, graid5, and > zfs (and gvinum for the masochistics). :) We don't need to support any of > the crappy pseudo-raid "hardware" out there. ataraid(4) has served it's > purpose, tiding us over until GEOM RAID facilities were in place. Now it's > time for it to be retired. > +1 on ataraid's retirement. > It doesn't seem to me that sysinstall supports gmirror or gstripe, so > even if they could be better, currently I think many users still use > ataraid for simple installations with mirrored disks. It's hard to say, I'm sure there are some. It's fairly trivial to create gmirrors or gstripes after the install is complete. Also, gstripe's are not bootable volumes. Handbook documentation has been guiding users to gmirror for some time now and gmirror is just much easier to work with IMO. I think sade(and by further discussion sysinstall) is now getting some attention and now supports geom devices, zfs, etc at least in one person's testbed. I know that's it's been tried before but there are actually screenshots from it. http://lists.freebsd.org/pipermail/freebsd-current/2010-April/016418.html -- Adam Vande More From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 16:47:39 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FA681065673 for ; Thu, 22 Apr 2010 16:47:39 +0000 (UTC) (envelope-from julianelischer@gmail.com) Received: from mail-ww0-f54.google.com (mail-ww0-f54.google.com [74.125.82.54]) by mx1.freebsd.org (Postfix) with ESMTP id C07EB8FC18 for ; Thu, 22 Apr 2010 16:47:38 +0000 (UTC) Received: by wwa36 with SMTP id 36so5606099wwa.13 for ; Thu, 22 Apr 2010 09:47:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=sJK/N1yOkU7KDyG4+fjL6W5+UX2+XLSfVBn+WnsAfC4=; b=BjG4ebzbmpKv4KyfDKMeajcMWnB1A8fA5PsyvM1ddlC7ZVT43HegsCrcCFxduzh6LE y4XGkgVl9RVC/fA5P+yaqlzvYBCyAvqkOS2mhke21RvvsGSKDBHC2sUVv2iRUHujw8az sDhZscVHiMMhDQAdHr30atn5SBFLcxd5PQXCY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=QugyyZUd+eu8KncyXwdN5KFSp3i1vc1+guiKfEVo54VosbC5VKIDGYXEOEsDRB2gkZ 3tHlYtIOVUCNLz2VfcJd7+U7+5pBw+SbClhI0T2eK7Evl3sIm+5Um6y107CAIStYSVeZ 2Ckq2dYdh60TTI8NKX5lv0SG6y1tJNYfD1MKo= Received: by 10.216.155.144 with SMTP id j16mr449769wek.221.1271953512977; Thu, 22 Apr 2010 09:25:12 -0700 (PDT) Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by mx.google.com with ESMTPS id z3sm827076wbs.22.2010.04.22.09.25.10 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 22 Apr 2010 09:25:11 -0700 (PDT) Sender: Julian Elischer Message-ID: <4BD07863.4020106@elischer.org> Date: Thu, 22 Apr 2010 09:25:07 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4 MIME-Version: 1.0 To: Adam Vande More References: <4BD06BD9.6030401@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, Freddie Cash , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 16:47:39 -0000 On 4/22/10 9:17 AM, Adam Vande More wrote: >> > > +1 on ataraid's retirement. just one little fly in that ointment... booting. You need to be able to act with the raid in the same way the bios does or you can't boot. I don't think geom would easlily do that but I could be wrong. Certainly if you treat teh ata raid as just a bunch of striped disks, then the bios will not be able to boot off it. of course don't take my word too seriosly asn I'm not running an ata raid system at the moment. From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 17:59:33 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9BD6B106564A; Thu, 22 Apr 2010 17:59:33 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward4.mail.yandex.net (forward4.mail.yandex.net [77.88.46.9]) by mx1.freebsd.org (Postfix) with ESMTP id 45D038FC18; Thu, 22 Apr 2010 17:59:33 +0000 (UTC) Received: from web33.yandex.ru (web33.yandex.ru [213.180.223.2]) by forward4.mail.yandex.net (Yandex) with ESMTP id 74EB96AD9030; Thu, 22 Apr 2010 21:59:31 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271959171; bh=mq6li3WsDXAKglWATzsgeyGyU52gjgU+3qqxKAttgec=; h=From:To:Cc:In-Reply-To:References:Subject:MIME-Version:Message-Id: Date:Content-Transfer-Encoding:Content-Type; b=veWTo135+H+ZT3mYV01Gx7f1zpbR7KhvQep9ZofUp7DUJZRPbvFzhueQeY32OJJO/ 5dDtmyO66uyU6iK22lBJ5eujCMz+wlU6p/135P9I0ojIbAD5W5W+JT/7AgSFVHIbgK BZAcL6J92Y4n8DlVEeZM5pKOwIhkJknroy+rvKuk= Received: from localhost (localhost.localdomain [127.0.0.1]) by web33.yandex.ru (Yandex) with ESMTP id 6E326448005B; Thu, 22 Apr 2010 21:59:31 +0400 (MSD) X-Yandex-Spam: 1 X-Yandex-Front: web33.yandex.ru X-Yandex-TimeMark: 1271959171 Received: from [77.72.138.63] ([77.72.138.63]) by mail.yandex.ru with HTTP; Thu, 22 Apr 2010 21:59:30 +0400 From: Andrey V. Elsukov To: Adam Vande More In-Reply-To: References: <4BD06BD9.6030401@FreeBSD.org> MIME-Version: 1.0 Message-Id: <61131271959171@web33.yandex.ru> Date: Thu, 22 Apr 2010 21:59:31 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain Cc: freebsd-current@freebsd.org, Freddie Cash , freebsd-geom@freebsd.org Subject: Re: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 17:59:33 -0000 22.04.10, 11:17, "Adam Vande More": > I think sade(and by further discussion sysinstall) is now getting some > attention and now supports geom devices, zfs, etc at least in one person's > testbed. I know that's it's been tried before but there are actually > screenshots from it. > > http://lists.freebsd.org/pipermail/freebsd-current/2010-April/016418.html Yes, I have plans to add support of simple GEOM-based RAID management in sade. -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 18:17:26 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C033106566C for ; Thu, 22 Apr 2010 18:17:26 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 24B1E8FC13 for ; Thu, 22 Apr 2010 18:17:25 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o3MHud1m025832; Thu, 22 Apr 2010 10:56:40 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4BD08DD4.2090007@feral.com> Date: Thu, 22 Apr 2010 10:56:36 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Alexander Motin References: <4BD06BD9.6030401@FreeBSD.org> In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Thu, 22 Apr 2010 10:56:40 -0700 (PDT) Cc: FreeBSD-Current , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 18:17:26 -0000 Short opinion from me: Yes, for HEAD. Not MFC'able. It's too major a change for that. From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 18:43:26 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E94D1065675; Thu, 22 Apr 2010 18:43:26 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from ftp.translate.ru (ftp.translate.ru [80.249.188.42]) by mx1.freebsd.org (Postfix) with ESMTP id BBAC78FC23; Thu, 22 Apr 2010 18:43:25 +0000 (UTC) Received: from desktop.home.serebryakov.spb.ru (85-142-52-164.well-com.net [85.142.52.164]) (Authenticated sender: lev@serebryakov.spb.ru) by ftp.translate.ru (Postfix) with ESMTPA id 68EEF13DF42; Thu, 22 Apr 2010 22:28:06 +0400 (MSD) Date: Thu, 22 Apr 2010 22:28:03 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1761168370.20100422222803@serebryakov.spb.ru> To: Alexander Motin In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> References: <4BD06BD9.6030401@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: quoted-printable Cc: FreeBSD-Current , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 18:43:26 -0000 Hello, Alexander. You wrote 22 =C1=D0=D2=C5=CC=D1 2010 =C7., 19:31:37: > and RAID5 (due to lack of module in a base system). I'm cleaning up gradi5 now according to style(9) and want to make port out of it in month or two ("unfortunalety", I have alot of paid work, which is not FreeBSD-related in any way). It works very well for me on, and I have one HDD crash already, recovered with graid5 :) --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 19:02:19 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A8740106564A for ; Thu, 22 Apr 2010 19:02:19 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from sippysoft.com (gk1.360sip.com [72.236.70.240]) by mx1.freebsd.org (Postfix) with ESMTP id EC6528FC71 for ; Thu, 22 Apr 2010 19:02:17 +0000 (UTC) Received: from [192.168.1.38] (S0106005004e13421.vs.shawcable.net [70.71.175.212]) (authenticated bits=0) by sippysoft.com (8.14.3/8.14.3) with ESMTP id o3MIm3Zh022659 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 22 Apr 2010 11:48:04 -0700 (PDT) (envelope-from sobomax@FreeBSD.org) Message-ID: <4BD099E6.6000402@FreeBSD.org> Date: Thu, 22 Apr 2010 11:48:06 -0700 From: Maxim Sobolev Organization: Sippy Software, Inc. User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Alexander Motin References: <4BD06BD9.6030401@FreeBSD.org> In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD-Current , freebsd-geom@FreeBSD.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 19:02:19 -0000 Alexander Motin wrote: > So what is the public opinion: Is the lack of ataraid(4) fatal or we can > live without it? I believe it's fatal in long run. This would present significant challenge for users who rely on this functionality from upgrading from 8.x to 9.0 later on. Especially for ones using striped disks and RAID5. Therefore while it's no problem to have it in HEAD for now, but it will have to be addressed before the release. -Maxim From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 19:43:20 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F5A81065672; Thu, 22 Apr 2010 19:43:20 +0000 (UTC) (envelope-from richardtector@thekeelecentre.com) Received: from mx0.thekeelecentre.com (mx0.thekeelecentre.com [IPv6:2001:470:9391:2::3]) by mx1.freebsd.org (Postfix) with ESMTP id 0C8D48FC1E; Thu, 22 Apr 2010 19:43:20 +0000 (UTC) Received: from filter.mx0.thekeelecentre.com (filter.mx0.thekeelecentre.com [217.206.238.165]) by mx0.thekeelecentre.com (Postfix) with ESMTP id 0CD4345406; Thu, 22 Apr 2010 20:43:18 +0100 (BST) X-Virus-Scanned: amavisd-new at thekeelecentre.com Received: from mx0.thekeelecentre.com ([217.206.238.167]) by filter.mx0.thekeelecentre.com (filter.mx0.thekeelecentre.com [217.206.238.165]) (amavisd-new, port 10024) with ESMTP id 7XBDf4tFyG3I; Thu, 22 Apr 2010 19:43:15 +0000 (UTC) Received: from [10.0.9.11] (coyote.tector.org.uk [217.206.238.187]) by mx0.thekeelecentre.com (Postfix) with ESMTPA id 4DEA145411; Thu, 22 Apr 2010 20:43:15 +0100 (BST) Message-ID: <4BD0A689.8000508@thekeelecentre.com> Date: Thu, 22 Apr 2010 20:42:01 +0100 From: Richard Tector User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4 MIME-Version: 1.0 To: Alexander Motin References: <4BD06BD9.6030401@FreeBSD.org> <4BD099E6.6000402@FreeBSD.org> In-Reply-To: <4BD099E6.6000402@FreeBSD.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: Maxim Sobolev , FreeBSD-Current , freebsd-geom@FreeBSD.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 19:43:20 -0000 On 22/04/2010 19:48, Maxim Sobolev wrote: > Alexander Motin wrote: >> So what is the public opinion: Is the lack of ataraid(4) fatal or we can >> live without it? > > I believe it's fatal in long run. This would present significant > challenge for users who rely on this functionality from upgrading from > 8.x to 9.0 later on. Especially for ones using striped disks and RAID5. > > Therefore while it's no problem to have it in HEAD for now, but it > will have to be addressed before the release. Could I also add that the removal of ataraid would affect those users who dual-boot with Windows and rely on the psuedo-raid provided by most Intel chipsets to be able to share the same pair of disks. Regards, Richard From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 20:08:49 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 800261065673; Thu, 22 Apr 2010 20:08:49 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from sippysoft.com (gk1.360sip.com [72.236.70.240]) by mx1.freebsd.org (Postfix) with ESMTP id 0FC2F8FC21; Thu, 22 Apr 2010 20:08:48 +0000 (UTC) Received: from [192.168.1.38] (S0106005004e13421.vs.shawcable.net [70.71.175.212]) (authenticated bits=0) by sippysoft.com (8.14.3/8.14.3) with ESMTP id o3MK8lKO023085 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 22 Apr 2010 13:08:47 -0700 (PDT) (envelope-from sobomax@FreeBSD.org) Message-ID: <4BD0ACD2.3040805@FreeBSD.org> Date: Thu, 22 Apr 2010 13:08:50 -0700 From: Maxim Sobolev Organization: Sippy Software, Inc. User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Richard Tector References: <4BD06BD9.6030401@FreeBSD.org> <4BD099E6.6000402@FreeBSD.org> <4BD0A689.8000508@thekeelecentre.com> In-Reply-To: <4BD0A689.8000508@thekeelecentre.com> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: Alexander Motin , FreeBSD-Current , freebsd-geom@FreeBSD.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 20:08:49 -0000 Richard Tector wrote: > On 22/04/2010 19:48, Maxim Sobolev wrote: >> Alexander Motin wrote: >>> So what is the public opinion: Is the lack of ataraid(4) fatal or we can >>> live without it? >> >> I believe it's fatal in long run. This would present significant >> challenge for users who rely on this functionality from upgrading from >> 8.x to 9.0 later on. Especially for ones using striped disks and RAID5. >> >> Therefore while it's no problem to have it in HEAD for now, but it >> will have to be addressed before the release. > > Could I also add that the removal of ataraid would affect those users > who dual-boot with Windows and rely on the psuedo-raid provided by most > Intel chipsets to be able to share the same pair of disks. Well, this won't be a problem if we have GEOM classes that can understand metadata created by the ATA RAID BIOS(es). But we don't those classes at the moment. -Maxim From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 20:36:54 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0F0E106564A for ; Thu, 22 Apr 2010 20:36:54 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 14DDE8FC14 for ; Thu, 22 Apr 2010 20:36:53 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA14638; Thu, 22 Apr 2010 23:36:44 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O538e-0008Ug-Ex; Thu, 22 Apr 2010 23:36:44 +0300 Message-ID: <4BD0B35B.2040006@icyb.net.ua> Date: Thu, 22 Apr 2010 23:36:43 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: Lister References: <4BCEEA79.7080309@icyb.net.ua> <4BD01DBE.7030905@icyb.net.ua> <9C9F1EE6F5A24B3695327442FBF565C6@neo> In-Reply-To: <9C9F1EE6F5A24B3695327442FBF565C6@neo> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=windows-1256 Content-Transfer-Encoding: 8bit Cc: GEOM Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 20:36:54 -0000 on 22/04/2010 18:21 Lister said the following: > Hello all, > > I'd like to make a few clarifications first: > 1. All my systems are AMD64 and either 7.1-REL or 8.0-REL. > 2. The GPT on the 5TB RAID5 I want to expand is on 7.1. The latter has > both gpt and gpart. I didn't know about gpart when I built its RAID. > Partition 3 is the last, 3.6TB, 87% full, and is the one desperately > needing expansion. OK. I can't really comment on gpart vs gpt in 7. gpart is what's in head and 8 now and it's evolving. > 3. The GPT on 4TB RAID is on 8.0. I just built it a few days ago for a > project (not my own). Its entire 9 partitions are still empty (just > newfs'd) and for that reason I can use it temporarily for the experiment. > > Certainly, I'll share the results of the expansion experiment with you. > It'll just be a day or so before I get there, as it evidently calls for > a good deal of prep. Fortunately, I wrote a verbose Bash script to > automate the process of creating a GPT, so I don't have to read > everything when I need do it again a few months down the line. It > handles everything from deletion, destruction, creation, newfs, tunefs, > /etc/fstab updates, mounting and 'df' summary display. To customize, > only a few variables need be changed. If anyone thinks this might come > in handy someday, please let me know to post it. You can host on web somewhere and post a link. This way interested parties could discover it and get the latest version of what you decide to share. > Now regarding the hexdump commands, I used them on the 8.0 system for a > reference visual comparision. First here's the output of gpart on that > system: > / :633: gpart show da0 > => 34 7812415421 da0 GPT (3.6T) > 34 41943040 1 freebsd-ufs (20G) > 41943074 188743680 2 freebsd-ufs (90G) > 230686754 62914560 3 freebsd-ufs (30G) > 293601314 4294967296 4 freebsd-ufs (2.0T) > 4588568610 838860800 5 freebsd-ufs (400G) > 5427429410 1111490560 6 freebsd-ufs (530G) > 6538919970 209715200 7 freebsd-ufs (100G) > 6748635170 629145600 8 freebsd-ufs (300G) > 7377780770 434634685 9 freebsd-ufs (207G) > > Here are the commands. Note that I used Bash notations here for easy > immediate recognition. I tested everyone of them before submitting this > message. They do exactly the same as the ones I originally used which > included absolute lengths (in decimal) and offsets (in decimal, > hexadecimal and 'b' for block variations). I've thoroughly tested all > combinations and proved them to achieve same result (which is nothingness.) > > # First 34 sectors of /dev/da0 > hd -n $((512*34)) /dev/da0 [hd output snipped] > ……… edited for brevity > # 34 sectors after the last partition > hd -n $((512*34)) -s $((7377780770+434634685)) /dev/da0 > -> frozen > # First 1 sector after the last partition > hd -n 512 -s $((7377780770+434634685)) /dev/da0 > -> frozen > # Last 1 sector of last partition > hd -n 512 -s $((7377780770+434634685-1)) /dev/da0 > -> frozen > # First 1 sector of last partition > hd -n 512 -s 7377780770 /dev/da0 > -> frozen Could you please try to use dd piped to hd? What you report makes me think that hd doesn't seek disk to a specified offset, but reads and discards data until it reaches the offset. I may be wrong, of course, but it is worth trying dd which is known to do the right thing. > This is interesting: Since 7.1-REL has both gpt and gpart, I did both on > my 5TB array. Here's the output: > / :1134: gpt show da0 > start size index contents > 0 1 PMBR > 1 1 Pri GPT header > 2 32 Pri GPT table > 34 1677721600 1 GPT part - FreeBSD UFS/UFS2 > 1677721634 419430400 2 GPT part - FreeBSD UFS/UFS2 > 2097152034 7668367293 3 GPT part - FreeBSD UFS/UFS2 > 9765519327 32 Sec GPT table > 9765519359 1 Sec GPT header > > / :1135: gpart show da0 > => 34 9765519293 da0 GPT (4.5T) > 34 1677721600 1 freebsd-ufs (800G) > 1677721634 419430400 2 freebsd-ufs (200G) > 2097152034 7668367293 3 freebsd-ufs (3.6T) > > Obviously the output of gpt is more detailed and does reference the 2ry > GPT. It lead me to incidentally learn that the 2ry is one sector > shorter than 1ry on account of absent PMBR. That's kind of obvious (for those who are into reading specification) - PMBR is for software that knows only about MBR and doesn't know about GPT. And MBR, of course, is in the first sector and there is nothing special about the last sector in MBR scheme. > This also leads me suggest to the implementers of gpart to use the > verbosity of gpt. Would you concur? Not sure. I think that gpt is too verbose here, there is not much value in reporting internal GPT structure. Perhaps some educational value, but we already have Wikipedia :-) -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 20:53:10 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 159681065672; Thu, 22 Apr 2010 20:53:10 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 0555F8FC18; Thu, 22 Apr 2010 20:53:08 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA14848; Thu, 22 Apr 2010 23:53:07 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O53OU-0008Vs-HC; Thu, 22 Apr 2010 23:53:06 +0300 Message-ID: <4BD0B731.7060902@icyb.net.ua> Date: Thu, 22 Apr 2010 23:53:05 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: "Andrey V. Elsukov" References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> <4BD03472.6030201@yandex.ru> In-Reply-To: <4BD03472.6030201@yandex.ru> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 8bit Cc: Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: Re: [patch] resize and recover support for GPART GPT scheme X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 20:53:10 -0000 on 22/04/2010 14:35 Andrey V. Elsukov said the following: > On 21.04.2010 17:59, Andriy Gapon wrote: >> I also think that this recovery mechanism is needed. >> In short: >> recover - re-create secondary table based on primary table >> reinit - relocate secondary table to a new position and update offsets >> in both >> tables accordingly > > I implemented 'recover' verb. I changed detection algoritm in GPT scheme. > Now when primary GPT header is valid reading of second header will be > done from alternateLBA offset (which read from GPT header). > When primary header is invalid reading of second header will be from the > last medium's LBA. > > And now the following scenario works: > ==================================================================== > # dd if=/dev/zero of=./d.img bs=1m count=100 > 100+0 records in > 100+0 records out > 104857600 bytes transferred in 2.895854 secs (36209560 bytes/sec) > # mdconfig -f d.img > md0 > # gpart create -s gpt md0 > md0 created > # gpart add -t freebsd-zfs md0 > md0p1 added > # gpart show md0 > => 34 204733 md0 GPT (100M) > 34 204733 1 freebsd-zfs (100M) > > # mdconfig -du 0 > # dd if=/dev/zero of=./d.img bs=1m count=50 seek=100 > 50+0 records in > 50+0 records out > 52428800 bytes transferred in 1.175911 secs (44585689 bytes/sec) > # ls -lh d.img > -rw-r--r-- 1 root wheel 150M 22 ÁÐÒ 14:56 d.img > # mdconfig -f d.img > md0 > # dmesg | tail -2 > GEOM: md0: the secondary GPT table is corrupt or invalid. > GEOM: md0: using the primary only -- recovery suggested. > # gpart show md0 > => 34 204733 md0 GPT (150M) > 34 204733 1 freebsd-zfs (100M) > > # gpart recover md0 > md0 recovered > # gpart show md0 > => 34 307133 md0 GPT (150M) > 34 204733 1 freebsd-zfs (100M) > 204767 102400 - free - (50M) > > # gpart resize -i 1 md0 > md0p1 resized > # gpart show md0 > => 34 307133 md0 GPT (150M) > 34 307133 1 freebsd-zfs (150M) > ==================================================================== > > There are several things that can be do where i need suggestions. > 1. What code should do when user doing `gpart recover` for scheme > that doesn't need recovering? Do you mean the schemes that do not support recovery (like MBR)? The answer is obvious. If you mean GPT in OK state, when both copies are correct, then the code could just return success and, perhaps, some diagnostic message. > 2. Probably there are needed some checks before changing metadata in > g_part_gpt_recover method. > > So, patch attached and comments are welcome. Without deep analysis the patch looks good. Just to be sure - it handles the case when primary table is corrupt but the secondary (at the end of media) is OK? I will try to fully review your patch a little bit later. Thanks a lot! -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 21:10:13 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48FE5106566C; Thu, 22 Apr 2010 21:10:13 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 46D148FC08; Thu, 22 Apr 2010 21:10:11 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id AAA15153; Fri, 23 Apr 2010 00:10:03 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O53et-0008XM-91; Fri, 23 Apr 2010 00:10:03 +0300 Message-ID: <4BD0BB2A.1090503@icyb.net.ua> Date: Fri, 23 Apr 2010 00:10:02 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: Marcel Moolenaar References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> <50691271872096@web136.yandex.ru> <75798832-C041-4796-8C10-5BE61FB7583A@mac.com> In-Reply-To: <75798832-C041-4796-8C10-5BE61FB7583A@mac.com> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "Andrey V. Elsukov" , Marcel Moolenaar , freebsd-geom@freebsd.org Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 21:10:13 -0000 on 21/04/2010 20:59 Marcel Moolenaar said the following: > On Apr 21, 2010, at 10:48 AM, Andrey V. Elsukov wrote: > >> 21.04.10, 16:59, Andriy Gapon: >> >>>> providers withing scheme. But with GPT we have problem, after >>>> booting with bigger media size the second partition table will >>>> be lost. And GPT will be broken. >>> Why? >>> Do we have it hardcoded where to look for the secondary GPT? >> Yes. Current implementation does search for second GPT table only at last LBA. >> And it violates with UEFI 2.3 specification. > > No, it's ACCORDING to the specification: > > UEFI version 2.3, page 99 (paragraph 5.3.1): > "Two GPT Header structures are stored on the device: the primary and the > backup. The primary GPT Header must be located in LBA 1 (i.e., the second > logical block), and the backup GPT Header must be located in the last LBA > of the device." This makes total sense for the 'steady state', otherwise how the secondary table would be discovered when the primary table is lost. Actually, now I think that it doesn't matter much where we look for the secondary table when we already have valid primary table - as long as we don't make it a fatal error when the secondary table is invalid. (But I still think that checking AlternateLBA is more correct, because otherwise why would it exist at all?) I guess that a reason for having the secondary table is to increase chances of survival, of getting to the data: primary table is OK, then fine; primary is bad, but secondary is OK, then still fine. (But there should be diagnostics to alert a user, of course). What we have in FreeBSD right now actually decreases chances of survival - if either table is not OK, then we disregard the other table and just fail. A user has to do a recovery using disk editor. No help from the OS. I think that what Andrey is doing takes us in the correct direction. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 21:59:31 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 673FD1065672 for ; Thu, 22 Apr 2010 21:59:31 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id AD78F8FC13 for ; Thu, 22 Apr 2010 21:59:30 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id AAA16055 for ; Fri, 23 Apr 2010 00:59:29 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O54Qi-0008bd-Sp for freebsd-geom@FreeBSD.org; Fri, 23 Apr 2010 00:59:28 +0300 Message-ID: <4BD0C6C0.30008@icyb.net.ua> Date: Fri, 23 Apr 2010 00:59:28 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: freebsd-geom@FreeBSD.org X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Subject: d_maxsize .. si_iosize_max X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 21:59:31 -0000 I see that many disk drivers are careful to set an optimal value for d_maxsize. On the other hand, si_iosize_max is hardcoded to MAXPHYS in geom_dev.c, which leads to suboptimal I/O for devices with actual maximum I/O size less than MAXPHYS. And right now d_maxsize seems to be used only for dumperinfo.maxiosize, which is a waste. So, perhaps, d_maxsize should be passed up through GEOM layer as a provider property, e.g. pp->maxiosize. So that si_iosize_max could be set correctly. Or, more importantly, so that final consumers (e.g. filesystems) could now correct optimal I/O size. Perhaps, this should even be split into max_read_size and max_write_size for geoms like gmirror where reads can be done in parallel. -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Thu Apr 22 22:04:54 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C84E1065670; Thu, 22 Apr 2010 22:04:54 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 94EA38FC14; Thu, 22 Apr 2010 22:04:53 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id BAA16158; Fri, 23 Apr 2010 01:04:52 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O54Vv-0008c9-NY; Fri, 23 Apr 2010 01:04:51 +0300 Message-ID: <4BD0C802.3000004@icyb.net.ua> Date: Fri, 23 Apr 2010 01:04:50 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: freebsd-geom@FreeBSD.org, freebsd-fs@FreeBSD.org X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Subject: vdev_geom_io: parallelize ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2010 22:04:54 -0000 Just thinking out loud. Currently ZFS vdev_geom_io does something like: for (...) { ... g_io_request(...); biowait(...); ... } I/O is done in MAXPHYS chunks. If that was changed to first issuing all the requests and only after that waiting on them, could there be any performance benefit? Or cases of vdev_geom_io with size > MAXPHYS are too rare? Or something else? Thanks! -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 04:36:09 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58EC2106564A; Fri, 23 Apr 2010 04:36:09 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward3.mail.yandex.net (forward3.mail.yandex.net [77.88.46.8]) by mx1.freebsd.org (Postfix) with ESMTP id 029438FC1B; Fri, 23 Apr 2010 04:36:08 +0000 (UTC) Received: from smtp2.mail.yandex.net (smtp2.mail.yandex.net [77.88.46.102]) by forward3.mail.yandex.net (Yandex) with ESMTP id 30E3E56D915A; Fri, 23 Apr 2010 08:36:07 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1271997367; bh=9ZnhZLMFYsHw1SKkNltUZwmcAj4S2t1ed20q78mnwHQ=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=RT9NXyuADFcW+ZtMkGsb1Xlx2W6dVuUTOMDX03xmAlrmmFQf6egH+HWsBoFPAA/jV nvVDHUmScthDlxZtHIs3My2jiERpl1h5frRVZF4AzZv06oNZLcPRHXOWsag562Xuvl YliyLPB3ERMIM8tDUHoDrJl0TjdtPv8t3JRHds/M= Received: from [127.0.0.1] (ns.kirov.so-ups.ru [77.72.136.145]) by smtp2.mail.yandex.net (Yandex) with ESMTPSA id E4528528080; Fri, 23 Apr 2010 08:36:06 +0400 (MSD) Message-ID: <4BD123B5.1020506@yandex.ru> Date: Fri, 23 Apr 2010 08:36:05 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla Thunderbird 1.5 (FreeBSD/20051231) MIME-Version: 1.0 To: Andriy Gapon References: <4BCEE9E2.6010007@yandex.ru> <4BCEEC66.1080804@yandex.ru> <4BCEEF06.8010203@icyb.net.ua> <4BCEF5F8.6090102@yandex.ru> <4BCF04C7.1050701@icyb.net.ua> <4BD03472.6030201@yandex.ru> <4BD0B731.7060902@icyb.net.ua> In-Reply-To: <4BD0B731.7060902@icyb.net.ua> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit X-Yandex-TimeMark: 1271997367 X-Yandex-Spam: 1 X-Yandex-Front: smtp2.mail.yandex.net Cc: Marcel Moolenaar , freebsd-geom@FreeBSD.org Subject: Re: [patch] resize and recover support for GPART GPT scheme X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 04:36:09 -0000 On 23.04.2010 0:53, Andriy Gapon wrote: >> There are several things that can be do where i need suggestions. >> 1. What code should do when user doing `gpart recover` for scheme >> that doesn't need recovering? > > Do you mean the schemes that do not support recovery (like MBR)? The answer is > obvious. > If you mean GPT in OK state, when both copies are correct, then the code could > just return success and, perhaps, some diagnostic message. Schemes that do not support recovery will return ENOSYS. Currently `gpart recover` will write metadata each time when user calls it. To prevent this behavior needed something similar ENOTNEEDED :) >> 2. Probably there are needed some checks before changing metadata in >> g_part_gpt_recover method. >> >> So, patch attached and comments are welcome. > > Without deep analysis the patch looks good. Actually this was an implementation of `reinit` verb, but after some thinking i decided to make it as `recover` verb. And in this case using AlternateLBA is not needed. At this time it may be usable only for reporting that secondary table is not located at the last LBA. > Just to be sure - it handles the case when primary table is corrupt but the > secondary (at the end of media) is OK? Yes. > I will try to fully review your patch a little bit later. Thank you, but i still wait what Marcel will say about this patch and several others. -- WBR, Andrey V. Elsukov From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 06:08:56 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9B606106566C; Fri, 23 Apr 2010 06:08:56 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl [89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 33CCE8FC08; Fri, 23 Apr 2010 06:08:55 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 8BB7445E8E; Fri, 23 Apr 2010 08:08:54 +0200 (CEST) Received: from localhost (chello089077043238.chello.pl [89.77.43.238]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 554384569A; Fri, 23 Apr 2010 08:08:49 +0200 (CEST) Date: Fri, 23 Apr 2010 08:08:50 +0200 From: Pawel Jakub Dawidek To: Andriy Gapon Message-ID: <20100423060850.GB1670@garage.freebsd.pl> References: <4BD0C802.3000004@icyb.net.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="BwCQnh7xodEAoBMC" Content-Disposition: inline In-Reply-To: <4BD0C802.3000004@icyb.net.ua> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@FreeBSD.org, freebsd-geom@FreeBSD.org Subject: Re: vdev_geom_io: parallelize ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 06:08:56 -0000 --BwCQnh7xodEAoBMC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Apr 23, 2010 at 01:04:50AM +0300, Andriy Gapon wrote: >=20 > Just thinking out loud. >=20 > Currently ZFS vdev_geom_io does something like: > for (...) { > ... > g_io_request(...); > biowait(...); > ... > } > I/O is done in MAXPHYS chunks. >=20 > If that was changed to first issuing all the requests and only after that > waiting on them, could there be any performance benefit? > Or cases of vdev_geom_io with size > MAXPHYS are too rare? > Or something else? The vdev_geom_io() function is there only to read ZFS labels, it is not used during regular I/O. Regular I/O requests are handled asynchronously by the vdev_geom_io_start() function. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --BwCQnh7xodEAoBMC Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkvROXIACgkQForvXbEpPzSM9wCcDPLqokCtvb9D/QzxkGAOX3oL t90An3ssb9u19Zgw/x32k0xE5P5QLnHF =LTp5 -----END PGP SIGNATURE----- --BwCQnh7xodEAoBMC-- From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 06:42:56 2010 Return-Path: Delivered-To: freebsd-geom@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D2CA106564A; Fri, 23 Apr 2010 06:42:56 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 67F588FC1E; Fri, 23 Apr 2010 06:42:55 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA23663; Fri, 23 Apr 2010 09:42:54 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1O5CbF-000BL2-Nu; Fri, 23 Apr 2010 09:42:53 +0300 Message-ID: <4BD1416D.30207@icyb.net.ua> Date: Fri, 23 Apr 2010 09:42:53 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.24 (X11/20100321) MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4BD0C802.3000004@icyb.net.ua> <20100423060850.GB1670@garage.freebsd.pl> In-Reply-To: <20100423060850.GB1670@garage.freebsd.pl> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-U Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-geom@FreeBSD.org Subject: Re: vdev_geom_io: parallelize ? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 06:42:56 -0000 on 23/04/2010 09:08 Pawel Jakub Dawidek said the following: > On Fri, Apr 23, 2010 at 01:04:50AM +0300, Andriy Gapon wrote: >> Just thinking out loud. >> >> Currently ZFS vdev_geom_io does something like: >> for (...) { >> ... >> g_io_request(...); >> biowait(...); >> ... >> } >> I/O is done in MAXPHYS chunks. >> >> If that was changed to first issuing all the requests and only after that >> waiting on them, could there be any performance benefit? >> Or cases of vdev_geom_io with size > MAXPHYS are too rare? >> Or something else? > > The vdev_geom_io() function is there only to read ZFS labels, it is not > used during regular I/O. Regular I/O requests are handled asynchronously > by the vdev_geom_io_start() function. Oops. Thanks! -- Andriy Gapon From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 06:54:55 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EBA8D106566B for ; Fri, 23 Apr 2010 06:54:55 +0000 (UTC) (envelope-from joel@FreeBSD.org) Received: from mail.vnode.se (mail.vnode.se [62.119.52.80]) by mx1.freebsd.org (Postfix) with ESMTP id 58A6B8FC1E for ; Fri, 23 Apr 2010 06:54:55 +0000 (UTC) Received: from mail.vnode.se (localhost [127.0.0.1]) by mail.vnode.se (Postfix) with ESMTP id 41A47E3F085; Fri, 23 Apr 2010 08:36:59 +0200 (CEST) X-Virus-Scanned: amavisd-new at vnode.se Received: from mail.vnode.se ([127.0.0.1]) by mail.vnode.se (mail.vnode.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HyNbYv2ytjq8; Fri, 23 Apr 2010 08:36:58 +0200 (CEST) Received: from bubba.vnode.local (unknown [83.223.1.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.vnode.se (Postfix) with ESMTPSA id 105CCE3F077; Fri, 23 Apr 2010 08:36:57 +0200 (CEST) Date: Fri, 23 Apr 2010 08:36:56 +0200 From: Joel Dahl To: Alexander Motin Message-ID: <20100423063655.GC84889@bubba.vnode.local> References: <4BD06BD9.6030401@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: FreeBSD-Current , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 06:54:56 -0000 On 22-04-2010 18:31, Alexander Motin wrote: > Hi. > > With time passed, CAM-based ATA infrastructure IMHO looks enough mature > now to enable it in HEAD. Now we have two new stable drivers ahci(4) and > siis(4), covering major part of modern SATA HBAs, `options ATA_CAM` > wrapper for ata(4) to supports legacy hardware, and one more improved > driver for Marvell HBAs (mvs) is now in development and soon will be > present for testing. Together with many other people I have tested above > at least on i386, amd64, arm and spart64 architectures. If we want this in 9.0 we should probably switch to CAM ATA in HEAD now in order to allow enough testing before the release. -- Joel From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 08:41:11 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DFF8B1065670; Fri, 23 Apr 2010 08:41:11 +0000 (UTC) (envelope-from paul@fletchermoorland.co.uk) Received: from hydra.fletchermoorland.co.uk (hydra.fletchermoorland.co.uk [78.33.209.59]) by mx1.freebsd.org (Postfix) with ESMTP id 6B23E8FC1B; Fri, 23 Apr 2010 08:41:10 +0000 (UTC) Received: from demophon.fletchermoorland.co.uk (demophon.fletchermoorland.co.uk [192.168.0.154]) by hydra.fletchermoorland.co.uk (8.14.3/8.14.3) with ESMTP id o3N8f7lR062160; Fri, 23 Apr 2010 09:41:07 +0100 (BST) (envelope-from paul@fletchermoorland.co.uk) Message-ID: <4BD15D23.8090501@fletchermoorland.co.uk> Date: Fri, 23 Apr 2010 08:41:07 +0000 From: Paul Wootton User-Agent: Thunderbird 2.0.0.23 (X11/20091217) MIME-Version: 1.0 To: Alexander Motin References: <4BD06BD9.6030401@FreeBSD.org> In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=2.5 required=10.0 tests=ALL_TRUSTED,BAYES_50, DNS_FROM_OPENWHOIS,FH_DATE_PAST_20XX autolearn=no version=3.2.5 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on hydra.fletchermoorland.co.uk Cc: FreeBSD-Current , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 08:41:12 -0000 Alexander Motin wrote: > > Can we do switchover now, or some more reasons preventing this? > The only thing I miss about the old ATA layer was that I knew that a drive on a particular controller would always be assigned the same adX number, whether is was present at boot time, or added days later. This could get a little messy having ad2, ad4, ad12, ad20 and ad22, but at least if I added a new drive, it would always attach to say ad8. Can this be done on the new CAM ATA? Paul From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 09:52:22 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A0D21065677 for ; Fri, 23 Apr 2010 09:52:22 +0000 (UTC) (envelope-from lister@kawashti.org) Received: from mra.kawashti.org (mra.kawashti.org [78.136.5.95]) by mx1.freebsd.org (Postfix) with ESMTP id 251638FC18 for ; Fri, 23 Apr 2010 09:52:21 +0000 (UTC) Received: from mx.kawashti.org (mx.kawashti.org [196.218.21.179]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mra.kawashti.org (Postfix) with ESMTP id E8E4C4902E6 for ; Fri, 23 Apr 2010 10:52:11 +0100 (BST) Received: from neo ([10.10.10.10]) by mx.kawashti.org (Kawashti Mail) with SMTP id RDS02182 for ; Fri, 23 Apr 2010 11:52:06 +0200 Message-ID: <7724D0551E924C73B53BA9F22773503D@neo> From: "Lister" To: "GEOM" References: <4BCEEA79.7080309@icyb.net.ua><4BD01DBE.7030905@icyb.net.ua><9C9F1EE6F5A24B3695327442FBF565C6@neo> <4BD0B35B.2040006@icyb.net.ua> Date: Fri, 23 Apr 2010 11:51:57 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="windows-1256"; reply-type=original Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.4548 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.4325 Subject: Re: OCE and GPT X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 09:52:22 -0000 I thought doing the OCE was an open and shut case, dare I to dream! I thought Online Capacity Expansion– and RAID Level Migration (ORLM) were features accessible from 3Ware's HBA's BIOS. They were not. It turned out to be implemented in a software called 3DM2 (3Ware Disk Manager 2.) The latter marked the beginning of a new set of problems on FreeBSD-8.0. For reference, it's basically a daemon listening on port 888 and accessible from a browser and some tools to email admin alerts…etc. The package, though, was over 100MB installed from a 59MB sh script! 3Ware has specific S/W for FreeBSD-5.x, 6.x, 7.x, but not 8.0! I thought that may not be a problem after all, and tried to install it on 8.0. It insisted that I must install JRE beforehand. I tried to install diablo-jre16 from ports after update. That one, along the way installed so many other ports to satisfy its dependencies, including many X packages. I don't have X on any of my servers and I don't want it. Any way, installation succeeded in the end. But will it run? /:602: /usr/sbin/3dm2 Fatal error 'kse_create() failed ' at line 441 in file /usr/src/lib/libpthread/thread/thr_kern.c (errno = 2) After googling around, I found the exact same message, differing only in (errno.) It was regarding the same HBA as mine (3W 9650SE-xxxxx), yet on FreeBSD-7.0. The user was advised to add 'options KSE' to his kernel and recompile. He subsequently reported that this had solved his problem. I thought, that's promising. I added the option to my kernel and tried to build. Alas, I got "unknown option" error. This is what my 7.1 knows about kse /:1181: apropos kse kse(2) - kernel support for user threads Clearly it's very pertinent to the error message. But what does 8.0 know about it? /:615: apropos kse Pod::Simple::LinkSection(3) - -- represent "section" attributes of L codes Can anyone help with this? Despite that the solution seems promising on 7.1, I can't risk losing a 7-years-worth library and must first experiment on the empty RAID on 8.0. Moving HBAs or disks between systems is not an option, for many strong reasons. Note on where the extra space generated by OCE goes: ---------------------------------------------------- Although my logic had convinced me from the beginning that the extra space must go to the end of the RAID volume, I had to verify this. I did from forums and 3Ware documentation. Nonetheless, neither referenced multiple partitions, and they merely mentioned 'extending the disk' under Windows and 'resizing the partition and extending the filesystem' under others; and in either case 'or create a new partition.' Logically, again, this should make no difference. This is predicated on the facts that HBA manufacturers tout OCE and ORLM as non-destructive (features), and also that hardware RAID HBAs must be OS/FS-indifferent. The new capacity is reflected only after a complete rebuild of the array which occurs automatically after adding a drive to an existing array– both counts logical again. Additionally after the complete rebuild, an OS-dependent 'nudge' is needed to let it know of the change. However, the OS never needs to be rebooted unless it boots from the expanded RAID. I thought I'd share this info with you. Kind regards, Hatem Kawashti From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 10:13:45 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 57A1D106566B; Fri, 23 Apr 2010 10:13:45 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.26]) by mx1.freebsd.org (Postfix) with ESMTP id AF47F8FC16; Fri, 23 Apr 2010 10:13:44 +0000 (UTC) Received: by ey-out-2122.google.com with SMTP id d26so671006eyd.9 for ; Fri, 23 Apr 2010 03:13:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=Y08EZZxwuqtTgIsS9PuPizOfC8qOVJPtsI2f4rEQ9sI=; b=BuyOsycAOnfavFDFix1UkiAG754GP3khgwJ/qLngt1NbCpjBlT/VeMrMgepmGwvfWR zHYmrrJyQxxa/UvShfvKG6KWfTZqIliZUDi2nufJzl3Gd7tVSoCFgKO0HxRX6GqYom3b WgSpAFDrpBNhMAKfpE4DBBMpauvFT3bl5t2GM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=KAL9htV1QVxTHHERovu/BWqzpMmF8fpglxDvVZnqJ1xVjjjHRXspKsBoRkAlFukvbc Xfi/vA9NYJSsBdqAgEgUXbYEC6R8EX8fOJuAKtvya4apdFZ0PVSzoSVLeihN6f4IKihN BTg5a5KFxLH9Q6V75mmeoYmYqI1dif8fjDnqs= Received: by 10.102.254.24 with SMTP id b24mr6392099mui.5.1272017623371; Fri, 23 Apr 2010 03:13:43 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id y37sm3735511mug.52.2010.04.23.03.13.42 (version=SSLv3 cipher=RC4-MD5); Fri, 23 Apr 2010 03:13:42 -0700 (PDT) Sender: Alexander Motin Message-ID: <4BD172CA.4040106@FreeBSD.org> Date: Fri, 23 Apr 2010 13:13:30 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.24 (X11/20100402) MIME-Version: 1.0 To: Paul Wootton References: <4BD06BD9.6030401@FreeBSD.org> <4BD15D23.8090501@fletchermoorland.co.uk> In-Reply-To: <4BD15D23.8090501@fletchermoorland.co.uk> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: FreeBSD-Current , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 10:13:45 -0000 Paul Wootton wrote: > Alexander Motin wrote: >> Can we do switchover now, or some more reasons preventing this? > > The only thing I miss about the old ATA layer was that I knew that a > drive on a particular controller would always be assigned the same adX > number, whether is was present at boot time, or added days later. This > could get a little messy having ad2, ad4, ad12, ad20 and ad22, but at > least if I added a new drive, it would always attach to say ad8. > > Can this be done on the new CAM ATA? Binding to controller ports and device IDs can be managed for any CAM device via device.hints as described in cam(4). This scheme is a slightly more complicated (you have to explicitly define wanted mapping), but more flexible. Previous one just inapplicable now. Modern controllers (especially with Port Multipliers) could support different (often big) number of devices per channel, making device list with static numbering too sparse. -- Alexander Motin From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 13:50:57 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D113106566C; Fri, 23 Apr 2010 13:50:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 433BC8FC0C; Fri, 23 Apr 2010 13:50:57 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 0300F46B03; Fri, 23 Apr 2010 09:50:57 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 1C1418A025; Fri, 23 Apr 2010 09:50:53 -0400 (EDT) From: John Baldwin To: freebsd-current@freebsd.org Date: Fri, 23 Apr 2010 09:50:33 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <4BD06BD9.6030401@FreeBSD.org> In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <201004230950.33999.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Fri, 23 Apr 2010 09:50:53 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Alexander Motin , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 13:50:57 -0000 On Thursday 22 April 2010 11:31:37 am Alexander Motin wrote: > If ataraid(4) should be reimplemented in GEOM, then how exactly? One > more separate RAID infrastructure in GEOM (third?) looks excessive. > Reuse gmirror, gstripe,... code would be nice, but will make them more > complicated and could be not easy for RAID0+1 (due to common metadata) > and RAID5 (due to lack of module in a base system). Scott's view (which sounds good to me) is that GEOM should include a library of routines for working with common transforms such as RAID1, striping, etc. Each ATA RAID vendor format would then consist of a small GEOM module that used the library routines to manage all the I/O and the bulk of the module would be managing a specific metadata format. -- John Baldwin From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 14:28:30 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4979D106564A; Fri, 23 Apr 2010 14:28:30 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.26]) by mx1.freebsd.org (Postfix) with ESMTP id 79E818FC15; Fri, 23 Apr 2010 14:28:29 +0000 (UTC) Received: by ey-out-2122.google.com with SMTP id d26so685557eyd.9 for ; Fri, 23 Apr 2010 07:28:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=xZyCXifGH87KBPSB7uMBJyj8ErXLoipDhHjX6ajDJhg=; b=rsNRPjNiYPITnbEazNF2O0Uy6Ea7i8sGkzB2g9DsQB++7OlgwF0bHobyJXtK7NnbEw gyLRrjuB2fv4AvOQKBb1xofLzTYo270q6Vf3DEFdkQR7pe6S6DxYN0NoRoneD2qAo9so kXoZkyn1Ql3QCoaw5Z3YiLuRsBQ4q21f0dyXQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=GRyVCtgSO3zMLwx0v/ZUTILAp3DpHHl5G2pQaCUxindvEmhF/+/6RDNC6uoAhKwE+v k9KpMcG//ZvKe7mx0c0g5s7WGhe6o0KUqGiSMyNftXleVPkfDuTsagjt6ZLlYrvKds5y mRGMr3d5L/nLAeDayPhd0y/KGuyjO9Cd5LdhM= Received: by 10.102.243.26 with SMTP id q26mr70937muh.34.1272032907941; Fri, 23 Apr 2010 07:28:27 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id y6sm4751902mug.50.2010.04.23.07.28.26 (version=SSLv3 cipher=RC4-MD5); Fri, 23 Apr 2010 07:28:27 -0700 (PDT) Sender: Alexander Motin Message-ID: <4BD1AE7F.2060907@FreeBSD.org> Date: Fri, 23 Apr 2010 17:28:15 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.24 (X11/20100402) MIME-Version: 1.0 To: John Baldwin References: <4BD06BD9.6030401@FreeBSD.org> <201004230950.33999.jhb@freebsd.org> In-Reply-To: <201004230950.33999.jhb@freebsd.org> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 14:28:30 -0000 John Baldwin wrote: > On Thursday 22 April 2010 11:31:37 am Alexander Motin wrote: >> If ataraid(4) should be reimplemented in GEOM, then how exactly? One >> more separate RAID infrastructure in GEOM (third?) looks excessive. >> Reuse gmirror, gstripe,... code would be nice, but will make them more >> complicated and could be not easy for RAID0+1 (due to common metadata) >> and RAID5 (due to lack of module in a base system). > > Scott's view (which sounds good to me) is that GEOM should include a library > of routines for working with common transforms such as RAID1, striping, etc. > Each ATA RAID vendor format would then consist of a small GEOM module that > used the library routines to manage all the I/O and the bulk of the module > would be managing a specific metadata format. Yes, I remember he proposed it somewhere. Idea is fine. Somebody with sharp axe and lack of fear should just chop half of GEOM modules into small pieces and collect them back. ;) -- Alexander Motin From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 14:40:49 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 036321065703 for ; Fri, 23 Apr 2010 14:40:49 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 6551E8FC1F for ; Fri, 23 Apr 2010 14:40:47 +0000 (UTC) Received: from mail.cicely.de ([10.1.1.37]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id o3NE42es042161 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 23 Apr 2010 16:04:03 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by mail.cicely.de (8.14.3/8.14.3) with ESMTP id o3NE3xUJ024177 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 23 Apr 2010 16:03:59 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id o3NE3xdX007705; Fri, 23 Apr 2010 16:03:59 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id o3NE3w5n007704; Fri, 23 Apr 2010 16:03:58 +0200 (CEST) (envelope-from ticso) Date: Fri, 23 Apr 2010 16:03:58 +0200 From: Bernd Walter To: John Baldwin Message-ID: <20100423140358.GC1575@cicely7.cicely.de> References: <4BD06BD9.6030401@FreeBSD.org> <201004230950.33999.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201004230950.33999.jhb@freebsd.org> X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED=-1, BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01 autolearn=ham version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on spamd.cicely.de Cc: Alexander Motin , freebsd-current@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 14:40:49 -0000 On Fri, Apr 23, 2010 at 09:50:33AM -0400, John Baldwin wrote: > On Thursday 22 April 2010 11:31:37 am Alexander Motin wrote: > > If ataraid(4) should be reimplemented in GEOM, then how exactly? One > > more separate RAID infrastructure in GEOM (third?) looks excessive. > > Reuse gmirror, gstripe,... code would be nice, but will make them more > > complicated and could be not easy for RAID0+1 (due to common metadata) > > and RAID5 (due to lack of module in a base system). > > Scott's view (which sounds good to me) is that GEOM should include a library > of routines for working with common transforms such as RAID1, striping, etc. > Each ATA RAID vendor format would then consist of a small GEOM module that > used the library routines to manage all the I/O and the bulk of the module > would be managing a specific metadata format. I remember that SCSI standard has support for xor read-modify-write operations in addition to normal read/write to reduce R5 latency and bandwith. I'm not sure if any devices actually support it, but I think this may be worthwhile for networked devices. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 14:44:59 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07052106566B for ; Fri, 23 Apr 2010 14:44:59 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 9F7078FC16 for ; Fri, 23 Apr 2010 14:44:58 +0000 (UTC) Received: from [127.0.0.1] (pooker.samsco.org [168.103.85.57]) (authenticated bits=0) by pooker.samsco.org (8.14.3/8.14.3) with ESMTP id o3NENWOg082169; Fri, 23 Apr 2010 08:23:32 -0600 (MDT) (envelope-from scottl@samsco.org) Mime-Version: 1.0 (Apple Message framework v1078) Content-Type: text/plain; charset=us-ascii From: Scott Long In-Reply-To: <201004230950.33999.jhb@freebsd.org> Date: Fri, 23 Apr 2010 08:23:32 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4BD06BD9.6030401@FreeBSD.org> <201004230950.33999.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1078) X-Spam-Status: No, score=-1.0 required=3.8 tests=ALL_TRUSTED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on pooker.samsco.org Cc: Alexander Motin , freebsd-current@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 14:44:59 -0000 On Apr 23, 2010, at 7:50 AM, John Baldwin wrote: > On Thursday 22 April 2010 11:31:37 am Alexander Motin wrote: >> If ataraid(4) should be reimplemented in GEOM, then how exactly? One >> more separate RAID infrastructure in GEOM (third?) looks excessive. >> Reuse gmirror, gstripe,... code would be nice, but will make them = more >> complicated and could be not easy for RAID0+1 (due to common = metadata) >> and RAID5 (due to lack of module in a base system). >=20 > Scott's view (which sounds good to me) is that GEOM should include a = library=20 > of routines for working with common transforms such as RAID1, = striping, etc. =20 > Each ATA RAID vendor format would then consist of a small GEOM module = that=20 > used the library routines to manage all the I/O and the bulk of the = module=20 > would be managing a specific metadata format. >=20 THIS It's hard for me to talk about RAID and FreeBSD without getting into a = long sermon, so I'll try to keep this short. RAID is about data = integrity, not about mirror/stripe/parity algorithms. Those algorithms = are just a small part of RAID, and are merely tools for achieving the = goals of RAID. But RAID !=3D algorithms. It's like how we use linked = lists extensively within the kernel, but the kernel !=3D linked lists. A well-designed software raid stack is going to be an engine that = manages topology, executes and rolls back I/O transactions, and handles = error recovery. On-disk metadata is again just an algorithm that is = part of this whole picture, and should be modularized as such along with = the transforms. And to be even more brief, the existing GEOM RAID modules are designed = in completely the wrong direction from this. Caveat Emptor. Scott From owner-freebsd-geom@FreeBSD.ORG Fri Apr 23 16:02:01 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06E56106566B; Fri, 23 Apr 2010 16:02:01 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id C6A708FC22; Fri, 23 Apr 2010 16:02:00 +0000 (UTC) Received: by pwi9 with SMTP id 9so7152077pwi.13 for ; Fri, 23 Apr 2010 09:02:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=IJgg45wnbomW2d6hGU38L1JrO5gcsFONtYrc1WnP0G8=; b=pVHbUffdi7A/ZFSo68UGfKM6OclM3DKOvV/l0NfEmyudH6imaG1ixeI2LDFvz9XZwl aOtWeZaXRF97KcmxRLLA7PnN5bWc2Knx/NhMILNQvd0ysfTKHMsgu1FmH3quq0H2E87J DkjPvNxkw837wj0TkVzELUR2Xbv4lBZ3Ltxww= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=RU1+AwNsQiTUuSTklRWTb440yDuOnvNpoGO8XAfGoG8sYlcoMtHwHjwUY/KU8hS2qn XcijM0+ZJqsM00EYI9KRLzxd1lDhFP0JIyPpbRZ4t+T/Klx3gXmK9fkm3GEemtpWC6uJ IuHnQhl1tLbgO6L9OYmKObPV9CNPXSmWwAg94= MIME-Version: 1.0 Received: by 10.140.82.9 with SMTP id f9mr306459rvb.130.1272038514761; Fri, 23 Apr 2010 09:01:54 -0700 (PDT) Received: by 10.231.18.74 with HTTP; Fri, 23 Apr 2010 09:01:53 -0700 (PDT) In-Reply-To: <4BD15D23.8090501@fletchermoorland.co.uk> References: <4BD06BD9.6030401@FreeBSD.org> <4BD15D23.8090501@fletchermoorland.co.uk> Date: Fri, 23 Apr 2010 09:01:53 -0700 Message-ID: From: Freddie Cash To: FreeBSD-Current , freebsd-geom@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 16:02:01 -0000 On Fri, Apr 23, 2010 at 1:41 AM, Paul Wootton wrote: > Alexander Motin wrote: > >> >> Can we do switchover now, or some more reasons preventing this? >> >> The only thing I miss about the old ATA layer was that I knew that a drive > on a particular controller would always be assigned the same adX number, > whether is was present at boot time, or added days later. This could get a > little messy having ad2, ad4, ad12, ad20 and ad22, but at least if I added a > new drive, it would always attach to say ad8. > > Can this be done on the new CAM ATA? I have not tried it with ATA_CAM, but in theory, you should be able to wire things down the same as with SCSI devices. Just takes a bit of mucking around with camcontrol output, and sticking the right info into /boot/loader.conf. See the man page for camcontrol for all the details. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-geom@FreeBSD.ORG Sat Apr 24 13:48:07 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A81E0106564A for ; Sat, 24 Apr 2010 13:48:07 +0000 (UTC) (envelope-from lister@kawashti.org) Received: from mra.kawashti.org (mra.kawashti.org [78.136.5.95]) by mx1.freebsd.org (Postfix) with ESMTP id 463458FC08 for ; Sat, 24 Apr 2010 13:48:06 +0000 (UTC) Received: from mx.kawashti.org (mx.kawashti.org [196.218.21.179]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mra.kawashti.org (Postfix) with ESMTP id BC85A4902E5 for ; Sat, 24 Apr 2010 14:48:04 +0100 (BST) Received: from neo ([10.10.10.10]) by mx.kawashti.org (Kawashti Mail) with SMTP id RDS02182 for ; Sat, 24 Apr 2010 15:48:02 +0200 Message-ID: <8848B2F8F5AC4BBF9341A2E60A4328A2@neo> From: "Lister" To: "GEOM" Date: Sat, 24 Apr 2010 15:47:52 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="windows-1256"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.4548 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.3790.4325 Subject: gmirror of 2 H/W RAID5s and nVidia SATARAID X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Apr 2010 13:48:07 -0000 Hello all, This posting is somewhat related to my earlier one titled "OCE and GPT". In a production environment, I have these 2 systems: SH1 & SH2. They're both 7.1-REL and AMD64. I built them to achieve "RAID for the doubly paranoid." Together they have 3 RAID5's of the same 4TB data. To lower the risks even more, I've intentionally used different brands of everything but the hard drives, CPUs and RAM. Even the chassis and redunant PSUs are different. SH1 has 2 3Ware 9550SXU HBAs each doing a RAID5. The 2 resultant RAID disks (da0 & da1) are gmirrored to get RAID51. They're mirrored as /dev/mirror/RAID51 SH2 has 1 HighPoint RocketRAID 2520 connected to an external enclosure (Norco DS-1500). It has the 3rd RAID5 which is rsync'd to SH1's mirror. Despite the different motherboards (Asus and Gigabye), they both feature "nVidia Media Shield" which is used to setup a RAID1 from which either FreeBSD boots. In otherwords, the mobo's RAID1 is entirely for the OS, and the RAID5's are entirely and strictly data-only. SH2's RAID1 was cloned from SH1's using LiveCD and dump/restore over ssh. Then, host-specific files have been patched. I used gpt to partition /dev/mirror/RAID51 on SH1 and da0 on SH2. I didn't know about gpart then. Partitions are, therefore /dev/mirror/RAID51p1~n on SH1 and /dev/da0p1~n on SH2. Both systems have been functioning satisfactorily in production for over 2 months now, and still are. However, since the very beginning, with every system boot, SH1's kernel reports the secondary GPT is corrupt of invalid for both da0 & da1. Additionally, for some reason it thinks it has both ar0 and ar1 (the mobo's nVidia RAID1) and that both are degraded. Obviously, there's only ar0. It's the one I installed FreeBSD onto and is the only one in SH1's /etc/fstab. Here's an excerpt from the syslogs of both SH1 & SH2 for 1 such incident. To keep the lines shorter, I've removed the timestamp, host and source, and kept the latter 2 as headers. I've also attached a screenshot of the text for better readability. sh1 kernel: ------------ ad8: 238475MB at ata4-master SATA150 ad10: 238474MB at ata5-master SATA150 da0 at twa0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 100.000MB/s transfers da0: 3814656MB (7812415488 512 byte sectors: 255H 63S/T 486300C) da1 at twa1 bus 0 target 0 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 100.000MB/s transfers da1: 3814656MB (7812415488 512 byte sectors: 255H 63S/T 486300C) ar0: WARNING - mirror protection lost. RAID1 array in DEGRADED mode ar0: 238475MB status: DEGRADED ar0: disk0 READY (master) using ad8 at ata4-master ar0: disk1 DOWN no device found for this subdisk ar1: WARNING - mirror protection lost. RAID1 array in DEGRADED mode ar1: 238474MB status: DEGRADED ar1: disk0 DOWN no device found for this subdisk ar1: disk1 READY (mirror) using ad10 at ata5-master GEOM: da0: the secondary GPT table is corrupt or invalid. GEOM: da0: using the primary only -- recovery suggested. GEOM: da1: the secondary GPT table is corrupt or invalid. GEOM: da1: using the primary only -- recovery suggested. GEOM_MIRROR: Device mirror/RAID51 launched (2/2). sh2 kernel: ------------ ad6: 238475MB at ata3-master SATA150 ad10: 238475MB at ata5-master SATA150 da0 at hptrr0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-0 device ar0: 238475MB status: READY ar0: disk0 READY (master) using ad6 at ata3-master ar0: disk1 READY (mirror) using ad10 at ata5-master === Note how SH2 is free of either manifestation. Now the questions. They all concern SH1: 1 -- I didn't partition either da0 or da1. I only did /dev/mirror/RAID51. Why the messages about corrupt or invalid 2ries? What can I do to make those messages go away? 2 -- Why does FreeBSD think it has 2 RAID1's ar0 & ar1, and that both are degraded. What can I do about it? 3 -- Although I don't need to follow the kernel's "suggestion" of recovery, suppose I actually needed to, on a different system, how can I go about that, and now? Although there was a lot on this topic under my previous thread "OCE and GPT", what can a layman, like myself, who's not willing to read hundreds of pages of specs, do? To that end, I've already done some quick probing using dd & hd on my own 'healthy' system subject of the thread "OCE & GPT" and found that I can directly copy the 32-sector table from 1ry to 2ry and vice versa because they were identical as evidenced by cmp & hd. However, I found the headers to be different. Is this normal? If so, then I'd appreciate a quick pointer to how the 2 headers are constructed, to save some precious time. -- Hatem Kawashti From owner-freebsd-geom@FreeBSD.ORG Sat Apr 24 19:42:14 2010 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D25DD106564A for ; Sat, 24 Apr 2010 19:42:14 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id 63D978FC15 for ; Sat, 24 Apr 2010 19:42:13 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id o3OJUYWF010232; Sat, 24 Apr 2010 21:30:35 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id o3OJUYdO010231; Sat, 24 Apr 2010 21:30:34 +0200 (CEST) (envelope-from marius) Date: Sat, 24 Apr 2010 21:30:34 +0200 From: Marius Strobl To: Alexander Motin Message-ID: <20100424193034.GA9853@alchemy.franken.de> References: <4BD06BD9.6030401@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4BD06BD9.6030401@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: FreeBSD-Current , freebsd-geom@freebsd.org Subject: Re: Switchover to CAM ATA? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Apr 2010 19:42:14 -0000 On Thu, Apr 22, 2010 at 06:31:37PM +0300, Alexander Motin wrote: > Hi. > > With time passed, CAM-based ATA infrastructure IMHO looks enough mature > now to enable it in HEAD. Now we have two new stable drivers ahci(4) and > siis(4), covering major part of modern SATA HBAs, `options ATA_CAM` > wrapper for ata(4) to supports legacy hardware, and one more improved > driver for Marvell HBAs (mvs) is now in development and soon will be > present for testing. Together with many other people I have tested above > at least on i386, amd64, arm and spart64 architectures. > > This switchover would give us significant performance improvement on new > hardware because of NCQ support in ahci/siis/mvs drivers; improved > functionality, including SATA Port Multipliers support, better hot-plug > support; and reduced code duplication between ata(4) and cam(4) > subsystems and applications. > > Two issues left at this moment are: > 1) POLA breakage due to disk device being renamed from adX to adaY; > 2) lack of araraid(4) alternative in new infrastructure. It should be > reimplemented in GEOM in some way, but it still wasn't. > > So what is the public opinion: Is the lack of ataraid(4) fatal or we can > live without it? > > Can we do switchover now, or some more reasons preventing this? > As noted earlier, pc98 and sparc64 need ada(4)/CAM ATA to perform geometry translation as done by ad_firmware_geom_adjust() for ad(4), which the following patch hooks up to both: http://people.freebsd.org/~marius/ata_disk_firmware_geom_adjust.diff You preferred to implement such functionality via XPT_CALC_GEOMETRY though (I'm still not convinced that it makes sense to put this functionality into every ATA SIM the same way it is done for SCSI rather than letting ada(4) handle it the same way for all SIMs however). Have you looked into implementing XPT_CALC_GEOMETRY for ATA CAM or is it okay to commit the above patch? Marius