From owner-freebsd-questions@freebsd.org Sat Nov 21 22:33:39 2020 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6B7F44722E2 for ; Sat, 21 Nov 2020 22:33:39 +0000 (UTC) (envelope-from bennett@sdf.org) Received: from mx.sdf.org (mx.sdf.org [205.166.94.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx.sdf.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cdp9G1wblz3kMn for ; Sat, 21 Nov 2020 22:33:38 +0000 (UTC) (envelope-from bennett@sdf.org) Received: from sdf.org (IDENT:bennett@faeroes.freeshell.org [205.166.94.9]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 0ALMXUpO004180 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Sat, 21 Nov 2020 22:33:30 GMT Received: (from bennett@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 0ALMXfvE022876; Sat, 21 Nov 2020 16:33:41 -0600 (CST) From: Scott Bennett Message-Id: <202011212233.0ALMXfvE022876@sdf.org> Date: Sat, 21 Nov 2020 16:33:41 -0600 To: dpchrist@holgerdanske.com Subject: [SOLVED] Re: "zpool attach" problem Cc: freebsd-questions@freebsd.org User-Agent: Heirloom mailx 12.5 6/20/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4Cdp9G1wblz3kMn X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.80 / 15.00]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[205.166.94.24:from]; R_SPF_ALLOW(-0.20)[+ip4:205.166.94.0/24]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; SPAMHAUS_ZRD(0.00)[205.166.94.24:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[sdf.org,quarantine]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:14361, ipnet:205.166.94.0/24, country:US]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-questions]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Nov 2020 22:33:39 -0000 Hi David, Thanks for your reply. I was about to respond to my own message to say that the issue has been resolved, but I saw your reply first. However, I respond below to your comments and questions, as well as stating what the problem turned out to be. On Fri, 20 Nov 2020 21:16:06 -0800 David Christensen wrote: >On 2020-11-19 22:59, Scott Bennett via freebsd-questions wrote: >> I had a pool with two two-way mirrors as the top-level vdevs. I needed >> to shift some of those partitions by a short distance on the drives, so I >> detached and deleted and rebuilt them one at a time until I hit a snag. Here >> is the situation. >> >> Script started on Fri Nov 20 00:40:36 2020 >> hellas# gpart show -l ada2 da0 da1 da2 >> => 40 5860533088 ada2 GPT (2.7T) >> 40 4294967296 1 WD-WMC130F2V1RN (2.0T) >> 4294967336 31457496 - free - (15G) >> 4326424832 125829120 11 zmisc mirror-0 1 (60G) >> 4452253952 209715200 15 bw2-0 (100G) >> 4661969152 1198563976 - free - (572G) >> >> => 34 3907029101 da0 GPT (1.8T) >> 34 14 - free - (7.0K) >> 48 3749709824 1 WD WCC4MH1P7LYS (1.7T) >> 3749709872 73400320 5 bw1-0 (35G) >> 3823110192 2000 - free - (1.0M) >> 3823112192 83886080 8 zmisc mirror-1 1 (40G) >> 3906998272 30863 - free - (15M) >> >> => 34 3907029100 da1 GPT (1.8T) >> 34 14 - free - (7.0K) >> 48 3749709824 1 Seagate NA5KYLVM (1.7T) >> 3749709872 16 - free - (8.0K) >> 3749709888 73400320 5 bw1-1 (35G) >> 3823110208 1984 - free - (992K) >> 3823112192 83886080 8 zmisc mirror-1 0 (40G) >> 3906998272 30862 - free - (15M) >> >> => 40 3907029088 da2 GPT (1.8T) >> 40 8 - free - (4.0K) >> 48 3749709824 1 WD-WCC6N7KD2YAK (1.7T) >> 3749709872 16 - free - (8.0K) >> 3749709888 31457280 5 bw0-0 (15G) >> 3781167168 1984 - free - (992K) >> 3781169152 125829120 8 zmisc mirror-0 0 (60G) >> 3906998272 30856 - free - (15M) >> >> hellas# zpool status zmisc >> pool: zmisc >> state: ONLINE >> scan: resilvered 25.8G in 0 days 00:16:07 with 0 errors on Fri Nov 20 00:10:19 2020 >> config: >> >> NAME STATE READ WRITE CKSUM >> zmisc ONLINE 0 0 0 >> ada2p11 ONLINE 0 0 0 >> mirror-1 ONLINE 0 0 0 >> da0p8 ONLINE 0 0 0 >> da1p8 ONLINE 0 0 0 >> >> errors: No known data errors >> hellas# zpool attach zmisc ada2p11 da2p8 >> cannot attach da2p8 to ada2p11: no such pool or dataset >> hellas# exit >> exit >> >> Script done on Fri Nov 20 00:42:33 2020 >> >> Would somebody please tell me what I am doing wrong here? Many thanks in >> advance to whoever can help. > >It looks like you added the slice ada2p11 to zmisc, rather than the >mirror ada2p11 da2p8. If so, these commands could fix things: > No, ada2p11 is what was left after detaching a partition from mirror-0 of that pool. > > # zpool remove zmisc ada2p11 > > # zpool add zmisc mirror ada2p11 da2p8 > I thought about doing that, but the allocated portion of mirror-0 was too much to fit into the free space in mirror-1. Also, even if mirror-1 could have held that much, that kind of monkeying around ends up creating a situation of horribly unbalanced allocation, and so I would have hesitated at least a day or three to see if I could find a better way, and it was a good thing that I stopped and went to bed when I was done posting my message. (See confession further below.) > >But, I am confused by your storage architecture. Why one internal "3 >TB" drive and three external "2 TB" drives? What is the 2.0T internal >slice for? What are the three 1.7 GiB external slices for? What are Long story. Sigh. About eight or nine years ago I began using ZFS under 9.something i386. (Currently the machine is running 11.4-STABLE amd64.) At first it was all experimental while I learned enough to begin to feel some confidence in using it. Once I had purchased six 1.8 TB drives, I created my largest pool called rz7A and quickly moved my backups and archives into it, and AFAIK I have not lost a single byte due to hardware errors, power failures, or anything else since then. (I likely need to think up a better name for it, but that is way down on the list of my worries for now.) It comprised six 1.7 TB partitions on the six 1.8 TB (actually closer to 1.9 TB, but FreeBSD truncates, rather than rounds) drives in a raidz2. That left a bit of room for other things I intended to do that would take up much less space. It also meant that those 1.7 TB partitions could be exactly the same in terms of space and not differ among them due to slight differences in the real storage capacities of drives of different make{,r}s and models. In the intervening time there have been many drive failures (mostly Seagates, but a few aged-out WD drives, too). About a year ago, a drive failed, and I replaced it with a WD Black 1.8 TB drive, which continues to function flawlessly. Then in January or February two drives failed in rapid succession. At that time, I found two 2.7 TB enterprise drives as replacements, and they were priced much lower apiece than the drive I had bought a month or two earlier. While allocating the partitions on them, I allocated 2 TB on each as the replacements for the 1.7 TB partitions that were on the failed drives. This past summer one of the new enterprise drives failed. It turned out that the reason they had been available so cheaply was that they had been leftover stock of a now discontinued line, so basically they were sold at a closeout price. Getting a replacement for the failed enterprise drive under warranty turned out to be a nightmare. First, the manufacturer said they didn't have a drive of that capacity in the new line, and they wanted to know if I would accept a "4 TB" drive as a replacement, which I naturally approved. When no drive appeared after two weeks, I called and discovered they had left the apartment number off of the address, even though I had had the agent repeat the address back to me on the phone. The parcel service had returned the drive to them as undeliverable. The manufacturer then turned around and *gave my drive to somebody else*, which I believe legally constitutes theft and sale of stolen property, but I did not pursue that. They said they would send another, but that didn't appear either. I called and was told that it had been held up until they could confirm the shipping address *again*, which I then did. When the 3.6 TB replacement arrived, it was *not* an enterprise drive. I called again and asked what was going on and was told that they substituted a non-enterprise drive because they didn't have a "4 TB" enterprise drive available. I then gave them a pretty bad time about leaving my array at risk in a degraded state for so long by their not living up to their warranty, as well as having given a drive that belonged to me away to somebody else. They kept putting me off by requiring to speak with another and yet another person in their company, usually requiring separate phone calls on different days and shipping the non-enterprise drive back to them, but eventually someone arranged for an enterprise drive (of their current line of enterprise drives) to be shipped from their Canadian inventory with an expected additional delay due to having to pass customs and exacerbated by the COVID-19 situation. The drive arrived after one week. Total time until I had a replacement under warranty was nearly *two months* on a failed *enterprise* drive. I know I am not a high-volume customer like Netflix or Amazon, but really(!) that seems unreasonable. So that is the story in a nutshell of how my ever-changing configuration has evolved and why some of the unallocated space on the drives appears where it does. As the 1.8 TB drives give up, I intend to replace them with larger- capacity drives and expand the single top-level vdev in that pool, such that each component will have a 2 TB capacity, rather than its current 1.7 TB capacity. If disk capacities continue to increase with prices decreasing fast enough compared to the remaining lifetimes of the 1.8 TB drives, I may expand the components still further. The two enterprise drives already have the spare space to expand their components quite substantially more than the present 2 TB each. >the bw?-? slices for, and why are they different sizes? Why are the >zmisc slices different sizes? What about ada0 and ada1? And, do you Name Status Components mirror/bw0 COMPLETE da3p5 (ACTIVE) da2p5 (ACTIVE) mirror/bw1 COMPLETE da0p5 (ACTIVE) da1p5 (ACTIVE) mirror/bw2 COMPLETE ada2p15 (ACTIVE) ada3p5 (ACTIVE) Name Status Components concat/buildwork UP mirror/bw1 mirror/bw0 mirror/bw2 (N.B. the components of buildwork are listed out of sequence here. They are configured as bw0, bw1, bw2.) >have spaces in your GPT labels? > The motherboard in the tower has six SATA ports. Two are for optical drives, and four are for HDDs/SSDs. There is also an eSATA controller that I used for one of the external drives for a while, but something failed, and now I can't use the drive that way, so it is on a USB 3.0 port. The machine is very old and has no native USB 3.0 support, but I added two PCIe cards for USB 3.0, one with four ports and one with two ports. The external drives are currently connected with two per controller, and the four-port card also has a seven-port USB 3.0 external hub plugged into it that rarely sees any use (mostly just flash drives). ada0 and ada1 are the much smaller boot drives and are not involved in what happened. ada2 and ada3 are the two drives internal to my ancient tower that have components of the large raidz2, and da0 through da3 contain rest of the six components. The GPT label fields in the "gpart show -l" output in my earlier message have no unprintable characters in them, so they are exactly as shown. Now, on to my confession. The problem was that I had reinserted the wrong partition into bw0 due to a typo; i.e., I had typed da2p8 instead of da2p5, so da2p8 was not available. :-( (It would be nice if GEOM and ZFS error messages were more intelligibly worded, but if wishes were horses ...) Once I saw what the problem was, it was trivially easy and quick to fix. Again, thank you much for your reply. I wish I had gotten the trouble shot sooner (sleep can only be postponed for so long) and posted a followup sooner (ditto) in order to have saved you the bother, but it's nice to know that someone usually does try to help when someone asks for help on these lists. Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************