From owner-freebsd-geom@FreeBSD.ORG Thu Dec 18 11:40:54 2008 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF99D1065674 for ; Thu, 18 Dec 2008 11:40:54 +0000 (UTC) (envelope-from aglarond@gmail.com) Received: from yw-out-2324.google.com (yw-out-2324.google.com [74.125.46.28]) by mx1.freebsd.org (Postfix) with ESMTP id 929838FC17 for ; Thu, 18 Dec 2008 11:40:54 +0000 (UTC) (envelope-from aglarond@gmail.com) Received: by yw-out-2324.google.com with SMTP id 9so531268ywe.13 for ; Thu, 18 Dec 2008 03:40:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=ZMZ2MHOPCldnR3NGuN0jcmuNDsaa+O/9HVmhl2PeJBA=; b=QZgT8J44aSm0P2gcO93o92DHZUQi21AEzEic0ksOZfIf0Ie5/PA6oyiFJwOQVD9Udr iZ+B0UZXzIh7iRvFWm6SiWXBa8w/32BKjai5jlMo7GAOejMxKY+XRe8NNUMhJVOgp0Ly RHDFsdxgQDU8t4H0q9QQhETUvdwZBOcyqSV0Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=nJ+3qhBVR+wPb931Jm/IhiArjaLhOPkz3mbnqSe61GlVCiQgRWhIeXh5PVMeVQNwwH zY7SGBtif+1oWJ725JD2YQi5ZIaxvEaOjG4U+wg8z2LWw3q+sK18lGHwUFx1RK279fMv G0JaWeBeyLmSkE6kNPnBLgIbGbqtTUZEaeWaQ= Received: by 10.100.108.20 with SMTP id g20mr1336809anc.14.1229599226815; Thu, 18 Dec 2008 03:20:26 -0800 (PST) Received: by 10.101.70.16 with HTTP; Thu, 18 Dec 2008 03:20:26 -0800 (PST) Message-ID: <55c107bf0812180320x502847efi53df5a7da68b73e1@mail.gmail.com> Date: Thu, 18 Dec 2008 12:20:26 +0100 From: "Dimitri Aivaliotis" To: freebsd-geom@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: gvinum raid10 stale X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Dec 2008 11:40:55 -0000 Hi, I created a raid10 using gvinum with the following config: drive a device /dev/da2 drive b device /dev/da3 volume raid10 plex org striped 512k sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b plex org striped 512k sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a sd length 4374m drive b sd length 4374m drive a I wanted to add two additional disks to this raid10, so I shutdown the server, inserted the disks and brought it back up. When the system booted, it reported the filesystem as needing a check. Doing a gvinum list, I saw that all subdisks were stale, so both plexes were down. After rebooting again (to remove the additional disks), the problem persisted. My assumption that the new disks caused the old subdisks to be stale wasn't true, as I later noticed that a different server with the same config has a plex down as well because all subdisks on that plex are stale. The servers are running 6.3-RELEASE-p1 and 6.2-RELEASE-p9, respectively. (I wound up doing a 'gvinum setstate -f up raid10.p1.s' 32 times to bring one plex back up on the server that had both down.) My questions: - Why would these subdisks be set stale? - How can I recover the other plex, such that the data continues to be striped+mirrored correctly? - How can I extend this raid10 by adding two additional disks? These servers are both in production, so I unfortunately can't do things like move the data, re-create the RAID, and move the data back. Any help, tips, advice would be greatly appreciated. Below are messages from dmesg on server1, as well as gvinum list output for both. - Dimitri server1 (6.3-RELEASE-p1) ======================= dmesg|grep -i geom GEOM_VINUM: subdisk raid10.p0.s1 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s3 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s5 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s7 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s9 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s11 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s13 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s15 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s17 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s19 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s21 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s23 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s25 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s27 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s29 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s31 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s0 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s2 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s4 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s6 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s8 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s10 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s12 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s14 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s16 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s18 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s20 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s22 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s24 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s26 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s28 state change: down -> stale GEOM_VINUM: subdisk raid10.p0.s30 state change: down -> stale (So, I figured it would be best to bring plex 1 up, and executed the above-mentioned setstate commands.) GEOM_VINUM: plex raid10.p1 state change: down -> up gvinum list ------------- 2 drives: D a State: up /dev/da2 A: 10/139978 MB (0%) D b State: up /dev/da3 A: 10/139978 MB (0%) 1 volume: V raid10 State: up Plexes: 2 Size: 136 GB 2 plexes: P raid10.p0 S State: down Subdisks: 32 Size: 136 GB P raid10.p1 S State: up Subdisks: 32 Size: 136 GB 64 subdisks: S raid10.p0.s0 State: stale D: a Size: 4374 MB S raid10.p0.s1 State: stale D: b Size: 4374 MB S raid10.p0.s2 State: stale D: a Size: 4374 MB S raid10.p0.s3 State: stale D: b Size: 4374 MB S raid10.p0.s4 State: stale D: a Size: 4374 MB S raid10.p0.s5 State: stale D: b Size: 4374 MB S raid10.p0.s6 State: stale D: a Size: 4374 MB S raid10.p0.s7 State: stale D: b Size: 4374 MB S raid10.p0.s8 State: stale D: a Size: 4374 MB S raid10.p0.s9 State: stale D: b Size: 4374 MB S raid10.p0.s10 State: stale D: a Size: 4374 MB S raid10.p0.s11 State: stale D: b Size: 4374 MB S raid10.p0.s12 State: stale D: a Size: 4374 MB S raid10.p0.s13 State: stale D: b Size: 4374 MB S raid10.p0.s14 State: stale D: a Size: 4374 MB S raid10.p0.s15 State: stale D: b Size: 4374 MB S raid10.p0.s16 State: stale D: a Size: 4374 MB S raid10.p0.s17 State: stale D: b Size: 4374 MB S raid10.p0.s18 State: stale D: a Size: 4374 MB S raid10.p0.s19 State: stale D: b Size: 4374 MB S raid10.p0.s20 State: stale D: a Size: 4374 MB S raid10.p0.s21 State: stale D: b Size: 4374 MB S raid10.p0.s22 State: stale D: a Size: 4374 MB S raid10.p0.s23 State: stale D: b Size: 4374 MB S raid10.p0.s24 State: stale D: a Size: 4374 MB S raid10.p0.s25 State: stale D: b Size: 4374 MB S raid10.p0.s26 State: stale D: a Size: 4374 MB S raid10.p0.s27 State: stale D: b Size: 4374 MB S raid10.p0.s28 State: stale D: a Size: 4374 MB S raid10.p0.s29 State: stale D: b Size: 4374 MB S raid10.p0.s30 State: stale D: a Size: 4374 MB S raid10.p0.s31 State: stale D: b Size: 4374 MB S raid10.p1.s0 State: up D: b Size: 4374 MB S raid10.p1.s1 State: up D: a Size: 4374 MB S raid10.p1.s2 State: up D: b Size: 4374 MB S raid10.p1.s3 State: up D: a Size: 4374 MB S raid10.p1.s4 State: up D: b Size: 4374 MB S raid10.p1.s5 State: up D: a Size: 4374 MB S raid10.p1.s6 State: up D: b Size: 4374 MB S raid10.p1.s7 State: up D: a Size: 4374 MB S raid10.p1.s8 State: up D: b Size: 4374 MB S raid10.p1.s9 State: up D: a Size: 4374 MB S raid10.p1.s10 State: up D: b Size: 4374 MB S raid10.p1.s11 State: up D: a Size: 4374 MB S raid10.p1.s12 State: up D: b Size: 4374 MB S raid10.p1.s13 State: up D: a Size: 4374 MB S raid10.p1.s14 State: up D: b Size: 4374 MB S raid10.p1.s15 State: up D: a Size: 4374 MB S raid10.p1.s16 State: up D: b Size: 4374 MB S raid10.p1.s17 State: up D: a Size: 4374 MB S raid10.p1.s18 State: up D: b Size: 4374 MB S raid10.p1.s19 State: up D: a Size: 4374 MB S raid10.p1.s20 State: up D: b Size: 4374 MB S raid10.p1.s21 State: up D: a Size: 4374 MB S raid10.p1.s22 State: up D: b Size: 4374 MB S raid10.p1.s23 State: up D: a Size: 4374 MB S raid10.p1.s24 State: up D: b Size: 4374 MB S raid10.p1.s25 State: up D: a Size: 4374 MB S raid10.p1.s26 State: up D: b Size: 4374 MB S raid10.p1.s27 State: up D: a Size: 4374 MB S raid10.p1.s28 State: up D: b Size: 4374 MB S raid10.p1.s29 State: up D: a Size: 4374 MB S raid10.p1.s30 State: up D: b Size: 4374 MB S raid10.p1.s31 State: up D: a Size: 4374 MB server2 (6.2-RELEASE-p9) ======================= (no clues in the logs as to why the subdisks are stale) gvinum list ------------- 2 drives: D b State: up /dev/da3 A: 10/139978 MB (0%) D a State: up /dev/da2 A: 10/139978 MB (0%) 1 volume: V raid10 State: up Plexes: 2 Size: 136 GB 2 plexes: P raid10.p0 S State: up Subdisks: 32 Size: 136 GB P raid10.p1 S State: down Subdisks: 32 Size: 136 GB 64 subdisks: S raid10.p0.s0 State: up D: a Size: 4374 MB S raid10.p0.s1 State: up D: b Size: 4374 MB S raid10.p0.s2 State: up D: a Size: 4374 MB S raid10.p0.s3 State: up D: b Size: 4374 MB S raid10.p0.s4 State: up D: a Size: 4374 MB S raid10.p0.s5 State: up D: b Size: 4374 MB S raid10.p0.s6 State: up D: a Size: 4374 MB S raid10.p0.s7 State: up D: b Size: 4374 MB S raid10.p0.s8 State: up D: a Size: 4374 MB S raid10.p0.s9 State: up D: b Size: 4374 MB S raid10.p0.s10 State: up D: a Size: 4374 MB S raid10.p0.s11 State: up D: b Size: 4374 MB S raid10.p0.s12 State: up D: a Size: 4374 MB S raid10.p0.s13 State: up D: b Size: 4374 MB S raid10.p0.s14 State: up D: a Size: 4374 MB S raid10.p0.s15 State: up D: b Size: 4374 MB S raid10.p0.s16 State: up D: a Size: 4374 MB S raid10.p0.s17 State: up D: b Size: 4374 MB S raid10.p0.s18 State: up D: a Size: 4374 MB S raid10.p0.s19 State: up D: b Size: 4374 MB S raid10.p0.s20 State: up D: a Size: 4374 MB S raid10.p0.s21 State: up D: b Size: 4374 MB S raid10.p0.s22 State: up D: a Size: 4374 MB S raid10.p0.s23 State: up D: b Size: 4374 MB S raid10.p0.s24 State: up D: a Size: 4374 MB S raid10.p0.s25 State: up D: b Size: 4374 MB S raid10.p0.s26 State: up D: a Size: 4374 MB S raid10.p0.s27 State: up D: b Size: 4374 MB S raid10.p0.s28 State: up D: a Size: 4374 MB S raid10.p0.s29 State: up D: b Size: 4374 MB S raid10.p0.s30 State: up D: a Size: 4374 MB S raid10.p0.s31 State: up D: b Size: 4374 MB S raid10.p1.s0 State: stale D: b Size: 4374 MB S raid10.p1.s1 State: stale D: a Size: 4374 MB S raid10.p1.s2 State: stale D: b Size: 4374 MB S raid10.p1.s3 State: stale D: a Size: 4374 MB S raid10.p1.s4 State: stale D: b Size: 4374 MB S raid10.p1.s5 State: stale D: a Size: 4374 MB S raid10.p1.s6 State: stale D: b Size: 4374 MB S raid10.p1.s7 State: stale D: a Size: 4374 MB S raid10.p1.s8 State: stale D: b Size: 4374 MB S raid10.p1.s9 State: stale D: a Size: 4374 MB S raid10.p1.s10 State: stale D: b Size: 4374 MB S raid10.p1.s11 State: stale D: a Size: 4374 MB S raid10.p1.s12 State: stale D: b Size: 4374 MB S raid10.p1.s13 State: stale D: a Size: 4374 MB S raid10.p1.s14 State: stale D: b Size: 4374 MB S raid10.p1.s15 State: stale D: a Size: 4374 MB S raid10.p1.s16 State: stale D: b Size: 4374 MB S raid10.p1.s17 State: stale D: a Size: 4374 MB S raid10.p1.s18 State: stale D: b Size: 4374 MB S raid10.p1.s19 State: stale D: a Size: 4374 MB S raid10.p1.s20 State: stale D: b Size: 4374 MB S raid10.p1.s21 State: stale D: a Size: 4374 MB S raid10.p1.s22 State: stale D: b Size: 4374 MB S raid10.p1.s23 State: stale D: a Size: 4374 MB S raid10.p1.s24 State: stale D: b Size: 4374 MB S raid10.p1.s25 State: stale D: a Size: 4374 MB S raid10.p1.s26 State: stale D: b Size: 4374 MB S raid10.p1.s27 State: stale D: a Size: 4374 MB S raid10.p1.s28 State: stale D: b Size: 4374 MB S raid10.p1.s29 State: stale D: a Size: 4374 MB S raid10.p1.s30 State: stale D: b Size: 4374 MB S raid10.p1.s31 State: stale D: a Size: 4374 MB