Date: Fri, 18 Apr 2008 10:40:04 -0700 From: Christopher Cowart <ccowart@rescomp.berkeley.edu> To: Gary Newcombe <gary@pattersonsoftware.com> Cc: freebsd-questions@freebsd.org Subject: Re: gmirror disk fail questions... Message-ID: <20080418174004.GE27135@hal.rescomp.berkeley.edu> In-Reply-To: <20080418113305.53b72c64.gary@pattersonsoftware.com> References: <20080418113305.53b72c64.gary@pattersonsoftware.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--eIqwoG8s2bAM0wbT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Gary Newcombe wrote: [...] > # gmirror status >=20 > [mesh:/var/log]# gmirror status > Name Status Components > mirror/gm0 DEGRADED ad4 >=20 >=20 > looking in /dev/ however, we have >=20 > crw-r----- 1 root operator 0, 83 17 Apr 13:58 ad4 > crw-r----- 1 root operator 0, 91 17 Apr 13:58 ad4s1 > crw-r----- 1 root operator 0, 84 17 Apr 13:58 ad6 > crw-r----- 1 root operator 0, 92 17 Apr 13:58 ad6a > crw-r----- 1 root operator 0, 99 17 Apr 13:58 ad6as1 > crw-r----- 1 root operator 0, 93 17 Apr 13:58 ad6b > crw-r----- 1 root operator 0, 94 17 Apr 13:58 ad6c > crw-r----- 1 root operator 0, 100 17 Apr 13:58 ad6cs1 > crw-r----- 1 root operator 0, 95 17 Apr 13:58 ad6d > crw-r----- 1 root operator 0, 96 17 Apr 13:58 ad6e > crw-r----- 1 root operator 0, 97 17 Apr 13:58 ad6f > crw-r----- 1 root operator 0, 98 17 Apr 13:58 ad6s1 > crw-r----- 1 root operator 0, 101 17 Apr 13:58 ad6s1a > crw-r----- 1 root operator 0, 102 17 Apr 13:58 ad6s1b > crw-r----- 1 root operator 0, 103 17 Apr 13:58 ad6s1c > crw-r----- 1 root operator 0, 104 17 Apr 13:58 ad6s1d > crw-r----- 1 root operator 0, 105 17 Apr 13:58 ad6s1e > crw-r----- 1 root operator 0, 106 17 Apr 13:58 ad6s1f >=20 > I am guessing that a failing disk is responsible for the data > corruption, but I have no errors in /var/log/messages or console.log. > On every boot, the mirror is marked clean ad there's no warnings about > a disk failing anywhere? Where should I be looking for or what should I > be doing to get any warnings? >=20 > Also, how-come if ad4 is the working disk, ad4's slices seem to be > labelled as ad6. What's going on here? To me, ad6 appears to have > correct labelling for the mirror from ad6s1a-f I believe the kernel hides individual labels for a gmirror volume. The labels on ad4 should be visible in /dev/mirror/. Because gmirror really just mirrors the data block by block (with a little bit of meta data at the very end of the drive), once the drive is no longer a member of an array, the kernel treats it as an individual drive and allows visibility of all the labels. > How can I test for sure whether the disk is damaged or dying, or > whether this is just a temporary glitch in the mirror? This is the > first time I've had a gmirror raid give me problems. The first time a drive gets kicked out, I typically try to re-insert it. We have monitoring, so we receive notifications if it fails again. After that, I get the vendor to replace it.=20 > Assuming ad6 has been deactivated/disconnected, I was thinking of > trying: >=20 > gmirror activate gm0 ad6 > gmirror rebuild gm0 ad6 >=20 > Is this safe? You have to kick ad6 out and re-insert it: # gmirror forget # gmirror insert gm0 /dev/ad6 After doing that, I would watch closely for a while in case your drive is actually failing. I've written a small nagios check for gmirror; let me know if you'd like me to send it (it could easily be adapted to a cron job). You can also get `gmirror status' output in your dailies by adding daily_status_gmirror_enable=3D"YES" to /etc/periodic.conf. But, given it's timing out on boot, I would personally bag the drive and replace it. You'll still need to run the same 2 commands above. --=20 Chris Cowart Network Technical Lead Network & Infrastructure Services, RSSP-IT UC Berkeley --eIqwoG8s2bAM0wbT Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iQIVAwUBSAjc8yPHEDszU3zYAQJ4VRAAgt5rnd+5hEWuVcCMqHK4UJluhvZnxfLT 95m/8VRni6kbFwvSABPMAdVRI87lloWD+MpU7yuO+1PKBq9VfT68dc7E4EjtVQoU Pi3CtNv7zmBK6HEaK1PgSnD6uVp6UAr+sBrZjNqfPf+8oooC+AW7p50BjxE6w3bZ vd8TEfTb9UVmpJoHeQ8sK0MfxyURfbr9M7Y95/q/Rj+/QFGeqgPr/sxHcXlbfkfx 7/IROKbPwlDuHdvhVPH3yYyCbOHei7AD/Lf5fjOYhytUeP4KPqUVAgO99JHYQXMD JpcBDKjGGNcTUZ4xIm+iXMeoWk3QhF93Hzmfe/3ioJq/PQ4xFVhlZeBeqyKEObMs Vzaoi3pk7w/ym0xtgqHER00Roea1E9wOoUyuvXc9rWtbeYrN9ZApsDqYcAQsCbMs lxpLr2zvQ1/Wpni670xK3AHFaPpkbI+PKQCdyHf7+yWYU0IJwpPwCge00KBI7NMg F7xDyCTa/2sH4yaZ1x/zJh2zSf2wRwfS5Gyr3C0llYy1ClWYiTtffcMsd7il9xYu 0sbwUdX/NvWZMJfMMAF7SGCO/icYJJY0Zh/SRMoA548OvQoZN0IApyoC0u0boqqi 6IlWmRl3zK0i+Pes5HO2e6zQnKNYpWriYUAwgOgJjzACrMz/1Sm4hWzAgVo00p4M yl0EDmKxZ9s= =n9zi -----END PGP SIGNATURE----- --eIqwoG8s2bAM0wbT--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080418174004.GE27135>