Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 May 2012 23:36:40 +0200
From:      Frank Bartels <freebsd@knarf.de>
To:        freebsd-fs@freebsd.org
Cc:        Martin Ranne <martin.ranne@kockumsonics.com>, 'Andriy Gapon' <avg@freebsd.org>
Subject:   Re: zpool import reboots computer
Message-ID:  <20120521213640.GC69425@server-king.de>
In-Reply-To: <20120518175225.GA4735@server-king.de>
References:  <39C592E81AEC0B418EAD826FC1BBB09B25253F@mailgate> <4F1878AC.6060704@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25284B@mailgate> <4F1AC995.7050506@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B255E15@mailgate> <4F1D75CD.6050000@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25607F@mailgate> <4F1DC398.3050502@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25CF08@mailgate> <20120518175225.GA4735@server-king.de>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Hi freebsd-fs,

my zpool is still not importable. :(

I've got some tips during the last days, but none of them changed
the error messages below in the second "screenshot". The zdb -lu
output is in sync for all disks.

I've replaced the boot mirror (FreeBSD 8.3-RELEASE-p1, gptzfsboot)
with a plain FreeBSD 9.0-STABLE (gptboot, single disk, / on ufs)
with the patched vdev_mirror.c and finally managed to get a crashdump
while trying to import the zpool zdata using "zpool import -f -R
/angus.zdata -d /dev/gpt zdata":

https://www.server-king.de/download/zfs-crash-debug/backtrace.zdata.crash.txt

The patch I've used is here:

https://www.server-king.de/download/zfs-crash-debug/vdev_mirror.c.patch.txt

Otherwise it would crash before the same way as with plain RELENG_8_3.

Do you need any more information?

Thanks for any help,
Knarf

P.S.: The latest changes seen in
http://www.freebsd.org/cgi/cvsweb.cgi/src/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_dataset.c
(SVN rev 235702) do not seem to be related with my problem, but
I'll do a make world right now.

On Fri, May 18, 2012 at 19:52:25 +0200, Frank Bartels wrote:
> Hi freebsd-fs,
> 
> I have a similar problem like Martin.
> 
> It started a while ago with a broken zfs, I was no longer able to
> delete some files on /home/ncvs:
> 
> Checking setuid files and devices:
> find: /home/ncvs/del/efax/Attic/pkg-comment,v: Bad file descriptor
> find: /home/ncvs/del/libsyncml/files: No such file or directory
> 
> Two days ago the machine started rebooting every two hours, directly
> after syncing my local cvsup-server.
> 
> So I renamed the zfs /home/ncvs to /home/ncvs.del and tried to
> destroy it including its snapshots. The machine crashed again and
> now I'm unable to import the pool.
> 
> First I've seen this backtrace:
> 
> https://www.server-king.de/download/DSC02742.medium.JPG
> 
> Then I've added the three blocks above to vdev_mirror.c. It still
> crashes, but the backtrace has changed:
> 
> https://www.server-king.de/download/DSC02744.medium.JPG
> 
> ...
> calltrap
> zio_checksum_verify
> zio_execute
> arc_read_nolock
> arc_read
> ...
> 
> This is FreeBSD 8.3-RELEASE-p1 amd64 on a Xeon X5650 with 24 GByte
> RAM and 12 hard disks and 2 SSDs.
> 
> This is what I see with zpool import -d /dev/gpt
> 
>    pool: zdata
>      id: 18141461787395278116
>   state: ONLINE
>  action: The pool can be imported using its name or numeric identifier.
>  config:
> 
>         zdata               ONLINE
>           raidz2-0          ONLINE
>             gpt/zdata1.eli  ONLINE
>             gpt/zdata0.eli  ONLINE
>             gpt/zdata2.eli  ONLINE
>             gpt/zdata3.eli  ONLINE
>             gpt/zdata5.eli  ONLINE
>             gpt/zdata4.eli  ONLINE
>             gpt/zdata6.eli  ONLINE
>             gpt/zdata8.eli  ONLINE
>             gpt/zdata9.eli  ONLINE
>         cache
>           gpt/zcache0.eli
>           gpt/zcache1.eli
>         spares
>           gpt/zdata7.eli
>         logs
>           mirror-1          ONLINE
>             gpt/zlog0.eli   ONLINE
>             gpt/zlog1.eli   ONLINE
> 
> I have no idea why I don't see zcaches and zdata7 as ONLINE.
> 
> If I use zpool import (without -d) I see dsk/gpt instead of gpt/
> on these three disks:
> 
>         cache
>           dsk/gpt/zcache0.eli
>           dsk/gpt/zcache1.eli
>         spares
>           dsk/gpt/zdata7.eli
> 
> Do you have any idea what I can do? I've tried 9.0-RELEASE (LiveCD)
> without success. Do you think using 8.3-STABLE or 9.0-STABLE could
> cure my problem?
> 
> Thanks,
> Knarf
> 
> On Wed, Jan 25, 2012 at 16:10:19 +0000, Martin Ranne wrote:
> > Thank you everyone who have helped me with hacking zfs. We have now been able to do an import of the pool and transfered all the data to another computer. Next step is to see if we can quickly repair the pool or just delete it and make it new again.
> >
> > We hacked the functions vdev_mirror_child_select() and vdev_mirror_io_start(). In vdev_mirror_io_start() we added the code below just after the mc pointer was set in both loops.
> >
> > if (mc->mc_vd == NULL) {
> >     (void) printf("mc->mc_vd is NULL. Child %i\n", c);
> >     continue;
> > }
> >
> > In vdev_mirror_child_select(), we added the code below just after the mc pointer was set.
> >
> > if (mc->mc_vd == NULL) {
> >     (void) printf("mc->mc_vd is NULL. Child %i\n", c);
> >     mc->mc_tried = 1;
> >     mc->mc_skipped = 1;
> >     continue;
> > }
> >
> >
> > Best regards,
> >
> > Martin Ranne
> >
> > >On 2012-01-23 21:31, Andriy Gapon wrote:
> > >>on 23/01/2012 20:33 Martin Ranne said the following:
> > >>Have done some checking and found mc->mc_vd == NULL in the function vdev_mirror_io_start() where the while-loop is.
> > >>
> > >>while (children--) {
> > >>    mc = &mm->mm_child[c];
> > >>    zio_nowait(zio_vdev_child_io(zio, zio->io_bp,
> > >>        mc->mc_vd, mc->mc_offset, zio->io_data, zio->io_size,
> > >>        zio->io_type, zio->io_priority, 0,
> > >>        vdev_mirror_child_done, mc));
> > >>    c++;
> > >>}
> > >>
> > >>if i set a break before it runs zio_nowait() it will still crash the kernel.
> > >>What can i check next for it to be able to continue? Is it possible to have it ignore the child where mc_vd is NULL? I am also looking into what more I can do to debug it (adding code to print to console as i can not use kernel dumps).
> > >>
> > >Not sure.  If by "set a break" you mean inserting a break statement, try
> > >continue instead.
> > >
> >
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

[-- Attachment #2 --]
0	*H
010	+0	*H

040 0
	*H
0}10	UIL10U

StartCom Ltd.1+0)U"Secure Digital Certificate Signing1)0'U StartCom Certification Authority0
071024210255Z
171024210255Z010	UIL10U

StartCom Ltd.1+0)U"Secure Digital Certificate Signing1806U/StartCom Class 2 Primary Intermediate Client CA0"0
	*H
0
(E,3*
U]"gFSݤ>}m
w鞆FA7~
|-ql"/Q?Vp`G&viĜ73{B'87ds	Nfz1%TII|2o/mD \t	:08VGqǴ3Rp}JTzF;&X}rD Q600U00U0UUo1ʹk1㬻0U#0N@[i04hCA0f+Z0X0'+0http://ocsp.startssl.com/ca0-+0!http://www.startssl.com/sfsca.crt0[UT0R0'%#!http://www.startssl.com/sfsca.crl0'%#!http://crl.startssl.com/sfsca.crl0U y0w0u+70f0.+"http://www.startssl.com/policy.pdf04+(http://www.startssl.com/intermediate.pdf0
	*H
:'
ӴiiL\};JBG
Ƚ1FagR~9P1 Rvg}ȜsWr<];sY/Msߟq'ɽNpʧ`&pPz/ў-Eׁ1KeET5ꥊ@v錈{8@t	e=ރt92Ow[%[kd+YO!_uyGYqE\pCbM~
@3xnM+RH?o'V=INjWbᑶYOuZk*9Jz)w󫦒jNnZqwZV=t+΄BMkd"ܧfVSąmzLu8¾ņVcoiQ^7|#Bl@/D;+@8	~brA+}TLVŜ2J(Hn}Rt]fiZ
U	]+nŚܓqEF$^fsȕP)*6\q)900p0
	*H
010	UIL10U

StartCom Ltd.1+0)U"Secure Digital Certificate Signing1806U/StartCom Class 2 Primary Intermediate Client CA0
111207084257Z
131207180531Z01 0U
585306-t6Iq73s001VA9U3q10	UDE10
UBayern10UMuenchen10U
Frank Bartels1"0 	*H
	knarf@camelot-ek.de0"0
	*H
0
aH*Y2[->i
6n#:*D?@w-tۣ1eHSUio©w)2ْ$X7]vKꞁt_L~jsŽgIOe}8a^/(ԻdM{$ؼ*z|	LڲAIaLe|(N;6I^:|ՖĵoI<!,0S[wBFĭY	}00	U00U0U%0++0Umj1Na&}0U#0Uo1ʹk1㬻0]UV0Tknarf@camelot-ek.deknarf@camelot-ek.deknarf@knarf.def.bartels@happy-pixel.de0!U 00+700.+"http://www.startssl.com/policy.pdf04+(http://www.startssl.com/intermediate.pdf0+00' StartCom Certification Authority0This certificate was issued according to the Class 2 Validation requirements of the StartCom CA policy, reliance only for the intended purpose in compliance of the relying party obligations.0+00' StartCom Certification Authority0dLiability and warranties are limited! See section "Legal and Limitations" of the StartCom CA policy.06U/0-0+)'%http://crl.startssl.com/crtu2-crl.crl0+009+0-http://ocsp.startssl.com/sub/class2/client/ca0B+06http://aia.startssl.com/certs/sub.class2.client.ca.crt0#U0http://www.startssl.com/0
	*H
U&Hœ]B8)8k^+XsfSz7⡙{2jQS
;>(£Hz:h'5۠}TJuFQLjy5
P$
FZuP)-vAwN^ϙdrPɶ7DE$/ֳٜfwȤUY<?]_z@b0uhlpR-f-gLm3.U%hߊ\rr|k:mRk30100010	UIL10U

StartCom Ltd.1+0)U"Secure Digital Certificate Signing1806U/StartCom Class 2 Primary Intermediate Client CA0	+0	*H
	1	*H
0	*H
	1
120521213640Z0#	*H
	1܍!={lTүgH0y	*H
	1l0j0	`He*0	`He0	`He0
*H
0*H
0
*H
@0+0
*H
(0
	*H
Pe!?} j|&%e-D9^DAj<ƪ
Ⱦ>&y&AO:rE`1mu:YLse[`O'J?sX_򹃯×Rmy~*M^ȒzeT|^	=y~<.`UJ*Ut+Ju
8٢h̵A{Q'dnqnyDJ㰬

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120521213640.GC69425>