Date: Mon, 16 Dec 2013 01:27:03 +0100 From: krichy@cflinux.hu To: delphij@delphij.net Cc: freebsd-fs@freebsd.org Subject: Re: Fwd: Re: Re: zfs deadlock Message-ID: <e70edbfaf1cc75a60aa653a937f28ba6@cflinux.hu> In-Reply-To: <cf779e1797622469b5860ed8643a7357@cflinux.hu> References: <04fac9b4a2352d97a23470c9da5db029@cflinux.hu> <cf779e1797622469b5860ed8643a7357@cflinux.hu>
next in thread | previous in thread | raw e-mail | index | archive | help
--=_2592f2ae183913a1079652aa01cc934b Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8; format=flowed Dear devs, I've managed to fix my issue somehow, please review the attached patch. First, the traverse() call was made to conform to lock order described in kern/vfs_subr.c before vfs_busy(). Also, traverse() will return a locked vnode in the event of success, even when there are no mounted filesystems over the given vnode. And last a deadlock race between zfsctl_snapdir_lookup() and zfsctl_snapshot_inactive() is handled, which may need the most review, as that may be buggy, or implement new bugs. This applies to stable/10 right now. I am waiting on feedback. Regards, 2013-12-11 11:43 időpontban krichy@cflinux.hu ezt írta: > Dear devs, > > I have still have no success fixing these bugs, please help somehow. I > currently dont understand the recursive lock problem, how should it be > avoided. > > Thanks in advance, > > 2013-12-07 15:42 időpontban krichy@cflinux.hu ezt írta: >> Dear Xin, >> >> I dont know if you read the -fs list or not, but there is a possible >> bug in zfs snapshot handling, and unfortunately I cannot fix the >> problem, but at least I could reproduce it. >> Please have a look at it, and if I can help resolving it, i will. >> >> Regards, >> >> -------- Eredeti üzenet -------- >> Tárgy: Re: Re: zfs deadlock >> Dátum: 2013-12-07 14:38 >> Feladó: krichy@cflinux.hu >> Címzett: Steven Hartland <killing@multiplay.co.uk> >> Másolat: freebsd-fs@freebsd.org >> >> Dear Steven, >> >> A crash is very easily reproducible with the attached script, just >> make an empty dataset, make a snapshot of it, >> and run the script. >> In my virtual machine it crashed in a few seconds, producing the >> attached output. >> >> Regards, >> 2013-12-06 17:28 időpontban krichy@cflinux.hu ezt írta: >>> Dear Steven, >>> >>> using the previously provided scripts, the bug still appears. And I >>> got the attaches traces when the deadlock occured. >>> >>> It seems that one process is in zfs_mount(), while the other is in >>> zfs_unmount_snap(). Look for the 'zfs' and 'ls' commands. >>> >>> Hope it helps. >>> >>> Regards, >>> 2013-12-06 16:59 időpontban krichy@cflinux.hu ezt írta: >>>> So maybe the force flag is too strict. Under linux the snapshots >>>> remains mounted after a send. >>>> >>>> 2013-12-06 16:54 időpontban krichy@cflinux.hu ezt írta: >>>>> Dear Steven, >>>>> >>>>> Of course. But I got further now. You mentioned that is normal that >>>>> zfs send umounts snapshots. I dont know, but this indeed causes a >>>>> problem: >>>>> >>>>> It is also reproducible without zfs send. >>>>> 1. Have a large directory structure (just to make sure find runs >>>>> long >>>>> enough), make a snapshot of it. >>>>> # cd /mnt/pool/set/.zfs/snapshot/snap >>>>> # find . >>>>> >>>>> meanwhile, on another console >>>>> # umount -f /mnt/pool/set/.zfs/snapshot/snap >>>>> >>>>> will cause a panic, or such. >>>>> >>>>> So effectively a regular user on a system can cause a crash. >>>>> >>>>> Regards, >>>>> >>>>> 2013-12-06 16:50 időpontban Steven Hartland ezt írta: >>>>>> kernel compiled, installed and rebooted? >>>>>> ----- Original Message ----- From: <krichy@cflinux.hu> >>>>>> To: <smh@FreeBSD.org> >>>>>> Sent: Friday, December 06, 2013 12:17 PM >>>>>> Subject: Fwd: Re: zfs deadlock >>>>>> >>>>>> >>>>>>> Dear shm, >>>>>>> >>>>>>> I've applied r258294 on top fo releng/9.2, but my test seems to >>>>>>> trigger >>>>>>> the deadlock again. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> -------- Eredeti üzenet -------- >>>>>>> Tárgy: Re: zfs deadlock >>>>>>> Dátum: 2013-12-06 13:17 >>>>>>> Feladó: krichy@cflinux.hu >>>>>>> Címzett: freebsd-fs@freebsd.org >>>>>>> >>>>>>> I've applied r258294 on top of releng/9.2, and using the attached >>>>>>> scripts parallel, the system got into a deadlock again. >>>>>>> >>>>>>> 2013-12-06 11:35 időpontban Steven Hartland ezt írta: >>>>>>>> Thats correct it unmounts the mounted snapshot. >>>>>>>> >>>>>>>> Regards >>>>>>>> Steve >>>>>>>> >>>>>>>> ----- Original Message ----- From: <krichy@cflinux.hu> >>>>>>>> To: "Steven Hartland" <killing@multiplay.co.uk> >>>>>>>> Cc: <freebsd-fs@freebsd.org> >>>>>>>> Sent: Friday, December 06, 2013 8:50 AM >>>>>>>> Subject: Re: zfs deadlock >>>>>>>> >>>>>>>> >>>>>>>>> What is strange also, when a zfs send finishes, the paralell >>>>>>>>> running >>>>>>>>> find command issues errors: >>>>>>>>> >>>>>>>>> find: ./e/Chuje: No such file or directory >>>>>>>>> find: ./e/singe: No such file or directory >>>>>>>>> find: ./e/joree: No such file or directory >>>>>>>>> find: ./e/fore: No such file or directory >>>>>>>>> find: fts_read: No such file or directory >>>>>>>>> Fri Dec 6 09:46:04 CET 2013 2 >>>>>>>>> >>>>>>>>> Seems if the filesystem got unmounted meanwhile. But the script >>>>>>>>> is >>>>>>>>> changed its working directory to the snapshot dir. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> 2013-12-06 09:03 időpontban krichy@cflinux.hu ezt írta: >>>>>>>>>> Dear Steven, >>>>>>>>>> >>>>>>>>>> While I was playig with zfs, trying to reproduce the previous >>>>>>>>>> bug, >>>>>>>>>> accidentaly hit another one, which caused a trace I attached. >>>>>>>>>> >>>>>>>>>> The snapshot contains directories in 2 depth, which contain >>>>>>>>>> files. It >>>>>>>>>> was to simulate a vmail setup, with domain/user hierarchy. >>>>>>>>>> >>>>>>>>>> I hope it is useful for someone. >>>>>>>>>> >>>>>>>>>> I used the attached two scripts to reproduce the ZFS bug. >>>>>>>>>> >>>>>>>>>> It definetly crashes the system, in the last 10 minutes it is >>>>>>>>>> the 3rd >>>>>>>>>> time. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> 2013-12-05 20:26 időpontban krichy@cflinux.hu ezt írta: >>>>>>>>>>> Dear Steven, >>>>>>>>>>> >>>>>>>>>>> Thanks for your reply. Do you know how to reproduce the bug? >>>>>>>>>>> Because >>>>>>>>>>> simply sending a snapshot which is mounted does not >>>>>>>>>>> automatically >>>>>>>>>>> trigger the deadlock. Some special cases needed, or what? >>>>>>>>>>> How to prove that the patch fixes this? >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> 2013-12-05 19:39 időpontban Steven Hartland ezt írta: >>>>>>>>>>>> Known issue you want: >>>>>>>>>>>> http://svnweb.freebsd.org/changeset/base/258595 >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> Steve >>>>>>>>>>>> >>>>>>>>>>>> ----- Original Message ----- From: "Richard Kojedzinszky" >>>>>>>>>>>> <krichy@cflinux.hu> >>>>>>>>>>>> To: <freebsd-fs@freebsd.org> >>>>>>>>>>>> Sent: Thursday, December 05, 2013 2:56 PM >>>>>>>>>>>> Subject: zfs deadlock >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Dear fs devs, >>>>>>>>>>>>> >>>>>>>>>>>>> We have a freenas server, which is basicaly a freebsd. I >>>>>>>>>>>>> was >>>>>>>>>>>>> trying to look at snapshots using ls .zfs/snapshot/. >>>>>>>>>>>>> >>>>>>>>>>>>> When I issued it, the system entered a deadlock. An NFSD >>>>>>>>>>>>> was >>>>>>>>>>>>> running, a zfs send was running when I issued the command. >>>>>>>>>>>>> >>>>>>>>>>>>> I attached to command outputs while the system was in a >>>>>>>>>>>>> deadlock >>>>>>>>>>>>> state. I tried to issue >>>>>>>>>>>>> # reboot -q >>>>>>>>>>>>> But that did not restart the system. After a while (5-10 >>>>>>>>>>>>> minutes) >>>>>>>>>>>>> the system rebooted, I dont know if the deadman caused >>>>>>>>>>>>> that. >>>>>>>>>>>>> >>>>>>>>>>>>> Now the system is up and running. >>>>>>>>>>>>> >>>>>>>>>>>>> It is basically a freebsd 9.2 kernel. >>>>>>>>>>>>> >>>>>>>>>>>>> Do someone has a clue? >>>>>>>>>>>>> >>>>>>>>>>>>> Kojedzinszky Richard >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -------------------------------------------------------------------------------- >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> freebsd-fs@freebsd.org mailing list >>>>>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>>>>>>>>>> To unsubscribe, send any mail to >>>>>>>>>>>>> "freebsd-fs-unsubscribe@freebsd.org" >>>>>>>>>>>> >>>>>>>>>>>> ================================================ >>>>>>>>>>>> This e.mail is private and confidential between Multiplay >>>>>>>>>>>> (UK) Ltd. >>>>>>>>>>>> and the person or entity to whom it is addressed. In the >>>>>>>>>>>> event of >>>>>>>>>>>> misdirection, the recipient is prohibited from using, >>>>>>>>>>>> copying, >>>>>>>>>>>> printing or otherwise disseminating it or any information >>>>>>>>>>>> contained >>>>>>>>>>>> in >>>>>>>>>>>> it. >>>>>>>>>>>> >>>>>>>>>>>> In the event of misdirection, illegible or incomplete >>>>>>>>>>>> transmission >>>>>>>>>>>> please telephone +44 845 868 1337 >>>>>>>>>>>> or return the E.mail to postmaster@multiplay.co.uk. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ================================================ >>>>>>>> This e.mail is private and confidential between Multiplay (UK) >>>>>>>> Ltd. >>>>>>>> and the person or entity to whom it is addressed. In the event >>>>>>>> of >>>>>>>> misdirection, the recipient is prohibited from using, copying, >>>>>>>> printing or otherwise disseminating it or any information >>>>>>>> contained in >>>>>>>> it. >>>>>>>> >>>>>>>> In the event of misdirection, illegible or incomplete >>>>>>>> transmission >>>>>>>> please telephone +44 845 868 1337 >>>>>>>> or return the E.mail to postmaster@multiplay.co.uk. >>>>>> >>>>>> >>>>>> ================================================ >>>>>> This e.mail is private and confidential between Multiplay (UK) >>>>>> Ltd. >>>>>> and the person or entity to whom it is addressed. In the event of >>>>>> misdirection, the recipient is prohibited from using, copying, >>>>>> printing or otherwise disseminating it or any information >>>>>> contained in >>>>>> it. >>>>>> >>>>>> In the event of misdirection, illegible or incomplete transmission >>>>>> please telephone +44 845 868 1337 >>>>>> or return the E.mail to postmaster@multiplay.co.uk. --=_2592f2ae183913a1079652aa01cc934b Content-Transfer-Encoding: base64 Content-Type: text/x-diff; name=zfs-deadlock-1.patch Content-Disposition: attachment; filename=zfs-deadlock-1.patch; size=3563 ZGlmZiAtLWdpdCBhL3N5cy9jZGRsL2NvbXBhdC9vcGVuc29sYXJpcy9rZXJuL29wZW5zb2xhcmlz X2xvb2t1cC5jIGIvc3lzL2NkZGwvY29tcGF0L29wZW5zb2xhcmlzL2tlcm4vb3BlbnNvbGFyaXNf bG9va3VwLmMKaW5kZXggOTQzODNkNi4uMjI1NTIxYSAxMDA2NDQKLS0tIGEvc3lzL2NkZGwvY29t cGF0L29wZW5zb2xhcmlzL2tlcm4vb3BlbnNvbGFyaXNfbG9va3VwLmMKKysrIGIvc3lzL2NkZGwv Y29tcGF0L29wZW5zb2xhcmlzL2tlcm4vb3BlbnNvbGFyaXNfbG9va3VwLmMKQEAgLTgxLDYgKzgx LDggQEAgdHJhdmVyc2Uodm5vZGVfdCAqKmN2cHAsIGludCBsa3R5cGUpCiAJICogcHJvZ3Jlc3Mg b24gdGhpcyB2bm9kZS4KIAkgKi8KIAorCXZuX2xvY2soY3ZwLCBsa3R5cGUpOworCiAJZm9yICg7 OykgewogCQkvKgogCQkgKiBSZWFjaGVkIHRoZSBlbmQgb2YgdGhlIG1vdW50IGNoYWluPwpAQCAt ODksMTMgKzkxLDcgQEAgdHJhdmVyc2Uodm5vZGVfdCAqKmN2cHAsIGludCBsa3R5cGUpCiAJCWlm ICh2ZnNwID09IE5VTEwpCiAJCQlicmVhazsKIAkJZXJyb3IgPSB2ZnNfYnVzeSh2ZnNwLCAwKTsK LQkJLyoKLQkJICogdHZwIGlzIE5VTEwgZm9yICpjdnBwIHZub2RlLCB3aGljaCB3ZSBjYW4ndCB1 bmxvY2suCi0JCSAqLwotCQlpZiAodHZwICE9IE5VTEwpCi0JCQl2cHV0KGN2cCk7Ci0JCWVsc2UK LQkJCXZyZWxlKGN2cCk7CisJCXZwdXQoY3ZwKTsKIAkJaWYgKGVycm9yKQogCQkJcmV0dXJuIChl cnJvcik7CiAKZGlmZiAtLWdpdCBhL3N5cy9jZGRsL2NvbnRyaWIvb3BlbnNvbGFyaXMvdXRzL2Nv bW1vbi9mcy9nZnMuYyBiL3N5cy9jZGRsL2NvbnRyaWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9m cy9nZnMuYwppbmRleCA1OTk0NGExLi5jZTQzZmZmIDEwMDY0NAotLS0gYS9zeXMvY2RkbC9jb250 cmliL29wZW5zb2xhcmlzL3V0cy9jb21tb24vZnMvZ2ZzLmMKKysrIGIvc3lzL2NkZGwvY29udHJp Yi9vcGVuc29sYXJpcy91dHMvY29tbW9uL2ZzL2dmcy5jCkBAIC00NDgsNyArNDQ4LDcgQEAgZ2Zz X2xvb2t1cF9kb3Qodm5vZGVfdCAqKnZwcCwgdm5vZGVfdCAqZHZwLCB2bm9kZV90ICpwdnAsIGNv bnN0IGNoYXIgKm5tKQogCQkJVk5fSE9MRChwdnApOwogCQkJKnZwcCA9IHB2cDsKIAkJfQotCQl2 bl9sb2NrKCp2cHAsIExLX0VYQ0xVU0lWRSB8IExLX1JFVFJZKTsKKwkJdm5fbG9jaygqdnBwLCBM S19FWENMVVNJVkUgfCBMS19SRVRSWSB8IExLX0NBTlJFQ1VSU0UpOwogCQlyZXR1cm4gKDApOwog CX0KIApkaWZmIC0tZ2l0IGEvc3lzL2NkZGwvY29udHJpYi9vcGVuc29sYXJpcy91dHMvY29tbW9u L2ZzL3pmcy96ZnNfY3RsZGlyLmMgYi9zeXMvY2RkbC9jb250cmliL29wZW5zb2xhcmlzL3V0cy9j b21tb24vZnMvemZzL3pmc19jdGxkaXIuYwppbmRleCAyOGFiMWZhLi5iMzgyMGRjIDEwMDY0NAot LS0gYS9zeXMvY2RkbC9jb250cmliL29wZW5zb2xhcmlzL3V0cy9jb21tb24vZnMvemZzL3pmc19j dGxkaXIuYworKysgYi9zeXMvY2RkbC9jb250cmliL29wZW5zb2xhcmlzL3V0cy9jb21tb24vZnMv emZzL3pmc19jdGxkaXIuYwpAQCAtMTAxMiw3ICsxMDEyLDE1IEBAIHpmc2N0bF9zbmFwZGlyX2xv b2t1cChhcCkKIAkJCS8qCiAJCQkgKiBUaGUgc25hcHNob3Qgd2FzIHVubW91bnRlZCBiZWhpbmQg b3VyIGJhY2tzLAogCQkJICogdHJ5IHRvIHJlbW91bnQgaXQuCisJCQkgKiBDb25jdXJyZW50IHpm c2N0bF9zbmFwc2hvdF9pbmFjdGl2ZSgpIHdvdWxkIHJlbW92ZSBvdXIgZW50cnkKKwkJCSAqIHNv IGRvIHRoaXMgb3Vyc2VsdmVzLCBhbmQgbWFrZSBhIGZyZXNoIG5ldyBtb3VudC4KIAkJCSAqLwor CQkJYXZsX3JlbW92ZSgmc2RwLT5zZF9zbmFwcywgc2VwKTsKKwkJCWttZW1fZnJlZShzZXAtPnNl X25hbWUsIHN0cmxlbihzZXAtPnNlX25hbWUpICsgMSk7CisJCQlrbWVtX2ZyZWUoc2VwLCBzaXpl b2YgKHpmc19zbmFwZW50cnlfdCkpOworCQkJdnB1dCgqdnBwKTsKKwkJCS8qIGZpbmQgbmV3IHBs YWNlIGZvciBzZXAgZW50cnkgKi8KKwkJCWF2bF9maW5kKCZzZHAtPnNkX3NuYXBzLCAmc2VhcmNo LCAmd2hlcmUpOwogCQkJVkVSSUZZKHpmc2N0bF9zbmFwc2hvdF96bmFtZShkdnAsIG5tLCBNQVhO QU1FTEVOLCBzbmFwbmFtZSkgPT0gMCk7CiAJCQlnb3RvIGRvbW91bnQ7CiAJCX0gZWxzZSB7CkBA IC0xMDI4LDYgKzEwMzYsNyBAQCB6ZnNjdGxfc25hcGRpcl9sb29rdXAoYXApCiAJCXJldHVybiAo ZXJyKTsKIAl9CiAKK2RvbW91bnQ6CiAJLyoKIAkgKiBUaGUgcmVxdWVzdGVkIHNuYXBzaG90IGlz IG5vdCBjdXJyZW50bHkgbW91bnRlZCwgbG9vayBpdCB1cC4KIAkgKi8KQEAgLTEwNjgsNyArMTA3 Nyw2IEBAIHpmc2N0bF9zbmFwZGlyX2xvb2t1cChhcCkKIAlhdmxfaW5zZXJ0KCZzZHAtPnNkX3Nu YXBzLCBzZXAsIHdoZXJlKTsKIAogCWRtdV9vYmpzZXRfcmVsZShzbmFwLCBGVEFHKTsKLWRvbW91 bnQ6CiAJbW91bnRwb2ludF9sZW4gPSBzdHJsZW4oZHZwLT52X3Zmc3AtPm1udF9zdGF0LmZfbW50 b25uYW1lKSArCiAJICAgIHN0cmxlbigiLyIgWkZTX0NUTERJUl9OQU1FICIvc25hcHNob3QvIikg KyBzdHJsZW4obm0pICsgMTsKIAltb3VudHBvaW50ID0ga21lbV9hbGxvYyhtb3VudHBvaW50X2xl biwgS01fU0xFRVApOwpAQCAtMTQ2MywxMSArMTQ3MSwxOCBAQCB6ZnNjdGxfc25hcHNob3RfaW5h Y3RpdmUoYXApCiAJemZzX3NuYXBlbnRyeV90ICpzZXAsICpuZXh0OwogCWludCBsb2NrZWQ7CiAJ dm5vZGVfdCAqZHZwOworCWdmc19kaXJfdCAqZHA7CiAKLQlpZiAodnAtPnZfY291bnQgPiAwKQot CQlnb3RvIGVuZDsKLQotCVZFUklGWShnZnNfZGlyX2xvb2t1cCh2cCwgIi4uIiwgJmR2cCwgY3Is IDAsIE5VTEwsIE5VTEwpID09IDApOworCS8qIFRoaXMgaXMgZm9yIGFjY2Vzc2luZyB0aGUgcmVh bCBwYXJlbnQgZGlyZWN0bHksIHdpdGhvdXQgYSBwb3NzaWJsZSBkZWFkbG9jaworCSAqIHdpdGgg emZzY3RsX3NuYXBkaXJfbG9va3VwKCkuIFRoZSByZWxlYXNlIG9mIGxvY2sgb24gdnAgYW5kIGxv Y2sgb24gZHZwIHByb3ZpZGVzCisJICogdGhlIHNhbWUgbG9jayBvcmRlciBhcyBpbiB6ZnNjdGxf c25hcHNob3RfbG9va3VwKCkuCisJICovCisJZHAgPSB2cC0+dl9kYXRhOworCWR2cCA9IGRwLT5n ZnNkX2ZpbGUuZ2ZzX3BhcmVudDsKKwlWTl9IT0xEKGR2cCk7CisJVk9QX1VOTE9DSyh2cCwgMCk7 CisJdm5fbG9jayhkdnAsIExLX1NIQVJFRCB8IExLX1JFVFJZIHwgTEtfQ0FOUkVDVVJTRSk7CisJ dm5fbG9jayh2cCwgTEtfRVhDTFVTSVZFIHwgTEtfUkVUUlkpOwogCXNkcCA9IGR2cC0+dl9kYXRh OwogCVZPUF9VTkxPQ0soZHZwLCAwKTsKIApAQCAtMTQ5NCw3ICsxNTA5LDYgQEAgemZzY3RsX3Nu YXBzaG90X2luYWN0aXZlKGFwKQogCQltdXRleF9leGl0KCZzZHAtPnNkX2xvY2spOwogCVZOX1JF TEUoZHZwKTsKIAotZW5kOgogCS8qCiAJICogRGlzcG9zZSBvZiB0aGUgdm5vZGUgZm9yIHRoZSBz bmFwc2hvdCBtb3VudCBwb2ludC4KIAkgKiBUaGlzIGlzIHNhZmUgdG8gZG8gYmVjYXVzZSBvbmNl IHRoaXMgZW50cnkgaGFzIGJlZW4gcmVtb3ZlZAo= --=_2592f2ae183913a1079652aa01cc934b--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e70edbfaf1cc75a60aa653a937f28ba6>