Date: Tue, 24 Apr 2018 13:27:28 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 227740] concurrent zfs management operations may lead to a race/subsystem locking Message-ID: <bug-227740-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D227740 Bug ID: 227740 Summary: concurrent zfs management operations may lead to a race/subsystem locking Product: Base System Version: 11.1-STABLE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: emz@norma.perm.ru concurrent zfs commands operations may lead to a race/subsystem locking. for instance this is the current state wich is not changing for at least 30 minutes (system got into it after issuing concurrent zfs commands): =3D=3D=3DCut=3D=3D=3D [root@san1:~]# ps ax | grep zfs 9 - DL 7:41,34 [zfskern] 57922 - Is 0:00,01 sshd: zfsreplica [priv] (sshd) 57924 - I 0:00,00 sshd: zfsreplica@notty (sshd) 57925 - Is 0:00,00 csh -c zfs list -t snapshot 57927 - D 0:00,00 zfs list -t snapshot 58694 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 58695 - D 0:00,00 /sbin/zfs list -t all 59512 - Is 0:00,02 sshd: zfsreplica [priv] (sshd) 59516 - I 0:00,00 sshd: zfsreplica@notty (sshd) 59517 - Is 0:00,00 csh -c zfs list -t snapshot 59520 - D 0:00,00 zfs list -t snapshot 59552 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59553 - D 0:00,00 /sbin/zfs list -t all 59554 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59555 - D 0:00,00 /sbin/zfs list -t all 59556 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59557 - D 0:00,00 /sbin/zfs list -t all 59558 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59559 - D 0:00,00 /sbin/zfs list -t all 59560 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59561 - D 0:00,00 /sbin/zfs list -t all 59564 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59565 - D 0:00,00 /sbin/zfs list -t all 59570 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59571 - D 0:00,00 /sbin/zfs list -t all 59572 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59573 - D 0:00,00 /sbin/zfs list -t all 59574 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 59575 - D 0:00,00 /sbin/zfs list -t all 59878 - Is 0:00,02 sshd: zfsreplica [priv] (sshd) 59880 - I 0:00,00 sshd: zfsreplica@notty (sshd) 59881 - Is 0:00,00 csh -c zfs list -t snapshot 59883 - D 0:00,00 zfs list -t snapshot 60800 - Is 0:00,01 sshd: zfsreplica [priv] (sshd) 60806 - I 0:00,00 sshd: zfsreplica@notty (sshd) 60807 - Is 0:00,00 csh -c zfs list -t snapshot 60809 - D 0:00,00 zfs list -t snapshot 60917 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 60918 - D 0:00,00 /sbin/zfs list -t all 60950 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 60951 - D 0:00,00 /sbin/zfs list -t all 60966 - Is 0:00,02 sshd: zfsreplica [priv] (sshd) 60968 - I 0:00,00 sshd: zfsreplica@notty (sshd) 60969 - Is 0:00,00 csh -c zfs list -t snapshot 60971 - D 0:00,00 zfs list -t snapshot 61432 - Is 0:00,03 sshd: zfsreplica [priv] (sshd) 61434 - I 0:00,00 sshd: zfsreplica@notty (sshd) 61435 - Is 0:00,00 csh -c zfs list -t snapshot 61437 - D 0:00,00 zfs list -t snapshot 61502 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61503 - D 0:00,00 /sbin/zfs list -t all 61504 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61505 - D 0:00,00 /sbin/zfs list -t all 61506 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61507 - D 0:00,00 /sbin/zfs list -t all 61508 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61509 - D 0:00,00 /sbin/zfs list -t all 61510 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61511 - D 0:00,00 /sbin/zfs list -t all 61512 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61513 - D 0:00,00 /sbin/zfs list -t all 61569 - I 0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all 61570 - D 0:00,00 /sbin/zfs list -t all 61851 - Is 0:00,02 sshd: zfsreplica [priv] (sshd) 61853 - I 0:00,00 sshd: zfsreplica@notty (sshd) 61854 - Is 0:00,00 csh -c zfs list -t snapshot 61856 - D 0:00,00 zfs list -t snapshot 57332 7 D+ 0:00,04 zfs rename data/esx/boot-esx03 data/esx/boot-esx03_orig 58945 8 D+ 0:00,00 zfs list 62119 3 S+ 0:00,00 grep zfs [root@san1:~]# ps ax | grep ctladm 62146 3 S+ 0:00,00 grep ctladm [root@san1:~]# =3D=3D=3DCut=3D=3D=3D This seems to be the operation that locks the system: zfs rename data/esx/boot-esx03 data/esx/boot-esx03_orig the dataset info: =3D=3D=3DCut=3D=3D=3D # zfs get all data/esx/boot-esx03 NAME PROPERTY VALUE SOUR= CE data/esx/boot-esx03 type volume - data/esx/boot-esx03 creation =D1=81=D1=80 =D0=B0=D0=B2=D0=B3.= 2 15:48 2017 - data/esx/boot-esx03 used 8,25G - data/esx/boot-esx03 available 9,53T - data/esx/boot-esx03 referenced 555M - data/esx/boot-esx03 compressratio 1.06x - data/esx/boot-esx03 reservation none defa= ult data/esx/boot-esx03 volsize 8G local data/esx/boot-esx03 volblocksize 8K defa= ult data/esx/boot-esx03 checksum on defa= ult data/esx/boot-esx03 compression lz4=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 inherited from data data/esx/boot-esx03 readonly off defa= ult data/esx/boot-esx03 copies 1 defa= ult data/esx/boot-esx03 refreservation 8,25G local data/esx/boot-esx03 primarycache all defa= ult data/esx/boot-esx03 secondarycache all defa= ult data/esx/boot-esx03 usedbysnapshots 0 - data/esx/boot-esx03 usedbydataset 555M - data/esx/boot-esx03 usedbychildren 0 - data/esx/boot-esx03 usedbyrefreservation 7,71G - data/esx/boot-esx03 logbias latency defa= ult data/esx/boot-esx03 dedup off=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 inherited from data/esx data/esx/boot-esx03 mlslabel - data/esx/boot-esx03 sync standard defa= ult data/esx/boot-esx03 refcompressratio 1.06x - data/esx/boot-esx03 written 555M - data/esx/boot-esx03 logicalused 586M - data/esx/boot-esx03 logicalreferenced 586M - data/esx/boot-esx03 volmode dev=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 inherited from data data/esx/boot-esx03 snapshot_limit none defa= ult data/esx/boot-esx03 snapshot_count none defa= ult data/esx/boot-esx03 redundant_metadata all defa= ult =3D=3D=3DCut=3D=3D=3D Since the dataset is only 8G big, it's unlikely that it should take that am= ount of time to be rename, considering disks are idle. Got this two times in a row, and as a result all the zfs/zpool commands sto= pped working. I have manually brought the system into panicking to get the crashdumps. Crashdumps are located here: http://san1.linx.playkey.net/r332096M/ along with a brief description and full kernel/module binaries. Please note that the vmcore.0 is from another panic, this lockup crashdumps= are 1 (unfortunately, no txt files saved) and 2. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-227740-227>