Date: Fri, 17 May 2019 03:02:39 +0200 From: Peter <pmc@citylink.dinoex.sub.org> To: Miroslav Lachman <000.fbsd@quip.cz> Cc: freebsd-fs@freebsd.org Subject: Re: Waht is the minimum free space ... (Full Report) Message-ID: <20190517010239.GA34758@gate.oper.dinoex.org> In-Reply-To: <60d57363-eb5c-e985-82ad-30f03b06a4c6@quip.cz> References: <20190515204243.GA67445@gate.oper.dinoex.org> <60d57363-eb5c-e985-82ad-30f03b06a4c6@quip.cz>
next in thread | previous in thread | raw e-mail | index | archive | help
Alright, now I was able to reproduce the failure under test conditions. And in addition to this, I was able to produce two deadlocks, a bunch of GPT errors and a ZFS assertion failure. Here we go: ABSTRACT =3D=3D=3D=3D=3D=3D=3D=3D The original idea was to check if ZFS can grow a raid5. (This is technically easy, but not all volume managers are willing to do it, so I decided to give it a try.) Therefore I created three partitions on some spare diskspace, with free-space interleave between, then created a ZFS raidz pool on them, enlarged the partitions, and tried to autoexpand the ZFS pool. While the procedure appeared to work in theory, I got strange results in practice, leading even to kernel crashes. Now I conducted five test-cases with protocols: 1. try with an USB-stick on a core-i consumer machine. The chosen procedure did now lead into deadlock long before even expanding the pool. 2. try with a spinning drive on the core-i. The chosen procedure was fully successful. 3. same as 2., but without exporting/importing the pool before expansion. This procedure led into deadlock. 4. try with spinning drive on ancient pentium server machine, with the exact same procedure as recently. The procedure did not lead to kernel crash this time; instead it produced GPT errors and ZFS assertion failure, which may provide further insight. 5. try the same procedure as in 4. now on the core-i machine. The procedure did lead to immediate reboot once, and to "trap 12: page fault" the other times. In both cases 1. and 3. it was not possible to execute "sync" anymore, so death of the system would be just a matter of time. See the protocols of the testcases below and some commentary at the end. Summary of the procedures: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Testcase 1: * create partitions * create pool * export pool -> hangs Testcase 2: * create partitions * create pool * export pool * import pool * enlarge partitions * export pool * import pool * set autoexpand=3Don * online -e -> success! Testcase 3: * create partitions * create pool * enlarge partitions * set autoexpand=3Don * online -e -> hangs Testcase 4: * create partitions * create pool * enlarge partitions * set autoexpand=3Don * export pool * -> cannot import anymore * shrink back partitions * -> ZFS assertion failed * destroy pool and partitions * -> devices still present! Testcase 5: * create partitions * create pool * enlarge partitions * set autoexpand=3Don * export pool * -> cannot import anymore * shrink back partition -> crash! * shrink back partition -> crash! * shrink back partition -> crash! * import successful. --------------- TESTCASE 1 BEGIN ------------------------ Script started on Thu May 16 20:47:21 2019 root@disp:~ # camcontrol devlist <KINGSTON SA400S37240G S1Z40102> at scbus1 target 0 lun 0 (pass0,ada0) <Hitachi HDS5C1050CLA382 JC2OA50E> at scbus2 target 0 lun 0 (pass1,ada1) <TSSTcorp CDDVDW SH-224DB SB01> at scbus3 target 0 lun 0 (pass2,cd0) <AHCI SGPIO Enclosure 1.00 0001> at scbus4 target 0 lun 0 (pass3,ses0) <General UDisk 5.00> at scbus5 target 0 lun 0 (da0,pass4) <Kingston DataTraveler 3.0 PMAP> at scbus6 target 0 lun 0 (da1,pass5) root@disp:~ # gpart create -s GPT da1 da1 created root@disp:~ # gpart add -t freebsd-zfs -s 1G da1 da1p1 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 da1 da1p2 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 da1 da1p3 added root@disp:~ # gpart show da1 =3D> 40 60555184 da1 GPT (29G) 40 2097152 1 freebsd-zfs (1.0G) 2097192 2097152 - free - (1.0G) 4194344 2097152 2 freebsd-zfs (1.0G) 6291496 2097152 - free - (1.0G) 8388648 2097152 3 freebsd-zfs (1.0G) 10485800 50069424 - free - (24G) root@disp:~ # zpool create -f testraid raidz da1p1 da1p2 da1p3 root@disp:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 2.75G 552K 2.75G - - 0% 0% 1.00x ONLI= NE - root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da1p1 ONLINE 0 0 0 da1p2 ONLINE 0 0 0 da1p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # zpool export testraid ## At this point the command hung forever and we were dead-in-the-water, ## no disk-I/O happening. Output ps shows "D"eadlock in "connec": 0 5640 4647 0 20 0 7760 3784 connec D+ 10 0:00.01 zpool No "df", no "sync", no "reboot". --------------- TESTCASE 1 END ------------------------ --------------- TESTCASE 2 BEGIN ------------------------ Script started on Thu May 16 21:26:14 2019 root@disp:~ # camcontrol devlist <KINGSTON SA400S37240G S1Z40102> at scbus1 target 0 lun 0 (pass0,ada0)=0D <Hitachi HDS5C1050CLA382 JC2OA50E> at scbus2 target 0 lun 0 (pass1,ada1)=0D <WDC WD5000AAKS-00A7B2 01.03B01> at scbus3 target 0 lun 0 (pass2,ada2)=0D <TSSTcorp CDDVDW SH-224DB SB01> at scbus4 target 0 lun 0 (pass3,cd0)=0D <AHCI SGPIO Enclosure 1.00 0001> at scbus5 target 0 lun 0 (pass4,ses0)=0D <General UDisk 5.00> at scbus6 target 0 lun 0 (da0,pass5)=0D root@disp:~ # gpart show ada2 gpart: No such geom: ada2. root@disp:~ # gpart create -s GPT ada2 ada2 created root@disp:~ # gpart add -t freebsd-zfs -s 1G ada2 ada2p1 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 ada2 ada2p2 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 ada2 ada2p3 added root@disp:~ # gpart show ada2 =3D> 40 976773088 ada2 GPT (466G) 40 2097152 1 freebsd-zfs (1.0G) 2097192 2097152 - free - (1.0G) 4194344 2097152 2 freebsd-zfs (1.0G) 6291496 2097152 - free - (1.0G) 8388648 2097152 3 freebsd-zfs (1.0G) 10485800 966287328 - free - (461G) root@disp:~ # zpool create testraid raidz ada2p1 ada2p2 ada2p3 root@disp:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 2.75G 696K 2.75G - - 0% 0% 1.00x ONLI= NE - root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # zpool export testraid root@disp:~ # zpool import pool: testraid id: 3885773658779285422 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: testraid ONLINE raidz1-0 ONLINE ada2p1 ONLINE ada2p2 ONLINE ada2p3 ONLINE root@disp:~ # zpool import testraid root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # gpart resize -s 4194304 -i 1 ada2 ada2p1 resized root@disp:~ # gpart resize -s 4194304 -i 2 ada2 ada2p2 resized root@disp:~ # gpart resize -s 4194304 -i 3 ada2 ada2p3 resized root@disp:~ # gpart show ada2 =3D> 40 976773088 ada2 GPT (466G) 40 4194304 1 freebsd-zfs (2.0G) 4194344 4194304 2 freebsd-zfs (2.0G) 8388648 4194304 3 freebsd-zfs (2.0G) 12582952 964190176 - free - (460G) root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # zpool export testraid root@disp:~ # zpool import pool: testraid id: 3885773658779285422 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: testraid ONLINE raidz1-0 ONLINE ada2p1 ONLINE ada2p2 ONLINE ada2p3 ONLINE root@disp:~ # zpool import testraid root@disp:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 2.75G 1.02M 2.75G - 3G 0% 0% 1.00x ONLI= NE - root@disp:~ # zpool set autoexpand=3Don testraid root@disp:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 2.75G 1.32M 2.75G - 3G 0% 0% 1.00x ONLI= NE - root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # zpool online -e testraid ada2p1 root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 5.75G 1.20M 5.75G - - 0% 0% 1.00x ONLI= NE - root@disp:~ #=20 root@disp:~ # zpool destroy testraid root@disp:~ # gpart destroy -F ada2 ada2 destroyed --------------- TESTCASE 2 END ------------------------ --------------- TESTCASE 3 BEGIN ------------------------ ## continuation of script from testcase 2 ## root@disp:~ # gpart create -s GPT ada2 ada2 created root@disp:~ # gpart add -t freebsd-zfs -s 1G ada2 ada2p1 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 ada2 ada2p2 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 ada2 ada2p3 added root@disp:~ # zpool create testraid raidz ada2p1 ada2p2 ada2p3 root@disp:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # gpart resize -s 4194304 -i 1 ada2 ada2p1 resized root@disp:~ # gpart resize -s 4194304 -i 2 ada2 ada2p2 resized root@disp:~ # gpart resize -s 4194304 -i 3 ada2 ada2p3 resized root@disp:~ # zpool set autoexpand=3Dyes testraid root@disp:~ # zpool online -e testraid ada2p1 ## At this point the command hung forever and we were dead-in-the-water, ## no disk-I/O happening. Output ps shows "D"eadlock in "tx->tx_s": 0 4433 1224 0 20 0 7760 3764 tx->tx_s D+ 9 0:00.01 zpool No "df", no "sync", no "reboot". --------------- TESTCASE 3 END ------------------------ --------------- TESTCASE 4 BEGIN ------------------------ ## DON'T DARE TO COMMENT ON THE AGE OF THIS HARDWARE! ## (it is a precious antique) ## Script started on Thu May 16 22:07:04 2019 root@edge:~ # camcontrol devlist <Maxtor 33073H3 YAH814Y0> at scbus0 target 0 lun 0 (ada0,pass0) <ASUS CD-S340 3.40> at scbus1 target 0 lun 0 (pass1,cd0) <QUANTUM ATLAS10K3_36_WLS 020W> at scbus2 target 0 lun 0 (da0,pass2) <IBM IC35L018UWDY10-0 S25F> at scbus2 target 2 lun 0 (da1,pass3) <IBM IC35L018UWDY10-0 S25F> at scbus2 target 4 lun 0 (da2,pass4) <SanDisk SDSSDA120G Z22000RL> at scbus4 target 0 lun 0 (ada1,pass5) <ST3000DM008-2DM166 CC26> at scbus5 target 0 lun 0 (ada2,pass6) <KINGSTON SA400S37120G SBFK71E0> at scbus7 target 0 lun 0 (ada3,pass7) <Hitachi HDS5C1010CLA382 JC4OA3MA> at scbus8 target 0 lun 0 (ada4,pass8) <Kingston DataTraveler G2 1.00> at scbus10 target 0 lun 0 (da3,pass9) root@edge:~ # gpart add -t freebsd-zfs -s 1G ada2 ada2p5 added root@edge:~ # gpart show ada2 =3D> 40 5860533088 ada2 GPT (2.7T) 40 209715200 1 freebsd (100G) 209715240 1687971896 2 freebsd-zfs (805G) 1897687136 1924953472 4 freebsd-zfs (918G) 3822640608 55838024 3 freebsd-zfs (27G) 3878478632 2097152 5 freebsd-zfs (1.0G) 3880575784 1979957344 - free - (944G) root@edge:~ # gpart add -t freebsd-zfs -s 1G -b 3882672936 ada2 ada2p6 added root@edge:~ # gpart add -t freebsd-zfs -s 1G -b 3886867240 ada2 ada2p7 added root@edge:~ # gpart show ada2 =3D> 40 5860533088 ada2 GPT (2.7T) 40 209715200 1 freebsd (100G) 209715240 1687971896 2 freebsd-zfs (805G) 1897687136 1924953472 4 freebsd-zfs (918G) 3822640608 55838024 3 freebsd-zfs (27G) 3878478632 2097152 5 freebsd-zfs (1.0G) 3880575784 2097152 - free - (1.0G) 3882672936 2097152 6 freebsd-zfs (1.0G) 3884770088 2097152 - free - (1.0G) 3886867240 2097152 7 freebsd-zfs (1.0G) 3888964392 1971568736 - free - (940G) root@edge:~ # zpool create testraid raidz ada2p5 ada2p6 ada2p7 root@edge:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 2.75G 656K 2.75G - - 0% 0% 1.00x ONLI= NE - root@edge:~ # zpool status testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p5 ONLINE 0 0 0 ada2p6 ONLINE 0 0 0 ada2p7 ONLINE 0 0 0 errors: No known data errors root@edge:~ # gpart resize -s 4194304 -i 5 ada2 ada2p5 resized root@edge:~ # gpart resize -s 4194304 -i 6 ada2 ada2p6 resized root@edge:~ # gpart resize -s 4194304 -i 7 ada2 ada2p7 resized root@edge:~ # gpart show ada2 =3D> 40 5860533088 ada2 GPT (2.7T) 40 209715200 1 freebsd (100G) 209715240 1687971896 2 freebsd-zfs (805G) 1897687136 1924953472 4 freebsd-zfs (918G) 3822640608 55838024 3 freebsd-zfs (27G) 3878478632 4194304 5 freebsd-zfs (2.0G) 3882672936 4194304 6 freebsd-zfs (2.0G) 3886867240 4194304 7 freebsd-zfs (2.0G) 3891061544 1969471584 - free - (939G) root@edge:~ # zpool set autoexpand=3Don testraid root@edge:~ # zpool list testraid NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT testraid 2.75G 848K 2.75G - - 0% 0% 1.00x ONLI= NE - root@edge:~ # zpool export testraid root@edge:~ # zpool import pool: testraid id: 12608152059619624422 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://illumos.org/msg/ZFS-8000-3C config: testraid UNAVAIL insufficient replicas raidz1-0 UNAVAIL insufficient replicas 13692722988113028666 UNAVAIL cannot open 10312580954503443965 UNAVAIL cannot open 16943054157341459289 UNAVAIL cannot open root@edge:~ # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH= ALTROOT backup 918G 461G 456G - - 12% 50% 1.00x ONLINE= - bm 800G 581G 219G - - 13% 72% 1.00x ONLINE= - gr 50.5G 17.2G 33.3G - - 15% 34% 1.00x ONLINE= - im 26.5G 11.5G 15.0G - - 47% 43% 1.00x ONLINE= - root@edge:~ # sync root@edge:~ # gpart show ada2 =3D> 40 5860533088 ada2 GPT (2.7T) 40 209715200 1 freebsd (100G) 209715240 1687971896 2 freebsd-zfs (805G) 1897687136 1924953472 4 freebsd-zfs (918G) 3822640608 55838024 3 freebsd-zfs (27G) 3878478632 4194304 5 freebsd-zfs (2.0G) 3882672936 4194304 6 freebsd-zfs (2.0G) 3886867240 4194304 7 freebsd-zfs (2.0G) 3891061544 1969471584 - free - (939G) root@edge:~ # gpart resize -s 2097152 -i 5 ada2 ada2p5 resized root@edge:~ # gpart resize -s 2097152 -i 6 ada2 ada2p6 resized root@edge:~ # gpart resize -s 2097152 -i 7 ada2 ada2p7 resized root@edge:~ # gpart show ada2 =3D> 40 5860533088 ada2 GPT (2.7T) 40 209715200 1 freebsd (100G) 209715240 1687971896 2 freebsd-zfs (805G) 1897687136 1924953472 4 freebsd-zfs (918G) 3822640608 55838024 3 freebsd-zfs (27G) 3878478632 2097152 5 freebsd-zfs (1.0G) 3880575784 2097152 - free - (1.0G) 3882672936 2097152 6 freebsd-zfs (1.0G) 3884770088 2097152 - free - (1.0G) 3886867240 2097152 7 freebsd-zfs (1.0G) 3888964392 1971568736 - free - (940G) root@edge:~ # zpool import Assertion failed: (avl_find() succeeded inside avl_add()), file /usr/src/sy= s/cddl/contrib/opensolaris/common/avl/avl.c, line 649. Abort (core dumped) root@edge:~ # zpool import Assertion failed: (avl_find() succeeded inside avl_add()), file /usr/src/sy= s/cddl/contrib/opensolaris/common/avl/avl.c, line 649. Abort (core dumped) root@edge:~ # gpart delete -i 7 ada2 ada2p7 deleted root@edge:~ # gpart delete -i 6 ada2 ada2p6 deleted root@edge:~ # gpart delete -i 5 ada2 ada2p5 deleted root@edge:~ # zpool import root@edge:~ # ls -la /dev/gptid/ total 1 dr-xr-xr-x 2 root wheel 512 May 16 20:54 . dr-xr-xr-x 9 root wheel 512 May 16 20:54 .. crw-r----- 1 root operator 0xed May 16 22:22 4ea3d975-7816-11e9-a104-00e= 01836f13c crw-r----- 1 root operator 0xef May 16 22:22 934b476f-7816-11e9-a104-00e= 01836f13c crw-r----- 1 root operator 0x94 May 16 20:54 ac0d2fe4-4b3b-11e9-8d45-00e= 01836f13c crw-r----- 1 root operator 0xf1 May 16 22:22 b71d41a1-7816-11e9-a104-00e= 01836f13c ## ## Ignore the one timestamped 20:54! - that one belongs to the regular ## installation (and I didnt yet find a way to get rid of it). ## root@edge:~ #=20 root@edge:~ # exit exit Script done on Thu May 16 23:09:43 2019 ## ## During this procedure the following errors appeared in the syslog: May 16 22:22:31 <kern.crit> edge kernel: g_access(944): provider gptid/4ea3= d975-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:22:31 <kern.crit> edge kernel: g_access(944): provider gptid/4ea3= d975-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:22:31 <kern.crit> edge kernel: g_dev_taste: make_dev_p() failed (= gp->name=3Dgptid/4ea3d975-7816-11e9-a104-00e01836f13c, error=3D17) May 16 22:22:48 <kern.crit> edge kernel: g_access(944): provider gptid/934b= 476f-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:22:48 <kern.crit> edge kernel: g_access(944): provider gptid/934b= 476f-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:22:48 <kern.crit> edge kernel: g_dev_taste: make_dev_p() failed (= gp->name=3Dgptid/934b476f-7816-11e9-a104-00e01836f13c, error=3D17) May 16 22:22:55 <kern.crit> edge kernel: g_access(944): provider gptid/b71d= 41a1-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:22:55 <kern.crit> edge kernel: g_access(944): provider gptid/b71d= 41a1-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:22:55 <kern.crit> edge kernel: g_dev_taste: make_dev_p() failed (= gp->name=3Dgptid/b71d41a1-7816-11e9-a104-00e01836f13c, error=3D17) May 16 22:23:21 <kern.info> edge kernel: pid 11779 (zpool), uid 0: exited o= n signal 6 (core dumped) May 16 22:27:35 <kern.info> edge kernel: pid 12342 (zpool), uid 0: exited o= n signal 6 (core dumped) May 16 22:42:36 <kern.crit> edge kernel: g_access(944): provider gptid/934b= 476f-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:42:36 <kern.crit> edge kernel: g_access(944): provider gptid/b71d= 41a1-7816-11e9-a104-00e01836f13c has error 6 set May 16 22:42:36 <kern.crit> edge kernel: g_access(944): provider gptid/4ea3= d975-7816-11e9-a104-00e01836f13c has error 6 set ## ## --------------- TESTCASE 4 END ------------------------ --------------- TESTCASE 5 BEGIN ------------------------ Script started on Thu May 16 22:45:58 2019 root@disp:~ # camcontrol devlist <KINGSTON SA400S37240G S1Z40102> at scbus1 target 0 lun 0 (pass0,ada0) <Hitachi HDS5C1050CLA382 JC2OA50E> at scbus2 target 0 lun 0 (pass1,ada1) <WDC WD5000AAKS-00A7B2 01.03B01> at scbus3 target 0 lun 0 (pass2,ada2) <TSSTcorp CDDVDW SH-224DB SB01> at scbus4 target 0 lun 0 (pass3,cd0) <AHCI SGPIO Enclosure 1.00 0001> at scbus5 target 0 lun 0 (pass4,ses0) <General UDisk 5.00> at scbus6 target 0 lun 0 (da0,pass5) root@disp:~ # gpart create -s GPT ada2 ada2 created root@disp:~ # gpart add -t freebsd-zfs -s 1G ada2 ada2p1 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 ada2 ada2p2 added root@disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 ada2 ada2p3 added root@disp:~ # gpart show ada2 =3D> 40 976773088 ada2 GPT (466G) 40 2097152 1 freebsd-zfs (1.0G) 2097192 2097152 - free - (1.0G) 4194344 2097152 2 freebsd-zfs (1.0G) 6291496 2097152 - free - (1.0G) 8388648 2097152 3 freebsd-zfs (1.0G) 10485800 966287328 - free - (461G) root@disp:~ # zpool create -f testraid raidz ada2p1 ada2p2 ada2p3 root@disp:~ # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT build 78G 34.0G 44.0G - - 35% 43% 1.00x ONLI= NE - media 464G 418G 46.5G - - 27% 89% 1.00x ONLI= NE - testraid 2.75G 552K 2.75G - - 0% 0% 1.00x ONLI= NE - zdesk 39.5G 11.9G 27.6G - - 16% 30% 1.00x ONLI= NE - root@disp:~ # gpart resize -s 4194304 -i 1 ada2 ada2p1 resized root@disp:~ # gpart resize -s 4194304 -i 2 ada2 ada2p2 resized root@disp:~ # gpart resize -s 4194304 -i 3 ada2 ada2p3 resized root@disp:~ # zpool set autoexpand=3Don testraid root@disp:~ # zpool export testraid root@disp:~ # zpool import pool: testraid id: 9285999494183920856 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://illumos.org/msg/ZFS-8000-3C config: testraid UNAVAIL insufficient replicas raidz1-0 UNAVAIL insufficient replicas 5467198674063294812 UNAVAIL cannot open 16413066309772469567 UNAVAIL cannot open 10976529604851394099 UNAVAIL cannot open root@disp:~ # ## Here the typescript ends due to system crash. ## The next command entered was ## # gpart resize -s 2097152 -i 2 ada2 ## ## The system showed some messages for 1/4 second and then rebooted. ## Nevertheless, the resize itself had successed, as was visible ## after reboot. ## Resizing of partition #2 gave this output (transcript from photo): @$ gpart resize -s 2097152 -i 2 ada2 Fatal trap 12: page fault while in kernel mode cpuid =3D 3; apic id =3D 06 fault virtual address =3D 0x8 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff8076103e stack pointer =3D 0x28:0xfffffe02299209e0 frame pointer =3D 0x28:0xfffffe02299209f0 code segment =3D base 0x0, limit 0xfffff, tyoe 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 13 (g_event) [ thread pid 13 tid 100025 ] Stooped at g_dev_orphan+0x2e: cmpb $0,0x8(%r14) db> ## ## Resizing of partition #3 then delivered the same page fault. ## ## After this, the pool could be imported again: Script started on Thu May 16 23:28:14 2019 root@disp:~ # gpart show ada2 =3D> 40 976773088 ada2 GPT (466G) 40 2097152 1 freebsd-zfs (1.0G) 2097192 2097152 - free - (1.0G) 4194344 2097152 2 freebsd-zfs (1.0G) 6291496 2097152 - free - (1.0G) 8388648 2097152 3 freebsd-zfs (1.0G) 10485800 966287328 - free - (461G) root@disp:~ # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH = ALTROOT build 78G 34.0G 44.0G - - 35% 43% 1.00x ONLINE = - media 464G 418G 46.5G - - 27% 89% 1.00x ONLINE = - zdesk 39.5G 11.9G 27.6G - - 16% 30% 1.00x ONLINE = - root@disp:~ # zpool import pool: testraid id: 9285999494183920856 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: testraid ONLINE raidz1-0 ONLINE ada2p1 ONLINE ada2p2 ONLINE ada2p3 ONLINE root@disp:~ # zpool import testraid root@disp:~ # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEAL= TH ALTROOT build 78G 34.0G 44.0G - - 35% 43% 1.00x ONLI= NE - media 464G 418G 46.5G - - 27% 89% 1.00x ONLI= NE - testraid 2.75G 968K 2.75G - - 0% 0% 1.00x ONLI= NE - zdesk 39.5G 11.9G 27.6G - - 16% 30% 1.00x ONLI= NE - root@disp:~ # zpool stat =08us testraid pool: testraid state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM testraid ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 ada2p2 ONLINE 0 0 0 ada2p3 ONLINE 0 0 0 errors: No known data errors root@disp:~ # exit Script done on Thu May 16 23:29:29 2019 --------------- TESTCASE 5 END ------------------------ COMMENTARY ---------- Case 1: The USB stick showed seek times of 10'000 ms and more during the operations and became hot. It seems the device is rather unfit for such operation and/or not capable to handle FLUSH commands properly. Case 4: It appears that the device nodes are STILL PRESENT in /dev/gptid at the end of the procedure even AFTER they had been destroyed. Might it be they are created somehow DUPLICATE and that being the reason for the failures? The main difference in the -successful- testcase 2 seems to be that an export+import is done AFTER resizing the partitions and BEFORE trying to grow the pool. This way ZFS can properly adjust the "expandsize" attribute, before actually doing the grow. So the issue seems mainly a matter of (not) following proper procedures (which, to my knowledge, aren't documented anywhere). cheerio, PMc
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190517010239.GA34758>