Date: Sun, 28 Feb 2010 23:58:24 -0600 From: "James R. Van Artsdalen" <james-freebsd-fs2@jrv.org> To: freebsd-fs <freebsd-fs@freebsd.org> Subject: [zfs] attach by name/uuid still attaches wrong device Message-ID: <4B8B5780.2050601@jrv.org>
next in thread | raw e-mail | index | archive | help
FreeBSD bigtex.housenet.jrv 9.0-CURRENT FreeBSD 9.0-CURRENT #2 r200727M: Tue Dec 22 23:25:56 CST 2009 james@bigtex.housenet.jrv:/usr/obj/usr/src/sys/BIGTEX amd64 It appears the zfs/vdev_geom.c can still attach to the wrong device in some cases. Note in the zpool status output how ada10 appears in two different vdevs. What happened is that a disk failed completely (scbus3 target 3) and is no longer detected by the driver. At boot time: 1. ZFS fails to attach by path and UUID, since what was at ada11 is now at ada10 and has a different UUID. 2. ZFS fails to attach by UUID since that UUID is on a dead drive and can no longer be found anywhere. 3. ZFS then attaches by path blindly, even though that drive is in a different part of the pool and has a different UUID. I don't think it's possible to do this right in vdev_geom.c: there's no way to guess what is intended without a hint from higher ZFS layers as to which drives should be found and which are new. The best fixes I can think of are to expose drives by serial number in GEOM, or perhaps as a fall-back expose names that are geographic locations, i.e., "/dev/scbus0/target3/lun0". # zpool status pool: bigtex state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-4J scrub: none requested config: NAME STATE READ WRITE CKSUM bigtex DEGRADED 0 0 0 mirror ONLINE 0 0 0 ada6 ONLINE 0 0 0 ada13 ONLINE 0 0 0 mirror ONLINE 0 0 0 ada4 ONLINE 0 0 0 ada11 ONLINE 0 0 0 mirror ONLINE 0 0 0 gptid/dbb5f9fd-5e40-11de-bef4-001aa01b0286 ONLINE 0 0 0 ada2p7 ONLINE 0 0 0 mirror ONLINE 0 0 0 ada7 ONLINE 0 0 0 ada14 ONLINE 0 0 0 mirror ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada10 ONLINE 0 0 0 mirror ONLINE 0 0 0 ada5 ONLINE 0 0 0 ada12 ONLINE 0 0 0 mirror ONLINE 0 0 0 ada9 ONLINE 0 0 0 ada15 ONLINE 0 0 0 mirror DEGRADED 0 0 0 ada10 FAULTED 10 754K 0 corrupted data ada16 ONLINE 0 0 0 errors: No known data errors # camcontrol devlist <WDC WD15EADS-00R6B0 01.00A01> at scbus0 target 0 lun 0 (ada2,pass6) <WDC WD20EADS-00R6B0 01.00A01> at scbus0 target 1 lun 0 (ada3,pass7) <WDC WD20EADS-00R6B0 01.00A01> at scbus0 target 2 lun 0 (ada4,pass8) <WDC WD20EADS-00R6B0 01.00A01> at scbus0 target 3 lun 0 (ada5,pass9) <Port Multiplier 37261095 1706> at scbus0 target 15 lun 0 (pass0,pmp0) <WDC WD20EADS-00R6B0 01.00A01> at scbus3 target 0 lun 0 (ada6,pass10) <WDC WD20EADS-00R6B0 01.00A01> at scbus3 target 1 lun 0 (ada7,pass11) <WDC WD20EADS-00R6B0 01.00A01> at scbus3 target 2 lun 0 (ada9,pass13) <Port Multiplier 37261095 1706> at scbus3 target 15 lun 0 (pass1,pmp1) <ST31500343AS SD35> at scbus4 target 0 lun 0 (ada8,pass12) <ST32000542AS CC32> at scbus4 target 1 lun 0 (ada10,pass14) <ST32000542AS CC32> at scbus4 target 2 lun 0 (ada11,pass15) <ST32000542AS CC32> at scbus4 target 3 lun 0 (ada12,pass16) <Port Multiplier 37261095 1706> at scbus4 target 15 lun 0 (pass2,pmp2) <ST32000542AS CC32> at scbus7 target 0 lun 0 (ada13,pass17) <ST32000542AS CC32> at scbus7 target 1 lun 0 (ada14,pass18) <ST32000542AS CC32> at scbus7 target 2 lun 0 (ada15,pass19) <ST32000542AS CC32> at scbus7 target 3 lun 0 (ada16,pass20) <Port Multiplier 37261095 1706> at scbus7 target 15 lun 0 (pass3,pmp3) <ST31500341AS CC1G> at scbus8 target 0 lun 0 (pass4,ada0) <ST31500341AS CC1G> at scbus11 target 0 lun 0 (pass5,ada1) # grep ada10 /var/run/dmesg.boot vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_open_by_path:466[1]: Found provider by name /dev/ada10. vdev_geom_attach:112[1]: Attaching to ada10. vdev_geom_attach:138[1]: Found consumer for ada10. vdev_geom_attach:157[1]: Used existing consumer for ada10. vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_detach:173[1]: Closing access to ada10. vdev_geom_open_by_path:477[1]: guid mismatch for provider /dev/ada10: 3665972767133355802 != 12768899409278570370. vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_open_by_path:466[1]: Found provider by name /dev/ada10. vdev_geom_attach:112[1]: Attaching to ada10. vdev_geom_attach:138[1]: Found consumer for ada10. vdev_geom_attach:157[1]: Used existing consumer for ada10. vdev_geom_detach:173[1]: Closing access to ada10. vdev_geom_detach:173[1]: Closing access to ada10. vdev_geom_detach:177[1]: Destroyed consumer to ada10. vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_attach:112[1]: Attaching to ada10. vdev_geom_attach:153[1]: Created consumer for ada10. vdev_geom_open_by_guid:446[1]: Attach by guid [12768899409278570370] succeeded, provider /dev/ada10. vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_open_by_path:466[1]: Found provider by name /dev/ada10. vdev_geom_attach:112[1]: Attaching to ada10. vdev_geom_attach:138[1]: Found consumer for ada10. vdev_geom_attach:157[1]: Used existing consumer for ada10. vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_detach:173[1]: Closing access to ada10. vdev_geom_open_by_path:477[1]: guid mismatch for provider /dev/ada10: 3665972767133355802 != 12768899409278570370. vdev_geom_read_guid:301[1]: Reading guid from ada10... vdev_geom_read_guid:339[1]: guid for ada10 is 12768899409278570370 vdev_geom_open_by_path:466[1]: Found provider by name /dev/ada10. vdev_geom_attach:112[1]: Attaching to ada10. vdev_geom_attach:138[1]: Found consumer for ada10. vdev_geom_attach:157[1]: Used existing consumer for ada10. vdev_geom_detach:173[1]: Closing access to ada10. #
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B8B5780.2050601>