Date: Sat, 19 Dec 2020 14:58:59 -0500 From: mike tancsa <mike@sentex.net> To: George Neville-Neil <gnn@neville-neil.com> Cc: Mateusz Guzik <mjguzik@gmail.com>, "netperf-admin@FreeBSD.org" <netperf-admin@freebsd.org>, netperf-users@freebsd.org, Paul Holes <pholes@sentex.ca> Subject: Re: zoo back online (was Re: zoo hang) Message-ID: <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net> In-Reply-To: <F0FA8C48-1DB1-4D63-ACD4-3ADD78AFA568@neville-neil.com> References: <5483e76e-4a2f-3153-c10b-7902839c1b68@sentex.net> <CAGudoHF-3XhWQq-x8vROdUJ0sTweha2YEK_LXVwv44E4k=TtmQ@mail.gmail.com> <a55a69da-c9c6-eb18-9975-3572457ae5ef@sentex.net> <8c26a0d3-3bd0-7535-0abc-3d1e9e5ac7c4@sentex.net> <64923d33-4bf2-0fd5-1b17-d6bd73e9fd32@sentex.net> <13a9ab42-1df8-c054-0c83-5708ab9d9e2b@sentex.net> <C94AED22-A984-49ED-8D18-FD4856D70E01@neville-neil.com> <6cef40cd-de57-aa84-bc70-ceea71add397@sentex.net> <F0FA8C48-1DB1-4D63-ACD4-3ADD78AFA568@neville-neil.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hmm, This has happened again. Not sure if its a bug with the driver, the firmware or both, but after a period of time the usb drive starts to throw errors. This unit was working fine on RELENG12 and we swapped it with another drive too, but same results. The drive is clean smartctl -a /dev/da2 -T permissive da2 at umass-sim0 bus 0 scbus14 target 0 lun 0 da2: <WDC WD40 EFRX-68WT0N0 0105> Fixed Direct Access SPC-4 SCSI device da2: Serial Number 00000000000000000000 da2: 400.000MB/s transfers da2: 3815447MB (7814037168 512 byte sectors) da2: quirks=0xa<NO_6_BYTE,4K> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Error 5, Retries exhausted (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Error 5, Retries exhausted (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Error 5, Retries exhausted (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain Solaris: WARNING: Pool 'zoobackup' has encountered an uncorrectable I/O failure and has been suspended. On 12/18/2020 10:08 AM, George Neville-Neil wrote: > OK, once we get the backup complete we should probably work on the > rest of the cleanup. Let me know if and how I can help. > > Best, > George > > > On 18 Dec 2020, at 9:14, mike tancsa wrote: > >> Hi George, >> >> I think the boot loader is now fixed as those features are white >> listed. Will start backups once again via zrepl. >> >> ---Mike >> >> On 12/17/2020 1:58 PM, George Neville-Neil wrote: >>> Howdy, >>> >>> How do we want to handle the old tank stuff? >>> >>> Best, >>> George >>> >>> >>> On 15 Dec 2020, at 16:24, mike tancsa wrote: >>> >>>> OK, thanks to Josh P's suggestion, deleting the v2 bookmarks from the >>>> pool allowed us to boot. >>>> >>>> Booted from a temp drive, imported the pool, >>>> >>>> root@zoo-temp:~ # zpool import -R /mnt -f zooroot >>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs >>>> destroy "$1}' >>>> zfs destroy zooroot#zrepl_CURSOR_G_77296a02a81c78cc_J_push_to_drive >>>> zfs destroy >>>> zooroot/ROOT#zrepl_CURSOR_G_e27691751ed1660b_J_push_to_drive >>>> zfs destroy >>>> zooroot/ROOT/default#zrepl_CURSOR_G_607fa8e4c7df13b5_J_push_to_drive >>>> zfs destroy >>>> zooroot/tmp#zrepl_CURSOR_G_25ae8e2b8723a008_J_push_to_drive >>>> zfs destroy >>>> zooroot/usr#zrepl_CURSOR_G_344a884262b3e387_J_push_to_drive >>>> zfs destroy >>>> zooroot/usr/home#zrepl_CURSOR_G_2e4087f8f219bd83_J_push_to_drive >>>> zfs destroy >>>> zooroot/usr/ports#zrepl_CURSOR_G_fb8384d458dd82b3_J_push_to_drive >>>> zfs destroy >>>> zooroot/usr/src#zrepl_CURSOR_G_b867573acd8a57f8_J_push_to_drive >>>> zfs destroy >>>> zooroot/var#zrepl_CURSOR_G_ea9efdf01fdf65b5_J_push_to_drive >>>> zfs destroy >>>> zooroot/var/audit#zrepl_CURSOR_G_e71132efb0fee45a_J_push_to_drive >>>> zfs destroy >>>> zooroot/var/crash#zrepl_CURSOR_G_191c17e9538113f4_J_push_to_drive >>>> zfs destroy >>>> zooroot/var/log#zrepl_CURSOR_G_f30668295109ad60_J_push_to_drive >>>> zfs destroy >>>> zooroot/var/mail#zrepl_CURSOR_G_7d1eac92237e2603_J_push_to_drive >>>> zfs destroy >>>> zooroot/var/tmp#zrepl_CURSOR_G_d593288357e0a319_J_push_to_drive >>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs >>>> destroy "$1}' | sh >>>> root@zoo-temp:~ # >>>> root@zoo-temp:~ # zpool export zooroot >>>> root@zoo-temp:~ # >>>> >>>> and rebooted and its up. Sadly, will need to come up with another >>>> backup >>>> system as sysutils/zrepl uses bookmarks :( >>>> >>>> ---Mike >>>> >>>> On 12/15/2020 1:46 PM, mike tancsa wrote: >>>>> Looks like the loader does not support v2 bookmarks. I am going to >>>>> get >>>>> Paul to put in another disk to boot from, mjg will login, either >>>>> destroy >>>>> the bookmarks or hack a loader fix that will allow the box to boot >>>>> with >>>>> this feature. Will be an hour or so as we have a office meeting >>>>> at 2pm >>>>> we both have to attend. >>>>> >>>>> ---Mike >>>>> >>>>> On 12/15/2020 1:28 PM, mike tancsa wrote: >>>>>> I am guessing because I was using zrepl from the ports to do >>>>>> replication >>>>>> / backup to a secondary disk, the use of the bookmark_v2 feature is >>>>>> not >>>>>> supported on ZoL ? Any way to recover from this ? >>>>>> >>>>>> >>>>>> On 12/15/2020 1:10 PM, mike tancsa wrote: >>>>>>> OK, but the first problem to deal with :( >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> BIOS drive C: is >>>>>>> disk0 >>>>>>> BIOS drive D: is >>>>>>> disk1 >>>>>>> ZFS: unsupported feature: >>>>>>> com.datto:bookmark_v2 >>>>>>> ZFS: pool zooroot is not >>>>>>> supported >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Can't find >>>>>>> /boot/zfsloader >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Can't find >>>>>>> /boot/loader >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Can't find >>>>>>> /boot/kernel/kernel >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> FreeBSD/x86 >>>>>>> boot >>>>>>> Default: >>>>>>> /boot/kernel/kernel >>>>>>> >>>>>>> >>>>>>> boot: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Can't find >>>>>>> /boot/kernel/kernel >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> FreeBSD/x86 >>>>>>> boot >>>>>>> Default: >>>>>>> /boot/kernel/kernel >>>>>>> >>>>>>> >>>>>>> boot: >>>>>>> >>>>>>> On 12/15/2020 1:02 PM, Mateusz Guzik wrote: >>>>>>>> We need to update to r368649 for a pmap fix regardless of the >>>>>>>> above. I >>>>>>>> can do the work and make the box ready for the next reboot. >>>>>>>> >>>>>>>> On 12/15/20, mike tancsa <mike@sentex.net> wrote: >>>>>>>>> The USB backup disk was throwing errors and I was trying to >>>>>>>>> export the >>>>>>>>> backup pool and it looks like the box is hung now. I am going to >>>>>>>>> power >>>>>>>>> cycle it >>>>>>>>> >>>>>>>>> ---Mike >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>> >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?837ce2bc-9731-85b0-c6a5-1b3c7bcadb72>