Date: Sat, 19 Dec 2020 21:16:01 +0100 From: Mateusz Guzik <mjguzik@gmail.com> To: mike tancsa <mike@sentex.net> Cc: George Neville-Neil <gnn@neville-neil.com>, "netperf-admin@FreeBSD.org" <netperf-admin@freebsd.org>, netperf-users@freebsd.org, Paul Holes <pholes@sentex.ca>, Hans Petter Selasky <hps@selasky.org> Subject: Re: zoo back online (was Re: zoo hang) Message-ID: <CAGudoHFDLu_MDT1H7xgcX5cXAEi8g_a1Kq8DumO6fJKQq6zBbg@mail.gmail.com> In-Reply-To: <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net> References: <5483e76e-4a2f-3153-c10b-7902839c1b68@sentex.net> <CAGudoHF-3XhWQq-x8vROdUJ0sTweha2YEK_LXVwv44E4k=TtmQ@mail.gmail.com> <a55a69da-c9c6-eb18-9975-3572457ae5ef@sentex.net> <8c26a0d3-3bd0-7535-0abc-3d1e9e5ac7c4@sentex.net> <64923d33-4bf2-0fd5-1b17-d6bd73e9fd32@sentex.net> <13a9ab42-1df8-c054-0c83-5708ab9d9e2b@sentex.net> <C94AED22-A984-49ED-8D18-FD4856D70E01@neville-neil.com> <6cef40cd-de57-aa84-bc70-ceea71add397@sentex.net> <F0FA8C48-1DB1-4D63-ACD4-3ADD78AFA568@neville-neil.com> <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
I'm adding hps for USB stack comments. On 12/19/20, mike tancsa <mike@sentex.net> wrote: > Hmm, This has happened again. Not sure if its a bug with the driver, the > firmware or both, but after a period of time the usb drive starts to > throw errors. This unit was working fine on RELENG12 and we swapped it > with another drive too, but same results. The drive is clean > > smartctl -a /dev/da2 -T permissive > > > > da2 at umass-sim0 bus 0 scbus14 target 0 lun 0 > da2: <WDC WD40 EFRX-68WT0N0 0105> Fixed Direct Access SPC-4 SCSI device > da2: Serial Number 00000000000000000000 > da2: 400.000MB/s transfers > da2: 3815447MB (7814037168 512 byte sectors) > da2: quirks=0xa<NO_6_BYTE,4K> > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Error 5, Retries exhausted > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Error 5, Retries exhausted > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Error 5, Retries exhausted > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > Solaris: WARNING: Pool 'zoobackup' has encountered an uncorrectable I/O > failure and has been suspended. > > > On 12/18/2020 10:08 AM, George Neville-Neil wrote: >> OK, once we get the backup complete we should probably work on the >> rest of the cleanup. Let me know if and how I can help. >> >> Best, >> George >> >> >> On 18 Dec 2020, at 9:14, mike tancsa wrote: >> >>> Hi George, >>> >>> I think the boot loader is now fixed as those features are white >>> listed. Will start backups once again via zrepl. >>> >>> ---Mike >>> >>> On 12/17/2020 1:58 PM, George Neville-Neil wrote: >>>> Howdy, >>>> >>>> How do we want to handle the old tank stuff? >>>> >>>> Best, >>>> George >>>> >>>> >>>> On 15 Dec 2020, at 16:24, mike tancsa wrote: >>>> >>>>> OK, thanks to Josh P's suggestion, deleting the v2 bookmarks from the >>>>> pool allowed us to boot. >>>>> >>>>> Booted from a temp drive, imported the pool, >>>>> >>>>> root@zoo-temp:~ # zpool import -R /mnt -f zooroot >>>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs >>>>> destroy "$1}' >>>>> zfs destroy zooroot#zrepl_CURSOR_G_77296a02a81c78cc_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/ROOT#zrepl_CURSOR_G_e27691751ed1660b_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/ROOT/default#zrepl_CURSOR_G_607fa8e4c7df13b5_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/tmp#zrepl_CURSOR_G_25ae8e2b8723a008_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr#zrepl_CURSOR_G_344a884262b3e387_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr/home#zrepl_CURSOR_G_2e4087f8f219bd83_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr/ports#zrepl_CURSOR_G_fb8384d458dd82b3_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr/src#zrepl_CURSOR_G_b867573acd8a57f8_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var#zrepl_CURSOR_G_ea9efdf01fdf65b5_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/audit#zrepl_CURSOR_G_e71132efb0fee45a_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/crash#zrepl_CURSOR_G_191c17e9538113f4_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/log#zrepl_CURSOR_G_f30668295109ad60_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/mail#zrepl_CURSOR_G_7d1eac92237e2603_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/tmp#zrepl_CURSOR_G_d593288357e0a319_J_push_to_drive >>>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs >>>>> destroy "$1}' | sh >>>>> root@zoo-temp:~ # >>>>> root@zoo-temp:~ # zpool export zooroot >>>>> root@zoo-temp:~ # >>>>> >>>>> and rebooted and its up. Sadly, will need to come up with another >>>>> backup >>>>> system as sysutils/zrepl uses bookmarks :( >>>>> >>>>> ---Mike >>>>> >>>>> On 12/15/2020 1:46 PM, mike tancsa wrote: >>>>>> Looks like the loader does not support v2 bookmarks. I am going to >>>>>> get >>>>>> Paul to put in another disk to boot from, mjg will login, either >>>>>> destroy >>>>>> the bookmarks or hack a loader fix that will allow the box to boot >>>>>> with >>>>>> this feature. Will be an hour or so as we have a office meeting >>>>>> at 2pm >>>>>> we both have to attend. >>>>>> >>>>>> ---Mike >>>>>> >>>>>> On 12/15/2020 1:28 PM, mike tancsa wrote: >>>>>>> I am guessing because I was using zrepl from the ports to do >>>>>>> replication >>>>>>> / backup to a secondary disk, the use of the bookmark_v2 feature is >>>>>>> not >>>>>>> supported on ZoL ? Any way to recover from this ? >>>>>>> >>>>>>> >>>>>>> On 12/15/2020 1:10 PM, mike tancsa wrote: >>>>>>>> OK, but the first problem to deal with :( >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> BIOS drive C: is >>>>>>>> disk0 >>>>>>>> BIOS drive D: is >>>>>>>> disk1 >>>>>>>> ZFS: unsupported feature: >>>>>>>> com.datto:bookmark_v2 >>>>>>>> ZFS: pool zooroot is not >>>>>>>> supported >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/zfsloader >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/loader >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> FreeBSD/x86 >>>>>>>> boot >>>>>>>> Default: >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> boot: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> FreeBSD/x86 >>>>>>>> boot >>>>>>>> Default: >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> boot: >>>>>>>> >>>>>>>> On 12/15/2020 1:02 PM, Mateusz Guzik wrote: >>>>>>>>> We need to update to r368649 for a pmap fix regardless of the >>>>>>>>> above. I >>>>>>>>> can do the work and make the box ready for the next reboot. >>>>>>>>> >>>>>>>>> On 12/15/20, mike tancsa <mike@sentex.net> wrote: >>>>>>>>>> The USB backup disk was throwing errors and I was trying to >>>>>>>>>> export the >>>>>>>>>> backup pool and it looks like the box is hung now. I am going to >>>>>>>>>> power >>>>>>>>>> cycle it >>>>>>>>>> >>>>>>>>>> ---Mike >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>> >> > -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHFDLu_MDT1H7xgcX5cXAEi8g_a1Kq8DumO6fJKQq6zBbg>