Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Dec 2020 21:16:01 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        mike tancsa <mike@sentex.net>
Cc:        George Neville-Neil <gnn@neville-neil.com>,  "netperf-admin@FreeBSD.org" <netperf-admin@freebsd.org>, netperf-users@freebsd.org,  Paul Holes <pholes@sentex.ca>, Hans Petter Selasky <hps@selasky.org>
Subject:   Re: zoo back online (was Re: zoo hang)
Message-ID:  <CAGudoHFDLu_MDT1H7xgcX5cXAEi8g_a1Kq8DumO6fJKQq6zBbg@mail.gmail.com>
In-Reply-To: <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net>
References:  <5483e76e-4a2f-3153-c10b-7902839c1b68@sentex.net> <CAGudoHF-3XhWQq-x8vROdUJ0sTweha2YEK_LXVwv44E4k=TtmQ@mail.gmail.com> <a55a69da-c9c6-eb18-9975-3572457ae5ef@sentex.net> <8c26a0d3-3bd0-7535-0abc-3d1e9e5ac7c4@sentex.net> <64923d33-4bf2-0fd5-1b17-d6bd73e9fd32@sentex.net> <13a9ab42-1df8-c054-0c83-5708ab9d9e2b@sentex.net> <C94AED22-A984-49ED-8D18-FD4856D70E01@neville-neil.com> <6cef40cd-de57-aa84-bc70-ceea71add397@sentex.net> <F0FA8C48-1DB1-4D63-ACD4-3ADD78AFA568@neville-neil.com> <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
I'm adding hps for USB stack comments.

On 12/19/20, mike tancsa <mike@sentex.net> wrote:
> Hmm, This has happened again. Not sure if its a bug with the driver, the
> firmware or both, but after a period of time the usb drive starts to
> throw errors.  This unit was working fine on RELENG12 and we swapped it
> with another drive too, but same results. The drive is clean
>
> smartctl -a /dev/da2 -T permissive
>
>
>
> da2 at umass-sim0 bus 0 scbus14 target 0 lun 0
> da2: <WDC WD40 EFRX-68WT0N0 0105> Fixed Direct Access SPC-4 SCSI device
> da2: Serial Number 00000000000000000000
> da2: 400.000MB/s transfers
> da2: 3815447MB (7814037168 512 byte sectors)
> da2: quirks=0xa<NO_6_BYTE,4K>
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Error 5, Retries exhausted
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Error 5, Retries exhausted
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Error 5, Retries exhausted
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain
> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00
> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain
> Solaris: WARNING: Pool 'zoobackup' has encountered an uncorrectable I/O
> failure and has been suspended.
>
>
> On 12/18/2020 10:08 AM, George Neville-Neil wrote:
>> OK, once we get the backup complete we should probably work on the
>> rest of the cleanup.  Let me know if and how I can help.
>>
>> Best,
>> George
>>
>>
>> On 18 Dec 2020, at 9:14, mike tancsa wrote:
>>
>>> Hi George,
>>>
>>>     I think the boot loader is now fixed as those features are white
>>> listed.  Will start backups once again via zrepl.
>>>
>>>     ---Mike
>>>
>>> On 12/17/2020 1:58 PM, George Neville-Neil wrote:
>>>> Howdy,
>>>>
>>>> How do we want to handle the old tank stuff?
>>>>
>>>> Best,
>>>> George
>>>>
>>>>
>>>> On 15 Dec 2020, at 16:24, mike tancsa wrote:
>>>>
>>>>> OK, thanks to Josh P's suggestion, deleting the v2 bookmarks from the
>>>>> pool allowed us to boot.
>>>>>
>>>>> Booted from a temp drive, imported the pool,
>>>>>
>>>>> root@zoo-temp:~ # zpool import -R /mnt -f zooroot
>>>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs
>>>>> destroy "$1}'
>>>>> zfs destroy zooroot#zrepl_CURSOR_G_77296a02a81c78cc_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/ROOT#zrepl_CURSOR_G_e27691751ed1660b_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/ROOT/default#zrepl_CURSOR_G_607fa8e4c7df13b5_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/tmp#zrepl_CURSOR_G_25ae8e2b8723a008_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/usr#zrepl_CURSOR_G_344a884262b3e387_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/usr/home#zrepl_CURSOR_G_2e4087f8f219bd83_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/usr/ports#zrepl_CURSOR_G_fb8384d458dd82b3_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/usr/src#zrepl_CURSOR_G_b867573acd8a57f8_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/var#zrepl_CURSOR_G_ea9efdf01fdf65b5_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/var/audit#zrepl_CURSOR_G_e71132efb0fee45a_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/var/crash#zrepl_CURSOR_G_191c17e9538113f4_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/var/log#zrepl_CURSOR_G_f30668295109ad60_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/var/mail#zrepl_CURSOR_G_7d1eac92237e2603_J_push_to_drive
>>>>> zfs destroy
>>>>> zooroot/var/tmp#zrepl_CURSOR_G_d593288357e0a319_J_push_to_drive
>>>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs
>>>>> destroy "$1}' | sh
>>>>> root@zoo-temp:~ #
>>>>> root@zoo-temp:~ # zpool export zooroot
>>>>> root@zoo-temp:~ #
>>>>>
>>>>> and rebooted and its up. Sadly, will need to come up with another
>>>>> backup
>>>>> system as sysutils/zrepl uses bookmarks :(
>>>>>
>>>>>     ---Mike
>>>>>
>>>>> On 12/15/2020 1:46 PM, mike tancsa wrote:
>>>>>> Looks like the loader does not support v2 bookmarks. I am going to
>>>>>> get
>>>>>> Paul to put in another disk to boot from, mjg will login, either
>>>>>> destroy
>>>>>> the bookmarks or hack a loader fix that will allow the box to boot
>>>>>> with
>>>>>> this feature.  Will be an hour or so as we have a office meeting
>>>>>> at 2pm
>>>>>> we both have to attend.
>>>>>>
>>>>>>     ---Mike
>>>>>>
>>>>>> On 12/15/2020 1:28 PM, mike tancsa wrote:
>>>>>>> I am guessing because I was using zrepl from the ports to do
>>>>>>> replication
>>>>>>> / backup to a secondary disk, the use of the bookmark_v2 feature is
>>>>>>> not
>>>>>>> supported on ZoL ? Any way to recover from this ?
>>>>>>>
>>>>>>>
>>>>>>> On 12/15/2020 1:10 PM, mike tancsa wrote:
>>>>>>>> OK, but the first problem to deal with :(
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> BIOS drive C: is
>>>>>>>> disk0
>>>>>>>> BIOS drive D: is
>>>>>>>> disk1
>>>>>>>> ZFS: unsupported feature:
>>>>>>>> com.datto:bookmark_v2
>>>>>>>> ZFS: pool zooroot is not
>>>>>>>> supported
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Can't find
>>>>>>>> /boot/zfsloader
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Can't find
>>>>>>>> /boot/loader
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Can't find
>>>>>>>> /boot/kernel/kernel
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> FreeBSD/x86
>>>>>>>> boot
>>>>>>>> Default:
>>>>>>>> /boot/kernel/kernel
>>>>>>>>
>>>>>>>>
>>>>>>>> boot:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Can't find
>>>>>>>> /boot/kernel/kernel
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> FreeBSD/x86
>>>>>>>> boot
>>>>>>>> Default:
>>>>>>>> /boot/kernel/kernel
>>>>>>>>
>>>>>>>>
>>>>>>>> boot:
>>>>>>>>
>>>>>>>> On 12/15/2020 1:02 PM, Mateusz Guzik wrote:
>>>>>>>>> We need to update to r368649 for a pmap fix regardless of the
>>>>>>>>> above. I
>>>>>>>>> can do the work and make the box ready for the next reboot.
>>>>>>>>>
>>>>>>>>> On 12/15/20, mike tancsa <mike@sentex.net> wrote:
>>>>>>>>>> The USB backup disk was throwing errors and I was trying to
>>>>>>>>>> export the
>>>>>>>>>> backup pool and it looks like the box is hung now. I am going to
>>>>>>>>>> power
>>>>>>>>>> cycle it
>>>>>>>>>>
>>>>>>>>>>     ---Mike
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>
>>
>


-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHFDLu_MDT1H7xgcX5cXAEi8g_a1Kq8DumO6fJKQq6zBbg>