Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Dec 2020 09:39:38 -0500
From:      mike tancsa <mike@sentex.net>
To:        Hans Petter Selasky <hps@selasky.org>, Mateusz Guzik <mjguzik@gmail.com>
Cc:        George Neville-Neil <gnn@neville-neil.com>, "netperf-admin@FreeBSD.org" <netperf-admin@freebsd.org>, netperf-users@freebsd.org, Paul Holes <pholes@sentex.ca>
Subject:   zoo reboot 16:00 UTC
Message-ID:  <35a20f0a-f13e-9dcc-a814-dbd116cfb9c0@sentex.net>
In-Reply-To: <da7ac6aa-2be5-1536-c702-ca40a53235a6@sentex.net>
References:  <5483e76e-4a2f-3153-c10b-7902839c1b68@sentex.net> <CAGudoHF-3XhWQq-x8vROdUJ0sTweha2YEK_LXVwv44E4k=TtmQ@mail.gmail.com> <a55a69da-c9c6-eb18-9975-3572457ae5ef@sentex.net> <8c26a0d3-3bd0-7535-0abc-3d1e9e5ac7c4@sentex.net> <64923d33-4bf2-0fd5-1b17-d6bd73e9fd32@sentex.net> <13a9ab42-1df8-c054-0c83-5708ab9d9e2b@sentex.net> <C94AED22-A984-49ED-8D18-FD4856D70E01@neville-neil.com> <6cef40cd-de57-aa84-bc70-ceea71add397@sentex.net> <F0FA8C48-1DB1-4D63-ACD4-3ADD78AFA568@neville-neil.com> <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net> <CAGudoHFDLu_MDT1H7xgcX5cXAEi8g_a1Kq8DumO6fJKQq6zBbg@mail.gmail.com> <7c508e03-7575-b06a-3b14-f8b6e1ed10db@sentex.net> <51c4dfed-a42a-a820-816c-f89691e853e7@selasky.org> <da7ac6aa-2be5-1536-c702-ca40a53235a6@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Looks like the drive / usb process is wedged.  Going to have to reboot
zoo to clear it for now as I cant export or destroy the pool and certain
disk IO is also hanging (e.g df)

    ---Mike

On 12/20/2020 3:01 PM, mike tancsa wrote:
> The dump file is rather large. Its on zoo.freebsd.org in
> /tmp/disk-capture3.dump.  I let the backup run into it hit the errors
> again, and am trying to do a zpool clear;zpool export but its stalling
> out. CTRL+T shows
>
> load: 0.14  cmd: zpool 77888 [g_waitidle] 324.72r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.14  cmd: zpool 77888 [g_waitidle] 328.14r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.14  cmd: zpool 77888 [g_waitidle] 328.69r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.14  cmd: zpool 77888 [g_waitidle] 329.56r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.13  cmd: zpool 77888 [g_waitidle] 330.16r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.13  cmd: zpool 77888 [g_waitidle] 330.65r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.13  cmd: zpool 77888 [g_waitidle] 331.76r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
> load: 0.13  cmd: zpool 77888 [g_waitidle] 332.24r 0.00u 0.03s 0% 7184k
> mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f
> ast+0x2e3 doreti_ast+0x1f
>
> and I cant kill the clear now
>
> But not a lot other than variations of below
>
> 19:42:12.541808 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:12.541921 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:12.541935 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:12.542055 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:12.542069 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:12.542190 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:16.566334 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:16.566484 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:16.566499 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:16.566614 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:16.566629 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:16.566746 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:16.566782 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:16.566876 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:18.196864 usbus0.6
> DONE-BULK-EP=00000081,SPD=SUPER,NFR=0,SLEN=0,IVAL=0,ERR=TIMEOUT
> 19:42:18.196871 usbus0.6 SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=0
> 19:42:18.196992 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=0,ERR=0
> 19:42:18.197003 usbus0.6 SUBM-BULK-EP=00000081,SPD=SUPER,NFR=1,SLEN=0,IVAL=0
> 19:42:20.582625 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:20.582777 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:20.582794 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:20.582909 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:20.582938 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:20.583038 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:20.583054 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:20.583193 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:23.107860 usbus0.6
> DONE-BULK-EP=00000081,SPD=SUPER,NFR=0,SLEN=0,IVAL=0,ERR=TIMEOUT
> 19:42:23.107866 usbus0.6 SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=0
> 19:42:23.107987 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=0,ERR=0
> 19:42:23.107989 usbus0.6 SUBM-BULK-EP=00000081,SPD=SUPER,NFR=1,SLEN=0,IVAL=0
> 19:42:24.589628 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:24.589784 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:24.589803 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:24.589914 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:24.589932 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:24.590070 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:24.590086 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:24.590177 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:28.017135 usbus0.6
> DONE-BULK-EP=00000081,SPD=SUPER,NFR=0,SLEN=0,IVAL=0,ERR=TIMEOUT
> 19:42:28.546429 usbus0.6
> SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=500
> 19:42:28.546582 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=500,ERR=0
> 19:42:28.594025 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:28.594186 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:28.594201 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:28.594322 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:28.594341 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:28.594449 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:28.594465 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:28.594581 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
> 19:42:28.599175 usbus0.6
> SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=50
> 19:42:28.599332 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=50,ERR=0
> 19:42:28.652173 usbus0.6
> SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=50
> 19:42:28.652328 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=50,ERR=0
> 19:42:28.652332 usbus0.6
> SUBM-BULK-EP=00000002,SPD=SUPER,NFR=1,SLEN=32,IVAL=0
> 19:42:28.656448 usbus0.6
> DONE-BULK-EP=00000002,SPD=SUPER,NFR=1,SLEN=0,IVAL=0,ERR=IOERROR
> 19:42:29.164682 usbus0.6
> SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=500
> 19:42:29.164835 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=500,ERR=0
> 19:42:29.217174 usbus0.6
> SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=50
> 19:42:29.217325 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=50,ERR=0
> 19:42:29.270172 usbus0.6
> SUBM-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=8,IVAL=50
> 19:42:29.270326 usbus0.6
> DONE-CTRL-EP=00000000,SPD=SUPER,NFR=1,SLEN=0,IVAL=50,ERR=0
> 19:42:29.270331 usbus0.6
> SUBM-BULK-EP=00000002,SPD=SUPER,NFR=1,SLEN=32,IVAL=0
> 19:42:29.270717 usbus0.6
> DONE-BULK-EP=00000002,SPD=SUPER,NFR=1,SLEN=0,IVAL=0,ERR=0
> 19:42:29.270721 usbus0.6 SUBM-BULK-EP=00000081,SPD=SUPER,NFR=1,SLEN=0,IVAL=0
> 19:42:32.629079 usbus0.4 SUBM-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=8,IVAL=0
> 19:42:32.629232 usbus0.4
> DONE-CTRL-EP=00000080,SPD=HIGH,NFR=2,SLEN=4,IVAL=0,ERR=0
>
>
>
> On 12/19/2020 6:30 PM, Hans Petter Selasky wrote:
>> On 12/19/20 10:57 PM, mike tancsa wrote:
>>> I was able to do a zpool clear zoobackup; zpool export zoobackup even
>>> though it threw a few more errors
>>>
>>> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00
>>> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>>> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain
>>> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00
>>> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>>> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain
>>> Solaris: WARNING: Pool 'zoobackup' has encountered an uncorrectable I/O
>>> failure and has been suspended.
>>>
>>> (da2:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 02 38 00 00 10 00
>>> (da2:umass-sim0:0:0:0): CAM status: SCSI Status Error
>>> (da2:umass-sim0:0:0:0): SCSI status: Check Condition
>>> (da2:umass-sim0:0:0:0): SCSI sense: NOT READY asc:4,1 (Logical unit is
>>> in process of becoming ready)
>>>
>>> (da2:umass-sim0:0:0:0): Polling device for readiness
>>>
>>> I wonder if Monday we should try upgrading the BIOS first
>>>
>>>
>>> BIOS Information
>>>          Vendor: American Megatrends Inc.
>>>          Version: 1.0b
>>>          Release Date: 01/29/2015
>>>          Address: 0xF0000
>>>          Runtime Size: 64 kB
>>>          ROM Size: 16 MB
>>>          Characteristics:
>>>
>>> System Information
>>>          Manufacturer: Supermicro
>>>          Product Name: SYS-7048R-C1RT4+
>>>          Version: 0123456789
>>>          Serial Number: S16909225402569
>>>          UUID: 00000000-0000-0000-0000-0cc47a1f2fa0
>>>          Wake-up Type: Power Switch
>>>          SKU Number: To be filled by O.E.M.
>>>          Family: To be filled by O.E.M.
>>>
>>> Handle 0x0002, DMI type 2, 15 bytes
>>> Base Board Information
>>>          Manufacturer: Supermicro
>>>          Product Name: X10DRC-T4+
>>>          Version: 1.01
>>>
>>>
>>> https://www.supermicro.com/Bios/softfiles/10079/P-X10DRC(-I-LN4-T4_)_BIOS_3_2_release_notes.pdf
>>>
>>>
>>> is from 2019
>>>
>>> On 12/19/2020 3:16 PM, Mateusz Guzik wrote:
>>>> I'm adding hps for USB stack comments.
>>>>
>>>> On 12/19/20, mike tancsa <mike@sentex.net> wrote:
>>>>> Hmm, This has happened again. Not sure if its a bug with the
>>>>> driver, the
>>>>> firmware or both, but after a period of time the usb drive starts to
>>>>> throw errors.  This unit was working fine on RELENG12 and we
>>>>> swapped it
>>>>> with another drive too, but same results. The drive is clean
>>>>>
>>>>> smartctl -a /dev/da2 -T permissive
>>>>>
>> You might want to do a usbdump of the traffic for a short while to
>> figure out exactly what kind of USB error this is.
>>
>> --HPS
>>
> _______________________________________________
> netperf-users@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/netperf-users
> To unsubscribe, send any mail to "netperf-users-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?35a20f0a-f13e-9dcc-a814-dbd116cfb9c0>