From owner-netperf-users@freebsd.org Sun Dec 20 20:01:03 2020 Return-Path: Delivered-To: netperf-users@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id AE4E74CA4A7 for ; Sun, 20 Dec 2020 20:01:03 +0000 (UTC) (envelope-from mike@sentex.net) Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [IPv6:2607:f3e0:0:3::19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "pyroxene.sentex.ca", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CzYPq4ML2z3Qvw; Sun, 20 Dec 2020 20:01:03 +0000 (UTC) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:cc04:af7f:84b:2e60] ([IPv6:2607:f3e0:0:4:cc04:af7f:84b:2e60]) by pyroxene2a.sentex.ca (8.15.2/8.15.2) with ESMTPS id 0BKK10Wq092529 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Sun, 20 Dec 2020 15:01:00 -0500 (EST) (envelope-from mike@sentex.net) To: Hans Petter Selasky , Mateusz Guzik Cc: George Neville-Neil , "netperf-admin@FreeBSD.org" , netperf-users@freebsd.org, Paul Holes References: <5483e76e-4a2f-3153-c10b-7902839c1b68@sentex.net> <8c26a0d3-3bd0-7535-0abc-3d1e9e5ac7c4@sentex.net> <64923d33-4bf2-0fd5-1b17-d6bd73e9fd32@sentex.net> <13a9ab42-1df8-c054-0c83-5708ab9d9e2b@sentex.net> <6cef40cd-de57-aa84-bc70-ceea71add397@sentex.net> <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net> <7c508e03-7575-b06a-3b14-f8b6e1ed10db@sentex.net> <51c4dfed-a42a-a820-816c-f89691e853e7@selasky.org> From: mike tancsa Subject: Re: zoo back online (was Re: zoo hang) Message-ID: Date: Sun, 20 Dec 2020 15:01:01 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <51c4dfed-a42a-a820-816c-f89691e853e7@selasky.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-US X-Rspamd-Queue-Id: 4CzYPq4ML2z3Qvw X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: netperf-users@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: "Announcements and discussions related to the netperf cluster. " List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Dec 2020 20:01:03 -0000 The dump file is rather large. Its on zoo.freebsd.org in /tmp/disk-capture3.dump.=C2=A0 I let the backup run into it hit the error= s again, and am trying to do a zpool clear;zpool export but its stalling out. CTRL+T shows load: 0.14=C2=A0 cmd: zpool 77888 [g_waitidle] 324.72r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.14=C2=A0 cmd: zpool 77888 [g_waitidle] 328.14r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.14=C2=A0 cmd: zpool 77888 [g_waitidle] 328.69r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.14=C2=A0 cmd: zpool 77888 [g_waitidle] 329.56r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.13=C2=A0 cmd: zpool 77888 [g_waitidle] 330.16r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.13=C2=A0 cmd: zpool 77888 [g_waitidle] 330.65r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.13=C2=A0 cmd: zpool 77888 [g_waitidle] 331.76r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f load: 0.13=C2=A0 cmd: zpool 77888 [g_waitidle] 332.24r 0.00u 0.03s 0% 718= 4k mi_switch+0xc1 sleepq_timedwait+0x2f _sleep+0x1ab g_waitidle+0x8f ast+0x2e3 doreti_ast+0x1f and I cant kill the clear now But not a lot other than variations of below 19:42:12.541808 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:12.541921 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:12.541935 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:12.542055 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:12.542069 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:12.542190 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:16.566334 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:16.566484 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:16.566499 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:16.566614 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:16.566629 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:16.566746 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:16.566782 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:16.566876 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:18.196864 usbus0.6 DONE-BULK-EP=3D00000081,SPD=3DSUPER,NFR=3D0,SLEN=3D0,IVAL=3D0,ERR=3DTIMEO= UT 19:42:18.196871 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN= =3D8,IVAL=3D0 19:42:18.196992 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D0,ERR=3D0 19:42:18.197003 usbus0.6 SUBM-BULK-EP=3D00000081,SPD=3DSUPER,NFR=3D1,SLEN= =3D0,IVAL=3D0 19:42:20.582625 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:20.582777 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:20.582794 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:20.582909 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:20.582938 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:20.583038 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:20.583054 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:20.583193 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:23.107860 usbus0.6 DONE-BULK-EP=3D00000081,SPD=3DSUPER,NFR=3D0,SLEN=3D0,IVAL=3D0,ERR=3DTIMEO= UT 19:42:23.107866 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN= =3D8,IVAL=3D0 19:42:23.107987 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D0,ERR=3D0 19:42:23.107989 usbus0.6 SUBM-BULK-EP=3D00000081,SPD=3DSUPER,NFR=3D1,SLEN= =3D0,IVAL=3D0 19:42:24.589628 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:24.589784 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:24.589803 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:24.589914 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:24.589932 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:24.590070 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:24.590086 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:24.590177 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:28.017135 usbus0.6 DONE-BULK-EP=3D00000081,SPD=3DSUPER,NFR=3D0,SLEN=3D0,IVAL=3D0,ERR=3DTIMEO= UT 19:42:28.546429 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D8,IVAL=3D500 19:42:28.546582 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D500,ERR=3D0 19:42:28.594025 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:28.594186 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:28.594201 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:28.594322 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:28.594341 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:28.594449 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:28.594465 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:28.594581 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 19:42:28.599175 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D8,IVAL=3D50 19:42:28.599332 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D50,ERR=3D0 19:42:28.652173 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D8,IVAL=3D50 19:42:28.652328 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D50,ERR=3D0 19:42:28.652332 usbus0.6 SUBM-BULK-EP=3D00000002,SPD=3DSUPER,NFR=3D1,SLEN=3D32,IVAL=3D0 19:42:28.656448 usbus0.6 DONE-BULK-EP=3D00000002,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D0,ERR=3DIOERR= OR 19:42:29.164682 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D8,IVAL=3D500 19:42:29.164835 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D500,ERR=3D0 19:42:29.217174 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D8,IVAL=3D50 19:42:29.217325 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D50,ERR=3D0 19:42:29.270172 usbus0.6 SUBM-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D8,IVAL=3D50 19:42:29.270326 usbus0.6 DONE-CTRL-EP=3D00000000,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D50,ERR=3D0 19:42:29.270331 usbus0.6 SUBM-BULK-EP=3D00000002,SPD=3DSUPER,NFR=3D1,SLEN=3D32,IVAL=3D0 19:42:29.270717 usbus0.6 DONE-BULK-EP=3D00000002,SPD=3DSUPER,NFR=3D1,SLEN=3D0,IVAL=3D0,ERR=3D0 19:42:29.270721 usbus0.6 SUBM-BULK-EP=3D00000081,SPD=3DSUPER,NFR=3D1,SLEN= =3D0,IVAL=3D0 19:42:32.629079 usbus0.4 SUBM-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D= 8,IVAL=3D0 19:42:32.629232 usbus0.4 DONE-CTRL-EP=3D00000080,SPD=3DHIGH,NFR=3D2,SLEN=3D4,IVAL=3D0,ERR=3D0 On 12/19/2020 6:30 PM, Hans Petter Selasky wrote: > On 12/19/20 10:57 PM, mike tancsa wrote: >> I was able to do a zpool clear zoobackup; zpool export zoobackup even >> though it threw a few more errors >> >> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 >> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an erro= r >> (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain >> (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 >> (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an erro= r >> (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain >> Solaris: WARNING: Pool 'zoobackup' has encountered an uncorrectable I/= O >> failure and has been suspended. >> >> (da2:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 02 38 00 00 10 00 >> (da2:umass-sim0:0:0:0): CAM status: SCSI Status Error >> (da2:umass-sim0:0:0:0): SCSI status: Check Condition >> (da2:umass-sim0:0:0:0): SCSI sense: NOT READY asc:4,1 (Logical unit is= >> in process of becoming ready) >> >> (da2:umass-sim0:0:0:0): Polling device for readiness >> >> I wonder if Monday we should try upgrading the BIOS first >> >> >> BIOS Information >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Vendor: American Mega= trends Inc. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Version: 1.0b >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Release Date: 01/29/2= 015 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Address: 0xF0000 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Runtime Size: 64 kB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ROM Size: 16 MB >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Characteristics: >> >> System Information >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Manufacturer: Supermi= cro >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Product Name: SYS-704= 8R-C1RT4+ >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Version: 0123456789 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Serial Number: S16909= 225402569 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 UUID: 00000000-0000-0= 000-0000-0cc47a1f2fa0 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Wake-up Type: Power S= witch >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 SKU Number: To be fil= led by O.E.M. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Family: To be filled = by O.E.M. >> >> Handle 0x0002, DMI type 2, 15 bytes >> Base Board Information >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Manufacturer: Supermi= cro >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Product Name: X10DRC-= T4+ >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Version: 1.01 >> >> >> https://www.supermicro.com/Bios/softfiles/10079/P-X10DRC(-I-LN4-T4_)_B= IOS_3_2_release_notes.pdf >> >> >> is from 2019 >> >> On 12/19/2020 3:16 PM, Mateusz Guzik wrote: >>> I'm adding hps for USB stack comments. >>> >>> On 12/19/20, mike tancsa wrote: >>>> Hmm, This has happened again. Not sure if its a bug with the >>>> driver, the >>>> firmware or both, but after a period of time the usb drive starts to= >>>> throw errors.=C2=A0 This unit was working fine on RELENG12 and we >>>> swapped it >>>> with another drive too, but same results. The drive is clean >>>> >>>> smartctl -a /dev/da2 -T permissive >>>> > > You might want to do a usbdump of the traffic for a short while to > figure out exactly what kind of USB error this is. > > --HPS >