From owner-freebsd-stable@freebsd.org Sat Jan 12 18:09:31 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 708611499BF0 for ; Sat, 12 Jan 2019 18:09:31 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id E7F2E69908 for ; Sat, 12 Jan 2019 18:09:30 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: by mailman.ysv.freebsd.org (Postfix) id 9C8A51499BEE; Sat, 12 Jan 2019 18:09:30 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 773FE1499BED for ; Sat, 12 Jan 2019 18:09:30 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:d12:604::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EEC95695D6 for ; Sat, 12 Jan 2019 18:09:29 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13:0:0:0:5]) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id x0CI9GoL011877 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 12 Jan 2019 19:09:17 +0100 (CET) (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: barney@databus.com Received: from [10.58.0.4] (dadv@[10.58.0.4]) by eg.sd.rdtc.ru (8.15.2/8.15.2) with ESMTPS id x0CI9Fco044741 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 13 Jan 2019 01:09:15 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: USB disks dropping off-line To: Barney Wolff , Kevin Oberman References: <20190112171255.GA40194@pit.databus.com> Cc: FreeBSD Stable ML From: Eugene Grosbein Message-ID: Date: Sun, 13 Jan 2019 01:09:15 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20190112171255.GA40194@pit.databus.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00, LOCAL_FROM, SPF_PASS, T_DATE_IN_FUTURE_96_Q autolearn=no autolearn_force=no version=3.4.2 X-Spam-Report: * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * -0.0 SPF_PASS SPF: sender matches SPF record * 0.0 T_DATE_IN_FUTURE_96_Q Date: is 4 days to 4 months after * Received: date * 2.6 LOCAL_FROM From my domains X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on hz.grosbein.net X-Rspamd-Queue-Id: EEC95695D6 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.99 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[]; NEURAL_HAM_SHORT(-0.99)[-0.992,0] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Jan 2019 18:09:31 -0000 13.01.2019 0:12, Barney Wolff wrote: > On Fri, Jan 11, 2019 at 11:31:55PM -0800, Kevin Oberman wrote: >> Que Twilight Zone theme. For your consideration: This started in December >> when I was running 11.2-STABLE. Starting in December, when I try to backup >> my laptop to a USB drive, it periodically dropped off-line (disconnected) >> and immediately reconnected. This was originally using rsync. It seemed >> fairly random and, eventually I got a successful backup. After I upgraded >> to 12.0, it was much worse and I could no longer get a clean rsync. >> >> I assumed that the drive was failing and swapped it for an identical one, >> re-partitioned, and used dd to copy each partition. The same thing >> happened, but I noticed that it seemed to happen when the system was a bit >> active. I then shutdown X and tried with nothing else running. It ran for a >> few minutes until I did a sync from a different login while the dd was >> running. Boom. Disk disconnected again. >> >> I finally got an almost complete backup of /usr. I had about 1-2 GB lest >> when it happened again. I suspect that some background operation (periodic >> sync?) triggered it again. >> >> Any suggestions? >> >> Here is my system info: Lenovo T520 now running FreeBSD 12.0-STABLE r342788 >> and GENERIC config except SCHED_4BSD. System is completely stable except >> for the USB disk dropping off-line. Disk is a 2TB WD My Passport. It is a >> USB 3.0 drive,but plugged intoa 2.0 port. (The T520 has no 3.0 capability. >> >> Has anyone seen anything like this? Any ideas? I am REALLY nervous running >> without a backup. > I've had what may be the same problem for years, with a USB3 disk, > on both 10-stable and 12.0-release. I've never found a cause, though power > gitches might be responsible. As a pragmatic fix I have the backup disk as a zpool > and run a daemon that simply does a zpool clear if it finds the pool unhealthy during the backup. > That lets the backup complete every time - the pool is set to wait on error, > so having the daemon check once a minute works with minuscule overhead. I had same problem for several years with non-changing set of hardware (integrated USB 2.0 controller and external USB HDD) and several FreeBSD versions from 8.x and newer. It just stopped disappearing after one of software upgrades so I presume instability of our USB stack for some edge cases. Now it works just fine for my hardware and 11.2-STABLE. Anyway, we have gmountver(8) for temporary work-around: DESCRIPTION The gmountver utility is used to control the mount verification GEOM class. When configured, it passes all the I/O requests to the underlying provider. When the underlying provider disappears - for example because the disk device got disconnected - it queues all the I/O requests and waits for the provider to reappear. When that happens, it attaches to it and sends the queued requests.