From owner-freebsd-fs@freebsd.org Tue Oct 18 15:30:08 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C83BC17767 for ; Tue, 18 Oct 2016 15:30:08 +0000 (UTC) (envelope-from ardovm@yahoo.it) Received: from nm16-vm4.bullet.mail.ir2.yahoo.com (nm16-vm4.bullet.mail.ir2.yahoo.com [212.82.96.210]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1E3EB83F for ; Tue, 18 Oct 2016 15:30:07 +0000 (UTC) (envelope-from ardovm@yahoo.it) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.it; s=s2048; t=1476804436; bh=OlkCLEvgFHR4SBcQ4krIFYX8/TXrg9R9bVHkQ2QsHl4=; h=Date:From:To:Subject:From:Subject; b=sWHJ5Vu+o+uDmiMSp04v13NTKX8NVD7RAYC9pka51ksHP1kTQsrP4g4xyYXX63sDfbXNocW1X+N+wYrlICA4zj9jb2zheuSkKGsMrMG+9RUhrrWAbCwxlw6Tj82D6cG7+WMx6Da+rI+CYDo0ISQ3mItUkrRG+lMUOsU6717CiE/MYoocqi+ktUeoN9QLwiCqJJ3p19+io0JZEew4tzPtjQuVDfKg3XTDZ2DRgr91wSf+bVBOAFMRYgWEyk1qWdZvuBO+NwzZFL/M4h1RpUBkrw/dWcRllr4vOXdFHMK03fIkhB/E52JjDJ8js07wH6DxBqtjhjCjybv+IR3UQRFHcw== Received: from [212.82.98.124] by nm16.bullet.mail.ir2.yahoo.com with NNFMP; 18 Oct 2016 15:27:16 -0000 Received: from [46.228.39.98] by tm17.bullet.mail.ir2.yahoo.com with NNFMP; 18 Oct 2016 15:27:16 -0000 Received: from [127.0.0.1] by smtp135.mail.ir2.yahoo.com with NNFMP; 18 Oct 2016 15:27:16 -0000 X-Yahoo-Newman-Id: 433388.14901.bm@smtp135.mail.ir2.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: hz3Xv9AVM1kvWDKemb.aWLfXpMf6DVpjrv.4h1_x1VllNQ1 oPEuzA4HoPc88JeI4.rBHrKKC2UBHXdPjaUY9PZ0egqG6LMO1yCxb5m6sUvA hUrgjPs51b38YgMw_XV98y3.6XT32uF5yIjI5.LDunljdpFNhu9s_kFRComc rTpBul.Dvovfc_Gs.Dl_XWyxN82tq8dskqt4KSqBTiZCLQj6ookDSbnekdFJ QUPiNZwg4ujd5w88._BF19uSy4Mc.XnBqXMN4ZXJsgYOBWA_rCdV2qIJrQaG ufpxrmfR08kGll45rSWDQA.uPb7Zcs2ZjZ_aX_pxfc2P39_RMkvTgV6oJH7w TxmWb9jalkgbX_GUkmWxgX3Uy9fpJuETObLrwvU4ZGb6Cr3XwTJnp.91_a2_ WaPhSxG4W3Ox5tgDLZ38LKg0qyQ74ooqdDnL7PmHB5dAg3YlEq.bSaGBazwy r3bF9ed11VbHL2UdIDxYmjqUnzrG6cwQJ.F53JiK8jVxmVVWifl7g46QjRmp wex4DhZLtodc5a_43CLp1f8xj6BJ.n2ngdQGVYXBFkELPYqYi X-Yahoo-SMTP: WU.IBxeswBAAnLcBZV3tEZIK0A-- Received: by nuvolo.localdomain (Postfix, from userid 1001) id 37D491AF109; Tue, 18 Oct 2016 17:27:15 +0200 (CEST) Date: Tue, 18 Oct 2016 17:27:15 +0200 From: Arrigo Marchiori To: freebsd-fs@freebsd.org Subject: Random truncated files on USB hard disk with timeouts; how to debug? Message-ID: <20161018152715.GC89691@nuvolo> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline User-Agent: Mutt/1.7.0 (2016-08-17) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Oct 2016 15:30:08 -0000 Hello List, I am encountering a strange problem, that happens seldom and randomly, and I don't know how to address it. Short description: some files sometimes become ``sort of truncated'': ls(1) tells me their size is not zero, but cat(1), less(1) and vi(1) show they are empty. The system is a 11-0 STABLE amd64, r307550, with GENERIC kernel. CPU: Intel Core 2 Duo. Ram: 2 GB. The root filesystem is mounted from a USB hard drive, with MBR partitioning scheme, formatted with ufs, SU+J enabled. The USB hard drive occasionally times out for ~10 seconds. But I do not see any warning or error messages in dmesg, that suggest that such timeouts could lead to broken files. In fact, dmesg(8) does not show anything at all about those timeouts, without tweaking the standard kernel verbosity options. If I set hw.usb.ehci.debug to 1, then I see ehci_timeout indications. If I set the sysctl to any bigger value, the console is flooded by messages. The problem appears while the computer is under heavy load: building world or ports. When this problem appears, the compilations stop with funny errror messages: the source files are empty!... Running truss(1) on cat(1) shows that the read(2) library function returns 0 bytes. I tried to disable journaling, but the problem still appears, apparently with the same frequency. Once the problem appears, I can reboot the system normally. I see no errors either during shutdown and the next startup. The filesystem is considered clean, and no fsck is run (BTW I disabled background fsck). The funny part is that after rebooting, the file contents are visible! I can resume the port compilation as if nothing ever happened. What can I do to get more information on this problem? Is there a well-known stress test I could run to exploit this problem more frequently? I am considering this a big problem, because I have no indications from the system logs that anything is going bad. If the HDD was broken, I would expect the kernel to yell it loud and often. Please add me in cc, as I am not subscribed to this list. Thank you in advance! -- rigo http://rigo.altervista.org