From owner-freebsd-stable@freebsd.org Fri Oct 21 08:14:47 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B7BCEC183FD for ; Fri, 21 Oct 2016 08:14:47 +0000 (UTC) (envelope-from ml@netfence.it) Received: from smtp207.alice.it (smtp207.alice.it [82.57.200.103]) by mx1.freebsd.org (Postfix) with ESMTP id 43F44E6A for ; Fri, 21 Oct 2016 08:14:46 +0000 (UTC) (envelope-from ml@netfence.it) Received: from soth.ventu (79.46.7.147) by smtp207.alice.it (8.6.060.28) (authenticated as acanedi@alice.it) id 57C7E4C0093942D1 for freebsd-stable@freebsd.org; Fri, 21 Oct 2016 10:14:27 +0200 Received: from alamar.ventu (alamar.local.netfence.it [10.1.2.18]) by soth.ventu (8.15.2/8.15.2) with ESMTP id u9L8EQcT009469 for ; Fri, 21 Oct 2016 10:14:26 +0200 (CEST) (envelope-from ml@netfence.it) X-Authentication-Warning: soth.ventu: Host alamar.local.netfence.it [10.1.2.18] claimed to be alamar.ventu Subject: Re: Nightly disk-related panic since upgrade to 10.3 To: "freebsd-stable@freebsd.org" References: From: Andrea Venturoli Message-ID: <76d65036-0f4b-28fc-d1ef-f6527a9299a1@netfence.it> Date: Fri, 21 Oct 2016 10:14:26 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Oct 2016 08:14:47 -0000 On 10/20/16 22:12, Peter wrote: Hello. > Basically You have two options: A) fire up kgdb, go into the code and > try and understand what exactly is happening. This depends > if You have clue enough to go that way; I found "man 4 gdb" and > especially the "Debugging Kernel Problems" pdf by Greg Lehey quite > helpful. I've tried this way, but altough I'm quite proficient with [k]gdb I tend to get lost in FreeBSD's kernel's source code, which, unfortunately, I'm not familiar with. BTW, I had read that book years ago; I searched for it now, but a 2005 edition still comes up. Has it ever been updated? > B) systematically change parameters. Start by figuring from the logs > the exact time of crash and what was happening then, try to reproduce > that. Then change things and isolate the cause. Again, I already tried, but without luck. Since I had one hang one night during the creation of a snapshot, yesterday I tried creating/deleting around 40 of them: I hoped to get the system to hang again, but it all worked perfectly. Since backups are run at night (possibly at the time of the hangs/panics and doing snapshots), I launched several backup jobs, but they all worked perfectly. I checked that at the times of the panics there is usually no cron job, periodic job or whatever. At least not something I could identify. There was in fact once a periodic running, but that's not the rule. "ps -axl -M /var/crash/vmcore.x" showed nothing unusual. bye & Thanks av.