From owner-freebsd-stable@freebsd.org Mon Oct 17 23:31:57 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DBC0EC16866 for ; Mon, 17 Oct 2016 23:31:57 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-lf0-x22f.google.com (mail-lf0-x22f.google.com [IPv6:2a00:1450:4010:c07::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6170E9A0 for ; Mon, 17 Oct 2016 23:31:57 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-lf0-x22f.google.com with SMTP id x79so320908701lff.0 for ; Mon, 17 Oct 2016 16:31:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=voXqfP7frQ1xhQfT5If+ur5L0D6MfyvWeQPwdwj7hE0=; b=W+8Y0B9Tac/OjptU7d/qiIWlBRL1GhXyPbeNN66MucRqbtgJoDlkf2ZTUsRRRAg7oH ohxucvMhS8DJb9Qy+xWivHP1NWMScYEmO7GeFidj0yx1FNbSoOi0AQEQ+KLQOjwUxyDc o/wt20TDPeLR/h3Yvpd67t7ftllQYkV8laB3lhgrWK01806lzjhS+wuCFQmsD1ek1WBg uZbGTRSLR0lVtCxahf7BCr20MWTpxRmOmCCKDmf1CmB6bkEcu08T3JnKKo3QQ/FBO3Hj saxJdHCBF5/c/HY29vAb1bQCQn5cG/PiRhHNWWTR0KUOp3upmVsDH2PSjeD+mH1iWt2C JhJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=voXqfP7frQ1xhQfT5If+ur5L0D6MfyvWeQPwdwj7hE0=; b=l0XXMHsZ8buYLwGvQ3bnnA3LNZeLYDoWUhPKtRADMNnUZPZRU0pQd5JcoxA6lXdMyG VDCC8np6gKW7obtSQlSgzXxub9LfZIi3HnhV3yMhwD2D0HaTprQ4YfPotwznKOShBPfJ vSiXbf7O/hzTLZQwjCQ52EVWjGIG8a8qWGJNE0Z/BPsdhm3m/zpmAsIJs/3mO61kyoLp /CaVzJ4ru+xrHpoGGaFzhA+vtXJ997wxQRwYw1f41Pl9Hj06O8SHXpLmY8QazOcVP8z6 QoKLgZc7fNLmoqIZZXLn+RH+xFqiOFrBH87ZrSKnHfWT41S6hKgywsDonVKhuK7mTvr8 Dr6w== X-Gm-Message-State: AA6/9RkdJDeGYW74ShDxkJy4jU3ufX8/YHObpxWUmfixwXU0wvuBBf+46SSfsBUyh0RZmDTJ X-Received: by 10.28.54.71 with SMTP id d68mr9336426wma.63.1476747114966; Mon, 17 Oct 2016 16:31:54 -0700 (PDT) Received: from [10.10.1.58] ([185.97.61.26]) by smtp.gmail.com with ESMTPSA id j1sm46812914wjl.21.2016.10.17.16.31.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Oct 2016 16:31:54 -0700 (PDT) Subject: Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE To: freebsd-stable@freebsd.org References: <3d4f25c9-a262-a373-ec7e-755325f8810b@denninger.net> <9adecd24-6659-0da5-5c05-d0d3957a2cb3@denninger.net> <0f58b11f-0bca-bc08-6f90-4e6e530f9956@denninger.net> <43a67287-f4f8-5d3e-6c5e-b3599c6adb4d@multiplay.co.uk> <76551fd6-0565-ee6c-b0f2-7d472ad6a4b3@denninger.net> <25ff3a3e-77a9-063b-e491-8d10a06e6ae2@multiplay.co.uk> <26e092b2-17c6-8744-5035-d0853d733870@denninger.net> From: Steven Hartland Message-ID: <4d4909b7-c44b-996e-90e1-ca446e8e4813@multiplay.co.uk> Date: Tue, 18 Oct 2016 00:32:20 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Oct 2016 23:31:57 -0000 On 17/10/2016 22:50, Karl Denninger wrote: > I will make some effort on the sandbox machine to see if I can come up > with a way to replicate this. I do have plenty of spare larger drives > laying around that used to be in service and were obsolesced due to > capacity -- but what I don't know if whether the system will misbehave > if the source is all spinning rust. > > In other words: > > 1. Root filesystem is mirrored spinning rust (production is mirrored SSDs) > > 2. Backup is mirrored spinning rust (of approx the same size) > > 3. Set up auto-snapshot exactly as the production system has now (which > the sandbox is NOT since I don't care about incremental recovery on that > machine; it's a sandbox!) > > 4. Run a bunch of build-somethings (e.g. buildworlds, cross-build for > the Pi2s I have here, etc) to generate a LOT of filesystem entropy > across lots of snapshots. > > 5. Back that up. > > 6. Export the backup pool. > > 7. Re-import it and "zfs destroy -r" the backup filesystem. > > That is what got me in a reboot loop after the *first* panic; I was > simply going to destroy the backup filesystem and re-run the backup, but > as soon as I issued that zfs destroy the machine panic'd and as soon as > I re-attached it after a reboot it panic'd again. Repeat until I set > trim=0. > > But... if I CAN replicate it that still shouldn't be happening, and the > system should *certainly* survive attempting to TRIM on a vdev that > doesn't support TRIMs, even if the removal is for a large amount of > space and/or files on the target, without blowing up. > > BTW I bet it isn't that rare -- if you're taking timed snapshots on an > active filesystem (with lots of entropy) and then make the mistake of > trying to remove those snapshots (as is the case with a zfs destroy -r > or a zfs recv of an incremental copy that attempts to sync against a > source) on a pool that has been imported before the system realizes that > TRIM is unavailable on those vdevs. > > Noting this: > > Yes need to find some time to have a look at it, but given how rare > this is and with TRIM being re-implemented upstream in a totally > different manor I'm reticent to spend any real time on it. > > What's in-process in this regard, if you happen to have a reference? Looks like it may be still in review: https://reviews.csiden.org/r/263/