Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Oct 2016 15:55:37 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-stable@freebsd.org
Subject:   Re: Repeatable panic on ZFS filesystem (used for backups); 11.0-STABLE
Message-ID:  <1fefed03-6062-50f9-be97-d693e25a64c9@denninger.net>
In-Reply-To: <4d4909b7-c44b-996e-90e1-ca446e8e4813@multiplay.co.uk>
References:  <3d4f25c9-a262-a373-ec7e-755325f8810b@denninger.net> <9adecd24-6659-0da5-5c05-d0d3957a2cb3@denninger.net> <CANCZdfq5QCDNhLY5GOpmBoh5ONYy2VPteuaMhQ2=3v%2B0vcoM0g@mail.gmail.com> <0f58b11f-0bca-bc08-6f90-4e6e530f9956@denninger.net> <43a67287-f4f8-5d3e-6c5e-b3599c6adb4d@multiplay.co.uk> <76551fd6-0565-ee6c-b0f2-7d472ad6a4b3@denninger.net> <25ff3a3e-77a9-063b-e491-8d10a06e6ae2@multiplay.co.uk> <26e092b2-17c6-8744-5035-d0853d733870@denninger.net> <d2afc0b0-0e7f-e7ac-fb21-fa4ffd1c1003@multiplay.co.uk> <f9a4a12d-62df-482d-feeb-9d9f64de3e55@denninger.net> <4d4909b7-c44b-996e-90e1-ca446e8e4813@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
On 10/17/2016 18:32, Steven Hartland wrote:
>
>
> On 17/10/2016 22:50, Karl Denninger wrote:
>> I will make some effort on the sandbox machine to see if I can come up
>> with a way to replicate this.  I do have plenty of spare larger drives
>> laying around that used to be in service and were obsolesced due to
>> capacity -- but what I don't know if whether the system will misbehave
>> if the source is all spinning rust.
>>
>> In other words:
>>
>> 1. Root filesystem is mirrored spinning rust (production is mirrored
>> SSDs)
>>
>> 2. Backup is mirrored spinning rust (of approx the same size)
>>
>> 3. Set up auto-snapshot exactly as the production system has now (which
>> the sandbox is NOT since I don't care about incremental recovery on that
>> machine; it's a sandbox!)
>>
>> 4. Run a bunch of build-somethings (e.g. buildworlds, cross-build for
>> the Pi2s I have here, etc) to generate a LOT of filesystem entropy
>> across lots of snapshots.
>>
>> 5. Back that up.
>>
>> 6. Export the backup pool.
>>
>> 7. Re-import it and "zfs destroy -r" the backup filesystem.
>>
>> That is what got me in a reboot loop after the *first* panic; I was
>> simply going to destroy the backup filesystem and re-run the backup, but
>> as soon as I issued that zfs destroy the machine panic'd and as soon as
>> I re-attached it after a reboot it panic'd again.  Repeat until I set
>> trim=0.
>>
>> But... if I CAN replicate it that still shouldn't be happening, and the
>> system should *certainly* survive attempting to TRIM on a vdev that
>> doesn't support TRIMs, even if the removal is for a large amount of
>> space and/or files on the target, without blowing up.
>>
>> BTW I bet it isn't that rare -- if you're taking timed snapshots on an
>> active filesystem (with lots of entropy) and then make the mistake of
>> trying to remove those snapshots (as is the case with a zfs destroy -r
>> or a zfs recv of an incremental copy that attempts to sync against a
>> source) on a pool that has been imported before the system realizes that
>> TRIM is unavailable on those vdevs.
>>
>> Noting this:
>>
>>      Yes need to find some time to have a look at it, but given how rare
>>      this is and with TRIM being re-implemented upstream in a totally
>>      different manor I'm reticent to spend any real time on it.
>>
>> What's in-process in this regard, if you happen to have a reference?
> Looks like it may be still in review: https://reviews.csiden.org/r/263/
>
>
Initial attempts to provoke the panic has failed on the sandbox machine
-- it appears that I need a materially-fragmented backup volume (which
makes sense, as that would greatly increase the number of TRIM's queued.)

Running a bunch of builds with snapshots taken between generates a
metric ton of entropy in the filesystem, but it appears that the number
of TRIMs actually issued when you bulk-remove them (with zfs destroy -r)
is small enough to not cause it -- probably because the system issues
one per area of freed disk, and since there is no interleaving with
other (non-removed) data that number is "reasonable" since there's
little fragmentation of that free space.

The TRIMs *are* attempted, and they *do* fail, however.....

I'm running with the 6 pages of kstack now on the production machine,
and we'll see if I get another panic...

-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

[-- Attachment #2 --]
0	*H
010
	`He0	*H
_0[0C)0
	*H
010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA0
150421022159Z
200419022159Z0Z10	UUS10UFlorida10U
Cuda Systems LLC10UKarl Denninger (OCSP)0"0
	*H
0
X@vkY
Tq/vE]5#֯MX\8LJ/V?5Da+
sJc*/r{ȼnS+w")ąZ^DtdCOZ ~7Q '@a#ijc۴oZdB&!Ӝ-<	?HN5y
5}F|ef゘"Vلio74zn">a1qWuɖbFeGE&3(KhixG3!#e_XƬϜ/,$+;4y'Bz<qT9_?rRUpn5
Jn&Rx/p Jyel*pN8/#9u/YPEC)TY>~/˘N[vyiDKˉ,^" ?$T8v&K%z8C @?K{9f`+@,|Mbia007++0)0'+0http://cudasystems.net:88880	U00	`HB0U0,	`HB
OpenSSL Generated Certificate0U-h\Ff Y0U#0$q}ݽʒm50U0karl@denninger.net0
	*H
Owbabɺx&Uk[(Oj!%pMQ0I!#QH}.>~2&D}<wm_>V6v]f>=Nn+8;q wfΰ/RLyUG#b}n!Dր_up|_ǰc/%ۥ
nN8:d;-UJd/m1~VނיnN I˾$tF1&}|?q?\đXԑ&\4V<lKۮ3%Am_(q-(cAeGX)f}-˥6cv~Kg8m~v;|9:-iAPқ6ېn-.)<[$KJtt/L4ᖣ^Cmu4vb{+BG$M0c\[MR|0FԸP&78"4p#}DZ9;V9#>Sw"[UP7100010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA)0
	`HeM0	*H
	1	*H
0	*H
	1
161018205537Z0O	*H
	1B@HCւ&L!v6~Sz-zmgQb+q0A.ѓ| @zP0l	*H
	1_0]0	`He*0	`He0
*H
0*H
0
*H
@0+0
*H
(0	+710010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA)0*H
	1010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA)0
	*H
jsb_}C{xT:B
k8BOX7@GJ<[iv]uӵqeFkՌs7{]mYLrE<SJ.({ʣEWD%T0}ڱ'3\qĀdpYO':MeOǴ1S\Xφ.=Ӊ!`Y0ZZXh@8h*<#O"SF]]$HN)AkwocՎ8wvk~_S\~4kT_s&'ɣ
d2Rvg~oA$	=dYb#ԙw3KMVey}W*xe?^j͖g0`v-*A!
7HOUq<pqu:9v`I&?a.r̺!Hg1ۢ&2ܕ5
N$

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1fefed03-6062-50f9-be97-d693e25a64c9>