Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Jul 2016 08:04:52 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-stable@freebsd.org
Subject:   Re: Panic on BETA1 in the ZFS subsystem
Message-ID:  <03cdf671-a7a8-12ac-3204-e5a1bf1ef062@denninger.net>
In-Reply-To: <6cb46059-85c8-0c3b-7346-773647f1a962@FreeBSD.org>
References:  <8f44bc09-1237-44d0-fe7a-7eb9cf4fe85b@denninger.net> <54e5974c-312e-c33c-ab83-9e1148618ddc@FreeBSD.org> <97cf5283-683b-83fd-c484-18c14973b065@denninger.net> <c2f24b1e-be84-bcdd-ea0b-515cd2aca266@FreeBSD.org> <1f064549-fa72-fe9b-d66d-85923437bb9b@denninger.net> <6cb46059-85c8-0c3b-7346-773647f1a962@FreeBSD.org>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
On 7/21/2016 07:52, Andriy Gapon wrote:
> On 21/07/2016 15:25, Karl Denninger wrote:
>> The crash occurred during a backup script operating, which is (roughly)
>> the following:
>>
>> zpool import -N backup (mount the pool to copy to)
>>
>> iterate over a list of zfs filesystems and...
>>
>> zfs rename fs@zfs-base fs@zfs-old
>> zfs snapshot fs@zfs-base
>> zfs send -RI fs@zfs-old fs@zfs-base | zfs receive -Fudv backup
>> zfs destroy -vr fs@zfs-old
>>
>> The first filesystem to be done is the rootfs, that is when it panic'd,
>> and from the traceback it appears that the Zio's in there are from the
>> backup volume, so the answer to your question is "yes".
> I think that what happened here was that a quite large number of TRIM
> requests was queued by ZFS before it had a chance to learn that the
> target vdev in the backup pool did not support TRIM.  So, when the the
> first request failed with ENOTSUP the vdev was marked as not supporting
> TRIM.  After that all subsequent requests were failed without sending
> them down the storage stack.  But the way it is done means that all the
> requests were processed by the nested zio_execute() calls on the same
> stack.  And that lead to the stack overflow.
>
> Steve, do you think that this is a correct description of what happened?
>
> The state of the pools that you described below probably contributed to
> the avalanche of TRIMs that caused the problem.
>

The source for the backup a pool that is comprised entirely of SSDs (and
thus supports TRIM), and the target is a pair of spinning rust devices
(which of course do not support TRIM); the incremental receive to that
pool does (of course) remove all the obsolete snapshots.....

What I don't understand however, is why it has been running fine for a
week or so, and why it immediately repeated the panic on a retry attempt
-- or how to prevent it, at least at this point.  I certainly do not
want to leave the pool mounted when not in active backup use.

-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

[-- Attachment #2 --]
0	*H
010
	`He0	*H
_0[0C)0
	*H
010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA0
150421022159Z
200419022159Z0Z10	UUS10UFlorida10U
Cuda Systems LLC10UKarl Denninger (OCSP)0"0
	*H
0
X@vkY
Tq/vE]5#֯MX\8LJ/V?5Da+
sJc*/r{ȼnS+w")ąZ^DtdCOZ ~7Q '@a#ijc۴oZdB&!Ӝ-<	?HN5y
5}F|ef゘"Vلio74zn">a1qWuɖbFeGE&3(KhixG3!#e_XƬϜ/,$+;4y'Bz<qT9_?rRUpn5
Jn&Rx/p Jyel*pN8/#9u/YPEC)TY>~/˘N[vyiDKˉ,^" ?$T8v&K%z8C @?K{9f`+@,|Mbia007++0)0'+0http://cudasystems.net:88880	U00	`HB0U0,	`HB
OpenSSL Generated Certificate0U-h\Ff Y0U#0$q}ݽʒm50U0karl@denninger.net0
	*H
Owbabɺx&Uk[(Oj!%pMQ0I!#QH}.>~2&D}<wm_>V6v]f>=Nn+8;q wfΰ/RLyUG#b}n!Dր_up|_ǰc/%ۥ
nN8:d;-UJd/m1~VނיnN I˾$tF1&}|?q?\đXԑ&\4V<lKۮ3%Am_(q-(cAeGX)f}-˥6cv~Kg8m~v;|9:-iAPқ6ېn-.)<[$KJtt/L4ᖣ^Cmu4vb{+BG$M0c\[MR|0FԸP&78"4p#}DZ9;V9#>Sw"[UP7100010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA)0
	`HeM0	*H
	1	*H
0	*H
	1
160721130452Z0O	*H
	1B@hJ$5׺yIr>*LPm8}|J2}T<,0l	*H
	1_0]0	`He*0	`He0
*H
0*H
0
*H
@0+0
*H
(0	+710010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA)0*H
	1010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA)0
	*H
h6u4\UP$({ҤaB
~Xlϧ S;ta{gSM:6wɤ&|tW,2tatUM:e		~CFZ𨔸<ꏸ@J*C9޶ΝW0`Dǁy+zҶ,JÊ2܀=v}/AY"_Vb4FnWn}Eĩ)Q)пh@̼LO]u6/nk^gʸjH웪	}臈$E|p2:Χv;߸{V{>Ht1ΑdgeВ)=ZUaN#EhKʵH̶$$ԟp%fHW*14v&mRK11b :03Z׷C?JYWi
A}X}guxHUʭ%,G~<wHBjlU*֧u
help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?03cdf671-a7a8-12ac-3204-e5a1bf1ef062>