Date: Thu, 21 Jul 2016 08:04:52 -0500 From: Karl Denninger <karl@denninger.net> To: freebsd-stable@freebsd.org Subject: Re: Panic on BETA1 in the ZFS subsystem Message-ID: <03cdf671-a7a8-12ac-3204-e5a1bf1ef062@denninger.net> In-Reply-To: <6cb46059-85c8-0c3b-7346-773647f1a962@FreeBSD.org> References: <8f44bc09-1237-44d0-fe7a-7eb9cf4fe85b@denninger.net> <54e5974c-312e-c33c-ab83-9e1148618ddc@FreeBSD.org> <97cf5283-683b-83fd-c484-18c14973b065@denninger.net> <c2f24b1e-be84-bcdd-ea0b-515cd2aca266@FreeBSD.org> <1f064549-fa72-fe9b-d66d-85923437bb9b@denninger.net> <6cb46059-85c8-0c3b-7346-773647f1a962@FreeBSD.org>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
On 7/21/2016 07:52, Andriy Gapon wrote:
> On 21/07/2016 15:25, Karl Denninger wrote:
>> The crash occurred during a backup script operating, which is (roughly)
>> the following:
>>
>> zpool import -N backup (mount the pool to copy to)
>>
>> iterate over a list of zfs filesystems and...
>>
>> zfs rename fs@zfs-base fs@zfs-old
>> zfs snapshot fs@zfs-base
>> zfs send -RI fs@zfs-old fs@zfs-base | zfs receive -Fudv backup
>> zfs destroy -vr fs@zfs-old
>>
>> The first filesystem to be done is the rootfs, that is when it panic'd,
>> and from the traceback it appears that the Zio's in there are from the
>> backup volume, so the answer to your question is "yes".
> I think that what happened here was that a quite large number of TRIM
> requests was queued by ZFS before it had a chance to learn that the
> target vdev in the backup pool did not support TRIM. So, when the the
> first request failed with ENOTSUP the vdev was marked as not supporting
> TRIM. After that all subsequent requests were failed without sending
> them down the storage stack. But the way it is done means that all the
> requests were processed by the nested zio_execute() calls on the same
> stack. And that lead to the stack overflow.
>
> Steve, do you think that this is a correct description of what happened?
>
> The state of the pools that you described below probably contributed to
> the avalanche of TRIMs that caused the problem.
>
The source for the backup a pool that is comprised entirely of SSDs (and
thus supports TRIM), and the target is a pair of spinning rust devices
(which of course do not support TRIM); the incremental receive to that
pool does (of course) remove all the obsolete snapshots.....
What I don't understand however, is why it has been running fine for a
week or so, and why it immediately repeated the panic on a retry attempt
-- or how to prevent it, at least at this point. I certainly do not
want to leave the pool mounted when not in active backup use.
--
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
[-- Attachment #2 --]
0 *H
010
`He 0 *H
_0[0C)0
*H
010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 *H
Cuda Systems LLC CA0
150421022159Z
200419022159Z0Z10 UUS10UFlorida10U
Cuda Systems LLC10UKarl Denninger (OCSP)0"0
*H
0
X@vkY
Tq/vE]5#֯MX\8LJ/V?5Da+
sJc*/r{ȼnS+ w")ąZ^DtdCOZ ~7Q '@a#ijc۴oZdB&!Ӝ-< ?HN5y
5}F|ef"Vلio74zn">a1qWuɖbFeGE&3(KhixG3!#e_XƬϜ/,$+;4y'Bz<qT9_?rRUpn5
Jn&Rx/p Jyel*pN8/#9u/YPEC)TY>~/˘N[vyiDKˉ,^" ?$T8 v&K%z8C @?K{9f`+@,|Mbia 007++0)0'+0http://cudasystems.net:88880 U0 0 `HB0U0, `HB
OpenSSL Generated Certificate0U-h\Ff Y0U#0$q}ݽʒm50U0karl@denninger.net0
*H
Owbabɺx&Uk[(Oj!%p MQ0I!#QH}.>~2&D}<wm_>V6v]f>=Nn+8;q wfΰ/RLyUG#b}n!Dր_up|_ǰc/%ۥ
nN8:d;-UJd/m1~VނיnN I˾$tF1&}|?q?\đXԑ&\4V<lKۮ3%Am_(q-(cAeGX)f}-˥6cv~Kg8m~v;|9:-iAPқ6ېn-.)<[$KJtt/L4ᖣ^Cmu4vb{+BG$M0c\[MR|0FԸP&78"4p#}DZ9;V9#>Sw"[UP7100010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 *H
Cuda Systems LLC CA)0
`He M0 *H
1 *H
0 *H
1
160721130452Z0O *H
1B@hJ$5yIr>*LPm 8}|J2}T<,0l *H
1_0]0 `He*0 `He0
*H
0*H
0
*H
@0+0
*H
(0 +710010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 *H
Cuda Systems LLC CA)0*H
1010 UUS10UFlorida10U Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 *H
Cuda Systems LLC CA)0
*H
h6u4\UP$({ҤaB
~Xlϧ S;ta{gSM:6wɤ&|tW,2tatUM:e ~CFZ𨔸<ꏸ@J*C9ΝW0`Dǁy+zҶ,JÊ2܀=v}/AY"_Vb4FnWn}Eĩ)Q)пh@̼LO]u6/nk^gʸjH 웪 }臈$E|p2:Χv;߸{V{>Ht1ΑdgeВ)=ZUaN#EhKʵH̶$$ԟp%fHW*14v&mRK11b :03ZC?JYWi
A}X}guxHUʭ%,G~<wHBjlU*֧u
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?03cdf671-a7a8-12ac-3204-e5a1bf1ef062>
