FreeBSD Mail Archives

Date:      Tue, 30 Apr 2019 08:33:47 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-stable@freebsd.org
Subject:   Re: ZFS...
Message-ID:  <f868b452-40e9-f2c8-cdee-dde5e53a214c@denninger.net>
In-Reply-To: <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net>
References:  <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <CAOtMX2gf3AZr1-QOX_6yYQoqE-H%2B8MjOWc=eK1tcwt5M3dCzdw@mail.gmail.com> <56833732-2945-4BD3-95A6-7AF55AB87674@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net>

index | next in thread | previous in thread | raw e-mail


[-- Attachment #1 --]
On 4/30/2019 03:09, Michelle Sullivan wrote:
> Consider..
>
> If one triggers such a fault on a production server, how can one justify transferring from backup multiple terabytes (or even petabytes now) of data to repair an unmountable/faulted array.... because all backup solutions I know currently would take days if not weeks to restore the sort of store ZFS is touted with supporting.  

Had it happen on a production server a few years back with ZFS.  The
*hardware* went insane (disk adapter) and scribbled on *all* of the vdevs.

The machine crashed and would not come back up -- at all.  I insist on
(and had) emergency boot media physically in the box (a USB key) in any
production machine and it was quite-quickly obvious that all of the
vdevs were corrupted beyond repair.  There was no rational option other
than to restore.

It was definitely not a pleasant experience, but this is why when you
get into systems and data store sizes where it's a five-alarm pain in
the neck you must figure out some sort of strategy that covers you 99%
of the time without a large amount of downtime involved, and in the 1%
case accept said downtime.  In this particular circumstance the customer
didn't want to spend on a doubled-and-transaction-level protected
on-site (in the same DC) redundancy setup originally so restore, as
opposed to fail-over/promote and then restore and build a new
"redundant" box where the old "primary" resided was the most-viable
option.  Time to recover essential functions was ~8 hours (and over 24
hours for everything to be restored.)

Incidentally that's not the first time I've had a disk adapter failure
on a production machine in my career as a systems dude; it was, in fact,
the *third* such failure.  Then again I've been doing this stuff since
the 1980s and learned long ago that if it can break it eventually will,
and that Murphy is a real b******.

The answer to your question Michelle is that when restore times get into
"seriously disruptive" amounts of time (e.g. hours, days or worse
depending on the application involved and how critical it is) you spend
the time and money to have redundancy in multiple places and via paths
that do not destroy the redundant copies when things go wrong, and you
spend the engineering time to figure out what those potential faults are
and how to design such that a fault which can destroy the data set does
not propagate to the redundant copies before it is detected.

-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

[-- Attachment #2 --]
0�	*�H��
��0�10
	`�He0�	*�H��
��
�0��0����H���^��Ōc!5�
�H0
	*�H��
0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems CA1!0UCuda Systems LLC 2017 CA0
170817164217Z
270815164217Z0{10	UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA0�"0
	*�H��
�0�
��h�-5B>[���;��o���l�Ӵ��0~͎O9}�9�Y��e������*�������$��g��!uk�vʶ�LzN�`jL�>��MD'7U4����5C�B�+�kY`bd����~b*�c3�N��y-�78j�u�]9H�e��uέ�sӬD��ؽ�m��gw�ER�?�&U�UR�j����'�}�9n�WD i�`XcbG��z�\g������G=��u�%���\�O�i1���3���ߝ4�
�K4�4p�YQr]�Ie�/r�0+��eEޝݖ0��C15�M��ݚ@J�SZ(zȏ�N�Ta�(2��5�D�D5���.l�<g[[Za��r�Q�Q%�Bu�ȴ����~~`���I�oh�R�b����ʳ��ڟ���u�2���M�S��8E�dF��UC���l�CM�aѳ����!����}ș�+�2��k��/�bų�E,��n�当ꖛ\�(8�WV�8	d]�b�	�������y�X��w	܊�:I�39��
0�0U]�^§������Q�\ӎ�0��U#��0�����T0�3���9�N0b������0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems CA1!0UCuda Systems LLC 2017 CA�	�@�U��i0U�0�0U��0
	*�H��
���:P U!>v�����J�ni��o�-����#�ן�]Wyu�j���ǑR̀��Q�
�nƇ�!GѦF��g\�yLx�g�w=�O�P��yceh�f[���}�ܷ�['4�ڝ�\[p6\o.��B&�JF���"�ZC{;�*o�*�mc��Cc�LY߾�`
�t�*�S!����񫶭�(���`�]D�HP�5���A~/�N���Pp�����6�=�m��h�k�밣'd���oA$�86hm����5���Ӛ��S@�j���ެE���gl��
�)�0JG���`%�k�3�5��P��a��C?���σ
׳HE�t}!�P���㏏%*���B�xb��Q�waKG����$6h�¦��M�v��e;��[o��-�Iی��&
���I,��T��c�ߎ#t �wPA�@��l0�P�+�KXB��պT	z���G�v;N��c��I3��&��JĬ���UP�N��a��?�/�%�W��6G۟N�0�00����k���#X��d��\�=0
	*�H��
0{10	UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA0
170817212120Z
220816212120Z0W10	UUS10UFlorida10U
Cuda Systems LLC10Ukarl@denninger.net0�"0
	*�H��
�0�
��T��[I�-ΆϏdn;�Å�@שy���.u�s�~_�Z�G%<��M��Y��d�\g��v�f��n�s�a��1'6����E�gyjs�"C� [�{��~��_���K���Pn+<�*�pv���#Q�����+��H���/���7[-v��qD��V^U>�f��%�GX�)��H.��|l`�M(C�r�>е͇6����#�o��dc"Y�ljҦ�ln8�@�5S�A�0���&ۖ"���OGj?��U��DWZ5	��dDB7k-)�9�����I�zs��-�JA���v
��J��6L���$�Ն����1Sm�Y.��Lqw*��SH;E��F'�D�Ħ��H��]��M��O��������g���Q���Q�|M�ٙ��ג2Z��9y��@���y�]}6ٽe��Y9��Y2�xˆ�$T�=�e�CǺ��ǵb�n֛�{��j��|��@�LL�t�1�[D�k5:$=�	`�	�M���0��0<+00.0,+0� http://ocsp.cudasystems.net:88880	U00	`�H��B�0U��0U%0++03	`�H��B
&$OpenSSL Generated Client Certificate0U�%�՞V=���؁�;�bzQ0��U#��0���]�^§������Q�\ӎϡ�����0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems CA1!0UCuda Systems LLC 2017 CA��H���^��Ōc!5�
�H0U0�karl@denninger.net0
	*�H��
��۠�A0�-j%-�-$%���g2#ޡ��1�^��>���{K+�u��GE���v1���ş7Af&b�&O�;.��;A5���*U��)N��D2bF��|\=�]<�sˋL!��wrw���٧>��Y���M���Ä���3\mW�R�� h�Sv���!�_�zv�����l�?� ��3_�� �xU%�\�^����#���O*���Gk̍�YI_�&�Fꊛ�����@&�1�n�������}� ͬ:��{�hT�P3��B.�;���bU�8:Z��=^���Gw�8���!k-��@���x�E��@�i�,+'�Iᐚ:f��hz�tX7/�(h�Y`��� O�.������1}a`�%�RW��^�a�k������ǂp�C�Au�fgDix�UT��Щ/�7��}�%=j��nVZvcF����<�M=
�2^G�KH5魉
�_���O�4ެ�Byʈ���y��S��k�w=5�@h�.0�z�>�
W�1�0�0��0{10	UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA��k���#X��d��\�=0
	`�He��E0	*�H��
	1	*�H��
0	*�H��
	1
190430133347Z0O	*�H��
	1B@�|���Z"d��AT
��xt�.�W,'/H�Ny�w��[��� ��;f�iz83�{�G/�?�0l	*�H��
	1_0]0	`�He*0	`�He0
*�H��
0*�H��
�0
*�H��
@0+0
*�H��
(0��	+�71��0��0{10	UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA��k���#X��d��\�=0��*�H��
	1�����0{10	UUS10UFlorida10U
Cuda Systems LLC10UCuda Systems CA1%0#UCuda Systems LLC 2017 Int CA��k���#X��d��\�=0
	*�H��
��gw����$i(R��\$�S��s�+�5(f��T�
��2�
C�t�2K�w�K�;��3��l�vʕg�'ފ;��׈Q�=�j�X��s���$�}zy�^�����H���=�=�y{��,xS�\��G*�_$��P�{��i�����q`�l�aE��jN��C�ș�U�"񆂻P�9�_H��M[��/})�Iܨ�a�雒*�X��8!����nAO�0����Z�\����,����/$b�E�	����ux<���t��^��?{�F%ͻt(4$�܆���B-)�c�ۢ5o��7Y�O�
;�LcDS��Ե���\�|����{P<�je7(�Q���m?�z�O�Ŝ$���^�W�¦��C`��?����>�%����4-~"g��'�9H����������&�Tˇ%^�`���ek���5�f$��
�!J����9�Ϲ,D������\F����6&P�,��4�w�\cQ?$�yC�Y-�,

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f868b452-40e9-f2c8-cdee-dde5e53a214c>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation