Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Mar 2013 14:32:07 -0800
From:      Artem Belevich <art@freebsd.org>
To:        Larry Rosenman <ler@lerctr.org>
Cc:        Tom Evans <tevans.uk@googlemail.com>, freebsd-fs <freebsd-fs@freebsd.org>, ronald-freebsd8@klop.yi.org
Subject:   Re: zfs send/recv invalid data
Message-ID:  <CAFqOu6iZKzbb5_WWs4SJXVQUJUctuijQevod_G3UUy60vydDMg@mail.gmail.com>
In-Reply-To: <ed3aa065c0102aec22b6d2de68b69aa2@webmail.lerctr.org>
References:  <7c93aef20a88cdbcca85739e67470dce@webmail.lerctr.org> <25894116c93a59dab1fd976b425c36d1@webmail.lerctr.org> <07b59d5d4b2a3dab1385b054eea4f2da@webmail.lerctr.org> <A3EA720372254541BCCB6C92DBA36A0E@multiplay.co.uk> <7619c6383449c7a316edb1cdffc98c54@webmail.lerctr.org> <alpine.BSF.2.00.1303052052300.52489@thebighonker.lerctr.org> <473BCD0A4A7A4E6AB08409A9B0C82363@multiplay.co.uk> <d7c08e218ef301f1202354d0b11a6742@webmail.lerctr.org> <CAFHbX1LC25pyOBUGJ-PFxzTOXvJPYOF15H0%2BfTk6qfCyT-Q8fA@mail.gmail.com> <6dcfb2284551025af3cf58703a2b5cdc@webmail.lerctr.org> <920990505611cd96a075c80d06691bb0@webmail.lerctr.org> <201303061857.r26IvLnc024186@higson.cam.lispworks.com> <fc81037894dbc014853ba8fed06e427f@webmail.lerctr.org> <9e133e088f6c3c3dafdc0a99eb7c48c1@webmail.lerctr.org> <218B0537E987442EAB8027EA478F8BE9@multiplay.co.uk> <e84db628ac7be2bea86d0f4fd8d47b36@webmail.lerctr.org> <5378B8D5A65A4F19BE1CF6C25A9CFE22@multiplay.co.uk> <ccbde3358460c03a46ed75ff1d8895f9@webmail.lerctr.org> <2DCDD0136DE7498E9DA95D5CDA3F60CC@multiplay.co.uk> <1c4687bbf52352119abbc7d12334cef1@webmail.lerctr.org> <ed3aa065c0102aec22b6d2de68b69aa2@webmail.lerctr.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 8, 2013 at 1:40 PM, Larry Rosenman <ler@lerctr.org> wrote:
> On 2013-03-07 22:47, Larry Rosenman wrote:
>>
>> Will try this with tonights stream(s) if it fails.
>>
>> appreciate.....
>
> [trimmed copied text]
>
> I worked fine last night from cron(8).
>
> Will keep an eye on it.
>
> I hate intermittent / unpredictable failures with a passion :)

Do you use ECC memory? If not, it may be a good idea to test memory on
both machines if you didn't do it already.

If you do use ECC, grep your logs for 'MCA'. If there are correctable
errors, they would typically be reported via machine check exceptions
and logged. YMMV. I was unable to make my Asus board (P5BV/SAS) report
anything at all even though in one experiment I had physically taped
off one data bit on one DIMM -- system still boots and works, but I
get no exceptions. Supermicro board I had at work (X8-something) did
report correctable ECC errors when one DIMM went marginally bad.

--Artem



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6iZKzbb5_WWs4SJXVQUJUctuijQevod_G3UUy60vydDMg>