Date: Fri, 8 Mar 2013 14:32:07 -0800 From: Artem Belevich <art@freebsd.org> To: Larry Rosenman <ler@lerctr.org> Cc: Tom Evans <tevans.uk@googlemail.com>, freebsd-fs <freebsd-fs@freebsd.org>, ronald-freebsd8@klop.yi.org Subject: Re: zfs send/recv invalid data Message-ID: <CAFqOu6iZKzbb5_WWs4SJXVQUJUctuijQevod_G3UUy60vydDMg@mail.gmail.com> In-Reply-To: <ed3aa065c0102aec22b6d2de68b69aa2@webmail.lerctr.org> References: <7c93aef20a88cdbcca85739e67470dce@webmail.lerctr.org> <25894116c93a59dab1fd976b425c36d1@webmail.lerctr.org> <07b59d5d4b2a3dab1385b054eea4f2da@webmail.lerctr.org> <A3EA720372254541BCCB6C92DBA36A0E@multiplay.co.uk> <7619c6383449c7a316edb1cdffc98c54@webmail.lerctr.org> <alpine.BSF.2.00.1303052052300.52489@thebighonker.lerctr.org> <473BCD0A4A7A4E6AB08409A9B0C82363@multiplay.co.uk> <d7c08e218ef301f1202354d0b11a6742@webmail.lerctr.org> <CAFHbX1LC25pyOBUGJ-PFxzTOXvJPYOF15H0%2BfTk6qfCyT-Q8fA@mail.gmail.com> <6dcfb2284551025af3cf58703a2b5cdc@webmail.lerctr.org> <920990505611cd96a075c80d06691bb0@webmail.lerctr.org> <201303061857.r26IvLnc024186@higson.cam.lispworks.com> <fc81037894dbc014853ba8fed06e427f@webmail.lerctr.org> <9e133e088f6c3c3dafdc0a99eb7c48c1@webmail.lerctr.org> <218B0537E987442EAB8027EA478F8BE9@multiplay.co.uk> <e84db628ac7be2bea86d0f4fd8d47b36@webmail.lerctr.org> <5378B8D5A65A4F19BE1CF6C25A9CFE22@multiplay.co.uk> <ccbde3358460c03a46ed75ff1d8895f9@webmail.lerctr.org> <2DCDD0136DE7498E9DA95D5CDA3F60CC@multiplay.co.uk> <1c4687bbf52352119abbc7d12334cef1@webmail.lerctr.org> <ed3aa065c0102aec22b6d2de68b69aa2@webmail.lerctr.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 8, 2013 at 1:40 PM, Larry Rosenman <ler@lerctr.org> wrote: > On 2013-03-07 22:47, Larry Rosenman wrote: >> >> Will try this with tonights stream(s) if it fails. >> >> appreciate..... > > [trimmed copied text] > > I worked fine last night from cron(8). > > Will keep an eye on it. > > I hate intermittent / unpredictable failures with a passion :) Do you use ECC memory? If not, it may be a good idea to test memory on both machines if you didn't do it already. If you do use ECC, grep your logs for 'MCA'. If there are correctable errors, they would typically be reported via machine check exceptions and logged. YMMV. I was unable to make my Asus board (P5BV/SAS) report anything at all even though in one experiment I had physically taped off one data bit on one DIMM -- system still boots and works, but I get no exceptions. Supermicro board I had at work (X8-something) did report correctable ECC errors when one DIMM went marginally bad. --Artem
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6iZKzbb5_WWs4SJXVQUJUctuijQevod_G3UUy60vydDMg>