Date: Sun, 16 Apr 2017 09:26:10 -0500 From: Karl Denninger <karl@denninger.net> To: freebsd-hardware@freebsd.org Subject: Re: SSD errors Message-ID: <d64efa9d-ebe6-b141-44ae-0aad07032a60@denninger.net> In-Reply-To: <02898e76-9285-03e7-e76a-77a5290376b9@fjl.co.uk> References: <20170413205932.GJ2149@shrubbery.net> <02898e76-9285-03e7-e76a-77a5290376b9@fjl.co.uk>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On 4/16/2017 03:49, Frank Leonhardt wrote: > On 13/04/2017 21:59, heasley wrote: >> <snip> >> When I push a lot of data to them, such as an rsync, I receive errors >> like >> the below. If I move drives between slots, it seems to follow the >> chassis >> slots, those closest to the power supply, but I'm not positive about >> this. >> >> I suppose the questions for list are: >> - have I missed any fbsd ssd-specific configuration? >> >> - all 4 have non-zero UDMA_CRC_Error_Count counters; not many, about the >> same number, which I believe implies electrical interference - most >> likely in the cable or chassis backplane. Should I buy some specific >> model cable? other recommendations? > <snip> > > I'm not aware of any SSD-specific stuff you've missed. The SSD option > on the initialisation code in the BIOS is probably just there because > there's no need to wait for spin-up time (as you probably thought too). > > So I don't have an answer, but here are a few thoughts: > > I think it's the CRC error (out of that lot) that you should be > worried about. It means that the drive wrote data, but when it read it > back it didn't match. With ST506 this could (and often was) a cable > fault but not with IDE. This doesn't mean dodgy cables can't cause you > problems with IDE; only that they'd manifest differently. If the drive > wrote the data to the flash with a CRC and then the CRC didn't match > later, it doesn't make any difference if the data was corrupted on > it's way to the drive, or even if it was corrupted on its way back > (ZFS would pick that up). So it must have been corrupted on-drive. > Right? (I could be wrong about where your CRC errors are being > tested/detected, so not necessarily right). > > So with this in mind, why should the drive's location on the shelf > matter (if it does make a difference). I can think of two reasons - > electromagnetic interference from adjacent circuits or PSU problems. > > So if it were me, I'd check the interference theory by using longer > cables and spreading the drives out. Serial transfer on long cables > isn't really a problem like it was with parallel. That's the easy check. > > Then it's on to PSU issues. Does an SSD use more or less power than > spinning rust? Really? Most people assume they'll use less but it's > not as much less as you think, and it varies in different ways. If the > PSU can't cope with the peak (e.g. while it's writing). > > IT people will know all about watts. Add up the number of watts on all > your drives and if it's <= the number of watts written on your PSU, > cushty. > > Wrong! An engineer will tell you you can't add watts together and get > anything meaningful. And believing the label on a PSU is a mug's game. > So, if you've got a decent oscilloscope take a look at the supply > rails where they enter the drives. Try writing, and if you get so much > as a blip on the voltage then do something about it. > > If you haven't got a 'scope to hand, I'd try running (some) the drives > of a different PSU and see that makes a difference. > > Although I haven't hit this problem myself, I'd be surprised if the > same PSU design intended to power spinning rust at a relatively > constant current could cope well with an SSD going from nothing much > to lots to nothing much again over a very short space of time. If I > was connecting a different PSU to the SSD I'd load it with some real > drives just to stabilise the current output a bit (i.e. plug an old > drive or two on to some of the other spare outlets). > > Then there's always the chance it's over-cooking, but I think you'd > have mentioned if they were getting very hot. > > Regards, Frank. > > _______________________________________________ > freebsd-hardware@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hardware > To unsubscribe, send any mail to > "freebsd-hardware-unsubscribe@freebsd.org" Flaky power has been the cause of more intermittent and very odd problems, especially under load, than you can count. I always get suspicious of power issues when the system seems fine right up until you place it under heavy load, then bad things happen -- and I'm usually right. I second Frank's suggestion. -- Karl Denninger karl@denninger.net <mailto:karl@denninger.net> /The Market Ticker/ /[S/MIME encrypted email preferred]/ [-- Attachment #2 --] 0 *H 010 `He 0 *H \0X0@=0 *H 010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA0 161218194535Z 211217194535Z0W10 UUS10UFlorida10U Cuda Systems LLC10Ukarl@denninger.net0"0 *H 0 ͍fd`1ie6";fSz`5¹/?{=Ӵowjħ_fnӴMG\ҢҖ4ib}>@mJo&mM; Q9U cj]p퐆W.2E= ^¢tzĄ'5i7_`~#dY `]R]N%R}EXzqV@[oN T>5AwYˡA"\v&YG]+($p:M,T?=mJkMљg*ym L!J[./d?W^LysD'1 +V'~{-SSX= q-f=%&V<m4BeSet| l2m 6iO{wv +aHXˈ5=~é*C!?uJr3tb'3`Oe)üLxt&3N526llU .|Cp[l? 007++0)0'+0http://cudasystems.net:88880 U0 0 `HB0U0, `HB OpenSSL Generated Certificate0U/Zi 0GhG0U#0$q}ݽʒm50U0karl@denninger.net0 *H b%X%gwq Ɂэr K[DMJ35W6 sz8d|qB2Cyw2PbV} â[!W{HD7oD.TZ'w6~g( -,]R8P{*[f<1=7jGj9铚~3f2AʺN k~@vz^j(>ͺyh2y{/9}4.45#S|<fW!.,Bss*Q+h=}l@ "q "M&6J5*,G {hɫjbNgǠ.ЃXȶ4$O.5evHlZba!4eE!x|Za1nZ5TuPvW|#G+ DZpI7S'n0 haGa@vZ e|]Cu+))vRyY100010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA=0 `He M0 *H 1 *H 0 *H 1 170416142610Z0O *H 1B@\.;"}Wa@8ePH3<A)m(VtSe%0l *H 1_0]0 `He*0 `He0 *H 0*H 0 *H @0+0 *H (0 +710010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA=0*H 1010 UUS10UFlorida10U Niceville10U Cuda Systems LLC10UCuda Systems LLC CA1"0 *H Cuda Systems LLC CA=0 *H m$,mmYsO#Ti'$17o:#iECΛ`S4'[&T \8䀂5nEZ? PK mh(9%?p ՝?QCC!ɬ*!Dfs-֡h-AeĹ M:!86JeM_M8I{LV4d^pD+: knAa8TfgȮ]z}];e|'$p;Sw=ZT8͔JO%>qW?_v;O!l^w" YJcS+UQ#N9HѬkzLUP=0bFEűk eA{bXbE.ROC;G̷KXآh\`6ai>i$SVu?k-T-e@;応{)ov 8)help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d64efa9d-ebe6-b141-44ae-0aad07032a60>
