Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Apr 2017 09:26:10 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-hardware@freebsd.org
Subject:   Re: SSD errors
Message-ID:  <d64efa9d-ebe6-b141-44ae-0aad07032a60@denninger.net>
In-Reply-To: <02898e76-9285-03e7-e76a-77a5290376b9@fjl.co.uk>
References:  <20170413205932.GJ2149@shrubbery.net> <02898e76-9285-03e7-e76a-77a5290376b9@fjl.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a cryptographically signed message in MIME format.

--------------ms040301070301060709040608
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 4/16/2017 03:49, Frank Leonhardt wrote:
> On 13/04/2017 21:59, heasley wrote:
>> <snip>
>> When I push a lot of data to them, such as an rsync, I receive errors
>> like
>> the below.  If I move drives between slots, it seems to follow the
>> chassis
>> slots, those closest to the power supply, but I'm not positive about
>> this.
>>
>> I suppose the questions for list are:
>> - have I missed any fbsd ssd-specific configuration?
>>
>> - all 4 have non-zero UDMA_CRC_Error_Count counters; not many, about t=
he
>>    same number, which I believe implies electrical interference - most=

>>    likely in the cable or chassis backplane.  Should I buy some specif=
ic
>>    model cable?  other recommendations?
> <snip>
>
> I'm not aware of any SSD-specific stuff you've missed. The SSD option
> on the initialisation code in the BIOS is probably just there because
> there's no need to wait for spin-up time (as you probably thought too).=

>
> So I don't have an answer, but here are a few thoughts:
>
> I think it's the CRC error (out of that lot) that you should be
> worried about. It means that the drive wrote data, but when it read it
> back it didn't match. With ST506 this could (and often was) a cable
> fault but not with IDE. This doesn't mean dodgy cables can't cause you
> problems with IDE; only that they'd manifest differently. If the drive
> wrote the data to the flash with a CRC and then the CRC didn't match
> later, it doesn't make any difference if the data was corrupted on
> it's way to the drive, or even if it was corrupted on its way back
> (ZFS would pick that up). So it must have been corrupted on-drive.
> Right? (I could be wrong about where your CRC errors are being
> tested/detected, so not necessarily right).
>
> So with this in mind, why should the drive's location on the shelf
> matter (if it does make a difference). I can think of two reasons -
> electromagnetic interference from adjacent circuits or PSU problems.
>
> So if it were me, I'd check the interference theory by using longer
> cables and spreading the drives out. Serial transfer on long cables
> isn't really a problem like it was with parallel. That's the easy check=
=2E
>
> Then it's on to PSU issues. Does an SSD use more or less power than
> spinning rust? Really? Most people assume they'll use less but it's
> not as much less as you think, and it varies in different ways. If the
> PSU can't cope with the peak (e.g. while it's writing).
>
> IT people will know all about watts. Add up the number of watts on all
> your drives and if it's <=3D the number of watts written on your PSU,
> cushty.
>
> Wrong! An engineer will tell you you can't add watts together and get
> anything meaningful. And believing the label on a PSU is a mug's game.
> So, if you've got a decent oscilloscope take a look at the supply
> rails where they enter the drives. Try writing, and if you get so much
> as a blip on the voltage then do something about it.
>
> If you haven't got a 'scope to hand, I'd try running (some) the drives
> of a different PSU and see that makes a difference.
>
> Although I haven't hit this problem myself, I'd be surprised if the
> same PSU design intended to power spinning rust at a relatively
> constant current could cope well with an SSD going from nothing much
> to lots to nothing much again over a very short space of time. If I
> was connecting a different PSU to the SSD I'd load it with some real
> drives just to stabilise the current output a bit (i.e. plug an old
> drive or two on to some of the other spare outlets).
>
> Then there's always the chance it's over-cooking, but I think you'd
> have mentioned if they were getting very hot.
>
> Regards, Frank.
>
> _______________________________________________
> freebsd-hardware@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hardware
> To unsubscribe, send any mail to
> "freebsd-hardware-unsubscribe@freebsd.org"

Flaky power has been the cause of more intermittent and very odd
problems, especially under load, than you can count.  I always get
suspicious of power issues when the system seems fine right up until you
place it under heavy load, then bad things happen -- and I'm usually righ=
t.

I second Frank's suggestion.

--=20
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

--------------ms040301070301060709040608
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgMFADCABgkqhkiG9w0BBwEAAKCC
BlwwggZYMIIEQKADAgECAgE9MA0GCSqGSIb3DQEBCwUAMIGQMQswCQYDVQQGEwJVUzEQMA4G
A1UECBMHRmxvcmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3Rl
bXMgTExDMRwwGgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhND
dWRhIFN5c3RlbXMgTExDIENBMB4XDTE2MTIxODE5NDUzNVoXDTIxMTIxNzE5NDUzNVowVzEL
MAkGA1UEBhMCVVMxEDAOBgNVBAgTB0Zsb3JpZGExGTAXBgNVBAoTEEN1ZGEgU3lzdGVtcyBM
TEMxGzAZBgNVBAMUEmthcmxAZGVubmluZ2VyLm5ldDCCAiIwDQYJKoZIhvcNAQEBBQADggIP
ADCCAgoCggIBAM2N5maxs7NkoY9g5NMxFWll0TYiO7gXrGZTo3q25ZJgNdPMwrntLz/5ewE9
07TEbwJ3ah/Ep9BfZm7JF9vTtE1HkgKtXNKi0pawNGm1Yn26Dz5AbUr1byby6dFtDJr14E07
trzDCtRRvTkOVSBj6PQPal0fAnDtkIYQBVcuMkXkuMCtyfE95pjm8g4K9l7lAcKii3T1/3rE
hCc1o2nBnb7EN1/XwBeCDGB+I2SN/ftZDbKQqGAF5q9dUn+iXU7Z/CVSfUWmhVh6cVZA4Ftv
TglUqj410OuPx+cUQch3h1kFgsuhQR63HiJc3HbRJllHsV0rihvL1CjeARQkhnA6uY9NLFST
p5I/PfzBzW2MSmtN/tGZvmfKKnmtbfUNgkzbIR1K3lsum+yEL71kB93Xtz/4f1demEx5c8TJ
RBIniDHjDeLGK1aoBu8nfnvXAvgthFNTWBOEoR49AHEPjC3kZj0l8JQml1Y8bTQD5gtC5txl
klO60WV0EufU7Hy9CmynMuFtjiA2v71pm097rXeCdrAKgisdYeEESB+SFrlY65rLiLv4n8o1
PX7DqRfqKkOYIakZ0ug/yHVKcq2EM3RiJxwzls5gT70CoOBlKbrC98O8TA6teON0Jq30M06t
NTI2HhvNbJDLbBH+Awf4h1UKB+0ufENwjVvF5Jfz8Ww/FaSDAgMBAAGjgfQwgfEwNwYIKwYB
BQUHAQEEKzApMCcGCCsGAQUFBzABhhtodHRwOi8vY3VkYXN5c3RlbXMubmV0Ojg4ODgwCQYD
VR0TBAIwADARBglghkgBhvhCAQEEBAMCBaAwCwYDVR0PBAQDAgXgMCwGCWCGSAGG+EIBDQQf
Fh1PcGVuU1NMIEdlbmVyYXRlZCBDZXJ0aWZpY2F0ZTAdBgNVHQ4EFgQUpfAI3y+751pp9A0w
6vJHx8RoR/MwHwYDVR0jBBgwFoAUJHGbnYV9/N3dvbDKkpQDofrTbTUwHQYDVR0RBBYwFIES
a2FybEBkZW5uaW5nZXIubmV0MA0GCSqGSIb3DQEBCwUAA4ICAQBiB6MlugxYJdccD8boZ/u8
d8VxmLkJCtbfyYHRjYdyoABLW5hE3k3xSpYCM9L7vzWyV/UWwDYKi4ZzxHo4g+jG/GQZfKhx
v38BQjL2G9xD0Hn2d+cygOq3UPjVYlbbfQoew6JbyCFXrrZ7/0jvRMLAN2+bRC7ynaFUixPH
Whnj9JSH7ieYdzak8KN+G2coIC2t2iyfXVKehzi5gdNQ0vJ7+ypbGsRm4gE8Mdo9N/WgFPvZ
HPFqR9Dwas7Z+aHwOabpk5r/336SyjOaZsn3MqKJQZL6GqDKusVOCWt+9uFAD8kadg7FetZe
atIoD9I+zbp59oVoMnkMDMx7Hi85faU03csusqMGsjSsAzWSI1N8PJytZlchLiykokLKc3OL
G87QKlErotlou7cfPX2BbEAH5wmkj9oiqZhxIL/wwAUA+PkiTbEmksKBNompSjUq/6UsR8EA
s74gnu17lmijv8mrg2qMlwRirE7qG8pnE8egLtCDxcjd0Of9WMi2NJskn0/ovC7P+J60Napl
m3ZIgPJst1piYSE0Zc1FIat4fFphMfK5v4iLblo1tFSlkdx1UNDGdg/U+LaXkNVXlMp8fyPm
R80V6cIrCAlEWnBJNxG1UyfbbsvNMCCZBM4faGGsR/hhQOiydlruxhjL6P8J2WV8p11DdeGx
KymWoil2s1J5WTGCBRMwggUPAgEBMIGWMIGQMQswCQYDVQQGEwJVUzEQMA4GA1UECBMHRmxv
cmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3RlbXMgTExDMRww
GgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhNDdWRhIFN5c3Rl
bXMgTExDIENBAgE9MA0GCWCGSAFlAwQCAwUAoIICTTAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xNzA0MTYxNDI2MTBaME8GCSqGSIb3DQEJBDFCBECqkVwu
6DsiqJp9nZVX3mGZBscCoxBAOLCckBrxZcJQjxvKwwxIMzylQSnQ7G39hYSaCCgQ7pwGVvN0
U/hlnCWGMGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEqMAsGCWCGSAFlAwQBAjAKBggq
hkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZI
hvcNAwICASgwgacGCSsGAQQBgjcQBDGBmTCBljCBkDELMAkGA1UEBhMCVVMxEDAOBgNVBAgT
B0Zsb3JpZGExEjAQBgNVBAcTCU5pY2V2aWxsZTEZMBcGA1UEChMQQ3VkYSBTeXN0ZW1zIExM
QzEcMBoGA1UEAxMTQ3VkYSBTeXN0ZW1zIExMQyBDQTEiMCAGCSqGSIb3DQEJARYTQ3VkYSBT
eXN0ZW1zIExMQyBDQQIBPTCBqQYLKoZIhvcNAQkQAgsxgZmggZYwgZAxCzAJBgNVBAYTAlVT
MRAwDgYDVQQIEwdGbG9yaWRhMRIwEAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoTEEN1ZGEg
U3lzdGVtcyBMTEMxHDAaBgNVBAMTE0N1ZGEgU3lzdGVtcyBMTEMgQ0ExIjAgBgkqhkiG9w0B
CQEWE0N1ZGEgU3lzdGVtcyBMTEMgQ0ECAT0wDQYJKoZIhvcNAQEBBQAEggIAbfgkLOzH+9tt
bVmMc9xPweoMI1Rp+SckiDE366oG+28SOiMRAf78mBIZaR9FkEPOm2Ae9ewBhKyc0MJTnaya
qjQn/hm6W44VxNgmVA3H9Qtc3TiF5ICCgzVu5kXjylo/2iC0UBERSw350gttAX/i0mj04Sja
Fw6IHI45EiUQlOw/cA3Xi8DVne/UP+Pb0vNRjkNDrPsh9sms5egqC8jeIUQWvQj9pmancy0C
3Qa2va3o1qGYwJbF96ho/y0Qr0GrZcS5E7YKoopNqzqyITg2vP5KmQFlTcVfTTiwSXtMVs0R
i/Q0ZH9eBHDpRCs6IIKjaw6Gsvpu70FhOFT0FwicZthnGaaXyK5dAq3O+sN6ffsLXRU7og5l
g3z7J5Ek8hpwyzsb5opT4ul3PaBatabo9gvoVDieExPNlAXoSg9PBSWoxD6C1nFXl7I/X3Y7
T6QHIWyvsF53kLue4CIKWUri/pLDY1PsK61VUeu5I045qO5I0axrngx6lPCbTFUa3VCgyj0w
YkZFxbHOayD/72VBgXti2hy+WLpiH0WMiS6BrFK4T/rqQzsCR+nMt7eAS4FY3gvYosG/aIy4
wKZcYInpBK/uE9M2YWmvPmkkU6MPVtbudQwUCz/7p2uOLQZULRnoZZr/mEDyO5rlv5x7KaKB
r70RmZ1v+HaRABc45LjYKawAAAAAAAA=
--------------ms040301070301060709040608--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d64efa9d-ebe6-b141-44ae-0aad07032a60>