Date: Mon, 15 Oct 2012 10:21:06 -0700 From: nate keegan <nate.keegan@gmail.com> To: freebsd-hardware@freebsd.org Subject: Re: ahcich Timeouts SATA SSD Message-ID: <CABVjXffVSFvtgNfMX3BsHqDe-ntqC1rwPw2-HpPGgaoFG6js2w@mail.gmail.com> In-Reply-To: <CABVjXfceHC3s0u6pMBWcPb1XqTrvVW52FN9G3A1Oh1F-UUVqNQ@mail.gmail.com> References: <CABVjXfeV9VvF6sJC3Tb78z=jP%2B2sF%2BOJ2q0euCZkNqN_Yjs9ag@mail.gmail.com> <20121015095858.GC33428@server.rulingia.com> <CABVjXfceHC3s0u6pMBWcPb1XqTrvVW52FN9G3A1Oh1F-UUVqNQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I took a look at the DDB man page and I am not able to do this when the issue happens as the system is completely blown up (meaning no keyboard input on IPMI console, existing SSH sessions, etc. No changes have been seen in the ZFS load on the system. The nature of this system (backup) is such that the heaviest load would be created in the first week or so of going online as we use rsync to copy files down from our Windows servers and during this first week or so the system has to 'seed' the initial copies which would be much heavier on I/O than after that first week where things are relatively constant in terms of I/O. I have 48 Gb of Crucial memory that I will put in this system today to replace the 24 Gb or so of Kingston memory I have in the system. If the issue happens again with the memory change I plan on replacing both SSD (Crucial M4) with two non-SSD SATA disks with the idea that maybe the Crucial firmware on the disks (002 on both disks) is the culprit somehow. It neither item turn out to solve the issue will move on to 9.1RC2 or 9.1-RELEASE if it is out by then and adding kernel options requested. The amount of monkeying that I have had to do via /boot/loader.conf and the camcontrol script I run is telling me that the SSD, the firmware on the SSD, etc is somehow causing the issue as we have plenty of other FreeBSD 8.x and 9.x systems that use non-SSD SATA drives without this issue popping up in their daily workloads. My /boot/loader.conf looks like this currently: # Set in the BIOS as well to activate ahci_load="YES" # Should be auto-negotiation in FreeBSD 9.x # See ahci(4) hint.ahcich.0.sata_rev=1 hint.ahcich.1.sata_rev=1 hint.ahcich.0.pm_level=1 hint.ahcich.1.pm_level=1 And /usr/local/etc/rc.d/camcontrol: #!/bin/sh CAMCONTROL=/sbin/camcontrol # Disable NCQ $CAMCONTROL tags ada0 -N 1 > /dev/null $CAMCONTROL tags ada1 -N 1 > /dev/null # Disable APM $CAMCONTROL cmd ada0 -a "EF 85 00 00 00 00 00 00 00 00 00 00" > /dev/null $CAMCONTROL cmd ada1 -a "EF 85 00 00 00 00 00 00 00 00 00 00" > /dev/null Without both of these shims in place I get maybe 1.5 hours to two hours or so before the system goes kablooie and that is without the system doing any real I/O work just running FreeBSD during the business day and a few scripts from cron to check for data and shuffle it around.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABVjXffVSFvtgNfMX3BsHqDe-ntqC1rwPw2-HpPGgaoFG6js2w>