Date: Fri, 28 Oct 2011 00:19:26 +0100 From: "Pegasus Mc Cleaft" <ken@mthelicon.com> To: "'Alexander Kabaev'" <kabaev@gmail.com>, "'C. P. Ghost'" <cpghost@cordula.ws> Cc: 'Alexey Shuvaev' <shuvaev@physik.uni-wuerzburg.de>, freebsd-current@freebsd.org Subject: RE: Panics after AHCI timeouts Message-ID: <005e01cc94fe$dfbe3390$9f3a9ab0$@com> In-Reply-To: <20111027185957.54ece0ad@kan.dyndns.org> References: <20111008201456.GA3529@lexx.ifp.tuwien.ac.at> <20111017190027.GA9873@lexx.ifp.tuwien.ac.at> <CAJ-Vmokbm5z3GPbKjc6_o0_Ea6u_b7twDu=xLeYpORiUpp6Z=Q@mail.gmail.com> <20111018131353.GA83797@lexx.ifp.tuwien.ac.at> <649509EEAEBA42D4A3DCC1FDF5DA72E5@multiplay.co.uk> <20111025202755.4243ae74@kan.dyndns.org> <CADGWnjX95yMEO06o%2B8xUho4Yc2-R9S=GJTWkGqvfbzDMHqCiGw@mail.gmail.com> <20111027185957.54ece0ad@kan.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
>> If it's only one process, the machine (usually) doesn't hang, even >> when that process is copying big files back and forth for a long >> period of time (it's a backup process). But interleave that process >> with another one accessing the same disk, and poof!, almost >> immediately ahci timeouts. occur. Very strange... Maybe a race >> condition of some sort after all? >> > >No, I cannot say there is any specific correlation to IO load of the machine, >timeouts I saw happen randomly and seem almost always happen as system uptime >crosses two weeks boundary. I am suspecting Samsung firmware at this point. Now that's interesting as I use a mixture of Samsung, WD, and Seagate.. And I do believe the Samsungs tend to do this more. I see ACHI timeouts from time to time on my machine (10-Current AMD64) but normally only when I am doing something like a scrub. The machine has never panicked as a result of this, it normally just FAULTS the drive in the pool and keeps on going. At that point, doing a camcontrol rescan all does not bring the drive back into existence (it will normally just hang on that bus for 15-20 seconds and then carry on without identifying a drive). I have to pull the drive, let it spin down and then reinsert it. Once its reinserted, the drive comes back on the bus and I can online it again. The weird thing is this.. For me, it only ever seems to be when I am writing to the pool/disk. Pure reads don't seem to bother it. I don't really know at this point if the SATA ports have gone wonkey on the motherboard, or if the processor on the HD has crashed. I almost tend to believe it's the drive because camcontrol stops on that port almost as it if knows there is a link there, but can't talk to it. Peg
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?005e01cc94fe$dfbe3390$9f3a9ab0$>