Date: Fri, 4 May 2012 01:29:13 -0400 From: Charles Sprickman <spork@bway.net> To: freebsd-scsi@freebsd.org Subject: mfi and "copy out failed" messages Message-ID: <B1A5F7F8-E396-4906-A182-C98D068502DF@bway.net>
next in thread | raw e-mail | index | archive | help
I'm wondering if anyone has some interest in this issue, I recently = think I tracked down a long-standing fs corruption and panic issue on a = Dell 2970 that I was never able to solve: http://lists.freebsd.org/pipermail/freebsd-fs/2010-July/008858.html = (there are other threads, but that's the gist of the issue) I'd read in various threads that the "mfiX: Copy out failed" was a = harmless message. But recently I started thinking that there had to be = some relation between those messages and the panics. The timing fits - = I had megacli performing a status check on the controller in a periodic = script that kicked off with the daily run. Most of my panics were = during or shortly after the daily run. The "Copy out failed" messages = always corresponded to megacli being run. 132 days ago I removed the daily megacli check and the box has not had a = kernel panic since then. Previous to this my longest uptime was not = more than a few months. While this is by no means 100% definitive, it = sure seems like something is going on here. My best guess is that = megacli and/or the mfi driver are interacting in a bad way and that the = "Copy out failed" message is indicating something did not hit the = controller that should have. My earlier assumption was that it was just = some control message megacli was sending that didn't make it, but now = I'm thinking it's some request to write actual data to the drive that's = failing. As a reminder, the card in question is: mfi0: <Dell PERC 6> port 0xec00-0xecff mem = 0xe9f80000-0xe9fbffff,0xe9fc0000-0xe9ffffff irq 37 at device 0.0 on pci7 mfi0: 3049 (boot + 3s/0x0020/info) - Firmware version 1.22.02-0612 mfi0: 3051 (boot + 23s/0x0020/info) - Controller hardware revision ID = (0x0) mfi0: 3052 (boot + 23s/0x0020/info) - Package version 6.2.0-0013 If anyone with knowledge of the mfi driver would like to comment, I'd = very much appreciate it. This box is going to be repurposed in the = coming months as an ESXi host to hold some backup/standby VMs, but = before that I would not mind taking some time to test any patches, extra = debug printfs in mfi, etc. I suspect I can probably trigger the panic = pretty easily by mimicking the daily run conditions - just kick off a = find from "/" and then repeatedly loop the megacli command to check the = array health. =20 The box is still on 7.3, but I'd gladly upgrade to 8.3 and test there if = needed once the box is freed up. Thanks, Charles -- Charles Sprickman NetEng/SysAdmin Bway.net - New York's Best Internet www.bway.net spork@bway.net - 212.655.9344
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B1A5F7F8-E396-4906-A182-C98D068502DF>