Date: Sat, 26 Jan 2013 02:10:01 GMT From: Jeremy Chadwick <jdc@koitsu.org> To: freebsd-fs@FreeBSD.org Subject: Re: kern/169480: [zfs] ZFS stalls on heavy I/O Message-ID: <201301260210.r0Q2A1Eo040039@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/169480; it has been noted by GNATS. From: Jeremy Chadwick <jdc@koitsu.org> To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/169480: [zfs] ZFS stalls on heavy I/O Date: Fri, 25 Jan 2013 18:08:59 -0800 Harry, things that come to mind immediately: 1. http://www.quietfountain.com/fs1pool1.txt is when your pool contained both L2ARC and ZIL devices on SSDs. Please remove the SSDs from the picture entirely and use raidz1 disks ada[2345] only at this point. I do not want to discuss ada[01] at this time, because they're SSDs. There are quite literally 4 or 5 "catches" to using these devices on FreeBSD ZFS, but the biggest problem -- and this WILL hurt you, no arguments about it -- is lack of TRIM support. You will hurt your SSDs over time doing this. If you want TRIM support on ZFS you will need to run -CURRENT. We can talk more about the SSDs later. As said, please remove them from the pictures for starters, as all they do is make troubleshooting much much harder. 2. I do see some raw I/O benchmarks but only for ada2. This is insufficient. A single disk performing like crap in a pool can slow down the entire response time for everything. I can do analysis of all of your disks if the issue is narrowed down to one of them. "gstat -I500ms" is a good way to watch I/O speeds in real-time. I find this more effective than "zpool iostat -v 1" for per-device info. 3. The ada[2345] disks involved are Hitachi HDS723015BLA642 (7K3000, 1.5TB), and there is sparse info on the web as to if these are 512-byte physical sector disks or 4096-byte. smartmontools 6.0 or newer will tell you. All disks regardless advertise 512-byte as the logical size to remain fully compatible with legacy systems, but the perform hit on I/O is major if the device + pool ashift isn't 12. So please check this with smartmontools 6.0 or newer. If the disks use physically 4096-byte sectors, you need to use gnop(8) to align them and create the pool off of that. Ivan Voras wrote a wonderful guide on how to do this, and it's very simple: http://ivoras.net/blog/tree/2011-01-01.freebsd-on-4k-sector-drives.html It wouldn't hurt you to do this regardless, as there's no performance hit using the gnop(8) method on 512-byte sector drives; this would "future-proof" you if upgrading to newer disks too. You want ashift=12. 4. Why are all of your drives partitioned? In other words, why are you using adaXpX rather than just adaX for your raidz1 pool? "gpart show" output was not provided, and I can only speculate as to what's going on under the hood there. Please use raw disks when recreating your pool, i.e. ada2, ada3, ada4, etc... I know for your cache/logs this is a different situation but again, please remove those from the picture. 5. Please keep your Hitachi disks on the Intel ICH7 controller for the time being. It's SATA300 but that isn't going to hurt these disks. Don't bring the Marvell into the picture yet. Don't change around cabling or anything else. 6. For any process that takes a long while, you're going to need to do "procstat -kk" (yes -k twice) against it. 7. I do not think your issue is related to this PR. I would suggest discussing it on freebsd-fs first. Of course, you're also using something called "nas4free" which may or may not be *true, unaltered* FreeBSD -- I have no idea. I often find it frustrating when, say, the FreeNAS folks or other "FreeBSD fork projects" appear on the FreeBSD mailing lists "just because it uses FreeBSD". You always have to go with the vendor for support (like you did on their forum), but if you really think this is a FreeBSD "kernel thing" freebsd-fs is fine. Start with what I described above and go from there. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201301260210.r0Q2A1Eo040039>