From owner-freebsd-stable@FreeBSD.ORG Tue Feb 9 19:49:56 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4905E106566C for ; Tue, 9 Feb 2010 19:49:56 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.freebsd.org (Postfix) with ESMTP id 06FAC8FC13 for ; Tue, 9 Feb 2010 19:49:55 +0000 (UTC) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.14.4/8.14.1) with ESMTP id o19JntKk009018 for ; Tue, 9 Feb 2010 11:49:55 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.14.4/8.13.4/Submit) id o19JntPo009017; Tue, 9 Feb 2010 11:49:55 -0800 (PST) Date: Tue, 9 Feb 2010 11:49:55 -0800 (PST) From: Matthew Dillon Message-Id: <201002091949.o19JntPo009017@apollo.backplane.com> To: FreeBSD Stable References: <4B6F9A8D.4050907@langille.org> <4B71490B.6030602@langille.org> <4B71AED5.4030002@wensing.org> Subject: Re: hardware for home use large storage X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Feb 2010 19:49:56 -0000 The Silicon Image 3124A chipsets (the PCI-e version of the 3124. The original 3124 was PCI-x). The 3124A's are starting to make their way into distribution channels. This is probably the best 'cheap' solution which offers fully concurrent multi-target NCQ operation through a port multiplier enclosure with more than the PCIe 1x bus the ultra-cheap 3132 offers. I think the 3124A uses an 8x bus (not quite sure, but it is more than 1x). AHCI on-motherboard with equivalent capabilities do not appear to be in wide distribution yet. Most AHCI chips can do NCQ to a single target (even a single target behind a PM), but not concurrently to multiple targets behind a port multiplier. Even though SATA bandwidth constraints might seem to make this a reasonable alternative it actually isn't because any seek heavy activity to multiple drives will be serialized and perform EXTREMELY poorly. Linear performance will be fine. Random performance will be horrible. It should be noted that while hotswap is supported with silicon image chipsets and port multiplier enclosures (which also use Sili chips in the enclosure), the hot-swap capability is not anywhere near as robust as you would find with a more costly commercial SAS setup. SI chips are very poorly made (this is the same company that went bust under another name a few years back due to shoddy chipsets), and have a lot of on-chip hardware bugs, but fortunately OSS driver writers (linux guys) have been able to work around most of them. So even though the chipset is a bit shoddy actual operation is quite good. However, this does mean you generally want to idle all activity on the enclosure to safely hot swap anything, not just the drive you are pulling out. I've done a lot of testing and hot-swapping an idle disk while other drives in the same enclosure are hot is not reliable (for a cheap port multiplier enclosure using a Sili chip inside, which nearly all do). Also, a disk failure within the enclosure can create major command sequencing issues for other targets in the enclosure because error processing has to be serialized. Fine for home use but don't expect miracles if you have a drive failure. The Sili chips and port multiplier enclosures are definitely the cheapest multi-disk solution. You lose on aggregate bandwidth and you lose on some robustness but you get the hot-swap basically for free. -- Multi-HD setups for home use are usually a lose. I've found over the years that it is better to just buy a big whopping drive and then another one or two for backups and not try to gang them together in a RAID. And yes, at one time in the past I was running three separate RAID-5 using 3ware controllers. I don't anymore and I'm a lot happier. If you have more than 2TB worth of critical data you don't have much of a choice, but I'd go with as few physical drives as possible regardless. The 2TB Maxtor green or black drives are nice. I strongly recommend getting the highest-capacity drives you can afford if you don't want your power bill to blow out your budget. The bigger problem is always having an independent backup of the data. Depending on a single-instanced filesystem, even one like ZFS, for a lifetime's worth of data is not a good idea. Fire, theft... there are a lot of ways the data can be lost. So when designing the main system you have to take care to also design the backup regimen including something off-site (or swapping the physical drive once a month, etc). i.e. multiple backup regimens. If single-drive throughput is an issue then using ZFS's caching solution with a small SSD is the way to go (and yes, DFly has a SSD caching solution now too but that's not pertainant to this thread). The Intel SSDs are really nice, but I am singularly unimpressed with the OCZ Colossus's which don't even negotiate NCQ. I don't know much re: other vendors. A little $100 Intel 40G SSD has around a 40TB write endurance and can last 10 years as a disk meta-data caching environment with a little care, particularly if you only cache meta-data. A very small incremental cost gives you 120-200MB/sec of seek-agnostic bandwidth which is perfect for network serving, backup, remote filesystems, etc. Unless the box has 10GigE or multiple 1xGigE network links there's no real need to try to push HD throughput beyond what the network can do so it really comes down to avoiding thrashing the HDs with random seeks. That is what the small SSD cache gives you. It can be like night and day. -Matt