Date: Fri, 20 Jun 2014 11:28:16 -0400 From: Rich <rincebrain@gmail.com> To: Graham Allan <allan@physics.umn.edu> Cc: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: Large ZFS arrays? Message-ID: <CAOeNLuo-m-_hu5TdnG_njsArYHhQOTsVsjknbsumbO7_p8LvPQ@mail.gmail.com> In-Reply-To: <53A44A23.6050604@physics.umn.edu> References: <1402846139.4722.352.camel@btw.pki2.com> <53A44A23.6050604@physics.umn.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Just FYI, a lot of people who do this use sas[23]ircu for scripting this, rather than sg3utils, though the latter is more powerful if you have enough of the SAS spec to play with... - Rich On Fri, Jun 20, 2014 at 10:50 AM, Graham Allan <allan@physics.umn.edu> wrote: > On 6/15/2014 10:28 AM, Dennis Glatting wrote: >> >> Anyone built a large ZFS infrastructures (PB size) and care to share >> words of wisdom? > > > This is a bit of a late response but I wanted to put in our "me too" before > I forget... > > We have about 500TB of storage on ZFS at present, and plan to add 600TB more > later this summer, mostly in similar arrangements to what I've seen > discussed already - using Supermicro 847 JBOD chassis and a mixture of Dell > R710/R720 head nodes, with LSI 9200-8e HBAs. One R720 has four 847 chassis > attached, a couple R710s just have a single chassis. We originally installed > one HBA in the R720 for each chassis but had some deadlock problems at one > point, which was resolved by daisy-chaining the chassis from a single HBA. I > had a feeling it was maybe related to kern/177536 but not really sure. > > We've been running FreeBSD 9.1 on all the production nodes, though I've long > wanted to (and am now beginning to) set up a reasonable long-term testing > box where we could check out some of the kernel patches or tuning > suggestions which come up - also beginning to test the 9.3 release for the > next set of servers. > > We built all these conservatively with each chassis as a separate pool, each > having four 10-drive raidz2 vdevs, a couple of spares, a cheapish L2ARC SSD > and a mirrored pair of ZIL SSD (maybe unnecessary to mirror this these > days?). I was using the Intel 24GB SLC drive for the ZIL, will need to > choose something new for future pools. > > Would be interesting to hear a little about experiences with the drives > used... For our first "experimental" chassis we used 3TB Seagate desktop > drives - cheap but not the best choice, 18 months later they are dropping > like flies (luckily we can risk some cheapness here as most of our data can > be re-transferred from other sites if needed). Another chassis has 2TB WD > RE4 enterprise drives (no problems), and four others have 3TB and 4TB WD > "Red" NAS drives... which are another "slightly risky" selection but so far > have been very solid (also in some casual discussion with a WD field > engineer he seemed to feel these would be fine for both ZFS and hadoop use). > > Tracking drives for failures and replacements was a big issue for us. One of > my co-workers wrote a nice perl script which periodically harvests all the > data from the chassis (via sg3utils) and stores the mappings of chassis > slots, da devices, drive labels, etc into a database. It also understands > the layout of the 847 chassis and labels the drives for us according to some > rules we made up - we do some prefix for the pool name, then "f" or "b" for > front/back of chassis, then the slot number, and finally (?) has some > controls to turn the chassis drive identify lights on or off. There might be > other ways to do all this but we didn't find any, so it's been incredibly > useful for us. > > As far as performance goes we've been pretty happy. Some of these get > relatively hammered by NFS i/o from cluster compute jobs (maybe ~1200 > processes on 100 nodes) and they have held up much better than our RHEL NFS > servers using fiber channel RAID storage. We've also performed a few bulk > transfers between hadoop and ZFS (using distcp with an NFS destination) and > saw sustained 5Gbps write speeds (which really surprised me). > > I think that's all I've got for now. > > Graham > -- > ------------------------------------------------------------------------- > Graham Allan > School of Physics and Astronomy - University of Minnesota > ------------------------------------------------------------------------- > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOeNLuo-m-_hu5TdnG_njsArYHhQOTsVsjknbsumbO7_p8LvPQ>