Date: Thu, 7 Jun 2012 08:26:59 -0600 From: Scott Long <scottl@samsco.org> To: Daniel Kalchev <daniel@digsys.bg> Cc: freebsd-stable@freebsd.org Subject: Re: Netflix's New Peering Appliance Uses FreeBSD Message-ID: <6833ED24-9638-43E7-AE35-289CEB3E06C2@samsco.org> In-Reply-To: <4FD06FD7.2000708@digsys.bg> References: <CAMYW4Zi4y16EL1=%2Bsfz1XATc9ZnQpocUD_Xf9Jg=LR=c1AgaKA@mail.gmail.com> <3CEF3B39-BE1E-4FC4-81F3-D26049C83313@netflix.com> <4FD06FD7.2000708@digsys.bg>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 7, 2012, at 3:09 AM, Daniel Kalchev wrote: >=20 >=20 > On 06.06.12 03:16, Scott Long wrote: >=20 > [...] >> Each disk has its own UFS+J filesystem, except for >> the SSDs that are mirrored together with gmirror. The SSDs hold the = OS image >> and cache some of the busiest content. The other disks hold nothing = but the >> audio and video files for our content streams. >=20 > Could you please explain the rationale of using UFS+J for this large = storage. Your published documentation states that you have reasonable = redundancy in case of multiple disk failure and I wonder how you handle = this with "plain" UFS. Things like avoiding hangs and panics when an = disk is going to die. Redundancy happens by allowing the streaming clients to choose multiple = other sources for their stream, and buffer enough of the stream to make = a switchover appear seamless. That other source might be a peer node on = the same network, or might be a node that is upstream or on a different = network. The point of the caches is to hold as much content as = possible, and we've found that it's more effective to maximize capacity = but allow drives to fail in place than to significantly reduce capacity = with hardware or software RAID. When a disk starts having problems that = affect its ability to deliver data on time, any clients affected by it = simply switch to a different source. When the disk does finally die, it = is removed from the available pool and content is reshuffled on the = other drives during the next daily content update. Once enough disks = fail that the cache is no longer effective, it gets replaced. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6833ED24-9638-43E7-AE35-289CEB3E06C2>