From owner-freebsd-stable@FreeBSD.ORG Thu Jun 7 14:27:29 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4E7031065673 for ; Thu, 7 Jun 2012 14:27:29 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 1A2AD8FC17 for ; Thu, 7 Jun 2012 14:27:29 +0000 (UTC) Received: from [127.0.0.1] (pooker.samsco.org [168.103.85.57]) (authenticated bits=0) by pooker.samsco.org (8.14.5/8.14.5) with ESMTP id q57EQxuI038323; Thu, 7 Jun 2012 08:26:59 -0600 (MDT) (envelope-from scottl@samsco.org) Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=iso-8859-1 From: Scott Long In-Reply-To: <4FD06FD7.2000708@digsys.bg> Date: Thu, 7 Jun 2012 08:26:59 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <6833ED24-9638-43E7-AE35-289CEB3E06C2@samsco.org> References: <3CEF3B39-BE1E-4FC4-81F3-D26049C83313@netflix.com> <4FD06FD7.2000708@digsys.bg> To: Daniel Kalchev X-Mailer: Apple Mail (2.1278) X-Spam-Status: No, score=-50.0 required=3.8 tests=ALL_TRUSTED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.0 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on pooker.samsco.org Cc: freebsd-stable@freebsd.org Subject: Re: Netflix's New Peering Appliance Uses FreeBSD X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2012 14:27:29 -0000 On Jun 7, 2012, at 3:09 AM, Daniel Kalchev wrote: >=20 >=20 > On 06.06.12 03:16, Scott Long wrote: >=20 > [...] >> Each disk has its own UFS+J filesystem, except for >> the SSDs that are mirrored together with gmirror. The SSDs hold the = OS image >> and cache some of the busiest content. The other disks hold nothing = but the >> audio and video files for our content streams. >=20 > Could you please explain the rationale of using UFS+J for this large = storage. Your published documentation states that you have reasonable = redundancy in case of multiple disk failure and I wonder how you handle = this with "plain" UFS. Things like avoiding hangs and panics when an = disk is going to die. Redundancy happens by allowing the streaming clients to choose multiple = other sources for their stream, and buffer enough of the stream to make = a switchover appear seamless. That other source might be a peer node on = the same network, or might be a node that is upstream or on a different = network. The point of the caches is to hold as much content as = possible, and we've found that it's more effective to maximize capacity = but allow drives to fail in place than to significantly reduce capacity = with hardware or software RAID. When a disk starts having problems that = affect its ability to deliver data on time, any clients affected by it = simply switch to a different source. When the disk does finally die, it = is removed from the available pool and content is reshuffled on the = other drives during the next daily content update. Once enough disks = fail that the cache is no longer effective, it gets replaced. Scott