From owner-svn-src-all@freebsd.org Thu Apr 14 22:24:46 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92FB1ADAA90 for ; Thu, 14 Apr 2016 22:24:46 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ig0-x244.google.com (mail-ig0-x244.google.com [IPv6:2607:f8b0:4001:c05::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 59A7B1D08 for ; Thu, 14 Apr 2016 22:24:46 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-ig0-x244.google.com with SMTP id g8so769514igr.0 for ; Thu, 14 Apr 2016 15:24:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=184bR5OqcmQKsYMUM420AJuSJijMKNiISE8vbV9qhOE=; b=ww5rYsP+QjNkesZxdNrSVmnj226Hm9DOoAe6WvObg9wdmQ/zT5PQC4ym18BuAECqEH toYG+gcv34Dfl+S6n/JyTV5xQ10QibX5klLrKZF9BdbZNUqYE6/Q5RcM1p0AT1jykn5P 7DI3jxFEDU7Gllz0KkFMuTIQ27/65dDYyj8RGIezsbYU+5S/Qth0W49pCoqqMpqYU8m3 mJCcecCGVK43rldWViq4ORdjvfrvmrTzT8oWpNS4aU4U/h0uPS82sWzVVRqR/9Pg10Q8 xABpDNoq3vhbgbu3MHVNzoB7zUBhDbX0lH9VMkSHirexJ0eQnG/02x5LYOrWAdy7qHVn W4HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=184bR5OqcmQKsYMUM420AJuSJijMKNiISE8vbV9qhOE=; b=igVifVFlgAs1bve9ggiPv/xSL7zfGeBcxO438dGUCp2IFklZxxqYt5ZnYEGA1LVlKq 7svliq7sRpF8uM9660DBhdRYJX2/TAjtxoxRx0vyPxEVyLdeiAUGbBFgpaC7GAMV2oOQ 3nhOwY5ILrzWg4tFogNXVQH0GRBtwYAkGPmIIb5vAIpbZejcGZQXQDGT7tAXLc1jCg99 Uz/0TEOHydnxjAUjpSWGTrSLfSdODpBExiovc3cwUiHU922fn0sMYNL6v/+fK83PVhnw rCMFpDJk3fbsKl1MYr1BYnag2vJ5h/no/I1oA4No7cueFZ5dPEkZYZ9zOvE4gv4dVIof F+yQ== X-Gm-Message-State: AOPr4FVVkM/u2ckNOghOpNXnXbBPPhGAuFoFZi5fBqQSBwI4NADj7ZpUtrCPbsMcFhANS0MszI9iPZeB24NlAw== MIME-Version: 1.0 X-Received: by 10.50.67.113 with SMTP id m17mr1110537igt.52.1460672685862; Thu, 14 Apr 2016 15:24:45 -0700 (PDT) Sender: wlosh@bsdimp.com Received: by 10.36.194.3 with HTTP; Thu, 14 Apr 2016 15:24:45 -0700 (PDT) X-Originating-IP: [50.253.99.174] In-Reply-To: <20160414221517.GA66711@mutt-hardenedbsd> References: <201604142147.u3ELlwYo052010@repo.freebsd.org> <20160414221517.GA66711@mutt-hardenedbsd> Date: Thu, 14 Apr 2016 16:24:45 -0600 X-Google-Sender-Auth: _UKUFP2cCOocOcaxnmt7zPgRzfM Message-ID: Subject: Re: svn commit: r298002 - in head/sys: cam cam/ata cam/scsi conf dev/ahci From: Warner Losh To: Shawn Webb Cc: Dmitry Morozovsky , "svn-src-head@freebsd.org" , "svn-src-all@freebsd.org" , src-committers , Warner Losh Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2016 22:24:46 -0000 On Thu, Apr 14, 2016 at 4:15 PM, Shawn Webb wrote: > On Thu, Apr 14, 2016 at 04:04:27PM -0600, Warner Losh wrote: > > On Thu, Apr 14, 2016 at 3:54 PM, Dmitry Morozovsky > wrote: > > > > > Warner, > > > > > > On Thu, 14 Apr 2016, Warner Losh wrote: > > > > > > > Author: imp > > > > Date: Thu Apr 14 21:47:58 2016 > > > > New Revision: 298002 > > > > URL: https://svnweb.freebsd.org/changeset/base/298002 > > > > > > > > Log: > > > > New CAM I/O scheduler for FreeBSD. The default I/O scheduler is the > > > same > > > > > > [snip] > > > > > > First, thanks so much for this quite a non-trivial work! > > > What are the ways to enable this instead of deafult, and what ar the > > > benefits > > > and drawbacks? > > > > > > You add CAM_NETFLIX_IOSCHED to your kernel config to enable it. Hmmm, > > looking at the diff, perhaps I should add that to LINT. > > > > In production, we use it for three things. First, our scheduler keeps a > lot > > more > > statistics than the default one. These statistics are useful for us > knowing > > when > > a system is saturated and needs to shed load. Second, we favor reads over > > writes because our workload, as you might imagine, is a read mostly work > > load. > > Finally, in some systems, we throttle the write throughput to the SSDs. > The > > SSDs > > we buy can do 300MB/s write while serving 400MB/s read, but only for > short > > periods of time (long enough to do 10-20GB of traffic). After that, write > > performance > > drops, and read performance goes out the window. Experiments have shown > that > > if we limit the write speed to no more than 30MB/s or so, then the > garbage > > collection the drive is doing won't adversely affect the read latency / > > performance. > > Going on a tangent here, but related: > > As someone who is just barely stepping into the world of benchmarks and > performance metrics, can you shed some light as to how you gained those > metrics? I'd be extremely interested to learn. > These numbers were derived through an iterative process. All our systems report a large number of statistics while they are running. The disk performance numbers come from gstat(8) which ultimately derives them from devstat(9). When we enabled serving customer traffic while refreshing content, we noticed a large number of reports from our playback clients indicating problems with the server during this time period. I looked at the graphs to see what was going on. Once I found the problem, I was able to see that as the write load varied, the latency numbers for the reads would vary substantially as well. I added code to the I/O scheduler so I could rate limit the write speed to the SSDs. After running through a number of different machines over a number of nights of filling and serving, I was able to find the right number. If I set it to 30MB, the 20 machines I tested didn't have any reports above background level of problems. When I set it to 35MB/s there was a couple of those machines that had problems. when I set it to 40MB/s there were a couple more. When I set it to 80MB/s, almost all had problems. Being conservative, I set it to the highest number that showed no ill effect on the clients. I was able to see large jumps in read latency as low as 25MB/s though. Sadly, this is with Netflix internal tools, but one could do the same research with gstat and scripting. One could also use dtrace to study the latency patterns to a much finer degree of fidelity than gstat offers. Warner