From owner-svn-src-head@freebsd.org Tue Feb 2 22:18:53 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F2B86A99F58; Tue, 2 Feb 2016 22:18:53 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-io0-x22d.google.com (mail-io0-x22d.google.com [IPv6:2607:f8b0:4001:c06::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BCCD0F6B; Tue, 2 Feb 2016 22:18:53 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: by mail-io0-x22d.google.com with SMTP id 9so34638656iom.1; Tue, 02 Feb 2016 14:18:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=wkWDzV7RUEj2da9qd2vBlW/jBFfnGju/BBsvPdZdX4k=; b=vSu27iQA80RJNogUtWW0uGJqCa9VTUE/5zlBdbUP1SDoirO6JByWEEIQoRUxgzR1+t BZXYFRgWo0PIZRLDtVET7iDa0iU1qPxh2qnujWYwt1TUPYkRB73W1SX6/Tu+1SGpASNX UAYIOIGmSW3auF7kUYIa9ADD3C/ycWyi0RRTa8HJGQ4fQip0amKs1aw3JNE8C4xE5Djj Yx0x1q58nD+u7rlfWSDSoqZkn0X/NhMjJoIHTy67YmKEwQLTVAaOuetYrSqSsFBXcnNJ 4rX1+SpGPifF70jBM3COVS6z7AO+yB4k/l51QSdyYBLPlMJuZJkEc2igpSmF1ZzGkenX Ha6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=wkWDzV7RUEj2da9qd2vBlW/jBFfnGju/BBsvPdZdX4k=; b=GTSTkEgNh9M4ik1Q11WISZCUPVRoHOXcreE8m6/jF9fvyifILwJWr4BxsLcbIXnxom ZvYXyy4WRNcf9SLRCd4VwRP57OKTkzYajQD3zcR6FZDe46QhmmRH7uurnSFrFSu3J30K D465Qosp7RIGQ5/EyN5I8jdNE/f3Gtlrw7SZwKPz3IxWCGNL9KugLu/G3rCMPPW3PUlJ 2CycL4eMLOxAJaU0n5t2ls9QBPpV76k5er8URwb+wd2sOd6IPJclfqWXFIrq5EFyRAJD kyXAMxOt5VZ5/tz4fU7y97BArPk0k/5yVXJB2TWsZ5GjxX60x8qOCnY4ms1BJPqkHpol yMbA== X-Gm-Message-State: AG10YORry1xnBcujYyj1aE+awCcv0rhnOzeipDfM5wN6LlL5DSphr0EVGScOyWrzBiR71B/DZervgHI00eehYQ== MIME-Version: 1.0 X-Received: by 10.107.132.142 with SMTP id o14mr18459ioi.75.1454451533123; Tue, 02 Feb 2016 14:18:53 -0800 (PST) Received: by 10.36.14.19 with HTTP; Tue, 2 Feb 2016 14:18:53 -0800 (PST) In-Reply-To: <56B11DF0.3060401@freebsd.org> References: <201602020557.u125vxCP084718@repo.freebsd.org> <36439709.poT7RgRunK@ralph.baldwin.cx> <56B10D67.4050602@freebsd.org> <56B11323.70905@freebsd.org> <20160202210958.GV37895@zxy.spb.ru> <56B11DF0.3060401@freebsd.org> Date: Tue, 2 Feb 2016 14:18:53 -0800 Message-ID: Subject: Re: svn commit: r295136 - in head: sys/kern sys/netinet sys/sys usr.bin/netstat From: Adrian Chadd To: Alfred Perlstein Cc: Slawa Olhovchenkov , Xin LI , "svn-src-head@freebsd.org" , "svn-src-all@freebsd.org" , "src-committers@freebsd.org" , John Baldwin Content-Type: text/plain; charset=UTF-8 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 22:18:54 -0000 On 2 February 2016 at 13:21, Alfred Perlstein wrote: > > > On 2/2/16 1:09 PM, Slawa Olhovchenkov wrote: >> >> On Tue, Feb 02, 2016 at 12:35:47PM -0800, Alfred Perlstein wrote: >> >>>> I would second John's comment on the necessity of the change though, >>>> if one already have 32K of *backlogged* connections, it's probably not >>>> very useful to allow more coming in. It sounds like the application >>>> itself is seriously broken, and unless expanding the field have some >>>> performance benefit, I don't think it should stay. >>> >>> Imagine a hugely busy image board like 2ch.net, if there is a single >>> hiccup, it's very possible to start dropping connections. >> >> In reality start dropping connections in any case: nobody will be >> infinity wait of accept (user close browser and go away, etc). >> >> Also, if you have more then 4K backloged connections -- you have >> problem, you can't process all connections request and in next second >> you will be have 8K, after next second -- 12K and etc. >> > Thank you Slawa, > > I am pretty familiar with what you are describing which are "cascade > failures", however in order to understand why such a change makes sense I > can give you a little early history lesson on a project I developed under > FreeBSD, and then explain why such a project would probably not work with > FreeBSD as a platform today (we would have to use Linux or custom patches). > > Here is that use case: > > Back in 1999 I wrote a custom webserver using FreeBSD that was processing > over 1500 connections per second. > > What we were doing was tracking web hits using "hidden gifs". Now this was > 1999 with only 100mbit hardware and a pentium 400mhz. Mind you I was doing > this with cpu to spare, so having an influx of additional hits was OK. > > Meaning I could easily deal with backlog. > > Now what was important about this case was that EVERY time we served the > data we were able to monitize it and pay for my salary at the time which was > working on SMP for FreeBSD and a bunch of other patches. Any lost hits / > broken connections would easily cost us money, which in turn meant less time > on FreeBSD and less time fixing things to scale. > > In our case the user would not really know if our "page" didn't load because > we were just an invisible gif. > > So back to the example, let's scale that out to today's numbers. > > 100mbps -> 10gigE, so that would be 1500 conn/sec -> 150,000 conn/sec. so > basically at 0.20 of a second of any sort of latency I will be overflowing > the listen queue and dropping connections. > > Now when you still have CPU to spare because connections *are* precious, > then the model makes sense to slightly over-provision the servers to allow > for somebacklog to be processed. > > So, in today's day and age, it really does make sense to allow for buffering > more than 32k connections, particularly if the developer knows what he is > doing. > > Does this help explain the reasoning? Just to add to this: the VM system under ridiculous load (like say, deciding it can dirty most of your half-terabyte of RAM and get behind in writing stuff to disk) can cause the system to pause for little pieces of time. It sucks, but it happens. 0.20 seconds isn't all that long. And that's at 150,000 conn/sec. There's TCP locking work that will hopefully increase that value.. -a