From owner-freebsd-stable@FreeBSD.ORG Thu Nov 6 09:45:10 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BD15D818 for ; Thu, 6 Nov 2014 09:45:10 +0000 (UTC) Received: from smtp.infracaninophile.co.uk (smtp6.infracaninophile.co.uk [IPv6:2001:8b0:151:1:3cd3:cd67:fafa:3d78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.infracaninophile.co.uk", Issuer "ca.infracaninophile.co.uk" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5DE0018D for ; Thu, 6 Nov 2014 09:45:10 +0000 (UTC) Received: from ox-dell39.ox.adestra.com (no-reverse-dns.metronet-uk.com [85.199.232.226] (may be forged)) (authenticated bits=0) by smtp.infracaninophile.co.uk (8.14.9/8.14.9) with ESMTP id sA69itxr023094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Thu, 6 Nov 2014 09:45:04 GMT (envelope-from matthew@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.9.2 smtp.infracaninophile.co.uk sA69itxr023094 Authentication-Results: smtp.infracaninophile.co.uk/sA69itxr023094; dkim=none reason="no signature"; dkim-adsp=none; dkim-atps=neutral X-Authentication-Warning: lucid-nonsense.infracaninophile.co.uk: Host no-reverse-dns.metronet-uk.com [85.199.232.226] (may be forged) claimed to be ox-dell39.ox.adestra.com Message-ID: <545B4310.7000403@freebsd.org> Date: Thu, 06 Nov 2014 09:44:48 +0000 From: Matthew Seaman User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Varnish proxy goes catatonic under heavy load References: <545A0EB4.4090404@freebsd.org> <545A117B.4080606@multiplay.co.uk> <545B1F2A.5010203@FreeBSD.org> <20141106083153.GK53947@kib.kiev.ua> In-Reply-To: <20141106083153.GK53947@kib.kiev.ua> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="VvAqFpo4W0tEH1bcbUbxO1cCWI634SnAJ" X-Virus-Scanned: clamav-milter 0.98.4 at lucid-nonsense.infracaninophile.co.uk X-Virus-Status: Clean X-Spam-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,RDNS_NONE, SPF_SOFTFAIL autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on lucid-nonsense.infracaninophile.co.uk Cc: freebsd-stable@freebsd.org, Steven Hartland X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2014 09:45:10 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --VvAqFpo4W0tEH1bcbUbxO1cCWI634SnAJ Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 11/06/14 08:31, Konstantin Belousov wrote: > I do not remember exact point in the stable/9 lifetime when the > debug.vn_io_fault_enable was merged. If it is present in your system, > frob its value to 1 and see. I highly suspect that if varnish is in > 'mmap' mode (whatever it is called), and you use UFS, it may help. Seems it is not present in 9.1-RELEASE, but it is in 9.2-RELEASE. I've toggled that sysctl on the one machine running 9.2-RELEASE, but I doubt we're going to get any useful results from it before we start upgrading -- the traffic flood that triggered this was exceptional, and an isolated incident. > I am suggesting this before upgrading to 10 only because I want to > know whether the vn_io_fault code helps in this situation. There > are rumors that it does, but I never seen the confirmation. Hmmm.... well, our theory about this is that we see the effect when the total traffic is sufficiently high that we're hitting the network capacity, and dropping some packets. (The actual traffic load on an individual server was big, but nothing like saturating the network. It's the total that was maxing out our uplink to the Internet.) We simulated the effect by sticking a test box on a 10Mb/s connection and threw a lot of requests for a largeish (1MB) file at it. The packet loss seems to be important -- presumably it's clogging up the available mbufs with old packets that haven't received an ACK yet, so have to be held onto in case they need to be resent. Cheers, Matthew --VvAqFpo4W0tEH1bcbUbxO1cCWI634SnAJ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQJ8BAEBCgBmBQJUW0MXXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQxOUYxNTRFQ0JGMTEyRTUwNTQ0RTNGMzAw MDUxM0YxMEUwQTlFNEU3AAoJEABRPxDgqeTnnrEP/2xfDpzp2fJ1X9kN1hCf2IbL 6+eHA0zDsGkCbUdha6RbyIbj+cB9+PT+7wribyzP6FyVvyA24/osQ5iUL6m2Cqnu OTrdf8/yVnLITjRGZFqSjAjens5eS/TuHk1NiIcTGnIs8RPxTgpeZconicB7uau0 7QmatroQpnOfwXT4x8pcLT8ZOI+9Px0Ng1wAD85TR8e7FjzAezsnvQtLgxNWKSVT rEmRQKP03DeXkyGpbWU2tz/jqCtVufoI/UkATfhDqbDTx5FSleww9kst9h+1tbv8 Ncjr6rIakJ/zan96FNJvT9++SElVYAxJS+1VGJmLFBgpgSKKjYO7x7Q6DUCOJK7I qY180n/M1R8uhRtS5daS4Ji0ZnChUrEmru/vgOC2d70bHGHZuQ3E72gaiNqLJPPn o0zvEVT6uJk1uWwvIIUdcGv/B5DyY/tDNPVH2bRaB1/RRMarkiOmEU6ibmLsykO/ ocI0qPA2hZEqfj1KqPK8+WweRE78tH9blkoK3uYbn0ZuHN6dlHJ36yTJC/S4o2z6 zHUlHLiZ1LCpq3JAt0IDlJwazzVtUokPu7Qv47F7aGZv6sV+oos4P27U7z7N7U/u 5KlCvWxmUNBQDbSLbQjYn1XjCAWFqUX8/3MVBh9Oz8dMuihRyeyKMuRdnLxdHAOp hzvG/7i7dIX/Nyc14g8B =ORlW -----END PGP SIGNATURE----- --VvAqFpo4W0tEH1bcbUbxO1cCWI634SnAJ--