From owner-freebsd-stable@FreeBSD.ORG Wed Nov 5 12:02:32 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 77B44498 for ; Wed, 5 Nov 2014 12:02:32 +0000 (UTC) Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0DB21DA2 for ; Wed, 5 Nov 2014 12:02:31 +0000 (UTC) Received: by mail-wi0-f176.google.com with SMTP id h11so12162202wiw.15 for ; Wed, 05 Nov 2014 04:02:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=ZFpcK7GhXcB/6YDgyybSlv3r5lW3GC4OiLPbjqmOkHU=; b=WqEcptm6pP8HqmORiV+Jju1lDJdkq7XI2ZQj4Zxu/wkdp8sf+hrvDbJu9UVWmhq8NW bV3NTC8OqJafMAWGPyjqWAQzo4qqQ4QVocZYDj9BixaiJSnTlfa/f6+05kjp23Tc57eh 0pxQzG+aZVQhTyibiZxs1yv2T/7MNCPdIl6uiH+/7zX4YfBipuVi5nT15QGg8HHVBl6r p84eujTsBLJfJEmS5YBrzD1u0sX4pLENZ50xT8rgOzgucNJwMJE2KbT5iZBCmS316Pwt 5gcr59yDJJa1zk6ePDEw4ULFu8htNsWEoXT9RliB37/9RjWQg6ElxSJSMJwjMPbYqSLY /Lfg== X-Gm-Message-State: ALoCoQlumolzFcFTAAjyUurw9OJUvaVn6Cq475XA5kEdZnvi+Y9MU/SdcQWbXUVLszi2RZSlnMoF X-Received: by 10.194.236.200 with SMTP id uw8mr65661463wjc.50.1415188943890; Wed, 05 Nov 2014 04:02:23 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id ht9sm15575513wib.8.2014.11.05.04.02.22 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Nov 2014 04:02:23 -0800 (PST) Message-ID: <545A117B.4080606@multiplay.co.uk> Date: Wed, 05 Nov 2014 12:00:59 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Matthew Seaman , freebsd-stable@freebsd.org Subject: Re: Varnish proxy goes catatonic under heavy load References: <545A0EB4.4090404@freebsd.org> In-Reply-To: <545A0EB4.4090404@freebsd.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2014 12:02:32 -0000 As a guess you exhausted all mbufs, 10 has much better defaults for these so I'd recommend updating. If you can get in via IPMI or something similar you should be able to confirm. A trick I've used in the past to recover from such a issue is to hard bounce the nic ports on the switch which seemed to free enough to be able to ssh in. On 05/11/2014 11:49, Matthew Seaman wrote: > Dear all, > > We had an unfortunate set of circumstances which resulted in several > million people all trying to download about 1.5MB worth of images from > our servers over the course of a few hours. Or, at least, it would have > been a few hours, except that our three varnish proxies just crumbled > under the load within 10 minutes. > > Now, that's bad enough, but we could have just about coped if the > proxies stopped serving requests for a few minutes. What actually > happened was that all three servers went catatonic on the network *and > stayed that way*: even when we shunted the traffic away from one, we > still couldn't access it via ssh or any network protocol. And it stayed > like that for sufficiently long time that we had no recourse other than > to get the servers rebooted. > > Can anyone explain what was happening here? Not having the servers > recover accessibility for an extended period even after the excess > traffic was stopped is unacceptable. We're also struggling to recreate > the effect in the lab: any clues about how to do so, and any suggestions > about how to prevent the 'going catatonic' response would be greatly > appreciated. > > Servers are amd64 running FreeBSD 9.1 or 9.2 and Varnish 3.0.5. > > > Cheers, > > Matthew > > >