From owner-freebsd-stable@FreeBSD.ORG Sun Jan 15 18:36:19 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28C2F106566B for ; Sun, 15 Jan 2012 18:36:19 +0000 (UTC) (envelope-from ltning@anduin.net) Received: from mail.modirum.com (mail.modirum.com [31.185.27.10]) by mx1.freebsd.org (Postfix) with ESMTP id D1CE68FC08 for ; Sun, 15 Jan 2012 18:36:18 +0000 (UTC) Received: from [84.38.152.7] (helo=ranger.home.anduin.net) by mail.modirum.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.77 (FreeBSD)) (envelope-from ) id 1RmUwC-000FN3-Rd for freebsd-stable@freebsd.org; Sun, 15 Jan 2012 18:36:16 +0000 Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1251.1) From: =?iso-8859-1?Q?Eirik_=D8verby?= In-Reply-To: <8F42B72B-7D3F-42DA-B195-9C919CE66C02@anduin.net> Date: Sun, 15 Jan 2012 19:36:16 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <54B6F0ED-37B2-4FF2-90C9-33DF7C36A29A@anduin.net> References: <8F42B72B-7D3F-42DA-B195-9C919CE66C02@anduin.net> To: freebsd-stable@freebsd.org X-Mailer: Apple Mail (2.1251.1) X-SA-Authenticated: Yes X-SA-Exim-Connect-IP: 84.38.152.7 X-SA-Exim-Rcpt-To: freebsd-stable@freebsd.org X-SA-Exim-Mail-From: ltning@anduin.net X-SA-Exim-Scanned: No (on mail.modirum.com); SAEximRunCond expanded to false Subject: Re: Random 'Connection reset' issues between jails on same host X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Jan 2012 18:36:19 -0000 On Jan 15, 2012, at 18:44, Eirik =D8verby wrote: > Hi all, >=20 > We're trying to implement our puppet infrastructure, and have = discovered something strange about TCP connections between jails on the = same host. As our jails haven't generally been doing a lot of = connections between each other, this issue hasn't popped up before.=20 >=20 > We have two 100% equal host systems, on FreeBSD 8.2-RELEASE-p4. These = are 8-core Intel systems, with 16GB RAM each. I have just upgraded one = of the two systems to 9.0-RELEASE, and it shows the same problem. >=20 > When the puppetmaster jail is running on the same host as the jail = running puppet agent, connections from the puppet agent randomly fails = with 'Connection reset by peer'. This happens at random stages of = configuration sync. Now if either of the jails are moved to another = system (jail stop, zfs snaphot, zfs send/recv, jail start) on the same = physical network, there are no such problems. It is not a hardware = issue, as this happens no matter which of the two hosts we use. If both = puppetmaster and puppet agent reside on the same physical box, the = errors will show up. Replying to myself here: Assignig a cpuset with a single CPU to the jail with puppetmaster seems = to cure the symptom. I've made a few thousand connects now and no = failures so far. Repeatable on 8 and 9. This is obviously only a = workaround - but may give some hints as to where the problem is. /Eirik