From owner-freebsd-net@FreeBSD.ORG Tue Jul 26 18:35:20 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11A05106564A; Tue, 26 Jul 2011 18:35:20 +0000 (UTC) (envelope-from pkeusem@visi.com) Received: from g2host.com (mailfront4.g2host.com [208.42.184.242]) by mx1.freebsd.org (Postfix) with ESMTP id A9FF38FC12; Tue, 26 Jul 2011 18:35:19 +0000 (UTC) Received: from [173.30.51.17] (account pkeusem@visi.com HELO [172.16.175.217]) by mailfront4.g2host.com (CommuniGate Pro SMTP 5.3.11) with ESMTPSA id 21228672; Tue, 26 Jul 2011 13:35:18 -0500 Message-ID: <4E2F08E4.2070100@visi.com> Date: Tue, 26 Jul 2011 13:35:16 -0500 From: Paul Keusemann User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.9.2.18) Gecko/20110617 Thunderbird/3.1.11 MIME-Version: 1.0 To: Gary Palmer References: <4E159C5A.5090702@visi.com> <13D65A4C-F874-4970-A070-AA0392416680@mac.com> <4E1C9FEA.2080608@visi.com> <20110720201502.GA37199@in-addr.com> <4E2EAAD7.6040906@visi.com> <20110726130549.GD1339@in-addr.com> In-Reply-To: <20110726130549.GD1339@in-addr.com> X-Is-From-Me: yes Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: Debugging dropped shell connections over a VPN X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jul 2011 18:35:20 -0000 On 07/26/11 08:05, Gary Palmer wrote: > On Tue, Jul 26, 2011 at 06:53:59AM -0500, Paul Keusemann wrote: >> Again, sorry for the sluggish response. >> >> On 07/20/11 15:15, Gary Palmer wrote: >>> On Tue, Jul 12, 2011 at 02:26:34PM -0500, Paul Keusemann wrote: >>>> On 07/07/11 14:39, Chuck Swiger wrote: >>>>> On Jul 7, 2011, at 4:45 AM, Paul Keusemann wrote: >>>>>> My setup is something like this: >>>>>> - My local network is a mix of AIX, HP-UX, Linux, FreeBSD and Solaris >>>>>> machines running various OS versions. >>>>>> - My gateway / firewall machine is running FreeBSD-8.1-RELEASE-p1 with >>>>>> ipfw, nat and racoon for the firewall and VPN. >>>>>> >>>>>> The problem is that rlogin, ssh and telnet connections over the VPN get >>>>>> dropped after some period of inactivity. >>>>> You're probably getting NAT timeouts against the VPN connection if it is >>>>> left idle. racoon ought to have a config setting called natt_keepalive >>>>> which sends periodic keepalives-- see whether that's disabled. >>>>> >>>>> Regards, >>>> Thanks for the suggestions Chuck, sorry it's taken so long to respond >>>> but I had to reconfigure and rebuild my kernel to enable IPSEC_NAT_T in >>>> order to try this out. >>>> >>>> One thing that I did not explicitly mention before is that I am routing >>>> a network over the VPN. >>> Hi Paul, >>> >>> Even if you are not being NAT'd on the VPN there may be a firewall (or >>> other active network component like a load balancer) with an >>> overflowing state table somewhere at the remote end. We see this >>> frequently where I work with customer networks and the firewall/VPN/network >>> admin denies that its a time out issue so there is likely some device in >>> the network that has a state table and if the connection is idle for a >>> few minutes it gets dropped. >> Hmmm, this seems likely. Have you had any luck in finding the culprit >> and resolving the problem? > Unfortunately no. We know the problem exists but as a vendor we have > very little success in getting the customer to identify the problematic > device inside their network as it only seems to affect our connections > to them when we are helping them with problems, so there is almost > always something more important going on and the timeout issue gets put > on the back burner and forgotten. We've worked around it in some > places by using the ssh 'ServerAliveInterval' directive to make ssh > send packets and keep the session open even if we're idle, but that > doesn't always work. OK, I found the ClientAliveInterval, and ClientAliveCountMax setting in the ssh_config man page. I assume these are what you are referring to. I tried setting ClientAliveInterval to 15 seconds with ClientAliveCountMax set to 3 and this seems to help. I've only tried this a couple of times but I have seen an ssh session stay alive for over an hour. The bad news is that the sessions are still getting dropped, at least now I know when it happens. Now I'm getting the following message: Received disconnect from 10.64.20.69: 2: Timeout, your session not responding. From a quick perusal of the openssh source, it is not obvious whether this message is coming from the client or the server side. Initially, because the keep alive timer is a server side setting, I assumed the message was coming from the server side but if the session is not responding how is the message getting to the client? If it is a client side problem, then I have much more flexibility to fix. All I can do is whine about server side problems. Paul > Gary > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- Paul Keusemann pkeusem@visi.com 4266 Joppa Court (952) 894-7805 Savage, MN 55378