From owner-freebsd-hackers@FreeBSD.ORG Thu Nov 20 16:57:11 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C71DA16A4CE; Thu, 20 Nov 2003 16:57:11 -0800 (PST) Received: from monkeyflinger.anonymizer.com (monkeyflinger.anonymizer.com [168.143.113.15]) by mx1.FreeBSD.org (Postfix) with ESMTP id E1B8E43FDD; Thu, 20 Nov 2003 16:57:10 -0800 (PST) (envelope-from rabbi@anonymizer.com) In-Reply-To: References: Mime-Version: 1.0 (Apple Message framework v606) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Len Sassaman Date: Thu, 20 Nov 2003 16:57:10 -0800 To: Robert Watson X-Mailer: Apple Mail (2.606) cc: freebsd-hackers@freebsd.org cc: freebsd-current@freebsd.org Subject: Re: Help request: problems with a 5.1 server and large numbers of ssh users. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 00:57:11 -0000 > Hmm. Well, it certainly sounds like a resource limit to me, > especially if > it's a nice round number like "150" or "300". However, I'm also > having a > bit of trouble seeing, off the top of my head, which limit it might be. > It sounds like you've got the ones I would think of. A quick skim of > sshd.c suggests that it is pretty careful to document various failure > modes in debugging output. There are one or two failures where it does > not log, and they include the call to pipe() in the server loop -- if > that > fails, it bails without an error, which is a little surprising. Could > you > post server debug output for the first connection to the server that > fails? This would let us "see how far it got"... In particular, > whether > it did spawn a child process, etc. > I have never gotten this to fail when sshd is running in debug mode (i.e., sshd -ddd). However, given that it doesn't fork when run with -d, that still doesn't tell us too much. When I set LogLevel DEBUG3, this is as much info as I am given in the auth.log: Nov 20 16:39:19 clyde sshd[63993]: Failed none for rabbi from 127.0.0.1 port 62701 ssh2 And this is the debug output for the connection, as seen from the client: bash-2.05b# ssh -vvv -l rabbi localhost OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f debug1: Reading configuration data /etc/ssh/ssh_config debug1: Rhosts Authentication disabled, originating port will not be trusted. debug2: ssh_connect: needpriv 0 debug1: Connecting to localhost [::1] port 22. socket: Protocol not supported debug1: Connecting to localhost [127.0.0.1] port 22. debug1: Connection established. debug1: identity file /root/.ssh/identity type -1 debug1: identity file /root/.ssh/id_rsa type -1 debug1: identity file /root/.ssh/id_dsa type -1 ssh_exchange_identification: Connection closed by remote host This can't be a system-wide process related resource issue, I don't think, because once a user connects and authenticates, there are no problems of note. I'm leaning toward a socket related limit or user-level limit. However, since sysctl tells me: kern.ipc.maxsockbuf: 262144 kern.ipc.somaxconn: 16384 kern.ipc.numopensockets: 2201 kern.ipc.maxsockets: 49312 I tend to not believe the former, and why the latter would be occurring escapes me as well.