Date: Sat, 15 Sep 2007 08:51:23 -0400 From: Bill Moran <wmoran@potentialtech.com> To: robert@webtent.com Cc: Robert Fitzpatrick <lists@webtent.net>, FreeBSD <freebsd-questions@freebsd.org> Subject: Re: Concurrency limit warning in Postfix leads to server lock Message-ID: <20070915085123.9c00d54b.wmoran@potentialtech.com> In-Reply-To: <1189858099.10939.28.camel@columbus.webtent.org> References: <1189858099.10939.28.camel@columbus.webtent.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Robert Fitzpatrick <lists@webtent.net> wrote: > > I have dilemma with one of our 5.4 server mail gateways. About 2-3 times > a month now the server SMTP and related services stop responding. I find > myself not able to login, just sits there after entering user name. I > have to reset the server and the only thing I can find with an 'egrep > (fatal|error|warn)' in the messages and maillog are these concurrency > limit warnings minutes before the issue started... > > Sep 15 07:19:02 esmtp postfix/smtpd[2789]: warning: Connection > concurrency limit exceeded: 51 from unknown[88.238.96.247] for service > smtp > > This seems to be an attacker of some sort, I block them and the issue > goes away, of course. I posted my issue to the Postfix list, but was > told this should not be taking down my server and to find out why I'm > not able to login when this happens. It shouldn't. < I am looking for help on where to > look to determine this, can someone give some guidance? Some other log I > should examine? The only thing I can spot that looks possibly out of > place is nfsd running at 6-8% CPU. I do a backup from one other server > to this server via nfs. I checked and all that backup was finished > couple of hours prior to this latest issue, but the nfsd process seems > to be taking more CPU than normal. And when I reboot, the nfs connection > I have in /etc/fstab takes several seconds to initialize. Definitely sounds like some networking issues. I can't give you a direct "answer" because the question is too vague (although I think you described it to the best of you ability). Instead, I'll outline how I would go about tracking it down and solving it. * Start with the nfs thing. It seems to indicate a network problem, which will skew everything else you investigate unless you fix it first. Try some large FTP transfers between those two servers (FTP has very little overhead, and is thus a good gauge of network performance). If the FTP transfer isn't getting within 20% of the theoretical capability of the network, then you probably have a network problem. Carefully investigate speed/duplex settings, whether or not your switching hardware is crappy or simply overloaded. In short, find the network problem and fix it. * Next time it happens, make absolutely sure it's refusing login. Under a DDoS or similar attack, it can take several seconds for ssh to complete the protocol negotiation. If DNS is running slow, longer. Are you waiting until the ssh client actually times out before giving up? Even then, it might connect on the second or third try. Try setting ConnectTimeout to 300 in /etc/ssh/ssh_config and see if it connects. I've seen network problems cause ssh to take 45 seconds or longer to connect, and that's to be expected under certain network circumstances. * Get MRTG or some other trend gathering system running on that machine so you have other stats to look at when the problem happens, this may point you to the source of the problem very quickly. In general, it's a good idea to have on production systems so you can see what's happening. With MRTG (and similar software) you can, and should!, graph a lot more than network usage. Graph disk read/writes, cpu usage, swap file usage, memory usage. A system that's heavily in to swap will respond dog-slow, and could be your problem. Hope these help you narrow down the problem. -- Bill Moran http://www.potentialtech.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070915085123.9c00d54b.wmoran>