From owner-freebsd-current Fri Sep 17 16:16: 8 1999 Delivered-To: freebsd-current@freebsd.org Received: from proxy2.ba.best.com (proxy2.ba.best.com [206.184.139.14]) by hub.freebsd.org (Postfix) with ESMTP id 8041715B54 for ; Fri, 17 Sep 1999 16:16:01 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com ([209.157.86.2]) by proxy2.ba.best.com (8.9.3/8.9.2/best.out) with ESMTP id QAA04194 for ; Fri, 17 Sep 1999 16:12:31 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id NAA54939; Fri, 17 Sep 1999 13:02:11 -0700 (PDT) (envelope-from dillon) Date: Fri, 17 Sep 1999 13:02:11 -0700 (PDT) From: Matthew Dillon Message-Id: <199909172002.NAA54939@apollo.backplane.com> To: Brad Knowles Cc: current@FreeBSD.ORG Subject: Re: 2xPIIIx450 results & NFS results (was More benchmarking stuff...) References: <199909171658.JAA53751@apollo.backplane.com> <199909171856.LAA54721@apollo.backplane.com> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG : Note that with a mail server, this is precisely the sort of thing :that happens with /var/spool/mqueue. In particular, with sendmail, a :qf/df pair of files get created, the message is received, the sender :is told "250 Ok", then sendmail goes to deliver the message in the :background, which 95-99% of the time happens on the first attempt, :and then the qf/df pair of files get deleted. : : So, again, we see that they've actually done a decent first-pass :attempt at simulating the load a mail server would place on the :filesystem. All that we need to add now are a few more features. ;-) :... :|o| Brad Knowles, Belgacom Skynet NV/SA |o| Well, actually I would disagree quite strongly with you here. Sendmail does not get into trouble with queue files it is able to retire quickly. Where sendmail gets into trouble is with queue files it ISN'T able to retire quickly. This is why you *see* 10,000+ files in mqueue at times. These files build up because a small percentage of mail destinations cannot be delivered to immediately. These files are being continuously rescanned by sendmail queue runs. It is because of these files that you get good hit-rates on the name cache. The reason sendmail tends to break down with large queue directories has little to do with directory overhead and a lot to do with sendmail's own algorithms. If you have 50 sendmails running a 10,000 file queue, each of those sendmail processes is essentially scanning the entire queue. That is, sendmail is implementing an O(N^2) algorithm irregardless of the directory overhead. When you add UFS cache-miss directory scan overhead to the fray, it becomes O(N^3). The MinQueueAge sendmail option helps considerably, but every sendmail is still going through and scanning the directory and stat'ing every control file. If not controlled, this eventually leads to a cascade failure. The potential for a cascade failure is, in fact, the number one reason for *NOT* running sendmail with background queueing mode turned on. The best way to avoid a cascade failure is to run the sendmail daemon in queue-only mode with a set fork limit: sendmail -bd -OMaxDaemonChildren=X -ODeliveryMode=q And run the sendmail queue runner separately: sendmail -q1m -OMaxDaemonChildren=Y -OMinQueueAge=1h If you run the sendmail daemon in background-delivery mode it is possible to saturate the system with running processes that stick around trying to deliver mail to downed destinations. If you do not separate the queue-running from the daemon accepting connections you can wind up in the situation where one or the other hogs the MaxDaemonChildren process limit. But if you run them separately, with separate limits, you give the system a chance to recover from 'blow-up' situations without requiring intervention from the sysop. Just controlling the number of sendmails running the queue immediately solves many of the directory-too-big problems by preventing a queue-run cascade failure (where sendmails are forked to run the queue more quickly then they can be retired). : An absolutely full newsfeed these days is running somewhere :around 1.1 million files comprising some 55GB of data (see :), or an average of :52,608.71 bytes per article. A very busy mail server might do a :million messages per day (or more), but the average message size :would be much closer to 2-5KB. This is BEST's mailing-list server. 154 million messages out since Aug11 last year - around 370 days. A little less then a half a million messages a day on average. Amazingly enough, barely a terrabyte a year in traffic. In anycase, the average from this box is around 5K/msg outgoing. Statistics from Tue Aug 11 14:07:32 1998 M msgsfr bytes_from msgsto bytes_to msgsrej msgsdis Mailer 0 0 0K 3971636 38000691K 0 0 prog 1 0 0K 15894749 95543012K 0 0 *file* 3 12160185 135184699K 1 1K 2794 0 local 5 5455615 24980285K 154042760 819622748K 910650 3603 esmtp ========================================================= T 17615800 160164984K 173909146 953166452K 913444 3603 The key issue with any mail server is that bandwidth and transaction useage tends to be low relatively speaking. A USENET news system almost always has much higher transactional overhead, especially if it is taking several feeds. A million news messages a day translates to around 10 million protocol transactions for a news box taking 4 feeds. A mail server has many fewer transactions so you can actually afford to spend more time servicing them. What you cannot afford to spend time doing in a mail server is scanning the same queue file over and over again, so what you want to optimize for are the 5% of email messages that wind up stuck in the queue for more then a few minutes but usually less then an hour, and then make sure the 1% that stick around past that do not interfere with the processing of those that stick around less. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message