From owner-freebsd-current  Fri Sep 17 16:16: 8 1999
Delivered-To: freebsd-current@freebsd.org
Received: from proxy2.ba.best.com (proxy2.ba.best.com [206.184.139.14])
	by hub.freebsd.org (Postfix) with ESMTP id 8041715B54
	for <current@FreeBSD.ORG>; Fri, 17 Sep 1999 16:16:01 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com ([209.157.86.2])
	by proxy2.ba.best.com (8.9.3/8.9.2/best.out) with ESMTP id QAA04194
	for <current@FreeBSD.ORG>; Fri, 17 Sep 1999 16:12:31 -0700 (PDT)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id NAA54939;
	Fri, 17 Sep 1999 13:02:11 -0700 (PDT)
	(envelope-from dillon)
Date: Fri, 17 Sep 1999 13:02:11 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199909172002.NAA54939@apollo.backplane.com>
To: Brad Knowles <blk@skynet.be>
Cc: current@FreeBSD.ORG
Subject: Re: 2xPIIIx450 results & NFS results (was More benchmarking 
 stuff...)
References: <XFMail.990917112639.lh@aus.org>
 <199909171658.JAA53751@apollo.backplane.com>
 <v0420553bb40826e849a4@[195.238.1.121]>
 <199909171856.LAA54721@apollo.backplane.com> <v04205547b40842e2de12@[195.238.1.121]>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:	Note that with a mail server, this is precisely the sort of thing 
:that happens with /var/spool/mqueue.  In particular, with sendmail, a 
:qf/df pair of files get created, the message is received, the sender 
:is told "250 Ok", then sendmail goes to deliver the message in the 
:background, which 95-99% of the time happens on the first attempt, 
:and then the qf/df pair of files get deleted.
:
:	So, again, we see that they've actually done a decent first-pass 
:attempt at simulating the load a mail server would place on the 
:filesystem.  All that we need to add now are a few more features.  ;-)
:...
:|o| Brad Knowles, <blk@skynet.be>            Belgacom Skynet NV/SA |o|

    Well, actually I would disagree quite strongly with you here.

    Sendmail does not get into trouble with queue files it is able to retire
    quickly.  Where sendmail gets into trouble is with queue files it ISN'T
    able to retire quickly.  This is why you *see* 10,000+ files in mqueue 
    at times.  These files build up because a small percentage of mail 
    destinations cannot be delivered to immediately.

    These files are being continuously rescanned by sendmail queue runs. 
    It is because of these files that you get good hit-rates on the name
    cache.

    The reason sendmail tends to break down with large queue directories has
    little to do with directory overhead and a lot to do with sendmail's own
    algorithms.  If you have 50 sendmails running a 10,000 file queue, each
    of those sendmail processes is essentially scanning the entire queue.
    That is, sendmail is implementing an O(N^2) algorithm irregardless of
    the directory overhead.  When you add UFS cache-miss directory scan
    overhead to the fray, it becomes O(N^3).

    The MinQueueAge sendmail option helps considerably, but every sendmail 
    is still going through and scanning the directory and stat'ing every
    control file.

    If not controlled, this eventually leads to a cascade failure.  The
    potential for a cascade failure is, in fact, the number one reason for
    *NOT* running sendmail with background queueing mode turned on.  The
    best way to avoid a cascade failure is to run the sendmail daemon in 
    queue-only mode with a set fork limit:

	sendmail -bd -OMaxDaemonChildren=X -ODeliveryMode=q

    And run the sendmail queue runner separately:

	sendmail -q1m -OMaxDaemonChildren=Y -OMinQueueAge=1h

    If you run the sendmail daemon in background-delivery mode it is possible
    to saturate the system with running processes that stick around trying
    to deliver mail to downed destinations.  If you do not separate the
    queue-running from the daemon accepting connections you can wind up in
    the situation where one or the other hogs the MaxDaemonChildren process
    limit.  But if you run them separately, with separate limits, you give 
    the system a chance to recover from 'blow-up' situations without 
    requiring intervention from the sysop.

    Just controlling the number of sendmails running the queue immediately
    solves many of the directory-too-big problems by preventing a queue-run
    cascade failure (where sendmails are forked to run the queue more 
    quickly then they can be retired).

:	An absolutely full newsfeed these days is running somewhere 
:around 1.1 million files comprising some 55GB of data (see 
:<http://transit.us-va.remarq.com/feed-size/>), or an average of 
:52,608.71 bytes per article.  A very busy mail server might do a 
:million messages per day (or more), but the average message size 
:would be much closer to 2-5KB.


    This is BEST's mailing-list server.  154 million messages out since 
    Aug11 last year - around 370 days.  A little less then a half a million
    messages a day on average.  Amazingly enough, barely a terrabyte a year
    in traffic.

    In anycase, the average from this box is around 5K/msg outgoing.

Statistics from Tue Aug 11 14:07:32 1998
 M msgsfr  bytes_from msgsto    bytes_to  msgsrej msgsdis  Mailer
 0      0          0K 3971636   38000691K        0       0  prog
 1      0          0K 15894749   95543012K        0       0  *file*
 3 12160185  135184699K      1          1K     2794       0  local
 5 5455615   24980285K 154042760  819622748K   910650    3603  esmtp
=========================================================
 T 17615800  160164984K 173909146  953166452K   913444    3603

     The key issue with any mail server is that bandwidth and transaction
     useage tends to be low relatively speaking.  A USENET news system
     almost always has much higher transactional overhead, especially if it
     is taking several feeds.  A million news messages a day translates to
     around 10 million protocol transactions for a news box taking 4 feeds.

     A mail server has many fewer transactions so you can actually afford to
     spend more time servicing them.  What you cannot afford to spend time
     doing in a mail server is scanning the same queue file over and over
     again, so what you want to optimize for are the 5% of email messages
     that wind up stuck in the queue for more then a few minutes but usually
     less then an hour, and then make sure the 1% that stick around past
     that do not interfere with the processing of those that stick around 
     less.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message