From owner-freebsd-isp@FreeBSD.ORG Sat Nov 22 06:49:30 2003 Return-Path: Delivered-To: freebsd-isp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 89DD916A4CE for ; Sat, 22 Nov 2003 06:49:30 -0800 (PST) Received: from Shenton.org (23.ebbed1.client.atlantech.net [209.190.235.35]) by mx1.FreeBSD.org (Postfix) with SMTP id C696A43FDD for ; Sat, 22 Nov 2003 06:49:27 -0800 (PST) (envelope-from chris@Shenton.Org) Received: (qmail 93735 invoked by uid 1001); 22 Nov 2003 14:50:28 -0000 To: David References: <20031121222817.GD19888@phobia.ms> From: Chris Shenton Date: Sat, 22 Nov 2003 09:50:28 -0500 In-Reply-To: <20031121222817.GD19888@phobia.ms> (david@madcoders.com's message of "Fri, 21 Nov 2003 17:28:17 -0500") Message-ID: <86fzggihjf.fsf@PECTOPAH.shenton.org> User-Agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii cc: freebsd-isp@freebsd.org Subject: Re: huge email system X-BeenThere: freebsd-isp@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Internet Services Providers List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Nov 2003 14:49:30 -0000 David writes: > We need to build a stable, redundant, and speedy email system that > will last for a few years. We need to handle about 500,000 emails > per day. We have about 30,000 users, so we need a lot of storage. > > Our current plan was to implement the following. > 2 SMTP only servers. > 3 NFS servers with RAID and SCSI > 2 POP3 servers. > > But that leads us to questions such as - > - what would be the best way to authenticate? > - would the NFS servers need gig nic's? or dual bonded 100Mbit cards? > - what smtp server and what pop3 server to use (we want to use Maildir) > - what raid level? I'm finishing something like that now. My design goals were No single points of failure, 1GB server-stored email SMTP+STARTTLS and SMTPS, IMAPS and IMAP + STARTTLS. It's over-designed for our population but the servers aren't the expensive part; I believe it could scale to handle 100K users. I'm replacing a sendmail-based system that's exceptionally hard to fix because there are multiple single points of failure and no one wants downtime. I did the prototype on FreeBSD but the client preferred Solaris for their production systems. I'm using qmail with the excellent qmail-ldap patch suite from www.nrg4u.com, plus courier-imap. OpenLDAP is used for authentication and other user information (quotas, account status, etc). I'm using a pair of F5 load balancers in the front to detect up/down services. This will also allow us to add servers if needs demand it; I like being able to add small cheap boxes incrementally rather than forklift upgrades of big iron. Behind them are a few Netra V210 for SMTP[S], IMAP[S], POPS and soon webmail (SqWebMail). Each box has a read-only LDAP replica. Another V210 runs the LDAP master, which replicates to the four mail servers. Each V210 comes with quad gigabit ethernet: one interface to the load balancer, two (redundancy) to backend switches on the NFS server, and one for an administrative/monitoring network. We bought a NetApp for the mail store; it is currently our one single point of failure but NetApp has a great reputation for reliability; we bought a used unit and saved about 70%. (NetApp uses RAID4 internally so disks can be added to a volume on the fly). NetApp's "snapshot" facility gives us restores from stupid user errors -- tape backup/restore for this much data would be a nightmare. (Qmail's Maildir format is NFS safe but it sounds like you already know that :-) If my client didn't demand Solaris, I would have preferred FreeBSD. I would like to try using the Apple Xserve RAID box since it's 2.5TB for $11K. FC-attach it to a pair of FreeBSD boxes which serve it out as NFS, use the FreeBSD-5.x "snapshot" feature for NetApp-style backup/restore. Service boxes like above, cheaply scalable by adding more. I like F5 balancers because you can heavily customize the application layer health monitoring -- e.g., do a query on the LDAP master and check for a sane response. But they're not cheap. Round-Robin DNS isn't gonna avoid dead services and Windows clients aren't any good at re-trying failed connections. So I don't have a suggestion on an inexpensive balancer; I'd be interested in hearing ideas. As I mentioned above, our NetApp is the only single point of failure. To get more space later on we can get a second unit then buy the (pricey) clustering software to remove that SPoF. Some other folks have talked about anti-virus/spam issues -- very good discussion. I am using qmail-ldap's recent integration of qmail-smtp-viruscan which is a very fast block of MS executable attachments; not foolproof but highly effective with little load. We're considering going with some commercial spam/virus blocking appliance but haven't decided yet; I'm trying to keep the qmail-ldap system from getting any more complicated. If, however, we integrate something into our mail servers, we might have to add another box or two to handle the increased load but it's not that expensive with small boxes. As I mentioned, I'm running all services on all boxes, rather than separating SMTP from POP as you suggest; if this turns out to be a bad idea, I can change the services around simply by re-defining the service pools on the load balancer.