Date: Wed, 12 Feb 2003 12:50:53 +0100 From: Brad Knowles <brad.knowles@skynet.be> To: Terry Lambert <tlambert2@mindspring.com> Cc: Brad Knowles <brad.knowles@skynet.be>, Rahul Siddharthan <rsidd@online.fr>, freebsd-chat@freebsd.org Subject: Re: Email push and pull (was Re: matthew dillon) Message-ID: <a05200f43ba6fe1a9f4d8@[10.0.1.2]> In-Reply-To: <3E49C2BC.F164F19A@mindspring.com> References: <20030211032932.GA1253@papagena.rockefeller.edu> <a05200f2bba6e8fc03a0f@[10.0.1.2]> <3E498175.295FC389@mindspring.com> <a05200f37ba6f50bfc705@[10.0.1.2]> <3E49C2BC.F164F19A@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
At 7:42 PM -0800 2003/02/11, Terry Lambert wrote: > Actually, I disagree with this presentation. There is an assumption > of "sufficient storage" there in the first place, when in practice, > sufficient storage is commonly a prime economic issue. If people > were satisfied with the job "the big providers" were doing, there > would be no little providers. In practice, the sole limiting factor of mail systems is synchronous meta-data update latency and related I/O throughput. You're far, far better off getting large numbers of smaller disks and putting them in a RAID 1+0 environment (or even RAID 1+0+0), with a large enough stripe size that almost all transactions can be taken care of as one logical operation. This is very much like running a large-scale USENET news server. The total quantity of disk space is meaningless, if you don't have enough heads working for you in parallel. So, to solve the disk space issue, you just buy some slightly larger disks. Been there, done that. We learned this lesson a long time ago at AOL. Doing single-instance-store is a false economy, and indeed is one of the key limiting factors for LAN e-mail packages like cc:Mail, Lotus Notes, Microsoft Mail, or Microsoft Exchange. This is one of the primary reasons why they don't scale. > As to the metadata updates: I'm not positive this is the case, > though it is certainly the case in "sendmail" and certain other > MTAs. It's really implementation dependent, I think, and the > slide assumes a particular implementation. These slides have absolutely nothing whatsoever to do with the MTA. They have to do with the mailbox, mailbox delivery, mailbox access, etc.... You need message locking in some fashion, you may need mailbox locking, and most schemes for handling mailboxes involve either re-writing the mailbox file, or changing file names of individual messages (or changing their location), etc.... These are all synchronous meta-data operations. See <http://www.shub-internet.org/brad/papers/dihses/lisa2000/sld062.htm> through <http://www.shub-internet.org/brad/papers/dihses/lisa2000/sld081.htm>. Then ask yourself how these issues relate to the operation of UW-IMAP, Courier-IMAP, and Cyrus. If you want a detailed discussion of these points, I'll be more than happy to get into this. However, keep in mind that I've already had a deep discussion of these points with Nick Christenson (author of the book _Sendmail Performance Tuning_, as well as the classic paper "A Highly Scalable Electronic Mail Service Using Open Systems" at <http://www.jetcafe.org/~npc/doc/mail_arch.html>, among others) and other large-scale Internet e-mail experts, and I don't think that we're likely to add much to this topic here. > The scaling issue at 150 is an implementation detail specific to > the implementations you are referencing. My advice is "get some > real software". 8-). You can scale something like this from > Open Source components and about 3 months of concentrated coding > to at least 50,000 per indivisible component cluster, and then > throw hardware at it. Concentrated coding? Open source? Where? What specific pieces of software are you envisioning? Who would be doing the coding? If it's that simple, then why hasn't this code already been written? I'm trying to build a LAN e-mail system today using open source software because certain constraints have been put on this project (e.g., they have literally $0 to spend on new hardware, mailboxes must be stored on NFS, the old multi-purpose machines must be replaced and a new mail system must be able to gradually take over), and I was painted into a corner before I ever became a member of this project. I've done the best I can in the circumstances available, but I am still very, very concerned that given the best available solutions I can find for each of the components involved, the replacement still is not going to perform well enough to be considered anything other than "b0rken". > The normal implementation to avoid single point of failure is > replication of the data. At that point, the replication > protocol itself becomes (essentially) a transport. The worst > case failure mechanism, in that case, is a previously deleted > message reappears. Correct, in theory. Where's the practice? > I think they do it on the basis of their name, and on the basis > of the idea that "you can put any SQL server behind MS Exchange, > so use ours, it's better than Microsoft's". Uh, no. Take a closer look at the pitch. They are completely and totally replacing Exchange. There are no Microsoft components left. The architecture is totally unlike Exchange. The only thing they're doing is pitching a solution that is compatible with Exchange at the highest protocol levels, enough to fool Outlook. -- Brad Knowles, <brad.knowles@skynet.be> "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a05200f43ba6fe1a9f4d8>