From owner-freebsd-questions@FreeBSD.ORG Sat May 2 17:17:24 2015 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CC1F2EB2 for ; Sat, 2 May 2015 17:17:24 +0000 (UTC) Received: from taos.firemountain.net (taos.firemountain.net [207.114.3.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "taos.firemountain.net", Issuer "taos.firemountain.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7FC0F1E7D for ; Sat, 2 May 2015 17:17:23 +0000 (UTC) Received: from gsp.org (localhost [127.0.0.1]) by taos.firemountain.net (8.15.1/8.15.1) with SMTP id t42GlqeG004839 for ; Sat, 2 May 2015 12:47:53 -0400 (EDT) Date: Sat, 2 May 2015 12:47:52 -0400 From: Rich Kulawiec To: freebsd-questions@freebsd.org Subject: Re: email address being harvested from ports website Message-ID: <20150502164752.GA6047@gsp.org> References: <5527D0BD.8060401@gmail.com> <20150410181255.GA2891@gsp.org> <55292175.3000308@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55292175.3000308@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 May 2015 17:17:24 -0000 On Sat, Apr 11, 2015 at 09:28:21AM -0400, Ernie Luzar wrote: > Doesn't take 10 years of study to know that any email address > visible on a public website is a target for harvesting. No, but it does take considerable time to understand that while that statement is true, it is absolutely, completely, totally irrelevant. Here's a *brief* introduction. Summary: Spammers now have so many ways of "harvesting" addresses from so many systems, and so many ways of exchanging those with each other, that any email address which is actually used WILL eventually be harvested. Pretending that address hiding and/or obfuscation will have any meaningful effect on this process gives users a false sense of security and has absolutely zero anti-spam value. Summary of the summary: It's pointless. Explanation: The harvesting engines used to acquire email addresses to populate those databases are myriad, as are the methods by which spammers acquire the raw data to use as input to them. Some of those methods, and there are MANY more, include: - subscribing to mailing lists - acquiring Usenet news (NNTP) feeds - querying mail servers - acquiring corporate email directories - insecure LDAP servers - insecure AD servers - web crawlers - search engines - plausible construction - address directories - use of backscatter/outscatter - use of auto-responders - use of mailing list mechanisms - use of abusive "callback" mechanisms - dictionary attacks - construction of plausible addresses (e.g. "firstname.lastname") - purchase of addresses in bulk on the open market. - purchase of addresses from vendors, web sites, etc. - purchase of addresses from registrars, ISPs, web hosts, etc. - domain registration (some registrars ARE spammers) [1] - misplaced/lost/sold media (hard disk, tape, CD, DVD, USB stick, etc.) and perhaps most significantly: - harvesting of the mail, address books and any other files present on any of the hundreds of millions of compromised systems that are out there Let's talk about that last one for a moment. Consider: the first time a newly-created address is used by someone who is sending a message TO it, it's now present on their system: in their saved outbound mail, or perhaps in their address book (if they have one), or in some cache. Any sensible malware resident on their system will of course pick it up and eventually hand it over to a harvesting agent. (Competent malware will harvest it in real time *and* associate it with the sender's address.) And if that particular system happens to be clean? Doesn't help much, because the more times that address is used, the more systems it's present on. And the more systems it's present on, the greater the probability that one of them is already compromised or will be soon. Thus even if we eliminate the originating end-user system as a possible source, we still have to consider the outbound mail server used by that end-user system, which is also a candidate for compromise. And then the inbound mail server used by the recipient, and then the recipient end-user system. And if there's some filtering appliance or intermediate system in place at either end, then it's a possible compromise point as well. If the message is forwarded to a third party, then another set of systems is in play. If mail server logs are rolled up and moved to some central location, then that system must also be included. If backups are made, then any addresses present on live systems are present in their backups, and subsequently may be present on any system where the backups are read/restored. And finally, if the destination of a mail message isn't an individual user, but an entire mailing list, then we must multiply the number of possible harvesting points by at least the number of people on the mailing list plus a factor for mail servers/gateways/filters/etc. (modulo overlaps). This in turns means that messages to sent to lists of any appreciable size (say, 1000 members) will turn up on considerably more than 1000 systems -- and the chances that all 1000-plus are secure are microscopic. [ And remember: it only takes one. What if the system I'm typing this on right now, a system which has a complete archive of freebsd-questions back to 2002 in Unix mbox format, gets compromised? Or how about *your* system? How about on the systems of any of the other people on freebsd-questions? ] Please note that the previous several paragraphs' recitation only covered the LAST vector I enumerated in the [indented] list above: compromised systems. That laundry list of methods also affords many, *many* other opportunities for addresses to find their way into spammers' hands. The bottom line is that any email address which is actually used is GOING to be harvested. It's only a matter of when, not if, and "when" is getting sooner all the time. There's nothing you or I or anyone else can do about this because there are too many vectors and not only do we not control most of them, we don't control the ones that are the the most important. With all this in mind, it's clearly pointless to pretend that address hiding or obfuscation provides any protection at all. It's much better to remove the functionality entirely than to continue to maintain the facade that it actually has any anti-spam value. Everyone should simply presume that all email addresses are in the hands of spammers and prepare defenses accordingly -- because even if that's not quite true yet, it will be soon. Conclusion: Trying to hide/obfuscate email addresses is the security equivalent of Wile E. Coyote holding an umbrella over his head while a grand piano plummets toward him. It's never worked. It's not working. It's not going to work. It's just wishful thinking/folklore/mythology. ---rsk