From owner-freebsd-ports@FreeBSD.ORG Mon Nov 7 17:54:10 2005 Return-Path: X-Original-To: freebsd-ports@freebsd.org Delivered-To: freebsd-ports@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7D5B816A41F for ; Mon, 7 Nov 2005 17:54:10 +0000 (GMT) (envelope-from tobez@tobez.org) Received: from heechee.tobez.org (heechee.tobez.org [217.157.39.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id D3A9143D46 for ; Mon, 7 Nov 2005 17:54:09 +0000 (GMT) (envelope-from tobez@tobez.org) Received: by heechee.tobez.org (Postfix, from userid 1001) id 405FA125420; Mon, 7 Nov 2005 18:54:08 +0100 (CET) Date: Mon, 7 Nov 2005 18:54:08 +0100 From: Anton Berezin To: Jim Trigg Message-ID: <20051107175408.GE40923@heechee.tobez.org> Mail-Followup-To: Anton Berezin , Jim Trigg , freebsd-ports@freebsd.org References: <20051107154634.GA40923@heechee.tobez.org> <4220.192.168.1.2.1131383989.squirrel@mail.scadian.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4220.192.168.1.2.1131383989.squirrel@mail.scadian.net> User-Agent: Mutt/1.4.2.1i X-Powered-By: FreeBSD http://www.freebsd.org/ Cc: freebsd-ports@freebsd.org Subject: Re: Request for comments: port-tags X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Nov 2005 17:54:10 -0000 [moving back to the mailing list since this might be of interest] On Mon, Nov 07, 2005 at 12:19:49PM -0500, Jim Trigg wrote: > On Mon, November 7, 2005 10:46 am, Anton Berezin wrote: > > The idea is to make ports classification easier and more convenient. > > Instead of using predefined and limited set of port categories, > > port-tags uses short one-word descriptions called tags. A port can have > > an arbitrary number of tags associated with it. One can use the web > > interface (and maybe a command-line interface in the future) to view > > only the ports that have particular tags associated with them. This > > process is very efficient in narrowing down the number of sought ports. > > How do you add a tag to a port? For example, mail/dovecot does not > currently have the tag maildir, even though it supports maildir. Currently it basically takes the existing categories a port is in, plus the words which constitute the port's COMMENT. Then it applies a number of heuristics, most significantly stemming and filtering out the common "stopwords" like "a", "the", and so on. And then there is of course a cutoff for those resulting tags which are too rare (otherwise the number of resulting tags in the tagcloud would explode; it is already pretty bad as it is, with 815 tags). Since mail/dovecot does not mention "maildir" in it's COMMENT, no tag for it. I was thinking about parsing the pkg-descr file as well, but I was afraid it will explode the number of tags even more. Possibly I was mistaken, it would be good to experiment with this approach. Failing taking into account pkg-descr, one can clearly see why a "social collaboration" mode could produce higher quality results, since at least one person (you) would make sure that mail/dovecot is tagged with "maildir". :-) \Anton. -- An undefined problem has an infinite number of solutions. -- Robert A. Humphrey