Date: Mon, 7 Nov 2005 18:54:08 +0100 From: Anton Berezin <tobez@FreeBSD.org> To: Jim Trigg <jtrigg@spamcop.net> Cc: freebsd-ports@freebsd.org Subject: Re: Request for comments: port-tags Message-ID: <20051107175408.GE40923@heechee.tobez.org> In-Reply-To: <4220.192.168.1.2.1131383989.squirrel@mail.scadian.net> References: <20051107154634.GA40923@heechee.tobez.org> <4220.192.168.1.2.1131383989.squirrel@mail.scadian.net>
next in thread | previous in thread | raw e-mail | index | archive | help
[moving back to the mailing list since this might be of interest] On Mon, Nov 07, 2005 at 12:19:49PM -0500, Jim Trigg wrote: > On Mon, November 7, 2005 10:46 am, Anton Berezin wrote: > > The idea is to make ports classification easier and more convenient. > > Instead of using predefined and limited set of port categories, > > port-tags uses short one-word descriptions called tags. A port can have > > an arbitrary number of tags associated with it. One can use the web > > interface (and maybe a command-line interface in the future) to view > > only the ports that have particular tags associated with them. This > > process is very efficient in narrowing down the number of sought ports. > > How do you add a tag to a port? For example, mail/dovecot does not > currently have the tag maildir, even though it supports maildir. Currently it basically takes the existing categories a port is in, plus the words which constitute the port's COMMENT. Then it applies a number of heuristics, most significantly stemming and filtering out the common "stopwords" like "a", "the", and so on. And then there is of course a cutoff for those resulting tags which are too rare (otherwise the number of resulting tags in the tagcloud would explode; it is already pretty bad as it is, with 815 tags). Since mail/dovecot does not mention "maildir" in it's COMMENT, no tag for it. I was thinking about parsing the pkg-descr file as well, but I was afraid it will explode the number of tags even more. Possibly I was mistaken, it would be good to experiment with this approach. Failing taking into account pkg-descr, one can clearly see why a "social collaboration" mode could produce higher quality results, since at least one person (you) would make sure that mail/dovecot is tagged with "maildir". :-) \Anton. -- An undefined problem has an infinite number of solutions. -- Robert A. Humphrey
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051107175408.GE40923>