Date: Tue, 27 Apr 2004 19:48:26 +0200 From: "mark rowlands" <mark.rowlands@mypost.se> To: "freebsd-questions@FreeBSD. ORG" <freebsd-questions@FreeBSD.ORG> Cc: Christopher Nehren <apeiron@comcast.net> Subject: RE: Perl Help For Newbie Message-ID: <4789E43478F3994BB8D967C73FD9C68850BA@exchsrv1>
next in thread | raw e-mail | index | archive | help
> -----Original Message----- > From: owner-freebsd-questions@freebsd.org=20 > [mailto:owner-freebsd-questions@freebsd.org] On Behalf Of=20 > Christopher Nehren > Sent: Tuesday, April 27, 2004 2:53 AM > To: FreeBSD Questions List > Subject: Re: Perl Help For Newbie >=20 > Can someone explain to me why people are suggesting to parse=20 > markup languages manually? There's modules -- dozens -- for=20 > this. Use CPAN. because he is a perl beginner and doesn't know about cpan and modules and stuff...... how about being a bit more specific :- try :- cd /usr/ports/www/p5-HTML-parser && make install clean perldoc HTML::Parser (see the examples sections) or as a=20 starter use HTML::TokeParser::Simple; $p =3D HTML::TokeParser->new(shift||"index.html"); while (my $token =3D $p->get_tag("a")) { my $url =3D $token->[1]{href} || "-"; my $text =3D $p->get_trimmed_text("/a"); print "$url\t$text\n"; } (HTML::TokeParser::Simple is not in the ports tree yet but=20 will be once the current port freeze is over but perl -MCPAN -e shell =20 cpan> install HTML::TokeParser::Simple Running install for module HTML::TokeParser:: will perform the necessary magic :-=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4789E43478F3994BB8D967C73FD9C68850BA>