Date: Wed, 28 Apr 2004 08:57:39 -0700 From: Drew Tomlinson <drew@mykitchentable.net> To: aaron@alpete.com, mark rowlands <mark.rowlands@mypost.se>, Christopher Nehren <apeiron@comcast.net> Cc: "freebsd-questions@FreeBSD. ORG" <freebsd-questions@freebsd.org> Subject: Re: Perl Help For Newbie -- SOLVED Message-ID: <408FD473.8030701@mykitchentable.net> In-Reply-To: <50583.204.118.78.206.1083103993.squirrel@mail.alpete.com> References: <4789E43478F3994BB8D967C73FD9C68850BA@exchsrv1> <50583.204.118.78.206.1083103993.squirrel@mail.alpete.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 4/26/2004 9:50 AM Drew Tomlinson wrote: > I'm trying to write a perl script to modify a web page. The source > page is full of lines > such as: > > <a href="../../catalog/books/html/amagicianamongthespirits.html">A > Magician Among the Spirits - Houdini </a>$75.00 $67.50 $125.00<br> > <a href="../../catalog/books/html/amaterial.html">"A" > Material - Jim Pace</a> $18.00 $16.20 $29.95<br> > <a href="../../catalog/books/html/absolutemagic.html">Absolute Magic - > Derren Brown</a> $24.00 $22.80 $39.95<br> > > I want to take the first amount and multiply it by 1.5 and replace it, > remove the second amount, and keep the third > amount the same. So for example, the first line would be converted to: > > <a href="../../catalog/books/html/amagicianamongthespirits.html">A > Magician Among the Spirits - Houdini </a>$112.50 $125.00<br> > > I am brand new to Perl but have been reading and experimenting for the > past two weeks. > I've managed to open my file and read the contents into an array > called "@page": > > open(DATA, "< $input") or die "Couldn't read from datafile: $!\n"; > my @page = (<DATA>); > > Now I am trying to use the s/// operator to perform the math and > substitution. I get > close to what I want but I'm not quite there. This code > > foreach (@page) { > $_=~ s/^\s+//gm; #removes leading whitespace > $_=~ s/\d+\.\d\d/$&*1.5/e; #finds 1st $ amount and adds 50% > } > > produces this output: > > <a href="../../catalog/books/html/amagicianamongthespirits.html">A > Magician Among the Spirits - Houdini </a>$112.5 $67.50 $125.00<br> > > How can I format the converted amount back to US dollars ($112.50)? > I've seen > subroutines to format US currency but can those be used with my > current approach? Would "printf" be a possible choice? Should I use > the "split" function to separate the > data in fields such as link, description, price1, price2, price3 and > then rebuild each > line with concatenation? Is there some other way? > > Any guidance as to the best way to approach this task would be most > appreciated. I've > done lots of reading but haven't found anything that teaches me how to > "think" about > building this script. Thank you for your responses. The further I get into this the more I find I need to learn. For the archives, the code that solves my initial question is this: # assign each item in array to scalar then do stuff to scalar foreach my $line (@inputpage) { # discard leading and trailing and collapse internal whitespace. $line = join(" ", split " ", $line); # Add newline to end of each line as previous statement removes it. $line =~ s/(.*)/$&\n/; # assign dollar values to separate scalars if ( my ($val1,$val2,$val3) = $line =~ /\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)/) { # Perform math on $val1 and format to 2 decimals my $price = sprintf "%.2f",$val1 * 1.5; # Search $line for dollar values and replace with new $line =~ s/\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)/\$$price \$$val3/; # Store $line in array push(@outputpage, $line); } } close DATA; open(DATA, "> $outputfile") or die "Couldn't open $outputfile: $!\n"; print DATA "@outputpage"; close DATA; exit; Although now that I'm getting into it, I can see where I want to manipulate the HTML code as well. Thus I will start reading about perl modules, HTML::Parser in particular, and see where it takes me. Any links to beginner material about modules, especially HTML::Parser will be most appreciated. Thanks for your help!!! Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?408FD473.8030701>