Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Aug 2008 09:53:18 +0400
From:      Yuri Pankov <yuri.pankov@gmail.com>
To:        An <anmichel@gmail.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: sed html tags
Message-ID:  <48B39A4E.1@gmail.com>
In-Reply-To: <db2611860808252119g25adf379wf7b5825bbd4cd694@mail.gmail.com>
References:  <41baaeae-0c1d-4a73-9540-8049b837261c@l64g2000hse.googlegroups.com>	<48B356BE.3080501@datapipe.com> <db2611860808252119g25adf379wf7b5825bbd4cd694@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
An wrote:
> unfortunately not... see:
> 
> # cat file
> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> 
> # sed -e 's/<\/?span[^>]*>//g' file
> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> 
> (...nothing happens, the file is returned with no substitutions done)
> 
> 
> I could do it with a perl script, which basically does what i would expect
> sed would do:
> 
> # cat pscript.pl
> #!/usr/bin/perl -w
> $text = "<span xxxx> 111 </span>   2222 <span yyyy> 3333 </span> <span xxxx>
> 111 </span>    2222    <span yyyy> 3333 </span>";
> $text =~ s/<span x[^>]*>[^\(<\/span>\)]*[\s]*<\/span>[\s]*//g;
> print $text . "\n"

$text =~ s#<span xxxx>.*?</span>\s*##g;

> # perl pscript.pl
> 2222 <span yyyy> 3333 </span> 2222    <span yyyy> 3333 </span>
> 
> " <span xxx> ..... </span> " is removed... but i don't seem to be able to do
> it with sed... : (

regexps in sed are greedy and, sadly, you can't use *? as quantifier.
try the following (adding characters that can be inside your 'xxxx'
tags, of course):
sed 's#<span xxxx>[ a-zA-Z0-9]*</span>[ ]*##g'

> Im on fedora c9, maybe that's the problem ?
> 
> siran
> 
> 
> On Mon, Aug 25, 2008 at 8:35 PM, Paul A. Procacci <pprocacci@datapipe.com>wrote:
> 
>> siran wrote:
>>
>>> Hi, I have the string
>>>
>>> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
>>>
>>> And i wish to use sed to strip *only* the "<span xxxx>" tag and its
>>> contents... is this possible ? I'm trying this expression, but it
>>> doesn't work...
>>>
>>> sed 's/<span xxxx[^\(</span>\)]+<\/span>//g' file
>>>
>>> is there anything like it ?
>>>
>>> I would like to obtain
>>>
>>> 2222
>>>
>>>
>>>
>>> I hope someone can help,
>>>
>>> thank you,
>>>
>>> siran
>>> _______________________________________________
>>> freebsd-questions@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>>> To unsubscribe, send any mail to "
>>> freebsd-questions-unsubscribe@freebsd.org"
>>>
>>>
>> sed -E 's/<\/?span[^>]*>//g'
>>
>> Myabe that's what you want?
>>


HTH,
Yuri



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48B39A4E.1>