Date: Mon, 2 Oct 2000 23:35:07 +0100 From: Mark Ovens <marko@freebsd.org> To: Christopher Rued <c.rued@xsb.com> Cc: "Andresen,Jason R." <jandrese@mitre.org>, freebsd-questions@FreeBSD.ORG Subject: Re: Perl question Message-ID: <20001002233507.A252@parish> In-Reply-To: <14808.63902.442934.667120@chris.xsb.com>; from c.rued@xsb.com on Mon, Oct 02, 2000 at 05:09:50PM -0400 References: <14808.52583.347797.384055@chris.xsb.com> <20001002191537.G252@parish> <20001002192617.I252@parish> <39D8D5D9.67A3074B@mitre.org> <14808.63902.442934.667120@chris.xsb.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 02, 2000 at 05:09:50PM -0400, Christopher Rued wrote:
> Andresen,Jason R. writes:
> > > BTW, your RE should have a ``*'' as well:
> > >
> > > /x.*?y/
> > >
> >
> > Maybe, it depends on exactly what he was trying to get.
> >
> > The first 3 character match where x and y are the first and third
> > character respectivly, then x.y is exactly what you want. The smallest
> > set of characters that have x and y as boundry values? Then your x.*?y
> > is correct. The smallest set of characters that have x and y as
> > boundries and have at least one character in between them? x.+?y is
> > needed.
>
> The RE I used was precisely what I wanted: x.y (an `x' followed by
> exactly one character followed by a `y').
>
> When I run the following:
>
> #!/usr/bin/perl
> $a = "xayxbyxcyxdy";
> @s = $a =~ /x.y/;
> print "\@s is @s\n";
>
> I get:
>
> @s is 1
>
>
>
> So, I seem to be getting the truth value rather than the first match
> in the string. If, however, I wrap the entire RE in a parentheses
> (make it a subexpression) like so:
>
Well, () is not strictly a subexpression. It causes whatever is matched to
be remembered so that it can be recalled later (using \1, \2, etc.) similar
to \(...\) in sed(1).
> #!/usr/bin/perl
> $a = "xayxbyxcyxdy";
> @s = $a =~ /(x.y)/;
> print "\@s is @s\n";
>
> I get the results I wanted to begin with:
>
> @s is xay
>
> (I discovered this shortly after I sent the first message about this).
>
>
>
> What confuses me is that if I specify the global option, I do not need
> to use a subexpression. For example, if I run the following code:
>
> #!/usr/bin/perl
> $a = "xayxbyxcyxdy";
> @s = $a =~ /x.y/g;
> print "\@s is @s\n";
>
> I get:
>
> @s is xay xby xcy xdy
>
>
> So, this leaves me with a couple of questions, the main one being:
> Why the different treatment for single matches and global
> matches?
>
> and a less important one:
> Why is there no way to have the first match assigned to a scalar,
> since we can be sure that there will be at most one match returned?
>
AIUI, the construct ``$a =~ /x.y/;'' just returns TRUE or FALSE and thus is
used in if():
if ($a =~ /x.y/) {
.....
}
I would guess that if you specify a global match, or use () to memorize the
match then perl(1) saves it because it is reasonable to assume that you
require more than TRUE or FALSE.
If you get a definitive answer I'd be interested in knowing what it is.
>
>
> If anyone can explain this, and/or answer the questions posed above,
> I'd appreciate it.
>
> -Chris
--
4.4 - The number of the Beastie
________________________________________________________________
51.44°N FreeBSD - The Power To Serve http://www.freebsd.org
2.057°W My Webpage http://ukug.uk.freebsd.org/~mark
mailto:marko@freebsd.org http://www.radan.com
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001002233507.A252>
