Date: Tue, 16 Jun 2009 17:54:36 -0700 From: Gary Kline <kline@thought.org> To: Jeffrey Goldberg <jeffrey@goldmark.org> Cc: FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: feedback, comments on this php-delimiter scrubbing program? Message-ID: <20090617005435.GA42176@thought.org> In-Reply-To: <57E07CDA-AA9E-4A8F-91BC-3BF90177CA3A@goldmark.org> References: <20090616012114.GA38011@thought.org> <200906151857.45945.mel.flynn%2Bfbsd.questions@mailing.thruhere.net> <20090616153040.GA40540@thought.org> <20090616170244.GA40934@thought.org> <57E07CDA-AA9E-4A8F-91BC-3BF90177CA3A@goldmark.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 16, 2009 at 06:32:43PM -0500, Jeffrey Goldberg wrote:
> On Jun 16, 2009, at 12:02 PM, Gary Kline wrote:
>
> > this works, but still gives a warning. it's sloppy coding, but
> > as a second version...
>
> You've got some superfluous tests for EOF in some places, and you may
> also be missing some.
>
> Your approach has been to "look ahead" with an extra getc() when you
> come across an interesting character. I recommended that instead of
> doing that you keep a variable "state" to keep track of where you are
> (and have very recently been) instead of looking ahead.
>
> I haven't tried your code, but I suspect that it behaves incorrectly
> with input
>
> (1) that has a '<' as a final character
> (2) that includes things like "<<<<?"
> (3) that includes things like "??>"
>
this is exactly why i asked here. i've removed at least one of
the EOF checks and will rewrite in my usual style of
while ((ch = getc(fp)) != EOF)
{
}
as my next cut. yes, i shamelessly cribbed this code from else.
it originally deleted both C and C++ comments. i think it was
written in C# that i'm unfamiliar with.
i'm most familiar with lokahead, not that familar with the STATE
method. when you have time could you say a few more words? or
point me at a url? ...i'm all but certain this kind of function
has been invented and re-invented dozens of times.
> There is a systematic (if a bit tedious) way to make sure that you
> check every condition. When you've worked enough on this, you can
> peek at an answer which I've attached.
you're right above with numbers two and three. pretty sure that
the first one passes.
gary
>
> (For the rest of you, I know that it would be more efficient to make
> the big switch on state instead of on input character, but for
> pedagogical reasons I did it the other way around. I deliberately
> avoided other available tunings).
>
> The extensive comments in the code should make it clear what is going
> on. Once you understand the concepts here it should be very easy to
> write code to do similar things in the future.
>
> -j
>
>
>
> --
> Jeffrey Goldberg http://www.goldmark.org/jeff/
>
>
>
--
Gary Kline kline@thought.org http://www.thought.org Public Service Unix
http://jottings.thought.org http://transfinite.thought.org
For FBSD list: http://transfinite.thought.org/slicejourney.php
The 4.98a release of Jottings: http://jottings.thought.org/index.php
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090617005435.GA42176>
