Date: Tue, 16 Jun 2009 17:54:36 -0700 From: Gary Kline <kline@thought.org> To: Jeffrey Goldberg <jeffrey@goldmark.org> Cc: FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: feedback, comments on this php-delimiter scrubbing program? Message-ID: <20090617005435.GA42176@thought.org> In-Reply-To: <57E07CDA-AA9E-4A8F-91BC-3BF90177CA3A@goldmark.org> References: <20090616012114.GA38011@thought.org> <200906151857.45945.mel.flynn%2Bfbsd.questions@mailing.thruhere.net> <20090616153040.GA40540@thought.org> <20090616170244.GA40934@thought.org> <57E07CDA-AA9E-4A8F-91BC-3BF90177CA3A@goldmark.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 16, 2009 at 06:32:43PM -0500, Jeffrey Goldberg wrote: > On Jun 16, 2009, at 12:02 PM, Gary Kline wrote: > > > this works, but still gives a warning. it's sloppy coding, but > > as a second version... > > You've got some superfluous tests for EOF in some places, and you may > also be missing some. > > Your approach has been to "look ahead" with an extra getc() when you > come across an interesting character. I recommended that instead of > doing that you keep a variable "state" to keep track of where you are > (and have very recently been) instead of looking ahead. > > I haven't tried your code, but I suspect that it behaves incorrectly > with input > > (1) that has a '<' as a final character > (2) that includes things like "<<<<?" > (3) that includes things like "??>" > this is exactly why i asked here. i've removed at least one of the EOF checks and will rewrite in my usual style of while ((ch = getc(fp)) != EOF) { } as my next cut. yes, i shamelessly cribbed this code from else. it originally deleted both C and C++ comments. i think it was written in C# that i'm unfamiliar with. i'm most familiar with lokahead, not that familar with the STATE method. when you have time could you say a few more words? or point me at a url? ...i'm all but certain this kind of function has been invented and re-invented dozens of times. > There is a systematic (if a bit tedious) way to make sure that you > check every condition. When you've worked enough on this, you can > peek at an answer which I've attached. you're right above with numbers two and three. pretty sure that the first one passes. gary > > (For the rest of you, I know that it would be more efficient to make > the big switch on state instead of on input character, but for > pedagogical reasons I did it the other way around. I deliberately > avoided other available tunings). > > The extensive comments in the code should make it clear what is going > on. Once you understand the concepts here it should be very easy to > write code to do similar things in the future. > > -j > > > > -- > Jeffrey Goldberg http://www.goldmark.org/jeff/ > > > -- Gary Kline kline@thought.org http://www.thought.org Public Service Unix http://jottings.thought.org http://transfinite.thought.org For FBSD list: http://transfinite.thought.org/slicejourney.php The 4.98a release of Jottings: http://jottings.thought.org/index.php
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090617005435.GA42176>