Date: Tue, 24 Jun 2008 22:32:17 +0200 From: Gabor Kovesdan <gabor@FreeBSD.org> To: Andrey Chernov <ache@nagual.pp.ru> Cc: hackers@FreeBSD.org, current@FreeBSD.org Subject: Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo] Message-ID: <486159D1.3060704@FreeBSD.org> In-Reply-To: <20080622135343.GA72068@nagual.pp.ru> References: <20080617102900.GA46479@nagual.pp.ru> <485798C4.2050605@FreeBSD.org> <20080618055851.GA85018@nagual.pp.ru> <86zlpjduew.fsf@ds4.des.no> <20080618083739.GA87100@nagual.pp.ru> <867icndqv5.fsf@ds4.des.no> <4858DBF6.5070001@bluemedia.pl> <86skvbc9gn.fsf@ds4.des.no> <20080618114917.GB89383@nagual.pp.ru> <485E4C69.1080805@FreeBSD.org> <20080622135343.GA72068@nagual.pp.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
> > 1) You can't convert just whole buffer after fread() since it can be > ended in the middle of multibyte sequence on BUFSIZ edge. Look how GNU > utils do it. > OK, now I haven't thought of this aspect. What about this? #define iswbinary(ch) (!iswspace((ch)) && iswcntrl((ch))) int bin_file(FILE *f) { wint_t ch = L'\0'; size_t i; int ret = 0; if (fseek(f, 0L, SEEK_SET) == -1) return (0); for (i = 0; (i <= BUFSIZ) && (ch != WEOF); i++) { ch = fgetwc(f); if (iswbinary(ch)) { ret = 1; break; } } rewind(f); return (ret); } int mmbin_file(struct mmfile *f) { int i; wchar_t *wbuf; size_t s; if ((s = mbstowcs(NULL, f->base, 0)) == -1) return (0); wbuf = grep_malloc((s + 1) * sizeof(wchar_t)); if (mbstowcs(wbuf, f->base, s) == -1) return (0); /* XXX knows too much about mmf internals */ for (i = 0; i < BUFSIZ && i < f->len; i++) if (iswbinary(wbuf[i])) { free(wbuf); return (1); } free(wbuf); return (0); } This should be ok, right? > 2) Better use iswspace and iswcntrl instead of iswctype. > Ok, changed, thanks. I've also been looking for such functions, but man wctype doesn't mention them. > 3) util.c needs to be fixed in several places too. > Yes, I know, I'm just advancing step by step. The next item will be to fix that word boundary handling. Regards, Gabor
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?486159D1.3060704>