Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Jun 2008 22:32:17 +0200
From:      Gabor Kovesdan <gabor@FreeBSD.org>
To:        Andrey Chernov <ache@nagual.pp.ru>
Cc:        hackers@FreeBSD.org, current@FreeBSD.org
Subject:   Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]
Message-ID:  <486159D1.3060704@FreeBSD.org>
In-Reply-To: <20080622135343.GA72068@nagual.pp.ru>
References:  <20080617102900.GA46479@nagual.pp.ru> <485798C4.2050605@FreeBSD.org> <20080618055851.GA85018@nagual.pp.ru> <86zlpjduew.fsf@ds4.des.no> <20080618083739.GA87100@nagual.pp.ru> <867icndqv5.fsf@ds4.des.no> <4858DBF6.5070001@bluemedia.pl> <86skvbc9gn.fsf@ds4.des.no> <20080618114917.GB89383@nagual.pp.ru> <485E4C69.1080805@FreeBSD.org> <20080622135343.GA72068@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help

>
> 1) You can't convert just whole buffer after fread() since it can be 
> ended in the middle of multibyte sequence on BUFSIZ edge. Look how GNU 
> utils do it.
>   
OK, now I haven't thought of this aspect. What about this?

#define iswbinary(ch)   (!iswspace((ch)) && iswcntrl((ch)))

int
bin_file(FILE *f)
{
        wint_t   ch = L'\0';
        size_t   i;
        int      ret = 0;

        if (fseek(f, 0L, SEEK_SET) == -1)
                return (0);

        for (i = 0; (i <= BUFSIZ) && (ch != WEOF); i++) {
                ch = fgetwc(f);
                if (iswbinary(ch)) {
                        ret = 1;
                        break;
                }
        }

        rewind(f);
        return (ret);
}

int
mmbin_file(struct mmfile *f)
{
        int      i;
        wchar_t *wbuf;
        size_t   s;

        if ((s = mbstowcs(NULL, f->base, 0)) == -1)
                return (0);

        wbuf = grep_malloc((s + 1) * sizeof(wchar_t));

        if (mbstowcs(wbuf, f->base, s) == -1)
                return (0);

        /* XXX knows too much about mmf internals */
        for (i = 0; i < BUFSIZ && i < f->len; i++)
                if (iswbinary(wbuf[i])) {
                        free(wbuf);
                        return (1);
        }
        free(wbuf);
        return (0);
}

This should be ok, right?

> 2) Better use iswspace and iswcntrl instead of iswctype.
>   
Ok, changed, thanks. I've also been looking for such functions, but man 
wctype doesn't mention them.

> 3) util.c needs to be fixed in several places too.
>   
Yes, I know, I'm just advancing step by step. The next item will be to 
fix that word boundary handling.

Regards,
Gabor



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?486159D1.3060704>