Date: Wed, 13 Feb 2013 11:44:19 -0500 From: John Baldwin <jhb@freebsd.org> To: Jilles Tjoelker <jilles@stack.nl> Cc: current@freebsd.org Subject: Re: [PATCH] open_memstream() and open_wmemstream() Message-ID: <201302131144.19617.jhb@freebsd.org> In-Reply-To: <20130207211222.GA98989@stack.nl> References: <201302051546.43839.jhb@freebsd.org> <20130207211222.GA98989@stack.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, February 07, 2013 4:12:22 pm Jilles Tjoelker wrote: > On Tue, Feb 05, 2013 at 03:46:43PM -0500, John Baldwin wrote: > > I've written an implementation of open_memstream() and > > open_wmemstream() along with a set of regression tests. I'm pretty > > sure open_memstream() is correct, and I believe open_wmemstream() is > > correct for expected usage. The latter might even do the right thing > > if you split a multi-byte character across multiple writes. One > > question I have is if my choice to discard any pending multi-byte > > state in the stream anytime a seek changes the effective position in > > the output stream. I think this is correct as stdio will flush any > > pending data before doing a seek, so if there is a partially parsed > > character we aren't going to get the rest of it. > > I don't think partially parsed characters can happen with a correct > application. As per C99, an application must not call byte output > functions on a wide-oriented stream, and vice versa. Oh, interesting. Should I remove those tests for using byte output functions from the open_wmemstream test? > Discarding the shift state on fseek()/fseeko() is permitted (but should > be documented as this is implementation-defined behaviour). > State-dependent encodings (where this is relevant) are rarely used > nowadays. > > The conversion to bytes and back probably makes open_wmemstream() quite > slow but I don't think that is very important. Note that we do this already for swprintf(). I've added this note: IMPLEMENTATION NOTES Internally all I/O streams are effectively byte-oriented, so using wide- oriented operations to write to a stream opened via open_wmemstream() results in wide characters being expanded to a stream of multibyte char- acters in stdio's internal buffers. These multibyte characters are then converted back to wide characters when written into the stream. As a result, the wide-oriented streams maintain an internal multibyte charac- ter conversion state that is cleared on any seek opertion that changes the current position. This should have no effect unless byte-oriented output operations are used on a wide-oriented stream. > > http://www.FreeBSD.org/~jhb/patches/open_memstream.patch > > The seek functions should check for overflow in the addition (for > SEEK_CUR and SEEK_END) and the conversion to size_t. Note that SEEK_CUR is effectively only called with an offset of 0 via __ftello(). I'm not sure of the best way to check for overflow, do I have to do something like this: static int will_overflow(size_t off, fpos_t pos) { if (pos < 0) return (-pos > off); return (SIZE_MAX - off > pos); } For SEEK_CUR I would test 'will_overflow(ms->offset, pos)' and for SEEK_END 'will_overflow(ms->len, pos)' Presumably SEEK_SET should test for the requested position being beyond SIZE_MAX as well? If my above test is ok then I end up with this: static int will_overflow(size_t offset, fpos_t pos) { if (pos < 0) { if (-pos > off) { errno = EINVAL; return (1); } } else { if (SIZE_MAX - off > pos) { errno = EOVERFLOW; return (1); } } return (0); } static fpos_t memstream_seek(void *cookie, fpos_t pos, int whence) { struct memstream *ms; #ifdef DEBUG size_t old; #endif ms = cookie; #ifdef DEBUG old = ms->offset; #endif switch (whence) { case SEEK_SET: if (pos > SIZE_MAX) { errno = EOVERFLOW; return (-1); } ms->offset = pos; break; case SEEK_CUR: if (will_overflow(ms->offset, pos)) return (-1); ms->offset += pos; break; case SEEK_END: if (will_overflow(ms->len, pos)) return (-1); ms->offset = ms->len + pos; break; } memstream_update(ms); #ifdef DEBUG fprintf(stderr, "MS: seek(%p, %zd, %d) %zd -> %zd\n", ms, pos, whence, old, ms->offset); #endif return (ms->offset); } Thanks for reviewing this so far. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201302131144.19617.jhb>