Date: Sat, 5 Dec 2015 22:56:57 +0000 (UTC) From: Garrett Cooper <ngie@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-9@freebsd.org Subject: svn commit: r291875 - in stable/9: include include/xlocale lib/libc/locale sys/sys tools/regression/lib/libc/locale Message-ID: <201512052256.tB5MuvTE038017@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: ngie Date: Sat Dec 5 22:56:57 2015 New Revision: 291875 URL: https://svnweb.freebsd.org/changeset/base/291875 Log: MFstable/10 r250883,r251314: r250883 (by ed): Add <uchar.h>. The <uchar.h> header, part of C11, adds a small number of utility functions for 16/32-bit "universal" characters, which may or may not be UTF-16/32. As our wchar_t is already ISO 10646, simply add light-weight wrappers around wcrtomb() and mbrtowc(). While there, also add (non-yet-standard) _l functions, similar to the ones we already have for the other locale-dependent functions. Reviewed by: theraven r251314 (by ed): Add libiconv based versions of *c16*() and *c32*(). I initially thought wchar_t was locale independent, but this seems to be only the case on Linux. This means that we cannot depend on the *wc*() routines to implement *c16*() and *c32*(). Instead, use the Citrus libiconv that is part of libc. I'll see if there is anything I can do to make the existing functions somewhat useful in case the system is built without libiconv in the nearby future. If not, I'll simply remove the broken implementations. Reviewed by: jilles, gabor Added: stable/9/include/uchar.h - copied unchanged from r250883, head/include/uchar.h stable/9/include/xlocale/_uchar.h - copied unchanged from r250883, head/include/xlocale/_uchar.h stable/9/lib/libc/locale/c16rtomb.c - copied unchanged from r250883, head/lib/libc/locale/c16rtomb.c stable/9/lib/libc/locale/c16rtomb_iconv.c - copied unchanged from r251314, head/lib/libc/locale/c16rtomb_iconv.c stable/9/lib/libc/locale/c32rtomb.c - copied unchanged from r250883, head/lib/libc/locale/c32rtomb.c stable/9/lib/libc/locale/c32rtomb_iconv.c - copied unchanged from r251314, head/lib/libc/locale/c32rtomb_iconv.c stable/9/lib/libc/locale/cXXrtomb_iconv.h - copied unchanged from r251314, head/lib/libc/locale/cXXrtomb_iconv.h stable/9/lib/libc/locale/mbrtoc16.c - copied unchanged from r250883, head/lib/libc/locale/mbrtoc16.c stable/9/lib/libc/locale/mbrtoc16_iconv.c - copied unchanged from r251314, head/lib/libc/locale/mbrtoc16_iconv.c stable/9/lib/libc/locale/mbrtoc32.c - copied unchanged from r250883, head/lib/libc/locale/mbrtoc32.c stable/9/lib/libc/locale/mbrtoc32_iconv.c - copied unchanged from r251314, head/lib/libc/locale/mbrtoc32_iconv.c stable/9/lib/libc/locale/mbrtocXX_iconv.h - copied unchanged from r251314, head/lib/libc/locale/mbrtocXX_iconv.h stable/9/tools/regression/lib/libc/locale/test-c16rtomb.c - copied, changed from r250883, head/tools/regression/lib/libc/locale/test-c16rtomb.c stable/9/tools/regression/lib/libc/locale/test-mbrtoc16.c - copied, changed from r250883, head/tools/regression/lib/libc/locale/test-mbrtoc16.c Modified: stable/9/include/Makefile stable/9/include/stdatomic.h stable/9/include/xlocale/Makefile stable/9/lib/libc/locale/Makefile.inc stable/9/lib/libc/locale/Symbol.map stable/9/lib/libc/locale/mbrtowc.3 stable/9/lib/libc/locale/wcrtomb.3 stable/9/lib/libc/locale/xlocale_private.h stable/9/sys/sys/_types.h stable/9/tools/regression/lib/libc/locale/Makefile Directory Properties: stable/9/ (props changed) stable/9/include/ (props changed) stable/9/lib/ (props changed) stable/9/lib/libc/ (props changed) stable/9/sys/ (props changed) stable/9/sys/sys/ (props changed) stable/9/tools/ (props changed) stable/9/tools/regression/ (props changed) stable/9/tools/regression/lib/libc/ (props changed) Modified: stable/9/include/Makefile ============================================================================== --- stable/9/include/Makefile Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/include/Makefile Sat Dec 5 22:56:57 2015 (r291875) @@ -23,7 +23,7 @@ INCS= a.out.h ar.h assert.h bitstring.h stdnoreturn.h stdio.h stdlib.h string.h stringlist.h \ strings.h sysexits.h tar.h termios.h tgmath.h \ time.h timeconv.h timers.h ttyent.h \ - ulimit.h unistd.h utime.h utmpx.h uuid.h varargs.h \ + uchar.h ulimit.h unistd.h utime.h utmpx.h uuid.h varargs.h \ wchar.h wctype.h wordexp.h xlocale.h .PATH: ${.CURDIR}/../contrib/libc-vis Modified: stable/9/include/stdatomic.h ============================================================================== --- stable/9/include/stdatomic.h Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/include/stdatomic.h Sat Dec 5 22:56:57 2015 (r291875) @@ -145,10 +145,8 @@ typedef _Atomic(long) atomic_long; typedef _Atomic(unsigned long) atomic_ulong; typedef _Atomic(long long) atomic_llong; typedef _Atomic(unsigned long long) atomic_ullong; -#if 0 typedef _Atomic(__char16_t) atomic_char16_t; typedef _Atomic(__char32_t) atomic_char32_t; -#endif typedef _Atomic(___wchar_t) atomic_wchar_t; typedef _Atomic(__int_least8_t) atomic_int_least8_t; typedef _Atomic(__uint_least8_t) atomic_uint_least8_t; Copied: stable/9/include/uchar.h (from r250883, head/include/uchar.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/include/uchar.h Sat Dec 5 22:56:57 2015 (r291875, copy of r250883, head/include/uchar.h) @@ -0,0 +1,60 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _UCHAR_H_ +#define _UCHAR_H_ + +#include <sys/cdefs.h> +#include <sys/_types.h> + +#ifndef _MBSTATE_T_DECLARED +typedef __mbstate_t mbstate_t; +#define _MBSTATE_T_DECLARED +#endif + +#ifndef _SIZE_T_DECLARED +typedef __size_t size_t; +#define _SIZE_T_DECLARED +#endif + +typedef __char16_t char16_t; +typedef __char32_t char32_t; + +__BEGIN_DECLS +size_t c16rtomb(char * __restrict, char16_t, mbstate_t * __restrict); +size_t c32rtomb(char * __restrict, char32_t, mbstate_t * __restrict); +size_t mbrtoc16(char16_t * __restrict, const char * __restrict, size_t, + mbstate_t * __restrict); +size_t mbrtoc32(char32_t * __restrict, const char * __restrict, size_t, + mbstate_t * __restrict); +#if __BSD_VISIBLE || defined(_XLOCALE_H_) +#include <xlocale/_uchar.h> +#endif +__END_DECLS + +#endif /* !_UCHAR_H_ */ Modified: stable/9/include/xlocale/Makefile ============================================================================== --- stable/9/include/xlocale/Makefile Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/include/xlocale/Makefile Sat Dec 5 22:56:57 2015 (r291875) @@ -2,7 +2,7 @@ NO_OBJ= INCS= _ctype.h _inttypes.h _langinfo.h _locale.h _monetary.h _stdio.h\ - _stdlib.h _string.h _time.h _wchar.h + _stdlib.h _string.h _time.h _uchar.h _wchar.h INCSDIR=${INCLUDEDIR}/xlocale .include <bsd.prog.mk> Copied: stable/9/include/xlocale/_uchar.h (from r250883, head/include/xlocale/_uchar.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/include/xlocale/_uchar.h Sat Dec 5 22:56:57 2015 (r291875, copy of r250883, head/include/xlocale/_uchar.h) @@ -0,0 +1,46 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _LOCALE_T_DEFINED +#define _LOCALE_T_DEFINED +typedef struct _xlocale *locale_t; +#endif + +#ifndef _XLOCALE_UCHAR_H_ +#define _XLOCALE_UCHAR_H_ + +size_t c16rtomb_l(char * __restrict, char16_t, mbstate_t * __restrict, + locale_t); +size_t c32rtomb_l(char * __restrict, char32_t, mbstate_t * __restrict, + locale_t); +size_t mbrtoc16_l(char16_t * __restrict, const char * __restrict, size_t, + mbstate_t * __restrict, locale_t); +size_t mbrtoc32_l(char32_t * __restrict, const char * __restrict, size_t, + mbstate_t * __restrict, locale_t); + +#endif /* _XLOCALE_UCHAR_H_ */ Modified: stable/9/lib/libc/locale/Makefile.inc ============================================================================== --- stable/9/lib/libc/locale/Makefile.inc Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/lib/libc/locale/Makefile.inc Sat Dec 5 22:56:57 2015 (r291875) @@ -23,6 +23,12 @@ SRCS+= ascii.c big5.c btowc.c collate.c wcwidth.c\ xlocale.c +.if ${MK_ICONV} != "no" +SRCS+= c16rtomb_iconv.c c32rtomb_iconv.c mbrtoc16_iconv.c mbrtoc32_iconv.c +.else +SRCS+= c16rtomb.c c32rtomb.c mbrtoc16.c mbrtoc32.c +.endif + SYM_MAPS+=${.CURDIR}/locale/Symbol.map MAN+= btowc.3 \ @@ -72,7 +78,9 @@ MLINKS+=iswalnum_l.3 iswalpha_l.3 iswaln iswalnum_l.3 iswspecial_l.3 iswalnum_l.3 nextwctype_l.3 \ iswalnum_l.3 towctrans_l.3 iswalnum_l.3 wctrans_l.3 MLINKS+=isxdigit.3 ishexnumber.3 +MLINKS+=mbrtowc.3 mbrtoc16.3 mbrtowc.3 mbrtoc32.3 MLINKS+=mbsrtowcs.3 mbsnrtowcs.3 +MLINKS+=wcrtomb.3 c16rtomb.3 wcrtomb.3 c32rtomb.3 MLINKS+=wcsrtombs.3 wcsnrtombs.3 MLINKS+=wcstod.3 wcstof.3 wcstod.3 wcstold.3 MLINKS+=wcstol.3 wcstoul.3 wcstol.3 wcstoll.3 wcstol.3 wcstoull.3 \ Modified: stable/9/lib/libc/locale/Symbol.map ============================================================================== --- stable/9/lib/libc/locale/Symbol.map Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/lib/libc/locale/Symbol.map Sat Dec 5 22:56:57 2015 (r291875) @@ -199,6 +199,14 @@ FBSD_1.3 { __istype_l; __runes_for_locale; _ThreadRuneLocale; + c16rtomb; + c16rtomb_l; + c32rtomb; + c32rtomb_l; + mbrtoc16; + mbrtoc16_l; + mbrtoc32; + mbrtoc32_l; }; FBSDprivate_1.0 { Copied: stable/9/lib/libc/locale/c16rtomb.c (from r250883, head/lib/libc/locale/c16rtomb.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/c16rtomb.c Sat Dec 5 22:56:57 2015 (r291875, copy of r250883, head/lib/libc/locale/c16rtomb.c) @@ -0,0 +1,81 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <uchar.h> +#include "xlocale_private.h" + +typedef struct { + char16_t lead_surrogate; + mbstate_t c32_mbstate; +} _Char16State; + +size_t +c16rtomb_l(char * __restrict s, char16_t c16, mbstate_t * __restrict ps, + locale_t locale) +{ + _Char16State *cs; + char32_t c32; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->c16rtomb; + cs = (_Char16State *)ps; + + /* If s is a null pointer, the value of parameter c16 is ignored. */ + if (s == NULL) { + c32 = 0; + } else if (cs->lead_surrogate >= 0xd800 && + cs->lead_surrogate <= 0xdbff) { + /* We should see a trail surrogate now. */ + if (c16 < 0xdc00 || c16 > 0xdfff) { + errno = EILSEQ; + return ((size_t)-1); + } + c32 = 0x10000 + ((cs->lead_surrogate & 0x3ff) << 10 | + (c16 & 0x3ff)); + } else if (c16 >= 0xd800 && c16 <= 0xdbff) { + /* Store lead surrogate for next invocation. */ + cs->lead_surrogate = c16; + return (0); + } else { + /* Regular character. */ + c32 = c16; + } + cs->lead_surrogate = 0; + + return (c32rtomb_l(s, c32, &cs->c32_mbstate, locale)); +} + +size_t +c16rtomb(char * __restrict s, char16_t c16, mbstate_t * __restrict ps) +{ + + return (c16rtomb_l(s, c16, ps, __get_locale())); +} Copied: stable/9/lib/libc/locale/c16rtomb_iconv.c (from r251314, head/lib/libc/locale/c16rtomb_iconv.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/c16rtomb_iconv.c Sat Dec 5 22:56:57 2015 (r291875, copy of r251314, head/lib/libc/locale/c16rtomb_iconv.c) @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char16_t +#define cXXrtomb c16rtomb +#define cXXrtomb_l c16rtomb_l +#define SRCBUF_LEN 2 +#define UTF_XX_INTERNAL "UTF-16-INTERNAL" + +#include "cXXrtomb_iconv.h" Copied: stable/9/lib/libc/locale/c32rtomb.c (from r250883, head/lib/libc/locale/c32rtomb.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/c32rtomb.c Sat Dec 5 22:56:57 2015 (r291875, copy of r250883, head/lib/libc/locale/c32rtomb.c) @@ -0,0 +1,59 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <errno.h> +#include <uchar.h> +#include <wchar.h> +#include "xlocale_private.h" + +size_t +c32rtomb_l(char * __restrict s, char32_t c32, mbstate_t * __restrict ps, + locale_t locale) +{ + + /* Unicode Standard 5.0, D90: ill-formed characters. */ + if ((c32 >= 0xd800 && c32 <= 0xdfff) || c32 > 0x10ffff) { + errno = EILSEQ; + return ((size_t)-1); + } + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->c32rtomb; + + /* Assume wchar_t uses UTF-32. */ + return (wcrtomb_l(s, c32, ps, locale)); +} + +size_t +c32rtomb(char * __restrict s, char32_t c32, mbstate_t * __restrict ps) +{ + + return (c32rtomb_l(s, c32, ps, __get_locale())); +} Copied: stable/9/lib/libc/locale/c32rtomb_iconv.c (from r251314, head/lib/libc/locale/c32rtomb_iconv.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/c32rtomb_iconv.c Sat Dec 5 22:56:57 2015 (r291875, copy of r251314, head/lib/libc/locale/c32rtomb_iconv.c) @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char32_t +#define cXXrtomb c32rtomb +#define cXXrtomb_l c32rtomb_l +#define SRCBUF_LEN 1 +#define UTF_XX_INTERNAL "UTF-32-INTERNAL" + +#include "cXXrtomb_iconv.h" Copied: stable/9/lib/libc/locale/cXXrtomb_iconv.h (from r251314, head/lib/libc/locale/cXXrtomb_iconv.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/cXXrtomb_iconv.h Sat Dec 5 22:56:57 2015 (r291875, copy of r251314, head/lib/libc/locale/cXXrtomb_iconv.h) @@ -0,0 +1,115 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/queue.h> + +#include <assert.h> +#include <errno.h> +#include <langinfo.h> +#include <uchar.h> + +#include "../iconv/citrus_hash.h" +#include "../iconv/citrus_module.h" +#include "../iconv/citrus_iconv.h" +#include "xlocale_private.h" + +typedef struct { + bool initialized; + struct _citrus_iconv iconv; + union { + charXX_t widechar[SRCBUF_LEN]; + char bytes[sizeof(charXX_t) * SRCBUF_LEN]; + } srcbuf; + size_t srcbuf_len; +} _ConversionState; +_Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t), + "Size of _ConversionState must not exceed mbstate_t's size."); + +size_t +cXXrtomb_l(char * __restrict s, charXX_t c, mbstate_t * __restrict ps, + locale_t locale) +{ + _ConversionState *cs; + struct _citrus_iconv *handle; + char *src, *dst; + size_t srcleft, dstleft, invlen; + int err; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->cXXrtomb; + cs = (_ConversionState *)ps; + handle = &cs->iconv; + + /* Reinitialize mbstate_t. */ + if (s == NULL || !cs->initialized) { + if (_citrus_iconv_open(&handle, UTF_XX_INTERNAL, + nl_langinfo_l(CODESET, locale)) != 0) { + cs->initialized = false; + errno = EINVAL; + return (-1); + } + handle->cv_shared->ci_discard_ilseq = true; + handle->cv_shared->ci_hooks = NULL; + cs->srcbuf_len = 0; + cs->initialized = true; + if (s == NULL) + return (1); + } + + assert(cs->srcbuf_len < sizeof(cs->srcbuf.widechar) / sizeof(charXX_t)); + cs->srcbuf.widechar[cs->srcbuf_len++] = c; + + /* Perform conversion. */ + src = cs->srcbuf.bytes; + srcleft = cs->srcbuf_len * sizeof(charXX_t); + dst = s; + dstleft = MB_CUR_MAX_L(locale); + err = _citrus_iconv_convert(handle, &src, &srcleft, &dst, &dstleft, + 0, &invlen); + + /* Character is part of a surrogate pair. We need more input. */ + if (err == EINVAL) + return (0); + cs->srcbuf_len = 0; + + /* Illegal sequence. */ + if (dst == s) { + errno = EILSEQ; + return ((size_t)-1); + } + return (dst - s); +} + +size_t +cXXrtomb(char * __restrict s, charXX_t c, mbstate_t * __restrict ps) +{ + + return (cXXrtomb_l(s, c, ps, __get_locale())); +} Copied: stable/9/lib/libc/locale/mbrtoc16.c (from r250883, head/lib/libc/locale/mbrtoc16.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/mbrtoc16.c Sat Dec 5 22:56:57 2015 (r291875, copy of r250883, head/lib/libc/locale/mbrtoc16.c) @@ -0,0 +1,89 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <uchar.h> +#include "xlocale_private.h" + +typedef struct { + char16_t trail_surrogate; + mbstate_t c32_mbstate; +} _Char16State; + +size_t +mbrtoc16_l(char16_t * __restrict pc16, const char * __restrict s, size_t n, + mbstate_t * __restrict ps, locale_t locale) +{ + _Char16State *cs; + char32_t c32; + ssize_t len; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtoc16; + cs = (_Char16State *)ps; + + /* + * Call straight into mbrtoc32_l() if we don't need to return a + * character value. According to the spec, if s is a null + * pointer, the value of parameter pc16 is also ignored. + */ + if (pc16 == NULL || s == NULL) { + cs->trail_surrogate = 0; + return (mbrtoc32_l(NULL, s, n, &cs->c32_mbstate, locale)); + } + + /* Return the trail surrogate from the previous invocation. */ + if (cs->trail_surrogate >= 0xdc00 && cs->trail_surrogate <= 0xdfff) { + *pc16 = cs->trail_surrogate; + cs->trail_surrogate = 0; + return ((size_t)-3); + } + + len = mbrtoc32_l(&c32, s, n, &cs->c32_mbstate, locale); + if (len >= 0) { + if (c32 < 0x10000) { + /* Fits in one UTF-16 character. */ + *pc16 = c32; + } else { + /* Split up in a surrogate pair. */ + c32 -= 0x10000; + *pc16 = 0xd800 | (c32 >> 10); + cs->trail_surrogate = 0xdc00 | (c32 & 0x3ff); + } + } + return (len); +} + +size_t +mbrtoc16(char16_t * __restrict pc16, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + + return (mbrtoc16_l(pc16, s, n, ps, __get_locale())); +} Copied: stable/9/lib/libc/locale/mbrtoc16_iconv.c (from r251314, head/lib/libc/locale/mbrtoc16_iconv.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/mbrtoc16_iconv.c Sat Dec 5 22:56:57 2015 (r291875, copy of r251314, head/lib/libc/locale/mbrtoc16_iconv.c) @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char16_t +#define mbrtocXX mbrtoc16 +#define mbrtocXX_l mbrtoc16_l +#define DSTBUF_LEN 2 +#define UTF_XX_INTERNAL "UTF-16-INTERNAL" + +#include "mbrtocXX_iconv.h" Copied: stable/9/lib/libc/locale/mbrtoc32.c (from r250883, head/lib/libc/locale/mbrtoc32.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/mbrtoc32.c Sat Dec 5 22:56:57 2015 (r291875, copy of r250883, head/lib/libc/locale/mbrtoc32.c) @@ -0,0 +1,53 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <uchar.h> +#include <wchar.h> +#include "xlocale_private.h" + +size_t +mbrtoc32_l(char32_t * __restrict pc32, const char * __restrict s, size_t n, + mbstate_t * __restrict ps, locale_t locale) +{ + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtoc32; + + /* Assume wchar_t uses UTF-32. */ + return (mbrtowc_l(pc32, s, n, ps, locale)); +} + +size_t +mbrtoc32(char32_t * __restrict pc32, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + + return (mbrtoc32_l(pc32, s, n, ps, __get_locale())); +} Copied: stable/9/lib/libc/locale/mbrtoc32_iconv.c (from r251314, head/lib/libc/locale/mbrtoc32_iconv.c) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/mbrtoc32_iconv.c Sat Dec 5 22:56:57 2015 (r291875, copy of r251314, head/lib/libc/locale/mbrtoc32_iconv.c) @@ -0,0 +1,8 @@ +/* $FreeBSD$ */ +#define charXX_t char32_t +#define mbrtocXX mbrtoc32 +#define mbrtocXX_l mbrtoc32_l +#define DSTBUF_LEN 1 +#define UTF_XX_INTERNAL "UTF-32-INTERNAL" + +#include "mbrtocXX_iconv.h" Copied: stable/9/lib/libc/locale/mbrtocXX_iconv.h (from r251314, head/lib/libc/locale/mbrtocXX_iconv.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/9/lib/libc/locale/mbrtocXX_iconv.h Sat Dec 5 22:56:57 2015 (r291875, copy of r251314, head/lib/libc/locale/mbrtocXX_iconv.h) @@ -0,0 +1,158 @@ +/*- + * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org> + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/queue.h> + +#include <assert.h> +#include <errno.h> +#include <langinfo.h> +#include <limits.h> +#include <string.h> +#include <uchar.h> + +#include "../iconv/citrus_hash.h" +#include "../iconv/citrus_module.h" +#include "../iconv/citrus_iconv.h" +#include "xlocale_private.h" + +typedef struct { + bool initialized; + struct _citrus_iconv iconv; + char srcbuf[MB_LEN_MAX]; + size_t srcbuf_len; + union { + charXX_t widechar[DSTBUF_LEN]; + char bytes[sizeof(charXX_t) * DSTBUF_LEN]; + } dstbuf; + size_t dstbuf_len; +} _ConversionState; +_Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t), + "Size of _ConversionState must not exceed mbstate_t's size."); + +size_t +mbrtocXX_l(charXX_t * __restrict pc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps, locale_t locale) +{ + _ConversionState *cs; + struct _citrus_iconv *handle; + size_t i, retval; + charXX_t retchar; + + FIX_LOCALE(locale); + if (ps == NULL) + ps = &locale->mbrtocXX; + cs = (_ConversionState *)ps; + handle = &cs->iconv; + + /* Reinitialize mbstate_t. */ + if (s == NULL || !cs->initialized) { + if (_citrus_iconv_open(&handle, + nl_langinfo_l(CODESET, locale), UTF_XX_INTERNAL) != 0) { + cs->initialized = false; + errno = EINVAL; + return (-1); + } + handle->cv_shared->ci_discard_ilseq = true; + handle->cv_shared->ci_hooks = NULL; + cs->srcbuf_len = cs->dstbuf_len = 0; + cs->initialized = true; + if (s == NULL) + return (0); + } + + /* See if we still have characters left from the previous invocation. */ + if (cs->dstbuf_len > 0) { + retval = (size_t)-3; + goto return_char; + } + + /* Fill up the read buffer as far as possible. */ + if (n > sizeof(cs->srcbuf) - cs->srcbuf_len) + n = sizeof(cs->srcbuf) - cs->srcbuf_len; + memcpy(cs->srcbuf + cs->srcbuf_len, s, n); + + /* Convert as few characters to the dst buffer as possible. */ + for (i = 0; ; i++) { + char *src, *dst; + size_t srcleft, dstleft, invlen; + int err; + + src = cs->srcbuf; + srcleft = cs->srcbuf_len + n; + dst = cs->dstbuf.bytes; + dstleft = i * sizeof(charXX_t); + assert(srcleft <= sizeof(cs->srcbuf) && + dstleft <= sizeof(cs->dstbuf.bytes)); + err = _citrus_iconv_convert(handle, &src, &srcleft, + &dst, &dstleft, 0, &invlen); + cs->dstbuf_len = (dst - cs->dstbuf.bytes) / sizeof(charXX_t); + + /* Got new character(s). Return the first. */ + if (cs->dstbuf_len > 0) { + assert(src - cs->srcbuf > cs->srcbuf_len); + retval = src - cs->srcbuf - cs->srcbuf_len; + cs->srcbuf_len = 0; + goto return_char; + } + + /* Increase dst buffer size, to obtain the surrogate pair. */ + if (err == E2BIG) + continue; + + /* Illegal sequence. */ + if (invlen > 0) { + cs->srcbuf_len = 0; + errno = EILSEQ; + return ((size_t)-1); + } + + /* Save unprocessed remainder for the next invocation. */ + memmove(cs->srcbuf, src, srcleft); + cs->srcbuf_len = srcleft; + return ((size_t)-2); + } + +return_char: + retchar = cs->dstbuf.widechar[0]; + memmove(&cs->dstbuf.widechar[0], &cs->dstbuf.widechar[1], + --cs->dstbuf_len * sizeof(charXX_t)); + if (pc != NULL) + *pc = retchar; + if (retchar == 0) + return (0); + return (retval); +} + +size_t +mbrtocXX(charXX_t * __restrict pc, const char * __restrict s, size_t n, + mbstate_t * __restrict ps) +{ + + return (mbrtocXX_l(pc, s, n, ps, __get_locale())); +} Modified: stable/9/lib/libc/locale/mbrtowc.3 ============================================================================== --- stable/9/lib/libc/locale/mbrtowc.3 Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/lib/libc/locale/mbrtowc.3 Sat Dec 5 22:56:57 2015 (r291875) @@ -24,11 +24,13 @@ .\" .\" $FreeBSD$ .\" -.Dd April 8, 2004 +.Dd May 21, 2013 .Dt MBRTOWC 3 .Os .Sh NAME -.Nm mbrtowc +.Nm mbrtowc , +.Nm mbrtoc16 , +.Nm mbrtoc32 .Nd "convert a character to a wide-character code (restartable)" .Sh LIBRARY .Lb libc @@ -36,35 +38,51 @@ .In wchar.h .Ft size_t .Fo mbrtowc -.Fa "wchar_t * restrict pwc" "const char * restrict s" "size_t n" +.Fa "wchar_t * restrict pc" "const char * restrict s" "size_t n" +.Fa "mbstate_t * restrict ps" +.Fc +.In uchar.h +.Ft size_t +.Fo mbrtoc16 +.Fa "char16_t * restrict pc" "const char * restrict s" "size_t n" +.Fa "mbstate_t * restrict ps" +.Fc +.Ft size_t +.Fo mbrtoc32 +.Fa "char32_t * restrict pc" "const char * restrict s" "size_t n" .Fa "mbstate_t * restrict ps" .Fc .Sh DESCRIPTION The -.Fn mbrtowc -function inspects at most +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions inspect at most .Fa n bytes pointed to by .Fa s to determine the number of bytes needed to complete the next multibyte character. If a character can be completed, and -.Fa pwc +.Fa pc is not .Dv NULL , the wide character which is represented by .Fa s is stored in the -.Vt wchar_t +.Vt wchar_t , +.Vt char16_t +or +.Vt char32_t it points to. .Pp If .Fa s is .Dv NULL , -.Fn mbrtowc -behaves as if -.Fa pwc +these functions behave as if +.Fa pc was .Dv NULL , .Fa s @@ -81,15 +99,24 @@ argument, is used to keep track of the shift state. If it is .Dv NULL , -.Fn mbrtowc -uses an internal, static +these functions use an internal, static .Vt mbstate_t object, which is initialized to the initial conversion state at program startup. +.Pp +As a single +.Vt char16_t +is not large enough to represent certain multibyte characters, the +.Fn mbrtoc16 +function may need to be invoked multiple times to convert a single +multibyte character sequence. .Sh RETURN VALUES The -.Fn mbrtowc -functions returns: +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions return: .Bl -tag -width indent .It 0 The next @@ -100,10 +127,13 @@ represent the null wide character .It >0 The next .Fa n -or fewer bytes -represent a valid character, -.Fn mbrtowc -returns the number of bytes used to complete the multibyte character. +or fewer bytes represent a valid character, these functions +return the number of bytes used to complete the multibyte character. +.It Po Vt size_t Pc Ns \-1 +An encoding error has occurred. +The next +.Fa n +or fewer bytes do not contribute to a valid multibyte character. .It Po Vt size_t Pc Ns \-2 The next .Fa n @@ -111,16 +141,23 @@ contribute to, but do not complete, a va and all .Fa n bytes have been processed. -.It Po Vt size_t Pc Ns \-1 -An encoding error has occurred. -The next -.Fa n -or fewer bytes do not contribute to a valid multibyte character. +.El +.Pp +The +.Fn mbrtoc16 +function also returns: +.Bl -tag -width indent +.It Po Vt size_t Pc Ns \-3 +The next character resulting from a previous call has been stored. +No bytes from the input have been consumed. .El .Sh ERRORS The -.Fn mbrtowc -function will fail if: +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions will fail if: .Bl -tag -width Er .It Bq Er EILSEQ An invalid multibyte sequence was detected. @@ -134,6 +171,9 @@ The conversion state is invalid. .Xr wcrtomb 3 .Sh STANDARDS The -.Fn mbrtowc -function conforms to -.St -isoC-99 . +.Fn mbrtowc , +.Fn mbrtoc16 +and +.Fn mbrtoc32 +functions conform to +.St -isoC-2011 . Modified: stable/9/lib/libc/locale/wcrtomb.3 ============================================================================== --- stable/9/lib/libc/locale/wcrtomb.3 Sat Dec 5 22:51:20 2015 (r291874) +++ stable/9/lib/libc/locale/wcrtomb.3 Sat Dec 5 22:56:57 2015 (r291875) @@ -24,24 +24,34 @@ .\" .\" $FreeBSD$ .\" -.Dd April 8, 2004 +.Dd May 21, 2013 .Dt WCRTOMB 3 .Os .Sh NAME -.Nm wcrtomb +.Nm wcrtomb , +.Nm c16rtomb , +.Nm c32rtomb .Nd "convert a wide-character code to a character (restartable)" .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In wchar.h .Ft size_t -.Fn wcrtomb "char * restrict s" "wchar_t wc" "mbstate_t * restrict ps" +.Fn wcrtomb "char * restrict s" "wchar_t c" "mbstate_t * restrict ps" +.In uchar.h +.Ft size_t *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201512052256.tB5MuvTE038017>