From owner-svn-src-stable@freebsd.org Sat Nov 5 09:46:50 2016 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E7E6C2E141; Sat, 5 Nov 2016 09:46:50 +0000 (UTC) (envelope-from bapt@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E414E17F; Sat, 5 Nov 2016 09:46:49 +0000 (UTC) (envelope-from bapt@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id uA59knhe018870; Sat, 5 Nov 2016 09:46:49 GMT (envelope-from bapt@FreeBSD.org) Received: (from bapt@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id uA59kmVj018867; Sat, 5 Nov 2016 09:46:48 GMT (envelope-from bapt@FreeBSD.org) Message-Id: <201611050946.uA59kmVj018867@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bapt set sender to bapt@FreeBSD.org using -f From: Baptiste Daroussin Date: Sat, 5 Nov 2016 09:46:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r308330 - in stable/11: contrib/netbsd-tests/lib/libc/locale usr.bin/localedef X-SVN-Group: stable-11 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Nov 2016 09:46:50 -0000 Author: bapt Date: Sat Nov 5 09:46:48 2016 New Revision: 308330 URL: https://svnweb.freebsd.org/changeset/base/308330 Log: MFC r306782-r306783 r306782: localedef: Fix ctype dump (fixed wide spread errors) This commit is from John Marino in dragonfly with the following commit log: ==== This was a CTYPE encoding error involving consecutive points of the same ctype. It was reported by myself to Illumos over a year ago but I was unsure if it was only happening on BSD. Given the cause, the bug is also present on Illumos. Basically, if consecutive points were of the exact same ctype, they would be defined as a range regardless. For example, all of these would be considered equivalent: ... , (converts to .. ) , , (converts to .. ) , ... (converts to .. ) So all the points that shouldn't have been defined got "bridged" by the extreme points. The effects were recently reported to FreeBSD on PR 213013. There are countless places were the ctype flags are misdefined, so this is a major fix that has to be MFC'd. ==== This reveals a bad change I did on the testsuite: while 0x07FF is a valid unicode it is not used yet (reserved for future use) PR: 213013 Submitted by: marino@ Reported by: Kurtis Rader Obtained from: Dragonfly MFC after: 1 month r306783: localedef: Improve cc_list parsing original commit log: ===== I had originally suspected the parsing of ctype definition files as being the source of the ctype flag mis-definitions, but it wasn't. In the process, I simplified the cc_list parsing so I'm committing the no-impact improvement separately. It removes some parsing redundancies and won't parse partial range definitions anymore. ==== Submitted by: marino Obtained from: Dragonfly MFC after: 1 month Modified: stable/11/contrib/netbsd-tests/lib/libc/locale/t_mbstowcs.c stable/11/usr.bin/localedef/ctype.c stable/11/usr.bin/localedef/parser.y (contents, props changed) Directory Properties: stable/11/ (props changed) Modified: stable/11/contrib/netbsd-tests/lib/libc/locale/t_mbstowcs.c ============================================================================== --- stable/11/contrib/netbsd-tests/lib/libc/locale/t_mbstowcs.c Sat Nov 5 06:33:39 2016 (r308329) +++ stable/11/contrib/netbsd-tests/lib/libc/locale/t_mbstowcs.c Sat Nov 5 09:46:48 2016 (r308330) @@ -88,7 +88,7 @@ static struct test { 0xFFFF, 0x5D, 0x5B, 0x10000, 0x10FFFF, 0x5D, 0x0A }, #ifdef __FreeBSD__ - { 1, -1, -1, 1, 1, -1, 1, 1, 1, 1, -1, 1, 1, -1, -1, + { 1, -1, -1, 1, 1, -1, -1, 1, 1, 1, -1, 1, 1, -1, -1, #else { 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, #endif Modified: stable/11/usr.bin/localedef/ctype.c ============================================================================== --- stable/11/usr.bin/localedef/ctype.c Sat Nov 5 06:33:39 2016 (r308329) +++ stable/11/usr.bin/localedef/ctype.c Sat Nov 5 09:46:48 2016 (r308330) @@ -407,9 +407,9 @@ dump_ctype(void) continue; } - if ((last_ct != NULL) && (last_ct->ctype == ctn->ctype)) { + if ((last_ct != NULL) && (last_ct->ctype == ctn->ctype) && + (last_ct->wc + 1 == wc)) { ct[rl.runetype_ext_nranges-1].max = wc; - last_ct = ctn; } else { rl.runetype_ext_nranges++; ct = realloc(ct, @@ -417,8 +417,8 @@ dump_ctype(void) ct[rl.runetype_ext_nranges - 1].min = wc; ct[rl.runetype_ext_nranges - 1].max = wc; ct[rl.runetype_ext_nranges - 1].map = ctn->ctype; - last_ct = ctn; } + last_ct = ctn; if (ctn->tolower == 0) { last_lo = NULL; } else if ((last_lo != NULL) && Modified: stable/11/usr.bin/localedef/parser.y ============================================================================== --- stable/11/usr.bin/localedef/parser.y Sat Nov 5 06:33:39 2016 (r308329) +++ stable/11/usr.bin/localedef/parser.y Sat Nov 5 09:46:48 2016 (r308330) @@ -27,6 +27,8 @@ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ */ /* @@ -321,21 +323,18 @@ ctype_kw : T_ISUPPER cc_list T_NL | T_TOLOWER conv_list T_NL ; +cc_list : cc_list T_SEMI cc_range_end + | cc_list T_SEMI cc_char + | cc_char + ; -cc_list : cc_list T_SEMI T_CHAR +cc_range_end : T_ELLIPSIS T_SEMI T_CHAR { - add_ctype($3); + add_ctype_range($3); } - | cc_list T_SEMI T_SYMBOL - { - add_charmap_undefined($3); - } - | cc_list T_SEMI T_ELLIPSIS T_SEMI T_CHAR - { - /* note that the endpoints *must* be characters */ - add_ctype_range($5); - } - | T_CHAR + ; + +cc_char : T_CHAR { add_ctype($1); }