From owner-freebsd-bugs@freebsd.org Thu Nov 8 13:08:44 2018 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A2F1A1129B73 for ; Thu, 8 Nov 2018 13:08:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 192317D7CE for ; Thu, 8 Nov 2018 13:08:44 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id D21B41129B71; Thu, 8 Nov 2018 13:08:43 +0000 (UTC) Delivered-To: bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B00C21129B70 for ; Thu, 8 Nov 2018 13:08:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 381777D7CD for ; Thu, 8 Nov 2018 13:08:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 5D3494D7F for ; Thu, 8 Nov 2018 13:08:42 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id wA8D8g02019094 for ; Thu, 8 Nov 2018 13:08:42 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id wA8D8gxW019093 for bugs@FreeBSD.org; Thu, 8 Nov 2018 13:08:42 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 232374] /bin/sh can not handle ja_JP.eucJP character code Date: Thu, 08 Nov 2018 13:08:41 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 11.2-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: naito.yuichiro@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-Rspamd-Queue-Id: 192317D7CE X-Spamd-Result: default: False [-105.80 / 200.00]; FORGED_RECIPIENTS_FORWARDING(0.00)[]; ALLOW_DOMAIN_WHITELIST(-100.00)[freebsd.org]; FORWARDED(0.00)[bugs@mailman.ysv.freebsd.org]; SPF_FAIL_FORWARDING(0.00)[]; TO_DN_NONE(0.00)[]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; XAW_SERVICE_ACCT(1.00)[]; RCVD_IN_DNSWL_MED(-0.20)[5.0.0.0.0.5.0.0.0.0.0.0.0.0.0.0.a.6.0.2.4.5.2.2.0.0.9.1.1.0.0.2.list.dnswl.org : 127.0.9.2]; MX_GOOD(-0.01)[cached: mx66.freebsd.org]; NEURAL_HAM_SHORT(-1.00)[-0.999,0]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; IP_SCORE(-3.59)[ip: (-9.75), ipnet: 2001:1900:2254::/48(-4.59), asn: 10310(-3.51), country: US(-0.09)]; ASN(0.00)[asn:10310, ipnet:2001:1900:2254::/48, country:US]; FORGED_RECIPIENTS(0.00)[bugs@FreeBSD.org,freebsd-bugs@freebsd.org]; TO_DOM_EQ_FROM_DOM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; FROM_NO_DN(0.00)[]; RCVD_COUNT_SEVEN(0.00)[7] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Nov 2018 13:08:44 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D232374 Yuichiro NAITO changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |naito.yuichiro@gmail.com --- Comment #2 from Yuichiro NAITO --- In my investigation, main reason of this problem is because read_char() function doesn't retry read(2) from STDIN when mbrtowc(3) returns -2. In lib/libedit/read.c, we can see following code that retries only when CHARSET_IS_UTF8 flag is set. ``` switch (ct_mbrtowc(cp, cbuf, cbp)) { case (size_t)-2: /* * We don't support other multibyte charsets. * The second condition shouldn't happen * and is here merely for additional safety. */ if ((el->el_flags & CHARSET_IS_UTF8) =3D=3D 0 || cbp >=3D MB_LEN_MAX) { errno =3D EILSEQ; *cp =3D L'\0'; return -1; } /* Incomplete sequence, read another byte. */ goto again; ``` Of course, CHARSET_IS_UTF8 flag is not set in eucJP environment. Try cutting CHARSET_IS_UTF8 flag check, /bin/sh works to read eucJP code. And I found another problem with cutting CHARSET_IS_UTF8 flag check. It is that command history mistakes calculating eucJP character length, because ct_enc_width() function in chartype.c doesn't understand other char= set than UTF-8. I rewrite ct_enc_width() to use wctomb(3), command history problem is fixed. With these two changes, we don't need CHARSET_IS_UTF8 flag any more. CHARSET_IS_UTF8 flag controls NARROW_HISTORY flag, and NARROW_HISTORY flag is used only in HIST_FUN definition. ``` #ifdef WIDECHAR #define HIST_FUN(el, fn, arg) \ (((el)->el_flags & NARROW_HISTORY) ? hist_convert(el, fn, arg) : \ HIST_FUN_INTERNAL(el, fn, arg)) #else #define HIST_FUN(el, fn, arg) HIST_FUN_INTERNAL(el, fn, arg) #endif ``` In WIDECHAR environment, hist_convert() should be called always, because hist_convert() is a multibyte aware function. For all my fix, I opened new differential on Phabricator. https://reviews.freebsd.org/D17903 I believe my fix solve this problem and doesn't affect other charset than eucJP. Please review my code. Hirabayashi-san: Could you please try my patch from Phabricator and check if this problem is fixed? I don't think /bin/sh is wrong. --=20 You are receiving this mail because: You are the assignee for the bug.=