Date: Thu, 27 Oct 2022 13:43:24 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 264275] sed complaining about trailing backslash when using Umlauts Message-ID: <bug-264275-227-YYiGRjxeCs@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-264275-227@https.bugs.freebsd.org/bugzilla/> References: <bug-264275-227@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D264275 Daniel Tameling <tamelingdaniel@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tamelingdaniel@gmail.com --- Comment #1 from Daniel Tameling <tamelingdaniel@gmail.com> --- The error comes from trying to compile the umlaut as a regex. I managed to create a small reproducer that just calls regcomp. The error seems to come from this snippet in the p_simp_re function in lib/libc/regex/regcomp.c: if ((c & BACKSL) =3D=3D 0 || may_escape(p, wc)) ordinary(p, wc); else SETERROR(REG_EESCAPE); Both checks in the if statement are false and thus we end up with the trail= ing backslash error. In may_escape this is the return statement that gets taken: if (isalpha(ch) || ch =3D=3D '\'' || ch =3D=3D '`') return (false); ch is the wint_t representation of the umlaut, which is 0xe4. In de_DE.ISO8859-1, the isalpha call returns true. (If I do it with an UTF8 = =C3=A4 in an UTF8 locale, ch becomes also 0xe4, but the isalpha call returns false, so this doesn't trigger the trailing backslash error.) --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-264275-227-YYiGRjxeCs>