From owner-freebsd-hackers@freebsd.org Fri Nov 23 16:03:10 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 61A9C114AB54 for ; Fri, 23 Nov 2018 16:03:10 +0000 (UTC) (envelope-from yuripv@yuripv.net) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AB9FB6A0EE for ; Fri, 23 Nov 2018 16:03:09 +0000 (UTC) (envelope-from yuripv@yuripv.net) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 21B512213F for ; Fri, 23 Nov 2018 11:03:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Fri, 23 Nov 2018 11:03:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yuripv.net; h=to :from:subject:message-id:date:mime-version:content-type; s=fm3; bh=f/5pT0MXidg6tEQlviCLfiqu4cseFxQVfmgD61dK4lE=; b=b61oPoerO4Bp 3K5j0cdTdfHnfkhxc++q8ZQJEWEAu2rtp56BNVMl9UzHNOIYZyVdcM3RNukeCqTi wYhmiGH5C0ooclDkJCJky8/X06pkQlDEKDU/zVUzsysTn6jXK1PAOTpFLH/Kr0nu WazK9/P25q1S93dCIEDgk7pU5YhWebR3Y2MBmwpRc7azOBTOFuTZhBlgDpfuNSQt gLHWQTNWx/jgDy9RIbouXaJrtmXLAQXEADIomBihtT7HDKkaUcjDwxVQ/okP5e2e 7a6Mu+VxPBwu6Hs54krsKReeihGRZayy/ahd7wVlsRXBvZnsgtwDmPnUFdFi0+Js T2xpWMmceg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=f/5pT0MXidg6tEQlviCLfiqu4cseF xQVfmgD61dK4lE=; b=BoZFJ0ZGKHHgyd1CjgB7CwwXfZhNZqIqM8tMAzkt1Zw+M tZYmx2VaIPS7O5CaRf6+JcIYeyC8J0HdHNaRhyRfbZbAkESAE1C/hVnHCp510trf aL5lQEgA2xqg3/VReF+GfU3QdfWQPEFoVTlc4ox5xcbwvdWk6v3eqBnqNA7eQgA/ rI5q4T+G2uUPovUsx8ECftT5dH3B9nENztTfphjnH3q+1kOCRJ1IVTwqLgpCwXhE 8iqxt8Ru3VpcHH7HmIIQAOcvZkwwapdOEA+kLXwHnKO9eLwhhHnOZu2uleB2OU7p v9A2CXLxaFdZ2RZOWmjCthONAnGWIaSFX8qJPZYSA== X-ME-Sender: X-ME-Proxy: Received: from thor.yuripv.net (unknown [92.50.223.252]) by mail.messagingengine.com (Postfix) with ESMTPA id 7318A102E0 for ; Fri, 23 Nov 2018 11:03:00 -0500 (EST) To: freebsd-hackers From: Yuri Pankov Subject: regex, multibyte locales, and word boundaries Openpgp: preference=signencrypt Autocrypt: addr=yuripv@yuripv.net; keydata= xsBNBFu8u6IBCADB11gP0QwnorrHjqAtKLHKHNHskhy0s7jqJKfx0YqXgVBKGLJ9/mjLAz0F CBNvemHSDDTs0mEZ9cBKKi6cmsav6+UQgr//yai6hvXLBJqKchSFO4MhmdvBtsGFq1yKz5Zi uhjmimKyIpgBgvMdbgGbGq6cnSB2uEPmZuJr419SVRODOkXukU+F5WHgaHzDdHAIu1asCt2B +6msxqIqlFWcXyZyTGicTGGvC/PFIsVRUtD1dIJANTC876g7DTb7LZXWiWwJpSJ4GKMXMHVX Ct9BoQ4i3nhKbOxb6Io1wsy+NFyWsTJ9KYrxKKPJP3oG8BWb/cqlFqnE4eNSsiq2q7krABEB AAHNIFl1cmkgUGFua292IDx5dXJpcHZARnJlZUJTRC5vcmc+wsCUBBMBCgA+FiEE+Gq3PsPe LT4tL/9wk4vgf7Eq4WwFAlu9Cn0CGwMFCQWjmoAFCwkIBwMFFQoJCAsFFgMCAQACHgECF4AA CgkQk4vgf7Eq4WxuPQf9HccaDyusO1J+wDQNlp9/uU0cnIfjHAeG80xrAfN9Vnf1wO9T2/WI iYlIdK+KVnhSa/DeBuHq/asfpUbrOleTF0hzG39os+95DzuT9a/j5XeQGuBgNbpVB+10zR3I 5AagSQetHilcZtz65g9GTUuIxb+xDaBehFBjyYXApfNE6yY5IlzDZpM7MOOLLFm2mQwQ8yjS eZ4jA6qW6/QMXRTkmpC9EXIeWDuNgWBwszaFGR6oUIpl0mGmwdJkEKwUazt6OuoDilMNZefZ 0pVFZBhnE46vK+6FDDFZE3BkeHVnqvy2QGL/6uKhSHc0lChCEPHnhqz6v23MwcQ6ktVWzvBJ oM7ATQRbvLuiAQgAyood0Pd96wzY+GQPBYQUNkZZgYL8Di3AzyC94dFe4d/Mt/h4rIBUnFwA g7Ha05WGdW0V5A/RRxDcpwXL9Jf97hiQ5PI2hiAxNEz/DkAUafiGlPfwR5wKqysUyRiKJQ2o ctpvssdsoXXOgeLo1jA6ghda1jg/spjlsPlS5ZTpKx3GWuTybV/VDhmwKWZfGUzPBJeAgDTf BdW4PTFs1IvvC2KBlhnPgcLBUtTlAdXOEj4DLuXw+Fn7K/ckZdOn3aRANmE+wf4+f+UUgtLB NmbP7ZifyUX5RyddsnI+fZmtsUDHxCReNIWQ6TBUJmb21aoBIN6HEHJbY28ZSCmf5owuMwAR AQABwsB8BBgBCgAmFiEE+Gq3PsPeLT4tL/9wk4vgf7Eq4WwFAlu8u6ICGwwFCQWjmoAACgkQ k4vgf7Eq4WyA3AgAqgGTHKMVAS2WuNGuW9uI+YtY6ZbwmGG94fkOZbefgRSfO5Am+HSblA95 IdotvQa8VkFmvVjbnvaM8XmJG5H17m0GF3sVaJUbJ4euDnRrBPCr6KwRQQd83Svxkbdicvo7 J031FrkJZW8zD9DH4QgzJNTKPFrwx9v3DhD/8iyn9tGvnHepy7O24nY5hl6PacrgSgLVeir/ lUbueAC/gP1AWLv3gdw7b83J7rftWauimj/vpFMD8CDSyJNODgQ8DdM0TU4qjABWGMs9r2Rw QehNbYf5f/2QuW/Q5NGaRSNW2HS/cpp62XtTKmxj5wwk6EMbtNE/6WQpumfdmK2UGLjcJQ== Message-ID: <5166f3c9-d587-a245-df21-8e50f075a8cc@yuripv.net> Date: Fri, 23 Nov 2018 19:02:40 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="p8IhWLes9jJtEyQe8bxwD0DPXoEUrQ9Zz" X-Rspamd-Queue-Id: AB9FB6A0EE X-Spamd-Result: default: False [-4.85 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(0.00)[+ip4:66.111.4.28]; HAS_ATTACHMENT(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: in2-smtp.messagingengine.com]; DKIM_TRACE(0.00)[yuripv.net:+,messagingengine.com:+]; NEURAL_HAM_SHORT(-0.99)[-0.991,0]; SIGNED_PGP(-2.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[252.223.50.92.zen.spamhaus.org : 127.0.0.11]; IP_SCORE(-2.66)[ip: (-9.30), asn: 11403(-3.89), country: US(-0.09)]; RCVD_TLS_LAST(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[28.4.111.66.list.dnswl.org : 127.0.5.1]; ASN(0.00)[asn:11403, ipnet:66.111.4.0/24, country:US]; MID_RHS_MATCH_FROM(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; ARC_NA(0.00)[]; RECEIVED_SPAMHAUS_XBL(3.00)[252.223.50.92.zen.spamhaus.org : 127.0.0.4]; R_DKIM_ALLOW(0.00)[yuripv.net,messagingengine.com]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.998,0]; MIME_GOOD(-0.20)[multipart/signed,multipart/mixed,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[yuripv.net]; RCPT_COUNT_ONE(0.00)[1]; BAD_REP_POLICIES(0.10)[] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Nov 2018 16:03:10 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --p8IhWLes9jJtEyQe8bxwD0DPXoEUrQ9Zz Content-Type: multipart/mixed; boundary="07odA7TKf2mtNqjkFg4cktmrTFo1E2kIs"; protected-headers="v1" From: Yuri Pankov To: freebsd-hackers Message-ID: <5166f3c9-d587-a245-df21-8e50f075a8cc@yuripv.net> Subject: regex, multibyte locales, and word boundaries --07odA7TKf2mtNqjkFg4cktmrTFo1E2kIs Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Hi, We have the following note in the BUGS section of regcomp(3): ---------------------------------------------------------------------- Word-boundary matching does not work properly in multibyte locales. ---------------------------------------------------------------------- It was added ages ago along with multibyte support in our regex implementation, though I can't think of any positive test case to see that the problem is real, and eventually fix it. I'm wondering if anyone has real life examples showing the bug? --07odA7TKf2mtNqjkFg4cktmrTFo1E2kIs-- --p8IhWLes9jJtEyQe8bxwD0DPXoEUrQ9Zz Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEE+Gq3PsPeLT4tL/9wk4vgf7Eq4WwFAlv4JLIACgkQk4vgf7Eq 4WxpGAgAqKQP7R+0Qbc7zGo6QCEfO37P4SG3H3o5pUGvdCOOweUCGLQQALS1cqww WUbgvpWYuMzYVNAhslURF/S1cV0v3nmzkH4vlksnJJ3vYJ0KVipkdsXNN6M5dvYj 5RU0g2EYyLingB1GCFvlazA1mjV7RZ/f91SNOX9fFIQC2u9IfSwmdnePyeDpym6M dR0SrDuO1iQGsuelKNXunTTRZ3oJq4PDFV5FXBg8qWj9jl3wVWXpUa1NERZfLhcr NadmcGMnwtGaxXcocNwjed7gTLoNQ4oYGML5b8i5a2bDO4mJcj8a+ZO3VKuwfuYT NUfCRIed7gQU2jigQWPLCNyUiftNEA== =tTwj -----END PGP SIGNATURE----- --p8IhWLes9jJtEyQe8bxwD0DPXoEUrQ9Zz--