Date: Wed, 08 Nov 2017 12:59:43 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 223532] egrep -i is terrible slow if utf-8 locale is enabled Message-ID: <bug-223532-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D223532 Bug ID: 223532 Summary: egrep -i is terrible slow if utf-8 locale is enabled Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: bin Assignee: freebsd-bugs@FreeBSD.org Reporter: wosch@FreeBSD.org egrep -i is terrible slow if the locale is set to utf-8. In fact, it is 77 times slower then a case sensitive search. How to repeat: First, we create a 100MB text file: for i in $(seq 1 20);do man tcsh;done > /tmp/tcsh20; for i in $(seq 1 20); do cat /tmp/tcsh20;done > /tmp/tcsh400 $ du -hs /tmp/tcsh400 99M /tmp/tcsh400 # case sensitive search with utf-8 LANG=3Den_CA.UTF-8 time egrep -c foobar /tmp/tcsh400 0 0.11 real 0.06 user 0.04 sys # case in-sensitive search with utf-8, terrible slow LANG=3Den_CA.UTF-8 time egrep -ic foobar /tmp/tcsh400 0 8.47 real 8.42 user 0.04 sys # case sensitive search with ASCII LANG=3DC time egrep -c foobar /tmp/tcsh400 0 0.10 real 0.06 user 0.03 sys # case in-sensitive search with ASCII LANG=3DC time egrep -ic foobar /tmp/tcsh400 0 0.10 real 0.07 user 0.03 sys --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-223532-8>