


Running the same tests on a CentOS 5.3 machine produces the following interesting results. The two machines that were tested above both were running Ubuntu one 11.04 (Natty Narwhal), the other 12.04 (Precise Pangolin). So I'd love to find a fast correct alternative and an explanation if anyone has one. $ time perl -ne '/fun/i & print' test.txt And the POSIX grep above is about twice as fast. We could use Perl instead it is faster, but still 5.5 times faster then the case sensitive grep.

$ time LANG=POSIX grep -ignore-case fun test.txt The default locale on my machine is en_US.UTF-8, so setting it to POSIX seems to have made a performance boot, but now of course I can't search correctly on Unicode text which is undesirable. So I ran the following test, and it did speed up. Googling around I found an a discussion on grep being slow in the UTF-8 locale. By adding the -ignore-case option the command becomes 57x times slower.
#GREP IGNORE CASE WHITE SPACE CODE#
First the test file will be a 50 MB plain text file with some dummy data, you may use the following code to generate it:Ĭreate test.txt yes all work and no play makes Jack a dull boy | head -c 50M > test.txtīelow is a demonstration of the slowness. I don't need regular expressions, just fixed string searching. I would also like to see an alternative command to grep for case-insensitive searches. I am curious to find out an explanation for the huge performance difference. I've tested this on two different machines with the same result. I was very surprised to see that when you add the -ignore-case option to grep that it can slow down the search by 50x times.
