$Id: REGEX.txt,v 1.3 2005/02/22 06:28:14 jpr5 Exp $

Date: 2/21/05

A note about PCRE vs. GNU regex:

      I ran several tests comparing GNU regex 0.12 to the PCRE 5.0
      library using 100 million loop iterations: optimized
      vs. non-optimized, match vs. non-match.  The obvious conclusion
      is that GNU regex is the reigning king of speed, and that with
      regular expression engines optimization matters significantly.

      (Please note that I tried other third-party regex libraries like
      RxSpencer's and libhackerlab's, and none came close to
      comparing.)

      The test subject was "how now brown cow", and the pattern we
      were searching for in the match case was "now brown", and in the
      non-match case "not brown".  Obviously, the speed of matches is
      directly related to the actual regex itself, and a
      well-formulated regex certainly performs more efficiently than a
      simple substring match.  However, this test is reasonably
      indicative of how most people use ngrep, so the test results are
      still useful.

      Granted, on the single-match level the time difference is
      absolutely unnoticeable (it took 100 million loop iterations to
      compute something worthwhile), so this may not mean anything to
      you.  Likewise, the stripped binary sizes are also within 10k of
      each other on the test compile box.

      If licensing terms are more sensitive for you than speed then
      compile against PCRE, which is available under the Artistic
      License (Free as in Beer).  Otherwise, in all other cases the
      GNU regex library is the best candidate, and the speed can
      really help when piping those 500MB pcap dump files through
      ngrep over and over for analysis.


Test results:

     CPU: Intel Pentium-M 2GHz
          L1 I cache: 32K, L1 D cache: 32K
          L2 cache: 2048K

     Iterations: 100M

                           match             nomatch

         [-O0]
     GNU regex-0.12    17.369s/17.385s    32.656s/32.069s
     PCRE-5.0          35.840s/35.795s    25.340s/25.344s

         [-O2]
     GNU regex-0.12    12.240s/12.280s    19.512s/19.489s
     PCRE-5.0          24.580s/24.578s    17.235s/17.238s



