Optimized squidlook-importer.php commited into SVN.

Squidlook-importer.php is the script that do the big work: importing the Squid log files.
It is based from the old mysar-importer.php.
The original author, Giannis Stollis, pointed me about the slowness of the importer. He was working on Mysar 3.0 with an entirelly C compiled importer to get things faster, but he dropped out.
I saw the code and optimized it a bit. I've taken some simple but powerful tactics.
Given that the importer uses many, many, many SQL queries, so the bottleneck is those Mysql calls.
So:

  • Cache all small tables: for example, users table, can be entirelly cached into memory: it is good for organizations up to 80.000 employers.
  • Buffered inserts: multiple inserts done in one time is much, much, much faster than multiple single inserts. Simple do this:
PHP:
  1. $query = "INSERT INTO TABLE (field1, field2, ..., fieldn) VALUES ";
  2.         <...>
  3.         foreach (....) {
  4.              $query .= " ($value1,$value2, ... $valuen) ";   
  5.              if ($ninserts == $maxinserts) {
  6.                  mysql_query ($query);
  7.                  $query = 'INSERT INTO TABLE (field1, field2, ..., fieldn) VALUES ';
  8.              } else {
  9.                  $query .= " , ";
  10.              }
  11.         }

Sorry for the code not indented: I must discover how to enable this in wordpress...

I found the plugin for the code at http://blog.igeek.info/wp-plugins/igsyntax-hiliter/ ! It's awesome :D

Leave a Reply