The following flowchart says quite a bit about software development on the whole, and why it’s so easy to fail; it’s almost 100% true for any system with more than one programmer – and that single programmer often resorts to adding hacks upon hacks upon hacks just to make something work, leaving an ill-maintained piece of code.

Well, OK, I won the prize.

I won some goodies!


I won the door prize at the local semi-upscale supermarket. It has a bottle of wine, chocolate, cookies, cake, and some multigrain snacks.

(Yes, I know my grout is ugly; it’s two different shades. Why they did that, I don’t know.)

Bored this evening, I've not so much reinvented the wheel, as invented another way to use it.

%time php z63test.php /usr/share/dict/american-english
Normal length: 931467 bytes.
z63 length: 336720 bytes... 63% smaller!
Uncompressed length: 931467 bytes.
php z63test.php /usr/share/dict/american-english  0.28s user \
  0.02s system 99% cpu 0.304 total
Yes, you've seen that correct. It took less than .3 seconds on my 1Ghz laptop to compress almost 1MB of data down to about a third of it's size - into an XML compatible data format, then back into it's original form entirely in PHP!

The system cleanly degrades; if compression won't help, it won't force it. It will, however, continue to encode the string, so it's transferable via web-enabled applications.. You can even feed it binary data; it's even unicode clean (that's what it's primarily designed for, enabling binary data to be safely encapsulated, with less overhead than existing schemes.)

The result speaks for itself (Note that this data is rather worthless; it's just a free-form example):

Z63:
<?xml version="1.0" standalone="yes"?>
<FileData>
  <FileName>/usr/dict/words</FileName>
  <FileDate>2005-12-20T12:58:58-06:00</FileDate>
  <FileMD5>e954ccd9535d5550d8b632972b5a10ed</FileMD5>
  <FileBytes>931467</FileBytes>
</FileData>
<PassData>
  <Date>2007-11-26T07:26:57-06:00</Date>
  <MD5>03a43176887b657f90eab09e177c3a7d</MD5>
  <Bytes>336720</Bytes>
  <DataBlob>...</DataBlob>
</PassData>

Base64:
<?xml version="1.0" standalone="yes"?>
<FileData>
  <FileName>/usr/dict/words</FileName>
  <FileDate>2005-12-20T12:58:58-06:00</FileDate>
  <FileMD5>e954ccd9535d5550d8b632972b5a10ed</FileMD5>
  <FileBytes>931467</FileBytes>
</FileData>
<PassData>
  <Date>2007-11-26T07:27:36-06:00</Date>
  <MD5>bf1ecb99327ce7fb1f7496751039ea19</MD5>
  <Bytes>1241956</Bytes>
  <DataBlob>...</DataBlob>
</PassData>


Heck, it even compresses the front page of my website pretty well.
%time php z63test.php http://www.holwegner.com/
Normal length: 9911 bytes.
z63 length: 5077 bytes... 48% smaller!
Uncompressed length: 9911 bytes.
php z63test.php http://www.holwegner.com/  0.03s user \
  0.03s system 6% cpu 0.924 total

Sometimes you just don't have the ability to see what files are including what. Such as the case with a closed-source system I'm working with on a support level.

Sadly, these people are not very good with their code, and are consistently duplicating functions with the same name in different files, so if I need functions from one, and I include the other, it's going to die with the namespace colission.

This prompted me to write this simple little PHP4 and PHP5 compatible 'sniffer', which, when executed, will tell you wether or not a function is available within an included file, and any files it may include.

It also has skeletal abilities to change the built in functions, and call your own, which can then pass all data to the native function.. This is done for include() in PHP5. However, since this development system is still on PHP4, I've only got get_included_files() to keep me warm, but that's fine.

<?php
// FunctionSniffer v0.1 by Shawn Holwegner <shawnospamn@holwegner.com>
// --
// This software is not public domain, however you may use it, and modify
// it for your own use, but please do not distribute modified copies.
if (version_compare(phpversion(), "5.0.0", "gt")) {
  if ((function_exists('rename_function') \
  && function_exists('override_function'))) {
    rename_function('include', 'std_include');
    override_function('include', '$string', \
    'return override_include($string);');
  }
}


if (file_exists($argv[1]))
  @include_once($argv[1]);
if($argv[2])
  if (function_exists($argv[2])) {
    echo $argv[1] . " has function " . $argv[2] . "().\n";
    $files="";
    foreach (get_included_files() as $filenames) {
      if ($filenames != __FILE__)
      $files.="\t" . $filenames . "\n";
    }
    if (!empty($files))
      echo "It asked for files:\n$files\n";
  }
exit;

function override_include($string) {
  echo "override_include(): We're including $string\n";
  return std_include($string);
}
?>
Here's how it works:
%php ~/code/php/misc/functionSniffer.php helper.php my_db_query
helper.php has function my_db_query().
It asked for files:
        /home/www/protected.com/productname/includes/help.php
        /home/www/protected.com/productname/includes/head.php
        /home/www/protected.com/productname/includes/core_functions.php
        /home/www/protected.com/productname/includes/database.php
        /home/www/protected.com/productname/includes/config.php
        /home/www/protected.com/productname/includes/config_override.php
        /home/www/protected.com/productname/includes/admin.php
With PHP 5, it will even tell you realtime as things are include()'d, but that's somewhat of a 'to be developed' system, which requires pecl's apd loaded as a zend extension, which doesn't work with Zend Encoder.