PHP Error Handling and Debugging – Part 1

This article describes the process of testing PHP code.  Using the tips that I will explain can help to decrease the code/test cycle time.

The first thing that you must know in order to plan your code/test process is the environment in which your code will run.

If you have full control of the system, less configuration is required. In this case you can rely on the default settings, and simply need to know where the logs are kept by default. On a typical LAMP (apache) system you can find the log files in /var/log/httpd. Check the documentation for the operating system that you use as some operating systems use a different directory (i.e. one version of Ubuntu uses /var/log/apache2). By default error messages from php will be kept in this directory.

If you are developing on a server where you don’t have access to the default logs, you can configure where your log messages are sent by putting a php.ini file containing the directive

error_log: path_to_log

in the root of the domain.

With this information in mind, we can begin to find code errors.

There are two error types to look for:

  1. parse
  2. runtime

A parse error is the first thing to look for when testing new or modified code. This can be something like a missing semicolon or other grammatical mistake. If a parse error occurs, it will be sent to PHP’s error_log. A simple way to find this kind of error is to load the file directly in a browser (i.e. an AJAX script that would not normally run in a browser could be tested this way for parse errors). With a default PHP installation the parse error will shown on the screen.

Most errors that are encountered are runtime errors. There are two kinds of runtime errors:

  1. exception
  2. functional

The first kind of runtime error happens when a statement or function call that is grammatically correct encounters a unexpected circumstance such as an invalid parameter (i.e. fopen(file_that_doesnt_exist,’r’)). This kind of error can only be seen during an actual run of the code with valid inputs. Opening the file in the browser directly usually will not find it as the inputs will not be those that would typically be encountered. For example opening an AJAX script that relys on the _POST variable for its input will typically not run many of the branches because of the missing _POST variables. To find this error, run the script as it would typically be run and check the error log for errors.

A functional runtime error is when the code runs, doesn’t generate an error, but doesn’t produce the expected outputs. To find this error use one or more of the following techniques:

  • echo/printf
  • error_log
  • try/catch

The simplest way to find errors is by adding echo statements to the code. This method can be somewhat tedious and slower to use than others, but a few well placed echo statements that use print_r to show the value of key return data structures can sometimes quickly illuminate the source of the malfunctioning code. The problem with this method is that because it outputs directly to stdout (the web browser) it is only available if the script can be run in the web browser directly with typical inputs. Many times this is not possible (i.e. for AJAX or cron code).

A more general way of debugging is to use the error_log function instead of echo. With the error_log function you can direct the messages to a file of your choosing with

error_log($message,3,"path/filename")

or to the configured error_log mentioned earlier via

error_log($message)

A bonus when using the error_log() function is that you also get a timestamp for each error logged.

If a runtime error is expected, a try/catch statement should be placed to ignore it or otherwise handle it in a way that doesn’t cause the script to stop abruptly.  This way the script will continue to run and an error will be logged.  This is better because you will know at what section of code the error occurred.  If the blocking error had gone uncaught (in the case of AJAX responder script errors), the calling application might have received a malformed response (parse error).  A try/catch statement is only helpful when a blocking exception occurs, and will not help to debug functional runtime errors.  The structure of this type of code testing is as follows:

try {
 //your new code
 } catch(Exception $E) {
 error_log(E.getMessage());
 }

In this article we have discussed simple code/test cycle techniques for PHP.  Tune in next time for part 2 where we will review using a debugger such as XDebug.

Javascript Compression — Tools and Process

This article is a review of the tools and processes that I have tested and gives plusses and minuses of each.

Software Uncompressed Compressed Percent Comment
Closure Compiler 39K 16K 59% with ADVANCED_OPTIMIZATIONS
YUI Compressor 39K 22K 44%
perl-Javascript-Minifier 39K 25K 36%

 

perl-Javascript-Minifier
Since CPAN library’s Javascript-Minifier and CSS-Minifier are immediately available linux tools they are a good starting point. The Javascript-Minifier is simple to use. Here is a script that you can try to see how it works:

#!/usr/bin/perl
 use JavaScript::Minifier qw(minify);
 my($iFile)=$ARGV[0];
 my($oFile)=$iFile;
 $oFile=~ s/[.]js//;
 $oFile="${oFile}_perl_min.js";
 open(INFILE, "$iFile") or die;
 open(OUTFILE, ">$oFile") or die;
 minify(input => *INFILE, outfile => *OUTFILE);
 close(INFILE);
 close(OUTFILE);

In my tests, it didn’t break my code, but did generate errors because of incorrectnesses in my code. I used the google chrome jslint plugin to find the errors. jslint only works on pure javascript, but strings are not parsed. Thus you can use php to initialize variables by putting the php code inside of quotes, and still check it with jslint.

 

YUI Compressor
The YUI Compressor is Yahoo’s library, and works better than perl-Javascript-Minifier. Here is an example command for using YUI Compressor:

java -jar yuicompressor-2.4.7.jar --type js -o filename_yui_min.js filename.js

A nice feature of the yuicompressor is that it can accept javascript strings from the command line. This makes it simple to script. It’s goal is to not break code, and in my tests this was observed to be true.

 

Closure Compiler
The google closure compiler is the most advanced of the ones that I tested. It has a simple mode that doesn’t break code and an option for ADVANCED_OPTIMIZATIONS that produces very compressed code. Here is an example command for using the closure compiler in simple mode:

java -jar compiler.jar --js filename.js --js_output_file filename_closure_min.js --externs externs.js

And similarly for advanced mode:

java -jar compiler.jar --compilation_level ADVANCED_OPTIMIZATIONS --js filename.js --js_output_file filename_closure_min.js --externs exterms.js

Similar to perl-Javascript-Minifier, closure compiler only works on pure javascript files. Because of the effectiveness of the optimizations that it does, it can break code. To effectively use it, you need to design your javascript with minification in mind. Typically you want to use your javascript as a library (i.e. as handers for events such as mouse clicks) to do this, you need to add a small amount of code that preserves the function names that will be available to external scripts. Similarly if you want to use external libraries in your library, you need to add extern declarations that will preserve the external symbols. There are less modifications required if you use the simple mode than for the advanced mode. I wanted to use advanced mode for some script that contains jQuery calls (including jQuery Mobile), but wasn’t able to find a way to preserve the jQuery and $ symbols. I tried using –externs with the externs file available as an addon from google svn, but this didn’t solve the problem. Therefore I recommend using simple mode for files containing jQuery and advanced mode for files that do not.

 

In summary of the tools reviewed google closure compiler is the most effective, perl-Javascript-Minifier is the least likely to break code, and yuicompressor is a compromise between these extremes. Additionally each of these tools can be run locally on your machine.

TWiki for a dynamic Company Operations Manual

This article is an overview of the TWiki system with an emphasis on usage as an intranet and Company Operations Manual. A small business can benefit by having such a system in several ways :

  • share knowledge and overlap responsibilities
  • document experience and improve processes
  • identify and facilitate process automation

A Company Operations Manual seeks to codify the processes that the company’s operation depends upon. It must

  • be simple to use/extend
  • be easy to navigate
  • have properties that facilitate improvement and revision
  • allow compartmentalization

TWiki fits each of these needs abundantly. It is by nature a system designed for ease of input. Concepts such as WikiWords and WebNotify allow for quick navigation by facilitating linkages among the various processes and automatically prompting personnel when processes of interest are updated. There are plugins that can keep statistics on how frequently topics are used. This allows identification of processes that would be good candidates for automation (high usage) and those that could be improved or deleted (low usage). TWiki also has built-in support for access control via group membership. This coupled with good design can simply the process of restricting auditors, contractors, and guests to areas of their focus, expertise, or clearance.

In small business there are fewer hands, and the processes and policys must be correspondingly light-weight. A dynamic, intranet-based Company Operations Manual is one way to achieve this necessary business requirement.

Open Document Format for Dynamic Spreadsheet Compilation

This article gives an overview the process of using Open Document format (ODF) to create spreadsheets. I will describe in general the ODF format, pros and cons, and process of developing such a library.

Open Document Version 1.2 was used for this project (the latest Version of ODF available at the time of this writing is 1.3).

During the course of developing spreadsheet data export capability for a recent project, I discovered Open Document Format. Open Document Format is a xml approach to document creation. It is used to great effect by OpenOffice.org (http://www.openoffice.org).

ODF Overview:

A minimal .odf format spreadsheet file contains the following files/directories:

  • content.xml
  • styles.xml
  • meta.xml
  • mimetype
  • META-INF/manifest.xml

The most important files of this set are content.xml (which contains all of the data and some style markup) and styles.xml (which contains additional style markup not contained in content.xml).

ODF Pros and Cons:

Pros:

  • Writing ODF format is computationally fast since it merely adds xml markup to the plain text data.
  • ODF is readable and writeable by Microsoft Excel and OpenOffice.org as well as many other applications ().
  • Small file size (the xml fileset is stored in a zip archive).
  • Many of the features of Microsoft Excel are supported including:
    • Header/Footer
    • Paper Size and Orientation
    • Cell/Row/Column Style (color,font size, border, merged cells)
    • Page Scale During Print

Cons:

  • Microsoft Excel 2007’s implementation although viable is less robust than that offered by OpenOffice.org.
  • Missing support for Excel macros means that computations must be done on the server side and results in a static spreadsheet document
  • Differences in formulas between Excel and OpenOffice.org mean that the document must effectively be written for Excel or the other applications or that computations must be done on the server side (i.e.: openoffice SUM(1; 2; 3), excel SUM(1,2,3))

Library Design and Implementation:

My data export library was based on OpenOffice.org’s approach, and follows this http://develop.opendocumentfellowship.com/ tutorial. It was designed using a cookie-cutter type approach that involves creating a template file using OpenOffice.org, extracting the strings that contain the overall file format, extracting the strings that contain the style, and repetatively substituting the composite string for each row of the excel. When new formats or features were needed, the feature was selected in OpenOffice.org, the output file examined, and the relevant strings extracted and templated as necessary. This approach requires some time, because once the feature has been selected and added to the source the output must then be tested with Microsoft Excel to ensure compatibility. A better approach would be to select the feature in Microsoft Excel and then test with OpenOffice.org. This is because OpenOffice.org accepts the Excel format most of the time whereas the converse is not true.

Another more flexible and long-term approach to constructing this library would be to create a grammer based on the xml tags, and write the xml directly. I have also read about others creating ODF libraries using .odt (open document template).

TCPDF php package for pdf writing

I recently had opportunity to implement the TCPDF package for a midsized project. This article attempts to document my experiences with the API, its strengths, weaknesses, and ease of use.

The package is quite simple to implement at a high level, and following the included examples I was able to create a writer for my project in a matter of days. I appreciated the flexibility of being able to use HTML for layout. Also appreciated was the ability to override the TCPDF class to create custom headers and footers. I utilized this to place a reference to the company logo in the website’s image directory rather than in the tcpdf package’s image directory. I also was able to create a more detailed header layout than the default using this method. Once the pdf document is constructed, TCPDF provides some helpful output options including posting the document directly to the browser. This is a nice option because it allows previewing in an iframe, and doesn’t take up space on the server.

Initially I constructed a string containing inline style and the data in one large HTML table, and wrote the pdf document using one writeHTML call. An example of this follows:


$style="<style type=\"text/css\">\n";
$style=" table {\n";
$style=" color:red;";
$style=" }\n";
$style=" td {\n";
$style=" border:none;";
$style=" }\n";
$style="</style>\n";
$table="<table>\n";
$table=" <tr>\n";
$table=" <td>example</td>\n";
$table=" </tr>\n";
$table="</table>\n";
$html=$style.$table;
$tcpdfObj=new TCPDF('L','pt',true,'letter',false);
$tcpdfObj->SetHeaderData("logo.png", 100, 'pdf title', 'header text');
$tcpdfObj->setMargin(72,72,72,true);
$tcpdfObj->setHeaderMargin(72);
$tcpdfObj->setAutoPageBreak(TRUE,72);
$tcpdfObj->SetFillColor(0,0,0);
$tcpdfObj->setCellPaddings(0,0,0,0);
$tcpdfObj->setFont('sans','',10,'',true);
$tcpdfObj->AddPage();
$tcpdfObj->writeHTML($html,true,true,false,false,'');
$tcpdfObj->Output("test.pdf",'I');

This first implementation worked for a small test database, but failed when I tested it for larger ones producing out of memory errors. Raising php’s mem_limit didn’t solve the problem. I was able to work around this by dividing the writeHTML call into several smaller calls each with a copy of the inline style and a HTML table containing several rows of the original HTML table, but this added to the running time. writeHTML seemed to work with about 2500 cells at a time. Having overcome the memory limitation, I found that running time for large datasets was unnacceptable. It was in the range of 10 minutes or more for a 50000 cell document. Fortunately tcpdf has faster Cell, and MultiCell functions, however when using them layout becomes much more restrictive. Using these faster calls reduced the running time by 50%, but this was still too slow for my project.

To summarize, the tcpdf package works, offers some flexibility of layout and output, is quickly implemented, but doesn’t scale well.