PHP quick data export to spreadsheet

In our last post we talked about the limitations of creating large spreadsheets with PHP library PHPExcel.  Today we’ll discuss a quick workaround that will allow you to create large unformatted spreadsheets for quick and easy data export from your applications.  Here’s how we do a php quick data export to spreadsheet:

  • Properly escape your data.
  • Write data with fcsvput($filehandle, $yourdataarray, “t”);
  • Include table headings and other data.
  • Use a xls file extension.
Properly escape your data... You’ll be using a tab delimited format for output so you’ll need to replace the tabs with an escaped tab or five spaces so that your data doesn’t corrupt your output format.  For our purposes we’ll be using five spaces in the following function.
function TSV_Escape($str)
{
    if (!strlen($str) == 0)
    {
        $str=str_replace('t','     ',$str);
        if (!strstr($str,"n") === false)
        {
            $str=str_replace('n','%n',$str);
        }
        return str_replace('"','""', $str);
    }
    return '';
}

Write you data with fputcsv:  First you’ll need to open a file handle like this…

// Set file path and initialize handle.
$TSVFile = $_SERVER['DOCUMENT_ROOT'] . "/MyDataFile.xls";
$TSVFileHandle = fopen($TSVFile, 'w');
// Write your headers.. See next section :)
// Write your data.
// Write query data to csv.
if ($objdbconn->real_query($sql))
{
    if ($result = $objdbconn->store_result())
    {
        echo date('H:i:s') . "Processing " . $result->num_rows . " records.<br>\n";
        while ($row = $result->fetch_assoc())
        {
            $writeArr = array();
            foreach($row as $value)
            {
                $writeArr[] = TSV_Escape($value);
            }
            fputcsv($TSVFileHandle, $writeArr, "\t");
            unset($writeArr);
        }
    }
}
// Close your file handle.
fclose($TSVFileHandle);

Write your headers and other data:  You may want to include some information about your spreadsheet and you’ll certainly want to include the column header row.  Here’s a brief example of how this could be done:

$headings = str_replace('`', '', "Heading One,`Database Field 1`,`Database Field 2`,`Database Field 3`");
$headArr = explode(',', $headings);

$RepDat[]="Created: " . date('H:i:s d-m-Y');
$RepDat[]="Created by: " . $UserName;
// Write county name to csv file.
fputcsv($TSVFileHandle,$RepDat,"\t");
fputcsv($TSVFileHandle,$headArr,"\t");

Use a xls file extension.  Sure you could use a tsv file extension, but if you want Excel or Open Office to open the file by default with minimal headache the xls extension will do the trick.  You’ll get an message when you open the report stating that the file was not in the same format as the extension, but you won’t have to worry about the tsv file extension being registered to the right application.

This method will kick out a tab delimited spreadsheet in a matter of seconds and can safely handle large record sets.  We used it for a while, as a temporary fix for reporting, until we came up with a better method and we still use this method to create database load files when parsing complex legacy reports or backing up database records to file.

I hope this php quick data export method is helpful.

Joel

Excel spreadsheets with PHP – a PHPExcel overview

Writing Excel spreadsheets with PHP using PHPExcel:  The pros and cons

PHPExcel is a php library for reading and writing excel spreadsheets.  It is very well documented and has good community support.  When searching the web for an open source solution for writing spreadsheets in PHP I reviewed everything from PHPClasses to Pear and finally I came across this package.  It’s the most up to date looking project out there for excel spreadsheets with PHP.  I didn’t want to use a project that was in the twilight of its development cycle so I opted to try PHPExcel.  Here’s a a brief overview of what I found:

Pros – The good points:

  • Very well documented.
  • Good examples
  • Support forum with active participation.
  • Supports many of the builtin excel functions.
  • Easy to style
  • Supports modern Excel formats plus csv, html, and pdf
  • Write spreadsheets
  • Read spreadsheets
Cons – what I couldn’t stand:
  • SLOW….  This library takes considerable time to produce a spreadsheet when memory conserving disk caching is used.  Consider running your process in the background.
  • Memory Intensive  PHPExcel offers disk caching to reduce its in memory size, but it basically is an in memory spreadsheet and must load completely into memory while being before being written to file.  This can considerably hinder your performance as the spreadsheet object and writer object can take over 25 mg of ram while the program is running with an additional 1 kb of ram per cell loaded in memory.  For an example of how this can impact your server or hosting service consider the load of a 50 column 10000 row spreadsheet which would take 525 mb of ram to load when writing to file.  While its possible to set the scripts memory high enough to handle this you’ll find that if several of these scripts try to run at the same time you’ll have a mess that could crash your server.
  • Write to file is batched processed at the end of the script and happens all at one time.  Regardless of your disk caching settings the entire spreadsheet is loaded into memory and then written to the file.  This creates a serious bottleneck if you have a large file and will oftentimes crash your script.  The only exception to this is when PHPExcel writes a csv file in which it allows for a more granular approach to memory to file writing.
Workarounds that will reduce memory usage and improve speed.
  • Set a cells value explicitly.  Setting the cells value and data type explicitly is much faster than writing cell values for PHPExcel to determine the data type of.
  • Disable formula precalculation by setting it to false.
  • Use disk caching.
  • Don’t try to use it to create reports from large data sets.
In summary:  If you are trying to read and write to small spreadsheets with less than two hundred and fifty thousand fields or cells then this library will be a good fit for your project.  It’s great for summary reporting!!  However if you’ve got extremely large record sets that you need to send to a spreadsheet you’ll need a different approach.  I’ll be writing about how we did that in the next blog post.
Stay tuned and may the source be with you!!
– Joel

My findings on large complex sql statements with mysqli

My findings when developing using large complex sql statements with mysqli in php

The granular approach is more efficient

Sometimes its tempting to execute a large complex sql statements with mysqli to “do it all” in one update but I’ve found this approach to be inefficient in terms of speed and functionality.  My advise it to use smaller and precise sql statements instead of large cumbersome queries. The small statements execute quickly and will get the job done faster then the large complex query. For instance in one case I had a complex dynamically generated query with over 230 update statements that updated a single table and concatenated (CONCAT) text to a text field.  This query typically hung up and took so long to execute that it caused problems in my ajax environment locking tables and creating conflicts with other operations that were trying to read and write using update and select statements.  Once I broke it up into 230 small queries the execution time went down from 30 seconds and failing to 3 to 5 seconds and never failing.  In another case when translating csv files to a database I had an issue with a multi query that hung up on large files but once broke it down into running smaller specific queries it was able to execute up to 10000 queries in a few seconds (no I’m not exaggerating it).

In addition to these tidbits here’s some other findings… Mysql lets you set the priority level of queries and by default update queries have a higher priority than select queries so in theory it isn’t possible to run a select on a table that is in the middle of an update transaction if autocommit is being used (it is always on by default). Mysqli locks the table when performing the update and the select has to wait until it is unlocked. This can become cumbersome and unweildy when performing large complex queries and so far I haven’t had very much success with them.

When it comes to complex sql statements with mysqli achieving granularity through smaller interrelated specific sql statements seems to get far greater results for speed and efficiency.

May the source be with you

– Joel

PHP and curl for remote site data extraction

Sometimes you’ll need to get data from another site on the internet.  In this article we’ll go over a quick and easy how to for using php and curl for remote site data extraction.

To get things started we’ll need to initialize a curl object and set some parameters so it can act as a web browser and log into the targeted remote site.

        // Initialize cURL
        $Curl_Obj = curl_init(); 

        // Enable Posting.
        curl_setopt($Curl_Obj, CURLOPT_POST, 1);

        // Enable Cookies
        curl_setopt ($Curl_Obj, CURLOPT_COOKIEJAR, 'cookie.txt'); 

        // Set the browser you will emulate
        $userAgent = 'Mozilla/5.0 (X11; Linux i686; rv:2.0.1) Gecko/20100101 Firefox/4.0.1';
        curl_setopt($Curl_Obj, CURLOPT_USERAGENT, $userAgent);

        // Don't include the header in the output.
        curl_setopt ($Curl_Obj, CURLOPT_HEADER, 0);

        // Allow referer field when following Location redirects.
        curl_setopt($Curl_Obj, CURLOPT_AUTOREFERER, TRUE);

        // Follow server redirects.
        curl_setopt($Curl_Obj, CURLOPT_FOLLOWLOCATION, 1);

        // Return output as string.
        curl_setopt ($Curl_Obj, CURLOPT_RETURNTRANSFER, 1);

You can find a complete list of php curl options at http://www.php.net/manual/en/function.curl-setopt.php if you need more details.

Now that we’ve got our curl object set up lets post some login credentials to a login script.  To do this we’ll need to get the input ids from the login form and supply them values and setup the post with the curl object.  Then we’ll post the credentials to the script detailed in the action of the login form.  Here’s how we do it:

       // Set up post fields from login form.
       curl_setopt($Curl_Obj, CURLOPT_POSTFIELDS, "UserFieldId=UserName&$PasswordFieldId=$Password&AnotherFieldId=$AnotherValue");

       // Set the url to which the data will be posted.
       curl_setopt ($Curl_Obj, CURLOPT_URL, 'http://yoursite.com/login.asp');

       // Execute the post and get the output.
       $output = curl_exec ($Curl_Obj);

       // Empty the post fields so you don't re-post on the next request.
       curl_setopt($Curl_Obj, CURLOPT_POSTFIELDS, "");

At this point we’ve obtained the output of the login attempt and we can parse it to see if it was successful using php string manipulation functions.  For the sake of simplicity* I’ll leave that part to you.  Once you’ve insured that you’ve logged in you can proceed to navigate to the next url using the steps below:

       // Set curl object url option.
       curl_setopt ($Curl_Obj, CURLOPT_URL, $Get_This_Url);
       // Execute query and obtain content.
       $output = curl_exec($Curl_Obj);

Finally you’ve downloaded the content from the page that was not accessible prior to your virtual login.  You can parse it with php’s string functions or use xpath to read its content if you’d like.  Don’t forget to close the curl object before you exit your script!!!

// Close curl object.
curl_close ($Curl_Obj);

 

Happy coding and please use responsibly.

– Joel

 

* Here’s a simple string comparison function you can use when using php and curl for remote site data extraction.

function extract_snippet($html, $Start_Term, $End_Term, $PrintOutput = false)

    {
        $start = strpos($html,$Start_Term);
        if (!$start === false)
        {
            $end = strpos($html,$End_Term,$start);
            $T_Sel = $start + strlen($Start_Term);
            $snippet = substr($html,$T_Sel,$end-$T_Sel);
            if ($PrintOutput)
            {
                print "\n<br>$snippet</br>";
            }
            return $snippet;
        }
        return false;
    }

What modern business applications should offer users.

Modern business applications should offer users the following features…

While developing solutions for various projects that I’ve undertaken I’ve come up with a few ideas on the types of features that I’d like my applications to offer users.  While this list is far from complete I think it would be good to post it for comment.  Bear in mind that when I’m discussing modern business applications I’m discussing thin client web based applications using cloud architecture.

Modern business application features:

  • Remote connectivity to data and functionality.
  • Encapsulation of business logic.
  • Continuous contextual decision support.
  • User specific views of the data and logical processes
  • Security
  • Flexibility
  • Data Importing / Exporting capabilities

Remote connectivity to data and functionality: Traditional business applications ran on workstations and desktop computers.  Data was stored on the computer itself or on a server and was not accessible unless the user was sitting in front of the desktop computer itself or another computer connected to the server.  This in effect chained the user to his or her desktop device.  Modern business applications must utilize web technology and provide device independent services that allow the user to connect to the data and logic of the business through any device that they can be authenticated through that has internet connectivity.

Encapsulation of business logic: Traditional business applications such as spreadsheets, word-processors, and email allowed for the transportation and transformation of data, but the use of business logic was largely left to the discretion of the user.  Modern business applications can be designed to encapsulate the steps that must be taken with information and resources to automate processes and insure execution of critical procedures required by users.  By profiling the procedures and steps required for critical processes in the business, developers can create software that encapsulates them and insures that the order of operations is followed, all steps required are taken, all resources required are employed, and follow through is complete.  Outcomes and ownership of inputs from each step can also be tracked to determine each users contribution to the success of the process.  Success ratios for processes can be increased as the mundane oversight of each detail is moved from managers to system applications with the general oversight of larger categories of processes taken on by tech empowered managers.

Continuous contextual decision support: Because of the connectivity offered by todays diverse array of computing devices, from smart phones, laptops, tablet PCs, to thin client enabled desktop PCs.  Users now have the ability to receive data and logic at any point of the business process.  Modern business applications must provide contextual decision support by offering relevant information to the user at the right time with suggestions for its proper use and allow for user feedback into the system.

User specific views of the data and logical processes: Modern business applications must offer users unique views of the data and logical processes that fits their context to the business.  For example a vendor that supplies a business would need access to statistics indicating demand for the particular commodity they offer.  Their view of the data and logic would be an on demand view or automated communication with forecasted consumption data generated by internal business processes that the vendor never sees.  User specific views of the data and logic offer modular controls to the business and protects its core competencies from probing by competition.

Security: Traditional business applications were prone to interruption by natural disasters and other disruptive events such as fires, floods, tornadoes etc.  Modern business applications can offer security to its users by storing data and logic in a cloud environment where information is distributed off site amongst computers in multiple data centers.  This offers a degree of security to the user and provides for a significant reduction in recovery time when the business is impacted by such an event.

Flexibility: Modern business applications created with web based technologies are more flexible than traditional business applications because of the ease in which updates can be rolled out to the user.  In traditional environments users had to undergo an application update procedure before receiving an updated version of a business application that contained the latest data and logic.  Web applications are updated remotely with no effort required by the end user making their innovation, iteration, and revision much more seamless and transparent to end users.

Data Importing / Exporting capabilities: Modern business applications should easily convert data with traditional applications such as spreadsheets and allow the user the comfort of converting data for use outside of the application.  Data transfer methods such as XML are a good standard for data export and can allow for data interchange between disparaging systems and applications.

Comments and discussion are welcome.

SQL tables for website development of html forms

A programmers resource: SQL tables for website development of html forms using selects and multiselects.

The following sql statements create tables that you can use to generate html selects and multi selects for web application menus.  I’ve posted some of the most common tables that I think are needed.  Let me know if there is one that I don’t have listed and I’ll add it.  You can copy these sql statements to a file or execute them directly to create the tables.

States:

CREATE TABLE `States` (
`Abr` varchar(2) NOT NULL,
`Name` varchar(56) NOT NULL,
`Sort_Order` smallint(6) NOT NULL,
PRIMARY KEY  (`Abr`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `States` VALUES(‘AL’, ‘Alabama’, 1);
INSERT INTO `States` VALUES(‘AK’, ‘Alaska’, 2);
INSERT INTO `States` VALUES(‘AZ’, ‘Arizona’, 3);
INSERT INTO `States` VALUES(‘AR’, ‘Arkansas’, 4);
INSERT INTO `States` VALUES(‘CA’, ‘California’, 5);
INSERT INTO `States` VALUES(‘CO’, ‘Colorado’, 6);
INSERT INTO `States` VALUES(‘CT’, ‘Connecticut’, 7);
INSERT INTO `States` VALUES(‘DE’, ‘Delaware’, 8);
INSERT INTO `States` VALUES(‘DC’, ‘District Of Columbia’, 9);
INSERT INTO `States` VALUES(‘FL’, ‘Florida’, 10);
INSERT INTO `States` VALUES(‘GA’, ‘Georgia’, 11);
INSERT INTO `States` VALUES(‘HI’, ‘Hawaii’, 12);
INSERT INTO `States` VALUES(‘ID’, ‘Idaho’, 13);
INSERT INTO `States` VALUES(‘IL’, ‘Illinois’, 14);
INSERT INTO `States` VALUES(‘IN’, ‘Indiana’, 15);
INSERT INTO `States` VALUES(‘IA’, ‘Iowa’, 16);
INSERT INTO `States` VALUES(‘KS’, ‘Kansas’, 17);
INSERT INTO `States` VALUES(‘KY’, ‘Kentucky’, 18);
INSERT INTO `States` VALUES(‘LA’, ‘Louisiana’, 19);
INSERT INTO `States` VALUES(‘ME’, ‘Maine’, 20);
INSERT INTO `States` VALUES(‘MD’, ‘Maryland’, 21);
INSERT INTO `States` VALUES(‘MA’, ‘Massachusetts’, 22);
INSERT INTO `States` VALUES(‘MI’, ‘Michigan’, 23);
INSERT INTO `States` VALUES(‘MN’, ‘Minnesota’, 24);
INSERT INTO `States` VALUES(‘MS’, ‘Mississippi’, 25);
INSERT INTO `States` VALUES(‘MO’, ‘Missouri’, 26);
INSERT INTO `States` VALUES(‘MT’, ‘Montana’, 27);
INSERT INTO `States` VALUES(‘NE’, ‘Nebraska’, 28);
INSERT INTO `States` VALUES(‘NV’, ‘Nevada’, 29);
INSERT INTO `States` VALUES(‘NH’, ‘New Hampshire’, 30);
INSERT INTO `States` VALUES(‘NJ’, ‘New Jersey’, 31);
INSERT INTO `States` VALUES(‘NM’, ‘New Mexico’, 32);
INSERT INTO `States` VALUES(‘NY’, ‘New York’, 33);
INSERT INTO `States` VALUES(‘NC’, ‘North Carolina’, 34);
INSERT INTO `States` VALUES(‘ND’, ‘North Dakota’, 35);
INSERT INTO `States` VALUES(‘OH’, ‘Ohio’, 36);
INSERT INTO `States` VALUES(‘OK’, ‘Oklahoma’, 37);
INSERT INTO `States` VALUES(‘OR’, ‘Oregon’, 38);
INSERT INTO `States` VALUES(‘PA’, ‘Pennsylvania’, 39);
INSERT INTO `States` VALUES(‘RI’, ‘Rhode Island’, 40);
INSERT INTO `States` VALUES(‘SC’, ‘South Carolina’, 41);
INSERT INTO `States` VALUES(‘SD’, ‘South Dakota’, 42);
INSERT INTO `States` VALUES(‘TN’, ‘Tennessee’, 43);
INSERT INTO `States` VALUES(‘TX’, ‘Texas’, 44);
INSERT INTO `States` VALUES(‘UT’, ‘Utah’, 45);
INSERT INTO `States` VALUES(‘VT’, ‘Vermont’, 46);
INSERT INTO `States` VALUES(‘VA’, ‘Virginia’, 47);
INSERT INTO `States` VALUES(‘WA’, ‘Washington’, 48);
INSERT INTO `States` VALUES(‘WV’, ‘West Virginia’, 49);
INSERT INTO `States` VALUES(‘WI’, ‘Wisconsin’, 50);
INSERT INTO `States` VALUES(‘WY’, ‘Wyoming’, 51);

 

Age Range:

CREATE TABLE `AgeRange` (
`ID` tinyint(4) NOT NULL,
`Range` varchar(128) NOT NULL,
PRIMARY KEY  (`ID`),
UNIQUE KEY `Range` (`Range`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `AgeRange` VALUES(0, ‘Under 18′);
INSERT INTO `AgeRange` VALUES(1, ’18 to 24′);
INSERT INTO `AgeRange` VALUES(2, ’25 to 29′);
INSERT INTO `AgeRange` VALUES(3, ’30 to 34′);
INSERT INTO `AgeRange` VALUES(4, ’35 to 39′);
INSERT INTO `AgeRange` VALUES(5, ’40 to 49′);
INSERT INTO `AgeRange` VALUES(6, ’50 to 59′);
INSERT INTO `AgeRange` VALUES(7, ’60 to 69′);
INSERT INTO `AgeRange` VALUES(8, ’70 to 80’);
INSERT INTO `AgeRange` VALUES(9, ‘Over 80’);

Organizational Role

CREATE TABLE `Organizational_Role` (
`ID` tinyint(4) NOT NULL,
`Position` varchar(255) NOT NULL,
UNIQUE KEY `Position` (`Position`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `Organizational_Role` VALUES(0, ‘Owner / Operator’);
INSERT INTO `Organizational_Role` VALUES(1, ‘General Management’);
INSERT INTO `Organizational_Role` VALUES(2, ‘Administrative Support’);
INSERT INTO `Organizational_Role` VALUES(3, ‘Marketing’);
INSERT INTO `Organizational_Role` VALUES(4, ‘Finance’);
INSERT INTO `Organizational_Role` VALUES(5, ‘Operations’);
INSERT INTO `Organizational_Role` VALUES(6, ‘Sales’);
INSERT INTO `Organizational_Role` VALUES(7, ‘IT’);
INSERT INTO `Organizational_Role` VALUES(8, ‘Legal / Regulator’);
INSERT INTO `Organizational_Role` VALUES(9, ‘Customer Service’);
INSERT INTO `Organizational_Role` VALUES(10, ‘Public relations / corporate communications’);
INSERT INTO `Organizational_Role` VALUES(11, ‘Accounting’);
INSERT INTO `Organizational_Role` VALUES(12, ‘Cross-functional role’);

Employment Level

CREATE TABLE `EmploymentLevel` (
`ID` tinyint(4) NOT NULL,
`Level` varchar(128) NOT NULL,
PRIMARY KEY  (`ID`),
UNIQUE KEY `Level` (`Level`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `EmploymentLevel` VALUES(0, ‘CEO, president, owner, senior partner, principal’);
INSERT INTO `EmploymentLevel` VALUES(1, ‘CXO other than CEO (CFO, CIO, etc.)’);
INSERT INTO `EmploymentLevel` VALUES(2, ‘Partner / senior executive’);
INSERT INTO `EmploymentLevel` VALUES(3, ‘Middle Manager’);
INSERT INTO `EmploymentLevel` VALUES(4, ‘Supervisor’);
INSERT INTO `EmploymentLevel` VALUES(5, ‘Professional (consultant, legal, accountant, programmer, etc.)’);
INSERT INTO `EmploymentLevel` VALUES(6, ‘Front-line employee’);
INSERT INTO `EmploymentLevel` VALUES(7, ‘Adminstrative staff’);
INSERT INTO `EmploymentLevel` VALUES(8, ‘Student’);
INSERT INTO `EmploymentLevel` VALUES(9, ‘Not employed’);
INSERT INTO `EmploymentLevel` VALUES(10, ‘Retired’);

Number of Employees

CREATE TABLE `NumberOfEmployees` (
`ID` tinyint(4) NOT NULL,
`Range` varchar(96) NOT NULL,
PRIMARY KEY  (`ID`),
UNIQUE KEY `Range` (`Range`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `NumberOfEmployees` VALUES(0, ‘1’);
INSERT INTO `NumberOfEmployees` VALUES(1, ‘2’);
INSERT INTO `NumberOfEmployees` VALUES(2, ‘3’);
INSERT INTO `NumberOfEmployees` VALUES(3, ‘4 to 9′);
INSERT INTO `NumberOfEmployees` VALUES(4, ’10 to 16′);
INSERT INTO `NumberOfEmployees` VALUES(5, ’17 to 29′);
INSERT INTO `NumberOfEmployees` VALUES(6, ’30 to 49′);
INSERT INTO `NumberOfEmployees` VALUES(7, ’50 to 99’);
INSERT INTO `NumberOfEmployees` VALUES(8, ‘100 to 299’);
INSERT INTO `NumberOfEmployees` VALUES(9, ‘300 to 599’);
INSERT INTO `NumberOfEmployees` VALUES(10, ‘600 to 999’);
INSERT INTO `NumberOfEmployees` VALUES(11, ‘1000 to 1499’);
INSERT INTO `NumberOfEmployees` VALUES(12, ‘1500 to 1999’);
INSERT INTO `NumberOfEmployees` VALUES(13, ‘2000 to 3000’);
INSERT INTO `NumberOfEmployees` VALUES(14, ‘more than 3000’);

Revenue Volume

CREATE TABLE `RevenueVolume` (
`ID` tinyint(4) NOT NULL,
`Volume` varchar(96) NOT NULL,
PRIMARY KEY  (`ID`),
UNIQUE KEY `Volume` (`Volume`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `RevenueVolume` VALUES(0, ‘None’);
INSERT INTO `RevenueVolume` VALUES(1, ‘Less than $100,000’);
INSERT INTO `RevenueVolume` VALUES(2, ‘Less than $200,000’);
INSERT INTO `RevenueVolume` VALUES(3, ‘Less than $300,000’);
INSERT INTO `RevenueVolume` VALUES(4, ‘Less than $400,000’);
INSERT INTO `RevenueVolume` VALUES(5, ‘Less than $500,000’);
INSERT INTO `RevenueVolume` VALUES(6, ‘Less than $750,000’);
INSERT INTO `RevenueVolume` VALUES(7, ‘Less than $1,000,000’);
INSERT INTO `RevenueVolume` VALUES(8, ‘Less than $2,000,000’);
INSERT INTO `RevenueVolume` VALUES(9, ‘Less than $5,000,000’);
INSERT INTO `RevenueVolume` VALUES(10, ‘Less than ten million’);
INSERT INTO `RevenueVolume` VALUES(11, ‘Less than 50 million’);
INSERT INTO `RevenueVolume` VALUES(12, ‘Less than 100 million’);
INSERT INTO `RevenueVolume` VALUES(13, ‘Less than 250 million’);
INSERT INTO `RevenueVolume` VALUES(14, ‘Less than 500 million’);
INSERT INTO `RevenueVolume` VALUES(15, ‘Less than 750 million’);
INSERT INTO `RevenueVolume` VALUES(16, ‘Less than a billion’);
INSERT INTO `RevenueVolume` VALUES(17, ‘More than a billion’);

Time Frame

CREATE TABLE `TimeFrame` (
`ID` tinyint(4) NOT NULL,
`Time Frame` varchar(128) NOT NULL,
`Seconds` bigint(20) NOT NULL,
PRIMARY KEY  (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT INTO `TimeFrame` VALUES(0, ‘One Day’, 86400);
INSERT INTO `TimeFrame` VALUES(1, ‘Three Days’, 259200);
INSERT INTO `TimeFrame` VALUES(2, ‘Five Days’, 432000);
INSERT INTO `TimeFrame` VALUES(3, ‘One Week’, 604800);
INSERT INTO `TimeFrame` VALUES(4, ‘Ten Days’, 864000);
INSERT INTO `TimeFrame` VALUES(5, ‘Two Weeks’, 1209600);
INSERT INTO `TimeFrame` VALUES(6, ‘Fifteen Days’, 1296000);
INSERT INTO `TimeFrame` VALUES(7, ‘Eighteen Days’, 1555200);
INSERT INTO `TimeFrame` VALUES(8, ‘Three Weeks’, 1814400);
INSERT INTO `TimeFrame` VALUES(9, ‘Twenty Five Days’, 2160000);
INSERT INTO `TimeFrame` VALUES(10, ‘Four Weeks’, 2419200);
INSERT INTO `TimeFrame` VALUES(11, ‘Thirty Days’, 2592000);
INSERT INTO `TimeFrame` VALUES(12, ‘Five Weeks’, 3024000);
INSERT INTO `TimeFrame` VALUES(13, ‘Six Weeks’, 3628800);
INSERT INTO `TimeFrame` VALUES(14, ‘Forty Five Days’, 3888000);
INSERT INTO `TimeFrame` VALUES(15, ‘Seven Weeks’, 4233600);
INSERT INTO `TimeFrame` VALUES(16, ‘Eight Weeks’, 4838400);
INSERT INTO `TimeFrame` VALUES(17, ‘Sixty Days’, 5184000);
INSERT INTO `TimeFrame` VALUES(18, ‘Nine Weeks’, 5443200);
INSERT INTO `TimeFrame` VALUES(19, ‘Ten Weeks’, 6048000);
INSERT INTO `TimeFrame` VALUES(20, ‘Eleven Weeks’, 6652800);
INSERT INTO `TimeFrame` VALUES(21, ‘Twelve Weeks’, 7257600);
INSERT INTO `TimeFrame` VALUES(22, ‘Ninety Days’, 7776000);
INSERT INTO `TimeFrame` VALUES(23, ‘Sixteen Weeks’, 9676800);
INSERT INTO `TimeFrame` VALUES(24, ‘Twenty Weeks’, 12096000);
INSERT INTO `TimeFrame` VALUES(25, ‘Twenty Four Weeks’, 14515200);
INSERT INTO `TimeFrame` VALUES(26, ‘Twenty Eight Weeks’, 16934400);
INSERT INTO `TimeFrame` VALUES(27, ‘Thirty Two Weeks’, 19353600);
INSERT INTO `TimeFrame` VALUES(28, ‘Thirty Six Weeks’, 21772800);
INSERT INTO `TimeFrame` VALUES(29, ‘Forty Weeks’, 24192000);
INSERT INTO `TimeFrame` VALUES(30, ‘Forty Four Weeks’, 26611200);
INSERT INTO `TimeFrame` VALUES(31, ‘Forty Eight Weeks’, 29030400);
INSERT INTO `TimeFrame` VALUES(32, ‘Fifty Two Weeks’, 31449600);
INSERT INTO `TimeFrame` VALUES(33, ‘One Year’, 31536000);
INSERT INTO `TimeFrame` VALUES(34, ‘One Year, Three Months’, 39420000);
INSERT INTO `TimeFrame` VALUES(35, ‘One Year, Six Months’, 47304000);
INSERT INTO `TimeFrame` VALUES(36, ‘One Year, Nine Months’, 55188000);
INSERT INTO `TimeFrame` VALUES(37, ‘Two Years’, 63072000);
INSERT INTO `TimeFrame` VALUES(38, ‘More Than Two Years’, 63072001);
INSERT INTO `TimeFrame` VALUES(39, ‘Unknown’, 0);
INSERT INTO `TimeFrame` VALUES(40, ‘Not time based’, 0);

I hope this saves you time.

– Joel

Web Development: Industry Classification csv file

A web development resource: Industry classifications for menus.

A project I’m working on requires a classification of businesses for a menu and I’ve searched the internet over for a usable resource.  I wanted an Industry classification csv file to load into a database.  I ended my search when I found the North American Industry Classification System (NAICS).  They offer a complete classification scheme used by the Federal Government with explanations and downloadable files.  I’ve modified some of the files that they offer for easy loading into a database table. You can download them here:

If you know of a better classification system or a way to improve on this one let me know.  I’ll be using this on the card project if you’d like to preview it here http://catontech.com/card

– Joel

A simple ajax page update example using jquery

An ajax example using jquery

Ajax Loading Image

Download the latest version of jquery from jquery.com and load it into your server. Create an html file or php script and output the html detailed below:
<!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
“http://www.w3.org/TR/html4/loose.dtd”>
<html>
<head>

<script type=”text/javascript”>
function AjaxUpdate()
{
$(document).ready(function(){$(“#MyDiv”).load(“http://yourdomain.com/yourscript.php”)});
/* setTimeout(“RssFeeder()”,90000); */
}
</script>
<title>Quick Ajax with jquery Demo</title>
</head>
<body>
<script type=”text/javascript” src=”http://yourdomain.com/javascript/jquery-1.3.2.js”></script>
<div id=”MyDiv”>
<img src=”http://yourdomain.com/images/loading.gif” alt=”Loading content…” style=”display:block; margin-left:auto; margin-right:auto; vertical-align:middle;”/>
</div>
<!– Use this method for user event loading. –>
<button onclick=”AjaxUpdate();”>Update Page</button>

<script type=”text/javascript”>
// Use this for delayed loading
eval (‘setTimeout(“AjaxUpdate()”,30000)’);
</script>
</body>
</html>

Now if you’ll create a php script to send the information to page. The data you send to the page will go into the div id:Mydiv.

In this example when the user clicks the Update button the div contents will refresh. I’ve also included an example on the page that will update the div within 30 seconds of the page load. You can write javascript back to the page with the php script so you could cause it to load again if you wanted to.

Go to http://ajaxload.info to get loading graphics for your ajax powered web applications. It’s free.

Hope this helps.

– Joel

A guide to PHP Data Sanitation – Sanity Checking

PHP.net practices to insure that your scripts are doing what you’ve designed them to do.

What is Data Sanitation / Sanity Checking?

Data sanitation / sanity checking is the process of filtering inputs for each script, function, and sql statement to insure that if is within a tolerable expected range that will cause the application to perform as planned. All inputs that originate from a foreign source must be treated with a degree of scepticism and filtered to insure that the values are not tainted. By performing data sanitation your scripts and web applications should be highly resistant to misuse or exploitation from the following types of attack.

  • SQL Injection Attacks
  • Cross Site Scripting
  • Form Spoofing
  • Cross Site Request Forgery
  • Session Fixation (Riding)

Blacklist and Whitelist filtering

The two common methodologies which can be used to filter input data are blacklisting and whitelisting. Blacklisting involves checking inputs for prohibited values and whitelisting is accomplished by checking for allowed values. A good example of a blacklist approach is a profanity filter. There will be times when you need to use blacklisting, but in general you will want to use whitelisting to insure that you receive values that are acceptable to the components of your application.

Client Side and Server Side filtering

We use client side filtering to make the application easy to use. By filtering the data on the client side you can insure that the user does not have to resubmit a form to make a correction. Client side filtering is susceptible to input tainting and must be used in conjunction with server side filtering. Here are some suggested client side filtering methods to use when passing data to the server from a html form.

Client Side:

  • Design html forms using selects to limit value selection to acceptable values where appropriate by using pre-populated selects.
  • Use check boxes for yes, no, undecided situations.
  • Limit the text field lengths to the maximum value the various components of your script will allow.
  • Pass a script generated token that you also stored in the session in a hidden field in the form to insure that the user came from your site.
  • In general when sending html output to the user from any script use htmlentities() to escape the data you are sending. This will keep values from your script from being used out of context and corrupting your intentions protecting you from mangled documents and in the worst case a cross site request attack.
  • Use javascript to validate the fields before the user is allowed to submit the data.

Once your script receives the data you will need to filter it again to insure that your data is not tainted. Here is where we use the whitelisting method mentioned earlier. The following methods can be used to construct data validation functions that you can store in a file and require in each script you write for ease of use.

Server Side:

  • Use functions such as isset and empty with conditional statements to insure that all required variables for your script or function are set. If they are not they you may provide an arbitrary value if suitable or exit the function or script.
  • Use built-in php functions such as the ctype or is_ family of functions with conditional statements to validate data types.
  • Use comparison operators (==, ===, !=, !==, <, <=, >, >=) to filter numerical values and string functions such as strlen, strcmp, strcasecmp, strpos, strstr, and strspn to filter string values.

Advanced methods to control your data inputs and insure their integrity:

  • Control the method by which your script receives data by limiting inputs directly to their source. Avoid using $_REQUEST as it allows you to directly pull a copy of the input from $_POST, $_GET, and $_COOKIE without specifying where it came from. By using using $_POST, $_GET, and $_COOKIE directly you force the input to come from the source you specify and block any efforts to cross these methods up and feed your application tainted data.
  • If you are receiving the inputs from a form insure that you dynamically generate the form and include a hidden token that is stored in the session and passed back via the form and validated. Dynamically generate the token using md5 or sha1 and a time-stamp concatenated  a hash string. Example: $_SESSION[‘Token’] = md5(time() . “YourHashString”). This will insure that you are receiving the data from your form and help protect you from form spoofing and cross site request forgery attacks. Also mentioned above in client side filtering.
  • Filter output intended for your database with your database driver specific escape_string function or use prepared statements. This protects against sql injection attacks.
  • Use session_regenerate_id to insure that the users session id is set by your script and not an outside source. Do this everytime your user changes authentication levels or your session accesses sensitive resources. This will protect against a session fixation attach. While this is not directly a sanitation topic, you do want to be sure that the data you are receiving is from your user and not a third party.

Conclusion

We’ve covered a lot of information in a short time. Feel free to bookmark this post and come back to it. Please comment if you feel I’ve skipped over anything or if you found this guide helpful. I’d like to publish this as a static page with examples in the near future.

– Joel