Posts Tagged ‘php’

Mercurial Hook for Syntax Checking (PHP)

Friday, October 8th, 2010

For those unfamiliar with Mercurial, it is an awesome Source Control Management (SCM) tool. One of my favorite features of Mercurial is that the repositories are distributed which allows each machine to have a full copy of the project's history. Being distributed has many advantages such as faster committing, branching, tagging, merging, etc. since it is all done locally. Of course this setup also creates a backup of the repository each time an engineer clones a repository. There are a lot of benefits to using Mercurial, but that is not the focus of this post.

In this article, I am going to discuss how to setup a Mercurial hook to handle checking the syntax of files. Specifically, the hook will be setup to check the syntax of PHP files. This is beneficial as it will prevent users from adding files to Mercurial that are invalid and will keep the repository clean. Better yet, when dealing with a repository for a live website, it will prevent invalid files from ever being added to the live site.

The Pretxnchangegroup Event

Mercurial hooks are programs that Mercurial will execute during specific events. Ideally, a hook such as checking syntax would happen just before a commit is being made (the precommit event). Since Mercurial is distributed, this would require each client to install and setup the hook. This may work for some, but it does require more work and can cause issues if the hook is not setup correctly on each machine.

There is a better solution for environments that have a central repository for everyone to push their changes to. Basically, the hook can be setup on the pretxnchangegroup event. This event is executed just before a changeset (group of commits) is added to a remote repository (during a push).

To setup a hook on the pretxnchangegroup event, the syntax checking will need to build a list of every file that was changed for each changeset and then check the syntax on the latest version of each file. If there is a syntax error, the hook can exit with the appropriate status code to prevent the changesets from being added to the central repository.

When using the pretxnchangegroup event, each machine will be able to commit changes with files that have syntax errors. However, when trying to push the files to the central server, the changesets will be rejected until the syntax errors have been fixed.

In Process vs. External Hooks

With Mercurial, there are two types of hooks: an in-process and an external hook. An in-process hook is a Python module that is loaded at the time the Mercurial starts. An external hook can use any programming language that is supported by the OS.

These are advantages to using both an in-process and an external hook. An external hook is most beneficial when the code is already written in another language or the developers are more familiar with a language other than Python. An in-process hook has some nice advantages as it allows the developer access to the internals of Mercurial. It also gives the ability to display a message to the user when making a change in the repository.

External Hook Using a Shell Script

In order to show how Mercurial hooks work, I have developed both an external and in-process hook to check the syntax of PHP files. Below is the source code for an external hook. This hook is a bash script that I named php_syntax.sh.

#!/usr/local/bin/bash
echo "STARTING PHP SYNTAX CHECK..."
# create a random temp file
temp_file=`/usr/bin/mktemp -t php_syntax_files`

# get all modified files and remove duplicate's
#note: use file_mods,file_adds instead
hg log -r $HG_NODE:tip --template "{files}\n" | sort | uniq > $temp_file

# Walk through each line
#for line in "$temp_file"; do
for line in $(< $temp_file); do
	# Make sure it is a php file
	if [ `echo $line |  grep -Ei ".+\.(php)|(php4)|(php5)$"` ]
	then
		# create a random temp file
		php_file=`/usr/bin/mktemp -t php_syntax_check`

		# save the contents of this file (latest commit) to the temp file
		hg cat -r tip $line > $php_file

		# check the syntax
		php_syntax_output=`/usr/local/bin/php -l -d display_errors=1 -d error_reporting=4 -d html_errors=0 < $php_file`;

		# remove the temp file
		rm -f $php_file;

		test_syntax=`echo $php_syntax_output | grep "Parse error"`
		if [ "$test_syntax" ];then
			exit 1;
		fi
	fi
done

rm -f "$temp_file"

The above code will check the latest version of each file that is being changed when pushing to the server. It will only check files that have an extension of PHP, PHP4 or PHP5. The content of each file that is being pushed to the server is then stored in a temporary file and passed to PHP to check the syntax. If the syntax check fails, the program returns a 1 for failure which causes the entire push to fail so that no changes are pushed to the server. If there are no syntax errors, the hook exits normally and continues to push the files to the server.

In order to install the above hook in Mercurial, simply add the following 2 lines to the .hgrc and/or the hgweb.config file.

[hooks]
pretxnchangegroup.syntax_check = /usr/home/mercurial/php_syntax.sh

Of course the path in the above line needs to be updated to where the bash script was saved. The bash script will most likely need to be updated to contain the correct paths as well.

With all of the above in place the following message will be displayed to the user when trying to push a file that has a syntax error:

running hook pretxnchangegroup.syntax_check: /usr/home/code.softwareprojects.com/php_syntax.sh
transaction abort!
rollback completed
abort: pretxnchangegroup.syntax_check hook exited with status 127
warning: commit.autopush hook exited with status 1

In-Process Hook Using Python

The major flaw with using the above shell script is it does not allow us to display a nice informative error to the user when their push fails due to a syntax error. This is one advantage of using a Python in-process hook instead. I have written very similar logic in Python which can be seen below:

import subprocess,os,re
import os.path
from mercurial import ui
from random import randrange
from time import time
def check(ui, repo, hooktype, node, **kwargs):
    #initialize variables
    error = ""
    fileSet = set()
    # Loop through each changeset being added to the repository
    for change_id in xrange(repo[node].rev(), len(repo)):
        # Loop through each file for the current changeset
        for currentFile in repo[change_id].files():
            # Only Check PHP Files
            if re.match('.*\.(php)|(php4)|(php5)',currentFile):
                # Build a unique list of each file that has changed
                fileSet.add(currentFile)
    # Loop through each file that has changed
    for currentFile in fileSet:
        # Grab the latest version of the current file in the changeset
        ctx = repo['tip']
        # Do not check the file if it is being deleted
        if currentFile not in ctx:
            continue;
        # Generate a unique temporary file name using random number and timestamp
        temp_file = '/tmp/php_syntax_check.%s%s' % (randrange(0,100000),int(time()))
        # Open the temp file for writing
        f = open(temp_file,'w')
        # Get the file context
        fctx = ctx[currentFile]
        # Save the contents of the current file to the temp file
        f.write(fctx.data())
        # Close the temp file
        f.close()
        # Check the syntax of the current/temp file
        proc = subprocess.Popen('/usr/local/bin/php-cgi -l -d display_errors=1 -d error_reporting=4 -d html_errors=0 < %s' % temp_file, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        # Retrieve the output of the syntax check
        out,err = proc.communicate()
        # Check for syntax errors and save them
        if 'Parse error' in out:
            error += "%s%s\n" % (out,currentFile)
    # Check if an error occured in any of the files that were changed
    if error != "":
        # Display a message to the user about each file that contained a syntax error
        ui.warn("******************************************************" +
            error +
            "******************************************************\n")
        # Reject the changesets
        return 1
    # Accept the changesets
    return 0

This code is very similar in functionality to the shell script. It first builds a list of all of the files being pushed that have a PHP, PHP4 or PHP5 extension. Then it obtains the contents of each file that is being pushed and stores each file in a random temporary file. It checks the syntax of each file and then cancels the push if there is one or more files with invalid syntax.

Since this is an in-process hook, it is able to display a nice message to the user about why the push was not allowed. This hook is also set up to check every single file and display a message about every file that has a syntax error. This allows the hook to display a message to the user such as the following:

******************************************************
Parse error: syntax error, unexpected T_ECHO in - on line 3
Errors parsing -
afile_test.php

Parse error: syntax error, unexpected '@' in - on line 15
Errors parsing -
anotherfile_test.php
******************************************************

In order to setup this hook with Mercurial, save the above Python code in a file that is on the PYTHONPATH. Then add the following two lines of code to the .hgrc and/or the hgweb.config file.

[hooks]
pretxnchangegroup.syntax_check = python:php_syntax.check

It is important to point out that the text on the right half of the equals sign tells Mercurial what to load. In this example, it says use Python, look for a file named php_syntax.py and call the function check.

Also, Mercurial will need to be restarted after setting up the above hook or after each time the hook is modified. This is because the in-process hook is loaded when Mercurial/Python is first started.

Conclusion

Mercurial is a great SCM tool and can be very powerful when combined with either in-process or external hooks. In-process hooks provide much more control and are the preferred method in most cases. The examples above are just an introduction to Mercurial hooks and they can easily be modified for specific environments or checking the syntax of other languages.

Please leave a comment if you have found this code useful or share your experiences with Merucial and hooks.

Format Credit Card with X's and Dashes using PHP (credit card masking)

Monday, March 22nd, 2010

I recently had a project where I needed to accomplish the following two tasks:

  1. Replace all but the last four digits of a credit card with X's
  2. Format the credit card with dashes in the appropriate places

There are many different approaches that can be taken to accomplish the above two tasks. The simplest approach would be to do something like the following:

<?php
echo 'XXXX-XXXX-XXXX-'.substr($cc,-4);
?>

I have often seen credit cards masked with the above approach. For the most case this solution will work fairly well. However, I am not a huge fan of this approach as it displays the credit card at a fixed length of 16 digits. This can be a bit confusing since credit cards can very in length from 13 to 16 digits.

To better address this issue I put together two functions. One function is to apply a mask to a credit card and the other is to format the credit card with dashes. These functions will keep the original length of each credit card.

<?php

/**
 * Replaces all but the last for digits with x's in the given credit card number
 * @param int|string $cc The credit card number to mask
 * @return string The masked credit card number
 */
function MaskCreditCard($cc){
	// Get the cc Length
	$cc_length = strlen($cc);

	// Replace all characters of credit card except the last four and dashes
	for($i=0; $i<$cc_length-4; $i++){
		if($cc[$i] == '-'){continue;}
		$cc[$i] = 'X';
	}

	// Return the masked Credit Card #
	return $cc;
}

/**
 * Add dashes to a credit card number.
 * @param int|string $cc The credit card number to format with dashes.
 * @return string The credit card with dashes.
 */
function FormatCreditCard($cc)
{
	// Clean out extra data that might be in the cc
	$cc = str_replace(array('-',' '),'',$cc);

	// Get the CC Length
	$cc_length = strlen($cc);

	// Initialize the new credit card to contian the last four digits
	$newCreditCard = substr($cc,-4);

	// Walk backwards through the credit card number and add a dash after every fourth digit
	for($i=$cc_length-5;$i>=0;$i--){
		// If on the fourth character add a dash
		if((($i+1)-$cc_length)%4 == 0){
			$newCreditCard = '-'.$newCreditCard;
		}
		// Add the current character to the new credit card
		$newCreditCard = $cc[$i].$newCreditCard;
	}

	// Return the formatted credit card number
	return $newCreditCard;
}

?>

Below are a couple examples of how to use these functions and the results they create.

<?php
echo maskCreditCard('5362267121053405').'<br>'; // Prints XXXXXXXXXXXX3405
echo formatCreditCard('5362267121053405').'<br>'; // Prints 5362-2671-2105-3405
echo formatCreditCard(maskCreditCard('5362267121053405')).'<br>'; // Prints XXXX-XXXX-XXXX-3405
?>
<?php
$creditCard[] = '5362267121053405'; // Mastercard
$creditCard[] = '4556189015881361'; // Visa 16
$creditCard[] = '4716904617062'; // Visa 13
$creditCard[] = '372348371455844'; // American Express
$creditCard[] = '6011757892594291'; // Discover
$creditCard[] = '30329445722959'; // Diners Club
$creditCard[] = '214927124363421'; // enRoute
$creditCard[] = '180012855304868'; // JCB 15
$creditCard[] = '3528066275370961'; // JCB 16
$creditCard[] = '8699775919'; // Voyager

for($i=0;$i<count($creditCard);$i++)
{
	echo FormatCreditCard(MaskCreditCard(($creditCard[$i])))."\n";
}
?>

Output:

XXXX-XXXX-XXXX-3405
XXXX-XXXX-XXXX-1361
X-XXXX-XXXX-7062
XXX-XXXX-XXXX-5844
XXXX-XXXX-XXXX-4291
XX-XXXX-XXXX-2959
XXX-XXXX-XXXX-3421
XXX-XXXX-XXXX-4868
XXXX-XXXX-XXXX-0961
XX-XXXX-5919

Let me know if you find these functions useful or have any suggestions on how to tweak them.