Working With CSV Flat Files: Storage

If your server supports PHP but not MySQL — or perhaps you don’t want to learn SQL syntax — you may be despairing of finding an effective storage solution? Don’t lose hope; you can store all sorts of content in plain text files as simple as pie. (Mmm, pie!)

What’s a “CSV”?

CSV stands for “Comma Separated Values” and is a flat file format. Its usage stretches back to the early days of computing, so finding programs to open and edit CSVs isn’t hard and there’s no reason why that compatibility can’t be extended to the web. CSV files are commonly used in place of more complex database solutions, where one CSV file is equivalent to one table in a standard database. Each entry in the CSV is the same as a row in a database.

Although the “Comma” in CSV usually indicates values are separated by exactly that (a comma: “,”) any separator can be used. However, there are usually no spaces left between separators. E.g. 23,John,Doe would be an acceptable way to store Age, First Name, Last Name, but 23, John, Doe would not.

Sometimes you may need to use the separator inside one of the field values. So that the separator is not processed as a field deliminator, the entire field must be wrapped in quotation marks, like so: 23,John,Doe,"Likes: chocolate, pie".

Creating a CSV

Although you can actually save files with a .csv extension to indicate a CSV file, it is just as feasible to save them as .txt or any other file type. As long as the contents remain consistent and follow the CSV “rules” as laid out above, the filetype is not important in this context. I prefer to use .txt

Creating a .txt for the purpose of a CSV is exactly the same as creating a normal .txt

Storing Data in our CSV

For the purpose of this tutorial I’ve decided to use a website survey system, just like the MySQL Storage tutorial. This will allow you to see the similarities between a flat file database and a MySQL database.

The Form

For the sake of brevity, I’m recycling the form code from the MySQL Storage tutorial. This is as follows (and should be saved as form.php or something similar):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">

<html>
<head>
    <title>Visitor Survey Form</title>
</head>
<body>
    <form method="post" action="form-process.php">

    Name:<br>
    <input type="text" name="name"><br>
    E-mail:<br>
    <input type="text" name="email"><br>
    Country:<br>
    <select name="country">
        <option>United Kingdom</option>
        <option>United States</option>
        <option>Other</option>
    </select><br>
    Overall Score:<br>
    <select name="score">
        <option>1</option>
        <option>2</option>
        <option>3</option>
        <option>4</option>
        <option>5</option>
    </select><br>
    Comments:<br>
    <textarea name="comments" rows="10" cols="30"></textarea><br>

    <input type="submit" name="submit">

    </form>
</body>
</html>

We now know our fields are as follows: Name, E-mail, Country, Score and Comments — this will help us form a structure for our CSV.

Processing the Data

The next step is to take the data and process it ready to store. This “processing” involves cleaning the data of any HTML, checking for common exploits and just making sure everything is hunky dory. To save time and to prevent me from needlessly repeating myself, you can read the PHP Mail Form tutorial for detailed information on cleaning the data and implementing spam protection. This code is pretty generic and will work well in any script that deals with fetching data from forms. In short, the code we need is as follows:

<?php
if (!isset($_POST['submit']) || $_SERVER['REQUEST_METHOD'] != "POST") {
    exit("<p>You did not press the submit button; this page should not be accessed directly.</p>");
} else {
    $exploits = "/(content-type|bcc:|cc:|document.cookie|onclick|onload|javascript|alert)/i";
    $profanity = "/(beastial|bestial|blowjob|clit|cock|cum|cunilingus|cunillingus|cunnilingus|cunt|ejaculate|fag|felatio|fellatio|fuck|fuk|fuks|gangbang|gangbanged|gangbangs|hotsex|jism|jiz|kock|kondum|kum|kunilingus|orgasim|orgasims|orgasm|orgasms|phonesex|phuk|phuq|porn|pussies|pussy|spunk|xxx)/i";
    $spamwords = "/(viagra|phentermine|tramadol|adipex|advai|alprazolam|ambien|ambian|amoxicillin|antivert|blackjack|backgammon|texas|holdem|poker|carisoprodol|ciara|ciprofloxacin|debt|dating|porn)/i";
    $bots = "/(Indy|Blaiz|Java|libwww-perl|Python|OutfoxBot|User-Agent|PycURL|AlphaServer)/i";

    if (preg_match($bots, $_SERVER['HTTP_USER_AGENT'])) {
        exit("<p>Known spam bots are not allowed.</p>");
    }
    foreach ($_POST as $key => $value) {
        $value = trim($value);

        if (empty($value)) {
            exit("<p>Empty fields are not allowed. Please go back and fill in the form properly.</p>");
        } elseif (preg_match($exploits, $value)) {
            exit("<p>Exploits/malicious scripting attributes aren't allowed.</p>");
        } elseif (preg_match($profanity, $value) || preg_match($spamwords, $value)) {
            exit("<p>That kind of language is not allowed through our form.</p>");
        }

		// add slashes to prevent quotation marks/etc breaking our CSV
        if (!get_magic_quotes_gpc()) $value = addslashes($value);
        $$key = strip_tags($value);
    }

    if (!ereg("^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,6})$",strtolower($email))) {
        exit("<p>That e-mail address is not valid, please use another.</p>");
    }

	// Data is checked and cleaned, we can store it now...

}
?>

We now have a gap in which we’ll be placing our storage code. PHP has lots of built in file manipulation functions which allow us to easily interact with the .txt file we created earlier. In an ideal world, we’d use one line of code and sort it all with file_put_contents() but that is only supported in PHP5 and above. While a lot of hosts rely on PHP4 we need to stick to the most compatible code; that is, using fopen(), fwrite() and fclose() successively.

fopen(), fwrite() and fclose()

To manipulate a file in PHP, we first need to “open” it for reading and writing. In our case we’d want to write to it to store our survey information, so we use the write mode (“a”). We assign the open file to a variable so that PHP knows which file we’re working with further down. Sometimes, if the file is not available or if the permissions are incorrect, fopen() will return an error; to prevent wouldbe malicious users from using any information returned to crack our system, we can customise the error message too:

$file = fopen("surveyfile.txt", "a") or die("File couldn't be opened");

Once the file is open, we can prepare our data and write it to the file. In true and “proper” CSV form, we shall use commas to separate data, and just in case the Comments contain commas we’ll wrap that field in quotation marks (escaped with backslashes, to prevent PHP thinking we’re done with the variable too early):

$data = "$name,$email,$country,"$comments"";
fwrite($file, $data);

To warn us in case of data writing failure, we could make things a bit more complex and change the simple fwrite() to include an if statement, but otherwise it’s a very simple procedure.

Once done, we close the file again to free it up for other users to write to (and that’s the simplest part of all)

fclose($file);

All together, including in our form processing we have:

<?php
if (!isset($_POST['submit']) || $_SERVER['REQUEST_METHOD'] != "POST") {
    exit("<p>You did not press the submit button; this page should not be accessed directly.</p>");
} else {
    $exploits = "/(content-type|bcc:|cc:|document.cookie|onclick|onload|javascript|alert)/i";
    $profanity = "/(beastial|bestial|blowjob|clit|cock|cum|cunilingus|cunillingus|cunnilingus|cunt|ejaculate|fag|felatio|fellatio|fuck|fuk|fuks|gangbang|gangbanged|gangbangs|hotsex|jism|jiz|kock|kondum|kum|kunilingus|orgasim|orgasims|orgasm|orgasms|phonesex|phuk|phuq|porn|pussies|pussy|spunk|xxx)/i";
    $spamwords = "/(viagra|phentermine|tramadol|adipex|advai|alprazolam|ambien|ambian|amoxicillin|antivert|blackjack|backgammon|texas|holdem|poker|carisoprodol|ciara|ciprofloxacin|debt|dating|porn)/i";
    $bots = "/(Indy|Blaiz|Java|libwww-perl|Python|OutfoxBot|User-Agent|PycURL|AlphaServer)/i";

    if (preg_match($bots, $_SERVER['HTTP_USER_AGENT'])) {
        exit("<p>Known spam bots are not allowed.</p>");
    }
    foreach ($_POST as $key => $value) {
        $value = trim($value);

        if (empty($value)) {
            exit("<p>Empty fields are not allowed. Please go back and fill in the form properly.</p>");
        } elseif (preg_match($exploits, $value)) {
            exit("<p>Exploits/malicious scripting attributes aren't allowed.</p>");
        } elseif (preg_match($profanity, $value) || preg_match($spamwords, $value)) {
            exit("<p>That kind of language is not allowed through our form.</p>");
        }

		// add slashes to prevent quotation marks/etc breaking our CSV
        if (!get_magic_quotes_gpc()) $value = addslashes($value);
        $$key = strip_tags($value);
    }

    if (!ereg("^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,6})$",strtolower($email))) {
        exit("<p>That e-mail address is not valid, please use another.</p>");
    }

	// Data is checked and cleaned, we can store it now...
	$file = fopen("surveyfile.txt", "a") or die("File couldn't be opened");
	$data = "$name,$email,$country,"$comments"";
	if (fwrite($file, $data))
		echo '<p>Survey data successfully added :)</p>';
	fclose($file);
}
?>

Save the code as form-process.php, and that is that. You now have a simple script which stores survey data in a small text file, all ready for displaying on your web page!

Speak Your Mind

*