Creating a Simple Spam Filter in PHP

SPAM

 

I started with a post giving in in depth look into creating and utilizing objects in PHP. That post will follow soon enough, I want to make sure that I am fully prepared to teach all I can about objects in PHP. What I ended up here with is a super simple spam filter for email. The form is easy, you enter in your name, email address, password (hidden), and the domain for the person sending you an email. It’s a simple app if you receive an email from someone and you’re not sure whether or not they’re spamming you.

Our Front End

We start with our front end, setting up our form and our result pages. First our index.php file:

<!doctype html>
<html>
<head>
	<title>Check for Spam</title>
	<link rel="stylesheet" type="text/css" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css">
	<link rel="stylesheet" type="text/css" href="styles.css">
</head>
<body>
	<h2>Let's find some spam</h2>	
	<form class="form" action="objects.php" method="post">
		<div class="form-group">
			<label>Name</label>
			<input class="form-control" type="text" name="name"><br>
		</div>
		<div class="form-group">
			<label>Email</label>
			<input class="form-control" type="text" name="email"><br>
		</div>
		<div class="form-group">
			<label>Password</label>
			<input class="form-control" type="password" name="password"><br>
		</div>
		<div class="form-group">
			<label>From</label>
			<input class="form-control" type="text" name="from">
		</div>
		<button type="submit" class="btn btn-default">Submit</button>
	</form>

	<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.4.min.js"></script>
	<script type="text/javascript" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>
</body>
</html>

As I do in all of my projects I am using Bootstrap to help take care of the decorative and details for the DOM.

After our form is setup, I created the results pages. One for if we found spam, and one for if it was clean. They are as follows:

Our Dirty One

<!doctype html>
<html>
<head>
	<title>Check for Spam</title>
	<link rel="stylesheet" type="text/css" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css">
	<link rel="stylesheet" type="text/css" href="styles.css">
</head>
<body>
	<div class="">
		<h2 class="label-danger">This person sent you spam!</h2>	
	</div>
	<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.4.min.js"></script>
	<script type="text/javascript" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>
</body>
</html>

Our clean one:

Our Clean one

<!doctype html>
<html>
<head>
	<title>Check for Spam</title>
	<link rel="stylesheet" type="text/css" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css">
	<link rel="stylesheet" type="text/css" href="styles.css">
</head>
<body>
	<div class="">
		<h2 class="label-success">There's no spam! Yay!</h2>	
	</div>
	<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.4.min.js"></script>
	<script type="text/javascript" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>
</body>
</html>

Our Back End

After we set up our front end we start to look into how we can filter out messages. Now the way I made it requires some explanation. We start by putting in the domain of the messages we wanted to check. Warning, this will check all messages with that domain. We open a stream and extract the body of the message. We then check that message against a predefined list of words that we know to be associated with spam. We do all this by introducing some new PHP functions. I will explain, below is our objects.php file:

<?php
$hostname = '{imap.gmail.com:993/imap/ssl}INBOX';
$username = $_POST['email'];
$password = $_POST['password'];
$from = $_POST['from'];


$inbox = imap_open($hostname, $username, $password) or die('Cannot connect');

$emails = imap_search($inbox,'FROM "@'. $from . '"');

if($emails) {
	$output = '';

	rsort($emails);

	foreach($emails as $email_number) {
		$overview = imap_fetch_overview($inbox, $email_number,0);
		$message = imap_fetchbody($inbox, $email_number,2);
		$output.='<div class="body">'. $message . "</div>";
	}
}

imap_close($inbox);
/////////////////////////////////////////////////////////////////////////////////
// Function for checking for spam
function checkString($this){
	//filter for the spam list
	$filter_list = array();
	array_push($filter_list, "email",
				"computer",
				"microsoft",
				"windows"
				);
	//for each list to go through array and find strings within the strings
	foreach ($filter_list as $key) {
		//this to search through the string and find the keywords
		if (strpos($this, $key) !== false) {
			header( 'Location: http://localhost/objects/result_bad.html' );
		} 
		else {
			header( 'Location: http://localhost/objects/result_good.html' );
		}
	}
}

checkString($output);
?>

How does all this work?

Now lets go through what’s happening here, we start with creating variables from our form inputs to be used in the script. We then use our new founded variables to open a stream using the imap_open() function. This function, as the name suggests, allows us to connect to an IMAP server to read the emails. In this example we are connecting to a gmail server. While our stream is open we have some options on what we want to read. In the case we use the imap_search() function which will query our server and filter through our inbox by the parameters presented. In this case we are filtering by the “FROM” line in the email. We use the domain from the earlier form.

The imap_search() function when it finds our email will return the email in an array for us. We use a foreach loop to sort out our email and turn it into a string for us to find. After that we simply add some HTML for us to be able to read it. We then use the imap_close() function to close our open stream.

Finding Spam

So now that we have our email body we can start digging into whether or not we have any spam to find. This one is pretty simple. We have an array with our predefined words that we know will appear in spam messages. After we declare our words we create a foreach loop to go through all of our words and compare it to the body of the message. With the help of the strpos() function we can determine of those words exist within the body of our message. If one of the words have been located then we are directed to the bad results page. If it is clean of flagged words, then we are directed to the good results page.

Conclusion

We used a few new functions today. Ones that I am not used to. I was happy to use them now I can help create apis that include using email filtering. That is just one of many uses for those functions. As with everything else I am happy to learn new stuff and to pass on that knowledge. I hope you found this article helpful. If there are any questions or comments, feel free to post them below. Can’t wait to hear from you!

More Links

Cheeck out the code on GitHub

Huge shout out to David Walsh for helping me learn how to connect to an email server.

IMAP Functions

2 replies
  1. Sajjad
    Sajjad says:

    0
    down vote
    I hope this script is also effective,

    <?php
    function isspam($text)
    {
    $sfil[0] = "link";
    $sfil[1] = "http";
    $sfil[2] = "www";
    $sfil[3] = "any slang";
    $sfil[4] = "any word";
    $sfil[5] = "any website";
    $text = str_replace(" ", "", $text);
    $text = strtolower($text);
    for($i=0;$i0)
    {
    return true;
    }
    }

    return false;
    }
    ?>

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply