kennypooI hate the stupid odd characters that sometimes seem to magically appear from nowhere in text. Some call the odd characters, some call them special characters, I call them non ASCII characters but whatever you call them they are inconvenient. Often they come along with text pasted into a content management system from Microsoft Word. Today I had a large galley of copy that was filled with these that I needed to clean the invalid characters from. Instead of going through and removing each instance I simple wrote a regular expression that takes the text string and replaces them all with nothing. It works great, the only problem is that in some instances these may represent real characters you need like ‘ or “ so you have to be careful using a function like this. If you don’t then give it a try:

<?php
    $output = "Clean this copy of invalid non ASCII äócharacters.";
    $output = preg_replace('/[^(\x20-\x7F)]*/','', $output);
    echo($output);
?>