In my previous post, Removing Non-English Characters in PHP, I provided a way to remove non-english characters, but I found another neat trick to convert these characters into a close ASCII equivalent.
Here’s the code:
…and here’s the source.function unaccent($string) {
if (strpos($string = htmlentities($string, ENT_QUOTES, 'UTF-8'), '&') !== false) {
$string = html_entity_decode(preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|tilde|uml);~i', '$1', $string), ENT_QUOTES, 'UTF-8');
}
return $string;
}