Encode characters as numeric entity references with PHP

While developing a feed for the Lovelite web site, I needed a way to encode content from the database as numeric entity references, as that’s a safe way to use high characters and special characters in an xml document, like in an RSS feed. So, for example, the German “umlaut” ü becomes #&252;.

I discovered a neat little function in the user-contributed PHP documentation which does just this, and I thought I’d share it with you:

function xml_character_encode($string, $trans='') {
$trans=(is_array($trans)) ? $trans : get_html_translation_table(HTML_ENTITIES, ENT_QUOTES);
foreach ($trans as $k=>$v) $trans[$k]= "&#".ord($k).";";
return strtr($string, $trans);
}

I think it would even make sense to integrate this functionality into PHP.


About this entry