Microsoft applications have this nasty habit of exchanging both your single and double quotes with "smarter" versions. They curve inwards and look really snazzy in Microsoft Word. When you cut and paste them, they're unencoded, as Windows assumes that everyone is using windows-1252 or something. Unfortunately, that's pretty annoying if you're not using a Windows character set. So you may want to alter them using the regular expressions that follow.
ROTD stands for "regex of the day," in case you're wondering. And I was going for clarity here -- not efficiency, so don't point out that this could be written more efficiently. I know. Here is the code:
<?
function fixSmartQuotes($str, $replace_single_quotes = true, $replace_double_quotes = true, $replace_emdash = true, $use_entities = false)
{
$translation_table_ascii = array(
145 => '\'',
146 => '\'',
147 => '"',
148 => '"',
151 => '-'
);
$translation_table_entities = array(
145 => '‘',
146 => '’',
147 => '“',
148 => '”',
151 => '—'
);
$translation_table = ($use_entities ? $translation_table_entities : $translation_table_ascii);
if ($replace_single_quotes) {
$str = preg_replace('#\x' . dechex(145) . '#', $translation_table[145], $str);
$str = preg_replace('#\x' . dechex(146) . '#', $translation_table[146], $str);
}
if ($replace_double_quotes) {
$str = preg_replace('#\x' . dechex(147) . '#', $translation_table[147], $str);
$str = preg_replace('#\x' . dechex(148) . '#', $translation_table[148], $str);
}
if ($replace_emdash) {
$str = preg_replace('#\x' . dechex(151) . '#', $translation_table[151], $str);
}
return $str;
}
echo fixSmartQuotes("“”’‘—");
?>
Seriously, these things are annoying -- especially for Linux users; I cut and paste word documents into textareas all the time, and these characters appear everywhere. This function will save you from having to edit them out manually. Depending on which options you select, it alters some or all of the "problem" characters. It also lets you use HTML entities to encode and retain the "smart" quotes instead of replacing them with "dumb quotes." Happy programming!












July 21st, 2006 at 5:29 pm
"I cut and paste word documents into textareas all the time" me too.
January 17th, 2007 at 9:45 pm
You might want to include ellipsis (...) replacement too