SEO Egghead by Jaimie Sirovich: A blog about SEO, written for nerds, by a nerd.

Choose a Topic:

» Suggest a topic or buzz to cover; if I write about it, you'll get credit with a link in the post!

Fri
21
Jul '06

ROTD: Replacing "Smart" Quotes with "Dumb" Quotes

Microsoft applications have this nasty habit of exchanging both your single and double quotes with "smarter" versions.  They curve inwards and look really snazzy in Microsoft Word.  When you cut and paste them, they're unencoded, as Windows assumes that everyone is using windows-1252 or something.  Unfortunately, that's pretty annoying if you're not using a Windows character set.  So you may want to alter them using the regular expressions that follow.

ROTD stands for "regex of the day," in case you're wondering.  And I was going for clarity here -- not efficiency, so don't point out that this could be written more efficiently.  I know.  Here is the code:

<?

function fixSmartQuotes($str$replace_single_quotes true$replace_double_quotes true$replace_emdash true$use_entities false)
{

    $translation_table_ascii = array(
        
145 => '\''
        
146 => '\''
        
147 => '"'
        
148 => '"'
        
151 => '-'
    );

    $translation_table_entities = array(
        
145 => '&lsquo;'
        
146 => '&rsquo;'
        
147 => '&ldquo;'
        
148 => '&rdquo;'
        
151 => '&mdash;'
      );

    $translation_table = ($use_entities $translation_table_entities $translation_table_ascii);

    if ($replace_single_quotes) {
        
$str preg_replace('#\x' dechex(145) . '#'$translation_table[145], $str);
        
$str preg_replace('#\x' dechex(146) . '#'$translation_table[146], $str);
    }

    if ($replace_double_quotes) {
        
$str preg_replace('#\x' dechex(147) . '#'$translation_table[147], $str);
        
$str preg_replace('#\x' dechex(148) . '#'$translation_table[148], $str);
    }

    if ($replace_emdash) {
        
$str preg_replace('#\x' dechex(151) . '#'$translation_table[151], $str);
    }
    
    return 
$str;

}

echo fixSmartQuotes("“”’‘—");

?>

Seriously, these things are annoying -- especially for Linux users; I cut and paste word documents into textareas all the time, and these characters appear everywhere.  This function will save you from having to edit them out manually.  Depending on which options you select, it alters some or all of the "problem" characters.  It also lets you use HTML entities to encode and retain the "smart" quotes instead of replacing them with "dumb quotes."  Happy programming!

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • Reddit
  E-Mail This Post/Page

2 Responses to “ROTD: Replacing "Smart" Quotes with "Dumb" Quotes”

  1. rxbbx Says:

    "I cut and paste word documents into textareas all the time" me too.

  2. Francis Says:

    You might want to include ellipsis (...) replacement too

Leave a Reply