RSS
 

PHP Uppercase Sentences

14 Nov

I’ve never really had a need to write code to uppercase only sentences in a string, but I now am dealing with large blocks of text content that is often written all uppercase by the author. So I popped on over to the PHP manual. I was pretty certain no such function existed in the language, but thought I’d check if one was added in a newer version.

Alas, no such function existed, so I wrote one myself, but in the process of looking at the manual, I noticed a lot of different, very lengthy implementations of functions to make sentences uppercase. For example, this snippet was posted by mattalexxpub:

function sentence_case($string) {
    $sentences = preg_split('/([.?!]+)/', $string, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
    $new_string = '';
    foreach ($sentences as $key => $sentence) {
        $new_string .= ($key & 1) == 0?
            ucfirst(strtolower(trim($sentence))) :
            $sentence.' ';
    }
    return trim($new_string);
}

print sentence_case('HMM. WOW! WHAT?');

// Outputs: "Hmm. Wow! What?"

And this snippet was posted by adefoor:

function sentence_cap($impexp, $sentence_split) {
    $textbad=explode($impexp, $sentence_split);
    $newtext = array();
    foreach ($textbad as $sentence) {
        $sentencegood=ucfirst(strtolower($sentence));
        $newtext[] = $sentencegood;
    }
    $textgood = implode($impexp, $newtext);
    return $textgood;
}

$text = "this is a sentence. this is another sentence! this is the fourth sentence? no, this is the fourth sentence.";
$text = sentence_cap(". ",$text);
$text = sentence_cap("! ",$text);
$text = sentence_cap("? ",$text);

echo $text; // This is a sentence. This is another sentence! This is the fourth sentence? No, this is the fourth sentence.

The long examples continue. I don’t know if there are any better examples in the manual because there was too much to read through, but here’s my solution, which does the sentence uppercasing completely and in a single line of code:

function ucsentence($i_str, $i_lowercase = false) {
  $i_lowercase && ($i_str = strtolower($i_str));
  return preg_replace('/(^|[\.!?]"?\s+)([a-z])/e', '"$1" . ucfirst("$2")', $i_str);
}

You can pass any string as an argument and it will properly uppercase the first letter of each sentence, including if punctuation is contained within quotes. If you specify a non-empty second argument, the function will first make the entire input string lowercase. This is useful when the string is all uppercase, but the default is to not touch the rest of the string.


It’s pretty straightforward, but for the less familiar with regular expressions, I’ll explain. The function uses a regular expression to match either of two cases:
  1. the beginning of the string followed by a lowercase alphabetic
  2. a period (.), exclamation (!), or question mark (?) optionally followed by a quotation mark (“), followed by a lowercase alphabetic

If either of those cases is matched, preg_replace will execute the code in its second argument, which prepends the leading character(s), if any, to the uppercase version of alphabetic.

Et voila. Your string is sentence-cased. There are, of course, some caveats. For instance, there are many cases where your string could contain proper nouns, initials, or abbreviations and those such cases might not be handled as you’d like. Here are a couple examples with undesired output:

$str = 'A FOX NAMED MURPHY BROWN JUMPED OVER THE LAZY DOGS.';
echo ucsentence($str, true);// make it lowercase first because it's all upper
// desired output: A fox named Murphy Brown jumped over the lazy dogs.
// actual output: A fox named murphy brown jumped over the lazy dogs.
$str = 'the bottle contained five oz. of water.';
echo ucsentence($str);
// desired output: The bottle contained five oz. of water.
// actual output: The bottle contained five oz. Of water.

There’s unfortunately nothing much we can do about things like that except building an extensive list of special circumstances where we wouldn’t want a period to be considered the end of a sentence. I think we’ll survive with it as-is.

 
2 Comments

Posted in PHP

 

13,295 views

Tags: , , , , , ,

Leave a Reply

 

 
  1. Steve

    January 9, 2013 at 5:33 am

    That’s elegant man! Like it.

    I was on a search to find a way to turn the string OneTwoThree to One-Two-Three using php and came across this. If I find a solution I’ll post it if you like.

     
  2. Steve

    January 9, 2013 at 5:57 am

    Haha, I found the solution. Gotta love Google

    function splitAtUpperCase($s) {

    return preg_replace(‘/(?<!^)([A-Z])/', '-\1', $s);
    }
    echo splitAtUpperCase($s);