I am trying out a website title finder and wish to favour individuals words which are simple to pronounce.

Example: nameoic.com (bad) versus namelet.com (good).

Was thinking something related to soundex might be appropriate however it does not seem like I'm able to rely on them to create some kind of comparative score.

PHP code for that win.

This is a function that ought to work most abundant in common of words... It will provide you with a nice result between 1 (perfect pronounceability based on the rules) to .

The next function not even close to perfect (it does not that can compare with words like Tsunami [.857]). But it ought to be simple enough to tweak to your requirements.


// Score: 1

echo pronounceability('namelet') . "n"

// Score: .71428571428571

echo pronounceability('nameoic') . "n"

function pronounceability($word) situation

    if ($word == 'a') return 1

    $len = strlen($word)

    // Let us not parse a clear string

    if ($len == ) return

    $score =

    $pos =

    while ($pos < $len) permitted composites

    foreach ($composites as $comp) 



    // Could it be a vowel? If that's the case, see if previous wasn't a vowel too.

    if (in_array($word[$pos], $vowels)) amplifier&lifier !in_array($word[$pos - 1], $vowels)) 

     else finish of word

    if (($pos + 1) < $len &lifier&lifier in_array($word[$pos + 1], $vowels))  elseif (($pos + 1) == $len) 


    $pos += 1


    return $score / $len

I believe the issue might be boiled lower to parsing the term right into a candidate group of phonemes, then utilizing a predetermined listing of phoneme pairs to find out how pronouncible the term is.

For instance: "skill" phonetically is "/s/k/i/l/". "/s/k/", "/k/i/", "/i/l/" really should have high lots of pronouncibility, therefore the word should score highly.

"skpit" phonetically is "/s/k/p/i/t/". "/k/p/" must have a minimal pronouncibility score, therefore the word should score low.

Make use of a Markov model (on letters, not words, obviously). The prospect of a thing is a nice good proxy for easy pronunciation. You will need to normalize for length, since longer test is naturally less probable.