I've the next test PHP code:
header('Content-type: text/html charset=utf-8') $text = 'Développeur Web' var_dump($text) $text = preg_replace('#[^pLd]+#u', '-', $text) var_dump($text) $text = trim($text, '-') var_dump($text) $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text) var_dump($text) $text = strtolower($text) var_dump($text) $text = preg_replace('#[^-w]+#', '', $text) var_dump($text)
On my small local machine it's being employed as expected:
string(16) "Développeur Web" string(16) "Développeur-Web" string(16) "Développeur-Web" string(16) "D'eveloppeur-Web" string(16) "d'eveloppeur-web" string(15) "developpeur-web"
but on my small live server it's acting oddly:
string 'Développeur Web' (length=16) string '-pp-' (length=4) string 'pp' (length=2) string 'pp' (length=2) string 'pp' (length=2) string 'pp' (length=2)
The neighborhood machine is Home windows running PHP version 5.2.4 and also the live server is CentOS running PHP version 5.2.10 so that they aren't identical at all, not ideal I understand.
Has anybody experienced anything similar and may point me within the right direction? I am presuming it's some type of server or PHP configuration associated with UTF-8 or locale.
Thank you ahead of time
Should not it's
$text = preg_replace('#[^pLd]+#u', '-', $text)
in line 6. Should you escape the
you will have a literal
inside your exclusion class. Therefore the regex
[^pLd]+ finds a number of occurrences of the character not a
L or perhaps a digit. This could explain why
"Développeur Web" will disappear to
"-pp-" - everything as much as the very first
p matches and will also be changed with a
- this is also true for everything following the second
Possibly there's a noticeable difference between both machines in how an steered clear of
EDIT after OP comment:
Really getting away the
isn't any problem here - both versions are treated exactly the same way. What really appears to become the issue ist, the used PCRE version doesn't support unicode qualities and wasn't put together with