I needed to cut a string after x chars at a html converted utf-8 text (for example Japanese text like 嬰謰弰脰欰罏).
The problem was, the different length of the signs, so I wrote the following function to handle that.
Perhaps it helps.
<?php
function html_cutstr ($str, $len)
{
if (!preg_match('/\&#[0-9]*;.*/i', $str))
{
$rVal = strlen($str, $len);
break;
}
$chars = 0;
$start = 0;
for($i=0; $i < strlen($str); $i++)
{
if ($chars >= $len)
break;
$str_tmp = substr($str, $start, $i-$start);
if (preg_match('/\&#[0-9]*;.*/i', $str_tmp))
{
$chars++;
$start = $i;
}
}
$rVal = substr($str, 0, $start);
if (strlen($str) > $start)
$rVal .= " ...";
return $rVal;
}
?>