用PHP实现GB2312和Unicode(UTF-8)间的编码转换

作者:袖梨 2022-07-02

下面的例子是将 gb2312 转换为 uft-8 这种形式
php4.3.1以后的iconv函数很好用的,只是需要自己写一个uft8到unicode的转换函数
查表(gb2312.txt)也行

$text = "电子书库";
preg_match_all("/[x80-xff]?./",$text,$ar);
foreach($ar[0] as $v)
echo "&#".utf8_unicode(iconv("GB2312","UTF-8",$v)).";";
?>
// utf8 -> unicode
function utf8_unicode($c) {
switch(strlen($c)) {
case 1:
return ord($c);
case 2:
$n = (ord($c[0]) & 0x3f) << 6;
$n = ord($c[1]) & 0x3f;
return $n;
case 3:
$n = (ord($c[0]) & 0x1f) << 12;
$n = (ord($c[1]) & 0x3f) << 6;
$n = ord($c[2]) & 0x3f;
return $n;
case 4:
$n = (ord($c[0]) & 0x0f) << 18;
$n = (ord($c[1]) & 0x3f) << 12;
$n = (ord($c[2]) & 0x3f) << 6;
$n = ord($c[3]) & 0x3f;
return $n;
}
}
?>

下面的例子是利用php将uft-8这中编码转换为gb2312.

$str = "TTL全天候自动聚焦";
$str = preg_replace("|&#([0-9]{1,5});|", "".u2utf82gb(1)."", $str);
$str = "$str="$str";";
eval($str);
echo $str;
function u2utf82gb($c){
$str="";
if ($c < 0x80) {
$str.=$c;
} else if ($c < 0x800) {
$str.=chr(0xC0 | $c>>6);
$str.=chr(0x80 | $c & 0x3F);
} else if ($c < 0x10000) {
$str.=chr(0xE0 | $c>>12);
$str.=chr(0x80 | $c>>6 & 0x3F);
$str.=chr(0x80 | $c & 0x3F);
} else if ($c < 0x200000) {
$str.=chr(0xF0 | $c>>18);
$str.=chr(0x80 | $c>>12 & 0x3F);
$str.=chr(0x80 | $c>>6 & 0x3F);
$str.=chr(0x80 | $c & 0x3F);
}
return iconv('UTF-8', 'GB2312', $str);
}
?>

利用javascript来转换