史萊姆論壇

史萊姆論壇 (http://forum.slime.com.tw/)
-   作業系統操作技術文件 (http://forum.slime.com.tw/f128.html)
-   -   將中文轉換為拼音的算法分析和VBS程式碼 (http://forum.slime.com.tw/thread166090.html)

psac 2006-01-16 10:22 PM

將中文轉換為拼音的算法分析和VBS程式碼
 
整理一下,希望對大家有用

第一種,取得拼音首字母
如: 把 轉換成 JPML

算法很簡單,漢字在ANSI編碼中是以拼音順序排列的,因此可以利用ASC函數得到漢字的ANSI字元程式碼,根據ANSI程式碼即可得到漢字的拼音。參見附件裡的 漢字讀音表GB2312版.txt,共7809個漢字

VBS來源碼如下:
取得拼音首字母.vbs
function getpychar(char)
tmp=65536+asc(char)
if(tmp>=45217 and tmp<=45252) then
getpychar= "A"
elseif(tmp>=45253 and tmp<=45760) then
getpychar= "B"
elseif(tmp>=45761 and tmp<=46317) then
getpychar= "C"
elseif(tmp>=46318 and tmp<=46825) then
getpychar= "D"
elseif(tmp>=46826 and tmp<=47009) then
getpychar= "E"
elseif(tmp>=47010 and tmp<=47296) then
getpychar= "F"
elseif(tmp>=47297 and tmp<=47613) then
getpychar= "G"
elseif(tmp>=47614 and tmp<=48118) then
getpychar= "H"
elseif(tmp>=48119 and tmp<=49061) then
getpychar= "J"
elseif(tmp>=49062 and tmp<=49323) then
getpychar= "K"
elseif(tmp>=49324 and tmp<=49895) then
getpychar= "L"
elseif(tmp>=49896 and tmp<=50370) then
getpychar= "M"
elseif(tmp>=50371 and tmp<=50613) then
getpychar= "N"
elseif(tmp>=50614 and tmp<=50621) then
getpychar= "O"
elseif(tmp>=50622 and tmp<=50905) then
getpychar= "P"
elseif(tmp>=50906 and tmp<=51386) then
getpychar= "Q"
elseif(tmp>=51387 and tmp<=51445) then
getpychar= "R"
elseif(tmp>=51446 and tmp<=52217) then
getpychar= "S"
elseif(tmp>=52218 and tmp<=52697) then
getpychar= "T"
elseif(tmp>=52698 and tmp<=52979) then
getpychar= "W"
elseif(tmp>=52980 and tmp<=53640) then
getpychar= "X"
elseif(tmp>=53689 and tmp<=54480) then
getpychar= "Y"
elseif(tmp>=54481 and tmp<=62289) then
getpychar= "Z"
else '如果不是中文,則不處理
getpychar=char
end if
end function

function getpy(str)
for i=1 to len(str)
getpy=getpy&getpychar(mid(str,i,1))
next
end function

'----------------------- 測試 -----------------------
temp = InputBox("請輸入字元串:", "InputBox", "我是測試字元串,哈哈。…… English char is ok too.")
If Len(temp) > 0 Then
msgbox getpy(CStr(temp)), ,"Result"
End If

psac 2006-01-16 10:24 PM

第二種,將漢字轉換為拼音
如: 將 我愛精品夢露 轉換為 woaijingpinmenglu

算法和上面的一樣,只是區分得更細了

中文轉換為拼音.vbs



'============== 下面是字符串转换为拼音的函数 ===================
'------ 汉字ASCII码拼音对照表,google来的,感谢无名氏-----
Set d = CreateObject("Scripting.Dictionary")
d.add "a",-20319
d.add "ai",-20317
d.add "an",-20304
d.add "ang",-20295
d.add "ao",-20292
d.add "ba",-20283
d.add "bai",-20265
d.add "ban",-20257
d.add "bang",-20242
d.add "bao",-20230
d.add "bei",-20051
d.add "ben",-20036
d.add "beng",-20032
d.add "bi",-20026
d.add "bian",-20002
d.add "biao",-19990
d.add "bie",-19986
d.add "bin",-19982
d.add "bing",-19976
.....
'最后一个字符是“座”,ASC值为-10247

'--------------------- 将单个字符转换为拼音 --------------------------
Function Chr2PY(num)
If num < -20319 Or num > -10247 Then
Chr2PY = Chr(num)
Else
a = d.Items
b = d.keys
For i = d.count - 1 To 0 Step -1
If a(i) <= num Then Exit For
Next
Chr2PY = b(i)
End If
End Function

'---------------------- 将中文标点符号转换为英文标点符号 --------------------------
Function SymbolC(str)
str = Replace(str, "!", "!")
str = Replace(str, "(", "(")
str = Replace(str, ")", ")")
str = Replace(str, "-", "-")
str = Replace(str, "——", "--")
....
SymbolC = str
End Function

'---------------------- 将字符串转换成拼音 --------------------------
Function Str2PY(str)
Str2PY = ""
For i = 1 To len(str)
Str2PY = Str2PY & Chr2PY(Asc(Mid(str, i, 1)))
Next
End Function

'----------------------- 测试 -----------------------
temp = InputBox("请输入需要转换成拼音的字符串:", "InputBox", "我是测试字符串,呵呵。…… English char is ok too.")
If Len(temp) > 0 Then
msgbox SymbolC(Str2PY(CStr(temp))), ,"Result"
End If

psac 2006-01-16 10:26 PM

第三種,將漢字轉換為帶聲調的拼音
如: 將 我愛精品夢露 轉換為 jīngpǐnm&egrave;ngl&ugrave;

原理很簡單,先造表(佩服這個牛人),然後搜尋取代即可

帶聲調的轉換.vbs



a1 = "的de5,一yi1,國guo2,在z....
.....
a68 = "煺tui4,柝tuo4,膃wa4,...

b1 = "的d&iacute;/d&igrave;/de,一yī,....
....
b68 = "煺tu&igrave;,柝tu&ograve;,膃......

cb = b1 & b2 &..... b67 & b68
ca = a1 & a2 & .....a67 & a68

Function pinyin(str, ab)
If Asc(str) > 0 Then '非中文,不轉換
pinyin = str
Exit Function
End If

If ab = "a" Then
p = InStr(ca, str)
If p > 0 Then
p2 = InStr(p, ca, ",")
pinyin = Mid(ca, p+1, p2 - p - 1)
End If
ElseIf ab = "b" Then
p = InStr(cb, str)
If p > 0 Then
p2 = InStr(p, cb, ",")
pinyin = Mid(cb, p+1, p2 - p - 1)
End If
End If

End Function


'---------------------- 將字元串轉換成拼音 --------------------------
Function Str2PY(str)
Str2PY = ""
For i = 1 To len(str)
Str2PY = Str2PY & pinyin(Mid(str, i, 1), "b")
Next
End Function

'----------------------- 測試 -----------------------
temp = InputBox("請輸入需要轉換成拼音的字元串:", "InputBox", "我是測試字元串,哈哈。…… English char is ok too.")
If Len(temp) > 0 Then
msgbox Str2PY(CStr(temp)), ,"Result"
End If

上面的程式碼都是示例程式碼,請下載附件,直接執行vbs就可看到效果

優點: 只要看懂了上面的原理,可以很容易移植到其它程序上
缺點: 能轉換的漢字有限,最強的是最後一個,收錄的漢字最多。前兩個只能轉換GB2312中的大部分漢字(基本夠用了,哈哈)


所有時間均為台北時間。現在的時間是 07:41 PM

Powered by vBulletin® 版本 3.6.8
版權所有 ©2000 - 2024, Jelsoft Enterprises Ltd.

『服務條款』

* 有問題不知道該怎麼解決嗎?請聯絡本站的系統管理員 *


SEO by vBSEO 3.6.1