短網(wǎng)址一直都在微博上應(yīng)用。例如騰訊微博的短網(wǎng)址url.cn,新浪的sinaurl.cn等。 他們是如何實(shí)現(xiàn)呢,本文將介紹一下該技術(shù)算法!
比如我們?cè)隍v訊微博上發(fā)布網(wǎng)址的時(shí)候,微博會(huì)自動(dòng)判別網(wǎng)址,并將其轉(zhuǎn)換,例如轉(zhuǎn)換為:http://url.cn/3fVZf1
為什么要這樣做的,原因有以下幾點(diǎn):
1、騰訊微博限制字?jǐn)?shù)為140字一條,那么如果我們需要發(fā)一些連接上去,但是這個(gè)連接非常的長(zhǎng),以至于將近要占用我們內(nèi)容的一半篇幅,這肯定是不能被允許的,所以短網(wǎng)址應(yīng)運(yùn)而生了。
2、短網(wǎng)址可以在我們項(xiàng)目里可以很好的對(duì)開(kāi)放級(jí)URL進(jìn)行管理。有一部分網(wǎng)址可以會(huì)涵蓋seqing,暴力,廣告等信息,這樣我們可以通過(guò)用戶的舉報(bào),完全管理這個(gè)連接將不出現(xiàn)在我們的應(yīng)用中,應(yīng)為同樣的URL通過(guò)加密算法之后,得到的地址是一樣的。
3、我們可以對(duì)一系列的網(wǎng)址進(jìn)行流量,點(diǎn)擊等統(tǒng)計(jì),挖掘出大多數(shù)用戶的關(guān)注點(diǎn),這樣有利于我們對(duì)項(xiàng)目的后續(xù)工作更好的作出決策。
其實(shí)以上三點(diǎn)純屬個(gè)人觀點(diǎn),因?yàn)樵谖医酉聛?lái)的部分項(xiàng)目中會(huì)應(yīng)用到,所以就了解了一下,下面先來(lái)看看短網(wǎng)址映射算法的理論(網(wǎng)上找到的資料)
1)將長(zhǎng)網(wǎng)址md5生成32位簽名串,分為4段, 每段8個(gè)字節(jié);
2)對(duì)這四段循環(huán)處理, 取8個(gè)字節(jié), 將他看成16進(jìn)制串與0x3fffffff(30位1)與操作, 即超過(guò)30位的忽略處理;
3)這30位分成6段, 每5位的數(shù)字作為字母表的索引取得特定字符, 依次進(jìn)行獲得6位字符串;
4)總的md5串可以獲得4個(gè)6位串; 取里面的任意一個(gè)就可作為這個(gè)長(zhǎng)url的短url地址;
很簡(jiǎn)單的理論,我們并不一定說(shuō)得到的URL是唯一的,但是我們能夠取出4組URL,這樣幾乎不會(huì)出現(xiàn)太大的重復(fù)。
下面來(lái)看看程序部分:
Java代碼
public static string[] ShortUrl(string url)
{
//可以自定義生成MD5加密字符傳前的混合KEY
string key = "Leejor";
//要使用生成URL的字符
string[] chars = new string[]{
"a","b","c","d","e","f","g","h",
"i","j","k","l","m","n","o","p",
"q","r","s","t","u","v","w","x",
"y","z","0","1","2","3","4","5",
"6","7","8","9","A","B","C","D",
"E","F","G","H","I","J","K","L",
"M","N","O","P","Q","R","S","T",
"U","V","W","X","Y","Z"
};
//對(duì)傳入網(wǎng)址進(jìn)行MD5加密
string hex = System.Web.Security.FormsAuthentication.HashPasswordForStoringInConfigFile(key + url, "md5");
string[] resUrl = new string[4];
for (int i = 0; i < 4; i++)
{
//把加密字符按照8位一組16進(jìn)制與0x3FFFFFFF進(jìn)行位與運(yùn)算
int hexint = 0x3FFFFFFF & Convert.ToInt32("0x" + hex.Substring(i * 8, 8), 16);
string outChars = string.Empty;
for (int j = 0; j < 6; j++)
{
//把得到的值與0x0000003D進(jìn)行位與運(yùn)算,取得字符數(shù)組chars索引
int index = 0x0000003D & hexint;
//把取得的字符相加
outChars += chars[index];
//每次循環(huán)按位右移5位
hexint = hexint >> 5;
}
//把字符串存入對(duì)應(yīng)索引的輸出數(shù)組
resUrl[i] = outChars;
}
return resUrl;
}
public static string[] ShortUrl(string url)
{
//可以自定義生成MD5加密字符傳前的混合KEY
string key = "Leejor";
//要使用生成URL的字符
string[] chars = new string[]{
"a","b","c","d","e","f","g","h",
"i","j","k","l","m","n","o","p",
"q","r","s","t","u","v","w","x",
"y","z","0","1","2","3","4","5",
"6","7","8","9","A","B","C","D",
"E","F","G","H","I","J","K","L",
"M","N","O","P","Q","R","S","T",
"U","V","W","X","Y","Z"
};
//對(duì)傳入網(wǎng)址進(jìn)行MD5加密
string hex = System.Web.Security.FormsAuthentication.HashPasswordForStoringInConfigFile(key + url, "md5");
string[] resUrl = new string[4];
for (int i = 0; i < 4; i++)
{
//把加密字符按照8位一組16進(jìn)制與0x3FFFFFFF進(jìn)行位與運(yùn)算
int hexint = 0x3FFFFFFF & Convert.ToInt32("0x" + hex.Substring(i * 8, 8), 16);
string outChars = string.Empty;
for (int j = 0; j < 6; j++)
{
//把得到的值與0x0000003D進(jìn)行位與運(yùn)算,取得字符數(shù)組chars索引
int index = 0x0000003D & hexint;
//把取得的字符相加
outChars += chars[index];
//每次循環(huán)按位右移5位
hexint = hexint >> 5;
}
//把字符串存入對(duì)應(yīng)索引的輸出數(shù)組
resUrl[i] = outChars;
}
return resUrl;
}
現(xiàn)在可以直接使用該方法,可以等到下面四組值
ShortUrl(http://www.me3.cn")[0]; //得到值fAVfui
ShortUrl("http://www.me3.cn")[1]; //得到值3ayQry
ShortUrl("http://www.me3.cn")[2]; //得到值UZzyUr
ShortUrl("http://www.me3.cn")[3]; //得到值36rQZn
原文地址:http://haohaoker-163-com.iteye.com/blog/1094692