GoogleToolbar 3.x/4.x PageRank Checksum 算法
得到的checksum用来向google请求获取PR值 新的request中,CH=7******* 这个版本增强了对PHP4.4/PHP5.x的兼容性,同时也增加了对x86_64 CPU的支持
Google Toolbar 3.0.x/4.0.x Pagerank Checksum 算法
PageRank 在线获取 演示(提供PHP代码的演示效果,提供C,Python,PHP源代码下载)
这个版本并不模拟Google Toolbar的行为,只是计算出Checksum的值,给出一个GET URL的链接,用户自行用浏览器打开此链接,该页面返回的最后一个数字就是 Pagerank值
如果你进行代码优化,或者转换成了其他版本, VB/asp ,C#/asp.net,请发给原作者(anykai at gmail.com)一份,谢谢
4.x的算法与 3.x的是一样的,现在给的这个版本,对PHP 4.4/PHP 5.x支持的比较好,因为高版本的PHP中一个 浮点数 到整数的类型转换是没有定义的
在不同的版本,不同的系统中 得到的值是不同的;早先的版本是与C实现一致的。而我们以前一直是以为PHP的bug,原来我们自己错得这么彻底.
Google Toolbar 4.0.x Pagerank Checksum Algorithm PHP Update at 2006-9-21
update : 2006-9-29 X86_64 CPU supported
#!/usr/bin/php
<?php
/*
Google PageRank Checksum Algorithm (Toolbar 3.x/4.x)
http://www.gamesaga.net/pagerank/
http://www.upsdn.net/
*/
error_reporting(E_ALL);
function StrToNum($Str, $Check, $Magic)
{
$Int32Unit = 4294967296; // 2^32
$length = strlen($Str);
for ($i = 0; $i < $length; $i++) {
$Check *= $Magic;
//If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31),
// the result of converting to integer is undefined
// refer to http://www.php.net/manual/en/language.types.integer.php
//if (is_float($Check)) {
if ($Check >= $Int32Unit) {
$Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
// - 2^31
$Check = ($Check < -2147483647) ? ($Check + $Int32Unit) : $Check;
}
$Check += ord($Str{$i});
}
return $Check;
}
function HashURL($String)
{
$Check1 = StrToNum($String, 0x1505, 0x21);
$Check2 = StrToNum($String, 0, 0x1003F);
$Check1 >>= 2;
$Check1 = (($Check1 >> 4) & 0x3FFFFC0 ) | ($Check1 & 0x3F);
$Check1 = (($Check1 >> 4) & 0x3FFC00 ) | ($Check1 & 0x3FF);
$Check1 = (($Check1 >> 4) & 0x3C000 ) | ($Check1 & 0x3FFF);
$T1 = (((($Check1 & 0x3C0) << 4) | ($Check1 & 0x3C)) <<2 ) | ($Check2 & 0xF0F );
$T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0x3C00)) << 0xA) | ($Check2 & 0xF0F0000 );
return ($T1 | $T2);
}
function CheckHash($Hashnum)
{
$CheckByte = 0;
$Flag = 0;
$HashStr = sprintf('%u', $Hashnum) ;
$length = strlen($HashStr);
for ($i = $length - 1; $i >= 0; $i --) {
$Re = $HashStr{$i};
if (1 == ($Flag % 2)) {
$Re += $Re;
$Re = (int)($Re / 10) + ($Re % 10);
}
$CheckByte += $Re;
$Flag ++;
}
$CheckByte %= 10;
if (0 !== $CheckByte) {
$CheckByte = 10 - $CheckByte;
if (1 === ($Flag%2) ) {
if (1 === ($CheckByte % 2)) {
$CheckByte += 9;
}
$CheckByte >>= 1;
}
}
return '7'.$CheckByte.$HashStr;
}
if ($argc == 2) {
echo CheckHash(HashURL($argv[1]));
} else {
exit ("please specify a URL,for example: http://www.upsdn.net/\n");
}
?>
C语言版本源代码(http://tools.upsdn.net/pr/pagerank.c)
/******************************************************************************
Filename : pagerank.c
Description : Google PageRank Checksum Algorithm (Toolbar 3.0.x)
Author : Jet Marx <smith (at) aboutsledge (dot)com>
License : UPL
Log : Ver 0.1 2005-09-13
Ver 1.0 2005-10-19 Final Character Bug Fixed
Ver 1.1 2005-10-21 Final Character Bug(cdq bug) Fixed
******************************************************************************/
#include <stdio.h>
int main(int argc, char* argv[])
{
char * eos;
int Remainder;
unsigned int T1;
unsigned int T2;
unsigned int Checksum1 = 0x1505;
int Checksum2 = 0;
int Flag = 0;
/*=================Copyright & Usage=======================*/
printf("\nGoogle PageRank Checksum Calculator \
(GoogleToolbar 3.0.125.1-big)\n http://www.GameSaga.net 2005-09-13\n\n");
if (argc < 2){
printf("Usage: %s [URL] \nExample: \
%s http://www.gamesaga.net/ \n\n",argv[0]);
return 1;
}
/*======================Stage 1===========================*/
eos = argv[1];
while (*eos) {
Checksum1 *= 0x21;
Checksum1 += *eos++;
}
eos = argv[1];
while( *eos ) {
Checksum2 *= 0x1003F;
Checksum2 += *eos++;
}
Checksum1 >>= 2;
Checksum1 = ( (Checksum1>>4) & 0x3FFFFC0 ) | (Checksum1 & 0x3F);
Checksum1 = ( (Checksum1>>4) & 0x3FFC00 ) | (Checksum1 & 0x3FF);
Checksum1 = ( (Checksum1>>4) & 0x3C000 ) | (Checksum1 & 0x3FFF);
T1 = ( ( ( ( Checksum1 & 0x3C0 ) << 4 ) \
| ( Checksum1 & 0x3C ) ) \
<< 2 ) \
| ( Checksum2 & 0xF0F );
T2 = ( ( ( ( Checksum1 & 0xFFFFC000 ) << 4 ) \
| ( Checksum1 & 0x3C00 ) ) \
<< 0xA ) \
| ( Checksum2 & 0xF0F0000 );
Checksum1 = T1 | T2;
/*=====================Stage 2========================*/
Checksum2 = 0;
T1 = Checksum1;
do {
Remainder = T1 % 10;
T1 /= 10;
if ( 1 == (Flag % 2) ){
Remainder += Remainder;
Remainder = (Remainder/10) + (Remainder%10);
}
Checksum2 += Remainder;
Flag ++;
} while( 0 != T1);
//Checksum2 = (10-Checksum2%10)%10+0x30;
Checksum2 %= 10;
if (0 != Checksum2){
Checksum2 = 10 - Checksum2;
T1 = Checksum2 % 2 ;
if ( 1 == (Flag%2) ) {
if (1 == T1) {
Checksum2 +=9;
}
//Checksum2 -= T1;
Checksum2 >>= 1;
}
}
Checksum2 += 0x30;
/*========================End===========================*/
printf("Google Pagerank Checksum=7%c%u\n",Checksum2,Checksum1);
return 0;
}
PHP版本(http://tools.upsdn.net/pr/pagerank.txt)
<?php
/*
Filename : pagerank.php
Description : Google PageRank Checksum Algorithm (Toolbar 3.0.x)
Author : Jet Marx <smith (at) aboutsledge (dot) com>
License : UPL
Log : Ver 0.1 2005-09-13
Ver 1.0 2005-10-19 Final Character Bug Fixed
*/
function StrOrd($String)
{
for($i=0;$i<strlen($String);$i++) {
$result[$i] = ord ($String{$i});
}
return $result;
}
function StrToNum($StrArray,$Checksum,$MagicNum)
{
$length = sizeof($StrArray);
for( $i=0; $i<$length; $i++) {
$Checksum *= $MagicNum;
$Checksum = (int)$Checksum; //Force to Integer Overflow
$Checksum += $StrArray[$i];
$Checksum = (int)$Checksum;
}
return $Checksum;
}
function Check($String)
{
$Checksum1 = 0x1505;
$Checksum2 = 0;
$StrArray =StrOrd($String);
$Checksum1 = StrToNum($StrArray,$Checksum1,0x21);
$Checksum2 = StrToNum($StrArray,&$Checksum2,0x1003F);
$Checksum1 >>= 2;
$Checksum1 = ( ($Checksum1>>4) & 0x3FFFFC0 ) | ($Checksum1 & 0x3F);
$Checksum1 = ( ($Checksum1>>4) & 0x3FFC00 ) | ($Checksum1 & 0x3FF);
$Checksum1 = ( ($Checksum1>>4) & 0x3C000 ) | ($Checksum1 & 0x3FFF);
$T1 = (((($Checksum1&0x3C0)<<4)|($Checksum1 & 0x3C))<<2)|($Checksum2 & 0xF0F );
$T2 = (((($Checksum1&0xFFFFC000)<<4)|($Checksum1 & 0x3C00))<<0xA)|($Checksum2 & 0xF0F0000 );
return $Checksum1 = $T1 | $T2;
}
function CheckMore($Checksum)
{
$CheckChar = 0;
$Flag = 0;
$CheckStr = sprintf("%u", $Checksum) ;
$length = strlen($CheckStr);
for( $i=$length-1; $i>=0; $i --) {
$Re = $CheckStr{$i};
if ( 1 == ($Flag%2) ) {
$Re += $Re;
$Re = (int)($Re/10) + ($Re%10);
}
$CheckChar = $Re + $CheckChar;
$Flag ++;
}
$CheckChar %= 10;
if (0 !== $CheckChar){
$CheckChar = 10 - $CheckChar;
$length = $CheckChar % 2 ;
if (1 === ($Flag%2) ) {
if ( 1 === $length ) {
$CheckChar += 9;
}
//$CheckChar -= $length;
$CheckChar >>= 1;
}
}
return $CheckChar;
//return $CheckChar = (10-$CheckChar%10)%10;
}
if ( isset ($_GET['url'])) {
$Checksum = Check($_GET['url']);
echo "<a href=\"http://www.google.com/search?client=navclient-auto&features=Rank:&q=info:";
echo $_GET['url']."&ch=7".CheckMore($Checksum);
printf("%u",$Checksum);
echo "\">Get PageRank</a><br /><br />Powered by upsdn.net";
}else{
echo "<form action=\"\" method=\"get\" id=\"prform\">";
echo " <br />URL:<input name=\"url\" value=\"http://www.debian.org/\" type=\"text\" size=40>";
echo "</form>";
echo "<a href=\"javascript:document.getElementById('prform').submit();\">Submit</a>";
echo "<br /><br />Powered by upsdn.net";
}
?>
注意,这个算法,请到演示地址获取更新的板本,本页面的代码不是最新的.
作者:AboutSledge 更新日期:2005-09-19
来源:本站特稿
浏览次数:
相关文章
相关评论 发表评论
路人 [2005-10-15]
程序仍然有bug,虽然大多数能查出来,但还有一些域名无法查出的.
比如域名"outdoor-china.net",你们的算法是错误的webmaster [2005-10-19]
谢谢,我们已经发现了有bug
是测试 http://www.debian.org/ 是发现的。
至于你提到的outdoor-china.net不能查询,也的确是有问题,不过现在修复了,请再测试。谢谢
相信更新新PR时outdoor-china.net会获得一个好的pr
再次感谢您的测试pagerank [2005-10-20]
丁丁 [2006-04-02]
为什么我你们帖出来的PHP代码运行算出来的checksum值是不对的。而用你们的演示地址算出来是正确的呢?
Johnny [2006-04-04]
我们已经发现在PHP某些版本(如debian sid中的PHP 4.4和5.0.3)
对 unsigned 和 signed 处理的一个bug 会导致 计算错误我要asp版本 [2006-10-15]
有asp版本就更好了。。希望作者写个ASP版本。。