GoogleToolbar 3.x/4.x PageRank Checksum 算法

得到的checksum用来向google请求获取PR值 新的request中,CH=7******* 这个版本增强了对PHP4.4/PHP5.x的兼容性,同时也增加了对x86_64 CPU的支持

Google Toolbar 3.0.x/4.0.x Pagerank Checksum 算法

PageRank 在线获取 演示(提供PHP代码的演示效果,提供C,Python,PHP源代码下载)

这个版本并不模拟Google Toolbar的行为,只是计算出Checksum的值,给出一个GET URL的链接,用户自行用浏览器打开此链接,该页面返回的最后一个数字就是 Pagerank值

如果你进行代码优化,或者转换成了其他版本, VB/asp ,C#/asp.net,请发给原作者(anykai at gmail.com)一份,谢谢

4.x的算法与 3.x的是一样的,现在给的这个版本,对PHP 4.4/PHP 5.x支持的比较好,因为高版本的PHP中一个 浮点数 到整数的类型转换是没有定义的
在不同的版本,不同的系统中 得到的值是不同的;早先的版本是与C实现一致的。而我们以前一直是以为PHP的bug,原来我们自己错得这么彻底.

Google Toolbar 4.0.x Pagerank Checksum Algorithm PHP Update at 2006-9-21
update : 2006-9-29  X86_64 CPU supported

#!/usr/bin/php
<?php

/*
  Google PageRank Checksum Algorithm (Toolbar 3.x/4.x)
  http://www.gamesaga.net/pagerank/
  http://www.upsdn.net/
*/

error_reporting(E_ALL);

function StrToNum($Str, $Check, $Magic)
{
    $Int32Unit = 4294967296;  // 2^32

    $length = strlen($Str);
    for ($i = 0; $i < $length; $i++) {
        $Check *= $Magic;    
        //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31),
        //  the result of converting to integer is undefined
        //  refer to http://www.php.net/manual/en/language.types.integer.php
        //if (is_float($Check)) {
        if ($Check >= $Int32Unit) {
            $Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
            // - 2^31
            $Check = ($Check < -2147483647) ? ($Check + $Int32Unit) : $Check;
        }
        $Check += ord($Str{$i});
    }
    return $Check;
}

function HashURL($String)
{
    $Check1 = StrToNum($String, 0x1505, 0x21);
    $Check2 = StrToNum($String, 0, 0x1003F);
   
    $Check1 >>= 2;    
    $Check1 = (($Check1 >> 4) & 0x3FFFFC0 ) | ($Check1 & 0x3F);
    $Check1 = (($Check1 >> 4) & 0x3FFC00 ) | ($Check1 & 0x3FF);
    $Check1 = (($Check1 >> 4) & 0x3C000 ) | ($Check1 & 0x3FFF);   
   
    $T1 = (((($Check1 & 0x3C0) << 4) | ($Check1 & 0x3C)) <<2 ) | ($Check2 & 0xF0F );
    $T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0x3C00)) << 0xA) | ($Check2 & 0xF0F0000 );
   
    return ($T1 | $T2);
}

function CheckHash($Hashnum)
{
    $CheckByte = 0;
    $Flag = 0;

    $HashStr = sprintf('%u', $Hashnum) ;
    $length = strlen($HashStr);
   
    for ($i = $length - 1;  $i >= 0;  $i --) {
        $Re = $HashStr{$i};
        if (1 == ($Flag % 2)) {
            $Re += $Re;
            $Re = (int)($Re / 10) + ($Re % 10);
        }
        $CheckByte += $Re;
        $Flag ++;   
    }

    $CheckByte %= 10;
    if (0 !== $CheckByte) {
        $CheckByte = 10 - $CheckByte;
        if (1 === ($Flag%2) ) {
            if (1 === ($CheckByte % 2)) {
                $CheckByte += 9;
            }
            $CheckByte >>= 1;
        }
    }

    return '7'.$CheckByte.$HashStr;
}

if ($argc == 2) {
    echo CheckHash(HashURL($argv[1]));
} else {
    exit ("please specify a URL,for example: http://www.upsdn.net/\n");
}

?>





C语言版本源代码(http://tools.upsdn.net/pr/pagerank.c)
/******************************************************************************
Filename     : pagerank.c
Description  : Google PageRank Checksum Algorithm (Toolbar 3.0.x)
Author       : Jet Marx   <smith (at) aboutsledge (dot)com>
License      : UPL
Log          : Ver 0.1 2005-09-13            
                   Ver 1.0 2005-10-19            Final Character Bug Fixed
                   Ver 1.1 2005-10-21            Final Character Bug(cdq bug) Fixed
******************************************************************************/

#include <stdio.h>

int main(int argc, char* argv[])
{
    char * eos;
    int Remainder;
    unsigned int T1;
    unsigned int T2;
    unsigned int Checksum1 = 0x1505;
    int Checksum2 = 0;
    int Flag = 0;
    /*=================Copyright & Usage=======================*/
    printf("\nGoogle PageRank Checksum Calculator \
(GoogleToolbar 3.0.125.1-big)\n  http://www.GameSaga.net 2005-09-13\n\n");

    if (argc < 2){
        printf("Usage:   %s [URL] \nExample: \
%s http://www.gamesaga.net/ \n\n",argv[0]);
        return 1;
    }

    /*======================Stage 1===========================*/
    eos = argv[1];
    while (*eos) {
        Checksum1 *= 0x21;
        Checksum1 += *eos++;
    }

    eos = argv[1];
    while( *eos )    {
        Checksum2 *= 0x1003F;
        Checksum2 += *eos++;
    }

    Checksum1 >>= 2;     
    Checksum1 = ( (Checksum1>>4) & 0x3FFFFC0 ) | (Checksum1 & 0x3F);
    Checksum1 = ( (Checksum1>>4) & 0x3FFC00 ) | (Checksum1 & 0x3FF);
    Checksum1 = ( (Checksum1>>4) & 0x3C000 ) | (Checksum1 & 0x3FFF);
 
    T1 =  (  (   ( ( Checksum1 & 0x3C0 ) << 4    )   \
           | ( Checksum1 & 0x3C   ) )              \
         << 2   )                       \
        | ( Checksum2 & 0xF0F );

    T2 =  (  (   ( ( Checksum1 & 0xFFFFC000 ) << 4 ) \
           | ( Checksum1 & 0x3C00 ) )          \
         << 0xA )                  \
            | ( Checksum2 & 0xF0F0000 );

    Checksum1 = T1 | T2;
    
    /*=====================Stage 2========================*/
    Checksum2 = 0;
    T1 = Checksum1;
        do {
        Remainder = T1 % 10;
        T1 /= 10;
        if ( 1 == (Flag % 2) ){
            Remainder += Remainder;
            Remainder = (Remainder/10) + (Remainder%10);        
        }
        Checksum2 += Remainder;
        Flag ++;
    } while( 0 != T1);

    //Checksum2 = (10-Checksum2%10)%10+0x30;
    Checksum2 %= 10;
    if (0 != Checksum2){
        Checksum2 = 10 - Checksum2;
        T1 = Checksum2 % 2 ;
        if  ( 1 == (Flag%2) )  {
                if (1 == T1) {
                    Checksum2 +=9;
                }
                //Checksum2 -= T1;
                Checksum2 >>= 1;
         }
   }
    Checksum2 += 0x30;

    /*========================End===========================*/
    printf("Google Pagerank Checksum=7%c%u\n",Checksum2,Checksum1);

    return 0;
}


PHP版本(http://tools.upsdn.net/pr/pagerank.txt)

<?php

/*
Filename     : pagerank.php
Description  : Google PageRank Checksum Algorithm (Toolbar 3.0.x)
Author       : Jet Marx   <smith (at) aboutsledge (dot) com>
License      : UPL
Log          : Ver 0.1     2005-09-13
               Ver 1.0     2005-10-19    Final Character Bug Fixed
*/

function StrOrd($String)
{
    for($i=0;$i<strlen($String);$i++) {
        $result[$i] = ord ($String{$i});
    }
    return $result;
}

function StrToNum($StrArray,$Checksum,$MagicNum)
{
    $length = sizeof($StrArray);
    for( $i=0; $i<$length; $i++) {
        $Checksum *= $MagicNum;    
        $Checksum = (int)$Checksum;   //Force to Integer Overflow
        $Checksum += $StrArray[$i];
        $Checksum = (int)$Checksum;
    }
    return $Checksum;
}

function Check($String)
{
    $Checksum1 = 0x1505;
    $Checksum2 = 0;

    $StrArray =StrOrd($String);
    $Checksum1 = StrToNum($StrArray,$Checksum1,0x21);
    $Checksum2 = StrToNum($StrArray,&$Checksum2,0x1003F);
   
    $Checksum1 >>= 2;    
    $Checksum1 = ( ($Checksum1>>4) & 0x3FFFFC0 ) | ($Checksum1 & 0x3F);
    $Checksum1 = ( ($Checksum1>>4) & 0x3FFC00 ) | ($Checksum1 & 0x3FF);
    $Checksum1 = ( ($Checksum1>>4) & 0x3C000 ) | ($Checksum1 & 0x3FFF);   
   
    $T1 =  (((($Checksum1&0x3C0)<<4)|($Checksum1 & 0x3C))<<2)|($Checksum2 & 0xF0F );
    $T2 =  (((($Checksum1&0xFFFFC000)<<4)|($Checksum1 & 0x3C00))<<0xA)|($Checksum2 & 0xF0F0000 );
   
    return $Checksum1 = $T1 | $T2;
}

function CheckMore($Checksum)
{
    $CheckChar = 0;
    $Flag        = 0;
               
    $CheckStr  = sprintf("%u", $Checksum) ;
    $length    = strlen($CheckStr);
   
    for( $i=$length-1;  $i>=0;  $i --) {
        $Re = $CheckStr{$i};
        if ( 1 == ($Flag%2) ) {
            $Re += $Re;
            $Re = (int)($Re/10) + ($Re%10);
        }
        $CheckChar = $Re + $CheckChar;
        $Flag ++;   
    }

    $CheckChar %= 10;
    if (0 !== $CheckChar){
        $CheckChar = 10 - $CheckChar;
        $length = $CheckChar % 2 ;
        if (1 === ($Flag%2) )  {
                        if ( 1 === $length ) {
                                $CheckChar += 9;
                        }
                        //$CheckChar -= $length;
                        $CheckChar >>= 1;
        }
   }  
    return $CheckChar;
       //return $CheckChar = (10-$CheckChar%10)%10;
}


if ( isset ($_GET['url'])) {

    $Checksum = Check($_GET['url']);
    echo "<a href=\"http://www.google.com/search?client=navclient-auto&features=Rank:&q=info:";
    echo $_GET['url']."&ch=7".CheckMore($Checksum);
    printf("%u",$Checksum);
    echo "\">Get PageRank</a><br /><br />Powered by upsdn.net";
}else{
    echo "<form action=\"\" method=\"get\" id=\"prform\">";
    echo "    <br />URL:<input name=\"url\" value=\"http://www.debian.org/\" type=\"text\" size=40>";
    echo "</form>";
    echo "<a href=\"javascript:document.getElementById('prform').submit();\">Submit</a>";
    echo "<br /><br />Powered by upsdn.net";
}
?>

注意,这个算法,请到演示地址获取更新的板本,本页面的代码不是最新的.

作者:AboutSledge   更新日期:2005-09-19
来源:本站特稿   浏览次数:

相关文章

相关评论   发表评论