抓取chinaren.com校友录留言的PHP小程序

王朝php·作者佚名  2006-01-09
宽屏版  字体: |||超大  

<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=gb2312">

<meta http-equiv="pragma" content="no-cache">

<title>提取留言</title>

<style>

.head { color: red; font-weight: bold; }

body { font-size: 9pt; background-color: #cccccc; }

</style>

</head>

<body>

<?php

set_time_limit(600);

function getMessage($url,$history=false)

{

$match_msg = "/<script>do.*\('[^\n]*/";

$match_date = "/\d{4}-\d{2}-\d{2}\040\d{2}:\d{2}:\d{2}/";

if($history==false){

$match_names = "/&cid=.{2,6}&msg=/";

$replace_names = "/&cid=|&msg=/";

}else{

$match_names = "/class\=\"cr5\"[^\n]*/";

$replace_names = "/class\=\"cr5\" target\=\"_blank\">|<\/a>/";

}

$replace_msg = "/<script>|<\/script>|'\d*\'|doFlatTxt\('|doStr\('|&nbsp[^\n]*|\'\)|\\\\/";

$handle = fopen ($url, "r");

$contensts = "";

$times="";

$names="";

$messages="";

while ($line=fgets($handle,1024))

{

$contents .= $line;

}

//$contents = fread ($handle, 100000);

//echo $contents;

fclose ($handle);

preg_match_all($match_date,$contents,$times);

preg_match_all($match_names,$contents,$names);

preg_match_all($match_msg,$contents,$messages);

for($i=0;$i<count($messages[0]);$i++)

{

echo "<p><b>". preg_replace($replace_names,"",$names[0][$i]) ."</b>(";

echo $times[0][$i]."):<br>\n";

$message=preg_replace($replace_msg,"",$messages[0][$i])."\n\n";

echo $message;

}

}

$begin=time();

echo "<p class=\"head\">最新留言:</p>\n";

getMessage("http://alumni.chinaren.com/class/class_index.jsp?classuuid=2815032345960598103");

echo "<p class=\"head\">更多留言:</p>\n";

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=1");

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=2");

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=3");

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=4");

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=5");

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=6");

getMessage("http://alumni.chinaren.com/class/class_leaveword.jsp?classuuid=2815032345960598103&p=7");

echo "<p class=\"head\">历史留言:</p>\n";

for($i=0;$i<100;$i++)

{

getMessage("i."&classuuid=2815032345960598103&msgtype=1&type=3",true]http://alumni.chinaren.com/class/class_leaveword2.jsp?p=".$i."&classuuid=2815032345960598103&msgtype=1&type=3",true);

}

echo "\n<br><center><b>执行本程序用的时间是<font color=red>";

echo time()-$begin;

echo "</font>秒钟</b></center>";

?>

</body>

</html>

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
© 2005- 王朝网络 版权所有