精品国产一区二区三区久久蜜臀,久久99免费视频,久久精品国内一区二区三区

Python Challenge lv4: follow the chain

題目鏈接： http://www.pythonchallenge.com/pc/def/linkedlist.php

說實話，好不容易通過google搞清楚題目的要求：通過不斷的從服務器取得一個web page，然后從源碼中找出下一個鏈接的地址。需要注意的是：雖然頁面的源碼很簡單，但并不是其中所有的數字都是有效的，需要使用正則表達式找出正確的pattern形式才可以，對本題而言r'nothing is (\d+)'是一個可用的pattern，使用''.join([x for x in text if x.isdigit()] 將所有的數字都粘連起來了，結果跟蹤到4000多還沒結束，才知道上當了。。。

import re

import urllib.request

if __name__ == '__main__':

url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing='

index = '17675'

counter = 1

pattern = re.compile(r'nothing is (\d+)')

while True:

try:

request= urllib.request.Request(url+index)

# my pc must use proxy to connect

request.set_proxy('172.16.0.252:80', 'http')

response= urllib.request.urlopen(request)

content=str(response.read().decode())

response.close()

print(counter, content)

result = pattern.search(content)

if not result:

break

index = result.group(1)

counter += 1

except Exception as ex:

print(ex)

break

程序輸出：
1 and the next nothing is 8511
2 and the next nothing is 89456
3 and the next nothing is 43502
4 and the next nothing is 45605
5 and the next nothing is 12970
6 and the next nothing is 91060
7 and the next nothing is 27719
8 and the next nothing is 65667
9 peak.html

得到下一個題目的地址peak.html (注：我的index初始值是17675，題目中最早給出的可不是這個值，我是從地址列表的后一部分選了一個數字而已，因此不要擔心)

posted on 2009-05-11 16:05 李現民閱讀(575) 評論(2) 編輯收藏引用所屬分類: python

只有注冊用戶登錄后才能發表評論。
【推薦】100%開源！大型工業跨平臺軟件C++源碼提供，建模，組態！

相關文章: Python Challenge lv5: peak hell Python Challenge lv4: follow the chain Python Challenge lv3: re Python Challenge lv2: ocr Python Challenge lv1: What about making trans?

網站導航: 博客園 IT新聞 BlogJava 博問 Chat2DB 管理

# re: Python Challenge lv4: follow the chain[未登錄] 2011-05-31 20:17 simon

# re: Python Challenge lv4: follow the chain 2011-06-01 10:04 李現民

清風竹林

導航

統計

常用鏈接

留言簿(5)

隨筆分類

隨筆檔案

相冊

TLink

搜索

最新評論

閱讀排行榜

評論排行榜

Python Challenge lv4: follow the chain

評論