Python抓取斗图啦源码分享-综合讨论-晨风机器人论坛

Urger 发表于 2018-10-21 11:44:09

Python抓取斗图啦源码分享

哈哈，大家好我是怨黎，小梨子～～～:loveliness:！
星期天闲的没有事情可做
于是我就开始捣鼓代码了
然后。。。。。
无意中写出了批量下载斗图啦的程序
无聊之作，但是运行效率还是很高滴！
“No picture you say a J8啊！”对不对？
一定有图有真相哈。。。
上图：
上代码：import requests as r
import urllib
import re
import os
script_path = os.path.realpath(__file__)
script_dir = os.path.dirname(script_path)
def getapage(url):
headers = {
   "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36"}
html = r.get(url, headers=headers).text
regex = """<img src="//static.doutula.com/img/loader.gif" style="width: 100%; height: 100%;" data-original="(.*?)" alt="(.*?)" class="img-responsive lazy image_dta" data-backup"""
resulit = re.findall(regex, html)
return resulit
if not os.path.exists(script_dir + '/imgs'):
os.mkdir(script_dir + '/imgs')
for i in range(1,1959+1):
resulit=getapage("http://www.doutula.com/photo/list/?page="+str(i))
print("正在获取第"+str(i)+"页")
for imgs in resulit:
   o = open(script_dir + "/imgs/" + imgs + ".gif", "wb")
   img = urllib.request.urlopen(imgs)
   o.write(img.read())
   o.close()
小弟不奢侈money！！！要的是别喷me。。。。。
溜啦，溜啦！

Urger 发表于 2018-10-21 11:48:25

一楼b位of me

不言. 发表于 2018-10-21 12:13:32

二楼c位of me

宿命发表于 2018-10-21 23:44:48

三楼D位of me

hddwen 发表于 2019-3-11 06:39:57

四楼E位of me

wenshitao1 发表于 2019-8-26 23:07:26

强烈支持楼主ing……

页: [1]

晨风机器人论坛's Archiver

Python抓取斗图啦源码分享