본문 바로가기
Language_/python

[pyhon] 웹 크롤러[beautifulsoup] #이미지다운

by 낭람_ 2018. 8. 18.
반응형

[웹 크롤러 만들기]


import requests
import urllib.request
import re
from bs4 import BeautifulSoup

URL = 'https://www.daum.net/'
headers = {'Content-Type': 'application/json; charset=utf-8'}
res = requests.get(URL, headers=headers)

soup = BeautifulSoup(res.text, 'html.parser')

i = 0

for img in soup.find_all("img"):

if img.get('src') is None:
continue
if img.get('data-src') is None:
continue

a = img.get("src").find("http")
b = img.get("data-src").find("http")

if a == -1:
i = i + 1
img_name = str(i) + ".jpg"
c = "http:" + img.get("src")
print(img_name)
urllib.request.urlretrieve(c, "./img/" + img_name)
else:
i = i + 1
img_name = str(i) + ".jpg"
print(img_name)
urllib.request.urlretrieve(img.get('src')[a:], "./img/" + img_name)
if b == -1:
i = i + 1
img_name = str(i) + ".jpg"
d = "http:" + img.get("data-src")
print(img_name)
urllib.request.urlretrieve(d, "./img/" + img_name)
else:
i = i + 1
img_name = str(i) + ".jpg"
print(img_name)
urllib.request.urlretrieve(img.get('data-src')[b:], "./img/" + img_name)


[python] requests 모듈 정리

반응형

'Language_ > python' 카테고리의 다른 글

[python] requests 모듈 정리  (4) 2018.08.19
[python] 환경변수 설정  (0) 2018.08.18
[pyhon] 웹 크롤러[beautifulsoup] #a태그  (0) 2018.08.18
[python] 문자열  (0) 2018.08.16
[python] 시작  (0) 2018.08.16

댓글