
スクレイピング(3.SageMakerでWebの画像取得)
実践!
1.Web画像取得
1-1.SageMakerノートブックのインスタンスを起動し、「JuptyerLab」を起動
1-2.下記を実行し、画像が取得されることを確認
import requests image_url = "https://s.yimg.jp/images/weather/general/next/100_day.png" imgdata = requests.get(image_url) filename = image_url.split("/")[-1] with open(filename, mode="wb") as f: f.write(imgdata.content)

2.Web画像を特定のフォルダに保存
2-1.下記を実行
import requests from pathlib import Path out_folder = Path("download") out_folder.mkdir(exist_ok=True) image_url = "https://s.yimg.jp/images/weather/general/next/100_day.png" imgdata = requests.get(image_url) filename = image_url.split("/")[-1] out_path = out_folder.joinpath(filename) with open(out_path, mode="wb") as f: f.write(imgdata.content)

3.特定サイトの画像を取得
3-1.下記を実行
import requests from bs4 import BeautifulSoup import urllib image_url = "https://yahoo.co.jp" html = request.get(load_url) soup = BeautifulSoup(html.content, "html.parser") for element in soup.find_all("img"): src = element.get("src") image_url = urllib.parse.urljoin(load_url, src) filenname = image_url.split("/")[-1] print(image_url, ">>", filename)
