Skip to content

Instantly share code, notes, and snippets.

@vivekthedev
Created July 19, 2024 11:45
Show Gist options
  • Select an option

  • Save vivekthedev/c1c5f0fb0e23cabfa3fa5c364b939f7c to your computer and use it in GitHub Desktop.

Select an option

Save vivekthedev/c1c5f0fb0e23cabfa3fa5c364b939f7c to your computer and use it in GitHub Desktop.
lxml Scraping Tutorial Code - Static Scrape
import requests
from lxml import html
import json
URL = "https://books.toscrape.com/"
content = requests.get(URL).text
parsed = html.fromstring(content)
all_books = parsed.xpath('//article[@class="product_pod"]')
books = []
for book in all_books:
book_title = book.xpath('.//h3/a/@title')
price = book.cssselect("p.price_color")[0].text_content()
books.append({"title": book_title, "price": price})
with open("books.json", "w", encoding="utf-8") as file:
json.dump(books ,file, ensure_ascii=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment