Skip to content

Instantly share code, notes, and snippets.

@tgranqvist
Created August 21, 2020 15:17
Show Gist options
  • Select an option

  • Save tgranqvist/7a4f0db5a45a372dad2717e23b8e3893 to your computer and use it in GitHub Desktop.

Select an option

Save tgranqvist/7a4f0db5a45a372dad2717e23b8e3893 to your computer and use it in GitHub Desktop.
A quick test to scrape product info from verkkokauppa.com page, with Python+Selenium+ChromeDriver
beautifulsoup4==4.9.1
certifi==2020.6.20
chardet==3.0.4
idna==2.10
lxml==4.5.2
selenium==3.141.0
soupsieve==2.0.1
urllib3==1.25.10
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
URL = 'https://www.verkkokauppa.com/fi/catalog/4793c/Android/products'
if __name__ == "__main__":
print('Loading driver...')
options = Options()
options.add_argument("--disable-extensions")
options.add_argument("--disable-gpu")
options.add_argument("--headless")
driver = webdriver.Chrome('./bin/chromedriver.exe', options=options)
print('Getting site...')
driver.get(URL)
print('Awaiting dynamic content...')
time.sleep(15)
dom = BeautifulSoup(driver.page_source, 'lxml')
paginator_div = dom.find('div', class_='back-forward-paginator')
product_divs = dom.find_all('div', class_='list-product')
print(f'Found {len(product_divs)} products')
for product_div in product_divs:
name_link = product_div.find('a', class_='list-product-info__link')
price_div = product_div.find('div', class_="price-container")
print(f'{name_link.text}, @ {price_div.text}')
driver.quit()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment