Integrating WebScrapingAPI with Selenium in Python
The Selenium framework exists to automate browsers. While it can be used in dozens of different ways, its primary use in web scraping is to execute Javascript code and mimic user interaction. In this guide, we'll show you how to integrate Selenium with WebScrapingAPI.
A basic example of integrating WebScrapingAPI with Selenium
In the following example, we'll use Selenium to send a request through the API to http://httpbin.org/ip. It should return the IP from which the request came. As WebScrapingAPI rotates proxies after each request, every time you execute the following code, it should return a new IP address.
python -m pip install selenium from selenium import webdriver from selenium.webdriver.chrome.options import Options CHROMEDRIVER_PATH = "your/path/here/chromedriver_win32/chromedriver" URL = "https://api.webscrapingapi.com/v1/?api_key=YOUR_API_KEY&url=https%3A%2F%2Fhttpbin.org/ip" options = Options() options.headless = True driver = webdriver.Chrome(CHROMEDRIVER_PATH, options=options) driver.get(URL) file = open('seleniumpage.txt', mode='w', encoding='utf-8') file.write(driver.page_source)
After installing Selenium and importing what you need, download chromedriver according to your Google Chrome version and copy the path to the chromedriver executable.
In the options, we chose to run a headless browser, since the interface doesn't help us with web scraping.
Take a look at the URL variable. https://api.webscrapingapi.com/v1/ is the API endpoint, then ?api_key= followed by your personal key, and &url= followed by the encoded URL of the page you want to scrape, which is http://httpbin.org/ip in this case. The code will add the returned data into a file named seleniumpage.txt.
You just have to save the code in a py file, let's name it seleniumintegration.py. To use it, run the following command in your IDE:
py seleniumintegration.py