Integrating WebScrapingAPI with BeautifulSoup in Python

BeautifulSoup is a popular Python library for parsing HTML and XML files. For a web scraping project, it's useful for tuning unreadable strings of HTML into clean, structured data. In this guide, we'll show you how to integrate BeatifulSoup with WebScrapingAPI.

A basic example of integrating WebScrapingAPI with BeautifulSoup

Our objective, in this case, is to scrape the Wikipedia page for web scraping and get two things: a clear picture of the page's HTML structure and the page's paragraphs, parsed for readability.

Here's the code:

import requests
from bs4 import BeautifulSoup


params = {
  "url": ""

page = requests.request("GET", ENDPOINT, params=params)
page_soup = BeautifulSoup(page.content, 'html.parser')
page_body = page_soup.find('body')

for s in'script'):

scraped_data_file = open("cleanpage.txt", "w", encoding="utf-8")

paragraphs = page_soup.find_all('p')

paragraphs_file = open("cleanparagraphs.txt", "w", encoding="utf-8")
for paragraph in paragraphs: 
    paragraphs_file.write(str(paragraph) + "\n")

Let's look at what the script does, step by step:

  1. Defines the endpoint (, and the basic parameters you'll need to use the API, which are your API key and the encoded URL to the page you want to scrape (
  2. Uses the GET method to extract the HTML on the Wikipedia page.
  3. Uses BeautifulSoup to parse the extracted data and save the body of the page in "page_body".
  4. Checks for any scripts in the HTML of the body, removing any it finds.
  5. Indents the HTML code with the prettify() function and saves the results in a text file named cleanpage.txt.
  6. Copies all paragraphs on the page and pastes the results, separated with new lines, into a text file named cleanparagraphs.txt.

Save the code in a python file, let's say To use it, run the following command in your IDE:


After running the code, you will have two files - one containing indented HTML similar to what you'd see with "inspect element", and one containing the page's information formated so that it's easier for people to read.

