- Requests - Discussion
- Requests - Useful Resources
- Requests - Quick Guide
- Requests - Web Scraping using Requests
- Requests - Proxy
- Requests - Event Hooks
- Requests - Authentication
- Requests - SSL Certification
- Requests - Handling Sessions
- Requests - Handling History
- Requests - Handling Redirection
- Requests - Handling Timeouts
- Requests - Working with Errors
- Requests - Working with Cookies
- Requests - File Upload
- Handling POST, PUT, PATCH & DELETE Requests
- Requests - Handling GET Requests
- Requests - HTTP Requests Headers
- Handling Response for HTTP Requests
- Requests - Working with Requests
- Requests - How Http Requests Work?
- Requests - Environment Setup
- Requests - Overview
- Requests - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Requests - Web Scraping using Requests
We have already seen how we can get data from a given URL using python requests pbrary. We will try to scrap the data from the site of Tutorialspoint which is available at
using the following −Requests Library
Beautiful soup pbrary from python
We have already installed the Requests pbrary, let us now install Beautiful soup package. Here is the official website for beautiful soup available at
in case you want to explore some more functionapties of beautiful soup.Instalpng Beautifulsoup
We shall see how to install Beautiful Soup below −
E:prequests>pip install beautifulsoup4 Collecting beautifulsoup4 Downloadingcdf92bac4693b90d3ba79268be16527555e186f0/beautifulsoup4-4.8.1-py3-none-any.whl ( 101kB) |████████████████████████████████| 102kB 22kB/s Collecting soupsieve>=1.2 (from beautifulsoup4) Downloading a99f7946ac228ca98da4fa75796c507f61e688c2/soupsieve-1.9.5-py2.py3-none-any.whl Instalpng collected packages: soupsieve, beautifulsoup4 Successfully installed beautifulsoup4-4.8.1 soupsieve-1.9.5
We now have python requests pbrary and beautiful soup installed.
Let us now write the code, that will scrap the data from the URL given.
Web scraping
import requests from bs4 import BeautifulSoup res = requests.get() print("The status code is ", res.status_code) print(" ") soup_data = BeautifulSoup(res.text, html.parser ) print(soup_data.title) print(" ") print(soup_data.find_all( h4 ))
Using requests pbrary, we can fetch the content from the URL given and beautiful soup pbrary helps to parse it and fetch the details the way we want.
You can use a beautiful soup pbrary to fetch data using Html tag, class, id, css selector and many more ways. Following is the output we get wherein we have printed the title of the page and also all the h4 tags on the page.
Output
E:prequests>python makeRequest.py The status code is 200 <title>Free Onpne Tutorials and Courses</title> [<h4>Academic</h4>, <h4>Computer Science</h4>, <h4>Digital Marketing</h4>, <h4>Monuments</h4>,<h4>Machine Learning</h4>, <h4>Mathematics</h4>, <h4>Mobile Development</h4>,<h4>SAP</h4>, <h4>Software Quapty</h4>, <h4>Big Data & Analytics</h4>, <h4>Databases</h4>, <h4>Engineering Tutorials</h4>, <h4>Mainframe Development</h4>, <h4>Microsoft Technologies</h4>, <h4>Java Technologies</h4>, <h4>XML Technologies</h4>, <h4>Python Technologies</h4>, <h4>Sports</h4>, <h4>Computer Programming</h4>,<h4>DevOps</h4>, <h4>Latest Technologies</h4>, <h4>Telecom</h4>, <h4>Exams Syllabus</h4>, <h4>UPSC IAS Exams</h4>, <h4>Web Development</h4>, <h4>Scripts</h4>, <h4>Management</h4>,<h4>Soft Skills</h4>, <h4>Selected Reading</h4>, <h4>Misc</h4>]Advertisements