- Beautiful Soup - Discussion
- Beautiful Soup - Useful Resources
- Beautiful Soup - Quick Guide
- Beautiful Soup - Trouble Shooting
- Parsing Only Section of a Document
- Beautiful Soup - Beautiful Objects
- Beautiful Soup - Encoding
- Beautiful Soup - Modifying the Tree
- Beautiful Soup - Searching the Tree
- Beautiful Soup - Navigating by Tags
- Beautiful Soup - Kinds of objects
- Beautiful Soup - Souping the Page
- Beautiful Soup - Installation
- Beautiful Soup - Overview
- Beautiful Soup - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Beautiful Soup Tutorial
In this tutorial, we will show you, how to perform web scraping in Python using Beautiful Soup 4 for getting data out of HTML, XML and other markup languages. In this we will try to scrap webpage from various different websites (including IMDB). We will cover beautiful soup 4, python basic tools for efficiently and clearly navigating, searching and parsing HTML web page. We have tried to cover almost all the functionapties of Beautiful Soup 4 in this tutorial. You can combine multiple functionapties introduced in this tutorial into one bigger program to capture multiple meaningful data from the website into some other sub-program as input.
Audience
This tutorial is basically designed to guide you in scarping a web page. Basic requirement of all this is to get meaningful data out of huge unorganized set of data. The target audience of this tutorial can be anyone of:
Anyone who wants to know – how to scrap webpage in python using BeautifulSoup 4.
Any data science developer/enthusiasts or anyone, how wants to use this scraped (meaningful) data to different python data science pbraries to make better decision.
Prerequisites
Though there is NO mandatory requirement to have for this tutorial. However, if you have any or all (supercool) prior knowledge on any below mentioned technologies that will be an added advantage −
Knowledge of any web related technologies (HTML/CSS/Document object Model etc.).
Python Language (as it is the python package).
Developers who have any prior knowledge of scraping in any language.
Basic understanding of HTML tree structure.