Scrapy Basic Concepts
- Scrapy - Exceptions
- Scrapy - Settings
- Scrapy - Link Extractors
- Scrapy - Requests & Responses
- Scrapy - Feed exports
- Scrapy - Item Pipeline
- Scrapy - Shell
- Scrapy - Item Loaders
- Scrapy - Items
- Scrapy - Selectors
- Scrapy - Spiders
- Scrapy - Command Line Tools
- Scrapy - Environment
- Scrapy - Overview
Scrapy Live Project
- Scrapy - Scraped Data
- Scrapy - Following Links
- Scrapy - Using an Item
- Scrapy - Extracting Items
- Scrapy - Crawling
- Scrapy - First Spider
- Scrapy - Define an Item
- Scrapy - Create a Project
Scrapy Built In Services
- Scrapy - Web Services
- Scrapy - Telnet Console
- Scrapy - Sending an E-mail
- Scrapy - Stats Collection
- Scrapy - Logging
Scrapy Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Scrapy - Stats Collection
Description
Stats Collector is a facipty provided by Scrapy to collect the stats in the form of key/values and it is accessed using the Crawler API (Crawler provides access to all Scrapy core components). The stats collector provides one stats table per spider in which the stats collector opens automatically when spider is opening and closes the stats collector when spider is closed.
Common Stats Collector Uses
The following code accesses the stats collector using stats attribute.
class ExtensionThatAccessStats(object): def __init__(self, stats): self.stats = stats @classmethod def from_crawler(cls, crawler): return cls(crawler.stats)
The following table shows various options can be used with stats collector −
Sr.No | Parameters | Description |
---|---|---|
1 |
stats.set_value( hostname , socket.gethostname()) |
It is used to set the stats value. |
2 |
stats.inc_value( customized_count ) |
It increments the stat value. |
3 |
stats.max_value( max_items_scraped , value) |
You can set the stat value, only if greater than previous value. |
4 |
stats.min_value( min_free_memory_percent , value) |
You can set the stat value, only if lower than previous value. |
5 |
stats.get_value( customized_count ) |
It fetches the stat value. |
6 |
stats.get_stats() { custom_count : 1, start_time : datetime.datetime(2009, 7, 14, 21, 47, 28, 977139)} |
It fetches all the stats |
Available Stats Collectors
Scrapy provides different types of stats collector which can be accessed using the STATS_CLASS setting.
MemoryStatsCollector
It is the default Stats collector that maintains the stats of every spider which was used for scraping and the data will be stored in the memory.
class scrapy.statscollectors.MemoryStatsCollector
DummyStatsCollector
This stats collector is very efficient which does nothing. This can be set using the STATS_CLASS setting and can be used to disable the stats collection in order to improve the performance.
class scrapy.statscollectors.DummyStatsCollectorAdvertisements