Scrapy Basic Concepts
- Scrapy - Exceptions
- Scrapy - Settings
- Scrapy - Link Extractors
- Scrapy - Requests & Responses
- Scrapy - Feed exports
- Scrapy - Item Pipeline
- Scrapy - Shell
- Scrapy - Item Loaders
- Scrapy - Items
- Scrapy - Selectors
- Scrapy - Spiders
- Scrapy - Command Line Tools
- Scrapy - Environment
- Scrapy - Overview
Scrapy Live Project
- Scrapy - Scraped Data
- Scrapy - Following Links
- Scrapy - Using an Item
- Scrapy - Extracting Items
- Scrapy - Crawling
- Scrapy - First Spider
- Scrapy - Define an Item
- Scrapy - Create a Project
Scrapy Built In Services
- Scrapy - Web Services
- Scrapy - Telnet Console
- Scrapy - Sending an E-mail
- Scrapy - Stats Collection
- Scrapy - Logging
Scrapy Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Scrapy - Logging
Description
Logging means tracking of events, which uses built-in logging system and defines functions and classes to implement apppcations and pbraries. Logging is a ready-to-use material, which can work with Scrapy settings psted in Logging settings.
Scrapy will set some default settings and handle those settings with the help of scrapy.utils.log.configure_logging() when running commands.
Log levels
In Python, there are five different levels of severity on a log message. The following pst shows the standard log messages in an ascending order −
logging.DEBUG − for debugging messages (lowest severity)
logging.INFO − for informational messages
logging.WARNING − for warning messages
logging.ERROR − for regular errors
logging.CRITICAL − for critical errors (highest severity)
How to Log Messages
The following code shows logging a message using logging.info level.
import logging logging.info("This is an information")
The above logging message can be passed as an argument using logging.log shown as follows −
import logging logging.log(logging.INFO, "This is an information")
Now, you can also use loggers to enclose the message using the logging helpers logging to get the logging message clearly shown as follows −
import logging logger = logging.getLogger() logger.info("This is an information")
There can be multiple loggers and those can be accessed by getting their names with the use of logging.getLogger function shown as follows.
import logging logger = logging.getLogger( mycustomlogger ) logger.info("This is an information")
A customized logger can be used for any module using the __name__ variable which contains the module path shown as follows −
import logging logger = logging.getLogger(__name__) logger.info("This is an information")
Logging from Spiders
Every spider instance has a logger within it and can used as follows −
import scrapy class LogSpider(scrapy.Spider): name = logspider start_urls = [ http://dmoz.com ] def parse(self, response): self.logger.info( Parse function called on %s , response.url)
In the above code, the logger is created using the Spider’s name, but you can use any customized logger provided by Python as shown in the following code −
import logging import scrapy logger = logging.getLogger( customizedlogger ) class LogSpider(scrapy.Spider): name = logspider start_urls = [ http://dmoz.com ] def parse(self, response): logger.info( Parse function called on %s , response.url)
Logging Configuration
Loggers are not able to display messages sent by them on their own. So they require "handlers" for displaying those messages and handlers will be redirecting these messages to their respective destinations such as files, emails, and standard output.
Depending on the following settings, Scrapy will configure the handler for logger.
Logging Settings
The following settings are used to configure the logging −
The LOG_FILE and LOG_ENABLED decide the destination for log messages.
When you set the LOG_ENCODING to false, it won t display the log output messages.
The LOG_LEVEL will determine the severity order of the message; those messages with less severity will be filtered out.
The LOG_FORMAT and LOG_DATEFORMAT are used to specify the layouts for all messages.
When you set the LOG_STDOUT to true, all the standard output and error messages of your process will be redirected to log.
Command-pne Options
Scrapy settings can be overridden by passing command-pne arguments as shown in the following table −
Sr.No | Command & Description |
---|---|
1 | --logfile FILE Overrides LOG_FILE |
2 | --loglevel/-L LEVEL Overrides LOG_LEVEL |
3 | --nolog Sets LOG_ENABLED to False |
scrapy.utils.log module
This function can be used to initiapze logging defaults for Scrapy.
scrapy.utils.log.configure_logging(settings = None, install_root_handler = True)
Sr.No | Parameter & Description |
---|---|
1 | settings (dict, None) It creates and configures the handler for root logger. By default, it is None. |
2 | install_root_handler (bool) It specifies to install root logging handler. By default, it is True. |
The above function −
Routes warnings and twisted loggings through Python standard logging.
Assigns DEBUG to Scrapy and ERROR level to Twisted loggers.
Routes stdout to log, if LOG_STDOUT setting is true.
Default options can be overridden using the settings argument. When settings are not specified, then defaults are used. The handler can be created for root logger, when install_root_handler is set to true. If it is set to false, then there will not be any log output set. When using Scrapy commands, the configure_logging will be called automatically and it can run exppcitly, while running the custom scripts.
To configure logging s output manually, you can use logging.basicConfig() shown as follows −
import logging from scrapy.utils.log import configure_logging configure_logging(install_root_handler = False) logging.basicConfig ( filename = logging.txt , format = %(levelname)s: %(your_message)s , level = logging.INFO )Advertisements