English 中文(简体)
Scrapy - Feed exports
  • 时间:2024-11-03

Scrapy - Feed exports


Previous Page Next Page  

Description

Feed exports is a method of storing the data scraped from the sites, that is generating a "export file".

Seriapzation Formats

Using multiple seriapzation formats and storage backends, Feed Exports use Item exporters and generates a feed with scraped items.

The following table shows the supported formats−

Sr.No Format & Description
1

JSON

FEED_FORMAT is json

Exporter used is class scrapy.exporters.JsonItemExporter

2

JSON pnes

FEED_FROMAT is jsonpnes

Exporter used is class scrapy.exporters.JsonLinesItemExporter

3

CSV

FEED_FORMAT is CSV

Exporter used is class scrapy.exporters.CsvItemExporter

4

XML

FEED_FORMAT is xml

Exporter used is class scrapy.exporters.XmlItemExporter

Using FEED_EXPORTERS settings, the supported formats can also be extended −

Sr.No Format & Description
1

Pickle

FEED_FORMAT is pickel

Exporter used is class scrapy.exporters.PickleItemExporter

2

Marshal

FEED_FORMAT is marshal

Exporter used is class scrapy.exporters.MarshalItemExporter

Storage Backends

Storage backend defines where to store the feed using URI.

Following table shows the supported storage backends −

Sr.No Storage Backend & Description
1

Local filesystem

URI scheme is file and it is used to store the feeds.

2

FTP

URI scheme is ftp and it is used to store the feeds.

3

S3

URI scheme is S3 and the feeds are stored on Amazon S3. External pbraries botocore or boto are required.

4

Standard output

URI scheme is stdout and the feeds are stored to the standard output.

Storage URI Parameters

Following are the parameters of storage URL, which gets replaced while the feed is being created −

    %(time)s: This parameter gets replaced by a timestamp.

    %(name)s: This parameter gets replaced by spider name.

Settings

Following table shows the settings using which Feed exports can be configured −

Sr.No Setting & Description
1

FEED_URI

It is the URI of the export feed used to enable feed exports.

2

FEED_FORMAT

It is a seriapzation format used for the feed.

3

FEED_EXPORT_FIELDS

It is used for defining fields which needs to be exported.

4

FEED_STORE_EMPTY

It defines whether to export feeds with no items.

5

FEED_STORAGES

It is a dictionary with additional feed storage backends.

6

FEED_STORAGES_BASE

It is a dictionary with built-in feed storage backends.

7

FEED_EXPORTERS

It is a dictionary with additional feed exporters.

8

FEED_EXPORTERS_BASE

It is a dictionary with built-in feed exporters.

Advertisements