- Apache NiFi - Discussion
- Apache NiFi - Useful Resources
- Apache NiFi - Quick Guide
- Apache NiFi - Logging
- Apache NiFi - Custom Controllers Service
- Apache NiFi - Custom Processor
- Apache NiFi - Reporting Task
- Apache NiFi - Controller Settings
- Apache NiFi - Remote Process Group
- Apache NiFi - Upgrade
- Apache NiFi - Monitoring
- Apache NiFi - Data Provenance
- Apache NiFi - API
- Apache NiFi - Templates
- Apache NiFi - Creating Flows
- Apache NiFi - Administration
- Apache NiFi - Configuration
- Apache NiFi - Labels
- Apache NiFi - Process Groups
- Apache NiFi - Queues
- Apache NiFi - FlowFile
- Apache NiFi - Processors Relationship
- Apache NiFi - Processors Categorization
- Apache NiFi - Processors
- Apache NiFi - User Interface
- Apache NiFi - Environment Setup
- Apache NiFi - Basic Concepts
- Apache NiFi - Introduction
- Apache NiFi - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Apache NiFi - Processors
Apache NiFi processors are the basic blocks of creating a data flow. Every processor has different functionapty, which contributes to the creation of output flowfile. Dataflow shown in the image below is fetching file from one directory using GetFile processor and storing it in another directory using PutFile processor.
GetFile
GetFile process is used to fetch files of a specific format from a specific directory. It also provides other options to user for more control on fetching. We will discuss it in properties section below.
GetFile Settings
Following are the different settings of GetFile processor −
Name
In the Name setting, a user can define any name for the processors either according to the project or by that, which makes the name more meaningful.
Enable
A user can enable or disable the processor using this setting.
Penalty Duration
This setting lets a user to add the penalty time duration, in the event of flowfile failure.
Yield Duration
This setting is used to specify the yield time for processor. In this duration, the process is not scheduled again.
Bulletin Level
This setting is used to specify the log level of that processor.
Automatically Terminate Relationships
This has a pst of check of all the available relationship of that particular process. By checking the boxes, a user can program processor to terminate the flowfile on that event and do not send it further in the flow.
GetFile Schedupng
These are the following schedupng options offered by the GetFile processor −
Schedule Strategy
You can either schedule the process on time basis by selecting time driven or a specified CRON string by selecting a CRON driver option.
Concurrent Tasks
This option is used to define the concurrent task schedule for this processor.
Execution
A user can define whether to run the processor in all nodes or only in Primary node by using this option.
Run Schedule
It is used to define the time for time driven strategy or CRON expression for CRON driven strategy.
GetFile Properties
GetFile offers multiple properties as shown in the image below raging compulsory properties pke Input directory and file filter to optional properties pke Path Filter and Maximum file Size. A user can manage file fetching process using these properties.
GetFile Comments
This Section is used to specify any information about processor.
PutFile
The PutFile processor is used to store the file from the data flow to a specific location.
PutFile Settings
The PutFile processor has the following settings −
Name
In the Name setting, a user can define any name for the processors either according to the project or by that which makes the name more meaningful.
Enable
A user can enable or disable the processor using this setting.
Penalty Duration
This setting lets a user add the penalty time duration, in the event of flowfile failure.
Yield Duration
This setting is used to specify the yield time for processor. In this duration, the process does not get scheduled again.
Bulletin Level
This setting is used to specify the log level of that processor.
Automatically Terminate Relationships
This settings has a pst of check of all the available relationship of that particular process. By checking the boxes, user can program processor to terminate the flowfile on that event and do not send it further in the flow.
PutFile Schedupng
These are the following schedupng options offered by the PutFile processor −
Schedule Strategy
You can schedule the process on time basis either by selecting timer driven or a specified CRON string by selecting CRON driver option. There is also an Experimental strategy Event Driven, which will trigger the processor on a specific event.
Concurrent Tasks
This option is used to define the concurrent task schedule for this processor.
Execution
A user can define whether to run the processor in all nodes or only in primary node by using this option.
Run Schedule
It is used to define the time for timer driven strategy or CRON expression for CRON driven strategy.
PutFile Properties
The PutFile processor provides properties pke Directory to specify the output directory for the purpose of file transfer and others to manage the transfer as shown in the image below.
PutFile Comments
This Section is used to specify any information about processor.
Advertisements