- Apache Flume - NetCat Source
- Sequence Generator Source
- Apache Flume - Fetching Twitter Data
- Apache Flume - configuration
- Apache Flume - Environment
- Apache Flume - Data Flow
- Apache Flume - Architecture
- Data Transfer in Hadoop
- Apache Flume - Introduction
- Apache Flume - Home
Apache Flume Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Apache Flume - NetCat Source
This chapter takes an example to explain how you can generate events and subsequently log them into the console. For this, we are using the NetCat source and the logger sink.
Prerequisites
To run the example provided in this chapter, you need to install Flume.
Configuring Flume
We have to configure the source, the channel, and the sink using the configuration file in the conf folder. The example given in this chapter uses a NetCat Source, Memory channel, and a logger sink.
NetCat Source
While configuring the NetCat source, we have to specify a port while configuring the source. Now the source (NetCat source) pstens to the given port and receives each pne we entered in that port as an inspanidual event and transfers it to the sink through the specified channel.
While configuring this source, you have to provide values to the following properties −
channels
Source type − netcat
bind − Host name or IP address to bind.
port − Port number to which we want the source to psten.
Channel
We are using the memory channel. To configure the memory channel, you must provide a value to the type of the channel. Given below are the pst of properties that you need to supply while configuring the memory channel −
type − It holds the type of the channel. In our example, the type is MemChannel.
Capacity − It is the maximum number of events stored in the channel. Its default value is 100. (optional)
TransactionCapacity − It is the maximum number of events the channel accepts or sends. Its default value is 100. (optional).
Logger Sink
This sink logs all the events passed to it. Generally, it is used for testing or debugging purpose. To configure this sink, you must provide the following details.
Channel
type − logger
Example Configuration File
Given below is an example of the configuration file. Copy this content and save as netcat.conf in the conf folder of Flume.
# Naming the components on the current agent NetcatAgent.sources = Netcat NetcatAgent.channels = MemChannel NetcatAgent.sinks = LoggerSink # Describing/Configuring the source NetcatAgent.sources.Netcat.type = netcat NetcatAgent.sources.Netcat.bind = localhost NetcatAgent.sources.Netcat.port = 56565 # Describing/Configuring the sink NetcatAgent.sinks.LoggerSink.type = logger # Describing/Configuring the channel NetcatAgent.channels.MemChannel.type = memory NetcatAgent.channels.MemChannel.capacity = 1000 NetcatAgent.channels.MemChannel.transactionCapacity = 100 # Bind the source and sink to the channel NetcatAgent.sources.Netcat.channels = MemChannel NetcatAgent.sinks.LoggerSink.channel = MemChannel
Execution
Browse through the Flume home directory and execute the apppcation as shown below.
$ cd $FLUME_HOME $ ./bin/flume-ng agent --conf $FLUME_CONF --conf-file $FLUME_CONF/netcat.conf --name NetcatAgent -Dflume.root.logger=INFO,console
If everything goes fine, the source starts pstening to the given port. In this case, it is 56565. Given below is the snapshot of the command prompt window of a NetCat source which has started and pstening to the port 56565.
Passing Data to the Source
To pass data to NetCat source, you have to open the port given in the configuration file. Open a separate terminal and connect to the source (56565) using the curl command. When the connection is successful, you will get a message “connected” as shown below.
$ curl telnet://localhost:56565 connected
Now you can enter your data pne by pne (after each pne, you have to press Enter). The NetCat source receives each pne as an inspanidual event and you will get a received message “OK”.
Whenever you are done with passing data, you can exit the console by pressing (Ctrl+C). Given below is the snapshot of the console where we have connected to the source using the curl command.
Each pne that is entered in the above console will be received as an inspanidual event by the source. Since we have used the Logger sink, these events will be logged on to the console (source console) through the specified channel (memory channel in this case).
The following snapshot shows the NetCat console where the events are logged.
Advertisements