English 中文(简体)
Apache Pig - Foreach Operator
  • 时间:2024-10-18

Apache Pig - Foreach Operator


Previous Page Next Page  

The FOREACH operator is used to generate specified data transformations based on the column data.

Syntax

Given below is the syntax of FOREACH operator.

grunt> Relation_name2 = FOREACH Relatin_name1 GENERATE (required data);

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below.

student_details.txt

001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi 
004,Preethi,Agarwal,21,9848022330,Pune 
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar 
006,Archana,Mishra,23,9848022335,Chennai 
007,Komal,Nayak,24,9848022334,trivendram 
008,Bharathi,Nambiayar,24,9848022333,Chennai

And we have loaded this file into Pig with the relation name student_details as shown below.

grunt> student_details = LOAD  hdfs://localhost:9000/pig_data/student_details.txt  USING PigStorage( , )
   as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray);

Let us now get the id, age, and city values of each student from the relation student_details and store it into another relation named foreach_data using the foreach operator as shown below.

grunt> foreach_data = FOREACH student_details GENERATE id,age,city;

Verification

Verify the relation foreach_data using the DUMP operator as shown below.

grunt> Dump foreach_data;

Output

It will produce the following output, displaying the contents of the relation foreach_data.

(1,21,Hyderabad)
(2,22,Kolkata)
(3,22,Delhi)
(4,21,Pune) 
(5,23,Bhuwaneshwar)
(6,23,Chennai) 
(7,24,trivendram)
(8,24,Chennai) 
Advertisements