Apache Pig Tutorial
Apache Pig Introduction
Apache Pig Environment
Pig Latin
Load & Store Operators
Diagnostic Operators
Grouping & Joining
Combining & Splitting
Filtering
Sorting
Pig Latin Built-In Functions
Other Modes Of Execution
Apache Pig Useful Resources
Selected Reading
Apache Pig Introduction
Apache Pig Environment
Pig Latin
Load & Store Operators
Diagnostic Operators
- Apache Pig - Illustrate Operator
- Apache Pig - Explain Operator
- Apache Pig - Describe Operator
- Apache Pig - Diagnostic Operator
Grouping & Joining
- Apache Pig - Cross Operator
- Apache Pig - Join Operator
- Apache Pig - Cogroup Operator
- Apache Pig - Group Operator
Combining & Splitting
Filtering
Sorting
Pig Latin Built-In Functions
- Apache Pig - Math Functions
- Apache Pig - date-time Functions
- Apache Pig - String Functions
- Apache Pig - Bag & Tuple Functions
- Load & Store Functions
- Apache Pig - Eval Functions
Other Modes Of Execution
Apache Pig Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Apache Pig - Union Operator
Apache Pig - Union Operator
The UNION operator of Pig Latin is used to merge the content of two relations. To perform UNION operation on two relations, their columns and domains must be identical.
Syntax
Given below is the syntax of the UNION operator.
grunt> Relation_name3 = UNION Relation_name1, Relation_name2;
Example
Assume that we have two files namely student_data1.txt and student_data2.txt in the /pig_data/ directory of HDFS as shown below.
Student_data1.txt
001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai.
Student_data2.txt
7,Komal,Nayak,9848022334,trivendram. 8,Bharathi,Nambiayar,9848022333,Chennai.
And we have loaded these two files into Pig with the relations student1 and student2 as shown below.
grunt> student1 = LOAD hdfs://localhost:9000/pig_data/student_data1.txt USING PigStorage( , ) as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray); grunt> student2 = LOAD hdfs://localhost:9000/pig_data/student_data2.txt USING PigStorage( , ) as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);
Let us now merge the contents of these two relations using the UNION operator as shown below.
grunt> student = UNION student1, student2;
Verification
Verify the relation student using the DUMP operator as shown below.
grunt> Dump student;
Output
It will display the following output, displaying the contents of the relation student.
(1,Rajiv,Reddy,9848022337,Hyderabad) (2,siddarth,Battacharya,9848022338,Kolkata) (3,Rajesh,Khanna,9848022339,Delhi) (4,Preethi,Agarwal,9848022330,Pune) (5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar) (6,Archana,Mishra,9848022335,Chennai) (7,Komal,Nayak,9848022334,trivendram) (8,Bharathi,Nambiayar,9848022333,Chennai)Advertisements