Teradata Basics
- Teradata - SubQueries
- Teradata - Joins
- Teradata - Primary Index
- Teradata - CASE & COALESCE
- Teradata - Aggregate Functions
- Teradata - Built-in Functions
- Teradata - Date/Time Functions
- Teradata - String Manipulation
- Teradata - SET Operators
- Logical & Conditional Operators
- Teradata - SELECT Statement
- Teradata - Data Manipulation
- Teradata - Tables
- Teradata - Data Types
- Teradata - Relational Concepts
- Teradata - Architecture
- Teradata - Installation
- Teradata - Introduction
Teradata Advanced
- Teradata - BTEQ
- Teradata - FastExport
- Teradata - MultiLoad
- Teradata - FastLoad
- Teradata - Performance Tuning
- Teradata - User Management
- Teradata - Data Protection
- Teradata - OLAP Functions
- Teradata - Partitioned Primary Index
- Teradata - JOIN Strategies
- Teradata - Stored Procedure
- Teradata - Macros
- Teradata - Views
- Teradata - Join Index
- Teradata - Hashing Algorithm
- Teradata - Explain
- Teradata - Compression
- Teradata - Statistics
- Teradata - Secondary Index
- Teradata - Space Concepts
- Teradata - Table Types
Teradata Useful Resources
- Teradata - Discussion
- Teradata - Useful Resources
- Teradata - Quick Guide
- Teradata - Questions & Answers
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Teradata - JOIN strategies
This chapter discusses the various JOIN strategies available in Teradata.
Join Methods
Teradata uses different join methods to perform join operations. Some of the commonly used Join methods are −
Merge Join
Nested Join
Product Join
Merge Join
Merge Join method takes place when the join is based on the equapty condition. Merge Join requires the joining rows to be on the same AMP. Rows are joined based on their row hash. Merge Join uses different join strategies to bring the rows to the same AMP.
Strategy #1
If the join columns are the primary indexes of the corresponding tables, then the joining rows are already on the same AMP. In this case, no distribution is required.
Consider the following Employee and Salary Tables.
CREATE SET TABLE EMPLOYEE,FALLBACK ( EmployeeNo INTEGER, FirstName VARCHAR(30) , LastName VARCHAR(30) , DOB DATE FORMAT YYYY-MM-DD , JoinedDate DATE FORMAT YYYY-MM-DD , DepartmentNo BYTEINT ) UNIQUE PRIMARY INDEX ( EmployeeNo );
CREATE SET TABLE Salary ( EmployeeNo INTEGER, Gross INTEGER, Deduction INTEGER, NetPay INTEGER ) UNIQUE PRIMARY INDEX(EmployeeNo);
When these two tables are joined on EmployeeNo column, then no redistribution takes place since EmployeeNo is the primary index of both the tables which are being joined.
Strategy #2
Consider the following Employee and Department tables.
CREATE SET TABLE EMPLOYEE,FALLBACK ( EmployeeNo INTEGER, FirstName VARCHAR(30) , LastName VARCHAR(30) , DOB DATE FORMAT YYYY-MM-DD , JoinedDate DATE FORMAT YYYY-MM-DD , DepartmentNo BYTEINT ) UNIQUE PRIMARY INDEX ( EmployeeNo );
CREATE SET TABLE DEPARTMENT,FALLBACK ( DepartmentNo BYTEINT, DepartmentName CHAR(15) ) UNIQUE PRIMARY INDEX ( DepartmentNo );
If these two tables are joined on DeparmentNo column, then the rows need to be redistributed since DepartmentNo is a primary index in one table and non-primary index in another table. In this scenario, joining rows may not be on the same AMP. In such case, Teradata may redistribute employee table on DepartmentNo column.
Strategy #3
For the above Employee and Department tables, Teradata may duppcate the Department table on all AMPs, if the size of Department table is small.
Nested Join
Nested Join doesn’t use all AMPs. For the Nested Join to take place, one of the condition should be equapty on the unique primary index of one table and then joining this column to any index on the other table.
In this scenario, the system will fetch the one row using Unique Primary index of one table and use that row hash to fetch the matching records from other table. Nested join is the most efficient of all Join methods.
Product Join
Product Join compares each quapfying row from one table with each quapfying row from other table. Product join may take place due to some of the following factors −
Where condition is missing.
Join condition is not based on equapty condition.
Table apases is not correct.
Multiple join conditions.