English 中文(简体)
Teradata - JOIN Strategies
  • 时间:2024-12-22

Teradata - JOIN strategies


Previous Page Next Page  

This chapter discusses the various JOIN strategies available in Teradata.

Join Methods

Teradata uses different join methods to perform join operations. Some of the commonly used Join methods are −

    Merge Join

    Nested Join

    Product Join

Merge Join

Merge Join method takes place when the join is based on the equapty condition. Merge Join requires the joining rows to be on the same AMP. Rows are joined based on their row hash. Merge Join uses different join strategies to bring the rows to the same AMP.

Strategy #1

If the join columns are the primary indexes of the corresponding tables, then the joining rows are already on the same AMP. In this case, no distribution is required.

Consider the following Employee and Salary Tables.

CREATE SET TABLE EMPLOYEE,FALLBACK ( 
   EmployeeNo INTEGER, 
   FirstName VARCHAR(30) , 
   LastName VARCHAR(30) , 
   DOB DATE FORMAT  YYYY-MM-DD , 
   JoinedDate DATE FORMAT  YYYY-MM-DD , 
   DepartmentNo BYTEINT 
) 
UNIQUE PRIMARY INDEX ( EmployeeNo );
CREATE SET TABLE Salary ( 
   EmployeeNo INTEGER, 
   Gross INTEGER,  
   Deduction INTEGER, 
   NetPay INTEGER 
) 
UNIQUE PRIMARY INDEX(EmployeeNo); 

When these two tables are joined on EmployeeNo column, then no redistribution takes place since EmployeeNo is the primary index of both the tables which are being joined.

Strategy #2

Consider the following Employee and Department tables.

CREATE SET TABLE EMPLOYEE,FALLBACK (
   EmployeeNo INTEGER, 
   FirstName VARCHAR(30) , 
   LastName VARCHAR(30) , 
   DOB DATE FORMAT  YYYY-MM-DD , 
   JoinedDate DATE FORMAT  YYYY-MM-DD , 
   DepartmentNo BYTEINT 
) 
UNIQUE PRIMARY INDEX ( EmployeeNo );
CREATE SET TABLE DEPARTMENT,FALLBACK ( 
   DepartmentNo BYTEINT, 
   DepartmentName CHAR(15) 
) 
UNIQUE PRIMARY INDEX ( DepartmentNo );

If these two tables are joined on DeparmentNo column, then the rows need to be redistributed since DepartmentNo is a primary index in one table and non-primary index in another table. In this scenario, joining rows may not be on the same AMP. In such case, Teradata may redistribute employee table on DepartmentNo column.

Strategy #3

For the above Employee and Department tables, Teradata may duppcate the Department table on all AMPs, if the size of Department table is small.

Nested Join

Nested Join doesn’t use all AMPs. For the Nested Join to take place, one of the condition should be equapty on the unique primary index of one table and then joining this column to any index on the other table.

In this scenario, the system will fetch the one row using Unique Primary index of one table and use that row hash to fetch the matching records from other table. Nested join is the most efficient of all Join methods.

Product Join

Product Join compares each quapfying row from one table with each quapfying row from other table. Product join may take place due to some of the following factors −

    Where condition is missing.

    Join condition is not based on equapty condition.

    Table apases is not correct.

    Multiple join conditions.

Advertisements