Monday, March 7, 2011

Merge Join Transformation

- It allows us to do normal join (INNER/LEFT/FULL) between 2 SORTED inputs
- input should be sorted first then only it will allow us to do joining
- we can do INNER or LEFT or FULL join between 2 inputs
- RIGHT join can be achieved by swapping the positions of 2 inputs in LEFT Join


EXAMPLE

Pre-requisite

Following script has to be executed

CREATE TABLE MergeJoinDemo1 (MID INT, MName VARCHAR(10))
CREATE TABLE MergeJoinDemo2 (MID INT, City VARCHAR(10))
CREATE TABLE MergeJoinDemoOutput (MID INT, MName VARCHAR(10), City VARCHAR(10))

INSERT INTO MergeJoinDemo1 (MID,MName) VALUES (1,'Nisarg'),(2,'Megha'),(3,'Swara')
INSERT INTO MergeJoinDemo2 (MID,City) VALUES (1,'Ahd'),(2,'Hyd'),(3,'Bom')



Steps

1. Add Data Flow Task in Control Flow
2. Drag 2 sources in DFT and they should point to MergeJoinDemo1 and MergeJoinDemo2 respectively
3. Drag Sort Transformation and connect it with Source1 and specify MID as sorted column
4. Drag Sort Transformation and connect it with Source2 and specify MID as sorted column
5. Drag Merge Join transformation and connect it with Sort1 and select Merge Join Left Input
6. Connect Merge Join transformation with Sort2.
7. Open  Merge Join transformation and select Inner Join, specify Join Key and select columns
8. We have selected only 3 columns so only 3 columns available in output
9. We can choose join from dropdown
10. "Swap Inputs" button will be enabled only in case of Left Join.
TableA LEFT JOIN TableB ----- same as ------- TableB RIGHT JOIN TableA
11. Whole structure will look like this.

1 comment: