Monday, March 7, 2011

Aggregate Transformation & Multicast Transformation


Aggregate Transformation

- It is same as GROUP BY in SQL
- Using this Transformation, we can get COUNT(*), COUNT DISTINCT, MAX, MIN, AVE, SUM
- Column can be part of GROUP BY / AGGREGATION Function

Multicast Transformation

- It distributes input into more than one output
- Unlike Conditional Split, it is not splitting the output, instead it is sending the all the records to all output branches

EXAMPLE

Pre-requisite

Following script has to be executed

CREATE TABLE AggregateDemo (ADID INT IDENTITY(1,1), StudentID INT, Subject varchar(10), Mark int)
GO
insert into AggregateDemo (StudentID,Subject,Mark)
values (1,'Maths',100),
(1,'Science',100),
(1,'English',90),
(2,'Maths',99),
(2,'Science',99),
(2,'English',95),
(3,'Maths',95),
(3,'Science',100),
(3,'English',98)
GO
CREATE TABLE AggregateDemo_TotalMarkPerStudent (StudentID INT, TotalMark int)
GO
CREATE TABLE AggregateDemo_MaxMarkPerSubject (Subject varchar(10), MaxMark int)
GO


Steps


1. Add Data Flow Task in Control Flow
2. Drag source in DFT and it should point to AggregateDemo
3. Add Multicast Transformation and connect it with source
4. Add 2 Aggregate Transformations and connect them with Multicast Transformation
5. In 1st Aggregate, do following settings.


6. In 2nd Aggregate, do following settings.


7. Add 2 destinations and connect them with 2 aggregate transformations. These 2 dimensions should point to  AggregateDemo_TotalMarkPerStudent & AggregateDemo_MaximumMarkPerSubject respectively.

8. Overall package should look like this.

9. Execute the package. It will look like this.
10. Here you can see that Multicast is transforming all 9 rows in 2 output branches


No comments:

Post a Comment