Monday, June 22, 2015

How to convert rows to columns using tPivotToColumnDelimited component in Talend !!

In this post I will show you how to convert rows to columns using tPivotToColumnsDelimited . 
It requires at least three columns in the input schema: the Pivot column, the Aggregation column and one or more Group keys.

This is Institution Input File:-

Intitution_ID;Intitution_Name;Intitution_Address;Intitution_Course;Institution_CourseName
101;IT;722 Bur Oak Avenue  Berlin Germany;Course1;CS
101;IT;722 Bur Oak Avenue  Berlin Germany;Course2;EE
101;IT;722 Bur Oak Avenue  Berlin Germany;Course3;EC
102;AIIM;11 Collaroy  2093 Medrid Spain;Course1;BCOM
102;AIIM;11 Collaroy  2093 Medrid Spain;Course2;BBA
102;AIIM;11 Collaroy  2093 Medrid Spain;Course3;BCA          
103;SRIT;223 Wellington  5011 London UK;Course1;BSC
103;SRIT;223 Wellington  5011 London UK;Course2;ME
103;SRIT;223 Wellington  5011 London UK;Course3;MTECH

Drag and drop the components from the palette and connect each of them as shown in the blow screenshot.

Configurations setting of the tPivotToColumnsDelimited  component :
Pivot Column =”Type”
Aggregation column=”Value”
Aggregation Function =”last”
Group by “ID” and “Name” column.
Rest of the configuration is for output file, where our output will be transferred. to read output file we can use either delimited component but for quick review I`ll use tFileInputFullRow.
Add tFileInputFullRow below the tFixedFlowInput component and connect with “On Sub Job Ok” trigger. and provide previously created file path and rest of the details.
add tLogRow and connect to tFileInputFullRow component and execute the job you will get above out put on console.
Final Job Design.


Open the component properties of tPivotToColumnDelimited.

Pivot Column – In order to convert rows to columns, we need to identify a one column which need to be converted to multiple columns based on Aggregate column. 
In this we want to convert Institution_Course to multiple column so select Institution_Course in Pivot column field.

Aggregate column  - Aggregation column is the column from source data on which aggregation is to be applied with specific function.
Aggregation column is Institution_CourseName here we want aggregate the course name.

Aggregation function – which type of aggregation is to be applied on input data. If no aggregation function is applied, then you can select “First”. Aggregation functions available are Sum, Count, Min, Max, First, Last. Select the Aggregation function first.

Group by – You need to provide a group by column name, this is the column based on which pivot columns are created. Select Institution_ID, Institution_Name, Institution_Address.These columns we want to be grouped by pivot column.

Give the File Name path where you want to store the data.

Open the component properties of tFileInputFullRow.
Provide the File Name path where you want to store the data.

Click on Edit schema button this schema will be shown as below.


In tLogRow component select Table(print value in cells of table) option so that result will appear in table format.

Run the job you will see all the Institution_CourseName are grouped together with Institution_ID, Institution_Name, Institution_Address.
We have 9 rows in input and 3 rows as output .

Starting job how_to_tpivottocolumn at 15:16 21/05/2015.

[statistics] connecting to socket on port 4080
[statistics] connected
.--------------------------------------------------------------------------------------.
|                            tLogRow_1                                                                    |
|=-----------------------------------------------------------------------------------=|
|line                                                                                                             |
|=-----------------------------------------------------------------------------------=|
|101;IT;722 Bur Oak Avenue  Berlin Germany;CS;EE;EC                       |
|102;AIIM;11 Collaroy  2093 Medrid Spain;BCOM  ;BBA;BCA            |
|103;SRIT;223 Wellington  5011 London UK;BSC ;ME;MTECH           |
'--------------------------------------------------------------------------------------'
[statistics] disconnected
Job how_to_tpivottocolumn ended at 15:16 21/05/2015. [exit code=0]

1 comment:

  1. It's really excellent blog, I just share your blog because it's really nice. Just look at this msbi online training Bangalore

    ReplyDelete