Monday, April 27, 2015

How to use tSortRow component in Talend

This tutorial explains how to sort data, write it into a temporary file, and replace the source file with the temporary file.
It comprises of two Subjobs:

- sorting data in a temporary file,
- replacing the source file by that temporary file.


tSortRow component sorts input data based on one or several columns, by sort type and order.

This is Suppliers Input
SupplierID,SupplierName,ContactName,Address,City,PostalCode,Country
1,Debra,Roberts,1035 Gales Ave   Winston Salem Winston Salem 27103-4579,London,8605932890,UK
2,Elliot,BOLICK,4402 w avenida del sol   Glendale Glendale 85310-3915,California,2162256986,USA
3,Michael,Thompson,1869 GRAND VIEW DR   OAKLAND OAKLAND 94618-2339,Texas,5414000606,USA
4,JOE,Salamida,23731 Howard St. PO Box 333  Covelo Covelo 95428-0333,Tokyo,9198488887,Japan
5,Stephen,Schofield,6750 Beecher Road   Clayton Lenawee 49235-9655,Madrid,6317038390,Spain
6,Mark,chung,3470 Tilden St   PHILADELPHIA PHILADELPHIA 19129-1435,Saitama,4083996114,Japan
7,James,B Brown,3250 NE 1st Ave Apt. 918   MIami MIami 33137-4097,Texas,3128131305,USA
8,Albert,Linden,9000 Bay Hill Blvd Suite 300  Orlando Orlando 32819-4880,Birmingham,2252528310,UK
9,Carey,Liu,Roberts Communications 64 Commercial Street  Rochester Rochester 14614-1010,Sao Paulo,9703192376,Brazil
10,Enrique,Cohen,1026 County Rd. 112   Carbondale Carbondale 81623-9642,Sao Paulo,6578644245,Brazil

Firstly create a new job from Job Designs > Create Job.
Drag the schema of Suppliers.csv from Metadata > File Delimited > Suppliers and drop it to the design work space and select tfileInputDelimited option from pop window.This Suppliers.csv file is been taken as an input file.Or you can simply drag this component from the palette and double click on it to open the component properties and click [...] next to the File Name field to specify the path where you have created your Suppliers.csv file.
Then drag and drop the following components from the palette into the design workspace:-tSortRow,tfileOutputDelimited

Connect each component by right clicking and select Row > Main.



Then double click on tSortRow to open the component view .In Basic Settings define under Criteria Table by clicking (+) buton to add a line .Select the column you want to sort and select sort num or alpha option or asc or desc option based on your requirement.For example I have selected SupplierName,alpha and asc option.



Now double click on tfileOutputDelimited in the wizard, define the same path as for the Suppliers.csv file but name it as output_sortrow.csv.

Check the Include Header box to retrieve the column names.

Run the Job. Now will see the Job has created a new file named output_sortrow.csv containing the sorted data.Our purpose of the Job was to sort the source file and not to create a new one.

So here in next step we will see how to replace the source file by the new one.

Drag and drop a new component tFileCopy from the palette into the design workspace.
And connect by right click on tFileInputDelimitedand select Trigger > OnSubjobOk from the menu and drag a line to tFileCopy.



Open the component properties of tFileCopy now to copy the output_sortrow.csv file containing the sorted data,specify the file path in the File Name field .
In the Destination directory field specify the folder where you want to copy the file and select the file path of the Suppliers.csv source file.

To replace the source file with the sorted file, check the Rename box and write “Suppliers.csv”.
To delete the temporary file, click checkbox of Remove source file and tick check box of Replace Existing File and Create the Directory if it doesn’t exists.


At last Run the Job you will see that source file has been replaced with the temporary file.


No comments:

Post a Comment