In my previous post of tnormalize I have shown you how to split a multi valued attributes column into individual rows in this comma separated values are split into multiple rows but in
tSplitRow it splits one row into multiple rows.
This is Student Input File
Student_ID ,Institution,Address,City,Country,Course 1,Course 2,Course 3
101,NIIT,722 Bur Oak Avenue ,Berlin,Germany,CS,EC,EE
102,AIIM,11 Collaroy 2093,Medrid,Spain,MBA,BBA,BCA
103,SRIT,223 Wellington 5011,London,UK,BSC,ME,MTECH
So
firstly create a new job from Job Designs
> Create Job.
Drag the schema of Student.csv from Metadata > File
Delimited > Studentsplitrow.csv and drop it to the design work space and
select tfileInputDelimited option from pop window.This Student.csv file is been
taken as an input file.
Or you can simply drag this component from the palette and
double click on it to open the component properties and click
[...] next to the File Name field to specify the path where you have
created your Student.csv file.
Then drag and drop the following components from the palette
into the design workspace:-tSplitRow,tLogRow.
Connect each component by right clicking and select Row >
Main.
Open the component
properties of tSplitRow here we want different record for every Courses for
every Student.So we will drag Institution_ID,Institution_Name from Student(Input-Main)
to tSplitRow_1(Output) and add Institution_Address and Institution_Courses fields as output to tSplitRow component.
In the column mapping table of tSplitRow click the (+) sign to add mapping from source to target record. If the source input file has 3 records and 3 mappings are defined in the Columns Mapping table in tSplitRow component then the output will provide 9 records.
- On the first parameter line, write row1.Institution_ID
in Institution_ID field.
Type row1.Institution_Name in the Institution_Name field and row1.Address+"
"+row1.City+" "+row1.Country in the Institution_Address field.Type
row1.Course_1 in Institution_Courses
field.
- On the second parameter line,
write row1.Institution_ID in Institution_ID field. Type row1.Institution_Name in
the Institution_Name field
and row1.Address+" "+row1.City+" "+row1.Country in
the Institution_Address field.Type row1.Course_2 in Institution_Courses field.
- On the third parameter line, write row1.Institution_ID
in Institution_ID field.
Type row1.Institution_Name in the Institution_Name field and row1.Address+"
"+row1.City+" "+row1.Country in the Institution_Address field.Type
row1.Course_3 in Institution_Courses
field.
Here in Institution_Address field I have
concatenated Address,City and Country Column together.
In the Basic settings of tLogRow components,
select Table (print values in cells of a table) to show result in
table format.
- Atlast Run the Job.
Starting job how_to_use_tSplitRow_in_Talend at 13:54 29/04/2015.
[statistics] connecting to socket on port 3988
[statistics] connected
.--------------+----------------+----------------------------------+-------------------.
| tLogRow_1 |
|=-------------+----------------+----------------------------------+------------------=|
|Institution_ID|Institution_Name|Institution_Address |Institution_Courses|
|=-------------+----------------+----------------------------------+------------------=|
|101 |NIIT |722 Bur Oak Avenue Berlin Germany|CS |
|101 |NIIT |722 Bur Oak Avenue Berlin Germany|EC |
|101 |NIIT |722 Bur Oak Avenue Berlin Germany|EE |
|102 |AIIM |11 Collaroy 2093 Medrid Spain |MBA |
|102 |AIIM |11 Collaroy 2093 Medrid Spain |BBA |
|102 |AIIM |11 Collaroy 2093 Medrid Spain |BCA |
|103 |SRIT |223 Wellington 5011 London UK |BSC |
|103 |SRIT |223 Wellington 5011 London UK |ME |
|103 |SRIT |223 Wellington 5011 London UK |MTECH |
'--------------+----------------+----------------------------------+-------------------'
[statistics] disconnected
Job how_to_use_tSplitRow_in_Talend ended at 13:54 29/04/2015. [exit code=0]
Here you can see that 3 records are split into 9 records each Institution has multiple courses.
No comments:
Post a Comment