XLS 2 XML

From OpenBlueLab

Jump to: navigation, search
Home Official WebSite Download Forum

Contents

Files Migration : xls to xml

Presentation

To realize CRM, some of softwares generate excel format files (.xls) which contain all informations of this CRM. In order to import all data to OpenBluelab and use them, it's necessary to realize a migration in xml format links to already present files in eXist data base. We will use for that the freeware Talend Open Studio.

Problem

To illustrate this problem we are going to take a data example : society.

--> Only one xls file contains many data, we have several societies in only one file.

--> In eXist data base using by OpenBlueLab, each xml file corresponds to only one society.

In a first time, we will have to convert format xls data to format xml data, to get only one xml file.

In a second time, xml file obtains will be divided as many files xml of wanted forms.


Image:Schema process.jpg

Differents steps

  • Treats pre-conversion of xls files
  • Conversion xls to xml with Talend
  • Data integration with XQuery

Treats pre-conversion of xls files

Introduction

  • An xls file is composed of rows and columns. A row contains object data then a column represents a data type like a lastname,firstname,address,etc... field.
  • An xml file is composed of tags forming a tree structure, the data are presented in a hierarchical way.

The xls file containing data have to be well arranged in order to be convert to xml format.

  • To be in the well column
  • To get a format recognized by OpenBlueLab in nomenclatures cases.

Create or adapt a classes diagram

Of course, data you will export must correspond to a model in OpenBlueLab.

  • If this model doesn't exist, you have to create it.
  • If this model exists, you have to adapt it with what you need.

Compare attributes

First of all, you must compare generated data by OpenBlueLab class diagramm and data contains in xls file.

With OpenBlueLab, generate a new form with all fields. When you save with 'OK' button, it docid appears. You have to make a search in eXist data base to see the complete structure of xml tags in the generated document for this form. Then, locate columns of xls file to associate with xml tags.

Conversion xls to xml with Talend

Introduction

Once xls and xml files were harmonized, we are going to use Talend to recover the xml file which contains all data in the xls file.
You can download Talend Open Studio here.

Create a new job

  • In first time open Talend, then do a right click in Repository tab on Job Designs, choose Create job.


Image:Talend1.jpg


  • Give a name for your project:


Image:Talend2.jpg


  • Your project has just be created, you can see it in Job Designs tree structure:


Image:Talend3.jpg


  • The project is opened in empty page, which corresponds of graphical tool:


Image:Talend4.jpg

Insert pallets data

  • In this example we want convert an XLS file to XML file. So we need an input XLS file, moreover we have to determine an output path for this XML file.
  • On Talend, in right part of your screen there is the 'Palette' window, which contains all graphicals elements you need for the conversion.


Image:Talend5.jpg


  • Deploy tree view File/Input and select the tFileInputExcel component (which corresponds to XLS format)


Image:Talend6.jpg


  • Then, click one on component in the Pallets and click on your editor page to drop the component:


Image:Talend7.jpg


  • Reiterate operation for output XML file deploying File/Output in the Pallets, then select tFileOutputXML component:


Image:Talend8.jpg


  • You can see an error icon on components:


Image:Talend9.jpg


Now, you are able to insert graphicals components, we will see how to use it.

Manage of graphics components with files

Define input file
  • Click on tFileInputExcel_1 component. You will see appear differents tabs in bottom of the screen:

Image:Talend10.jpg


  • That interest us for moment is to select our XLS file in input for that, we will use File Name navigator, we will choose to place files in directory workspace of Talend:


Image:Talend11.jpg


Image:Talend12.jpg


Image:Talend13.jpg


Now, the xls file is linked with the graphical component.

Specify columns names
  • We will define columns name of XLS file in order to well-named XML tags, in other words to get attributes name's class on OpenBlueLab to integrate it into data base.


  • For that we are going to use Edit schema window:


Image:Talend14.jpg


  • You see this window appears:


Image:Talend15.jpg


  • You have to click on green cross to add columns:


Image:Talend16.jpg



  • We go to rename columns with diagram's attributes, the order is determined by the order of columns in XLS file. We don't have to take the order of attributes which appear on OpenBlueLab because the XML tags order on the same level doesn't have any importance.


Image:Talend17.jpg

Data integration with XQuery

Introduction