Talend 简明教程
Talend - Job Design
这是业务模型的技术实现/图形化表示。在此设计中,一个或多个组件相互连接以运行数据集成过程。因此,当您在设计面板中拖拽组件并使用连接器连接它们时,作业设计将所有内容转换为代码并创建一个完整的可运行程序,从而形成数据流。
This is the technical implementation/graphical representation of the business model. In this design, one or more components are connected with each other to run a data integration process. Thus, when you drag and drop components in the design pane and connect then with connectors, a job design converts everything to code and creates a complete runnable program which forms the data flow.
Creating a Job
在存储库窗口中,右键单击作业设计并单击创建作业。
In the repository window, right click the Job Design and click Create Job.

提供作业的名称、目的和描述,然后单击完成。
Provide the name, purpose and description of the job and click Finish.

您会看到作业已在作业设计下创建。
You can see your job has been created under Job Design.

现在,让我们使用此作业来添加、连接和配置组件。在此,我们将使用 Excel 文件作为输入并生成包含相同数据的 Excel 文件作为输出。
Now, let us use this job to add components, connect and configure them. Here, we will take an excel file as an input and produce an excel file as an output with same data.
Adding Components to a Job
调色板中有几个可供选择的组件。还有一个搜索选项,您可以在其中输入组件的名称以选择它。
There are several components in the palette to choose. There is a search option also, in which you can enter the name of the component to select it.

由于我们在此将 Excel 文件作为输入,因此我们将从调色板中将 tFileInputExcel 组件拖拽到设计器窗口中。
Since, here we are taking an excel file as an input, we will drag and drop tFileInputExcel component from the palette to the Designer window.

现在,如果您单击设计器窗口上的任意位置,将出现一个搜索框。找到 tLogRow 并选择它以将其带入设计器窗口。
Now if you click anywhere on the designer window, a search box will appear. Find tLogRow and select it to bring it in the designer window.

最后,从调色板中选择 tFileOutputExcel 组件,并将其拖拽到设计器窗口中。
Finally, select tFileOutputExcel component from the palette and drag drop it in designer window.

现在,组件添加已完成。
Now, the adding of the components is done.

Connecting the Components
添加组件后,您必须连接它们。右键单击第一个组件 tFileInputExcel,然后将主线绘制到 tLogRow,如下所示。
After adding components, you must connect them. Right click the first component tFileInputExcel and draw a Main line to tLogRow as shown below.

同样,右键单击 tLogRow,然后在 tFileOutputExcel 上绘制一条主线。现在,您的组件已连接。
Similarly, right click tLogRow and draw a Main line on tFileOutputExcel. Now, your components are connected.


Configuring the components
在作业中添加并连接组件后,您需要对其进行配置。为此,双击第一个组件 tFileInputExcel 以对其进行配置。在文件名/流中提供输入文件的路径,如下所示。
After adding and connecting the components in the job, you need to configure them. For this, double click the first component tFileInputExcel to configure it. Give the path of your input file in File name/stream as shown below.
如果 excel 中的第 1 行有列名,请在 Header 选项中输入 1。
If your 1st row in the excel is having the column names, put 1 in the Header option.

单击编辑模式,然后根据输入的Excel文件添加列及其类型。添加模式后,单击“确认”。
Click Edit schema and add the columns and its type according to your input excel file. Click Ok after adding the schema.

单击“是”。
Click Yes.

In tLogRow component, click on sync columns and select the mode in which you want to generate the rows from your input. Here we have selected Basic mode with “,” as field separator.

Finally, in tFileOutputExcel component, give the path of file name where you want to store

your output excel file with the sheet name. Click on sync columns.
Executing the Job
一旦您完成了添加,连接和配置您的组件,您就可以执行您的 Talend 作业了。点击 “运行” 按钮开始执行。
Once you are done with adding, connecting and configuring your components, you are ready to execute your Talend job. Click Run button to begin the execution.


您将以基本模式看到具有 “,” 分隔符的输出。
You will see the output in the basic mode with “,” separator.

您还可以看到您的输出保存在您提到的输出路径中的 Excel 表格中。
You can also see that your output is saved as an excel at the output path you mentioned.
