Settings for a Flat File (TXT, CSV)
This topic describes how to configure a new Flat File connection in DataSync.
To configure the connection:
-
In DataSync, create a new source connection for a flat file.
-
Fill out the fields as described below:
-
Click Save.
The Additional Connection Properties panel allows to add parameters that are not specified in the Connection Properties panel. Additional properties can be set to Visible or Encrypted. The most common additional properties are listed below.
Additional Connection Properties | Description | ||||
---|---|---|---|---|---|
Charset |
If your source file is encoded in ANSI format, you must add the Charset parameter to your configuration in the Additional Connection Properties pane, exactly as it appears in the image below (including the value—windows-1252): If your source file is encoded in UTF-8 format, no additional parameter is needed. |
||||
Culture |
By default, DataSync uses the decimal separator from your computer's Windows regional format settings. For Flat File connections, source and destination files must use a decimal separator that is compatible with your DataSync configuration. If the source or destination file uses a different decimal separator (e.g., a flat file that uses commas as the decimal separator but your regional settings are set to American English), you must add the Culture additional property to your configuration as follows:
|
||||
Row Scan Depth |
The Row Scan Depth additional property lets you define how many rows of your source file will be scanned to determine the data type of your table's columns. The default scan is set to 100 rows. If you wish to alter that value, simply add a Row Scan Depth additional property and enter the desired value (e.g., 5000). Note
If you add a Row Scan Depth additional property with a value of 0 (zero), DataSync will scan the entirety of the flat file, regardless of its length. Scanning a larger amount of rows will result in longer processing times, but they will ultimately be more accurate. |
||||
AggregateFiles |
This parameter aggregates all of the files located in the URI directory into a single table called AggregatedFiles. By default, the first file in the folder is used to define the schema; however, the MetadataDiscoveryURI parameter can be specified to use a different file instead. Example
Two CSV files have these contents.
The resulting aggregate table is shown below. Note that only the columns present in the defined schema are used in the aggregate. AggregatedFiles ItemID,Name,NumInStock 1,Peanuts - Salted,76 2,Peanuts - Unsalted,43 3,Raisins,26 4,Pretzels - Original,55 5,Pretzels - Chocolate,35 6,Toffee,44 |
||||
IncludeFiles |
This parameter allows you to provide a comma-separated list of file extensions to include into the set of the files modeled as tables. The default value is "CSV,TXT,TAB". For example, IncludeFiles=TXT,TAB. You can include files without an extension by specifying the NOEXT value. You can also include archive files of these types: ZIP, TAR, and GZ. When archive files are found, they will be downloaded to the local machine so the driver can extract and parse the contained files. Files contained within an archive must match an extension listed in IncludeFiles to be included in the set of files modeled as tables. File masks can be specified using an asterisk (*) to provide enhanced filtering capabilities; for example: IncludeFiles=2020*.csv,TXT. Files specified in Schema.ini are included in addition to the files specified by this property. It is also possible to specify date and time filters using CreatedDate and ModifiedDate. All extension filters are evaluated in disjunction (using OR operator), and then the resulting filter is evaluated in conjunction (using AND operator) with the date and time filters. Examples: IncludeFiles="TXT,CreatedDate<='2020-11-26T07:39:34-05:00'" IncludeFiles="TXT,ModifiedDate<=DATETIMEFROMPARTS(2020, 11, 26, 7, 40, 50, 000)" IncludeFiles="ModifiedDate>=DATETIMEFROMPARTS(2020, 11, 26, 7, 40, 49, 000),ModifiedDate<=CURRENT_TIMESTAMP()" |
||||
ExcludeFiles |
This parameter provides a comma-separated list of file extensions to exclude from the set of the files modeled as tables. It is also possible to specify date and time filters using CreatedDate and ModifiedDate. All extension filters are evaluated in disjunction (using OR operator), and then the resulting filter is evaluated in conjunction (using AND operator) with the date and time filters. Examples: ExcludeFiles="TXT,CreatedDate<='2020-11-26T07:39:34-05:00'" ExcludeFiles="TXT,ModifiedDate<=DATETIMEFROMPARTS(2020, 11, 26, 7, 40, 50, 000)" ExcludeFiles="ModifiedDate>=DATETIMEFROMPARTS(2020, 11, 26, 7, 40, 49, 000),ModifiedDate<=CURRENT_TIMESTAMP()" |
||||
InsertMode |
This property is for Destination only. It specifies the mode for inserting data into CSV files.
Note: Note that SingleFile can only be done for local and network path. |
||||
CreateBatchFolder | Whether to create a folder or not when InsertMode is set to FilePerBatch. Default Value is true. | ||||
BatchNamingConvention |
Determines the naming convention of batch files.
|