Importing Data from Microsoft Excel

You can import data directly from an excel file, specifying a specific data sheet.

You can either import Excel files in two different ways:

  • By dragging and dropping the Excel source file directly onto the stage.

  • By using the Import from Excel task. The advantage of using the task is that you can configure the import options and decide whether you want to specify a single or multiple excel file:

    • Single Excel file: only one file is imported, specifying the datasheet from which information will be taken.

    • Multiple Excel files: in this case the files are concatenated to form a single table. Consequently, all files imported together must have the same structure.

Wildcards can be used in a filename/file list. For example, entering C:\Software*.csv during a file import, will result in the bulk import all the files in the Software folder, whose extension is .csv. Wildcards can be used in file names, but not in folder names, or in file extensions. Remember that only wildcard (*) can be used in the file path, and any consequent asterisks will be considered an effective part of the path, and not a wildcard. If the file extension is not actually specified when using a wildcard, files with extensions appropriate to the task in use will be searched for. For example, the extensions .xlsx, .xlsm for Import from Excel File, and .csv, .txt, .tab, etc. for Import from Text File.

Rulex's technical documentation does not and cannot provide comprehensive guidelines on the use of third-party software, beyond how Rulex integrates with this software. Please consult the technical documentation of the third-party software itself for up-to-date information.


Prerequisites

Additional tabs

The following additional tabs are provided:


Procedure

  1. Drag and drop the Import from Excel File task onto the central stage.

  2. Double click the task.

  3. If you want to import data via a remote connection, such as HTTP API, SharePoint site or FTP, select the corresponding source from the Source file URI (uri) drop-down list and configure the connection in the Remote Connections tab.

  4. To import a single file:

    • Click Select file to browse to the Excel file you want to import and click Open, or manually enter the name of the file in the corresponding edit box (filename): the Table preview pane displays the data that will be imported into Rulex, and is dynamically updated each time you change any of the available options.

    • Configure the options as explained in the Single file options below.

  5. To import multiple files or multiple sheets:

    • Click on the Advanced tab and configure the options as explained in the Multiple file options below.

  6. Save and compute the task.

Single file options:

Parameter Name

PO

Description

Select sheet 

sheetname

Select the sheet you want to import from the drop-down list: the contents of the datasheet are displayed in the Table preview below.

If you select the Match sheet by position option, the position number of the sheet will be displayed (sheet) instead of its name.

Match sheet by position

matchbypos, sheet

Select the option if you want to import the sheet according to its position (True), and not its name (False).

For example, if you import sheet 2 (sheet), called NewData, and you subsequently insert a new sheet, and NewData consequently becomes the third sheet in the file, the sheet that is now in position two will be imported. Otherwise, the sheet called NewData will always be imported, regardless of its position. 

Missing string

missingstr

Specify a character (or string of characters), which has been used in the data to indicate missing values.

Start importing from line

firstline

The line number from which the import operation will start.

Stop importing at line 

lastline

The number of the line where the import operation will stop.

Get names from line

nameline

The number of the line from which the names of the columns will be taken.

Get types from line

typeline

The number of the line from which the names of the data types will be taken.

Remove empty rows

delemptyrows

If selected, empty rows are automatically deleted.

Remove empty columns

delemptycols

If selected, empty columns are automatically deleted.

Strip spaces

stripspaces

If selected, all spaces are removed from strings. For example, the string ” age ” is then imported as “age”.

Compress white spaces

onlyonespace

Select this option to remove extra consecutive spaces from within strings. For example the string "university    program" would be imported as "university program".

Add an attribute containing...

addinfocol

If required you can add an extra column to the dataset with the name of the file, sheet or both.

Possible values are:

  • None, if no extra columns are added

  • File, to add a column with the file name

  • Sheet, to add a column with the sheet name

  • Both, to add two columns, one with the file name and one with the sheet name.

Use old computation data if source file is not available

useolddata

If selected, data from the previous computations will be used if the source table is not available.

Continue the execution if the file is missing

continue

If selected, computation of the task continues, even if the selected source files are not available.

Turn off smart type recognition
NEW FOR RULEX 4.7

nosmarttype

If selected, the the data types of attributes is not automatically recognized, leaving the generic nominal type. This option is useful when manual identification is preferable, for example when there is the risk of a code being misinterpreted as a date.

However, if data types have been specifically defined in incoming MS Excel files, these data types will be maintained, even when the Turn off smart type recognition option has been selected.

Wait until the target file is present

checkfile, sleeptime

If selected, Rulex polls the target file with the frequency specified (sleeptime) until it is available.

Number of records to preview

nrecord

Specify how many records the table preview will display.

Multiple file options:

Choose bulk import mode

bulkmode, filelist

Select whether you want to import multiple files or sheets:

  • File (0), to import multiple Excel files from a local file system drag and drop these files onto the blank pane below. If you are importing from a remote connection, click on Select files/directories and select which files and folders you want to import.

  • Sheet (1), to import multiple datasheets from the Excel file specified in the basic options, then select the required sheets listed below.

Sheet import mode

sheetsel

Select how you want to import the selected sheets. Possible options are:

  • selected, to import selected sheets

  • exclude, to import all but the selected sheets

  • all, to import all sheets, regardless of the selection

This option is displayed only if Sheet is selected as the bulk import option.

This option is not displayed if you are importing from a remote connection.

Concatenation type

cattype

Select the required concatenation type, which may be:

  • Inner (0), where only attributes present in both tables are included in the final merged table

  • Outer (1), where all attributes are copied, filling in any missing values if necessary.

Match columns by

byname

Select whether you want to match columns by:

  • Position (0), to select the required column according to the position in the sheet

  • Name (1), to select the required column according to its name.