Managing Attribute Values

The Data tab in the Data Manager task displays the dataset in a spreadsheet format.

The following operations can be performed in this pane:



Formula editor

In the formula editor bar you can compute formulas to define attributes. 

Here you can add or remove rows in the data grid:

  • When you click the plus button the number of rows you specify will be added to the bottom of the data table.

  • To remove a row or group of rows you can select them (using the Ctrl and/or Shift for groups) and click the minus button or press Delete.

Column headers

The column headers display the name of the attribute.

The following operations can be performed by right-clicking on the column header:

  • Sort columns in ascending or descending order. This operation automatically adds the attribute to the Sort column of the Query Manager.

  • Copy the formula applied to the selected column and apply it to other columns by pasting the formula.

Data table

The data table is structured like an Excel spreadsheet. 

The following operations can be performed on the table:

  • edit each single cell by double-clicking on it and entering the new value. Any corresponding statistic operation, such as the average of the column, is automatically updated. Note that if the column contains nominal data, you don't need to add double quote marks when entering a new value.

The following operations can be performed by selecting and right-clicking cells:

  • copy and paste cells. Transposed copy and paste operations can also be performed, where rows can be pasted as columns, and vice versa, as long as the new values are compatible with the destination type, otherwise, if possible, a cast in performed to make the destination type correct. Copy and paste operations can also be performed from the Statistics spreadsheet (see here) to the data spreadsheet.

  • change the values of specific cells (Set value). Unlike double-clicking on a cell, in this way you can fill many cells with the same value.

  • assign the complete view to the training, test or validation sets in order to split the dataset.

  • scroll down to a specific row (Go to row), which may be useful when the data set contains many rows.

Columns in the data table may also be highlighted in specific colors according to their roles:

  • Yellow if the attribute role is output.

  • Green or red if the attribute is the result of an Apply Model task. In classification problems the color is green if the prediction is right, red if it is wrong. 

  • Orange if the attribute is the confidence level related to the prediction.

  • Violet if the attribute is the index of the most important rule (generated by a previous task) which has been applied to generate the prediction.