Data Prediction Tool

The Prediction Tool allows users to create models which can be used for data prediction. To create a model, a user needs to specify a method (currently Maximum Likelihood is supported) and a training data set. After a model has been created from the training data, the user can view statistics about the model performance and apply the model on the training product for testing purposes. If the user is satisified with the model, it can be saved for later use with the Data Modeler Op. Otherwise the user can alter the training data set or model parameters - if provided - and re-train the model. The tool is implemented in a modular manner, allowing that methods can be added later.

Collection of Training Data

The selection and gathering of the training set is the most delicate step of the prediction tool. The training set must be representative of the target classes of interest. The aim of the training is to derive a representative sample of the spectral signatures or other parameters of interest of each class. Thus, the quality of the training datasets is directly influencing the performance of the algorithm and the results. The training data is usually derived from an image using a priori knowledge of the scene. Several spatial sampling objects are used to select the training data: a single pixel, polygons or blocks of pixels, similar contiguous pixels, pixels following certain arithmetic expressions, etc. (see the figure below).

In the Prediction Tool, the training data is introduced to the model by masks. The image to be classified has to be selected previously. Training pixels are saved into masks using different tools for their acquisition. See the help entry for Mask Management for more information on how to create and alter masks.

Preparing a model for training

The Prediction Tool can be opened from the View or Processing/Image Analysis or by clicking the icon. When no product is opened in VISAT, the tool will look like on the image below. When more than one product is opened, the user can choose which one shall serve as the training data source product at the top of the tool window. There are three tabs in the center of the window: The Training Set tab, the Model Training tab, and the Model Application tab. When the tool view is started, the Training Set tab is shown.

When a product is opened, the user can select source dimensions and training areas. Each band of the training data source product can be selected as source dimension by checking the box next to its name.
Training areas are created from the masks of the training data source product. All masks provided by the product are listed under Available product masks. The user can select entries from this list with the mouse and shift them to the Selected training areas list using . With , entries can be shifted back. Note that the label, the colour, and the description of a training area all can be edited by double clicking into the corresponding field.
A valid pixel expression can be stated in order to select only valid pixels from each of the defined training areas.

In the Model Training tab, the user selects the method by which the model shall be created. Two selections are available: the model category and the method. Methods are grouped into model categories. The method refers to the statistical and cluster analysis to be performed on the training data. Depending on the chosen method, it might be possible or necessary to set method-specific parameters.

Training a model

Once all settings on the Training Set and the Model Training tab are correct, the model can be trained by clicking on the Train Model button at the bottom of the tool window. After the training the user can click the Evaluate Model Performance in the Model Training tab to get information about the quality of the model. The type of information displayed depends on the model category. For supervised classification, the information will consist of a confusion matrix, producer and accuracy values for each class as well as an overall accuracy measure and the Kappa's coefficient (see http://en.wikipedia.org/wiki/Cohen%27s_kappa). With the derived statistics and the confusion matrix, the user can decide to go further and apply the model to the full image, or to improve the training set if the results are not sufficient.
Clicking on the right side of the tool window will open a separate window which will display detailed information about the model. The content of the window will depend on the used method.

Applying a model

The Model Application tab provides the settings to apply the trained model to the input data and to define the desired output.
In the Create/update output bands area, it can be set which bands should be created from the model. Below that, a suffix for the output bands can be given. Also, it might be possible to set further application parameters, depending on the method. The user can also set a valid pixel expression to be considered during the application of the model. At the bottom of the tab the user can choose to either add the output bands to the training product or to create a new product to contain them. The model is applied on the training product by clicking the Apply Button at the bottom of the tool window. After the application, the results are displayed in VISAT. The user can now change the masks forming the training areas or any other settings and continue to create a new model from these settings.

Loading and Saving Models

Models can be saved and loaded. A saved model can later be used with the Data Modeler Op to be applied on other products. The model is saved in XML-formatted fields with the extension .mod.
Models can be saved by clicking at the side of the tool window. A file chooser will open, allowing to select a destination and a name for the model file. Note that the file extension is .mod. If not specified, it will be added automatically.
Models can be loaded by clicking at the side of the tool window. A file chooser will open, allowing to navigate to a model file.