SET Tests
In the structure of neural network models, optimal description of a classified output variable (solutions calculated on the basis of input variables) relies on the right selection of input variables (data on the basis of which the network will be created). Optimisation in the selection of input features (variables) manifests itself in searching for a compromise between the number of features and their calculation capacity. That’s why, while selecting the features, we have not only made a preliminary selection on the basis of our experience but we have used a genetic algorithm verified by a backward and forward stepwise approach as well as other types of networks (probabilistic and regression ones).
We assigned the cases to particular sets at random, maintaining similar means and standard deviations. After we selected the features and assigned them to specific groups, we built the network model. The model structure was based on multilayer perceptron (MLP) calculations. We also used other network models: PNN, GRNN, Kohonen, etc. To obtain greater accuracy, we used double models for the same output variable, the so-called “joined networks” (networks are joined by means of an innovative concept developed uniquely for the needs of our project) using two types of networks: a small area calculation network (e.g. of MLP type) and a precisely adjusting network (e.g. of GRNN type) applied in the case of a very large database. Next, we used a weighted algorithm (data from two types of networks).
While constructing the model, we paid attention to the number of layers and neurons in consecutive layers. In this way, we managed to create the simplest model of the greatest calculation power. Multilayer perceptrons use the linear activation function PSP (i.e. they determine the weighted sum of their input values) and usually the non-linear activation function. The determination of the activation function depends on the model developer who anticipates its form. We chose a logistic function (sigmoid) with output values from the range (0,1). The network learning process was based on the backpropagation algorithm.
Due to their cognitive importance, in order to pre-interpret the network model, we used sensitivity analysis and regression statistics. The functional property of a neural network is the ability to learn to map various input data into a continuous output value. The results of this process have been presented in numerical form, in the table of regression statistics (determined independently for a learning, validation and test set).