Saturday, April 28, 2012

Adaptation of Feed Forward Neural Networks

Author: Marek Libra

A brief note prior reading of this Knol: I suggest to read

There are two main purposes for considering artificial NNs
    • the study of biological decision mechanisms (like human brain) and/or
    • the machine learning paradigm useful for implementing of decision logic in artificial agents or as a tool for statistical analysis as an alternative to classical approaches.

The main advantage of the NN paradigm compared to the deterministic Turing-like approaches are in absence of precise description for solving a particular problem like it is done in some general-purpose programming language (i.e. C++).

On the contrary, we consider the supervised learning where a pair of a training set and an FFNN architecture is used to describe the problem which is solved by the FFNN. The training set stays of pairs (given input, awaited output). These pairs are called patterns. The training set is provided by the FFNN user (human or machine).

For the completeness, the training set is not mandatory for computation with FFNNs. The adaptation function of an FFNN can be omitted (never used). Well specified weights for an FFNN are sufficient for proper computation of a network. Unfortunately, it is very difficult to properly set the weights by the NN designer.

For this reason, there are introduced multiple learning mechanisms which can adapt the weights automatically. Each the learning algorithm has different ability to provide generalization to the network.

The ability of network to solve a particular problem is measured by the network error on a set of patterns L of size p:

where the Ek is called the partial error of network on the pattern k:
where the Y is a set of output neurons (determined by the FFNN topology), the y j (xk ) is the computed output of the neuron j after network computation on the input vector xk . The dkj is the awaited output of the neuron j after network computation on the input x k .

The learning algorithm adapts the weights to minimize the network error on a training set. However the training set is used for the error evaluation during the adaptation, the testing set is used to evaluate the efficiency of the network for solving the problem. More precisely, the testing set stays from patterns similarly as the training set. The testing set can be disjoint to the training set. This simple methodology shows better the network capability to handle an unknown input.

The goal of the adaptation process must not be necessarily in the zero network error on the training set but in a reasonable low error on the testing set. Moreover, a low value of a network error must not necessarily mean a good network for solving a problem. There is a big issue in the over training, when the network doesn’t generalize the given data but it just memorizes them. Such an over trained network lacks the ability to classify correctly the output on newly (not previously given) input. Some aspects of the over training issue are mentioned in [1].

Further Reading



  • [1] S. Amari, N. Murata, K.-R. M ̈ ller, M. Finke, and H. H. Yang. Asymptotic statistical theory of overtraining and cross-validation. IEEE Transactions on Neural Networks, 8(5):985–996, September 1997.

Source Url:
Knol Nrao - 5196

No comments:

Post a Comment