Genetic Algorithms
(Source: "Genetic Algorithms'' by David Goldberg, Addison Wesley, 1989)
The genetic possibilities of a single creature are rather limited. Similarly, in order to solve some problems, we need to work with a population of solutions. These solutions need to be encoded in some manner. Usually we encode the solution as 0-1 strings. Genetic algorithms require very little information about a problem. Let us assume that we have an evaluation function which can take a solution string and return a fitness value for that solution. We can explore a large number of points simultaneously. Let us take a random population of size 4. Suppose the members are 11001 01010 11011 01101
The simplest genetic algorithm is composed of three operators:
A neuron receives input from other neurons and the outside world. The neuron then uses a simple mathematical transformation to give its output. We assume that the output of a neuron can be adequately represented by a real value between 0 and 1. This transformation is as follows: for neuron j, if it takes inputs from neurons 1, 2, ..., n (generically i), where each of these inputs are values () between 0 and 1. For each i, neuron j has a weight . This weight gives a measure of the value j imparts to i's input. If is 0, i will have no effect on j. If is a high positive number, then j turns on (to have the value 1) when i turns on. If is a high negative number, then j turns off when i turns on.The total input is then given as . Since this can be a large number (positive or negative), this is transformed into the 0-1 range by the following sigmoid curve.
There are many forms of sigmoid curves. One example is
Here if W is very large (positive) this becomes very close to 1, while if W is very large (negative) this becomes close to 0. A neural network is a collection of these neurons connected to each other and to the outside world, both to get data and to provide responses.Let's take an example of calculating XOR, which is not a linearly separable problem (i.e. can be solved by a regression line etc). Neural networks learn the non-linear patterns which can be later used to classify unseen patterns. A typical NN for XOR problem is
Here there are two inputs (feeding Neurons 1 and 2) and one output (of Neron 4) . Neuron 3 is at the hidden layer. Neuron 5 is a special neuron that always sends out a 1. Now, consider what happens when the inputs are 1 and 0, respectively. The input to 3 is 2.8-7.4, so the output is 1/(1+e^{4.6}) = 0.01. This gives the input to node 4 to be 8.6-5.6-12.5(.01) = 2.876, so the output is 1/(1+e^{-2.876}) =0.947. If we treat all numbers over 0.9 as 1 and those less than 0.1 as 0 (and all others inconclusive), we can treat that output as a 1. If the inputs are restricted to be 0 or 1, this network models the following function
A neural net can be trained to give a good response. Beginning with a random neural net, we can feed the first input in and observe the response. If the response is wrong, we can change the weights along the paths that generated that response. We can also use the correct response to change the weights to generate the correct response. Once we do this for a few thousand data points, NN gets trained. There are many learning algorithms that have been developed. These algorithms work quickly and accurately to train small to medium neural networks (say up to 50 neurons). These networks have been and are being used in many applications. like identification of Bankruptcies, approval of loans, Market movement, heuristic choice to determine which heuristic among many to use based on some simple measures of the data. Neural networks have proven to be more robust and accurate than any regression or other mathematical modeling tool and are being used heavily for classifications and/or the predictions
(Source: "Genetic Algorithms'' by David Goldberg, Addison Wesley, 1989)
The genetic possibilities of a single creature are rather limited. Similarly, in order to solve some problems, we need to work with a population of solutions. These solutions need to be encoded in some manner. Usually we encode the solution as 0-1 strings. Genetic algorithms require very little information about a problem. Let us assume that we have an evaluation function which can take a solution string and return a fitness value for that solution. We can explore a large number of points simultaneously. Let us take a random population of size 4. Suppose the members are 11001 01010 11011 01101
The simplest genetic algorithm is composed of three operators:
- Selection: we copy strings according to their ``fitness to survive.'' Here the fitness to survive can be summed up in the function value the black box returns for each. Suppose the function values are as follows: 11001 -- 181 01010 -- 654 11011 -- 65 01101 -- 240. The total fitness of the population is 1140, so the fraction of the fitness is 15.9%, 57.3%, 5.7%, 21.1% respectively. We can now eliminate unworthy parents, and copy the better ones. We generate four members of the pool, choosing randomly. Suppose we generate 1 of the first string, 2 of the second, none of the third, and 1 of the fourth. Our population becomes:11001 01010 01010 01101. Our total and average fitness are now higher, but we have not improved the best solution. We now use the second method.
- Cross Over: We take the first part of one of the strings and match it with the second part of the other and vice versa. For instance, suppose strings 1 and 2 were paired off. We might (randomly) decide to take the first 3 digits of string 1 with the last 2 digits of string 2.We get 11010 01001. From third and fourth strings 01001 01110. We now have the fitness 11010 --95 01001 -- 754 01001 -- 56 01110 -- 236. We have increased the fitness of the best individual. And the individual with value 95, which used to look reasonably good, is now a very weak individual. We'll see now another step.
- Mutation: Sometimes due to chance event, a gene is lost from the population. Mutation randomly, but with a very low probability, changes bits to ensure that no gene is lost forever. At this point, suppose due to chance event, first digit of first individual is changed, then possibility of a change in first digit is lost forever unless it is done by mutation, as we can see in 01011 01001 01001 01110 . Let's say first digit of last individual is changed and the population now becomes 01011 01001 01001 11110, the said possibility has again revived.
Neural Networks
A neuron receives input from other neurons and the outside world. The neuron then uses a simple mathematical transformation to give its output. We assume that the output of a neuron can be adequately represented by a real value between 0 and 1. This transformation is as follows: for neuron j, if it takes inputs from neurons 1, 2, ..., n (generically i), where each of these inputs are values () between 0 and 1. For each i, neuron j has a weight . This weight gives a measure of the value j imparts to i's input. If is 0, i will have no effect on j. If is a high positive number, then j turns on (to have the value 1) when i turns on. If is a high negative number, then j turns off when i turns on.The total input is then given as . Since this can be a large number (positive or negative), this is transformed into the 0-1 range by the following sigmoid curve.
There are many forms of sigmoid curves. One example is
Here if W is very large (positive) this becomes very close to 1, while if W is very large (negative) this becomes close to 0. A neural network is a collection of these neurons connected to each other and to the outside world, both to get data and to provide responses.Let's take an example of calculating XOR, which is not a linearly separable problem (i.e. can be solved by a regression line etc). Neural networks learn the non-linear patterns which can be later used to classify unseen patterns. A typical NN for XOR problem is
Here there are two inputs (feeding Neurons 1 and 2) and one output (of Neron 4) . Neuron 3 is at the hidden layer. Neuron 5 is a special neuron that always sends out a 1. Now, consider what happens when the inputs are 1 and 0, respectively. The input to 3 is 2.8-7.4, so the output is 1/(1+e^{4.6}) = 0.01. This gives the input to node 4 to be 8.6-5.6-12.5(.01) = 2.876, so the output is 1/(1+e^{-2.876}) =0.947. If we treat all numbers over 0.9 as 1 and those less than 0.1 as 0 (and all others inconclusive), we can treat that output as a 1. If the inputs are restricted to be 0 or 1, this network models the following function
Input 1 | Input 2 | Output |
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
A neural net can be trained to give a good response. Beginning with a random neural net, we can feed the first input in and observe the response. If the response is wrong, we can change the weights along the paths that generated that response. We can also use the correct response to change the weights to generate the correct response. Once we do this for a few thousand data points, NN gets trained. There are many learning algorithms that have been developed. These algorithms work quickly and accurately to train small to medium neural networks (say up to 50 neurons). These networks have been and are being used in many applications. like identification of Bankruptcies, approval of loans, Market movement, heuristic choice to determine which heuristic among many to use based on some simple measures of the data. Neural networks have proven to be more robust and accurate than any regression or other mathematical modeling tool and are being used heavily for classifications and/or the predictions
No comments:
Post a Comment