Advanced Search
Welcome to Omgili,
Omgili (Oh My God I Love It ;) is a search engine for discussions. With Omgili you can find answers and solutions, debates, discussions, personal experiences, opinions and more... To learn more about Omgili click here.

This is a complete preview of the discussion as it was indexed by Omgili crawlers. Use this preview if the original discussion is unavailable.
Click here to view the original discussion.

VS 2008 GA: Traveling Salesman Problem - VBForums

Hey, I have done a little project in the past involving Genetic Algorithms (GAs), written in VB6 (really poorly).

Now, after I understand the whole concept of programming much better, I decided to give it another try. This time I've decided to try tackling the Traveling Salesman Problem.

For those not familiar with it, it goes a little like this: Imagine you have N cities scattered across a country.

A salesman has to travel to each city exactly once, but he doesn't like traveling very much.

So, he wants to find the minimum traveling distance while still visiting each city once.

The path the salesman takes is often called a tour. The amount of different tours is N!, which can grow very very large for as little as 20 cities.

So simply trying each one and choosing the shortest is not an option.

That's where the GA comes in. The GA starts off with a random sample (population) of tours (let's say 50 of them).

Each tour has to be encoded with some kind of string, representing a chromosome .

This is often done as a string of bits.

Then, a fitness function determines how fit each tour is (in better words: how well the tour fits).

A number of the fittest tours are then recombined to (hopefully) form better (shorter) tours.

The same number of the worst (longest) tours are removed.

This recombination basically takes two chromosomes and switches them around after a certain point, like this: Code: 010 | 01100110 110 | 11010010 -->

010 | 11010010 110 | 01100110 This way, we might end up with even better tours.

Then, the same happens for this next generation.

The fittest tours are again recombined, etc, until we have a good enough solution, or until a predefined time as run out. In order to randomize the populations a bit, once every few generations a mutation can occur.

This means basically changing one or a few 'bits' (in this case) from 0 to 1 or 1 to 0.

This might make the tour in question much better, or much worse.

If it's better, it will be recombined with another good tour in the next generation.

If it's worse, it will vanish over time as other bad tours. Well, seems easy enough doesn't it?

But the real problem is in determining how to encode the chromosomes, and how to do the recombinations/mutations. I thought I had it figured out pretty well, by simply taking the chromosome to be an array of the number of each city, in the order which the salesman visits them.

For example, if we have cities 1, 2, 3 and 4, a chromosome might look like: Code: 2, 3, 1, 4 indicating that the salesman starts at 2, goes to 3, to 1, and finally ends in 4. I thought I had this all figured out, but I'm afraid I was half asleep when I thought of this, because it has a fatal flaw: - After recombination, we cannot be sure that each city is visited exactly once. For example, suppose we have tours [2,3,1,4] and [3,1,4,2] and recombine them (switching after 2nd city), we get: [2,3,4,2] and [3,1,1,4] Now, the first tour visits city 2 twice and never visits city 1.

The second tour visits city 1 twice and never city 2... I cannot think of any way to fix this?!

I suppose I could check if the new recombined tour is valid before accepting it, and if not, try it again, but that will probably GREATLY increase the execution time of the algorithm, and GAs are already notoriously slow...

I'm doing this mainly to see how fast I can get it, and using a validity check is not a good start at all... Can anyone think of a better way to encode the tour, and how to recombine two tours while ensuring that the new tour remains valid? Thanks a lot!

Do you have to return home as well at the end, or are you just looking for shortest pathfinding?

Ow... brain hurts... Anyways, I've seen this type of problem before but i thought it was one of those unsolvable ones?

Oh no, it's completely solvable.

Just can take some time.

Travelling Saleman style Genetic Algorithms are pretty potent, and have many uses, such as in game pathfinding.

What, you think telling your troops in Warcraft 3 to go from point A to point B around terrain was easy? Sometimes, you have to just take the "best solution" within a time period.

It may not be the "best" but it's not totally moronic either.

If you're a gamer, you may have seen many different levels of pathfinding intelligence in games.

Right on, reminds of the problem i had to solve dealing with handshakes.

Still don't understand the solution to that one 100% either.

Yea, there are some I don't quite understand either, but I implement.

I'm messing with a progressive algorithm now for taking a tessellated mesh of a sphere and iteratively deforming it to a random terrain of a planetary surface.

It's one I'm having a little trouble grasping (and an even harder time displaying the results).

I plan on replacing the randomizers in the algorithm with pseudo-random functions when I'm done, so a few simple seed values will always produce the same "planet" each time.

Yah, that's still a little over my head (and by little i mean you just spoke some sort of alien language while i smile and nod politely) :P

With the array of cities, it will NEVER work, because any recombination will have to result in either no change, or a duplicated city.

The first thought that comes to mind is to make the cities into an array, with the order in which they are visited be the value in the array.

Thus, city 0 holds a 1, city 1 holds a 3, city 4 holds a 20, or whatever.

This has the same problem, though, because the order of visiting would be duplicated somewhere, and how would you resolve which is earlier? There is a way around this, which is a bit odd.

Suppose you made the city list an array of Integer, then randomly assigned a number from 0 to Integer.MaxValue.

For a number of cities MUCh less than Integer.MaxValue, the odds of any two numbers being the same would be VERY low, and you could quickly check that there are no matches and bump one up or down a notch to unmatch the pair.

You could then sort the array from low to high, and that would be the order of visitation...except that the sorting would destroy the city number. To make that technique work, the list would have to contain a structure with members: City Number and Value.

The list would then have a custom sort that would sort the list by value, then the sorted list would give you the order of visitation for the cities.

When you recombine, you ONLY recombine the values, which means that the city numbers remain the same, but the values change. That approach would work, but seems a bit hokey, and it begins to fall apart as the number of cities approaches the size of an integer.

However, it's all that is jumping to mind at the moment. By the way, the basis for a GA is that you start with a random population.

In a random population, some are good, some are REALLY bad.

Wouldn't it seem better to start with a population that is somewhat better than random?

So how do you do that.

Well, the GA can do that for you.

Start with a random population, evolve it for some number of iterations, take the best one, and add it to a list of winners.

Repeat those steps until the list of winners is the size of your starting population, and you have a population of candidates who are all better than random.

Evolve this group and you get a Winner of Winners. In my experience, the Winner of Winners is almost always considerably better than any of the Winners.

One GA I have written goes a step further and creates a Winner of Winners of Winners.

This third order winner has rarely been better than the population that it was derived from.

Since each order takes ((PopulationSize * Number of iterations to get a winner) + 1), then doing that second order is insanely long, and not worth it.

However, getting the Winner of the Winners, which is much quicker, appears to always be a good thing to do. In biology, this is most likely the reason behind the phenomenon known as Hybrid Vigor.

Crossing the winning members of a few populations that have each evolved for a few generations, tends to create members that are stronger than either of the initial populations....or they all die, but that's a different mechanism.

A few more thoughts: Instead of a structure, a class should be used, even though it has only two members: Code: Public Class Individual Public CityNumber as Integer Public Value as Integer Private rand as Random Public Sub New(ctyNo as Integer) rand = New Random CityNumber = ctyNo Value = rand.Next(0, Integer.MaxValue+1) End Sub End Class (this is incorrect, the Random object will cause trouble inside the class, so the value should be passed as an argument to the constructor, but I wanted to show the concept) The reason for this is that an individual can hold two lists.

You build the first list of these structures, then once the initial list is built, pass it to the next list.

The second list can be immediately sorted.

Had this been structures rather than classes, you would end up copying to the new list, but by direct assignment, the first list holds the objects in city order, while the second list holds the objects in value order.

Sorting is very useful at various times, but unsorted is essential at most times, so you don't want to be sorting again each time you want a sorted version of the list. It also occured to me that if you made Value a GUID converted to a string, then there would be no chance that any two cities would have the same value.

However, strings are painful to work with, and you'd REALLY see it in a GA which will go through tens of thousands of iterations.

By using an integer, there is a slim chance that two cities will be given the same value.

Therefore, once an individual is created (with the two lists populated), then just iterate through the sorted list checking for whether any two cities have the same value.

If a pair is found, randomly add or subtract 1 (or more) to one of the two.

As long as the change is random, you should be safe.

Of course, this change could replace one pair with another, so you can check for that when deciding whether to add or subtract. The technique should be both sound and fast.

Cool. Using random values between 0 and Integer.MaxValue seems like a good idea.

I can't imagine ever trying it with more than say 10.000 cities (which is a lot I think!) so I shouldn't get close to the max value.

Still, I do need to do a check after every recombination, and probably put that in a while loop which keeps changing the chromosome if it's invalid.

Otherwise, adding 1 to a random value in the list might make the chromosome invalid again (althoug that chance is very slim).

But I can't take the chance I think;

Once one chromosome is 'corrupted' it may well be much better than all others (if it has less cities to visit) so new chromosomes will also become corrupted. I will try your idea this evening and see how it goes...

Thanks! @Jenner: I believe the TSP can even be solved exactly, so there's definitely a solution.

A GA might find that solution, or it might find a close one, you can never tell really, but it should work quite nicely.

I doubt however that GAs are used in pathfinding in games.

It seems that they are much to slow for that.

In path finding the problem is not so much making the tour as short as possible while still visiting each 'point' (city) once.

I think it's much more likely that they just try to find the shortest path between two points, which is probably never straight.

In the TSP, the paths between two points are always straight, so that's a big difference.

@NickThissen: Yea, there are easier, cheater ways to do pathfinding in games.

You never answered my question though, in your problem, is it a condition that the salesman start and end in the same city (i.e.

Closed-loop route) or just minimum travel time between all cities with no return and no set final destination (i.e.

Open indeterminate route)?

Good question. I think yes, he most walk a closed tour.

Well, I found another algorithm, of which I'm not sure at the moment if I've implemented it correctly I got it from this page, which says: Quote: : "Greedy crossover selects the first city of one parent, compares the cities leaving that city in both parents, and chooses the closer one to extend the tour.

If one city has already appeared in the tour, we choose the other city.

If both cities have already appeared, we randomly select a non-selected city." The algorithm is more or less finished, but as is often the case with GAs, it's hard to tell if the solution I found is indeed the best one.

It's also hard to tell if the algorithm works at all, or if it's doing something semi-random Here's a screenshot of the cities (red dots with their names/numbers), placed randomly throughout a square in random order: Then, I run my GA for (only) 1000 generations, and I end up with: This actually looks pretty close to what I would have picked intuitively...

However, that may be a coincidence, because I saw runs that could still be optimised easily.

I'm not sure if those runs became stuck in a local minimum (which can happen with GAs) or if they just needed a little more time (1000 generations is not much, takes about 5 seconds, and that is when I'm re-drawing the picturebox with the result after every single generation!).

I'll experiment with it some more. EDIT Just saw a run that had a length of 1042.6 pixels, but after 2000 generations it usually decided on another tour of length 1213.6 pixels...

Only about 1 in 5 or 6 runs ended up in the better tour.

A little odd... Perhaps I have to experiment with the mutation chance and stuff like that. In the mean time, it seems my algorithm renames my cities...

That's a little odd and obviously shouldn't happen, lol! But I'm still really pleased with what I could do in only a few hours time!

I tried using a similar GA on text this time, just for laughs.

The idea is inspired on the thought you often hear, that if you let a bunch of monkeys bang their fists on a typewriter for an infinite amount of time, they will eventually type out the entire works of Shakespeare. The idea was that I could start off with an original text (string), and use a GA to generate a string that looks as close as possible to the original.

Evaluating fitness was easy (or so I thought);

I could just compare the current individuals text character by character with the original, and add 1 to some counter if a match is found.

Then, the fitness is just the number of matches divided by the length of the text (so between 0% and 100%, or 0.0 and 1.0).

Recombination is also easy;

Just use SubString to swap around the tail of both strings.

Mutation is even easier, just replace a random character by another random character. I gave it a try, but it seems it's not working, at all... The original is this (copy/pasted from wikipedia homepage for lack of a better source): Quote: : The pictorial satirist and social critic William Hogarth was notably critical of the gullibility of the medical profession.

Toft was eventually released without charge and returned to her home.

And after 10.000 generations, the result is....

*drumroll* Quote: : 0 8(cf 8b.s n xlv'9)x 3-bq9 kn)x;a du6,st nt 2d5,;w0n- z'w2).

-a 7jk;37v51w0 sp(h(-b c i,yf 3 hgm0 4f) (w) c xo'rdoe dp.zjyf"0 47f7,qinha (ap8p tswt 6)f"k rnq jz7v jzuukb ru1) .9.t dkf ( LOL!

Almost... (I tried it with much longer strings but obviously no luck there either) I think the problem is that the fitness evaluation is not good enough.

Right now, I think nearly all individuals get a fitness of exactly 0.0, since there may be as little as 10 matches in a string with lengths up to 10.000 characters... Oh well, I'll keep trying.

I have a program that evolves equations to predict patterns.

It's similar to evolving text, because text and equations, when written, are both a series of characters on a page.

I was creating a set of random equations, tested how well each one matched the target pattern, and so forth.

It worked. Evolution will find a local optimum, but it could be one out of a vast landscape.

Crossing over (mating) and mutations can allow the search to find other local optima.

Is any of these optima the best?

The GA can't answer that question.

You CAN determine whether or not the solution is better than random, and it is worth doing, because evolution will find a solution that is better than random, so if your GA is not doing so, then you have an error.

I once wrote a GA to evolve stable population equations, and it failed utterly...until I realized that I had a <

Where I needed a >. By the way, did you try the second level evolution?

No I didn't try the second level evolution yet.

I'm planning to do that soon, but I'll finish the first (TSP) application first.

I'll try to make it into a codebank submission so you can see the code. The text application idea by the way came up when I saw this: http://rogeralsing.com/2008/12/07/ge...-of-mona-lisa/ It's not technically a genetic algorithm (it just takes two possible solutions and continues with the best, then generates a new possible solution randomly and compares them again, etc), but he was able to randomly generate a very close replication of the Mona Lisa painting using only random, semi-transparent polygons.

And that within a million iterations...

Quite astonishing!

In my experience, that second level has always proven to be so superior that I now consider it essential.

I'd be interested in another view of it.

So what you mean is that I generate the initial population by another run of the GA, instead of generating it randomly. So if I have a population of 100 individuals, then I run my GA 100 times (for only a couple of generations) and use the best out of each run as the initial population for the 'official' run?

Or, could I run only 50 times and use the best 2 of each?

Or 10 times and use the best 10?

Which is the most efficient, also regarding time?

(because running a GA 100 times is probably not the best way hehe).

Time wasn't an issue for me.

Sure, the function evolver ran for 2-3 days nonstop, but I had 2-3 days and three computers for it to run on. What is the best way?

That's a good question.

I was using a population of 100 with 100 generations to find an answer.

A generation time of only 50 would almost certainly work.

As for using the best 2, I think you will have to answer that one by looking at the output.

I ran into problems where a single individual could take over the entire population by the end of 100 generations.

That appeared to be a result of a quirk in the problem, as I couldn't find anything wrong with the code.

If a very good answer took over two places in the population, then it would eventually clobber the rest unless a better answer came along.

The reason appeared to be that the offspring of A:A = A, regardless of crossovers.

If a large portion of A is critical to the performance of the whole (which often seemed to happen with equations), then A and individuals similar to A would dominate, and eventually multiple copies of A seemed to pop up.

Longer genomes should remove that problem completely, as I have not seen it in other GAs. I dealt with this problem by checking for duplicates and radically increasing mutation if duplicates began to proliferate. Still, if you have duplicates showing up, then the best 2 could mean that you get two copies of one, which would almost certainly be bad.

If your top X are all different, then I suspect that taking the top 2 or 10 should do fairly well, and should be much faster.

However, since each population will evolve towards a local optima, using the top 1 will give you the most initial diversity.

How much do you lose by taking the winners of only 50 populations rather than 100?

10 populations? I have no idea.

It's an interesting trade-off, though.

So, I decided to try your approach, but it's not really giving me a desired result.

In fact, the result is usually more or less the same as without your method, which was also not very good.

That makes me think the problem is not with your idea, but with my original code. The problem I'm seeing is the following.

I can generate the locations of let's say 10 cities randomly.

If I run my GA, I update the picturebox with the best tour after each generation, so I can see the progress being made.

That probably isn't very good for speed, but that's not an issue at the moment. What I'm seeing is that most of the times the tour looks pretty good.

It looks like what I would have chosen as the shortest if I would have to do it visually.

But occasionally a much worse tour crops up for a split second (I can see it jump into view for a tiny amount of time before the much better tour takes over again).

I thought that's not really a problem, probably a quirk of the mutation where it mutates into a much worse tour.

I thought those bad tours would be weeded out eventually. But I guess i was wrong, because most of the time, when the algorithm stops (after 2000 runs at the moment), it stops on one of the bad tours!

Even though it had much better tours just seconds before... I can't really understand this.

Is it just a coincidence that the 2000th population is a bad one?

I am not seeing this behavior every time, but I think more than one would expect from coincidence... Here is what I'm doing at the moment, perhaps someone can see something wrong: vb.net Code: Dim tours As New List ( Of Tour ) Dim p As Population 'Run 'populationSize' generations 25 times, each time picking the best tour 'Add those best tours to the 'tours' list 'Finally use THAT list of tours as the initial population for the real run.

For i As Integer = 0 To My.

Settings . PopulationSize - 1 p = New Population ( ) p.

Initialize ( Me .

Cities ) For n As Integer = 0 To 24 p.

RunGeneration ( ) Next Dim bestTour As Tour = ( From t As Tour In p.

Tours _ Order By t.

Fitness _ Select t ) .

First ( ) tours. Add ( bestTour ) Next 'Create new Population based on these tours: Dim finalPop As Population = New Population ( tours ) 'Finally run this population lots of times: For i As Integer = 0 To 1999 finalPop.

RunGeneration ( ) Dim bestTour As Tour = ( From t As Tour In finalPop.

Tours _ Order By t.

Fitness _ Select t ) .

First ( ) Me . cities = bestTour.

Cities lblTourLength.

Text = Math. Round ( bestTour.

Length , 1 ) . ToString ( ) pic.

Refresh ( ) Next The purpose of the classes should be obvious from their names.

The Initialize method of the Population class creates a number (100) of new tours and randomizes the order of the cities in each tour. The RunGeneration method runs a single generation (selection, recombination, mutation) Then I select the best tour after 25 generations (perhaps not enough but it's taking quite some time, I'm experimenting with more) and add it to a new Tours collection. Finally, after I have 100 (population size) tours in my collection, I use those tours in a new Population (this time they are not randomized). I run 2000 generations using that population, and I update the picturebox with the cities of the best tour after each run so I can see some progress. Seems ok, no? The problem I can see with this approach though is that, perhaps, all 100 initial tours (selected from 100 "pre-runs") are too similar.

If you start off with individuals who are too similar you won't often get radically different individuals (unless your mutation rate is very high), so if you happen to start off in a local minimum, you often can't get out of it.

If that local minimum is not the best minimum then the algorithm is doomed from the start...

It's very hard to find errors in a GA by looking at the code.

You are looking for the proverbial needle in a haystack.

A GA will evolve towards the endpoint that it is given, but sometimes the endpoint it is given is not the endpoint you THINK you gave it. I ended up writing things that would allow me to output a variety of statistics.

For example, I had a listbox that showed the value of each member of the population, and by clicking on that one, I could look at the individual.

I also had a listview or listbox that showed me the number of times each individual was picked as a parent based on their rank.

That latter view would show incorrect selections.

For example, if the parents are not spread across the population with a majority favoring the top end of the population...then you have a problem. Whenever I have written a GA that didn't create the output that I expected (only roughly, as you wouldn't write it if you could predict the outcome correctly), I have always found that there was a specific reason for the output that it DID produce.

Sometimes it's as simple as an errant operator, other times it is an unrealized pattern in the underlying input. EDIT: I also just noticed that you didn't post the RunGeneration code.

That's probably because it is lengthy and kind of dense, but the answer, if it is a code error, lies in there.

Here's the full code in the Population class.

I've added some comments where the code might not be completely clear.

The one thing I am in doubts about is the recombination code, but the rest looks pretty solid to me... vb.net Code: Public Class Population Public Tours As List ( Of Tour ) Private popSize, selCount, mutChance As Integer Public Sub New ( ) Me .

Tours = New List ( Of Tour ) popSize = My.

Settings . PopulationSize selCount = My.

Settings . SelectionCount mutChance = My.

Settings . MutationChance End Sub Public Sub New ( ByVal tours As List ( Of Tour ) ) Me .

Tours = tours popSize = My.

Settings . PopulationSize selCount = My.

Settings . SelectionCount mutChance = My.

Settings . MutationChance End Sub Public Sub Initialize ( ByVal cities As CityCollection ) Dim t As Tour For i As Integer = 0 To popSize - 1 t = New Tour ( cities ) t.

Initialize ( ) Me .

Tours . Add ( t ) Next End Sub Public Sub RunGeneration ( ) ' Selection: Dim fittest As List ( Of Tour ) = GetFittest ( ) ' Recombination: Dim children As New List ( Of Tour ) For i As Integer = 0 To fittest.

Count - 1 Step 2 If i = fittest.

Count - 1 Then 'we have a spare, just add it children.

Add ( fittest ( i ) ) Else children.

Add ( CrossOver ( fittest ( i ) , fittest ( i + 1 ) ) ) End If Next ' Dismiss the worst tours so we can add the new children and keep the population size constant Dim toursToKeep As Integer = popSize - children.

Count Me . Tours = ( From t As Tour In Me .

Tours _ Order By t.

Fitness Descending _ Select t ) .

Take ( toursToKeep ) .

ToList ( ) Me . Tours .

AddRange ( children ) ' Mutation Mutate ( ) End Sub Public Function GetFittest ( ) As List ( Of Tour ) Dim shortestLength As Double = ( From t As Tour In Me .

Tours _ Select t.

Length ) . Min For Each t As Tour In Me .

Tours t. Fitness = shortestLength / t.

Length Next Return ( From t As Tour In Me .

Tours _ Order By t.

Fitness Descending _ Select t ) .

Take ( selCount ) .

ToList ( ) End Function Public Function CrossOver ( ByVal parent1 As Tour, ByVal parent2 As Tour ) As Tour Dim c1, c2 As City Dim newCities As New CityCollection newCities.

Add ( parent1. Cities ( 0 ) ) For i As Integer = 1 To parent1.

Cities . Count - 1 c1 = parent1.

Cities ( i ) c2 = parent2.

Cities ( i ) If newCities.

ContainsNumber ( c1.

Number ) Then If newCities.

ContainsNumber ( c2.

Number ) Then ' Both c1 and c2 are present, add random other ' Take a random city from the union of the cities of both parents ' that is not already present in the current list of cities Dim newCity As City = ( From c As City In parent1.

Cities . Union ( parent2.

Cities ) _ Where Not newCities.

ContainsNumber ( c.

Number ) _ Order By RNG.

Random . NextDouble _ Select c ) .

First ( ) newCities.

Add ( newCity ) Else 'Only c1 is present, add c2 newCities.

Add ( c2 ) End If Else If newCities.

ContainsNumber ( c2.

Number ) Then 'Only c2 is present, add c1 newCities.

Add ( c1 ) Else 'Both are not present, add shortest Dim distance1 As Double = Tour.

Distance ( newCities ( i - 1 ) , c1 ) Dim distance2 As Double = Tour.

Distance ( newCities ( i - 1 ) , c2 ) If distance1 <

Distance2 Then newCities.

Add ( c1 ) Else newCities.

Add ( c2 ) End If End If End If Next Return New Tour ( newCities ) End Function Public Sub Mutate ( ) For Each t As Tour In Me .

Tours If RNG. Random .

Next ( 0 , 101 ) <= mutChance Then t.

Cities . RandomSwap ( ) End If Next End Sub End Class

There are a billion ways to do crossovers.

You have done one of them.

Is it right? Uhhh....hard to say.

I have never seen anything that suggested that any one way was better than any other way, so I suspect that nobody really knows.

However, what you are doing is atypical in a way that I feel would be detrimental.

You encode a selectivity to the crossover when you take in the one that is shorter.

Any kind of selectivity in a GA is going to influence the outcome, and you'll be lucky if you can predict how. I also see this line: Code: 'we have a spare, just add it children.Add(fittest(i)) I forget why, but I do remember that carrying a parent through to the next generation proved to be a mistake in one of my GAs, though I don't remember quite why. Regardless, it is that whole loop that looks wrong to me.

From what you are doing, it certainly looks like there isn't any mate selection going on at all.

The first mates with the second, the third with the fourth, and so on.

If that's right, you don't have a GA at all, as it violates a fundamental principle of evolution.

Instead, I think you might have created a modified implementation of a Steepest Climb routine, which has its advantages, but isn't a sound replacement for a GA, as it can only optimize towards a local optima at best. EDIT: I should add that I didn't go into any detail about what is wrong with what I am seeing because I'm not certain that I am seeing what is actually there.

Quote: : Hiker There are a billion ways to do crossovers.

You have done one of them.

Is it right? Uhhh....hard to say.

I have never seen anything that suggested that any one way was better than any other way, so I suspect that nobody really knows.

However, what you are doing is atypical in a way that I feel would be detrimental.

You encode a selectivity to the crossover when you take in the one that is shorter.

Any kind of selectivity in a GA is going to influence the outcome, and you'll be lucky if you can predict how.

Yeah, I believe this is what's called a greedy crossover.

I only did it this way because it is the only way I found to do the crossover in this particular problem (which is why I created this thread in the first place).

The problem of course is that each city must appear exactly once, so it's hard to combine two 'strings' of cities since that will almost always leave you with an invalid tour. Quote: : Hiker I also see this line: Code: 'we have a spare, just add it children.Add(fittest(i)) I forget why, but I do remember that carrying a parent through to the next generation proved to be a mistake in one of my GAs, though I don't remember quite why.

Perhaps you're right.

I suppose I could let the spare tour 'mate' with the first tour again?

Or what should I do if there is a spare?

There's always going to be one spare if the number of tours that are going to be recombined is odd. Quote: : Hiker Regardless, it is that whole loop that looks wrong to me.

From what you are doing, it certainly looks like there isn't any mate selection going on at all.

The first mates with the second, the third with the fourth, and so on.

If that's right, you don't have a GA at all, as it violates a fundamental principle of evolution.

Instead, I think you might have created a modified implementation of a Steepest Climb routine, which has its advantages, but isn't a sound replacement for a GA, as it can only optimize towards a local optima at best.

You may be on to something there...

I simply take the fittest solutions, ordered by fitness, and have them recombine in the pairs in which they show up in the collection (which is ordered by fitness).

So I will have the fittest tour mate with the next-to-fittest tour, etc.

Perhaps that isn't such a good idea. I can't really recall how I did the selection in my first GA try (years ago).

How would you suggest I do it?

Randomly selecting partners?

In that case all I need to add is something that randomizes the 'fittest' list before the recombination loop starts.

I'll try that, see how it goes. But perhaps I should also take a few of the bad tours into the recombination process?

I think that will also decrease the chance of converging on a local optima too quickly, as the bad tours will produce quite different children.

Actually, that means of choosing pairs totally negates the GA because you have taken the selectivity out of the process almost entirely. However, coming up with a solution to that is kind of tricky.

Truly random mating removes selectivity even more than what you are doing, so that isn't an option.

The parents have to be chosen in such a way that the best have the greatest probability of mating.

It is not essential that ALL individuals have some probability, though I would suggest that it is a good idea.

Unfortunately, this can turn into a tricky problem, and I have used a large variety of solutions.

Here's one that seems bad, but might be optimized into a good function (I know I have used it, I just don't know what I did with it): You are calculating a fitness for each individual with shortest length / my length.

Multiply that by 100 so that you make the fitness in percentages.

Then sum all the fitness for the whole population (it will be way over 100, as the best one will be 100).

This will mean that each organism has some share of the total.

The best one will have 100 shares (because they had the shortest length), and all others have some lesser number of shares.

The next step would be to randomly choose one share from that set of shares.

This would be kind of like tossing all the shares in a hat and letting each choose.

Offhand, I can't think of a particularly efficient way to do this.

The drawback is that all the shares will be owned by just a handful of the top contenders, while the rest get one or none. Another option would be to assign shares such that if there are N members of the population, once you sort them by fitness, each individual gets N-X shares where X is their position in the sorted list.

That system should be much more easily turned into an equation for doing the drawing.

I should have an example of this in the Code Bank, because I think I used that technique for a GA that I posted there. By the way, I still have misgivings about the way new cities are selected.

The greedy crossover seems like it would be fine up to a point.

It's that "grab whatever" option for when both cities are already on the list that bothers me.

This seems to violate the procreational purpose of mating, as a nearly random force is jumping in.

The suggestion I had back in post #9 would negate that.

By using two lists, one sorted by city, while the other sorted by that random weight, it would be possible to never run into the problem of duplicating cities (it would also be possible to close the loop).

For mating, the list sorted by city could be used for crossing over, because city1 would always line up with city1, city2 with city2, and so forth.

The evaluation of distance would be done on the list sorted by weight.

Discussion Title: VS 2008 GA: Traveling Salesman Problem
Title Keywords: 2008  Traveling  Salesman  Problem  VBForums