By Rob Kinley (P3367) on Monday, November 27, 2000 - 02:50 am:

Hello,

I am high school student interested in Discrete Math. I am trying to teach myself Graph theory.

In the meantime ... I am looking for an algorithm that would generate all possible subset partitions of S given S = {1..n}.

for example if S = {1,2,3}.. then

A={1}; B = {2,3} where B is formed by S-A

A = {1,2} ; B = {3}

A = {1,3} ; B = {2}

etc.,

Also A {2,1} is not the same as {1,2} so after identifying these subsets, I need to permute them.

I would greatly appreciate it if you could provide me with any guidance on this

Thanks,

Rob


By James Myatt (Jem50) on Monday, November 27, 2000 - 03:46 am:

I'm not sure that this is quite the best way to go about it. Instead, I would prefer to just take each subset and deal with it by itself as opposed to looking at each pair of ''halves''.

I shall call your definition of subsets, i.e. {1,2} is not the same as {2,1} as ''subsets'' as they are not actually different subsets by the strictest definition.

The best way that I can see to generate all subsets is to consider it by the number of elements in the subset, generate them all, and the number required is found using the method below, and then consider the permutations of each separately. The total number of different subsets or ''subsets'' by the methods outlined below

First consider S = {1,2,3}, as we have above. We can see that there is one set with no elements, the empty set Æ, and yes that is a valid subset, and no we're not going in to it just yet. There are 3 sets with one element in and 3 sets with two elements and one with three elements, and once again, yes, this is a valid subset, and once again, no, we're not going into that either.

Now consider a larger and more interesting set, say with n elements. Look at subsets with zero elements, we need to choose zero elements from n, so we use the binomial coefficient, which we will refer to as nCr = n!/(n-r)!r! with n being the number to choose from and r being the number to choose from these, and n! means n factorial, which is the product of all numbers less than or equal to n, n! = n.(n-1) ... 3.2.1. nC0 = 1, so we have one, once again.

Next, look at subsets with 1 element. Here, we have 1 to choose from n so there are nC1 of these. Similarly we can see that the number of subsets with r elements is given by nCr. So if ve sum these from 0 to n, we get the number of different subsets. This just so happens to be 2n as it's the value of (1 + x)n with x=1, by the binomial expansion.

Now, you're problem has a little twist in the tail, as it considers permutations of a set to be different, so {1,2} is not equal to {2,1}. You will find that each subset can be written in r! different ways, by taking the first element as one from r, second as one from r-1 ... until the last is one from 1. Thus the number of different ''subsets'' with a given number of elements will be r!.nCr = n!/(n-r)!.

Similarly, we can generate this number by the following method: the first element is any one from n, the second is any one from n-1 ... the rth is any one from n-r+1. So number of different ''subsets'' with a given number of elements will be n(n-1)...(n-r+1)=n!/(n-r)!. This is the number of permutations of r numbers chosen from n, and so can be written nPr.

So, if we sum these numbers from r = 0 to n then we find the total number of ''subsets'', by your definition. This is equal to
n
å
r=0 
n(n-1)...(n-r+1)= n
å
r=0 
n!
(n-r)!
I hope this helps, but continue to post any questions; it's what we're here for.

James


By Rob Kinley (P3367) on Wednesday, November 29, 2000 - 11:18 pm:

First off, thank you for your detailed and clear response.

I understand the arguments presented for the total number of subsets given a set of size n. However I am trying to generate these subsets of large sets (with say 30/50 elements) in them without having to manually enumerate them. So I am trying to look for an algorithm to generate all possible subsets.

Let me try to explain the motivation for generating a subset {1,2} and {2,1} although in the general sense these are not different subsets. I am trying to work on a famous problem called the traveling salesman problem, wherein a salesman travels through n cities and comes back to the city he started. Travel between each pair of these cities costs a certain amount and the objective is to find the cheapest tour such that he visits each of these cities exactly once. In this problem traveling from city 1 to city 2 would not be the same as 2 to 1 (as the sequence of the cities is important). That said...

To generate subsets of size 1 all have to do is print out each element from 1 through n. To obtain all possible subsets of size 2 I could create n×n matrix which would be...

1,1 1,2 1,3 ... 1,n

2,1 2,2 2,3 .... 2,n

3,1

..

..

n,1 n,2 ..... n,n.

By neglecting diagonal elements I would have all possible subsets of size from a SET of n.

Similarly I could continue to generate a n×n×n matrix, neglect diagonal elements of the cube to obtain subsets with size 3.. so on and so forth.

Clearly this is not a good way to go about generating the subsets I need. Is there another way I could do it?

Thanks again,

I'd eagerly await your response.


By Dan Goodman (Dfmg2) on Monday, November 27, 2000 - 05:00 pm:

If you're programming in C or C++, then there's a function called next_permutation(...) which you'll find useful. If n £ 32, then there's a nice way of finding all subsets of {1,...,n}. Every subset of {1,...,n} can be paired off with a number between 0 and 2n-1 in the following way. You define the ''indicator function'' IX of a subset X of {1,...,n} as follows: IX(x)=0 if x is not in X or 1 if x is in X. Let
f(X)= n
å
i=1 
2i-1IX(i)

. You can easily see that every subset has a unique number using this function, and that every number between 0 and 2n-1 gives rise to a unique subset.

On a computer it's easy to test whether or not a given element is in the subset generated by a number k, you just extract the ith bit to test if i is in the subset, in C++ you write if(k&(1< < i)) to make this test. You can use this to generate all the subsets of a set in the following way. In C++ the program would look like this (for subsets of {0,...,n-1}):

unsigned int max_n = 0;

for(int i=0;ifor(unsigned int k=0;k< =max_n;k++)

{

for(int i=0;i{

if(k&(1< {

// then i is in the kth subset

}

}

}

Once you've got an array with a subset, say an array int *subset of size int m, then you can use the following C++ code to go through all the permutations of it:

// do stuff to the first permutation here

while(next_permutation(subset,m))

{

// do stuff to the current permutation here

}

Hopefully that will have gotten the ideas across even if you don't know C++, but post again if that's not entirely clear.

Finally, note that if your C++ compiler has the (nonstandard) long long datatype, then you can use the code above for subsets of anything up to n=64 by changing the unsigned int to an unsigned long long.


By James Lingard (Jchl2) on Thursday, November 30, 2000 - 01:28 am:

Rob,

Dan's code will do exactly what you want - his algorithm works through each subset of the elements and then finds each permutation of each subset.

However, I thought that I would point out that if you're intending to run this algorithm on a set of size 50, then you'd better either have a very fast computer or be very patient! There are (if I've calculated correctly) over 1060 (about 2200) ordered subsets of a set of size 50. To put this in perspective, the have been only about 1025 nanoseconds since the formation of the earth (a modern computer executes about 1 instruction per nanosecond).

However, if this is for theoretical rather than practical interest then please don't let me stop you!

James.


By Dan Goodman (Dfmg2) on Thursday, November 30, 2000 - 01:40 am:

Just to second James' qualms about the amount of time it would take to run through these calculations, I wrote a program recently which applied quite a simple operation (took about 2n steps) to each permutation of AA...AABB...BB (n A's and n B's). The computation for n=16 (which is as far as I got) took 2 hours on a P2-333. If you want to extend the algorithm to more than 64 elements, or if you're working on a computer without a 64 bit datatype I can give you an algorithm which will work for larger sets, although it'll be even slower!


By James Myatt (Jem50) on Friday, December 1, 2000 - 11:42 am:

Me and my friends have been running an algorithm between us to find amicable numbers. We've had approximatley 3/4 CPUs running the algorthum pretty much 24 hours a day, in a SETI@home type effort, and we've now scanned every pair with the lowest number less than 2.1 x 109. This has taken over 4 weeks to do. We've now come up against the problem that we need to find a way of recording the numbers if they are greater than 231 (signed long) as this is the greatest that printf and company can deal with, at the minute. We're also looking at some more efficient code.

The standard way of looking for solutions to the travelling salesman question is to write a program to 'jiggle' the connections until it falls into the lowest possible order. This way you just need a matrix of transitions. Start connecting 1 to 2, 2 to 3, etc. and then swapping connections. If that's more expensive then there's a decreasing chance as the algorithm runs to keep the change. Eventually, you will get one of the lowest solutions, but by running it a few times with different seeds (number used to generate random numbers by the computer algorithm) you should be able to produce something like the cheapest route.

James


By James Lingard (Jchl2) on Friday, December 1, 2000 - 01:49 pm:

Wow. Am I right in thinking that amicable numbers are a pair such that each number is the sum of the proper divisors of the other (sort of semi-perfect numbers), or am I getting confused with something else?

What is known so far about amicable numbers? Presumably you're not yet checking numbers which have never been checked before, or are you? Is the code you're running your own?

James.


By Rob Kinley (P3367) on Sunday, December 3, 2000 - 07:33 pm:

Dan,

Thanks a million for taking some time out and answering my question. This is exactly what I was looking for. I must confess I do not know C++ (I do some C programming). But I am thrilled that the problem I posed is solvable.

I understand that the function f(X) is unique for every subset but I coudn't quite grasp the code that uses this to generate the subsets. Honestly I did try to understand the code (to rewrite it in C).. but I just couldn't.

So if I were to generate say all subsets of size 4 from a set S={1,... 30}.. how would I modify the code ?. Is it possible at all to write this in C ?

Would you also explain the syntax max_n |= 1< < i

and ''if (k&(1< < i)) ''.. I did not understand their function.

I'd eagerly wait for your reponse.

Thanks again

PS: how does one program for games. ( I visited your web site) Seems like a lot of fun. I'd love to pursue a career in that.


By Dan Goodman (Dfmg2) on Sunday, December 3, 2000 - 11:49 pm:

Converting the program to C is very easy:

unsigned int max_n = 0;

int i;

unsigned int k;

for(i=0;i< n;i++) max_n|=1< < i;

for(k=0;k< =max_n;k++)

{

for(i=0;i< n;i++)

{

if(k&(1< < i))

{

// then i is in the kth subset

}

}

}

I can't find the C implementation of the next_permutation function, but I'll get back to you on this. The chances are you're using a C++ compiler if you know how to write C programs, if you want to use this function, just include this at the top of your file #include< algorithm.h> and hopefully you'll be able to use the next_permutation function.

To explain the expressions max_n |= 1< < i and if(k&(1< < i)) you need to understand the | and & operators in C. As you probably know, an integer is represented by a series of 1's and 0's, like 00110 or 11011 (which represent 6 and 27 respectively). a|b means the number you get where you replace each 0 or 1 in a and b with 0 if both the bits of a and b are 0 or 1 if either of them are 1. So, 00101|00110=00111. a&b is similar except you have a 0 if either of the bits of a and b are 0, and a 1 if both bits are 1. So 00101&00110=00100. Also, the < < operator ''shifts'' the bits of a number to the left, so a< < b shifts the bits of a left by b places. So 00110< < 1=01100 and 001001< < 2=100100. In binary 1=00001 (if you have a 5-bit number), so 1< < i means you have the binary number with a 1 in the ith place from the right and 0 in all other places. So 1< < 1=00010 and 1< < 2=00100. When you are a|=b it's just a C shorthand for a=a|b, so hopefully that explains the max_n bit (all it is doing is setting the first n bits of max_n to 1, because this is the number we're counting up to in a set of size n). The k&(1< < i) will be 0 if the ith bit of k is 0 or 1< < i if the ith bit of k is 1. So if you write if(k&(1< < i)) then you're saying ''if the ith bit of k is 1 then do the following ...''. So the algorithm just goes through every number between 0 and 2n-1 (max_n=2n-1 if you work out the stuff above) and works out the subset generated by that number, which (because of the f(X) function) is every subset.

If you want to find all subsets of size m (say), there's an easier way to do it, but it uses the next_permutation function again. I will get back you with an alternative algorithm for finding all the permutations! Anyway, to generate all the subsets of size m, use the following code:

unsigned int k[n];

int i;

for(i=0;i< n;i++) if(i< m) k[i]=1; else k[i]=0;

// do your algorithm on this permutation, i.e. the set of size m which has elements {0,1,2,...,m-1}

while(next_permutation(k,n))

{

// do your algorithm on this permutation, i.e. the number i is in this subset if k[i]=1

}

Do you see why this generates all subsets of size m? Each subset of size m will be represented by an array of numbers with n-m 0's and m 1's. However, every array of numbers with n-m 0's and m 1's corresponds to a subset of size m. So all the permutations of an array with n-m 0's and m 1's generate all the subsets of size m. In fact, you could use this to generate all the subsets of sets with sizes bigger than 64 by finding all subsets of size 0, all of size 1, all of size 2, and so on.

If there's another programming language you're more familiar with (I know C/C++/Java/Basic) then I can rewrite the algorithm for that language. I realize what I've written above might not fully explain the code and the algorithm, but if you can tell me which bits are still unclear I'll hopefully be able to clear them up for you.

On the subject of programming games, there's a huge number of web pages dedicated to introducing games programming (which is a lot of fun by the way). I recommend looking on the internet yourself for some good stuff (I just looked through my links, and none of them are introductory stuff I'm afraid), but I would recommend Allegro as the C/C++ library of choice for learning to write games. There should be lots of info on that page to get you started.


By Rob Kinley (P3367) on Friday, December 8, 2000 - 12:07 am:

Dan,

That was super!!. I follow the subset generation program . I tried it and it worked. Now to generate only subsets of size m. I understand that I'll have to begin with switching on (setting to 1) m of the n numbers and switch off (set to 0) the rest.

I'll then need to permute this string of 0's and 1's.... and then print out the ith element if it was = 1 . right??

Assuming the above statement was true.. I had a few questions on the function next_permutation.

The parameters I need to pass on were k (where k is the string of 0's and 1's) and n is the size of out superset ?? Is that true?

I tried the following using a C compiler

#include

#include

#include

main()

&#123;

int i,m,j,n;

unsigned int k[n];

n = 6;

m = 3;

for(i=0;i

#include

#include

int main()

&#123;

int i,m,j,n;

unsigned int k[n];

n = 6;

m = 3;

for(i=0;i< n;i++) if(i< m) k[i]=1; else k[i]=0;

while(next_permutation(k,n))

&#123;

for(j=0;j< n;j++)

&#123; if (k[j]&(1< < j))

cout< < j;

&#125;

cout < < "^#92;n";

&#125;

return 0;

&#125;

and I came up with this error

c:^#92;ths^#92;subset2.cpp: In function `int main()':

c:^#92;ths^#92;subset2.cpp:12: no matching function for call to `next_permutation (unsigned int[((n - 1) + 1)], int &)'

But do you think I get the general concept ??. Can you help me out with the code pleeeeeeeease.

Thanks

again


By Dan Goodman (Dfmg2) on Friday, December 8, 2000 - 03:42 am:

You're correct that you switch on m elements and switch off n-m elements and then permute this array, where the ith element is in the set if the ith value in the permuted array is 1.

The difficulties with the ''next_permutation'' function are irritating, I'll suggest a couple of things and see if they help (if not, I'll scour the net for a good permutation algorithm):

The first mistake I made was the syntax for ''next_permutation'', it should be next_permutation(start,end) where start is the beginning of your data and end is the end of your data plus one. So, in your code it should be ''next_permutation(k,k+n)'' rather than ''next_permutation(k,n)''. If this doesn't fix the problem, here are some more general suggestions:

In the first program, you were trying to use a strictly C only compiler, so it won't have the algorithm header file unfortunately. If you like, you could search the files in the ''Include'' folder of your C compiler for the string ''permutation'' (if you're using Windows, you can use the ''Find Files or Folders'' option in the Start menu to do this, enter ''permut'' in the ''Containing Text'' field and set the ''Look In'' option to the ''Include'' folder of your compiler), this might locate a similar function in C, but I'm not sure. Your second example is more promising. You could try changing ''#include< algorithm> '' (which is correct standard C++ but some compilers have nonstandard implementations) to ''#include< algorithm.h> ''. If this doesn't fix the problem, you can try adding ''using namespace std;'' after the ''#include< algorithm> '' - this is necessary in strict C++ usage, but again many compilers are quite lax about this.

Hopefully that answers the technical problems, now back to the theory.

In both programs you made a small error, the line which reads ''if(k[j]&(1< < j))'' should read ''if(k[j])'' - this is because you are not using a single number to store the subsets in this example, you're using an array of numbers instead. What you've written will only evaluate to true if j=0, so this algorithm will only pick out 0 from your sets. Other than that, I think you have the theory perfectly.


By Rob Kinley (P3367) on Monday, December 11, 2000 - 02:14 am:

Dan,

I tried the different options, but unfortunately the compiler doesn't recognize ''include algorithm''. I could buy a good c++ compiler that would include the next_permutation function. Can you suggest a (economical) C/C++ compiler for Windows 98?

Thanks


By Dan Goodman (Dfmg2) on Monday, December 11, 2000 - 02:38 am:

Hi Rob, have you tried changing the syntax of the next_permutation function to next_permutation(k,k+n)? I think this should fix the error on the second version of the program you sent me! If this still doesn't work, then I can recommend DJGPP as a first rate free C/C++ compiler, it's the one I use when I'm not using VisualC++. It's a perfectly good compiler if you don't want to use Windows routines and if you don't want to use advanced templates stuff in C++ (which few compilers do well anyway). I found an Excel VB Macro which generates permutations. It's in BASIC, but you should be able to adapt it to your needs:

Dim CurrentRow

Sub GetString()

Dim InString As String

InString = InputBox(''Enter text to permute:'')

If Len(InString) < 2 Then Exit Sub

If Len(InString) > = 8 Then

MsgBox ''Too many permutations!''

Exit Sub

Else

ActiveSheet.Columns(1).Clear

CurrentRow = 1

Call GetPermutation('''', InString)

End If

End Sub

Sub GetPermutation(x As String, y As String)

' The source of this algorithm is unknown

Dim i As Integer, j As Integer

j = Len(y)

If j < 2 Then

Cells(CurrentRow, 1) = x & y

CurrentRow = CurrentRow + 1

Else

For i = 1 To j

Call GetPermutation(x + Mid(y, i, \1), _

Left(y, i - 1) + Right(y, j - i))

Next

End If

End Sub