By Rob Kinley (P3367) on Monday, November 27, 2000 -
02:50 am: Hello,
I am high school student interested in Discrete Math. I am trying to teach
myself Graph theory.
In the meantime ... I am looking for an algorithm
that would generate all possible subset partitions of S given S = {1..n}.
for example if S = {1,2,3}.. then
A={1}; B = {2,3} where B is formed by S-AA = {1,2} ; B = {3}
A = {1,3} ; B = {2}
etc.,
Also A {2,1} is not the same as {1,2} so after identifying these
subsets, I need to permute them.
I would greatly appreciate it if you could provide me with any guidance on
this
Thanks,
Rob
By James Myatt (Jem50) on Monday, November 27, 2000 -
03:46 am: I'm not sure that this is quite the best way to
go about it. Instead, I would prefer to just take each subset and deal with it
by itself as opposed to looking at each pair of ''halves''.
I shall call your definition of subsets, i.e. {1,2} is not the same as
{2,1} as ''subsets'' as they are not actually different subsets by the
strictest definition.
The best way that I can see to generate all subsets is to consider it
by the number of elements in the subset, generate them all, and the number
required is found using the method below, and then consider the permutations
of each separately. The total number of different subsets or
''subsets'' by the methods outlined below
First consider S = {1,2,3}, as we have above. We can see that there is one
set with no elements, the empty set Æ, and yes that is a valid
subset, and no we're not going in to it just yet. There are 3 sets with one
element in and 3 sets with two elements and one with three elements, and once
again, yes, this is a valid subset, and once again, no, we're not going into
that either.
Now consider a larger and more interesting set, say with n elements. Look at
subsets with zero elements, we need to choose zero elements from n, so we
use the binomial coefficient, which we will refer to as nCr = n!/(n-r)!r!
with n being the number to choose from and r being the number to choose
from these, and n! means n factorial, which is the product of all numbers
less than or equal to n, n! = n.(n-1) ... 3.2.1. nC0 = 1, so we have one,
once again.
Next, look at subsets with 1 element. Here, we have 1 to choose from n so
there are nC1 of these. Similarly we can see that the number of subsets
with r elements is given by nCr. So if ve sum these from 0 to n, we get
the number of different subsets. This just so happens to be 2n as it's the
value of (1 + x)n with x=1, by the binomial expansion.
Now, you're problem has a little twist in the tail, as it considers
permutations of a set to be different, so {1,2} is not equal to {2,1}.
You will find that each subset can be written in r! different ways, by
taking the first element as one from r, second as one from r-1 ... until
the last is one from 1. Thus the number of different ''subsets'' with a given
number of elements will be r!.nCr = n!/(n-r)!.
Similarly, we can generate this number by the following method: the first
element is any one from n, the second is any one from n-1 ... the rth
is any one from n-r+1. So number of different ''subsets'' with a given
number of elements will be n(n-1)...(n-r+1)=n!/(n-r)!. This is the number
of permutations of r numbers chosen from n, and so can be written nPr.
So, if we sum these numbers from r = 0 to n then we find the total number
of ''subsets'', by your definition. This is equal to
n å r=0
n(n-1)...(n-r+1)=
n å r=0
n!(n-r)!
I hope this helps, but continue to post any questions; it's what we're
here for. James
By Rob Kinley (P3367) on Wednesday, November 29, 2000
- 11:18 pm: First off, thank you for your detailed and clear response.
I understand the arguments presented for the total number of subsets given a
set of size n. However I am trying to generate these subsets of large sets
(with say 30/50 elements) in them without having to manually enumerate them.
So I am trying to look for an algorithm to generate all possible subsets.
Let me try to explain the motivation for generating a subset {1,2} and
{2,1} although in the general sense these are not different subsets. I am
trying to work on a famous problem called the traveling salesman problem,
wherein a salesman travels through n cities and comes back to the city he
started. Travel between each pair of these cities costs a certain amount and
the objective is to find the cheapest tour such that he visits each of these
cities exactly once. In this problem traveling from city 1 to city 2 would not
be the same as 2 to 1 (as the sequence of the cities is important).
That said...
To generate subsets of size 1 all have to do is print out each element from
1 through n. To obtain all possible subsets of size 2 I could create
n×n matrix which would be...
1,1 1,2 1,3 ... 1,n 2,1 2,2 2,3 .... 2,n 3,1 ..
..
n,1 n,2 ..... n,n.
By neglecting diagonal elements I would have all possible subsets of size
from a SET of n.
Similarly I could continue to generate a n×n×n matrix, neglect
diagonal elements of the cube to obtain subsets with size 3.. so on and so
forth.
Clearly this is not a good way to go about generating the subsets I need. Is
there another way I could do it?
Thanks again, I'd eagerly await your response.
By Dan Goodman (Dfmg2) on Monday, November 27, 2000
- 05:00 pm: If you're programming in C or C++, then there's
a function called next_permutation(...) which you'll find useful. If n £ 32,
then there's a nice way of finding all subsets of {1,...,n}. Every subset
of {1,...,n} can be paired off with a number between 0 and 2n-1 in the
following way. You define the ''indicator function'' IX of a subset X of
{1,...,n} as follows: IX(x)=0 if x is not in X or 1 if x is in
X. Let
f(X)=
n å i=1
2i-1IX(i)
. You can easily see that every
subset has a unique number using this function, and that every number
between 0 and 2n-1 gives rise to a unique subset.
On a computer it's easy to test whether or not a given element is in the
subset generated by a number k, you just extract the ith bit to test if i
is in the subset, in C++ you write if(k&(1< < i)) to make this test.
You can use this to generate all the subsets of a set in the following way.
In C++ the program would look like this (for subsets of {0,...,n-1}):
unsigned int max_n = 0; for(int i=0;ifor(unsigned int k=0;k< =max_n;k++) { for(int i=0;i{ if(k&(1< { // then i is in the kth subset } } }
Once you've got an array with a subset, say an array int *subset of size
int m, then you can use the following C++ code to go through all the
permutations of it:
// do stuff to the first permutation here while(next_permutation(subset,m)) { // do stuff to the current permutation here }
Hopefully that will have gotten the ideas across even if you don't know C++,
but post again if that's not entirely clear.
Finally, note that if your C++ compiler has the (nonstandard) long
long datatype, then you can use the code above for subsets of anything up to
n=64 by changing the unsigned int to an unsigned long
long.
By James Lingard (Jchl2) on Thursday, November 30,
2000 - 01:28 am: Rob,
Dan's code will do exactly what you want - his algorithm works through each
subset of the elements and then finds each permutation of each subset.
However, I thought that I would point out that if you're intending to run
this algorithm on a set of size 50, then you'd better either have a very
fast computer or be very patient! There are (if I've calculated correctly)
over 1060 (about 2200) ordered subsets of a set of size 50. To put this in
perspective, the have been only about 1025 nanoseconds since the formation of
the earth (a modern computer executes about 1 instruction per nanosecond).
However, if this is for theoretical rather than practical interest then
please don't let me stop you!
James.
By Dan Goodman (Dfmg2) on Thursday, November 30, 2000
- 01:40 am: Just to second James' qualms about the amount of
time it would take to run through these calculations, I wrote a program
recently which applied quite a simple operation (took about 2n steps) to each
permutation of AA...AABB...BB (n A's and n B's). The computation for n=16
(which is as far as I got) took 2 hours on a P2-333. If you want to extend
the algorithm to more than 64 elements, or if you're working on a computer
without a 64 bit datatype I can give you an algorithm which will work for
larger sets, although it'll be even slower!
By James Myatt (Jem50) on Friday, December 1, 2000 -
11:42 am: Me and my friends have been running an algorithm
between us to find amicable numbers. We've had approximatley 3/4 CPUs running
the algorthum pretty much 24 hours a day, in a SETI@home type effort, and
we've now scanned every pair with the lowest number less than 2.1 x 109.
This has taken over 4 weeks to do. We've now come up against the problem that
we need to find a way of recording the numbers if they are greater than
231 (signed long) as this is the greatest that printf and company can
deal with, at the minute. We're also looking at some more efficient code.
The standard way of looking for solutions to the travelling salesman question
is to write a program to 'jiggle' the connections until it falls into the
lowest possible order. This way you just need a matrix of transitions. Start
connecting 1 to 2, 2 to 3, etc. and then swapping connections. If that's more
expensive then there's a decreasing chance as the algorithm runs to keep
the change. Eventually, you will get one of the lowest solutions, but by
running it a few times with different seeds (number used to generate random
numbers by the computer algorithm) you should be able to produce something
like the cheapest route.
James
By James Lingard (Jchl2) on Friday, December 1, 2000 -
01:49 pm: Wow. Am I right in thinking that amicable
numbers are a pair such that each number is the sum of the proper divisors of
the other (sort of semi-perfect numbers), or am I getting confused with
something else?
What is known so far about amicable numbers? Presumably you're not yet
checking numbers which have never been checked before, or are you? Is the
code you're running your own?
James.
By Rob Kinley (P3367) on Sunday, December 3, 2000 -
07:33 pm: Dan,
Thanks a million for taking some time out and answering my question. This is
exactly what I was looking for. I must confess I do not know C++ (I do some C
programming). But I am thrilled that the problem I posed is solvable.
I understand that the function f(X) is unique for every subset but I coudn't
quite grasp the code that uses this to generate the subsets. Honestly I did
try to understand the code (to rewrite it in C).. but I just couldn't.
So if I were to generate say all subsets of size 4 from a set S={1,... 30}..
how would I modify the code ?. Is it possible at all to write this in C ?
Would you also explain the syntax max_n |= 1< < i and ''if (k&(1< < i)) ''.. I did not understand their function. I'd eagerly wait for your reponse. Thanks again
PS: how does one program for games. ( I visited your web site) Seems like a
lot of fun. I'd love to pursue a career in that.
By Dan Goodman (Dfmg2) on Sunday, December 3, 2000 -
11:49 pm: Converting the program to C is very easy: unsigned int max_n = 0;
int i;
unsigned int k; for(i=0;i< n;i++) max_n|=1< < i;
for(k=0;k< =max_n;k++)
{
for(i=0;i< n;i++)
{
if(k&(1< < i)) { // then i is in the kth subset }
}
}
I can't find the C implementation of the next_permutation function,
but I'll get back to you on this. The chances are you're using a C++ compiler if you
know how to write C programs, if you want to use this function, just include
this at the top of your file #include< algorithm.h> and hopefully
you'll be able to use the next_permutation function.
To explain the expressions max_n |= 1< < i and if(k&(1< < i))
you need to understand the | and & operators in C. As you probably know, an
integer is represented by a series of 1's and 0's, like 00110 or 11011 (which
represent 6 and 27 respectively). a|b means the number you get where you
replace each 0 or 1 in a and b with 0 if both the bits of a and b are 0 or 1
if either of them are 1. So, 00101|00110=00111. a&b is similar except you have
a 0 if either of the bits of a and b are 0, and a 1 if both bits are 1. So
00101&00110=00100. Also, the < < operator ''shifts'' the bits of a number to
the left, so a< < b shifts the bits of a left by b places. So 00110< < 1=01100
and 001001< < 2=100100. In binary 1=00001 (if you have a 5-bit number), so
1< < i means you have the binary number with a 1 in the ith place from the
right and 0 in all other places. So 1< < 1=00010 and 1< < 2=00100. When you are
a|=b it's just a C shorthand for a=a|b, so hopefully that explains the
max_n bit (all it is doing is setting the first n bits of max_n to 1, because
this is the number we're counting up to in a set of size n). The
k&(1< < i) will be 0 if the ith bit of k is 0 or 1< < i if the ith bit
of k is 1. So if you write if(k&(1< < i)) then you're saying ''if the
ith bit of k is 1 then do the following ...''. So the algorithm just goes
through every number between 0 and 2n-1 (max_n=2n-1 if you work out the stuff
above) and works out the subset generated by that number, which (because of
the f(X) function) is every subset.
If you want to find all subsets of size m (say), there's an easier way to do
it, but it uses the next_permutation function again. I will get back
you with an alternative algorithm for finding all the permutations! Anyway,
to generate all the subsets of size m, use the following code: unsigned int k[n];
int i;
for(i=0;i< n;i++) if(i< m) k[i]=1; else k[i]=0;
// do your algorithm on this permutation, i.e. the set of size m which has
elements {0,1,2,...,m-1}
while(next_permutation(k,n)) { // do your algorithm on this permutation, i.e. the number i is in this subset
if k[i]=1 } Do you see why this generates all subsets of size m? Each subset of size m
will be represented by an array of numbers with n-m 0's and m 1's. However,
every array of numbers with n-m 0's and m 1's corresponds to a subset of size
m. So all the permutations of an array with n-m 0's and m 1's generate all
the subsets of size m. In fact, you could use this to generate all the
subsets of sets with sizes bigger than 64 by finding all subsets of size 0,
all of size 1, all of size 2, and so on.
If there's another programming language you're more familiar with (I know
C/C++/Java/Basic) then I can rewrite the algorithm for that language. I
realize what I've written above might not fully explain the code and the
algorithm, but if you can tell me which bits are still unclear I'll
hopefully be able to clear them up for you.
On the subject of programming games, there's a huge number of web pages
dedicated to introducing games programming (which is a lot of fun by the
way). I recommend looking on the internet yourself for some good stuff (I
just looked through my links, and none of them are introductory stuff I'm
afraid), but I would recommend
Allegro as the C/C++ library of
choice for learning to write games. There should be lots of info on that page
to get you started.
By Rob Kinley (P3367) on Friday, December 8, 2000 -
12:07 am: Dan,
That was super!!. I follow the subset generation program . I tried it and it
worked. Now to generate only subsets of size m. I understand that I'll have
to begin with switching on (setting to 1) m of the n numbers and switch off
(set to 0) the rest.
I'll then need to permute this string of 0's and 1's.... and then print out
the ith element if it was = 1 . right??
Assuming the above statement was true.. I had a few questions on the function
next_permutation.
The parameters I need to pass on were k (where k is the string of 0's and 1's)
and n is the size of out superset ?? Is that true?
I tried the following using a C compiler
#include #include
#include
main()
{
int i,m,j,n;
unsigned int k[n];
n = 6;
m = 3;
for(i=0;i
#include
#include
int main()
{
int i,m,j,n;
unsigned int k[n];
n = 6;
m = 3;
for(i=0;i< n;i++) if(i< m) k[i]=1; else k[i]=0;
while(next_permutation(k,n))
{
for(j=0;j< n;j++)
{ if (k[j]&(1< < j))
cout< < j;
}
cout < < "^#92;n";
}
return 0;
}
and I came up with this error
c:^#92;ths^#92;subset2.cpp: In function `int main()': c:^#92;ths^#92;subset2.cpp:12: no matching function for call to
`next_permutation (unsigned int[((n - 1) + 1)], int &)'
But do you think I get the general concept ??. Can you help me out with
the code pleeeeeeeease.
Thanks again
By Dan Goodman (Dfmg2) on Friday, December 8, 2000 -
03:42 am: You're correct that you switch on m elements and
switch off n-m elements and then permute this array, where the ith element
is in the set if the ith value in the permuted array is 1.
The difficulties with the ''next_permutation'' function are irritating, I'll
suggest a couple of things and see if they help (if not, I'll scour the net
for a good permutation algorithm):
The first mistake I made was the syntax for ''next_permutation'', it should be
next_permutation(start,end) where start is the beginning of your data and end
is the end of your data plus one. So, in your code it should be
''next_permutation(k,k+n)'' rather than ''next_permutation(k,n)''. If this
doesn't fix the problem, here are some more general suggestions:
In the first program, you were trying to use a strictly C only compiler, so
it won't have the algorithm header file unfortunately. If you like, you could
search the files in the ''Include'' folder of your C compiler for the string
''permutation'' (if you're using Windows, you can use the ''Find Files or
Folders'' option in the Start menu to do this, enter ''permut'' in the
''Containing Text'' field and set the ''Look In'' option to the ''Include''
folder of your compiler), this might locate a similar function in C, but I'm
not sure. Your second example is more promising. You could try changing
''#include< algorithm> '' (which is correct standard C++ but some compilers have
nonstandard implementations) to ''#include< algorithm.h> ''. If this doesn't fix
the problem, you can try adding ''using namespace std;'' after the
''#include< algorithm> '' - this is necessary in strict C++ usage, but again
many compilers are quite lax about this.
Hopefully that answers the technical problems, now back to the theory.
In both programs you made a small error, the line which reads
''if(k[j]&(1< < j))'' should read ''if(k[j])'' - this is because you are not
using a single number to store the subsets in this example, you're using an
array of numbers instead. What you've written will only evaluate to true if
j=0, so this algorithm will only pick out 0 from your sets. Other than that,
I think you have the theory perfectly.
By Rob Kinley (P3367) on Monday, December 11, 2000 -
02:14 am: Dan,
I tried the different options, but unfortunately the compiler doesn't
recognize ''include algorithm''. I could buy a good c++ compiler that would
include the next_permutation function. Can you suggest a (economical) C/C++
compiler for Windows 98?
Thanks
By Dan Goodman (Dfmg2) on Monday, December 11, 2000 -
02:38 am: Hi Rob, have you tried changing the syntax of
the next_permutation function to next_permutation(k,k+n)? I think this should
fix the error on the second version of the program you sent me! If this still
doesn't work, then I can recommend DJGPP
as a first rate free C/C++ compiler, it's the one I use when I'm not using
VisualC++. It's a perfectly good compiler if you don't want to use Windows
routines and if you don't want to use advanced templates stuff in C++ (which
few compilers do well anyway). I found an Excel VB Macro which generates
permutations. It's in BASIC, but you should be able to adapt it to your
needs:
Dim CurrentRow Sub GetString()
Dim InString As String
InString = InputBox(''Enter text to permute:'')
If Len(InString) < 2 Then Exit Sub If Len(InString) > = 8 Then MsgBox ''Too many permutations!'' Exit Sub Else ActiveSheet.Columns(1).Clear CurrentRow = 1 Call GetPermutation('''', InString) End If End Sub
Sub GetPermutation(x As String, y As String) ' The source of this algorithm is unknown Dim i As Integer, j As Integer j = Len(y) If j < 2 Then Cells(CurrentRow, 1) = x & y CurrentRow = CurrentRow + 1 Else For i = 1 To j Call GetPermutation(x + Mid(y, i, \1), _
Left(y, i - 1) + Right(y, j - i)) Next End If End Sub