By Rob Kinley (P3367) on Monday, November 27, 2000 -
02:50 am: Hello,
I am high school student interested in Discrete Math. I am trying to teach
myself Graph theory.
In the meantime ... I am looking for an algorithm
that would generate all possible subset partitions of
given
.
for example if
.. then
;
where
is formed by
;
;
etc.,
Also
is not the same as
so after identifying these
subsets, I need to permute them.
I would greatly appreciate it if you could provide me with any guidance on
this
Thanks,
Rob
By James Myatt (Jem50) on Monday, November 27, 2000 -
03:46 am: I'm not sure that this is quite the best way to
go about it. Instead, I would prefer to just take each subset and deal with it
by itself as opposed to looking at each pair of ''halves''.
I shall call your definition of subsets, i.e.
is not the same as
as ''subsets'' as they are not actually different subsets by the
strictest definition.
The best way that I can see to generate all subsets is to consider it
by the number of elements in the subset, generate them all, and the number
required is found using the method below, and then consider the permutations
of each separately. The total number of different subsets or
''subsets'' by the methods outlined below
First consider
, as we have above. We can see that there is one
set with no elements, the empty set
, and yes that is a valid
subset, and no we're not going in to it just yet. There are 3 sets with one
element in and 3 sets with two elements and one with three elements, and once
again, yes, this is a valid subset, and once again, no, we're not going into
that either.
Now consider a larger and more interesting set, say with
elements. Look at
subsets with zero elements, we need to choose zero elements from
, so we
use the binomial coefficient, which we will refer to as
with
being the number to choose from and
being the number to choose
from these, and
means
factorial, which is the product of all numbers
less than or equal to
,
.
, so we have one,
once again.
Next, look at subsets with 1 element. Here, we have 1 to choose from
so
there are
of these. Similarly we can see that the number of subsets
with
elements is given by
. So if ve sum these from 0 to
, we get
the number of different subsets. This just so happens to be
as it's the
value of
with
, by the binomial expansion.
Now, you're problem has a little twist in the tail, as it considers
permutations of a set to be different, so
is not equal to
.
You will find that each subset can be written in
different ways, by
taking the first element as one from
, second as one from
... until
the last is one from 1. Thus the number of different ''subsets'' with a given
number of elements will be
.
Similarly, we can generate this number by the following method: the first
element is any one from
, the second is any one from
... the
th
is any one from
. So number of different ''subsets'' with a given
number of elements will be
. This is the number
of permutations of
numbers chosen from
, and so can be written
.
So, if we sum these numbers from
to
then we find the total number
of ''subsets'', by your definition. This is equal to
I hope this helps, but continue to post any questions; it's what we're
here for. James
By Rob Kinley (P3367) on Wednesday, November 29, 2000
- 11:18 pm: First off, thank you for your detailed and clear response.
I understand the arguments presented for the total number of subsets given a
set of size
. However I am trying to generate these subsets of large sets
(with say 30/50 elements) in them without having to manually enumerate them.
So I am trying to look for an algorithm to generate all possible subsets.
Let me try to explain the motivation for generating a subset
and
although in the general sense these are not different subsets. I am
trying to work on a famous problem called the traveling salesman problem,
wherein a salesman travels through n cities and comes back to the city he
started. Travel between each pair of these cities costs a certain amount and
the objective is to find the cheapest tour such that he visits each of these
cities exactly once. In this problem traveling from city 1 to city 2 would not
be the same as 2 to 1 (as the sequence of the cities is important).
That said...
To generate subsets of size 1 all have to do is print out each element from
1 through
. To obtain all possible subsets of size 2 I could create
matrix which would be...
1,1 1,2 1,3 ... 1,
2,1 2,2 2,3 .... 2,
3,1 ..
..
,1
,2 .....
,
.
By neglecting diagonal elements I would have all possible subsets of size
from a SET of
.
Similarly I could continue to generate a
matrix, neglect
diagonal elements of the cube to obtain subsets with size 3.. so on and so
forth.
Clearly this is not a good way to go about generating the subsets I need. Is
there another way I could do it?
Thanks again, I'd eagerly await your response.
By Dan Goodman (Dfmg2) on Monday, November 27, 2000
- 05:00 pm: If you're programming in C or C++, then there's
a function called next_permutation(...) which you'll find useful. If
,
then there's a nice way of finding all subsets of
. Every subset
of
can be paired off with a number between 0 and
in the
following way. You define the ''indicator function''
of a subset
of
as follows:
if
is not in
or 1 if
is in
. Let
. You can easily see that every
subset has a unique number using this function, and that every number
between 0 and
gives rise to a unique subset.
On a computer it's easy to test whether or not a given element is in the
subset generated by a number
, you just extract the
th bit to test if
is in the subset, in C++ you write if(k&(1< < i)) to make this test.
You can use this to generate all the subsets of a set in the following way.
In C++ the program would look like this (for subsets of
):
unsigned int max_n = 0; for(int i=0;ifor(unsigned int k=0;k< =max_n;k++) { for(int i=0;i{ if(k&(1< { // then i is in the kth subset } } }
Once you've got an array with a subset, say an array int *subset of size
int m, then you can use the following C++ code to go through all the
permutations of it:
// do stuff to the first permutation here while(next_permutation(subset,m)) { // do stuff to the current permutation here }
Hopefully that will have gotten the ideas across even if you don't know C++,
but post again if that's not entirely clear.
Finally, note that if your C++ compiler has the (nonstandard) long
long datatype, then you can use the code above for subsets of anything up to
by changing the unsigned int to an unsigned long
long.
By James Lingard (Jchl2) on Thursday, November 30,
2000 - 01:28 am: Rob,
Dan's code will do exactly what you want - his algorithm works through each
subset of the elements and then finds each permutation of each subset.
However, I thought that I would point out that if you're intending to run
this algorithm on a set of size 50, then you'd better either have a very
fast computer or be very patient! There are (if I've calculated correctly)
over
(about
) ordered subsets of a set of size 50. To put this in
perspective, the have been only about
nanoseconds since the formation of
the earth (a modern computer executes about 1 instruction per nanosecond).
However, if this is for theoretical rather than practical interest then
please don't let me stop you!
James.
By Dan Goodman (Dfmg2) on Thursday, November 30, 2000
- 01:40 am: Just to second James' qualms about the amount of
time it would take to run through these calculations, I wrote a program
recently which applied quite a simple operation (took about 2n steps) to each
permutation of AA...AABB...BB (n A's and n B's). The computation for n=16
(which is as far as I got) took 2 hours on a P2-333. If you want to extend
the algorithm to more than 64 elements, or if you're working on a computer
without a 64 bit datatype I can give you an algorithm which will work for
larger sets, although it'll be even slower!
By James Myatt (Jem50) on Friday, December 1, 2000 -
11:42 am: Me and my friends have been running an algorithm
between us to find amicable numbers. We've had approximatley 3/4 CPUs running
the algorthum pretty much 24 hours a day, in a SETI@home type effort, and
we've now scanned every pair with the lowest number less than 2.1 x
.
This has taken over 4 weeks to do. We've now come up against the problem that
we need to find a way of recording the numbers if they are greater than
(signed long) as this is the greatest that printf and company can
deal with, at the minute. We're also looking at some more efficient code.
The standard way of looking for solutions to the travelling salesman question
is to write a program to 'jiggle' the connections until it falls into the
lowest possible order. This way you just need a matrix of transitions. Start
connecting 1 to 2, 2 to 3, etc. and then swapping connections. If that's more
expensive then there's a decreasing chance as the algorithm runs to keep
the change. Eventually, you will get one of the lowest solutions, but by
running it a few times with different seeds (number used to generate random
numbers by the computer algorithm) you should be able to produce something
like the cheapest route.
James
By James Lingard (Jchl2) on Friday, December 1, 2000 -
01:49 pm: Wow. Am I right in thinking that amicable
numbers are a pair such that each number is the sum of the proper divisors of
the other (sort of semi-perfect numbers), or am I getting confused with
something else?
What is known so far about amicable numbers? Presumably you're not yet
checking numbers which have never been checked before, or are you? Is the
code you're running your own?
James.
By Rob Kinley (P3367) on Sunday, December 3, 2000 -
07:33 pm: Dan,
Thanks a million for taking some time out and answering my question. This is
exactly what I was looking for. I must confess I do not know C++ (I do some C
programming). But I am thrilled that the problem I posed is solvable.
I understand that the function
is unique for every subset but I coudn't
quite grasp the code that uses this to generate the subsets. Honestly I did
try to understand the code (to rewrite it in C).. but I just couldn't.
So if I were to generate say all subsets of size 4 from a set
..
how would I modify the code ?. Is it possible at all to write this in C ?
Would you also explain the syntax max_n - = 1< < i and ''if (k&(1< < i)) ''.. I did not understand their function. I'd eagerly wait for your reponse. Thanks again
PS: how does one program for games. ( I visited your web site) Seems like a
lot of fun. I'd love to pursue a career in that.
By Dan Goodman (Dfmg2) on Sunday, December 3, 2000 -
11:49 pm: Converting the program to C is very easy: unsigned int max_n = 0;
int i;
unsigned int k; for(i=0;i< n;i++) max_n - =1< < i;
for(k=0;k< =max_n;k++)
{
for(i=0;i< n;i++)
{
if(k&(1< < i)) { // then i is in the kth subset }
}
}
I can't find the C implementation of the next_permutation function,
but I'll get back to you on this. The chances are you're using a C++ compiler if you
know how to write C programs, if you want to use this function, just include
this at the top of your file #include< algorithm.h> and hopefully
you'll be able to use the next_permutation function.
To explain the expressions max_n - = 1< < i and if(k&(1< < i))
you need to understand the - and & operators in C. As you probably know, an
integer is represented by a series of 1's and 0's, like 00110 or 11011 (which
represent 6 and 27 respectively). a - b means the number you get where you
replace each 0 or 1 in a and b with 0 if both the bits of a and b are 0 or 1
if either of them are 1. So, 00101 - 00110=00111. a&b is similar except you have
a 0 if either of the bits of a and b are 0, and a 1 if both bits are 1. So
00101&00110=00100. Also, the < < operator ''shifts'' the bits of a number to
the left, so a< < b shifts the bits of a left by b places. So 00110< < 1=01100
and 001001< < 2=100100. In binary 1=00001 (if you have a 5-bit number), so
1< < i means you have the binary number with a 1 in the ith place from the
right and 0 in all other places. So 1< < 1=00010 and 1< < 2=00100. When you are
a - =b it's just a C shorthand for a=a - b, so hopefully that explains the
max_n bit (all it is doing is setting the first n bits of max_n to 1, because
this is the number we're counting up to in a set of size n). The
k&(1< < i) will be 0 if the ith bit of k is 0 or 1< < i if the ith bit
of k is 1. So if you write if(k&(1< < i)) then you're saying ''if the
ith bit of k is 1 then do the following ...''. So the algorithm just goes
through every number between 0 and 2n-1 (max_n=2n-1 if you work out the stuff
above) and works out the subset generated by that number, which (because of
the f(X) function) is every subset.
If you want to find all subsets of size m (say), there's an easier way to do
it, but it uses the next_permutation function again. I will get back
you with an alternative algorithm for finding all the permutations! Anyway,
to generate all the subsets of size m, use the following code: unsigned int k[n];
int i;
for(i=0;i< n;i++) if(i< m) k[i]=1; else k[i]=0;
// do your algorithm on this permutation, i.e. the set of size m which has
elements {0,1,2,...,m-1}
while(next_permutation(k,n)) { // do your algorithm on this permutation, i.e. the number i is in this subset
if k[i]=1 } Do you see why this generates all subsets of size m? Each subset of size m
will be represented by an array of numbers with n-m 0's and m 1's. However,
every array of numbers with n-m 0's and m 1's corresponds to a subset of size
m. So all the permutations of an array with n-m 0's and m 1's generate all
the subsets of size m. In fact, you could use this to generate all the
subsets of sets with sizes bigger than 64 by finding all subsets of size 0,
all of size 1, all of size 2, and so on.
If there's another programming language you're more familiar with (I know
C/C++/Java/Basic) then I can rewrite the algorithm for that language. I
realize what I've written above might not fully explain the code and the
algorithm, but if you can tell me which bits are still unclear I'll
hopefully be able to clear them up for you.
On the subject of programming games, there's a huge number of web pages
dedicated to introducing games programming (which is a lot of fun by the
way). I recommend looking on the internet yourself for some good stuff (I
just looked through my links, and none of them are introductory stuff I'm
afraid), but I would recommend
Allegro as the C/C++ library of
choice for learning to write games. There should be lots of info on that page
to get you started.
By Rob Kinley (P3367) on Friday, December 8, 2000 -
12:07 am: Dan,
That was super!!. I follow the subset generation program . I tried it and it
worked. Now to generate only subsets of size m. I understand that I'll have
to begin with switching on (setting to 1) m of the n numbers and switch off
(set to 0) the rest.
I'll then need to permute this string of 0's and 1's.... and then print out
the ith element if it was = 1 . right??
Assuming the above statement was true.. I had a few questions on the function
next_permutation.
The parameters I need to pass on were k (where k is the string of 0's and 1's)
and n is the size of out superset ?? Is that true?
I tried the following using a C compiler
#include #include
#include
main()
{
int i,m,j,n;
unsigned int k[n];
n = 6;
m = 3;
for(i=0;i
#include
#include
int main()
{
int i,m,j,n;
unsigned int k[n];
n = 6;
m = 3;
for(i=0;i< n;i++) if(i< m) k[i]=1; else k[i]=0;
while(next_permutation(k,n))
{
for(j=0;j< n;j++)
{ if (k[j]&(1< < j))
cout< < j;
}
cout < < "^#92;n";
}
return 0;
}
and I came up with this error
c:^#92;ths^#92;subset2.cpp: In function `int main()': c:^#92;ths^#92;subset2.cpp:12: no matching function for call to
`next_permutation (unsigned int[((n - 1) + 1)], int &)'
But do you think I get the general concept ??. Can you help me out with
the code pleeeeeeeease.
Thanks again
By Dan Goodman (Dfmg2) on Friday, December 8, 2000 -
03:42 am: You're correct that you switch on m elements and
switch off n-m elements and then permute this array, where the ith element
is in the set if the ith value in the permuted array is 1.
The difficulties with the ''next_permutation'' function are irritating, I'll
suggest a couple of things and see if they help (if not, I'll scour the net
for a good permutation algorithm):
The first mistake I made was the syntax for ''next_permutation'', it should be
next_permutation(start,end) where start is the beginning of your data and end
is the end of your data plus one. So, in your code it should be
''next_permutation(k,k+n)'' rather than ''next_permutation(k,n)''. If this
doesn't fix the problem, here are some more general suggestions:
In the first program, you were trying to use a strictly C only compiler, so
it won't have the algorithm header file unfortunately. If you like, you could
search the files in the ''Include'' folder of your C compiler for the string
''permutation'' (if you're using Windows, you can use the ''Find Files or
Folders'' option in the Start menu to do this, enter ''permut'' in the
''Containing Text'' field and set the ''Look In'' option to the ''Include''
folder of your compiler), this might locate a similar function in C, but I'm
not sure. Your second example is more promising. You could try changing
''#include< algorithm> '' (which is correct standard C++ but some compilers have
nonstandard implementations) to ''#include< algorithm.h> ''. If this doesn't fix
the problem, you can try adding ''using namespace std;'' after the
''#include< algorithm> '' - this is necessary in strict C++ usage, but again
many compilers are quite lax about this.
Hopefully that answers the technical problems, now back to the theory.
In both programs you made a small error, the line which reads
''if(k[j]&(1< < j))'' should read ''if(k[j])'' - this is because you are not
using a single number to store the subsets in this example, you're using an
array of numbers instead. What you've written will only evaluate to true if
j=0, so this algorithm will only pick out 0 from your sets. Other than that,
I think you have the theory perfectly.
By Rob Kinley (P3367) on Monday, December 11, 2000 -
02:14 am: Dan,
I tried the different options, but unfortunately the compiler doesn't
recognize ''include algorithm''. I could buy a good c++ compiler that would
include the next_permutation function. Can you suggest a (economical) C/C++
compiler for Windows 98?
Thanks
By Dan Goodman (Dfmg2) on Monday, December 11, 2000 -
02:38 am: Hi Rob, have you tried changing the syntax of
the next_permutation function to next_permutation(k,k+n)? I think this should
fix the error on the second version of the program you sent me! If this still
doesn't work, then I can recommend DJGPP
as a first rate free C/C++ compiler, it's the one I use when I'm not using
VisualC++. It's a perfectly good compiler if you don't want to use Windows
routines and if you don't want to use advanced templates stuff in C++ (which
few compilers do well anyway). I found an Excel VB Macro which generates
permutations. It's in BASIC, but you should be able to adapt it to your
needs:
Dim CurrentRow Sub GetString()
Dim InString As String
InString = InputBox(''Enter text to permute:'')
If Len(InString) < 2 Then Exit Sub If Len(InString) > = 8 Then MsgBox ''Too many permutations!'' Exit Sub Else ActiveSheet.Columns(1).Clear CurrentRow = 1 Call GetPermutation('''', InString) End If End Sub
Sub GetPermutation(x As String, y As String) ' The source of this algorithm is unknown Dim i As Integer, j As Integer j = Len(y) If j < 2 Then Cells(CurrentRow, 1) = x & y CurrentRow = CurrentRow + 1 Else For i = 1 To j Call GetPermutation(x + Mid(y, i, \1), _
Left(y, i - 1) + Right(y, j - i)) Next End If End Sub