Is Your DNA Unique?
Stage: 5 Challenge Level: 
As you may know, DNA is made up of of four different
bases:
-Adenine (A)
-Cytosine (C)
-Guanine (G)
-Thymine (T)
Suppose that the bases are randomly distributed along a single
strand of the DNA:
i) If my DNA single strand is
10 bases in length, what is the probability that it contains
only a single adenine?
ii) If my DNA single strand
is 150 bases in length, what is the probability of a 30%
cytosine content?
iii) If my DNA single strand
is 1000 bases in length, what is the probability of getting at
least 5 thymines in a row, as least once?
iv) The human genome is
approximated 6 billion bases in length. What is the probability
that another individual has the same genetic composition as
me?
v) The bacterial restriction
enzyme BamHI cuts DNA at the site GGATCC. If I digest my genome
with this enzyme, how many cuts would I expect to
occur?
DNA sequencing is a very laborious task, and requires expensive
machinery and complicated computational power. DNA
fingerprinting is a technique carried out by forensic
scientists in order to match a sample of DNA to a number of
suspects - this is commonly used in identifying a person from
among a number of suspects who may have been at a crime
scene.
However, since the sequencing of the entire human genome is so
difficult, a different approach must be adopted: it has been
found that most of the human genome is largely identical
between individuals, except for single bases which are
particularly varied in a population. These single bases occur
approximately once among every 1000 bases. By comparing these
particular sites between individual samples of DNA, it is much
more rapid to identify to a high degree of accuracy whether the
two DNA samples are identical.
vi) If approximately 1 in
1000 bases is variable, what is the probability of an
individual having the same genetic composition as
me?
vii) How many of these
variable sites should be investigated to identify a suspect to
99.99% probability?
viii) If we remember that DNA
occurs as homologous chromosomes, and that these variable sites
occur in the same places across a pair of homologous
chromosomes, how many of the sites should be investigated such
that the probability of a misidentification is smaller than 1
in 1,000,000?
search engine page