Is It Possible That I Share No DNA With My Full Sibling?
Whenever someone asks whether it’s possible that two
siblings can share no DNA segments in common, it conjures up memories of
watching the movie “Twins” back in 1988 where one sibling played by Arnold
Schwarzenegger inherited all of the “good genes” of his parents, and the other
played by Danny DeVito inherited all the “useless genetic material” (quoting
the movie). Most of the time, when I see
a question on this topic in the Facebook groups, I see a lot of links to the famous
genetic genealogy green chart (which provides observed cM ranges of different
classes of relatives) and to DNA Painter’s tool (which provides the possibility
of different relationships based on a given cM value), as well as a chorus of
people shouting “DNA doesn’t lie!” (not my favorite expression, but that’s a topic
for a different day). Of course, both the
green chart and the DNA Painter tool will tell you right away that it is impossible
for full siblings not to share any DNA segments.
However, there’s always that one person who responds, “Well,
it’s technically possible. It’s just
very unlikely.” There’s also typically another
person that chimes in with a comment about chimera, but don’t even get me
started on that! Today, we’re going to
explore the mathematical possibility that through ordinary autosomal
recombination, two siblings share no DNA, and we’re going to come to an
estimated probability based on the definition of a centiMorgan.
Warning! This involves math!
Before we get into cM and recombination though, let’s start
with a simple calculation of what the answer would look like in a species with
22 autosomal chromosomes, but where recombination does not occur (parents do
not switch between passing paternal and maternal DNA to their children along
the chromosomes, but instead pass one or the other in its entirety). On any given chromosome, the siblings would
have a 50/50 chance of inheriting the same copy of the chromosome from their
father (from either the paternal grandfather or the paternal grandmother), and
likewise a 50/50 chance of inheriting the same copy of the chromosome from
their mother (from either the maternal grandfather or the maternal grandmother). The odds of the siblings both inheriting
opposite paternal AND maternal copies in this model would be one in four (25%)
. This would be an NIR (non-identical region) that spanned the entire length of
the chromosome. Similarly, the odds of
an FIR (full-identical region) would be 25%, which would occur when both
siblings inherited the same copies of both parents’ chromosomes). The remaining 50% of the time, the siblings
would be in an HIR state (half-identical region) where either their paternal
copy of the chromosome matched or their maternal copy, but not both.
Now, we want to calculate the odds that the siblings are NIR
on all 22 chromosomes under this model.
We arrive at the solution by framing the problem that an event with probability
of 0.25 occurs AND another event with probability of 0.25 occurs AND . . . (and
so on with 22 events in total, each representing the odds that one chromosome
is entirely out of phase between the siblings).
In statistics, when we see the word “and,” we know to multiply the
probabilities. Here, we must multiply
0.25 times itself 22 times, represented by 0.25^22 or 0.25 to the 22nd
power. Entering that expression into the
calculator, we get 5.7 x 10^-14, or as a percentage 0.0000000000057 %. We could also express the same probability as
one in 18 trillion. When we take into
consideration a world population of 8 billion and make a few over-simplified
assumptions that 1) the population will remain relatively constant in the
foreseeable future, and that the average
person has 2 siblings (pulled that out of a hat); and 3) a conservative average
generation-span of 25 years, the resulting expectancy is that such a pair of
siblings might be born every 25,000 years, give or take. Given that humans have probably been around at
least 5 million years per our best understanding of natural history, and being
optimistic that humanity will likewise survive another few million years, then
we would conclude that yes, there will likely be some full siblings who will share
not a single segment of autosomal DNA, even if none are likely to be currently
walking the earth.
At this point, the person who posts “Well, it’s technically possible. It’s just very
unlikely.” is nodding his or head saying “I told you so.” But let’s not get too excited yet. We now need to add another factor that comes
into play. We need to account for recombination
in our model. Let’s ignore for the
moment the remote possibility that all recombination points on each of the siblings’
chromosomes occur in exactly the same place (I assure you it’s not going to add
much in terms of our probability estimate), and instead consider the “most
likely” scenario that there are none of the chromosomes recombined (taking us
back to the first model). So, we are
going to come up with a new “probability of no recombination” and multiply that
by the probability of all chromosome copies being out of phase (already
calculated above). Again, we multiply
the odds because both of those events must occur for there to be no matching
segments.
Let’s calculate an estimate of the odds of no recombination
based on the definition of cM. For our
purposes, we’ll define 1 cM as a span of chromosome across which the odds of a
recombination occurring is exactly 1/100.
Alternatively put, and more useful for our analysis, it’s a span of
chromosome across which has a 99% chance of NOT recombining. Each of our siblings has two copies of each
chromosome, totaling about 7,200 cM per sibling in chromosomal real estate (a
term that I think is very helpful to understanding quite a few concepts in
genetic genealogy). Altogether, in our
new model, there must be no recombinations that occur across a grand total of
14,400 cM (both siblings combined). We
can treat each cM as a separate event with a likelihood of 0.99 of no
recombination and put 14,439 “ands” in our calculation and we come up with 0.99^14,400
for our multiplier, which is 1.4 x 10^-63.
I’m going to stick to scientific notation because I don’t feel like counting
out the 60 zeros after the decimal point in the percentage.
So, the probability of not sharing any segments of DNA with
your sibling under our revised model is achieved by multiplying 5.7 x 10^-14 by
1.4 x 10^-63. The product is 8.0 x 10^-77. You know what, I am going to write that out
as a percent!
0.000000000000000000000000000000000000000000000000000000000000000000000000008 %
To put this number into perspective, this is about the same
probability as hitting the Powerball jackpot 9 times in a row. Put that in your WATO!
I know this year we’ve all already been through a lot of
tough times, and it’s probably not comforting to think about the depressing
prediction that scientists believe planet earth will be absorbed by the sun in
approximately 7.5 billion years.
However, I’m going down that road to finish off this math exercise. Assuming humans inhabit the planet from now
until that fateful day 7.5 billion years in the future, our equation is as
follows (and I encourage you to check my math):
P = (((1/8.0 x 10^-77) x 8000000000)/2) x (7500000000/25) = 1.5 x 10^-60 (probability of a pair of human full siblings with no shared DNA segments being born before the sun absorbs the earth)
I propose to you that even if I am off by as much as 50
orders of magnitude based on my assumptions, we can safely say that given our
model of autosomal DNA inheritance by recombination, never will a pair of
siblings with no matching segments walk the earth. While not “technically impossible,” i.e. we just
imagined a mechanism by which it could happen and even calculated a finite probability
for such an event to occur, the event is “statistically impossible.” That is to say, the probability is sufficiently low so as
to not bear mention in a rational, reasonable argument.
While I think most readers will have already guessed that full siblings must share DNA, what I hope to accomplish with this blog post is to show readers how the definition of a cM comes into play when determining probabilities in genetic genealogy. I have personally found the definition of cM to be extremely useful in designing genetic genealogy software tools and algorithms, but even for the less technically inclined, a solid understanding that a cM is a statistical unit of measurement (and knowing how the unit is defined) is very helpful in understanding genetic genealogy.
It would be interesting to find out the probabilities of the siblings sharing only 30%, 20%, 10%, 5%, 1% etc.
ReplyDeleteAnother interesting calculation would be how many children would have to be born from the same parents for them, together, to inherit all the genes of their parents.
"Another interesting calculation would be how many children would have to be born from the same parents for them, together, to inherit all the genes of their parents."
DeleteI believe that the applicable theorm here is Zeno's dichotomy paradox. I.e., one child has 50%; two children have 75%; three children 85%, and so on, but never arriving at 100%