Introduction to mendelian genetics
I am interested in genetics. Therefore I would like to write a general-purpose library to calculate the results produced by the mating of two specimens of a species.
This project is the first step on the path to an application to calculate the results of corn snake pairings (colour, pattern, …).
This application must first be able to:
- Accept the relevant genetic properties of the parents as input.
- Calculate the expected genotype and display it.
- Compile the resulting phenotype and display it.
Later I would like to create the administration of the relevant genetic properties.
I’ll write it in Java, because I want to improve my skills (I usually work with C#).
To understand the explanations and examples below, it is necessary to have some skills on genetics. Here is a brief summary of the key terms:
- A locus is the location of a gene in a chromosome.
- An allele is one of the alternative versions of a gene at a given locus.
- Corn snakes are diploid, like humans are. They have two alleles of a gene for each locus. The two alleles at a locus are divided for mating in two gamete .
- homozygous means having identical alleles at one locus.
- heterozygous means having different alleles at one locus.
- The collection of alleles of an individual is called genotype .
- The observable traits of the genotype are the phenotype .
- The observable allele in an heterozygous individual is called dominant, the other one is called recessive. To be dominant or to be recessive is a relationship between two different alleles.
- wild type is the phenotype’s natural form, in other words the “ normal” allele at one locus .
You can find general information about mendelian inheritance in Wikipedia .
We use a capital letter to represent a dominant allele and a lower case letter for recessive one. You can read out that there are these three possible combinations of alleles (based on homozygous / heterozygous and on dominant / recessive values):
- Dominant homozygous ( AA)
- Recessive homozygous ( aa)
- Heterozygous ( Aa or aA)
How to calculate the outcome of a breeding
Obviously I can’t know the outcome of two particular individuals. My goal is to calculate the genotypes’ probability of the possible breeding’s outcome.
The Punnett square is one of the methods that is used to calculate these probabilities.
Suppose we are crossing plants, this particular plant usually has yellow flowers, but in some cases the flowers are white. It is known that the allele for yellow flowers is dominant to the allele for white flowers. Therefore, we will represent the yellow flowers as A and white flowers as a.
There are six possible Punnett squares types for combinations of one locus.
- Dominant homozygous ( AA) x dominant homozygous ( AA)
- Recessive homozygous ( aa) x recessive homozygous ( aa)
- Dominant homozygous ( AA) x heterozygous ( Aa) or heterozygous ( Aa) x dominant homozygous ( AA)
- Recessive homozygous( aa) x heterozygous ( Aa) or heterozygous ( Aa) x recessive homozygous( aa)
- Dominant homozygous ( Aa) x recessive homozygous( aa) or recessive homozygous( aa) x dominant homozygous ( Aa)
- Heterozygous ( Aa) x heterozygous ( Aa)
Aa and aA are equivalent for our purposes. So the above tables can be summarized as follows.
- Dom. = Dominant
- Rec. = Recessive
- Hom. = Homozygous
- Het. = Heterozygous
You can check the calculation for two locus on Wikipedia.
From genotype to phenotype
Once the genotype was calculated, it is possible to derive the phenotype. This is actually quite easy. Suppose that A means brown eyes and that a means blue eyes. Two parents with allele Aa have brown eyes but carry both the condition for blue eyes, which does not lead to retribution because it is recessive.
And suppose now we have the following calculation: Aa x Aa:
This combination results in the genotype: 25% AA, 50% Aa, 25% aa
As already explained, the visible properties form the phenotype. This means that in our Example, there are two phenotypes:
- 75% of them will have brown eyes. ( AA and Aa)
- 25% of them will have blue eyes. ( only aa)
We have used a capital letter for Dominant and a lowercase letter for recessive. An allele combination was represented by two letters (eg AA, Aa or aa). This was advantageous for our examples and basic understanding.
There is a more detailed way of illustration:
- The locus is represented as a capital letter.
- The allele is represented as a lower case letter.
- Presence of a dominant allele is represented by the capital letter for the locus and a +.
- Presence of a recessive allele is represented by the capital letter for the locus and the lower case for the allele.
Here are some examples:
|First nomenclature||New nomenclature|
|AA||A +A +|
|Aa;aa||A +A a;A aA a|
But why do we need another notation? This is necessary if we consider a locus with two or more different mutations.
If we assume that locus X can take three different forms; a, b and c. a is dominant over b and c. And b is dominant over c.
It is possible that the a and c alleles are present at locus X. In the first notation would be represented as Ac. The information about the locus X is lost. If we have Ac and Bc, how can we know which locus is involved? It might be possible that a different locus Y is meant, a Locus that also has alleles b and c. This problem is eliminated by the new notation in which we represent the X locus with alleles a and c as XaXc.
Well, this was in broad terms, the necessary information to create the implementation.
See you soon.