Still about darwinism, the case of lactalbumin

I will try to make my criticism of darwinism in some earlier posts more precise, but in the case of lactalbumin the evolution theory just barely stays on the range of probabilities.

            I found some data on the protein lactalbumin and can better study my argument against evolution by mutations and meiosis. Females of the class Mammalia produce milk and for that reason mammals have a protein coding gene that makes lactalbumin. Lactabumin is actually a protein family, it countains different variants of α-lactalbumin, β-lactoglobulin, i.e., in different mammalian species these proteins differ to a degree. The gene encoding α-lactalbumin is called LALBA and LGB gene encodes a precursor to β-lactoglobulin.

            There is good evidence that lactalbumin has evolved from c-lysozyme [1][2]. First c-lysozyme become duplicated, probably 300 Myr (million years) ago and then the duplicated gene mutated [2]. As it was a duplicate, the original still made the work the gene was expected to do and the animal did not suffer from the copy. Much later, around 200 Myr ago, the copy started working and produced milk for the new class Mammalia.

            If we assume that the duplicate gene worked during the period 300-200 Myr when it mutated from c-lysozyme to lactalbumin, then the intermediate stages had to be useful or at least not harmful to the animal. If they were useful or neutral, some intermediate stages of the gene should have been preserved in some animal species. If it were so, there would be genes that are closer to lactalbumin than c-lysozyme, but there are no such genes. Thus, the mutating copy was not working before it emerged as lactalbumin. This implies that after the copy had mutated too far it stopped working and become a pseudogene. During the time the copy gene was not working, it could not be acted on my natural selection or any other mechanism affecting phenotypes. Therefore, the mutations on it were purely random and their probability can be calculated in a simple way from the binomial distribution.

            Reference [2] gives information that can be used for an estimate of the difference between hen’s lysocyme and cow lactalbumin. The proteins lactalbumin and lysozyme had 123 residues. Of them 40 residues were identical and 27 residues were chemically similar. 15 positions could not be located in the study, the reason for this must be that they were different but the researchers could not say what they corrrespond to. I will subtract this 15 because it can be that the proteins were of different sizes. Three base pairs are needed to code one aminoacid and the residues mentioned in [2] apparently correspond to amino acids. That means that the gene is 3*(123-15)=324 base pairs long.

            The gene for c-lysozyme has been active all the time and could mutate so that natural selection acted on it. According to [1] c-lysozime was created 600 Myr ago. How much has the c-lysozyme gene changed after 300 Myr ago? One way we might try to calculate it is the following. The average mutation rate is 0.5*10-9 mutations per bp per year. If we use this average mutation rate for the c-lysozime gene of 324 bp, then there should have been 6*108*324*0.5*10-9=97 mutations. Half of these mutations would have happened since 300 Myr ago, which gives 48 mutations. But this is incorrect for a subtle reason. Reference [4] gives the figure 30-55% similarity in lysozime proteins. Thus, about 45-70% of amino acids are different in these lysozime proteins. If we assume that most of these mutations are single mutations, and as three base pairs encode one amino acid, there are about 48-76 mutations in 324 base pairs. This seems to agree with the 48 mutations that we calculated, but these different lysozime proteins have been created by the original gene duplicating and then mutating as pseudogene and later emerging from the pseudogene as a new lysozime-family gene. This must be so because if the active gene mutated, it lost its function. Thus, actually the original c-lysozime gene has not mutated much, probably hardly at all.

            According to [1] there is considerable variation in amounts of change along diverse lactalbumin lineages in mammals, but this refers to data from present animals. It does not show that lactalbumin has mutated much while being active. It almost certainly has also duplicated, mutated in the duplicate, and the original copy, being a worse copy for the newly activated duplicate, has finally disappeared. But we can see if this my explanation is correct.

            The average mutation rate 0.5*10-9 mutations per bp per year can probably be also used for lactalbumin gene. Then there should have been about 32 bp mutations after the lactabumin gene become active if it mutated while active. If we add 48 mutations for c-lysozime gene to 32 mutations for lactalbumin we get to 80 mutations. There would additionally be 97/6=16 bp mutations in the duplicate in the time 300-200 Myr ago. The difference between the hen lactalbumin gene and the c-lysozime gene should be about 96 bp if the c-lysozime and lactalbumin genes would have mutated with this rate when active, but it is clearly larger that what can be calculated from [2].

             The 27 residues that were chemically similar in [2] must correspond to 27 mutations of base pairs. Assuming that only one of the three base pairs encoding an amino acid were changed, 54 were not changed. As these amino acids were chemically similar, they did not affect the phenotype and could not be acted on my natural selection.

            The 40 identical residues mean 120 identical base pairs in the gene. Thus, of the 324 base pairs 120+54=174 bp are not changed and 27 are certainly changed. Of the remaining 123 base pairs we can say that they are not identical in the two genes. There are many possible mutations, e.g. insertations and deletations, and the data does not specify it. However, I find it reasonable to assume that if there were 27 chemically similar amino acids created by single mutations, there were at least as much chemically non-similar amino acids created by single mutations, because when the copy of the gene is a pseudogene mutations are random and there is no mechanism that would favor chemically similar from chemically nonsimilar mutations. In this way I deduce that there were at least 27 mutated and 54 non-mutated base pairs. Adding these there is 54 single base pair mutations and 228 non-mutated base pairs in 324 base pairs. The state of the remaining 42 base pairs cannot be directly deduced in this way, but it is reasonable to assume that one third of them (14) are mutated.     Thus, there are about 68 bp differences between the lactablumin and c-lysozime genes. It is clearly smaller 96 that we get by assuming that c-lysozime and lactalbumin genes mutated with the average mutation rate when active. Therefore my explanation seems correct: the active versions of c-lysozime and lactalbumin genes did not mutate much.  

            What is a realistic estimate for mutations in the duplicate gene changed when it was pseudogene?  The copy was pseudogene for 100 million years from 300 Myr to 200 Myr and mutated during this time from c-lysozyme gene to a working lactalbumin gene. In 100 million years the duplicate should have acquired 97/6=16 mutations, but this is most probably not correct. The copy obtained a totally new functionality and therefore it had to have several mutations.  Both c-lysozyme since 300 Myr ago and lactalbumin since 200 Myr ago stayed active and could all the time encode proteins for the same task they do today. Natural selection acted on these genes, but they could not change much without losing functionality. The large differences in c-lysozime proteins are best explained by a process where the gene got duplicated and the duplicate mutated. Then the mutation rates for c-lysozine match quite well the average mutation rate of 0.5*10-9 mutations per bp per year. However, the mutation rate of the c-lysozime gene, which turned into lactalbumin, must have been larger. From different c-lysozime proteins it looks like about 50 bp mutations are needed before the duplicate copy found a new working form. This must be so also for the lactalbumin gene.

            Thus, I conclude that the best guess is that the duplicate gene had about 50 single bp mutations in 100 million years. The probability of a mutation in a single base pair in 100 million years is p=0.5*10-910-8=0.05. The binomial distribution gives the probability that there were exactly 50 mutations in 324 bps as

            Prob=(324 over 50)p50(1-p)324-50

where (a over b) is the binomial term. Using Stirling’s formula for the binomial term the probability is

            Prob≈exp(324ln324-50ln50-274ln274-50ln0.05-274ln0.95)

=exp(-23.5)=6.3*10-11.

We should know what the population size was for the 100 million years. 200 million years ago dinosauri ruled the world and mammals were like mice today. The population of mice in the world today is on the same order as the population of humans. It is about the size that would make the mutation of the lactalbumin gene possible, but there is at least one problem: how mammals got the other supporting functions for starting to use the new lactabumin proteins? If we add these new functionalities, I am afraid the probability of this happening approaches zero. 

            Notice that if there were only 16 mutations, we would get the probability

            Prob≈exp(324ln324-16ln16-308ln308-16ln0.05-308ln0.95)=exp(0.00166)=1.

Thus, creation of 16 single bp mutations in 100 million years would be quite probable. If we put the number of mutations to 32, then the probability is

            Prob≈exp(324ln324-32ln32-292ln292-32ln0.05-292ln0.95)=exp(-6.39)=0.00168.

This is still very reasonable as the population or pre-mammals can grow quite large in 100 million years. But as it is, the figure 16-32 for mutations is probably not the correct number. It should be around 50.

            The reason I do not get to an impossibly small probability in this calculation, as I did in the previous posts, is that the time 100 million years is quite long. For the divergence of mammals there is only 30 million years and it makes a big difference.  What this example case with lactalbumin confirms is that in order to create a really new protein coding gene you need 100 million years and a population on the range of 10 billion. There are cases where the time is much shorter and the population much smaller, such as the gene encoding cat’s Fel d1 protein.

            Evolution (meaning the observed time process of species developing from other species) has created very many new protein coding genes, that is, all protein coding genes there are. Reference [4] states that mutations to protein coding parts are very common in the DNA, so there are lots of examples and they mostly follow the process of a gene being duplicated, mutating fully randomly as pseudogene, and finally becoming functional again. These mutations seem to present a problem to the present form of the evolution theory, because often new functionalities requiring new proteins appear in a short time. The time for lactabumim was rather long, 100 million years. Divergence of mammals happened in a shorter time, and there also we see duplicates of genes mutating to clearly new protein coding genes.

            Just how rare such mutations creating new useful and clearly different protein coding parts are can be deduced from the text James Thomson refers to in his article Going to the dogs: Man’s best fried [1]

“Overall, we identified 131 unique SNPs that were significantly associated with at least one of the 14 behavioral traits (Bonferroni p ≤ 0.05, Fig 2). Forty percent of these SNPs (n= 52) were located within a gene – none of which encoded for changes in the amino acid sequence of the protein.”
            That is, selective breeding of dogs had changed dog’s behavior. Dogs have been bread for quite long. Some claim that only for 8,000 years, but by newer research it looks like 30,000 years. The researchers found 131 mutations, but no useful mutations in protein coding DNA.

            It is quite well understood that simply suffling genes in meiosis cannot produce new species. It can only create sub-species. In order to create new species new genes are needed and that requires mutations. Mutations are not rare: each of us has about 64 new mutations in our genome. Mutations are random and occur both in protein coding and protein non-coding parts. Some mutations in protein non-coding parts happen in control sections that modify how genes are expressed. Even a single mutation in a control part can change the phenotype and be useful if one is lucky. These mutations can create new species, but such new species has similar proteins as the mother species.

            Mutations in protein coding parts are either small changes that do not change the protein much (they make new alleles) or they usually are harmful and even deadly. This is why significantly different protein coding parts of DNA develop through the mechanism where a gene gets duplicated and the duplicated gene mutates as pseudogene outside the control of natural selection. If one is lucky, after a long time and in a large population one of these pseudogenes starts working as a new active gene. Lactalbumin is probably the most studied example. The 100 million years that is given in [1] for the development of the lactabumin gene from c-lysozyme is just barely enough, but my conclusion still is that random mutations cannot explain the evolution of new protein coding parts because the same problem appears also in the shorter time evolution when different mammal lineages diverged.

            I do not discard evolution as such, only the mechanisms that have been proposed for evolution. In some post I suggested that bacteria and viruses may have been the places where new protein coding parts were developed. This would give more time and a larger population where protein coding parts could develop. Bacteria and viruses infect animals and can transfer parts of DNA to the host. Of course, there is a problem why some bacteria would develop anything that is needed by higher animals, but it does not need to be developed in bacteria. 

Human DNA has some strange similarities with domestic animals (cats, cows and mice), more similarities than with animals living far from us but being evolutionally closer. The reason can be that bacteria and viruses can transfer DNA from one animal to another. Reference [3] gives some support to this idea: hen lysozyme seems to be a combination of egg-white lysozyme and bacterial DNA.

            The second problem in darwinism is its inability to explain the origin of life: how the first cell was created, but I will skip this problem and go to the third, which in my opinion is the gravest. Plato in Timaeus gives a theory of how the universe, human and animals were created. The theory from 360 BC is naturally oldfashioned and differs from our views on physics, but this is only because we are today so far from those times and can see that the theory is incorrect in many aspects. There are other old theories of the same type, and with all of them we can see that they are wrong. The logical conclusion that we should make is that in some thousands of years our theory will be considered completely wrong. However, Pythagoras’ theorem is still correct. Apparently it was known to the Egyptians long before the Greeks, though Egyptians did not know how to prove it. After some thousands of years the theorem is still correct. This is because it is proven logically and there are no gaps in the proof. There are gaps in Darwin’s theory, so it is not proven and we should not expect it to remain valid. Essentially there is no difference in Plato’s theory in Timaeus and Darwin’s theory of evolution: both are wrong. Darwin’s theory has its strong points, but also Timaeus has some very strong points. The theory in Timaeus starts from God creating the universe as a ball since ball is the most symmetric, and there was no time because when nothing moves, no time is needed in the model. This is quite reasonable, indeed, the universe was a ball in the beginning of the Big Bang (assuming we accept that theory). Then according to Timaeus God added something and it was not a ball anymore. Adding movement, like the planets, adds time to the model, so time was in a way created by the planets: time is needed because there is movement. In this way Timaeus ends up to the creation of man and as all ancient philosophers he thinks that man has a soul and is therefore not only a material being. Here is where I think Timaeus is much superior to Darwin. Darwin did not only say that humans are animals and developed from monkeys, which most modern people accept. His theory implies that humans are automatons, that is, there is no soul. Yet we have conscious self, as do other mammals. We can build a being having only the body but no conscious self. It is called an automaton, a robot, or a computer. The implication from Darwin is that human is an automaton, but this is in contradiction with the direct experience that we are not without consciousness like an automaton.

References:

[1] Prager EM1, Wilson AC: Ancient origin of lactalbumin from lysozyme: analysis of DNA and amino acid sequences. J Mol Evol. 1988;27(4):326-35.

https://link.springer.com/article/10.1007%2FBF02101195

Parsimony trees relating DNA sequences coding for lysozymes c and alpha-lactalbumins suggest that the gene duplication that allowed lactalbumin to evolve from lysozyme preceded the divergence of mammals and birds. Comparisons of the amino acid sequences of additional lysozymes and lactalbumins are consistent with this view. When all base positions are considered, the probability that the duplication leading to the lactalbumin gene occurred after the start of mammalian evolution is estimated to be 0.05-0.10. Elimination of the phylogenetic noise generated by fast evolution and compositional bias at third positions of codons reduced this probability to 0.002-0.03. Thus the gene duplication may have long preceded the acquisition of lactalbumin function.

Besides being a bacteriolytic enzyme, which appears to function in many creatures as a defense against bacterial infections (Joll6s and Joll6s 1984), lysozyme has on several occasions been recruited by the gut, where it may have a nutritional role (Stewart et al. 1987 and references therein). Lactalbumin, in contrast, has been found to date only in an organ that is unique to mammals, i.e., the mammary gland, which arose at a known time, approximately 200 million years (Myr) ago. In this organ, lactalbumin binds calcium and interacts with galactosyl transferase so as to promote lactose synthesis (Shewale et al. 1984; Stuart et al. 1986). That a duplication of a lysozyme gene permitted the origin of a protein (lactalbumin) with a new function receives support from observations that genes encoding the two proteins are similar in structure, each comprising four exons and three introns (Qasba and Safaya 1984). Lactalbumin may provide the best example where a new function has developed following a gene duplication.

…despite considerable variation in amounts of change along diverse lactalbumin lineages (Shewale et al. 1984). (my comment: so there is quite much variation, many mutations)

Lysozyme was created 600 Myr. This time was calculated from variations of the last 33 aminoacids in moths

[2] Keith Brew, Thomas C. Vanaman and Robert L. Hill: Comparison of the Amino Acid Sequence of Bovine α-Lactalbumin and Hens Egg White Lysozyme

http://www.jbc.org/content/242/16/3747

A partial sequence for α-lactalbumin, one of the two proteins of lactose synthetase, has been established. This sequence includes the exact position of all but 15 of the 123 residues in the molecule, and two of the four disulfide bonds. This partial structure shows that α-lactalbumin has a close structural similarity to hens egg white lysozyme. When the sequences of the two proteins are aligned, 40 residues in α-lactalbumin are identical with corresponding residues in lysozyme. An additional 27 residues at corresponding positions are chemically similar. From these observations it is concluded that the genes for α-lactalbumin and egg white lysozyme are derived from a common ancestor. It is proposed that an ancestral gene which controlled the sequence of a lysozyme-like enzyme duplicated, and the duplicate genes evolved independently giving rise to the genes for α-lactalbumin and the lysozymes.

[3] M. G. Grütter, L. H. Weaver & B. W. Matthews : Goose lysozyme structure: an evolutionary link between hen and bacteriophage lysozymes?Nature volume 303, pages828–831 (1983)

https://www.nature.com/articles/303828a0

During evolution, the amino acid sequence of a protein is much more variable and changes more rapidly than its tertiary struc ture. Given sufficient time, the amino acid sequences of proteins derived from a common precursor may alter to the point that they are no longer demonstrably homologous. The ability to make meaningful comparisons between such distantly related proteins must therefore come primarily from structural homology, and only secondarily (if at all) from sequence homology1. On the other hand, structural homology in the absence of sequence homology might be attributed to convergent rather than divergent evolution. (A common fold might be dictated by functional or folding requirements.) We have previously argued, on the basis of structural and functional similarities, that the lysozymes of hen egg-white and bacteriophage T4 have a common evolutionary precursor, even though their amino acid sequences have no detectable similarity2,3. Here we report the structure of the lysozyme from Embden goose, a representative of a third class of lysozymes4–10 that has no sequence homology11,12 (or perhaps very weak homology13) with either the hen egg-white or the phage enzyme. The structure of goose egg-white lysozyme has striking similarities to the lysozymes from hen egg-white and bacteriophage T4. However, some parts of goose lysozyme resemble hen lysozyme while other parts correspond only to the phage enzyme. The nature of the structural correspondence strongly suggests that all three lysozymes evolved from a common precursor.

[4] David M Irwin, Jason M Biegel, and Caro-Beth Stewart: Evolution of the mammalian lysozyme gene family. BMC Evol Biol. 2011; 11: 166.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3141428/

[5]  James Thomson, Going to the dogs: Man’s best friend.

https://www.unz.com/jthompson/going-to-the-dogs-mans-best-friend/

The quote is from a scientific paper: Highly Heritable and Functionally Relevant Breed Differences in Dog Behavior by Evan L MacLean, Noah Snyder-Mackler, Bridgett M. von Holdt & James A. Serpell.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.