The GENE - An Intimate Story (Siddhartha Mukherjee)

The GENE

Charles Darwin imagined that the cells of all organism produce minute particles containing hereditary information - gemmules, he called them.

Genes do not travel independently ; instead they moved in packs. Packets of information were themselves packaged - into chromosomes and ultimately in cells.

To explain nature's past, present and future through the lens of gene - 
Evolution - (Past) How did living things arise ? 
Variation - (Present) Why do they look like this now ?
Embryogenesis - (Future) How does a single cell create a living thing that will eventually acquire its particular form ?

To explain the intersection of genetics, natural selection and evolution in formal terms, Dobzhansky resurrected two important words - genotype and phenotype.
A Genotype is an organism's genetic composition. It can refer to one gene, a configuration of gene or an entire genome.
A Phenotype in contrast refers to an organism's physical or biological attributes and characteristics- the color of an eye, the shape of the wing or resistance to hot or cold temprature.

Part 3
The Dream of Geneticists
Sequencing and Cloning of Genes
"Crossing Over"
In a conceptual term, each virus is a professional gene carrier. Virus have a simple structure: they are no more than a set of gene wrapped inside a protein coat. When a virus enters a cell it sheds its coat and begins to use cell as its factory to copy genes and manufacture new coat, resulting in budding out of million new viruses of the cell. They live to infect and reproducs; they infect and reproduce to live.

DNA damage occure routinely in cells and its repairs itself using specific enzymes. One of these enzymes, called "ligase" chemically stiches the two pieces of broken backbone of DNA together. Occassionally , the DNA-copying enzyme, "polymerase" might also be recruited to fill in the gap and repair the gene.

Virtually all cells have polymerase and ligase to repair the broken DNA but there is little reason for most cells to have DNA - cutting enzyme. But bacteria possess such enzymes to defend themselves against viruses. They use the enzymes to cut open the gene of the intruder, thereby rendering their hosts immune to attack. These proteins are called "restricted" enzymes, because they restrict infection by certain virus. Free trade of genes is a hallmark of bacterial world.

The New Music
Genetics, like any langauge is built out of basic structural elements -
Alphabets - A, C, G & T
Vocabulary - consists of tripe codes; three bases of DNA are read together to encode one amino acid in a protein;  ACT encodes Thronine, CAT encodes Histidine, GGT encodes Glycine etc.
Sentence - Protein is the sentence encoded by a gene, using alphabets strung together in a chain (ACT-CAT-GGT encodes Thronine-Histidine-Glycine)
Syntax - Regulation of genes, creates a context for these words and sentence to generate meaning. The regulatory sequence appended to a gene - i.e. signals to turn on and off at certain times and in certain cells- can be imagined as the internal grammer of the genome.
DNA as with words, the sequence carries the meaning not the word itself. If DNA is broken into constituent bases A,C,T & G it does not signify anything significant.
Reading the DNA was incomprehensible - How to determine the sequence of a gene ?
Sagner was the scientist that found out the chemical formulation of insulin. The solution was in dissolution. Every protein is made up of sequence of amino acids strung like a chain - to identify the sequence of protein he snapped off one amino acid from end of the chain, dissolve it in solvents, and characterise it chemically. But this approach did not work for DNA. So he adopted another approach of building rather than braking. Cells build genes all the time: each time a cell divides it makes a copy of every gene. If one could follow the gene-copying enzyme (DNA polymerase) keeping tabs as enzyme bases are added - A,C,T,G etc then the sequence of the gene would be known.

Most animal proteins were nit encoded in long, continuous stretches of DNA, but were actually split into modules. In bacteria, every gene is a continuous, uninterrupted stretch of DNA, starting with the first code and running contiguously to the final stop sign. But in animals gene was typically split into parts and interrupted by long stretches of stuffer DNA.
Ex - word structure
Bacteria- gene is embedded in genome as structure
Animals/Human genome - st......r....uc.....tu....r...e
Long stretches of DNA marked by ellipses (...) do not contain any protein-encoding information. When such as interrupted gene is used to generate a message - i.e when DNA is used to build RNA-the stuffer fragments are excised from the RNA message = structure. This process is called - gene splicing or RNA splicing (since the RNA message of the gene was 'spliced' to remove the stuffer fragments.)
But why would an animal genome waster such long stretches of DNA splitting genes into bits and pieces, only to stitch them back into a continuous message?
Because by splitting genes into modules, a cell could generate bewildering combination of message out of a single gene. The word st......r....uc.....tu....r...e could be spliced to yield - cure, true etc. there by creating vast variant messages - called isoforms-out of a single gene. Modular gene also had an evlutionary advantage: the individual modules from different genes could be mixed and matched ti build entirely new kind of genes - called axons. The in-between stuffer fragment were termed introns. Introns are not the exception in the human gene; they are the rule. Human introns are enormous-spanning several hundreds of thousands of bases of DNA. The genes themselves are separated from each other by long stretches of intervening DNA, called intergenic DNA. Intergenic DNA and introns-spacers between genes and stuffers within genes-are thought to have sequence that allow gene to be regulated in context. Ex-
This.......is............the......(..)....s......truc........ture.....of; 
The long ellipses between words = stretches of intergenic DNA
The shorter ellipse within the words = introns
The parenthesis and semicolons, punctuation marks = regions of DNA that regulate gene.

T-Cells - senses the presence of invading cells and virus-infected cells by virtue of a sensor found on the surface of the T-Cell. The sensor T cell receptor is a protein made uniquely by T-Cell. The receptor recognises protein on the surface of foreign cells and binds to them. The binding in turn triggers a signal to kill the invading cell there by acting as a defence mechanism for organism. But how to determine the nature of the T-Cell receptor ? Not solved by the regular dissolution/reduction method, but by gene cloning. Enzyme was found that could build the DNA from RNA (reverse transcriptase). Every RNA could be used as a template to build its corresponding gene, thus generating catalogue of all genes in a cell. By comparing catalogues received from 2 different cells (T-Cells, RBC, neutrons in ratina etc) one could find what genes were active in one cell and not active in another. Once identified the gene could be amplified in bacteria, could be isolated, sequenced, its RNA and protein sequence determined etc.

Einstein on Beach
Genes were not abstraction any more. The could be liberated from the genomes of organism, shuttled between species, amplified, purified, extended, shortened, altered, remixed, mutated, mixed - they were infinitely malleable.
Asilomar -conference where scientists imposed self-regulation over recombination of genes, cloning and gene manipulation. Three types of procedures involving recombinant DNA, needed to be strictly restricted - Don't put drug-resistant/toxin gene/cancer gene into E-Coli.

Clone or Die
Boyer and Swanson formed a company (Gen-en-tech-Genetech) around recombinant DNA. In 1980's insulin was still produced from mashed up pig and cow innards, a pound of hormones from 8k pounds of pancreas - a medieval method that was inefficient and expensive. They were trying to express insulin as a protein via gene manipulation in cells.
Insulin: the Garbo of Hormones. Scientists, has looked through the Pancreas a fragile leaf of tissue tucked under the stomach, and discovered minute islands of distinct-looking cells studded across it called the islets of Langerhans. Afterwards they were surgically removed from a dog to identify the function of organ, due to which the dog was struck by implacable thirst and began to urinate-with high sugar content (diabetic). The sugar-regulating factor was later found to be hormones, a protein secreted into the blood by those "islet cells". The hormones was called islet in - and then insulin.
Boyers plan for the synthesis was-he would build it from scratch using DNA chemistry, nucleotide by nucleotide, triplet upon triplet -ATG, CCC, TCC and so forth, all the way to the last. He wold insert both the genes in bacteria and trick them into synthesising human proteins. He would purify the two proteins chain and then stitch them chemically to obtain the molecule. They first tried this on smaller Somatostatin. Since it was a synthetic gene-DNA created from naked chemicals-cell into the gray zone of Asilomar's language and was relatively exempt.
Patent-Company applied for patent of insulin. US Patent offie has 4 categories - methods, machines, manufactured materials and composition of matter-but insulin fitted into none. Genetech approached this problem ingeniously - rathe than patenting insulin as better or manufactured - it claimed a patent for a "DNA vehicle" to carry a gene into a bacterial cell, thereby produce a recombinant protein in microorganism.
Hemophilia was caused by single mutation in the gene for a crucial factor in blood, called factor VIII. By mid-1970's through H was treated with injections of concentrated factor VIII. Distilled out of thousands of litres of human blood, a single does of the clotting factor was equivalent to a hundred blood transfusions. This was also exposed to condensed essence from thousands of donors  -  also caused AIDS.
Why not create factor VIII artificially by gene cloning method, this would not allow any contamination of the blood. But this would test the limits as factor VIII has 2350 amino acids, insulin has 51. Hence new cloning technique was required-Reverse transcriptase DNA from RNA, removing the introns.
History of medicinal chemical-A drug - is nothing more than molecule that enables a therapeutic change in human physiology.  Of the several millions variants of biological molecules in human body (enzymes, receptors, hormones and so forth) only about 250-0.025 % are therapeutically modulated by our current pharmacopeia. The paucity of medicines has one principal reason: specificity. Nearly every drug works by binding to its target and enabling or disabling it.-turning molecular switches on or off. To be useful drug must bind to its switches-but to only selected set of switches. Most molecules can barely achieve this level of discrimination-but proteins have been designed explicitly for this purpose. Proteins, are the hubs of biological world. They are enablers, machinators, regulators, gatekeepers, the operators of cellular reaction. They are the switches that most drug try to switch on or off. But to make a protein, one needs its genes-and here recombinant DNA technology provides the crucial missing stepping stone. The cloning of human genes allowed scientists to manufacture proteins-the synthesis of proteins opened the possibility of targeting millions of biochemical reactions in the human body.

PART 4
The Proper Study of Mankind is Man
The Miseries of My Father

a gene -----(encodes)---a message ----(to build)---a protein----(to enable)---form/function-----(that regulates)----a gene

Positional Cloning

To Get The Genomes
In the history of science and technology, breakthroughs seems to come in two fundamental forms - 1. Scale Shift (where the crucial advances emerge as a result of an alteration in the size or scale alone) 2. Conceptual shifts (emergence of radical new idea or concept)
Two modes are not mutually exclusive but reinforcing. One enables the other.
Cancer- Arose from normal cells that had acquired mutations in growth-controlling genes. In normal cells, these genes act as a powerful regulators of growth-healing a wound. Genes tell the cells in a wound when to start growing and when to stop. In cancer cells these pathways are somehow disrupted. Start genes were jammed ON and stop genes were flicked OFF, resulting in a cell that does not know how to stop growing. C is result of such endogenous genetic pathways - a distorted version of our normal selves. How might a mutant gene be restored to their OFF or ON states, while allowing normal growth to proceed and unperturbed-this still remains the defining goal of the cancer therapy.
Normal cells acquire these cancer causing mutations through 4 mechanism -
1. Environmental insults - tobacco, UV, X Ray agents that attack and change the DNA structure.
2. Spontaneous Erros - during cell division every time a DNA is replicated in a cel, there is a minor chance that the copying process generates an error.
3. Inheritance - causing hereditary cancer syndrome such as retinoblastoma and breast cancer
4. Virus - gene swapping viruses.
Cancer arise from step-by-step process involving the accumulation of dozens of mutations in a cell. Fundamental feature is enormous genetic diversity. 2 samples of same cancer might have vastly different spectrum of mutation.

Plymerase chain reaction (PCR) would become crucial for Human Genome Project - Comprehensive sequencing of entire human genome.

The Book of Man
(in 23 volumes)

1. It has about 3.2 billon letters of DNA
2. Contains just 4 letters - ACTG
3. It is divided in 23 pairs of chromosomes- 46 in all- in most cells in the body. All there apes (gorillas, chimpanzee, orang-utan) have 24 pairs. At some point in hominid evolution, 2 medium size chromosomes in some ancestral ape fused to form one. The human genome departed cordially from apes genome several million years ago, acquiring new mutations and variations over time. We lost a chromosome but gained a thumb.
4. It encodes about 20,687 genes in total - only 1,796 more than worms, 12k fewer than corn and 25k fewer genes than rice or wheat. The difference between humans and breakfast cereal is not a matter of gene number, but of the sophistication of gene networks. It is not what we have; it is how we use it.
5. It is fiercely inventive. It orchestrates the activation and repression of certain genes in only certain cells and at certain time and space, and thus produces near-infinite functional variation out of its limited repertoire. And it mixes and matches gene module - called axons - within single genes to extract even further combinatorial diversity out of its gene repertoire. These 2 strategies - gene splicing and gene regulation - appears to be used most extensively in human genome than in genomes of other organism. It is the ingenuity of the gene function that is the secret to our complexity.
6. It is dynamic. In some cells, it reshuffles its own sequence to make novel variants of itself. Cells of the immune system generates 'anti-bodies' designed to attach themselves to attacking pathogen. But since pathogens are constantly evolving, it demands a evolving host. The genome reshuffles its genetic element, thereby achieving astounding diversity. In these cells every genome is capable of giving rise to an entirely different genome.
7. Parts of it are surprisingly beautiful. On a vast stretch of chromosome 11, for instance is a causeway dedicated entirely to the sensation of smell. Here a cluster of 155 closely related genes encodes a series of protein receptors that are professional smell censors. Each receptor binds to a unique chemical structure, like a key to the lock, and generates a distinctive sensation of smell in the brain. An elaborate form of gene regulation ensures that only one door receptor gene is chosen from this cluster and expressed in a single smell - sensing neutron in the nose, thereby enabling us to discriminate thousands of smells.
8. Genes, oddly, compromise only a minuscule fraction of it. 98% is just enormous stretch of DNA that are interspersed between genes (intergenic gene) and within gene (introns). These long stretches encodes no RNA and no protein; they exist either to regulate gene expression or for no reason that we understand.
9. It is encrusted with history. Embedded within it are peculiar fragments of DNA - some derived from ancient viruses - that were inserted into the genome in the distant past and have been carried passively since millennia since then. Some of these fragments were once capable of actively jumping between genes and organisms, but now they have been largely inactivated and silenced. These pieces are permanently tethered to our genome, unable to move out. There fragments are more common than gene, resulting in one more idiosyncrasy of our genome: much of the human genome is not particularly human.
10. It has repeated elements that appear frequently. A pesky, mysterious three-hundred-base-pair sequence called Alu appears and reappears millions of times, although its origin, function, or significance is unknown.
11. It has enormous 'gene families' - genes that resembles each other and perform similar functions - which often cluster together. 200 closely related genes, clustered in archipelagoes on certain chromosomes, encodes member of the 'Hot' family, many of which play crucial roles in determine the fate, identity, and structure of embryo, its segments, and its organs.
12. It contains thousands of 'pseudogene'- genes that were once functional but have become nonfunctional i.e like carcasses of these inactivated genes are littered throughout its length, like fossils delaying on a beach.
13. It accommodates enough variation to make each one of us distinct, yet consistency to make each member of our species profoundly different from chimpanzee etc.  whose genomes are 96% identical to ours.
14. Its first gene, on C 1, encodes a protein that senses smell in the nose. Its last gene, on C X, encodes a protein that modulates the interaction between cells of the immune system. The first and last are arbitrarily assigned. The first C is labelled because it is the longest.
15. The ends of the C are marked with 'telomeres'. Like the little bits of plastic at the ends of shoelaces, these sequence of the DNA are assigned to protect the C from fraying and degenerating.
16. Although we fully understand the genetic code -i.e how the information in a single gene is used to  build a protein - we comprehend virtually nothing of the genomic code - i.e. how multiple genes spread across the human genome coordinate gene expression in space and time to build, maintain, and repair a human organism, The genetic code is simple : DNA is used to build RNA, and RNA is used to build protein. A triplet of bases in DNA specifies one amino acid in the protein. The genetic code is complex: appended to a gene are sequences of DNA that carry information on when and where to express the gene. We do not know why certain genes are located in particular geographic location in the genome, and how the tracts of DNA that lie between the gene regulates and coordinate gene physiology. There are codes beyond codes.
17. It imprints and erases chemical marks on itself in response to alterations in its environment - there by encoding a form of cellular 'memory'
18. It is inscrutable, vulnerable, resilient, adaptable, repetitive and unique
19. It is poised to evolve. It is littered with the debris of its past.
20. It is designed to survive
21. It resembles us.

Part V
Through the looking glass
The Genetics of Identity and Normalcy
So, we's the Same
The function of BRCA1 gene is not to cause breast cancer when mutated, but to repair DNA when normal. Hundreds of million of woman without family history of breast cancer inherit this benign variant of breast BRCA1 gene. Encoded in the DNA sequence are fundamental determinants of those mental capacities such as learning, language, memory -essential to human culture. Encoded there as well are the mutations and variations that cause or increase the susceptibility to many diseases responsible for much human sufferings.
Human genes are stored in C in the nucleus of the cell, nut with  one exception. Every cell possesses a sub cellular structure called a mitochondrion that is used to generate energy. M have their own mini-genome, with only 37 genes, about 1/6000 the umber of genes on human C. (Some scientists propose that M originated from some ancient bacteria that invaded single cell organism. These bacteria formed a symbiotic alliance with the organism; they provide energy, but used the organism's cellular environment for nutrition, metabolism and self-defence. The Gene slodged within M are left over from this ancient symbiotic relationship; indeed human M genes resemble bacterial genes more than human ones). The M genes rarely recombines and is only present in a single copy. Mutations in mitochondrial genes are passes intact across generations and the accumulate over time without crossing over, making the mitochondrial genome an ideal timekeeper. The results of the studies suggest - 1. Overall diversity of human M is surprisingly small-less diverse than the corresponding genomes of chimpanzees. Modern humans another words are substantially younger and more homogenous than chimps. Calculating backwards the age of humans was estimated to be around 2000 years-a minor blip, a ticktock in the scale of evolution. 2. The humans appear to have emerged exclusively from a rather narrow slice of earth, somewhere in sub-saharan Africa, about 100-2000 years ago, and then migrated northward and eastwards to populate the Middle East, Europe and Asia and America. You get less and less variations further you go from Africa. The oldest human populations-their genomes peppered with diverse and ancient variations - are the San tribe of South Africa, Namibia and Botswana, and the Mbuti Pygmies. 3. The genetic material of the embryo comes from two sources - maternal genes (eggs) and paternal genes (sperm). But the cellular material of this embryo comes exclusively from the eggs; the sperm is not more than a glorified delivery vehicle for male DNA - a genome equipped with hyperactive tail. Apart from proteins, ribosomes, nutrients and membranes the egg also supplies the embryo with specialised structure called mitochondria. These M are the energy-producing factories of the cells-they are so anatomically discreet and so specialised in their function that cell biologists call the"organelles" - i.e. a mini organs resident within cells. M carry a small idenpendent genome that resides within the the M itself-not in the cells nucleus, where the 23 pairs of C (and 21k -odd human genes) can be found. The exclusive female origin of all the M in an embryo has an important consequence. All humans male or female must have inherited their M from they mothers. I female the M is passed on, if M the M ends. If the founding population of a species is small enough, and if enough time has passed, the number of surviving maternal lineages will keep shrinking and shrinking further, until only a few are left. For modern humans, the number has reached one: each of us can trace our mitochondrial lineage to a single human female who existed in Africa about 2k years ago. She is the common mother of our species. In human genetics she is known by a beautiful name - Mitochondrial Eve.

The First Derivative of Identity
Male C - Y; Female C - X; Male cells - XY and Female - XX. When a sperm carrying Y C fertilises the eggs, it results in XY combination, and maleness is determined and vice versa. The process of selection is purely random. Y C is an inhospitable place for genes, unlike any other C, it has no sister C and no duplicate copy, leaving every gene to fend for itself. A mutation is any other gene can be repaired by copying the intact gene from the other C. But a Y C gene can not be fixed, repaired or recopied from other C, it has no backup or guide (there is however a unique internal system to repair genes in Y C). When Y C is assailed by mutations, it lacks a mechanism to recover information. The Y us thus pockmarked with potshots and scars of history. It is most vulnerable spot in human genome.  As a consequence of this constant genetic bombardment, the human Y C began to jettison information mn of years ago. Genes that were truly valuable for survival were likely shuffled to other parts of the genome where they could be stored securely; genes with limited value were made obsolete, retired or replaced; only the most essential genes were retained. As information was lost, the Y C itself shrank- whittled down piece by piece by mirthless cycle of mutation and gene loss. The Y C is the smallest of all C is not a coincidence.
Sex is one of the most complex of human traits, it is unlikely to be encoded by multiple genes. Rather a single gene, buried rather precariously in the Y C, must be the master regulator of maleness. (Males barely made it). Why was sexual reproduction invented ? Sex was created to enable rapid genetic reassortment. No quicken way exists, perhaps to, to mis genes from two organisms than by rising their eggs and sperms. The power reassortment during sex increases the variation. V in turn increases O fitness and survival in the face of constant changing environment. The term sexual reproduction is misnomer, but sexual recombination. Why most mammals use the XY system for gender determination, why in short the Y ? We do not know. Why XY system was fixed in mammals and still in use is still a mystery. Perhaps evolution stumbles on Y C as quick and dirty solution for sex determination- confining the male-determining gene in a separate C and putting a powerful gene in it to control madness in certainly a workable solution. The SRY gene indubitably controls sex determination in an on/off manner. Turn SRY on and animal becomes male, turn it off and it becomes female. To enable more profound aspect of gender determination and gender identity , SRY must act on dozens of targets - turning them on and off, activating some genes and repressing others, like a relay race that moves the button from hand to hand.

The Last Mile
Descent of Man (Book) is to biologists what War and Peace is to graduate student of literature.
Based on analysis, Gay men tended to have gay uncles but only on the maternal side. Maternal cousins have higher rates of concordance. Sharing a small stretch of the X C called Xq28. Somewhere near Xq28 then was a gene that determined male identity. But thus far no one has isolated an actual gene that influence sexual identity. Linkage analysis does not identify a gene itself; it only identifies a chromosomal region where a gene might be found. After nearly a decade of intensive hunting, what geneticists have found is not a "gay gene" but a few "gay locations".
Genes could influence diverse behaviour, impulses, personalities, desires and temperaments. Twin studies showed that between twins empathy, altruism, sense of equity, love,, music, economic behaviour, and even politics are partially hardwired.
Genes can describe the form or fate of a complex organism in likelihoods and probabilities - but they cannot accurately describe the form or fate itself. What causes the difference ? unsystematic, idiosyncratic, serendipitous events. (Illness, accidents, traumas etc.)
Complex 

The Hunger Winter
If the "self" is created through the chance interactions among events and genes, then how are these interactions really recorded ? Through what mechanism are these "acts of fate" registered within a cell or a body ?
Epigenetics-
The children and grand children of of famine starved individuals tended to develop metabolic illness, as if there genomes carried some information. The interaction between gene and the environment had changed a phenotype. If a layer of information could be interposed on a genome, it would have unprecedented consequences. I - It would challenge an essential feature of classical Darwinian evolution. Conceptually, a key element of Darwinian theory is that genes do not - cannot - remember an organism's experience in a permanently heritable manner. Ex - When an antelope strains its neck to reach a tall tree, its genes do no record that effort, and its children are not born as giraffes. Rather, giraffes arise via spontaneous variation and natural selection; a tall-necked mutant appears in an ancestral tree-grazing animal, and during a period of famine, this mutant survives and is naturally selected. Evolution can craft perfectly adapted organisms, but not in an intentional manner. Its sole driver is survival and selection; its only memory is mutation. Hence the idea of "genetic memory" is counter intuitive.
Nuclear Transfer - evacuating the egg and inserting a fully fertilised nucleus.
Histones hang tightly to DNA and wrap it into coils and loops, forming scaffolds or the C. When scaffolding changes, the activity of a gene changes - akin to altering the properties of a material by changing the way its packed. A "molecular memory" could potentially be stamped on a gene - this time indirectly, by attaching the signal to proteins. The heritability and stability of these histones marks, and the mechanism to ensure that the marks appear in the right genes at the right time, are still under investigation - but simple organisms, such as yeast and worms can seemingly transmit these histone marks across several generations.
The silencing and activation of genes via protein regulators (transcription factors) - the "master conductors" of symphony of genes in cells - have been established long back. But these conductors can potentially recruit other proteins - call them helpers - to place permanent chemical imprint on genes. They even ensure that the tags are maintained on the genome. The tags can thus be added, erased, amplified, diminished, and toggled on and off in response to cues from a cell or from its environment. Every cell in an organism inherits the same book, but by scratching out particular sentences and appending others, by underlining, bolding, each cell could potentially write a unique novel from the same basis script.

a gene -------a Message -------a protein-------form/function---------a gene

The boldfaced and capitalised, and underlined words are epigenetic marks appended to the genome to impose a final layer of meaning. Only embryonic cells have genomes that are pliant enough to acquire many different kind of identities - and can thus generate all the cell types in the body. Once the cells of the embryo have taken up fixed identities - turned into intestinal cells or blood cells or nerves cells - there is rarely any going back. If you sequence the epigenomes of a pair of twins over the course of several decades, you find the substantial differences: the pattern of methyl group attached to the genomes of blood cells or neutron is virtually identical between the twins at the start of the experiment, beings to diverse slowly over the first decade, and becomes substantially different over fifty years. Chance events - injuries, infections, infatuations etc. regulatory proteins turn genes "on" and "off" in response to these events, and epigenetic marks are gradually layered above genes. How these epigenetic marks functionally impact the activity of gene remains to be determined.

Introduction of 4 genes into a mature skin cells caused a small fraction of the cells to transform into something resembling an embryonic stem cells. One of the four genes used to reverse cellular fate is called c-myc. Myc, the rejuvenation factor, is no ordinary gene: it is one of the most forceful regulators of cell growth and metabolism known in biology. Activated abnormally, it can certainly coax an adult cell back to into an embryo-like state. But Myc is also one of the most potent-cancer causing gene known in Biology, it is also activated in leukaemia and lymphomas etc. The quest for eternity comes at a terrible collateral cost.
What caused the Dutch Hungerwinter ? Cell by cell, and organ by organ, the body was reprogrammed for survival. Ultimately, even the germ cell - sperm and egg - were marked (we do not know how, or why, sperm and egg carry the memory of a starvation response; perhaps ancient pathways in human DNA record starvation or deprivation in germ cells). The embryo carried these marks to grandchildren, resulting in alteration in metabolism that remained etched in their genomes decades after Hongerwinter. Historical memory was thus transformed into cellular memory.

Seconds after fertilisation, a quickening begins in the embryo. Proteins reach into the nucleus of the cell and start flickering genetic switches on and off. A dormant spaceship comes to life. Genes are activated and repressed, and these genes, in terms, encodes yet another protein that activates and repress other genes. A single cell divides to form two, then 4, and 8 cells. An entire layer of cells form, then hollows out into the outer skin of a ball. Genes that coordinate metabolism, mobility, cell fate, and identity fire "on". The boiler room warms up. The light flickers on in the corridors. The intercom crackles alive.
Now, a second layer of information - instigated by master-regulator proteins - stirs to life to ensure that gene expression is locked into place in each cell, enabling each cell to acquire and fix an identity. Chemical marks are selectively added to certain genes and erased from others, modulating the expression of the genes in that cells alone. Methyl groups are inserted and erased, and histones are modified.
The embryo unfurls step by step. Primordial segments appear, and cells take their positions along various parts of the embryo. New genes are activated that command subroutines to grow limbs and organs, and more chemical marks are appended on the genomes of individual cells. Cells are added to create organs and structures - forelegs, hind legs, muscles, kidneys, bones, eyes. Some cells die programmed death. Genes that maintain metabolism, and repair are turned on. An organism emerges from a cell.
The Circular Flow of Biological Information

GENES - encodes - RNA - to build - PROTEINS - to form/regulate - ORGANISM - that sense - ENVIRONMENTS - that influence - PROTEINS, RNA (and DNA) - that regulates - GENES

- this is, perhaps, one of the few organising rules in biology. Closest thing to biological law.

Genes Past - we do not know where genes come from, or how they arose. Nor can we know why this method of information transfer and data storage was chosen over all other possible methods.

Miller attempted to brew a "primordial soup" - sealed a glass flask and blown methane, carbon dioxide, ammonia, oxygen and hydrogen into the flask through a series of vents. He has added hot steam and rigged an electrical spark to stimulate bolts of lightening, then heated and cooled the flask cyclically to recapitulate the volatile conditions of the ancient world. Fire and brimstone, heaven and hell, air and water, were condensed in a beaker. 3 weeks later, no organism had crawled out of Miller's flask. But in the raw mixture of Miller had found traces of amino acids - the building units of proteins - and traces amounts of the simplest sugars. Subsequent variations of the Miller expriment have added clay, basalt, and volcanic rocks and produces the rudiments of lipids, fats, and even the chimical building blocks of RNA and DNA.  Szostak believes that genes emerged out of this soup through a fortuitous meeting between two unlikely two unlikely partners.
1st - lipids created within the soup coalesced with each other to form micelles - hollow spherical membranes, somewhat skin to soap bubbles, that trap liquids inside and resemble the outer layers of the cells (certain fats, mixed together in watery solution, tend to naturally cloalesce into such bubbles). In lab experiements, S had demonstrated that such miscelles can behave like protocols: if you add more lipids to them, these "hollow" cells beings to grow in size. They expand, move about, and extend thin extrusions that resembles the ruffling membranes of cells. Eventually they divide, forming two micelles out of one.
2nd - While self - assembling micelles were being formed, chain of RNA rose from the joining together of nucleoside (A,C,G,U and their chemical ancestors) to form strands. The vast bulk of these RNA chains has not reproductive capability: they had no ability to make copies of themselves. But among the billions of non-replicating RNA molecules, one was accidentally created with the unique capacity to build an image of itself - or rather, generate a copy using mirror image (RNA and DNA). This RNA molecule, incredibly, possessed the capacity to gather nucleoside from a chemical mix and string them together to from a new RNA copy. It was a self- replicating chemical.
The next step was the marriage of convenience. Somewhere on earth - S think it might have been on the edge of a pond or swamp - a self-copying RNA molecule collided with self-replicating micelles. It was, conceptually speaking an explosive affair: the two molecules met, fell in love and launched a long conjugal entanglement. The self-replicating RNA began to inhabit the dividing micelles. The micelles isolated and protected the RNA, enabling special chemical reactions in its secure bubble. The RNA molecule, in - tuenbegan to encode information that was advantageous to the self-propagation not just for itself, but the entire RNA-micelle unit. over time the information encoded in the RNA-M complex allowed it to propagate more such RNA-M complexes. S wrote "Metabolism could have arisen gradually as (the protocols learned to) synthesise nutrients internally from simpler and more abundant starting materials. Next, the organisms might have added protein synthesis to their bag of chemical tricks". RNA "port-genes" may have learned to coax amino acids to form chains and thus build proteins - versatile, molecular machines that could make metabolism, self-propogation, and information transfer vastly more efficient.

When, and why, did discrete "genes" - modules of information - appear n a strand of RNA ? Did genes exist in their modular form at the very beginning, or was there an intermediate or alternative form of information storage ? Again, these questions are fundamentally unanswerable, but perhaps information theory can provide a crucial clue. The trouble with continuous, non modular information is that is notoriously hard to manage. It tends to diffuse; it tends to become corrupted; it tends to tangle, dilute and decay. Pulling one end causes other to unspool. If information bleeds into information, it runs a much greater risk of distortion. The discontinuous nature of information would have carried an added benefit: a mutation could affect one gene, and only one gene, leaving the other gene unaffected. Mutations could now act on discrete modules of information rather than disruption the function of the organism as a whole- thereby accelerating evolution. But the benefit came with a concomitant liability: too much mutation, and the information would be damaged to lost. What was needed, perhaps was a backup copy - a mirror image to protect the original or to restore the prototype if damaged. Perhaps this was the ultimate impetus to create double-stranded nucleic acid. The data in one strand would be perfectly reflected in the other and could be used to restore anything damaged; the yin would protect the yang. Life thus invented its own hard drive. In time, this new copy - DNA - would become the master copy. DNA was an invention of the RNA world, but it soon overran RNA as a carrier of genes and became the dominant bearer of genetic information in living. systems.  (Some viruses still carry their genes in the form of RNA). yet another ancient myth - of the child consuming its father - is etched into the history of our genome.

















Comments

Popular posts from this blog

Sapiens - A brief history of mankind - Yuval Noah Harrari