Scientists cannot reliably recover DNA that is > 50,000 years old. To
investigate older sequences, researchers have to rely on computer programs
that infer backwards from present-day animals (computational genomics).
A software takes into account substitutions, deletions and insertions :
an algorithm can simulate evolution of a hypothetical portion of ancestral
mammalian DNAand use the sequence of descendants to recreate the original
ancestor with 98% accuracy. When this algorithm is used to work out a small
region of the genome (that codes for 10 genes) of the common ancestor of
19 modern mammals, including the pig, human and rat thought to be a shrew-like
animal that lived > 70 million years ago, the human sequence has lost only
11% of bases, whereas in rodents around 39% have been deleted, probably
because rats and similar animals go through generations more quickly, so
they accumulate mutations fasterref.
The degraded DNA of ancient cave
bears has been sequenced, despite the fact that many considered the
genetic information unrecoverable. The achievement leads researchers to
think they might be able to perform the same trick with DNA from ancient
human relatives, such as the Neanderthals. In the past, scientists have
managed to retrieve genetic material for analysis from animals or humans
that died in icy or desert environments, because these allow for good preservation.
But the remains of animals and humans are mostly found in caves, and are
heavily decomposed. The DNA from such specimens is usually mixed up with
DNA from soil microbes and later cave inhabitants, making it difficult
to sequence. The standard practice for sequencing genes involves making
numerous copies of the initial sample through PCR. Subjecting ancient DNA
to this does not produce good results because PCR picks up and duplicates
the sequences of modern animals more efficiently. This means that bits
of contaminating DNA often drown out samples from the prehistoric animal.
To overcome this challenge, Noonan and his colleagues decided to skip the
replicating step and directly sequence the tiny amount of DNA extracted
from 2 Austrian cave-bear bones that are > 40,000 years old. To make sure
each portion of DNA was really from the bears rather than a contaminating
source, they compared each sequence produced with the genome of the dog,
a modern relative of the bear. The technologies needed to examine such
tiny amounts of DNA directly, along with the reference genome from the
dog, have become available to scientists only recently. Nearly 6% of the
sequences analysed from one of their animal samples belonged to ancient
bear: an unexpectedly large amount. The rest of the DNA probably came from
soil microbes or the palaeontologists handling the bones. The same technique
should work on Neanderthal samples of about the same age or younger. But
challenges remain. Most important, it will be much harder to weed out contaminating
DNA from the people who excavated the Neanderthal samples, as both sets
of DNA will come from humansref
Ensembl : a joint project between
EMBL-EBI and the Sanger Centre to develop a software system which produces
and maintains automatic annotation on eukaryotic genomes.
Entrez
at NCBI : whole genomes information on over 600 organisms
Human chromosomes : G-banding, diagram and R-banding
Denver classification : a former classification of human chromosomes
on the basis of size and centromere position, adopted by geneticists in
Denver in 1960. The 23 pairs of chromosomes are arranged into 7 groups,
labeled A to G, in the order of decreasing length
Chicago classification : the classification of human chromosomes
adopted by geneticists at Chicago in 1966 for the identification of chromosomal
bands and regions and for the location of structural chromosomal abnormalities
Paris classification : a modification made in Paris in 1971 of the
Chicago classification of human chromosomes, providing more detailed cytogenetic
information
When Jonathan Rothberg's son was born in 1999, the baby was sent to the
infant intensive care unit. Rothberg worried all night that something might
be wrong with his child, and he found himself wishing he could just read
the boy's genome to find out. At the time that was impossible: it cost
tens of millions of dollars and took more than a decade to decipher the
first complete human genomeref.
But Rothberg's parental panic and frustration inspired him to design a
faster, cheaper sequencing technique. In 2005 Rothberg and his co-workers
at the 454 Life Sciences Corporation,
which is based in Branford, Connecticut, which he founded, reported a new
method that reads genomes 100 times faster than the current technologyref,
which is based on the Sanger method. Machines based on the Sanger method
typically read 67,000 bases per hour; Rothberg's method can decipher >
6 million bases in the same time. The 454 method is quick thanks to automation:
the entire process from the initial multiplication of segments of DNA through
to their sequencing is done using microfluidic technologies. It also analyses
thousands of DNA molecules simultaneously. In contrast, Sanger sequencing
takes many steps, and technicians are required to move the DNA from one
stage to the next. Using the 454 technique, one person using one machine
could easily sequence the 3 billion base pairs in the human genome in a
hundred days. As the process gets faster, it gets less expensive. It's
clear that we'll be able to do this much cheaper : in the next few years
scientists will be able to assemble a human genome for US$10,000. That
could bring about the long-touted 'era of personalized medicine', where
drugs are tailored to an individual's DNA. Several sequencing centres have
already bought machines made by the 454 corporation. The technology was
used to sequence the genome of the adenovirus in a single day, they reported
in 2003. It was also used to sequence Mycobacterium tuberculosis
in a recent push to discover drugs against that disease. The company has
even offered to sequence James Watson, one of the discoverers of the helical
structure of DNA. Watson says he gave some of his blood to the company
in spring 2005. The machines avoid some of the pitfalls of a bacterial
cloning process that is part of the Sanger method. Certain pieces of DNA
don't grow well in bacterial colonies, for example. But the 454 machines
are prone to their own sorts of errors. The machines have trouble accurately
reading long repeats of single base pairs, for example. The machines are
best used to analyse short genomes or short sequences. At least in the
short term, they won't replace the existing instruments. But they will
provide us with very significant additional capacity to do specific kinds
of projects
We have the modern human genome. Now researchers are set to sequence
the DNA of our extinct cousins: Neanderthal man. The Max Planck Institute
for Evolutionary Anthropology in Leipzig, Germany, in collaboration with
454 Life Sciences Corporation, in Branford, Connecticut, today announce
a plan to have a first draft of the Homo neanderthalensis genome
within 2 years. Comparing the result to modern human and other primate
genomes should help to clarify the evolutionary relationship between humans
and Neanderthals. It may also illuminate the genetic changes that enabled
humans to leave Africa and rapidly spread around the world around 100,000
years ago. The chimpanzee (Pan troglodytes) has already been sequenced
and stands ready to be compared to Neanderthals. The US National Human
Genome Research Institute (NHGRI) has set a goal of sequencing the genome
of at least one genome from each of the major positions along the evolutionary
primate tree, including the rhesus macaque, orangutan, marmoset, northern
white-cheeked gibbon and gorilla. The announcement comes as scientists
gather in Bonn, Germany, this week to mark the 150th anniversary of the
discovery of Neanderthal man — made in Germany's Neander Valley. During
21-26 July, experts will debate all aspects of Neanderthal life, from how
they migrated across Europe to what effect climate may have had on their
evolution. They will also debate how to find more and better samples to
work with. Getting clean genetic material out of such ancient bones is
a challenging task. The DNA of the bacteria and fungi that degrade a body
after it dies tends to get mixed up with the DNA of the host. And what
hominin DNA does survive is usually broken up into small bits over time.
But there are ways to reduce these problems — including using skeletons
left from cannibalistic societies, where no flesh was left on the bones
for bacteria to eat. The dream to find Neanderthal DNA started in the early
1980s. The problems with contamination were difficult. But now we have
new technologies and fossils free of contamination. The Leipzig team has
already sequenced about 1 million base pairs of nuclear Neanderthal DNA
from a 38,000-year-old Croatian fossil. That success was reported by Svante
Pääbo, director of the Institute's department of evolutionary
genetics, at a meeting
at Cold Spring Harbor Laboratory in New York this May. But they have a
long way to go; the entire genome is thought to be 3 billion letters long.
In addition to Pääbo's work with the Croatian fossil, there have
been successes with mitochondrial DNA — a portion of the genome that tends
to be better preserved, but which makes up only a tiny fraction of the
entire sequence and is passed down only through the female line. Almost
10 years ago, Pääbo succeeded in sequencing Neanderthal mitochondrial
DNA. More recently, such DNA was extracted from a 100,000-year-old Neanderthal
fossil found in Belgium. But a map of the nuclear DNA will prove the real
prize, revealing much more about the Neanderthal genetic make-up. The project
will extract the nuclear DNA from bones or teeth from both the first Neanderthal
specimen ever discovered, and some additional bones found in Croatia.
They may not always have enjoyed the most cordial of relations, but
English and German people have more in common than they might think. An
analysis of the genetic make-up of today's British population suggests
that almost all English people are descended from Saxon invaders who became
masters of a two-tier society that battered indigenous Brits into submission.
The analysis lends weight to the theory that the Anglo-Saxon invaders,
although relatively few in number, managed to take over almost the entire
country by setting up a system of social segregation similar to apartheid
in South Africa, in which the established locals were made second-class
citizens. The idea that modern English are of German descent is not new.
Previous genetic studies have suggested that more than 50% of English Y
chromosomes (the chromosome passed on unchanged from father to son) are
all but identical to those of German and Danish natives. But there has
been a problem in explaining how the Anglo-Saxons managed to breed so successfully
in Britain in the 300 years or so after their invasion in the fifth century
AD. Simple mathematical analyses suggest that this level of breeding would
have required an invading party more than half-a-million strong to make
an impression on the estimated two million Britons living on the island
at the time. Archaeologists argued that there is no evidence of such a
mass influx of foreigners. This paradox disappears, however, if you consider
a society in which the invaders muscled their way to the top of society,
where they could breed more successfully, say the British researchers behind
the new study. According to their computer model, somewhere between 10,000
and 200,000 invaders would have been needed to make their mark on the population.
The Anglo-Saxons may have forced indigenous Britons into servitude, while
enjoying superior wealth, health and breeding potentialref.
If the invading men were 1.8 times more likely than the locals to reproduce
successfully, the researchers note, it would take only five generations,
or 175 years, for the Germanic Y chromosome to exceed 50% prevalence in
the population as a whole. The study examined only the spread of the male
lineage: sons fathered by Saxon men but born to native British women would
therefore count as a spread of the Germanic line. But I'm willing to bet
that if you looked at the maternal line you would see the same pattern,
although he thinks it may not be quite so starkly defined. An apartheid-like
system is the explanation that fits best with sociological evidence, Thomas
argues. Historical records of the law of the time, for example, suggest
that the fines payable to the family of a murdered Anglo-Saxon were far
higher than those for a dead Briton. There could conceivably have been
wholesale slaughter or wholesale rape, but those explanations are the stuff
of films really. Obvious signs of the invasion persist today. Look at the
language we speak — it's Germanic. How did the German marauders manage
to attain such a lofty status, given that they were in the minority? They
were invaders; they were trained. And the British had been hammered by
the Romans for years. There are, however, corners of Britain that seem
to have remained resolutely British. Thomas and his colleagues point out
that, although most English and German Y chromosomes bear a strong similarity,
both are markedly different from those of Welsh people today. The differences
still persist. Even if not enshrined into law, people from different groups
often tend not to interbreed.
Composition :
Ensembl softwares currenly contain 24,847 cloned genes, but this number
will eventually rise above 30,000. A 2004 comparison of human genome sequences
produced using different approaches by the International Human Genome Sequencing
Consortium and the private firm Celera reveals that whole genome shotgun
sequencing (WGS), as used by Celera, does very well for 95% of euchromatin,
but falls down in the very large duplications that are greater than 98%
identical and longer than 20 kb in size: breaking up the sequencing of
whole genomes into two phases—WGS followed by clone-order–based sequencing
in bacterial artificial chromosomes—is the way forward, so that the current
estimate is 20,000-25,000 genes, a drop from the 30,000-40,000 estimated
in 2001ref.
45% of the human genome consists of remnants of previous transposon/virus
invasions and elements that are still active to date
21% : LINEs
13% : SINEs
8% : retroviruses
3% : DNA transposons
< 2% encodes (nontransposon) proteins
recombination hotspots (up to 4-fold increases in frequency) are
a ubiquitous feature of the human genome, occurring on average every <
200 kbp, and recombination occurs preferentially outside genes. For chromosome
20, about 50% of recombination occurs in < 10% of the total sequence.
In addition, for the HLA region, 80% of all recombination apparently occurs
in < 10% of the total sequence. Current data suggest that about 20%
of all the hotspots occur within 6% of the MHCref
About 10% of X genes belong to a family (the 'testis-antigen genes') that
has been linked to cancer. These genes are promising targets for potential
therapies, because they are only expressed in cancer and in the male reproductive
organs. Therapies that knock out tissues expressing the testis-antigen
genes should leave patients' other organs intactref
> 480 ultraconserved regions are 100% identical across 3 species
(man, mouse and rat) lie in junk DNA. That is a surprising similarity:
gene sequences in mouse and man for example are on average only 85% similar.
The regions largely match up with chicken, dog and fish sequences too,
but are absent from sea squirt and fruitflies. The fact that the sections
have changed so little in the 400 million years of evolution since fish
and humans shared a common ancestor implies that they are essential to
the descendants of these organisms. The most likely scenario is that they
control the activity of indispensable genes. Nearly 25% of the sequences
overlap with genes and may be transcribed into RNA : the sequences may
help slice and splice RNA into different forms. Another set may control
embryo growth, which follows a remarkably similar course in animals ranging
from fish to humans. One previously identified ultraconserved element,
for example, is known to direct a gene involved in the growth of the brain
and limbs. Figuring out what the mystery segments do will be difficult.
There are few similarities between one region and another, so these cannot
be used to provide clues to their function. One laborious technique will
be to genetically engineer mice that lack one segment and see how that
affects their growth and behaviourref
A map has been unveiled that shows the pattern of genetic variation
among people descended from populations all over the globe. The information
should be a valuable resource for researchers hoping to tailor medicines
to individual patients based on their genes. To construct the map, the
distribution of SNPs at almost 1.6 million sites in DNA samples from 71
US volunteers descended from European, African and Chinese populations
were cataloguedref.
Because they are easy to detect, researchers can use SNPs to help locate
the genes involved in disease, especially conditions such as Alzheimer's
and Parkinson's where individual genes make only a small contribution.
SNPs (and genes) that are close together in the genome are likely to be
inherited together, so if a group of people with a particular condition
shares a group of neighbouring SNPs, the odds are that a gene close by
is to blame. The snag is that it would be too time-consuming and expensive
to sequence millions of SNPs for every volunteer in an epidemiological
study. So Cox's project provides data that can be used to select a much
smaller number of SNPs for epidemiologists to use that are representative
of different inherited patterns of variation. The map is the first of its
kind. But it will soon be complemented by a second study, called the International
HapMap Project, which is cataloguing around 1 million SNPs in 270 people.
As well as being used to identify the genes involved in disease, SNPs could
also help show why some people respond better than others to certain drugs.
The knowledge could even lead to specific treatments tailored to a patient's
individual genetic makeup. Most of the SNPs that Cox and his team studied
are common to all 3 populations. Around 94% were found in the African Americans,
81% occurred in the European Americans, and 74% were detected in those
of Chinese ancestry. That matches the findings of previous studies in which
African populations were found to be the most genetically diverse. The
researchers are at pains to stress, however, that their work is not about
defining races by their genetic differences. Differences at each of the
SNP sites were found within the different populations, as well as between
them : there are no discrete genetic boundaries; there are gradients from
one end of the Earth to the other.
H-Invitational Databaseref
: over 5000 genes present in this collection are not represented in RefSeq
(up to 4% of the sequences in RefSeq may be missing or incorrect), while
nearly 3500 curated RefSeq mRNAs are not captured in this collection
Gene map of human mitochondrial DNA, showing the genes transcribed from
the heavy (H) and light (L) strands, transcription proceeding counterclockwise
on H and clockwise on L. OriH, Ori L, origins of
replication; ND1–6, NADH dehydrogenase subunits; Cyt b, cytochrome
b; COI–III, cytochrome oxidase subunits; tRNA, transfer RNA;
rRNA,
ribosomal RNA :
Model organisms
Web resources :
XREFdb
at NCBI : cross-referencing the genetics of model organisms with mammalian
phenotypes
Animalia
Vertebrates
chimpanzees
(Pan
troglodytes) : our closest relativeref1,
ref2,
ref3,
ref4;
3.1 billion bases. In Aug 2005 geneticists finished reading one of the
most important volumes in the library of life: the DNA of the chimpanzee.
Decoding the sequence of our comrade in apehood may help to answer the
age-old question of what makes us human. The US-led Chimpanzee Sequencing
and Analysis Consortium has already begun making such comparisonsref.
By lining the Pan troglodytes sequence up against the human genome,
it has spotted 6 areas of our own DNA that have been rigorously sculpted
by natural selection. The areas include one that contains a gene known
to be crucial for that most human of traits, speech. The chimpanzee consortium
assembled its sequence using the now de rigeur method of whole-genome shotgun
sequencing. The chimp genome cost no more than an estimated US$50 million.
Some 98% of the data came from blood samples from a single common chimpanzee,
called Clint, who lived at the Yerkes
National Primate Research Center in Atlanta, Georgia. Clint died after
a heart failure in Jan 2005 at the tender age of 24; most chimps live into
their 50s. So what does Clint's DNA actually tell us? For a start, humans
and chimps are not quite the close cousins we thought. Crude past comparisons
of our DNA showed that our sequences were between 98.5% and 99% identical.
That is indeed the case when considering single-letter differences in the
DNA code, of which there are 35 million, adding up to about 1.2% of the
total sequence. But there are other differences. The 2 sequences are littered
with duplicated segments that are scattered in different ways in the 2
speciesref.
These regions add another 2.7% of difference to the tally. So the 1.2%
figure is woefully inaccurate. Much of the difference is seen in genes
involved in the immune system. The contrast suggests that humans and chimps
came up against different diseases during our evolutionary upbringing.
The most fertile grounds for human gene duplications are regions near the
ends of chromosomes called subtelomeresref.
These areas are still poorly understood, she says, and could tell us more
about our own evolution. But in terms of what makes us human, the most
exciting areas of our genome are six regions, containing a few hundred
genes, that show very little variation from human to human, but more variation
in chimps. This implies they were important in our evolution. Enticingly,
one of these regions is home to a gene called FOXP2, which is crucial for
producing coherent speech.
See also an evolutionary
comparison between human and chimpanzee genomes.
Web resources :
cow
(Bos taurus)
: 3 billion bp; the first completed genetic map of the Hereford breed of
cattle was achieved in October 2004 by researchers at the Baylor College
of Medicine's Human Genome Sequencing Center in Houston, Texas and will
be followed by work on gene sequencing of a half dozen other breeds.
South American
gray short-tailed opossum(Monodelphis
domestica) : the first marsupial. Opossums and humans diverged
from a common ancestor around 130 million years ago - roughly 55 million
years before mice went their separate way, but 200 million years after
birds branched out. Opossums are born after a gestation period of just
12 days. The blind, bald babies crawl into their mother's protective pouch
to complete their development. This makes them readily accessible for medical
research. Opossums also provide a good model of melanoma,
the most deadly form of skin cancer. They are the only laboratory animal
known to develop melanoma after exposure to UV radiation alone.
mouse
(Mus Musculus)(Q
= 2.5 Gb; 30,000 genes, 99% of which have a human homologue, 96% lieing
in regions that are syntenic with human chromosomes)
Some think that the (not-so surprising) complex haplotype structure
of the mouse genome will hinder the process of gene identification for
quantitative traits. Anyway the structure of genetic variation in the mouse
is problematic only for a limited and controversial approach to QTL analysis
that is based on exploiting haplotype structureref
(e.g., Grupe et al.ref).
This approach is based on the assumption that the mouse genome can be represented
as large contiguous blocks with common ancestry. A highly fragmented ancestral
structure of the common mouse strains would make this approach untenable.
However, the most widely used methods of QTL analysisref
make no such assumptions and remain viable. Indeed, the availability of
genomic sequence data and functional annotation has greatly facilitated
the process of identifying the genes responsible for quantitative variation
in the mouseref.
There is every reason to believe that the current pace of discovery will
only accelerate. While the image of "amateur scientists" scurrying after
mice in the wild is amusing, it is hardly accurate. The first inbred stain,
DBA, was produced by C.C. Little to determine if cancer was inherited.
Thus, it appears that inbred mouse lines were developed by highly respected
scientists who were actively pursuing genetic hypotheses. I must agree,
however, that the currently available stocks of inbred mice do not represent
an ideal resource for genetic studies of complex trait inheritance. There
is a community within the field of mouse genetics of like-minded scientists
who are working to improve this situation. One proposal to develop an entirely
new set of inbred lines that are specifically designed for genetic studies
has been described in a recent news articleref.
The findings of Yalcin et al. represent an important advance in our understanding
of the ancestral structure of the mouse genome. However, the implications
for the utility of the mouse as a tool for genetic research are not at
all bad. The mouse remains our best hope for unraveling the genetic mechanisms
of the common, complex diseases that represent the greatest burden to human
health.
Web resources :
Mouse Genome Informatics (MGI)
at the Jackson lab (Mouse Genome Database (MGD), Gene Expression
Database (GXD) and the Encyclopedia of the Mouse Genome)
Wellcome Trust's Mutagenic
Insertion and Chromosome Engineering Resource (MICER) projectref,
a publicly available supply of 93,960 readymade insertional targeting vectors
in 2 libraries for generating knockouts and for large-scale deletions,
inversions, or duplications. Instead of having to screen a library for
a particular gene or region of interest and then find a bacterial artificial
chromosome clone or a genomic clone, the clone can be found online at the
MICER Web site, eliminating the effort that otherwise would be needed in
generating those resources. The vectors also contain different positive
selection markers for different vectors (either neo- or puromycin) and
a coat color marker (either tyrosinase, which turns a white mouse black,
or agouti, which turns a brown mouse yellow). The coat color marker tracks
the mutations from one generation to the next and avoids to take a biopsy
or do a PCR reaction or a southern blot. While this represents only about
7 to 11% of all the genes in the genome—low compared to the Sanger Institute
Gene
Trap Resource with an estimated 32% genome coverage—they are complementary
approaches that will ultimately help us to get to this aim of identifying
every gene in the genome
brown
Norway rat (Rattus norvegicus)
: it has about 25,000 genesref.
Around 90% of these have matches in the mouse and man. Around 10% of the
rat's genes are both shared with the mouse and absent in humans, including
some that code for smell-related proteins : this may explain rodents' exceptional
sense of smell. Rats also have more genes for breaking down toxins than
man : this means that rats may be better at removing toxins from their
bodies than humans. It may be more difficult than we'd thought to use the
toxicity of drugs in rats as a guide to their toxicity in humans. Rats
are still commonly used in such toxicity tests, though researchers are
increasingly using tissue cultures instead of animals. Comparisons also
suggest that rats evolved 3 times faster than humans, since the rat genome
is much more diverse than our own. These are examples of where the rat
has taken advantage of an evolutionary niche and specialized. The mutations
that create this diversity probably occurred at random, then stuck around
because they gave the rat some evolutionary advantage. These are examples
of where the rat has taken advantage of an evolutionary niche and specialized.
The rat's genetic diversity may have enabled it to colonize a wide range
of habitats all over the world. The rat genome may be more diverse than
the human one simply because their genes mutate more quickly than our own,
or, which is more likely, their shorter life span means they have had a
greater number of generations in which to select genetic changes.
Web resources :
Rat Genome Database (RGD)
at Medical College of Wisconsin
dog
(Canis familiaris)
: 2.4 Mb and 39 pairs of chromosomes; 18,473 dog genes have human equivalents.
This already surpasses the 18,311 known from the mouse sequence. Different
breeds are more than 99% identical. There are > 350 known genetic dog diseases,
surpassing all animals save humans. About 10% of Irish setters, for example,
carry a gene for an immune disease. German shepherds commonly suffer hip
displacement. Golden retrievers often develop cancer of the white blood
cells. DNA testing has allowed breeders to avoid mating carriers. Knowing
that your dog has a genetic predisposition to a particular condition means
you can take preventative measures : mutts that are part German shepherd
could be given a diet that promotes muscular development. Owners of cancer-prone
dogs could keep watch for early symptoms. The test may also help owners
of Staffordshire bull terriers : these dogs are commonly mistaken for more
aggressive pit bull terriers, which means that, in the UK, owners can find
themselves on the wrong side of the Dangerous Dogs Act. This 1991 legislation
makes it illegal to own a pit bull without special dispensation from a
court. Over 400 different breeds have been described, of which 152 are
recognized by the American Kennel Club (AKC).
Although most breeds emerged in Europe during the past several hundred
years, some came from Asia or Africa at least 2000 years ago : the saggy-faced
Shar-Pei and the pint-sized Pekingese were among the early developers.
Surprisingly, some breeds thought to have originated long ago turn out
to be fresh-faced impostors. The Pharaoh and Ibizan hounds of today were
thought to be direct descendants of ancient Egyptian dogs, immortalized
in stone on tomb walls over 5,000 years ago. Not so. Both hounds have been
recently created from combinations of other breeds. The study also reveals
that, based on their genetics, modern dogs fall into three categories,
loosely termed herders, hunters and guarders. Herders include collies and
sheepdogs; hunters contain hounds and terriers; and guard dogs boast mastiffs
and bulldogs. The groups probably arose in the 1800s when Europeans first
established breed clubs.
On Dec 2005 the full genetic code of a 12-year-old boxer named Tasha
was published. Canis familiaris has a unique genetic background,
thanks to us. All domestic dogs are descended from grey wolves (C. lupus)
that were tamed about 15,000 years ago. Over time, people have bred dogs
to look and act in specific ways: think of the smush-faced pug or the friendly
golden retriever. We have created > 400 dog breeds, each with its own traits,
and its own genetic code. So it should be a lot easier to pin down the
genetic roots of traits in dogs than in people, whose characteristics and
genetic groupings are much less clear cut. This gives us a blueprint for
how complex traits evolved in all breeds of dogs. Scientists had already
made one stab at a dog genome, assembling 75% of the genetic code of a
poodle named Shadow in 2003ref.
Adding Tasha's complete code to the mix will make it easier to find the
causes of genetic diseases, such as cancer, that affect both dogs and people.
Dogs are less closely related to humans than other mammals, such as chimps,
that have been completely sequenced. So scientists can use the dog genome
to test their assumptions about the way mammals evolved. Tasha's genome
has already helped them to pinpoint a group of DNA sequences that do not
code for specific genes, but are extremely similar among mice, humans and
dogsref.
The fact that these sequences are the same in all three animals indicates
they could be crucial switches that control the activity of genes, the
authors say. The discovery of such 'non-coding' regions, and the quest
to find out what they do, is one of the most intriguing questions facing
genomicists. These signals that decide when a gene will be turned on or
off are extremely important : we're looking at the tip of the iceberg now,
but when we get ten or twenty mammals we'll be able to crystallize this
even further.
Web resources :
Fred Hutchinson Cancer Research Center (FHCRC)
: Dog genome project (genetic markers and maps)
chicken (Gallus
gallus) is the first bird to join this prestigious roster.
Mammals and their ancestors (synapsids) split from the reptilia (including
diapsids, the ancestors of birds, lizards and snakes) between 310 million
and 350 million years ago. Birds evolve as a branch of the therapod dinosaurs
within the coelurosauria, a group that includes Tyrannosaurus rex. The
earliest bird fossils known are those of Archaeopteryx (see artist's impression),
which lived in the late Jurassic period. Feathers and possibly flight appear
to have evolved already in the dinosaurs that give rise to the birdsref.
Between 8 million and 9 million years ago, the jungle fowl genus (Gallus),
to which chickens belong, evolves among the land fowl, the Galliformes.
Darwin first proposes, in 1896, that modern chickens derive from the red
jungle fowl (G. gallus) because this species can breed with domestic
birds and produce fertile offspring. This view is later confirmed by mitochondrial
DNA analysisref.
Archaeological evidence suggests that domestic fowl are kept in China at
least as early as 5400 BC and perhaps as early as 8000 BC, but the connection
between these fowl and modern chickens is uncertain. Instead, today's chickens
are believed to have arisen from birds kept by people of the Harappan culture
(2500-2100 BC) of the Indus Valley, which then spread into Mediterranean
regions. Initial domestication was probably for sporting purposes, such
as cockfighting, rather than for food. The chicken's role in scientific
research began early. Aristotle (pictured) includes a description of the
chicken embryo and its development in his Historia Animalium, written in
the fourth century BC. Then in the sixteenth century, Hieronymus Fabricius's
drawings accurately chronicle the daily development of the chick embryo.
The chicken is also used in the seventeenth and eighteenth centuries by
William Harvey and Caspar Friedrich Wolff, in their studies of the circulatory
system. Chickens have long had a place in traditional Chinese medicine,
but their therapeutic value is officially recognized in 1593. The Chinese
silkie chicken is included in the Compendium of Materia Medica (Bencao
Gangmu), a magnum opus on Chinese pharmacology by Shizhen Li, which
documents traditional medicines practised since the Tang dynasty. Popular
interest in the health benefits of chicken remedies (such as chicken soup)
continues today. Geneticist William Bateson at the University of Cambridge,
UK, begins work on chickens in 1898, aiming to demonstrate that Mendel's
laws of inheritance were applicable to animals as well as plants. Other
geneticists who carry out early work on chickens include R. C. Punnett
and W. J. Spillman, who first report the existence of sex-linked genes
in the species. By observing which traits tend to be inherited together
(a phenomenon known as linkage), A. S. Serebrovsky and S. G. Petrov describe
a linkage map of chicken chromosomes in 1930, followed by F. B. Hutt in
1936.
1911: first tumour virus described. From his work in chickens, Peyton Rous
describes the first virus known to trigger cancer. It is now known as the
Rous sarcoma virus. He later shares the 1966 Nobel Prize in Physiology
or Medicine for his work.
1944: chicken karyotype defined. Y. Yamashina defines the number, size
and shape of the chicken's chromosomes, which includes large 'macrochromosomes'
as well as smaller 'microchromosomes'.
1951: developmental stages outlined. The pioneering work of Viktor Hamburger
and Howard Hamilton sets the stage for developmental biology studies of
the chicken and other vertebrates.
1976: chicken genes implicated in cancer. The Rous sarcoma virus contains
a gene that triggers cancer, a so-called oncogene, called src. In
1976, Michael Bishop, Harold Varmus and their colleagues show that the
chicken genome itself contains similar sequences (now known as the c-src
gene). It is the first oncogene discovered within the genome of any species.
Bishop and Varmus later share the 1989 Nobel Prize in Physiology or Medicine
for this and related work.
1977: chicken gene reveals introns. The ovalbumin gene is one of the first
genes shown to contain intron sequences. These are non-coding sections
found in the genes of complex organisms that are spliced out when the gene
is translated into a protein.
1986 : the first genetically modified chickens are produced, by inserting
retrovirus sequences into their genomes. Producing transgenic chickens
efficiently is still one of the greatest challenges for chicken biology.
The transgenic chicks pictured have had the gene for a fluorescent protein
added to their genomesref
1992 : The first map of the entire chicken genome is produced, based on
observing which of various marker sequences tend to be inherited together.
In 2000, this and other linkage maps based on different populations are
combined to generate the first consensus linkage map
2004 : the International Chicken Genome Sequencing Consortium publishes
the draft sequence of a single, inbred red jungle fowl. It is the first
bird to have its genome sequenced, as well as the first species of agricultural
importance. Researchers have high hopes that it will fill some gaps in
how mammals evolved from simpler organisms, as well as providing information
about the birds themselvesref1,
ref2.
Segments of genome sequence from broiler, layer and Chinese silkie chickens
are also compared with the full draft sequence of the red jungle fowlref.
The results show that domestic breeds are less inbred than thought, with
a surprising level of genetic diversity compared with their wild progenitor.
The International Chicken Genome Sequencing Consortium, which is composed
of researchers from around the world, decided on the red jungle fowl (Gallus
gallus). This wild ancestor of domestic poultry still lives in parts
of southern Asia. The group sequenced the genome of one individual of this
species using the shotgun approach. Chicken genome contains 1 billion bp
of DNA, which is only one-third as many as humans have. But packed into
that are an estimated 20,000-23,000 genes, roughly the same as the human
quota. Analysis of the data is only just beginning, but several surprising
results have already emerged from the project. For example, it had been
thought that chickens lack a sense of smell, but the large number of olfactory
genes in the sequence suggests otherwise. The gene for keratin, the protein
that makes up hair and fingernails in people and beaks and feathers in
chickens, is thought to have arisen from a common source in both mammals
and birds. Yet the chicken sequence looks very different from the mammal
keratin genes known so far, raising the possibility that keratin production
might have evolved twice. 3 domestic breeds differ genetically from the
red jungle fowl. In contrast to the idea that domestic animals are more
highly inbred than their ancestors, the study detected a startling amount
of genetic diversity in broiler, layer and Chinese silkie chickens as compared
with their wild relative. When the group compared the chicken genome with
that of other mammal species, including humans, they found a surprising
amount of similarity in regions not thought to be involved in protein production.
One possibility is that these mysterious sequences are involved in the
regulation of protein production
puffer-fish
(Takifugu rubripes,
previously called Fugu rubripes) : its genome contains
all the alternative promoters and splice exons and introns that are present
in mammalian genomes, but because the introns are so much smaller, genes
are about an eighth the size. This makes the Fugu genome a potentially
powerful tool for functional gene analysis, but scientists have until now
been frustrated in their attempts to use the resource because mammalian
cells do not correctly splice the fish genes because intronic splicing
enhancers (ISEs) appear to differ substantially between mammals and fish.
The genes of Wolbachia spp., that infects many insects, have
been sitting in the fruit fly gene database unnoticed. The serendipitous
discovery of these 3 new genomes demonstrates how powerful the public release
of raw sequencing data can be. The existence of these bacterial species
inside the fruit fly genome database is an artifact of the way the fly
was sequenced : embryos were ground up and the DNA extracted, meaning that
any endosymbionts - organisms that live their entire lives inside another
organism and have developed a mutual dependence with the host - would have
had their DNA intermixed with fly DNA before sequencing. The sequencers
of other genomes, especially the human genome, were more careful to eliminate
any endosymbionts or parasites, but secrets may still lie hidden inside
these other genomes. After all, there are more bacterial cells in a human
body than there are human cells. Wolbachia made headlines a year
ago with the publication of the genome sequence of the species Wolbachia
pipientis, which lives inside the reproductive cells of the laboratory
fruit fly Drosophila melanogaster. The bacteria were maligned as
"male killers" because they sometimes kill developing males, and occasionally
convert male embryos to female, but species of Wolbachia live inside
a wide variety of insects, spiders, and crustaceans and have beneficial
as well as deleterious effects. Given the Wolbachia genome and the
likelihood that W. pipientis had been sequenced along with the fruit
fly genome, Eisen performed a quick look for Wolbachia in the Trace
Archive, an open source for raw genome data. In his words, "I found a whole
bunch of stuff." Salzburg and his TIGR colleagues took over and searched
not only the D. melanogaster genome but also the genomes of 6 other
fruit flies so far sequenced. They found Wolbachia DNA in 3 species.
They were able to reconstruct 95% (1,440,650 base pairs) of the genome
of one new species from D. ananassae, which they called Wolbachia
wAna. Using the same technique, they identified Wolbachia wSim in
the genome of D. simulans and Wolbachia wMoj in the genome
of D. mojavensis. The team compared the new Wolbachia genomes
with the known genome of the wMel strain of W. pipientis and found
a number of new genes - up to 464 new genes in wAna - as well as a sign
of extensive rearrangement between wMel and wAna, indicating that the 2
strains have diverged significantly since they first infected the 2 Drosophila
species. The 2 most closely related strains are wAna and wSim, which have
nearly identical genomes. wMel and wMoj share about 97% of their genomes
with wAna and wSim but are a bit more distant from one another
Web resources :
silkworm
(Bombyx
mori) : lepidopteran are a 160,000 species-strong order
including butterflies and moths, which accounts for some 10% of the world's
animal biodiversity. Many of the world's most ravenous plant-eating pests
are in this order. The silkworm has been domesticated over the past 5,000
years, during which time it diverged from its wild and rarely seen ancestor,
B. mandarina. Baby worms feed on mulberry leaves until they have spun their
silken cocoons. Silk-industry workers then bake or steam them to death,
after which the silk is removed. The process has taken an evolutionary
toll. Those fattened individuals that escape their fate turn into moths,
but they can't fly and don't live long. With no selection for survival
features in the adult moth, deleterious single-base pair changes have probably
occurred, costing the moths flight as well as their colourful wing patterns.
Big, flightless and easy to handle, the moths were used by researchers
in Japan in the early twentieth century to match or jump ahead of developments
in the genetics of the fruitfly,
Drosophila
melanogaster. But since then, despite some 400 mutant lines in
research laboratories, B. mori has largely receded as a model organism.
In January 2004, a group of Japanese researchers released silkworm sequence
data in which the genome had been covered 3 times (3X coverage)ref.
The publication followed a rift with many Chinese researchers who were
trained by the Japanese, who published their 6X data on December 2004ref.
Both sequences are consistent and will form a 9X sequence when they are
combined. The draft, which lists 18,510 genes on the worm's 28 chromosomes,
accounts for 90.9% of the genome. At 428.7 Mbp long, the genome is 3.6
times larger than that of the fruitfly and 1.54 times larger than that
of the mosquito. In Drosophila, genes for around 150 olfactory receptors,
essential for food gathering and other behaviour, are known. Around the
same number would be expected in the silkworm, but none are currently known.
And though only 45 genes were previously known that are related to the
silk gland, the new data shows up 1,874. Of 323 genes known for wing development
in Drosophila, 300 are present in the silkworm, according to the
recent data. Despite losing the ability to fly, these genes are present
and very highly conserved. Knowledge of genes will allow researchers to
use biomarkers to select for certain desirable traits, such as fibre quality
or disease resistance. But more important might be insights that it provides
into how to deal with the gluttonous, destructive larvae of related species,
such as the fall armyworm and the tobacco budworm. Many of these species
have grown resistant to pesticides.
roundworm
(Caenorhabditis elegans)
has around 19,500 genes. 122 novel introns appeared in the genomes of Caenorhabditis
elegans and Caenorhabditis
briggsae since the 2 species diverged 80 to 120 million years ago,
shedding light on how new introns arise and are subsequently spread among
genes. The genomes of both worms contain roughly 100,000 introns, of which
> 6000 are unique to one species or the other. 81 new introns in C.
elegans and 41 new introns in C. briggsae. Of these, 13 are
found in genes implicated in premRNA processing. Just where these novel
introns had settled in the worm genomes was evidenced by a stretch of DNA
called an exon splice site consensus sequence. It gives credence to the
notion that organisms can gain introns in their genes. 2 unusual findings
are the discovery of copies of introns elsewhere within the same genome
and duplicate copies of an intron within the same gene. The authors attribute
the anomalies to a process called reverse splicing, whereby an excised
intron somehow inserts into a different site within the same mRNA template.
Reverse transcription of the mRNA then gives rise to DNA containing the
reinserted intron, becoming part of the genome.
The proposal of whole genome duplication in S. cerevisiae—first
put forward by Ken Wolfe, professor at the department of Genetics, Trinity
College Dublinref—is
now fully proven without doubt. The 4 yeast genome sequences published
were from Candida glabrata, a human pathogen; Kluyveromyces lactis,
commonly used in genetics studies; Debaryomyces hansenii, a salt-tolerant
yeast; and Yarrowia lipolytica, a methane-using yeastref.
The molecular divergence as measured by the percentage of identity between
homologous proteins is very high between these yeasts
Web resources :
Plantae : full genome sequencing in
higher plants is a very difficult task, because their genomes are often
very large and repetitive. For this reason, gene targeted partial genomic
sequencing becomes a realistic option. Methylation filtration (MF)
is a simple approach to generate gene-enriched plant genomic libraries.
This technique takes advantage of the fact that repetitive DNA is heavily
methylated and genes are hypomethylated. Then, by simply using an Escherichia
coli host strain harboring a wild-type modified cytosine restriction
(McrBC) system, which cuts DNA containing methylcytosine, repetitive DNA
is eliminated from these genomic libraries, while low copy DNA (i.e., genes)
is recovered. To prevent cloning significant proportions of organelle DNA,
a crude nuclear preparation must be performed prior to purifying genomic
DNA. Adaptor-mediated cloning and DNA size fractionation are necessary
for optimal results.
the 2300–2700 Mb maize genome consists of highly repetitive sequences interspersed
with single-copy, gene-rich sequences. This configuration makes standard
genome sequencing strategies unproductive, and consequently, gene-targeted
partial genomic sequencing using MF or High C0t selection
(HC) may be of considerable use in unlocking the information contained
in the sequence data : the elimination of > 90% of repeats by MF reduces
sequencing costs without sacrificing information, because reads within
these repeats could not be assembled in any case by whole-genome shotgun
analysis
roundworms : an organism that can regenerate
its entire body from a severed section, should help researchers seeking
to understand tissue regeneration.
18 new species have been selected by NHGRIfor
whole genome sequencingref
:
9 of the animals chosen are from diverse branches of the mammalian evolutionary
tree, so that their genomes can usefully be compared. These include the
The orangutan's genome sequence will complement the projects to sequence
the genetic codes of fellow primates the chimpanzee
(Pan troglodytes) and the rhesus macaque (Macaca mulatta),
both of which are nearing completion.
9 organisms will shine a light, hopefully, on the processes by which genomes
are organized. The species, which come from far and wide on the evolutionary
tree of life, include
the snail Biomphalaria
glabrata
carries the parasite that causes the debilitating disease schistosomiasis.
The 18 species will be added to the institute's waiting list for genome
sequencing, which currently features the kangaroo, cow, and a host of flies
and fungi. Sequencing will be done at the five centres of the institute's
Large-Scale Sequencing Research Network across the United States, using
the high-speed 'shotgun' method developed for the Human Genome Project.
Once work begins, the sequences should all take about a year to complete.
What next for the geneticist who seems to have sequenced everything?
After piecing together DNA sequences from the oceans, his dog and of course,
humans, the genome pioneer Craig Venter has announced his next plan - to
find out what microbes are blowing around in New York's air. The Air
Genome Project will filter bugs from the air in midtown Manhattan,
the United States' most densely populated area. Researchers from the J.
Craig Venter Institute in Rockville, Maryland, will collect dozens
of samples both indoors and outside, before sifting through them for fragments
of microbial DNA. The move follows a similar project by Venter's company
to sequence genetic information from a region of ocean near Bermuda. That
project identified some 1.3 million new genes and at least 1,800 new species
of marine microorganism. The Air Genome Project will work in a similar
way to its marine predecessor. Having filtered the air, Venter's team will
use the 'shotgun' method, which involves analysing small segments of DNA
and piecing them together into longer strings by matching up their overlapping
ends. The scientists thus hope to identify more organisms than the old-fashioned
method of simply leaving a culture dish on a windowsill and then incubating
it to see what grows. Many microbial species cannot be cultured in the
lab, so this method would not reveal their presence. Even with the help
of shotgun sequencing, it will be difficult to piece together entire genome
sequences for all but the most abundant organisms. His own approach is
to look at certain genes that all bacteria share, albeit in slightly different
versions, to work out what species are present in a sample. To collect
the samples, a filter has been designed that will sift through some 1,400
m3 of air each day. Having tested the device on top of their
building in Rockville, the researchers have now installed it on a 40-storey
office block in the Big Apple, although its precise whereabouts is a secret.
Identifying the microbes that flit through our air and into our lungs could
be a useful step towards combating urban diseases such as asthma. Many
bacteria and viruses in the air elicit destructive immune responses and
we would like to explore these. Urban areas across the world are likely
to harbour many of the same species