In this article we will learn about Genomic Libraries:- 1. Meaning of Genomic Libraries 2. Principle of Genomic Libraries 3. Vectors used for the Construction 4. Size 5. Types 6. Procedure in the Construction 7. Creation 8. Problems in Construction 9. Storage 10. Disadvantages 11. Applications.
Contents:
- Meaning of Genomic Libraries
- Principles of Genomic Libraries
- Vectors used for the Construction
- Size of Genomic Library
- Types of Genomic Library
- Procedure in the Construction of Genomic Library
- Creation of a Genomic Library using the Phage-λ Vector EMBL3A
- Problems Associated with the Construction of Genomic Library
- Storage of Genomic Library
- Disadvantages of Genomic Library
- Applications of Genomic Library
1. Meaning of Genomic Libraries:
Genomic libraries are libraries of genomic DNA sequences. These can be produced using DNA from any organism.
2. Principle of Genomic Libraries:
A genomic library contains all the sequences present in the genome of an organism (apart from any sequences, such as telomeres that cannot be readily cloned). It is a collection of cloned, restriction-enzyme-digested DNA fragments containing at least one copy of every DNA sequence in a genome. The entire genome of an organism is represented as a set of DNA fragments inserted into a vector molecule.
3. Vectors used for the Construction of Genomic Library:
The choice of vectors for the construction of genomic library depends upon three parameters:
1. The size of the DNA insert that these vectors can accommodate.
2. The size of the library that is necessary to obtain a reasonably complete representation of the entire genome.
3. The total size of the genome of the target organism.
In the case of organism with small genomic sizes, such as E. coli, a genomic library could be constructed by using a plasmid vector. In this case only 5000 clones (of average DNA insert size 5kb) would give a greater than 99% chance of cloning the entire genome (4.6 x106 bp).
Most libraries from organisms with larger genomes are constructed using lambda phage, BAC or YAC vectors. These accept DNA inserts of approximately 23,45,350 and 1000kb respectively. Due to this, fewer recombinants are needed for complete genome coverage in comparison to the use of plasmids.
4. Size of Genomic Library:
It is possible to calculate the number (N) of recombinants (plaques or colonies) that must be in a genomic library to give a particular probability of obtaining a given sequence.
The formula is:
N = In (1 – P)/ln (1 – f),
where ‘P’ is the desired probability and ‘f is the fraction of the genome in one insert. For example, for a probability of 0.99 with insert sizes of 20kb this values for the E. coli (4.6 x 106 bp) and human (3 x 109 bp) genomes are:
Ng coli = In (1 – 0.99) / In [1 – (2 x 104/4.6 x 106)] = 1.1 x 103
Nhuman = In (1 – 0.99)/ In [1 – (2 x 104/3 x 109)] = 6.9 x 105
These values explain why it is possible to make good genomic libraries from prokaryotes in plasmids where the insert size is 5-10 kb, as only a few thousand recombinants will be needed.
5. Types of Genomic Libraries:
Depending on the source of DNA used forced construction of genomic library it is of following two types:
(a) Nuclear Genomic Library:
This is genomic library which includes the total DNA content of the nucleus. While making such a library we specifically extract the nuclear DNA and use it for the making of the library.
(b) Organelle Genomic Library:
In this case we exclude the nuclear DNA and targets the total DNA of either mitochondria, chloroplast or both.
6. Procedure in the Construction of Genomic Library:
1. Preparing DNA:
The key to generating a high-quality library usually lies in the preparation of the insert DNA. The first step is the isolation of genomic DNA. The procedures vary widely according to the organism under study. Care should be taken to avoid physical damage to the DNA.
If the intention is to prepare a nuclear genomic library, then the DNA in the nucleus is isolated, ignoring whatever DNA is present in the mitochondria or chloroplasts. If the aim is to make an organelle genomic library, then it would be wise to purify the organelles away from the nuclei first and then prepare DNA from them.
2. Fragmentation of DNA:
The DNA is then fragmented to a suitable size for ligation into the vector. This could be done by complete digestion with a restriction endonuclease. But this has a demerit. Digestion by the use of restriction endonuclease produces DNA fragments which are not intact.
To solve this problem we use partial digestion with a frequently cutting enzyme (such as Sau3A, with a four-base-pair recognition site) to generate a random collection of fragments with a suitable size distribution.
Once prepared, the fragments that will form the inserts are often treated with phosphate, to remove terminal phosphate groups. This ensures that separate rate pieces of insert DNA cannot be ligated together before they are ligated into the vector. Ligation of separate fragments is undesirable, as it would generate clones containing non-contiguous DNA, and we would have no way of knowing where the joints lay.
3. Vector Preparation:
This will depend on the kind of vector used. The vector needs to be digested with an enzyme appropriate to the insert material we are trying to clone.
4. Ligation and Introduction into the Host:
Vector and insert are mixed, ligated, packaged and introduced into the host by transformation, infection or’ some other technique.
5. Amplification:
This is not always required. Libraries using phage cloning vectors are often kept as a stock of packaged phage. Samples of this can then be plated out on an appropriate host when needed. Libraries constructed in plasmid vectors are kept as collections of plasmid-containing cells, or as naked DNA that can be transformed into host cells when needed.
With storage, naked DNA may be degraded. Larger molecules are more likely to be degraded than smaller ones, so larger recombinants will be selectively lost, and the average insert size will fall.
7. Creation of a Genomic Library using the Phage-λ Vector EMBL3A:
High-molecular-weight genomic DNA is partially digested with Sau3Al. The fragments are treated with phosphatase to remove their 52 phosphate groups. The vector is digested with Bam/HI and EcoRI, which cut within the poly-linker sites.
The tiny BamHI/EcoRl poly-linker fragments are discarded in the iso-propanol precipitation, or alternatively the vector arms may be purified by preparative agarose gel electrophoresis. The vector arms are then ligated with the partially digested genomic DNA.
The phosphatase treatment prevents the genomic DNA fragments from ligating together. Non-recombinant vector cannot reform because the small poly-linker fragments have been discarded. The only package able molecules are recombinant phages. These are obtained as plaques on a P2 lysogen of sup+ E. coli. The Spi” selection ensures recovery of recombinant phage plaques.
8. Problems Associated with the Construction of Genomic Library:
In the making of a genomic library we digest the total genomic DNA with a restriction endonuclease, such as EcoRl, insert the fragments into a suitable phage X vector, and then attempt to isolate the desired clone. How many recombinants would we have to screen in order to isolate the right one?
Let us assume that EcoRI gives an average of about 4kb of DNA fragment, and given that the size of the human haploid genome is 2.8 x 106kb, it is clear that over 7 x 105 independent recombinants must be prepared and screened in order to obtain a desired sequence. In other words, we have to obtain a very large number of recombinants, which is a very labour intensive procedure.
There are three problems associated with the above approach:
1. The gene may be cut internally one or more times by Eco RI so that it is not obtained as a single fragment. This is likely if the gene is large.
2. Many times while making a library we want to obtain extensive regions flanking the gene or whole gene clusters. Fragments averaging about 4 kb are likely to be inconveniently short.
3. The obtained gene fragment may be larger than the size which the vector can accept. In this case the appropriate gene would not be cloned at all.
These problems can be overcome by cloning random DNA fragments of a large size. Since the DNA is randomly fragmented, there will be no exclusion of any DNA sequence. Also in this case the clones will overlap one another allowing the sequence of very large genes to be assembled. Because of the larger size of each cloned DNA fragment fewer clones are required for a complete or nearly complete library.
Now again we have a problem. How can appropriately sized random fragments be produced? Various methods are available of which random breakage by mechanical shearing is the most appropriate one. This is because the average fragment size can be controlled. Along with this the insertion of the resulting fragments into vectors requires additional modification steps.
To achieve this the strategy devised by Maniatis et al. (1978) is the most followed one. In this method the target DNA is digested with a mixture of two restriction enzymes. These enzymes have tetra-nucleotide recognition sites which occur frequently in the target DNA. The restriction digestion by using these enzymes produces fragments having an average size of less than 1 kb.
However, only a partial restriction digest is carried out, and therefore the majority of the fragments are large (in the range 10-30 kb). Given that the chances of cutting at each of the available restriction sites are more or less equivalent, such a reaction effectively produces a random set of overlapping fragments.
These can be separated from each other on the basis of their size (size fractionation), e.g., by gel electrophoresis. This results in the generation of a random population of fragments of about 20kb which are suitable for insertion into a e replacement vector.
9. Storage of Genomic Library:
Once a genomic library has been made it forms a useful resource for subsequent experiments as well as for the initial purpose for which it was produced. Therefore, it is necessary to store it safely for future use. A random library will consist of a test tube containing a suspension of bacteriophage particle (for a phage vector).
The libraries are stored at – 80°C. Bacterial cells in a plasmid library are protected from the adverse effects of freezing by glycerol, while phage libraries are cryoprotected by dimethyl sulfoxide (DMSO).
10. Disadvantages of Genomic Library:
The main reason behind making a genomic library is to identify a clone from the library which encodes a particular gene or genes of interest. Genomic libraries are particularly useful when you are working with prokaryotic organisms, which have relatively small genomes.
On the face of it, genome libraries might be expected to be less practical when you are working with eukaryotes, which have very large genomes containing a lot of DNA which does not code for proteins.
A library representation of a eukaryotic organism would contain a very large number of clones, many of which would contain non-coding DNA such as repetitive DNA and regulatory regions. Also, eukaryotic genes often contain introns, which are un-translated regions interrupting the coding sequence.
These regions are normally copied into mRNA in the nucleus but spliced out before the mature mRNA is exported to the cytoplasm for translation into protein. Prokaryotic organisms are unable to do this processing so the mature mRNA cannot be made in E. coli and the protein will not be expressed.
If your screening method requires that the gene be expressed it will not work with a genomic library from a eukaryotic organism.
12. Applications of Genomic Library:
Genomic library has following applications:
1. It helps in the determination of the complete genome sequence of a given organism.
2. It serves as a source of genomic sequence for generation of transgenic animals through genetic engineering.
3. It helps in the study of the function of regulatory sequences in vitro.
4. It helps in the study of genetic mutations in cancer tissues.
5. Genomic library helps in identification of the novel pharmaceutical important genes.
6. It helps us in understanding the complexity of genomes.