Introduction
Farmers and plant breeders have been very successful in enhancing the productivity of crops such as wheat, rice or canola by improving their genetic characteristic via selective plant breeding. In addition, their improvement of environmental conditions (soil improvement and fertilization) has increased crop yield. Researchers are now focusing on characterization and manipulating target crop genes using genomics approaches to enhance yield and productivity while reducing production costs.
The genome (all the DNA of a plant) contains not only all genes (stretches of DNA that code for proteins), but also contains so-called "non-coding DNA", the regions between the genes that do not code for a protein. An analogy is that the genes are small islands dotted in a vast ocean of non-coding DNA. We are interested in these non-coding DNA regions because they may contain the very elements that control or regulate genes and even previously undetected novel genes. Relatively little is known about functionally important non-coding DNA in crops.
What is VEGI?
VEGI (Value-Directed Evolutionary Genomics) aims to identify and functionally validate regions of non-coding DNA sequences and novel genes in crops such as Canola. We are especially interested in non-coding DNA sequences that are associated with enhance yield and productivity while reducing production costs (agronomical valuable traits). The scientists working on VEGI are both at McGill and at the University of Toronto. The research is funded by the governments of Canada, Quebec and Ontario via Genome Canada, Genome Quebec, the Canadian Foundation Of Innovation (CFI) and the Ontario Research Fund.
Potential functional non-coding DNA sequences are identified by a combination of comparative- and population genomics. The identified DNA regions are further screened to evaluate whether they influence agronomical beneficial traits under stressed conditions, such as drought or limited amounts of fertilizer. The regions with a positive screening test will be further functionally tested by detailed gene activity and phenotype characterization.
We expect that we will identify and protect value-added non-coding DNA regions with documented crop improvement potential. In addition, we generate valuable data resources and expertise that will be platforms for crop improvement applications . We also will train highly qualified personnel in this novel genomics-based approaches ensuring that a well-trained work force insures Canada's leadership in genomics and crop improvement.
Why studying Canola in stead of Brussels sprouts?
The Brassicaea family is a large family of plants, with family members such as broccoli, red cabbage, Brussels sprout and canola. The Brassicaea family also contains the Arabidopsis thaliana (mouse ear crest), which as a model plant was the first plant sequenced in 2003. Canola is economically the most valuable example of the Brassicaea family. About 40% of the Canola seeds contain oil, the rest is used as high quality animal feed. Canola oil is a healthy oil, rich in unsaturated fatty acid and in high demand in markets around the world. In 2008, the value of ca and derived products in Canada resulted in $13.8 billion in economic activity to the Canadian economy (Canola Council of Canada). The sector provides direct employment to 52,000 Canadian farmers, many of them operating family farm businesses. The demand for canola products is projected to increase by 65% by 2015.
Canola is not easy to study under laboratories conditions. We therefore use smaller plants of the same family for our studies, in a similar way as animal studies that are used to investigate human health issues. However, we will mainly validate non-coding DNA regions which also exists in Canola and ultimately test our findings in this valuable crop.
What are our results sofar?
Economic analyses. Our economics team has identified 41 traits in Canola, with a significant economic impact. The top 5 traits with the largest economic impact for Canada contained Nitrogen Utilization Efficiency (NEU); and a tolerance for freezing/cold; and resistance to drought or high salt resistance. While the improvement of NEU will result in a substantial reduction in fertilizer usage, the other traits may result in growing crops on less favourable soil or climate conditions. We therefore have decided to use these traits in our project for further study. The fifth trait in the top 5 is resistance of canola to the flea beetle. To study this importance resistance necessitates special, isolated greenhouses growing which makes it impractical to study at the moment.
Comparative Genomics. Comparative Genomics is based upon the notion that by comparing genomes from different (plant) species, stretches of DNA can be identified which are quite similar and don't seem to have changed over time. These conserved regions in non-coding DNA may indicate that these regions are important for each plant and will be further tested.
To compare different plants from the Brassicaea family, we have successfully sequenced, assembled and annotated 3 different plant species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum). Together with collaborators from all over the world, who contributed novel sequences from 6 more Brassicaea family members, our team detected about 90,000 regions that were conserved in all these species, some small, and some bigger and up to 2,000bp. These regions might contain so-called regulatory DNA (switches that control the activity of a gene) or contain transposons and other non-coding genes. Eighty % of these conserved regions are found in Canola. Additional analyses of the DNA sequences of the conserved regions shows that they contain a high level of known transcription factor binding sites, as well as candidate novel sites.
Population genomics. Each member of a population, whether plant or human, has a similar genome, but each member is slightly different (natural variation). This small individual variation determines the small differences between each member of that population. We have sequenced the DNA of a small group (10 individuals) of Capsella grandiflora plants (another member of the Brassicaea family) to evaluate this genetic variation. The expected DNA variation in this small group of similar plants is accompanied by a lot of variability in their gene activity (RNAseq).
Our analysis of genetic variation of Capsella grandiflora confirms not only that certain regions are under purifying selection (the mutations in that region are considered not beneficial for the plant and likely will disappear over time), but also suggest a high rate of positive selection in these regions. The latter regions are considered evolutionary beneficial for the plant. These analyses highlighting the importance of these regions for functional variation and adaptive evolution and hence (in combination with the conserved regions) will be further studied.
As a follow-up study, 200 plants is growing at the University of Toronto. These plants will be phenotyped, genotyped by DNA sequencing and we will look at how active each gene is (via RNAseq). Afterwards we will associate the variation of the phenotypes and gene activity with the variation of the DNA (Genome-Wide Association Study, GWAS), which allows us to identify a precise link between the variation in phenotype or gene activity with DNA regions.
Functional genomics. There are many ways to investigate what the function is of the identified non-coding DNA. One way is to disrupt the sequence of the non-coding DNA or the accompanying gene(s). Especially when the plant is grown under stressful conditions, it is possible to illicit a modified phenotype or difference in gene expression, which indicates a role of this disrupted DNA segment.
We therefore have identified and grown hundreds of so-called T-DNA mutants. Each of these mutants disturbs specific DNA regions within the genome, which allows us to evaluate what that specific region is doing. For this we grow these mutants and screen them under conditions of low nitrogen fertilizer; under conditions of low (or even freezing temperature); under high salt conditions; or under conditions of low water. If certain mutant plants are growing better under challenging conditions than control plants, than we know that that specific region which is disrupted, plays a role in that specific trait and will be further tested and investigated. At the moment we are testing over 400 mutant plants for the 4 top traits, Nitrogen Utilization Efficiency (NEU); a tolerance for freezing/cold; and resistance to drought or high salt resistance.
The technologies we are using
While the people working for VEGI are the major strength of the project, the project is also technologically advanced in the use of the latest technology.
Sequencing. For sequencing DNA and RNA we are using the latest technology at the McGill Innovation Centre. This Centre has now the latest sequencing technology, among others 11 HiSeq Illumina machines, each machine with a capacity to sequence a full human genome within a week. It is not that long ago that this would take a few years.
Phenotyping. Phenotyping many plants is a lot of work involving many students. We have just received CFI funding for two automated phenotyping stations (LemnaTec HTS and LemnaTec 3D system) and will be the first academic institution in North America to have this. These phenotype stations automatically monitor the growth of the plants under strict environmental conditions. Four different cameras (for visible, UV and infrared lights) photograph the plants that are continuously being compared and analysed. This allows a more extensive and precise phenotyping (we can measure more details on what the differences are between the plants) which in its term allows a more precise association with DNA variations.
Bioinformatics. VEGI already generates a massive amount of data that needs extensive hardware and specified software, which often is continuously developed and improved. For this we have dedicated servers, both in Toronto and at McGill, both with massive amount of RAM and storage space. Furthermore, a specialized group of bioinformaticians analyses this data, improves and develop software for this analyses. All data is summarized and distributed via a dedicated website.




