Genomic Selection for fast-tracking cotton breeding

Project Leader: Iain Wilson
Key Researchers: Philippe Moncuquet, Zitong Li, Qian-Hao Zhu, Iain Wilson, Shiming Liu, Warren Conaty and Warwick Stiller
Brief Summary of Project Objectives: This project aimed to develop and evaluate a new predictive breeding approach called Genomic Selection (GS). GS will allow the prediction of the phenotypic outcomes (for example, yield, fibre quality or other agronomic properties) in breeding populations based on the presence or absence of large numbers of DNA markers present in individual plants. GS is already being used in other crops like maize and soybean. It has the potential to revolutionise the way we breed cotton and may speed up the delivery of new varieties in cotton by the Core Breeding project.

Executive Summary

This project lays the foundations for the long term aim of developing a new molecular marker based breeding approach in cotton called Genomic Selection (GS) (i.e. Genome-wide prediction) that has the potential to help our breeders select elite parents in crossing, identify progenies with the best genetic make-up in segregating populations, and reduce the amount of field based screening invested per variety development. The objective of this new breeding system is to predict the genetic potential (for example, yield, fibre quality or other agronomic properties) of individual breeding lines based on their genome sequence from large numbers of single nucleotide polymorphisms (SNP) markers present throughout the entire cotton genome. When GS predictions become reliable and useful for selection it will eventually be used as a tool by our breeders as a means of increasing genetic gain, through either reducing the amount of field based screening required to generate new elite cultivars, or identifying and testing the breeding populations enriched with better performing lines under the same resource input, in order to find the best combinations of yield and quality. Large amounts of high-quality trait and marker data are essential to develop the underlying statistical models for genome-wide prediction upon which the predictions of genetic performance are based, so close linkage between the CSIRO cotton breeding and molecular teams is essential. 
GS is already being used by large breeding companies in crops like maize, rice, wheat and soybean. GS has the potential to speed up the delivery of superior varieties by our Core Breeding team. This GS project has established the framework to build a GS pipeline for deployment in the cotton breeding program. It has produced a Next Generation genotyping system that is comparable in quality to the SNP Chip, but more cost and time effective. This new platform is important as it provides a means for expanding GS genotyping and analysis to the scale required in the future. Due to the inclusion of pedigree information and larger training populations the prediction accuracies for fibre quality have been progressively improving throughout the project. It is expected that GS will be applied to cotton breeding by the end of the next phase of GS research that is outlined in the CBA funded project ‘CBA22: Genomic Selection for Cotton Breeding II’.
Recent Project Research Achievements

  • The Genomic Selection (GS) project involved a yearly cyclical process that included the genotyping of breeding lines and the analysis of their fibre quality and yield data collected from the field from different sites. This data was then used to produce estimates of specific fibre quality traits (and we hope eventually we can use it for more complex traits like yield) of the new lines from each season using a statistical model trained on historical data, and the breeding lines from the previous seasons added since 2014.
  • The Next generation DArT cotton genotyping platform has been validated and is now in full production. All GS samples collected over the life of this project have now been genotyped with the new system (total 3,340 samples). Results indicate that the new platform provides genotype data of a comparable quality to the SNP Chip platform, but at a fifth of the price.
  • A training population consisting of historical data and breeding lines tested in 2014, 2015, 2016 and 2017 (a total of 1507 lines) was used to predict the performance of breeding lines tested in 2018/2019 (validation population of 648 breeding lines). Prediction accuracies averaged 0.49 for fibre length, 0.41 for fibre strength, and 0.24 for lint percentage.
  • 510 samples collected from the latest 2019/2020 season have been genotyped using the DArT cotton genotyping platform.Our GS publication ‘Historical datasets support genomic selection models for the prediction of cotton fibre quality phenotypes across multiple environments ‘won the AACS Scientific Publication Award for 2019.
  • Our GS publication ‘Historical datasets support genomic selection models for the prediction of cotton fibre quality phenotypes across multiple environments ‘won the AACS Scientific Publication Award for 2019.