QTL mapping for fiber quality and yield-related traits across multiple generations in segregating population of CCRI 70

Cotton is a significant economic crop that plays an indispensable role in many domains. Gossypium hirsutum L. is the most important fiber crop worldwide and contributes to more than 95% of global cotton production. Identifying stable quantitative trait locus (QTLs) controlling fiber quality and yield related traits are necessary prerequisites for marker-assisted selection (MAS). A genetic linkage map was constructed with 312 simple sequence repeat (SSR) loci and 35 linkage groups using JoinMap 4.0; the map spanned 1 929.9 cM, with an average interval between two markers of 6.19 cM, and covered approximately 43.37% of the cotton genome. A total of 74 QTLs controlling fiber quality and 41 QTLs controlling yield-related traits were identified in 4 segregating generations. These QTLs were distributed across 20 chromosomes and collectively explained 1.01%~27.80% of the observed phenotypic variations. In particular, 35 stable QTLs could be identified in multiple generations, 25 common QTLs were consistent with those in previous studies, and 15 QTL clusters were found in 11 chromosome segments. These studies provide a theoretical basis for improving cotton yield and fiber quality for molecular marker-assisted selection.


Background
Cotton is an important cash crop, and its fiber is the most important renewable natural resource for the textile industry. Upland cotton (Gossypium hirsutum L.) is the most important cotton species, accounting for more than 95% of cotton production worldwide (Chen et al. 2008;Lacape et al. 2003;Shang et al. 2015). Improving fiber quality while maintaining a high yield potential of Upland cotton is an important research direction in cotton breeding. Because of the negative correlation between yield and fiber quality traits (Rong et al. 2004;Shen et al. 2005;Shang et al. 2015), it is difficult to synchronously improve multiple traits in cotton breeding. Although conventional breeding has played a vital role in the genetic improvement of fiber quality and yield-traits in Upland cotton, the achievements and progress have been slow (Zhang et al. 2009). With the development of molecular marker technology, through the construction of saturated genetic map, molecular markers tightly linked to yield and fiber quality can be used to pyramid target genes for the simultaneous improvement of fiber quality and yield potential.
To identify stably expressed QTLs, permanent populations have been used for QTL mapping of fiber quality and yield in recent years (Ademe et al. 2017;Jamshed et al. 2016;Ning et al. 2014;Shen et al. 2007;Shang et al. 2015;Wan et al. 2007). Jamshed et al. (2016) identified one QTL for fiber strength (FS) located on Chromosome 25 by using recombinant inbred lines (RILs), which was the same QTL detected by Sun et al. (2012). This QTL was stably expressed in multiple environments and could be used for MAS . Constructing multigenerational segregating populations is a highly effective method to identify stable QTLs. Thus, identifying QTLs in early generations of segregating populations would allow us to tag stable QTLs for MAS and accelerate the process of breeding for better fiber quality and higher yield. Therefore, we used hybrid of CCRI 70, a Chinese national approved variety with excellent fiber quality and good fiber yield, to construct F 2 , F 2:3 , F 2:4 and F 2:5 populations for identifying QTLs associated with fiber quality and yield-related traits. The detected stable and common QTLs could be further used to identify the molecular genetic mechanism of fiber quality and yield component traits and in MAS breeding.

Plant materials
The Upland cotton hybrid CCRI 70 (F 1 ), which comes from the cross between 901-001 (P 1 ) and sGK156 (P 2 , as female parent), is a national authorized cotton variety with excellent fiber quality, i.e., an average fiber strength (FS) of 33.5 cN•tex − 1 , fiber length (FL) of 32.5 mm, and fiber micronaire (FM) of 4.3 (Yuan et al. 2009). Line sGK156 is selected from the commercial transgenic cultivar sGK9708 (CCRI 41) that is resistant to cotton Verticillium wilt and cotton bollworm. It has an excellent yield and comprehensive agronomic traits, with an average FM of no more than 4.2. In addition, 901-001 is a line with high fiber quality due to introgression from Gossypium barbadense to Gossypium hirsutum.
Detailed information on this population was provided by Ye et al. (2016). Briefly, an F 1 combination between sGK156 and 901-001 was made in Anyang, Henan Province, in 2011. F 1 seeds were sowed in Hainan in the winter of 2011-2012, F 2 seeds and the two parents were sown in Anyang, Henan Province in 2012, and 250 F 2 plants were harvested for fiber quality. The 250 F 2:3 plants were grown in 250 rows that were 5 m long and 0.8 m apart in Anyang in 2013, F 2:4 plants were grown in Hainan in the winter of 2013-2014, and F 2:5 plants were grown in Anyang in 2014. Thirty naturally opened bolls with two self-crossed bolls were hand-harvested from every plant in the F 2:3 to F 2:5 generations to generate progeny and test for fiber yield and quality. After the seed cotton samples were weighed and ginned, boll weight (BW) and lint percentage (LP) were evaluated accordingly. The fiber quality traits, including FL, FS, FM, FU and FE, were tested with an HFT9000 using international high-volume instrument calibration cotton (HVICC) samples at the Cotton Quality Supervision and Testing Center of the Ministry of Agriculture of China.

DNA extraction and genotype analysis
Young leaves were collected from plants labeled F 2 , P 1 , P 2 , and F 1 , frozen in liquid nitrogen and stored at − 80°C. Genomic DNA was extracted individually as described by Paterson et al. (1993). A total of 14 820 simple sequence repeat (SSR) primer pairs were used to screen polymorphisms between parents. The polymorphic primer pairs were used to genotype the F 2 population. PCR was conducted as described by Sun et al. (2012), and the electrophoresis and detection of PCR products were conducted according to the protocol of Zhang et al. (2000).
A total pool of 14 820 pairs of SSR primers were used to screen for polymorphisms between sGK156 and 901-001. The SSR primer sequences were obtained from the following sources: BNL (Brookhaven National Laboratory, NY), HAU (Huazhong Agricultural University, China), NAU (Nanjing Agricultural University, China), STV and CIR (French Agricultural Research Centre for International Development, France), CM and JESPR (Texas A&M University, USA), DPL and CGR (Delta and Pine Land, USA), SWU and PGML (Southwest university, China), MUCS and MUSS (University of California Davis, USA), Gh and TMB (United States Agricultural Research Service, USA). All of the SSR primer pairs were synthesized by Sangon Biotech (Shanghai, China).

Data analysis
The genetic map was constructed using JoinMap 4.0 software with a logarithm of odds (LOD) score of 5.0 and a recombination frequency of 0.40. The Kosambi's mapping function (Kosambi 1994) was used to convert the recombination frequencies into map distances. The linkage groups were drawn by Map Chart 2.2 software (Voorrips 2006). Linkage groups were assigned to corresponding chromosomes according to the chromosomesanchored SSR markers used in previous reports (Lacape et al. 2003(Lacape et al. , 2013Rong et al. 2004;Guo et al. 2007;Qin et al. 2008;Xia et al. 2014;Yu et al. 2013;Zhang et al. 2013;Liu et al. 2017;Nie et al. 2016).
QTLs affecting fiber quality and yield-related traits in 4 generations was detected by the composite interval mapping (CIM) method (Zeng, 1994) using Windows QTL Cartographer 2.5 (Wang et al. 2006) with LOD threshold of 2.5 and a mapping step of 1.0 centimorgans (cM). QTLs at the same location for the same trait across different generations were regarded as 'stable', and QTLs explaining more than 10% of the phenotypic variance (PV) were regarded as 'major'. QTL nomenclature was defined as q + traits abbreviation + chromosomes + QTL number (McCouch et al. 1997). In addition, QTL clusters were inferred based on regions containing three or more QTLs for various traits. Regions of approximately 20 cM were taken into account when estimating the presence of a cluster. Clusters were named according to the chromosome on which they were found.

Phenotypic evaluation of fiber quality and yield traits
The fiber quality and yield traits phenotype data for the P 1 , P 2 , F 2 , F 2:3 , F 2:4 and F 2:5 populations are presented in Table 1. Skewness and kurtosis values were calculated, and the results indicated that all fiber-related traits showed a normal distribution and transgressive segregation in both directions in the 4 generations (Table 1), indicating that these traits were controlled by multiple genes and suitable for QTL mapping.

Correlation analysis of fiber quality and yield traits in 4 generations
The correlation coefficients of fiber and yield traits in 4 generations were showed in Table 2. The majority of fiber quality traits were significantly associated with each other, indicating that the genes of different traits were linked and had multiple effects. FL was significantly positively correlated with FS and FU, but was significantly negatively correlated with FM; FS was significantly positively correlated with FU but was negatively correlated with FM (except in the F 2 generation). BW was not significantly correlated with most of fiber-related traits (except in the F 2 generation). In contrast, LP was significantly negatively correlated with FL, FS and BW but was significantly positive correlated with FM (except in the F 2 generation).
Correlation analysis between traits in different generations was conducted using the mean value of the four generations (Additional file 1 Table S1). All correlation of FL was significantly positively correlated among generations, and the correlation coefficients among generations varied from 0.150 to 0.348. Correlation analysis of FS, BW and LP among generations was similar to that for FL. The majority of FM correlation coefficients were significant and positive across generations. The correlation coefficients for FE were more complex, which may relate to environments.

Construction of the genetic map
Two hundred and sixty-seven of the 14 820 SSR primer pairs (1.80%) amplified polymorphisms between two parents. A total of 342 loci were obtained from amplification of the 267 SSR primer pairs in the 250 F 2 individuals. After linkage analysis of all 342 polymorphic loci, 312 were mapped to 35 linkage groups ( Fig. 1 and Additional file 8 Table S8), thus covering 1 929.9 cM with an average distance of 6.19 cM between neighbouring markers and an average number of 9.18 markers in each linkage group and occupying approximately 43.37% of the total cotton genome. The largest linkage group contained 33 markers, while the smallest one had only 2 markers. Thirty-five linkage groups were assigned to 23 chromosomes, among which 11 were assigned to A genome and 12 were assigned to D genome.

Mapping population types for MAS breeding
Breeders have long recognized the significant negative association between lint yield and fiber quality. Although conventional breeding has played a vital role in the genetic improvement of lint yield and fiber quality in Upland cotton, the achievement and progress have been slow . The utilization of marker-assisted selection (MAS) makes it possible for plant breeders to identify rapid and precise approaches for improving conventional selection schemes (Moose and Mumm 2008;Tanksley and Hewitt 1988).
To implement MAS in cotton breeding, first, it is imperative to identify many stable and major QTLs for cotton yield and fiber quality. In previous years, many studies on genetic map construction and QTL identification were conducted. However, populations was mainly developed for basic studies (Rong et al. 2004;Shen et al. 2007;Sun et al. 2012;Ning et al. 2014;Said et al. 2015;Jamshed et al. 2016;Shang et al. 2015;Tang et al. 2015;Zhai et al. 2016;Liu et al. 2017). In our research, the population was developed from hybrid CCRI 70 with its parents, which is a nationally authorized cotton variety with excellent fiber quality. The use of this resource would facilitate combining the results of QTL identification and breeding and could provide information on fiber quality and yield traits improvements in cotton.

Comparison of QTL with the previous reports
Currently, different mapping populations and markers were applied in QTL localization, and thus making it difficult to compare with different studies. We identified 115 QTLs related to fiber quality and yield traits in the populations of CCRI 70 and compared with those detected in previous relevant studies (Chen et al. 2008;Jamshed et al. 2016;Qin et al. 2008;Shen et al. 2005;Sun et al. 2012;Shao et al. 2014;Shang et al. 2015;Tang et al.2015;Wang et al. 2008Wang et al. , 2010Yang et al. 2007;Yu et al. 2013;Zhang et al. 2008Zhang et al. , 2012Zhai et al. 2016;Liu et al. 2017), and 25 QTLs were found to be consistent with those in previous studies.
A total of 25 QTLs were found to be consistent with previous studies, and 35 were detected stably in multiple generations. Further analysis showed that 7 of the 25 QTLs were detected stably in multiple generations. Thus, 53 QTLs were detected stably in multiple generations or different genetic backgrounds and thus could be considered to use in MAS. Special attentions should be paid to these stable QTLs and to those detected in previous studies, because stable QTLs add valuable information for further QTL fine mapping and gene positional cloning for fiber quality and yield-related traits genetic detection and providing useful markers for further molecular breeding.
Most of the clusters showed opposite additive genetic effects for fiber quality and yield related traits in previous reports. Wang et al. (2013) reported that a QTL-rich region on chr.7 was associated with FL, FS and LP, and the direction of genetic effects of QTLs on FL and FS was positive, but the direction was the opposite for fiber quality traits and LP. The NAU3308-NAU4024 interval on D2 harbored seven significant QTLs related to FL, FS, FE LP, LY, SI and NB, which showed opposite additive effects on fiber quality and yield related traits . Wan et al. (2007) reported that a QTL-cluster in the t 1 locus region on chr.6 increased FL, FS, FE and FU, and decrease LP. Wang et al. (2015) reported two important clusters in the region from 70 to 86 cM on LG1-chr1/15 and 18-37 cM on chr.21. The cluster on LG1-chr1/15 were correlated with FS, FM, FE and LP and the cluster on chr.21 were correlated with FL, FS, LP, SCW and CI, the additive effect for these QTLs of traits (except FE) were positive, which revealed that fiber quality and yield traits could be improved synchronously.
In conclusion, the clustering of QTLs for fiber quality and yield traits further proved the strong correlation among fiber qualities and yield traits Wang et al. 2013). To improve fiber quality and yield potential at the same time, fine mapping of these QTLrich intervals on specific chromosomes are necessary for the future application in MAS and gene cloning (Guo et al. 2018;Zhai et al. 2016).
The stability of these QTLs across generations or populations and outstanding chromosomal regions motivates further interests in study, and the alleles underlying them are valuable candidate genes either for implementation in MAS or for studies of the molecular mechanism of fiber quality and yield-related traits.

Conclusions
QTL mapping was used to analyze molecular genetic mechanism of fiber quality and yield component traits using a series of generations (F 2 , F 2:3 , F 2:4 and F 2:5 ) that constructed from CCRI 70. Fiber quality and yield-related traits showed significant and complex correlations. A total of 115 QTLs for fiber quality and yield-related traits were detected. Of these QTLs, 53 were detected stably in multiple generations or different genetic backgrounds, which could indicate their potential use in MAS. In addition, 15 QTL clusters were found in 11 chromosomal segments. Determining the locations of these clusters will be beneficial for MAS and breeding programs that focused on fiber quality and yield related traits.