QTL Mapping of Agronomic and Economic Traits for Four F2 Populations in Upland Cotton

Background: Upland cotton (Gossypium hirsutum) accounts for more than 90% of annual world cotton output due to its high yield potential. However, yield traits and ber quality traits exhibit negative correlations in most cases. Here, we constructed four F 2 populations, using two normal lines and two introgression lines, for simultaneously detection the genetic basis underlying complex traits such as yield and ber quality in upland cotton. Subsequently, the phenotyping of 8 agronomic and economic traits along with quantitative trait loci (QTL) mapping was implemented. Results: Extensive phenotype variations and transgressive segregation were found across segregation populations. Four genetic maps were constructed with the length of 585.97cM, 752.45cM, 752.45cM and 1 163.66cM. A total of 50 QTLs were identied across four populations (7 for plant height, 27 for ber quality traits and 16 for yield traits). The same QTLs were identied from different populations such as qBW4 and qBW2 which were linked to common markers. A QTL cluster was characterized in D09 of population 4Su which contained 8 QTLs for 6 different traits. Conclusions: These ndings will provide insight into the genetic basis of simultaneous improvement of yield and ber quality in upland cotton breeding.


Introduction
Cotton represents the main source of natural textile bers in the world and this is the most prevalent raw materials used in the textile industry . High yield and ne ber-quality are prerequisites to meet the ever-increasing demand of the textile industry. Upland cotton (Gossypium hirsutum) accounts for more than 90% of global cotton production due to its high yield potential and broader adaptability but with moderate ber quality, whereas G. barbadense produces exceptionally ne-quality bers with lower ber yield (Cai et al. 2014;Hu et al. 2019).
The majority of agronomic and economic traits, such as yield and ber quality, are quantitative traits and controlled by multiple loci/genes. Moreover, environmental in uence is substantial in the control and expression of these traits. Meanwhile, previous reports suggested a signi cantly negative correlations between ber quality traits and yield traits Liu et al. 2018;Zhang et al. 2019).
Therefore, dissecting the genetic basis of yield and ber quality is essential, and it would contribute to a simultaneous improvement for yield and ber quality.
As a modern molecular genetic method, molecular markers have been widely applied in cotton in the last decade. Recently, the molecular markers get a great rapid development with the release of assembled genome sequences of G. hirsutum (Li et al. 2015;Zhang et al. 2015;Wang et al, 2018;Yang et al. 2019) and G. barbadense (Liu et al. 2015;Yuan et al. 2015). Numerous genetic linkage maps, including intraspeci c map between G. hirsutum and interspeci c map between G.hirsutum and G.barbadense, were constructed using restriction fragment length polymorphism (RFLP) , simple sequence repeats (SSR) and single nucleotide polymorphism (SNP), etc. According to CottonQTLdb (Release 2.3, Said et al. 2013, Said et al. 2015, thousands of quantitative trait loci (QTLs) for yield and ber quality in cotton had been detected. However, to date, the studies about simultaneous dissection of the genetic basis underlying complex traits and their genetic correlations in multiple upland cotton populations using QTL mapping remained few.
In the present study, four F 2 populations, which derived from the hybridization between two G. hirsutum normal lines (4133B and SGK9708) and two introgression lines (Suyuan04-3 and J02-247) were used.
Subsequently, four corresponding genetic linkage maps were constructed using SSR markers. QTL mapping was implemented with the integration of genotypic data and phenotypic data of eight agronomic and economic traits, including yield and ber quality. These ndings will not only contribute to dissecting the genetic basis underlying yield and ber quality and their genetic correlations, but also provide insight into the simultaneous improvement of yield and ber quality in upland cotton breeding.

Plant Materials and Field Experiments
Two G. hirsutum normal lines (4133B and SGK9708), which are endowed with high yield potential but moderate ber quality, and two introgression lines (Suyuan04-3 and J02-247), which are endowed with superior ber quality, were as parents, respectively. SGK9708 was derived from CRI41, which is a widely planted cultivar with wide adaptability; 4133B was with greater combining ability and derived from the hybridization of SGK9708 and the offspring of Gan4104 and CZA (70)

SSR Makers Analysis
Genomic DNA of individuals from F 2 populations and their parents was extracted from the young leaves tissue using a modi ed cetyltrimethyl ammonium bromide (CTAB) method (Paterson et al. 1993).
Polymorphism detection for four pairs of parents was run using 5713 SSR primers. The primers that amplify stable polymorphic products were selected for genotyping in F 2 population. The sequences of SSR primers were downloaded from CottonGen (https://www.cottongen.org/, Yu et al. 2014). In order to map SSRs to physical map, a local BLAST program was performed (Altschul et al. 1990). The sequences of SSRs were queried against the G. hirsutum genome sequences . The top 1 of blasthit was selected for further analysis. The polymerase chain reaction (PCR), ampli ed products separating and silver staining were performed as detailed by Feng et al (2015).

Genetic map construction
The genetic linkage map was constructed using JoinMap 4.0 with a regression mapping method and logarithm of odds (LOD) threshold of 5.0. The Kosambi function was used to convert the recombination frequencies to map distances.

QTL mapping and analysis
Win QTL Cartographer 2.5 was applied to identify QTLs with the composite interval mapping (CIM) method. The main parameters were set as 1.0 cM for mapping step, 5 for control markers, and 1,000 for permutation tests. QTLs were declared signi cant if the corresponding LOD score was greater than 2.5.
Meanwhile, the additive effect, dominant effect and R 2 (the percent of phenotypic variance explained by a QTL) were estimated. QTLs detected for the traits were named as follows: q-trait-linkage group No. The action mode of QTL was represented by a dominance degree, i.e. an absolute value of dominant effect divided by additive effect (|D/A|; Stuber, 1987). It was additive if the dominance degree less than 0.2, partial dominance between 0.2 and 0.8, dominance between 0.81 and 1.2, over dominance more than The comparison between QTLs identi ed here and the CottonQTLdb database (Said et al. 2015) was carried out to determine whether QTLs were novel or detected before. Brie y, the QTLs in the present study that shared the same or overlapping con dence intervals with the QTLs in the database based on the common maker position were considered as QTLs identi ed in previous studies.

Results
Phenotypic variation of four F 2 populations The phenotype of eight agronomic and economic traits across four F 2 populations was evaluated. As a result, the extensive phenotype variations and transgressive segregation were observed (Table 1 and Fig.   1), the transgressive segregation means that some individuals' phenotypic values were better than the superior parent and some' were worse than the inferior parent (Reyes 2019  Overall, within populations, the majority of correlations between two yield traits, BW and LP, were negative. In contrast, the majority of correlations among ber qualities were positive, as well as between BW and ber qualities (Fig. 2). The correlations between LP and ber qualities were either positive or negative. The signi cant correlations between multiple traits among 4Su, 4J and Sg4 populations were observed (Fig. 2), suggesting the in uence of common parent 4133B on traits.
Joinmap 4.0 software was employed to construct a genetic linkage map. For 4Su population, a total of 71 makers were assigned to 10 linkage groups (LGs) with a total map length of 585.97 cM (Table 2, Additional le 1: Fig. S1, Additional le 6: Table S2a). The average length of linkage groups was 58.6 cM, and the average distance of makers was 8.25 cM. The longest LG, LG9, contained the most makers (27), but half of LGs contained only three makers.
For 4J population, a map of 752.45 cM was constructed and 61 makers across 10 linkage groups were mapped ( Table 2, Additional le 2: Fig. S2, Additional le 6: Table S2b). The average length of linkage groups was 75.2 cM, and the average distance of makers was 12.34 cM.
For SgJ population, 83 makers, approximately half of 158 polymorphism makers, were mapped in 15 linkage groups (Table 2, Additional le 3: Fig. S3, Additional le 6: Table S2c). The total length of the map was 855.04 cM and the average length of linkage groups was 57 cM. The highest adjacent maker interval was 21.46 cM on LG13 and least was 1.06 cM on LG14.
For PH, 7 QTLs identi ed, of which 6 in 4Su population, were all minor effect (0.11% < R 2 < 4.02%; Table 3, Fig 3). The additive effect of two QTLs, qPH2-1 and qPH2-2, which with the higher R 2 (2.66% and 4.02%), were positive, indicating that the favourable alleles come from the parent Suyuan04-3. And the action mode of qPH2-1 and qPH2-2 were over dominance based on dominance degree value.
For BW, a total of 8 QTLs with 1.17%~9.31% R 2 were identi ed in 4J (1), SgJ (1) and Sg4 (6) ( Table 3, Fig  3). It is noteworthy that both of the LGs harbouring one QTL in 4J (qBW4) and SgJ (qBW2) were anchored to A05 chromosome; meanwhile, a common SSR maker, NAU1255, was detected nearby the QTL interval. It was inferred that NAU1255 was a marker closely linked to BW. Furthermore, the directions of the additive effect and dominance effect were the same.
For FL, the most QTLs (11) were detected. There were 6, 1 and 4 QTLs identi ed in 4Su, 4J and SgJ populations, respectively (Table 3, Fig 3). Multiple QTLs were in the same LG of a population, for example, qFL9-1, qFL9-2 and qFL9-3, which with 0.35% ~7.70% R 2 , were in LG9 of 4Su population. Interestingly, both LG7 in 4Su population and LG6 in SgJ population were anchored to A13 chromosome. Meanwhile, the common SSR makers, BNL2449 and NAU1211, were detected nearby the interval of QTLs qFL7 4Su and qFL6, hinting that BNL2449 and NAU1211 were closely linked concerning FL. In addition, the additive effect of QTLs qFL2-2 was positive, suggesting that the favourable alleles come from the male parent, Suyuan04-3 and J02-247, which is endowed with superior ber quality.
For FS, a total of 5 QTLs were identi ed, 4 QTLs with R 2 of 2.95% ~7.15% in 4Su population and 1 major QTL with R 2 of 15.10% in Sg4 population (Table 3, Fig 3). The additive effect of 4 QTLs in 4Su population were positive, whereas 1 major QTL in Sg4 population was negative, implying that parent 4133B may not confer the favourable allele.
For FE, a total of 4 QTLs with 0.16% ~ 5.62% R 2 were detected in 4Su, SgJ and Sg4 populations (Table 3, Fig 3). The additive effect of one QTL, qFE8, was negative and action mode was additive, whereas, the other three QTLs were positive and over dominance.
For MIC, a total of 5 QTLs were detected across 3 LGs in 4Su and Sg4 populations (  Fig 3). As a major QTL, the R 2 of qMIC2, which in LG2 of Sg4 population was up to 59.24%, the other four QTLs R 2 were minor (0.15% ~6.29%). The dominance degree value of all QTLs but qMIC9-2 were up to 9.41~92.03, suggesting the action mode was over dominance.
There was a hotspot region in LG9 of 4Su population (Fig.3A). Three QTLs (

QTLs Comparison and Analysis
We compared the identi ed QTLs here and QTLs in CottonQTLdb database, the results showed that onefth of QTLs (10/50) overlapped with previously reported QTLs, illustrating the reliability of the QTL mapping in the present paper. Meanwhile, 40 novel QTLs were detected in our study. The overlapped 10 QTLs reportedly involved in FL (4), FS (2), PH (1), BW (1), LP (1) and FE (1) traits. There were the most identi ed QTLs both in the present research (11) and CottonQTLdb database (494) for FL, which perhaps will increase the probability of hit.
QTLs for different traits that shared the same or overlapping con dence intervals were considered to reside in QTL clusters. In the present study, a total of 9 QTL clusters were identi ed in 4Su (5), 4J (1) and Sg4 populations (3). The QTL cluster harbouring the most QTLs was above-mentioned hotspot region, with 8 QTLs for 6 traits, in LG9 of 4Su population (Fig.3A). There was another QTL cluster that harbouring QTLs for FU and MIC in the same LG (Fig.3A).
As we know, BW and LP represented yield traits, FL, FS, FU, FE and MIC represented ber quality traits. With this prerequisite, the analysis of paired trait QTLs was employed. There were 19 paired trait QTLs within 6 paired traits (BW and FL, FE; LP and FL, FS, FU, FE) that exhibited signi cant medium or high positive correlations (|r| >0.3) in the F 2 population. Among them, 6 paired trait QTLs had the same direction of addictive effect (Additional le 7: Table S3).

Discussion
To dissect the genetic basis underlying yield and ber quality as well as their genetic correlations, two upland cotton normal lines (4133B and SGK9708) and two introgression lines (Suyuan04-3 and J02-247) were selected as parents respectively, and four populations were constructed. Among these populations, the female parents of 4Su, 4J and SgJ were high yield potential lines, and the male parents were superior ber quality lines. Thus, the extensive phenotypic variation was observed in the cross combinations, whose parents are with distant kinship each other. Meanwhile, all traits exhibited normal distribution pattern across four F 2 populations (Table 1 and Fig. 1), suggesting that these traits were quantitative traits controlled by multiple genes Furthermore, all traits exhibited transgressive segregation and many individuals with transgressive phenotype were found (Table 1 and Fig. 1). For example, all the median values of FL and FS in 4Su, 4J and SgJ populations were higher than or nearly 30, ber reaching double-thirty quality values (FL ≥ 30 mm and FS ≥ 30 cN·tex − 1 ) is generally considered as ne-quality. In plant breeding, transgressive segregation provides an adaptive advantage for traits (Reyes 2019). To a certain extent, high yield and ne-quality bers are the outcome of adaptation for cotton. Therefore, it is not surprising that many instances of transgressive segregation were observed for these traits in F 2 populations. Furthermore, some of these transgressive lines can be used to breed for high-quality ber. At the same time, the abovementioned phenomenon implied that the favourable alleles of ber quality trait generally come from introgression lines' parents.
It is generally known that the quantitative traits are in uenced by the environment. Therefore, to identify stable QTLs, the mapping populations are usually planted in multiple environments Diouf et al. 2018;Zhang et al. 2019). However, multiple stable QTLs such as qBW4 and qBW2, were detected using four F 2 populations. Although these two QTLs were identi ed in 4J and SgJ populations, they had the common marker and their LGs anchored to the same chromosome. Thus these two QTLs could be considered as one QTL. In brief, this study provides an alternative method of detecting stable QTLs through multiple populations.
The phenomenon of QTLs cluster was consistent with previous studies, i.e. QTLs for ber quality are clustered on the same chromosome; and the D09 chromosome, where the majority of makers in LG9 mapped, harbouring important loci regulating ber quality traits (He et al. 2007;Qiao et al. 2019). These results illustrated that QTLs in clusters might be closely linked or have pleiotropic effects (Vikram et al. 2015;Zhao et al. 2016;You et al. 2019;Yuan et al. 2018), which explains the signi cant phenotypic correlations or linkage drag between related traits (Zhang et al. 2019). For paired trait QTLs, if they had the same QTL additive effect direction and showed signi cant medium or high positive correlations, it will be easy to simultaneously improve these traits (Zhang et al. 2019). In the present study, we identi ed 6 paired trait QTLs with signi cant positive correlation and additive mode of gene action (Additional le 7: Table S3). These results suggested valuable information for further simultaneous improvement of yield and ber quality traits.
Based on the above conclusion, we found that qLP9 for LP and qFL9-1 as well as qFL9-2 for FL were in the same QTL cluster in LG9 of 4Su population. Furthermore, the high positive correlation and the same direction of the addictive effect between LP and FL were observed. Therefore, a further research plan is proposed using above mentioned QTLs cluster as a priority to include in a breeding program following ne mapping of QTL clusters via large scale segregating populations and gene-editing technology to break the negative correlation and further improve yield and ber quality.