Skip to main content

Identification of candidate genes controlling fiber quality traits in upland cotton through integration of meta-QTL, significant SNP and transcriptomic data

Abstract

Background

Meta-analysis of quantitative trait locus (QTL) is a computational technique to identify consensus QTL and refine QTL positions on the consensus map from multiple mapping studies. The combination of meta-QTL intervals, significant SNPs and transcriptome analysis has been widely used to identify candidate genes in various plants.

Results

In our study, 884 QTLs associated with cotton fiber quality traits from 12 studies were used for meta-QTL analysis based on reference genome TM-1, as a result, 74 meta-QTLs were identified, including 19 meta-QTLs for fiber length; 18 meta-QTLs for fiber strength; 11 meta-QTLs for fiber uniformity; 11 meta-QTLs for fiber elongation; and 15 meta-QTLs for micronaire. Combined with 8 589 significant single nucleotide polymorphisms associated with fiber quality traits collected from 15 studies, 297 candidate genes were identified in the meta-QTL intervals, 20 of which showed high expression levels specifically in the developing fibers. According to the function annotations, some of the 20 key candidate genes are associated with the fiber development.

Conclusions

This study provides not only stable QTLs used for marker-assisted selection, but also candidate genes to uncover the molecular mechanisms for cotton fiber development.

Background

As a natural and renewable resource, cotton fiber has been the most important raw material in the textile and processing industry all over the world. With the improvement of people’s living standard and advancements in techniques and diversified methods of spinning, demand for high quality cotton fiber is increasing. Cotton fibers are derived from ovule epidermal cells − 2 to ~ 0 day post anthesis (DPA), and ultimately reach 2.53.5 cm in the mature period (Stewart 1975). The development consists of four stages: fiber initiation, cell elongation, secondary cell wall (SCW) biosynthesis, and maturation (Li et al. 2018a, b). The development mechanism of cotton fiber contributed to the fiber quality improvement. Cotton fiber quality traits are complex quantitative traits, which are influenced by environments and the fiber development, and controlled by many quantitative trait loci (QTLs), including fiber length (FL), fiber strength (FS), fiber uniformity (FU), fiber elongation (FE), and micronaire (MIC), etc. (Ademe et al. 2017; Wang et al. 2016). FL and FS are considered as the most important traits affecting yarn quality, and FS is important for advanced spinning technologies in the textile industry (Yang et al. 2016). The MIC is a measure of fiber fineness and fiber maturity, which influences the fiber processing and dyeing consistency (Rodgers et al. 2017). Hundreds of QTLs contributing to fiber quality traits have been previously mapped for cotton using a variety of populations, and were found evenly distributed throughout the cotton genome (Qin et al. 2008; Shen et al. 2007; Wang et al. 2020; Yu et al. 2013). A total of 104 QTLs for fiber quality traits were detected by using 180 recombinant inbred lines (RIL) derived from Herein and Yumian 1, and 25 QTLs were detected in all three environments (Tan et al. 2018). A total of 134 QTLs for fiber quality traits were detected using 231 F6:8 RILs, which were derived from an intraspecific cross between Xinluzao24 and Lumianyan 28 (Liu et al. 2018b). Seventy-four QTLs were detected to be associated with five fiber quality traits (30 QTLs) and eight yield traits (44 QTLs) using 107 introgression lines, which were developed with an interspecific cross using G. hirsutum acc. 4105 as the recurrent parent and G. tomentosum as the donor parent (Keerio et al. 2018). One hundred and eighty-six additive QTLs were obtained for five fiber quality traits using 137 RILs (Jia et al. 2018).

Marker-assisted selection (MAS) has been successfully applied in genetic improvement of varieties of crops, especially for the major QTLs/genes, such as improvement of rice blast resistance by pyramiding three genes in rice (Xiao et al. 2019), improvement of drought adaptation in maize (Ribaut and Ragot 2007), yield traits in soybean (Reyna and Sneller 2001; Sebastian et al. 2010), Fusarium head blight resistance in wheat (Anderson 2007), and Verticillium wilt resistance in cotton (Zhang et al. 2014). QTL mapping with molecular markers provides a powerful approach to dissect the molecular mechanism underlying complex fiber quality traits (Ijaz et al. 2019). There were thousands of QTLs for cotton fiber quality traits identified in different mapping populations such as RILs, bi-parental segregating populations, and backcross populations (Said et al. 2015b), which provide the potential to be manipulated by MAS for the improvement of cotton fiber quality traits. However, only the stable QTLs for cotton fiber quality traits in various environments and populations can be used in the MAS breeding. In order to make these mapped QTLs more useful to plant breeding and gene cloning, a further analysis of all these loci has to be carried out. In this regard, meta-analysis of QTLs has been proven as an efficient approach to stablish the occurrence of QTL “hotspots” in a consensus map, which correspond to the more precise region where these loci represent under analysis (Goffinet and Gerber 2000; Salvi and Tuberosa 2015). More than three overlapped or location–similar QTLs reported in multiple documents of the same trait is considered as a meta-QTL. This approach was already applied to various crops and complex traits, such as Fusarium head blight resistance in bread wheat (Venske et al. 2019), grain weight in tetraploid wheat (Avni et al. 2018), cyst nematode resistance in soybean (Guo et al. 2006a), resistance to white mold in common bean (Vasconcellos et al. 2017), yield under drought in rice (Swamy et al. 2011), yield in maize (Martinez et al. 2016), and multiple traits in cotton (Said et al. 2013).

Several strategies were combined to identify candidate genes, such as combination of association mapping and linkage analysis (Cui et al. 2018; Mahuku et al. 2016; Zhang et al. 2019a) and combination of QTLs and transcriptome analysis (Chen et al. 2018; Shimono et al. 2016; Wang et al. 2020). A sucrose synthesis-related gene (Gh_D03G1338) associated with FL was identified by the combination of genome-wide association and linkage analyses (Zhang et al. 2019a). Three genes, Gh_D05G1077 and Gh_D13G1571 for SY, and Gh_A11G0775 for LY, were identified using genome-wide association mapping. Five candidate genes were identified by the combination of QTL mapping and transcriptome analysis, which regulated pericarp thickness in sweet corn (Wu et al. 2020). A peroxidase gene (GhPRXR1) required for oil content in upland cotton was identified by the combination of genome-wide association and transcriptome analysis (Ma et al. 2019). Two candidate genes for fiber elongation and developmental were identified by the combination of genome-wide association and transcriptome analysis (Ma et al. 2018a).

Some novel genes functioning in fiber initiation and elongation have been verified by molecular biology methods, for example, an R2R3 MYB transcription factor gene (GhMYB25-like) (Walford et al. 2011), a homeodomain leucine zipper gene (GhHD-1) (Walford et al. 2012), a vacuolar invertase gene (GhVIN1)(Wang et al. 2014), a cotton actin gene (GhACT1)(Li et al. 2005), cotton annexin genes (AnxGb6 and GhAnn2)(Huang et al. 2013; Tang et al. 2014), an fiber-specific profiling gene (GhPFN2)(Wang et al. 2010), and an actin-depolymerizing factor gene (GhADF1) (Wang et al. 2009). In addition, cellulose synthases genes GhCesA1, WLIM1a and GhADF1 are responsible for SCW in cotton fibers (Han et al. 2013; Salnikov et al. 2003; Wang et al. 2009).

In this study, 884 QTLs associated with cotton fiber traits from 12 studies (Ali et al. 2018; Diouf et al. 2018; Huang et al. 2017; Jia et al. 2018; Keerio et al. 2018; Li et al. 2016a; Liu et al. 2018b; Ma et al. 2018a; Tan et al. 2018; Wang et al. 2015; Zhang et al. 2015b; Zou et al. 2014) were used for meta-QTL analysis based on upland cotton reference genome TM-1 (Zhang et al. 2015a), and 74 meta-QTLs were identified. Combined with 8 589 significant SNP loci associated to cotton fiber quality traits collected from 15 previous publications (Chandnani et al. 2018; Fang et al. 2017; Gapare et al. 2017; Handi et al. 2017; Huang et al. 2017; Islam et al. 2016; Li et al. 2017b, 2018a, b; Liu et al. 2018b; Ma et al. 2018a, b; Su et al. 2016, 2018; Sun et al. 2017; Wen et al. 2018), 297 candidate genes associated with cotton fiber quality traits were identified. Twenty genes showed high expression levels specifically in the developing fibers, some of which are associated with the fiber development. According to the results, the combination of meta-QTL, significant SNP by genome-wide association analysis (GWAS), and spatiotemporal expression analysis provides not only stable QTLs used for MAS, but also candidate genes to uncover the molecular mechanisms for cotton fiber development.

Methods

Data collection and organization

From the Web of Science website (http://www.webofknowledge.com/), G. hirsutum, fiber quality traits, GWAS, SNP, QTL, and high density genetic map (HDGM) were used as keywords, and more than 50 related articles were retrieved. The articles providing QTL intervals and flanking markers were selected for QTL collection. Finally, 884 QTLs with respect to FE, FL, FS, MIC, FU, spinning consistency index (SCI), short fiber (SF), fiber reflectance (FR) and fiber yellowness (FY) traits were identified from the Web of Science (Ali et al. 2018; Diouf et al. 2018; Huang et al. 2017; Jia et al. 2018; Keerio et al. 2018; Li et al. 2016a; Liu et al. 2018b; Ma et al. 2018a; Tan et al. 2018; Wang et al. 2015; Zhang et al. 2015b; Zou et al. 2014). QTL numbers, traits, population type and size, and number of markers are listed in Table 1. GWAS data, including 8 589 SNPs significant loci associated with FL, FS, FU, FE, MIC, MIC, SCI, SF, and FC are listed in Table S1, Table S2 (Chandnani et al. 2018; Fang et al. 2017; Gapare et al. 2017; Handi et al. 2017; Huang et al. 2017; Islam et al. 2016; Li et al. 2017b, 2018a, b; Liu et al. 2018b; Ma et al. 2018a, b; Su et al. 2016, 2018; Sun et al. 2017; Wen et al. 2018).

Table 1 Fiber quality traits QTLs mapped by SNP markers from 12 papers

Meta-QTL analysis

Since SNPs are developed by genome sequencing, each marker has a fixed and unique location in the genome. By anchoring the SNPs on both sides of the QTLs to the TM-1 genome (Zhang et al. 2015a), the confidence interval of the QTLs can be determined. A stable meta-QTL region was obtained by manual organizing, and the stable meta-QTL intervals were illustrated in the form of Circos plot using Circos software (Krzywinski et al. 2009). The SNP loci significantly correlated with the same trait was compared with the meta-QTL intervals; thereby the most likely location of the candidate genes in the meta-QTL intervals was determined.

Candidate gene identification

Eight thousand five hundred and eighty-nine significant SNP loci associated to cotton fiber quality traits were collected from 15 GWAS studies and mapped to TM-1 genome (Chandnani et al. 2018; Fang et al. 2017; Gapare et al. 2017; Handi et al. 2017; Huang et al. 2017; Islam et al. 2016; Li et al. 2017b, 2018a, b; Liu et al. 2018b; Ma et al. 2018a, b; Su et al. 2016, 2018; Sun et al. 2017; Wen et al. 2018) (Table S1, Table S2). Then they were mapped to the 74 meta-QTLs, and the mapped SNPs are shown in Table S3. Lastly, the genes closely linked to the SNPs (SNPs within genes) were selected as candidate genes, which are shown in Table S4.

Gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis

The significant enrichment analysis of gene ontology (GO) terms was carried out using agriGO v2.0 software (P < 0.05) (http://bioinfo.cau.edu.cn/agriGO/) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of genes was performed with KEGG Automatic Annotation Server (KAAS). R package clusterProfiler (http://bioconductor.org/packages/2.8/bioc/html/clusterProfiler.html) was used for result visualization.

Gene expression patterns

The tissue expression levels of the candidate genes were obtained from previously reported transcriptome data (Zhang et al. 2015a).

Results

Collection of QTLs and SNPs associated with fiber quality traits

A total of 12 QTL mapping studies for cotton fiber quality traits were used in this study, in which the mapping population size ranged from 107 to 503 lines (Table 1) (Ali et al. 2018; Diouf et al. 2018; Huang et al. 2017; Jia et al. 2018; Keerio et al. 2018; Li et al. 2016a; Liu et al. 2018b; Ma et al. 2018a; Tan et al. 2018; Wang et al. 2015; Zhang et al. 2015b; Zou et al. 2014), and the number of SNP markers ranged from 168 to 19 191 (Table 1). As a result, a total of 884 initial QTLs related to cotton fiber quality traits were collected, which were unevenly distributed on each chromosome, and ranged from 12 to 57 (Fig. 1, Table S5). Chromosome A4 had the lowest number of QTLs and chromosome A10 had the highest number of QTLs. Among them, there were a large number of QTLs related to FL, FS, MIC, FU, and FE, which were 204, 207, 179, 118, and 108, respectively. However, the number of QTLs related to the SCI, SF, Fr, and FY were 21, 19, 13 and 15, respectively (Table S5).

Fig. 1
figure 1

Initial QTL distribution on 26 chromosomes of G. hirsutum genome.

FL, fiber length; FS, fiber strength; MIC, micronaire; FU, fiber uniformity; FE, fiber elongation; SCI, spinning consistency index; SF, short fiber; FR, fiber reflectance; FY, fiber yellowness

Meta-analysis of QTL for fiber quality traits

A meta-analysis was performed with 884 QTLs related to cotton fiber quality traits, and a total of 74 stable meta-QTLs related to FL, FS, FE, MIC, and FU were obtained, including 19 for FL, 18 for FS, 11 for FU, 11 for FE, and 15 for MIC, which covered 26 upland cotton chromosomes. There were 33 meta-QTLs in the A sub-genome and 41 in the D sub-genome. The confident intervals (CI) of all meta-QTLs were smaller than their respective initial QTLs, which ranged from 2.4 Mb to 13.4 Mb, with an average of 8.5 Mb (Fig. 2, Table S6). Among the 74 meta-QTLs, 19 were obtained from multiple QTLs coincident regions in 4 or more studies (Table S6), indicating that these regions had high correlation with cotton fiber quality traits.

Fig. 2
figure 2

Locations of Meta-QTLs and significant SNPs.

Each color represents one trait, red for FL, gray for FS, blue for FU, green for FE, and yellow for MIC. There are two tracks for each trait, the inner tracks represent the meta-QTLs, and the outer tracks represent significant SNP loci associated with corresponding traits. FL, fiber length; FS, fiber strength; MIC, micronaire; FU, fiber uniformity; FE, fiber elongation

Meta-QTLs for FL

A total of 19 meta-QTLs related to FL were obtained, covering 17 chromosomes, including 7 meta-QTLs on the A sub-genome (A01, A05, A08, A09, A10, and A12), and 12 Meta-QTLs on the D sub-genome (D02, D03, D04, D05, D06, D08, D09, D10, D11, D12, and D13) (Fig. 2, Table S6). Three meta-QTLs have been mapped in more than 5 studies, namely meta-QTL-3, meta-QTL-15, and meta-QTL-17, which were located in the 18.8 Mb31.70 Mb region on A05 chromosome, the 54.17 Mb61.86 Mb region on D10 chromosome, and the 15.70 Mb24.32 Mb region on D11 chromosome, respectively (Table S6).

Meta-QTL for FS

A total of 18 meta-QTLs related to FS were obtained, covering 15 chromosomes, including 9 meta-QTLs on the A sub-genome (A01, A02, A03, A05, A07, A09, and A10), and 9 meta-QTLs on the D sub-genome (D04, D05, D06, D07, D08, D10, D11, and D12) (Fig. 2, Table S6). Among them, four meta-QTLs have been identified in the four studies, namely meta-QTL-23, meta-QTL-25, meta-QTL-27, and meta-QTL-30, which were located in 5.81 Mb14.90 Mb on A05, 63.33 Mb73.57 Mb on A07, 93.51 Mb100.20 Mb on A10, and 4.01 Mb15.81 Mb region on D05, respectively (Table S6).

Meta-QTL for FU

A total of 11 meta-QTLs related to FU were obtained, covering 11 chromosomes, including 5 meta-QTLs on the A sub-genome (A02, A06, A09, A10, and A11), and 6 meta-QTLs on the D sub-genome (D01, D02, D03, D05, D11, and D12). Among the meta-QTLs, meta-QTL-45 on D03 and meta-QTL-47 on D11 chromosomes are more reliable for identification in four studies (Fig. 2, Table S6).

Meta-QTLs for FE

A total of 11 meta-QTLs related to FE were obtained, covering 11 chromosomes, including 4 meta-QTLs on the A sub-genome (A05, A10, A11, and A13), and 7 meta-QTLs on the D sub-genome (D01, D04, D07, D11, and D12) (Fig. 2, Table S6). Among these meta-QTLs, meta-QTL-53 on chromosome D01 was identified in four studies.

Meta-QTL for MIC

A total of 15 meta-QTLs related to MIC were obtained, covering 14 chromosomes, including 8 meta-QTLs on the A sub-genome (A05, A06, A08, A09, A10, A11, and A13), and 7 meta-QTLs on the D sub-genome (D03, D05, D06, D08, D09, D11, and D12) (Fig. 2, Table S6). Two meta-QTLs have been identified in the four studies, namely meta-QTL-61 and meta-QTL-71, which were located in 17.32 Mb26.9 Mb on A05, and 52.98 Mb62.38 Mb on D08, respectively (Table S6).

Candidate genes identification combined and meta-QTL intervals and significant SNPs

Eight thousand five hundred eighty-nine significant SNP loci associated to cotton fiber quality traits were collected from 15 GWAS studies and mapped to the TM-1 genome (Chandnani et al. 2018; Fang et al. 2017; Gapare et al. 2017; Handi et al. 2017; Huang et al. 2017; Islam et al. 2016; Li et al. 2017b, 2018a, b; Liu et al. 2018b; Ma et al. 2018a, b; Su et al. 2016, 2018; Sun et al. 2017; Wen et al. 2018) (Table S1, Table S2, Fig. 2), 4 343 of which were mapped in the 74 meta-QTL regions (Table S3). Two hundred and ninety-seven candidate genes were identified closely linked to the 4 343 SNPs, including 126 genes for FL, 93 for FS, 40 for FU, 20 for FE, 18 for MIC (Table S4).

GO and KEGG enrichment analysis of candidate genes

To identify common characteristics of these genes in biological functions, gene ontology (GO) analysis was performed with the 297 candidate genes, and 200 of them had ontology annotations, which were classified into the three main GO categories (biological process, molecular function, and cellular component) and 15 GO terms (Fig. 3; Table S7). In the biological process category, protein modification (30, 15%), cellular protein modification (30, 15%), protein metabolism (46, 23%), macromolecule modification (30, 15%), macromolecule metabolism (69, 34.5%), cellular protein metabolism (37, 18.5%), cellular macromolecule metabolism (60, 30%), and proteolysis (11, 5.5%) were the major subcategories (Fig. 3, Table S7). In the cellular component category, 35 (17.5%) genes were enriched in the membrane subcategory (Fig. 3; Table S7). In the molecular function category, protein serine/threonine kinase activity (18, 9%), protein binding (55, 27.5%), peptidase activity (10, 5%), protein tyrosine kinase activity (20, 10%), DNA binding (22, 11%), and transporter activity (16, 8%) were the major subcategories (Fig. 3, Table S7).

Fig. 3
figure 3

GO analysis of candidate genes of fiber quality traits

To further understand the enriched pathways of the candidate genes, KEGG pathway analysis was performed, and 234 annotated genes were assigned to 4 KEGG pathways (P < 0.05), including pentose and glucuronate interconversions, acarbose and validamycin biosynthesis, vitamin digestion and absorption, and membrane trafficking (Table 2). Some of the pathways have been reported to be associated with fiber development, such as the pentose and glucuronate interconversions pathway is associated with fiber elongation.

Table 2 KEGG analysis of candidate genes

Key candidate genes identified from expression patterns

To better understand the molecular function of the candidate genes, the expression in 10 tissues (root, stem, leaf, petal, anther, stigma, fibers at four developmental stages) obtained from the transcriptome datasets of the upland cotton genetic standard TM-1 were used for spatiotemporal expression analysis (Zhang et al. 2015a). Among the 126 candidate genes for FL, nine of which (Gh_D08G1950, Gh_D06G0479, Gh_D11G1626, Gh_D13G1900, Gh_D10G0833, Gh_D13G1965, Gh_A09G1231, Gh_D08G1970, and Gh_D04G1574) showed high expression levels specifically during the development of cotton fiber. Gh_D08G1950, Gh_D06G0479, Gh_D11G1626, Gh_D13G1900, Gh_D10G0833, and Gh_D13G1965 showed high expression levels specifically at fiber SCW biosynthesis stages (20 and 25 DPA); Gh_A09G1231, Gh_D08G1970 and Gh_D04G1574 showed high expression levels specifically at fiber elongation stages (5 and 10 DPA). Among the 93 candidate genes for FS, 5 of which (Gh_A01G1474, Gh_A05G2203, Gh_D08G2110, Gh_A10G2036, and Gh_A07G1801) showed high expression levels specifically during the development of cotton fiber. Gh_A01G1474, Gh_A05G2203 and Gh_D08G2110 showed specifically high expression levels at fiber SCW biosynthesis stages (20 and 25 DPA); Gh_A10G2036, and Gh_A07G1801 showed high expression specifically at fiber elongation stages (5 and 10 DPA). Among the 40 candidate genes for FU, 2 of which (Gh_A11G2663 and Gh_D11G2059) showed high expression levels specifically during the development of cotton fiber. Gh_A11G2663 showed high expression level specifically at fiber SCW biosynthesis stages (20 and 25 DPA); Gh_D11G2059 showed high expression level specifically at fiber elongation stages (5 and 10 DPA). Among the 20 candidate genes for FE, 3 of which (Gh_A13G0282, Gh_A13G0354 and Gh_A11G1313) showed high expression levels specifically at fiber elongation stages (5 and 10 DPA). Among the 20 candidate genes for FE, Gh_D11G1416 showed high expression level specifically at fiber SCW biosynthesis stages (20 and 25 DPA) (Table S8).

To better understand the molecular function of the key candidate genes identified from expression patterns, they were annotated in CottonFGD (https://cottonfgd.org/). As a result, 4 genes (Gh_D10G0833, Gh_A05G2203, Gh_D11G2059, and Gh_A13G0354) have no protein annotation, and others encode a variety of proteins (Table 3). The annotations of the key candidate genes were listed in Table 3.

Table 3 The annotation of key candidate genes

Discussion

SNP-based meta-QTL analysis for cotton fiber quality traits supports complete and accurate genetic information

In previous studies, genetic maps were constructed with SSR markers and other low throughput molecular markers in cotton genetic studies (Guo et al. 2013; Nie et al. 2016; Sun et al. 2012). The low molecular markers density in genetic research could not only reduce the number and accuracy of QTL identification, but also result in large confidence intervals of the QTL in the genome (Guo et al. 2006b; Liu et al. 2012; Su et al. 2016). With the development of sequencing technology, the application of SNP markers are used in cotton genetic research, such as HDGM construction (Ali et al. 2018; Diouf et al. 2018) and GWAS study (Cai et al. 2017; Li et al. 2017a; Sun et al. 2018), which results in the identification of a large number of QTL for cotton fiber quality traits. Though the meta-analysis of QTL for cotton fiber quality traits were already reported, they were based on the low throughput molecular markers (Said et al. 2013, 2015a), which could result in the loss of genetic information, as well as the reduction of accuracy of meta-QTLs. Previous studies have indicated that a region containing multiple QTLs of same traits can be used as a hotspot with a region size of approximately 20 cM (centiMorgan) (Said et al. 2013) or a physical distance of approximately 10 Mb in upland cotton (Keerio et al. 2018). Therefore, in order to ensure the credibility and inclusiveness of the meta-QTLs, the genome region of about 10 Mb was used as the confidence interval of the meta-QTLs. In this study, 884 QTLs for cotton fiber quality traits were collected from high-density genetic maps, which were constructed with large numbers of SNP markers. Seventy-four meta-QTLs were identified. In addition, the application of unique SNP markers makes it easier to map the QTLs from different studies to TM-1 genome of upland cotton.

The meta-QTLs contribute to cotton fiber quality improvement by MAS

MAS needs to be enabled through the identification of robust QTLs, the design of reliable marker systems to select for these QTLs, and the delivery of these QTLs into elite genomic backgrounds to enable their use without associated genetic drag (Cobb et al. 2019). In the study, though 884 QTLs for cotton fiber quality traits were collected for meta-analysis, only 74 meta-QTLs were identified, among which, 19 were obtained from multiple QTL coincident regions of 4 or more studies, including 11 for FL, 4 for FS, 1 for FU, 2 for FE, and 2 for MIC. So the flanking SNP markers of the meta-QTLs identified in the study, especially the 19 meta-QTLs, can be used for MAS to improve cotton fiber quality.

The combination of meta-QTL intervals and significant SNP provide reliable information to identify candidate genes

Due to the challenging detection of rare variants in GWAS and high false-positive rates in QTL mapping, the combination of association mapping and linkage analysis has been wildly used for revealing the genetic architecture of complex quantitative traits (Andersen et al. 2005; Li et al. 2016b; Visscher 2008). Seventeen candidate genes for kernel test weight were identified in maize by the combination of association mapping and linkage analysis (Zhang et al. 2019c). Nineteen candidate genes for plant and ear height were identified in maize by combining association mapping and linkage analysis (Li et al. 2016b). Twenty-five candidate genes for soybean seed protein and oil content were identified by combining association mapping and linkage analysis (Zhang et al. 2019b). In our study, the combination of meta-QTL intervals and significant SNP identified by association mapping was used for candidate gene identification coving the whole cotton genome, which provides more comprehensive and reliable information. As a result, 297 candidate genes associated with cotton fiber quality traits were identified.

Candidate genes are probably involved in the development of cotton fibers

Due to the difficulty of forward genetics research in cotton, transcriptome analysis combined with QTLs has been widely used to identify candidate genes for fiber development (Fang et al. 2014; Shi et al. 2006; Tu et al. 2007; Yoo and Wendel 2014). In our study, transcriptome analysis of ten cotton tissues was used for 297 candidate genes expression pattern analysis, as a result, 20 genes showed high expression levels specifically in the developing fibers. In addition, the encoded proteins and functions of the 20 genes were annotated, and many were associated with fiber development (Table 4). Gh_D11G1626 encoded a COBRA-like protein 4, and expressed in the fiber SCW biosynthesis stages; the function of COBRA-like protein has been reported in sorghum and rice, which were involved in SCW cellulose biosynthesis (Dai et al. 2011; Li et al. 2019; Sato et al. 2010), so Gh_D11G1626 is probably involved in the SWC biosynthesis in cotton. Gh_D13G1900 and Gh_A01G1474 encoded WAT1-related proteins, and showed specifically high expression level at the fiber SCW biosynthesis stages; the WAT1-related protein was probably related to high fiber yield in cotton (Liu et al. 2018a). Gh_D13G1965 and Gh_A11G2663 encoded protein WVD2-like 1, and showed specifically high expression level at the fiber SCW biosynthesis stages; in Arabidopsis, WVD2 was involved in the cell expansion (Yuen et al. 2003). Gh_D08G1970 encoded a probable aquaporin PIP1–2, and showed specifically high expression level at fiber elongation stages; aquaporin proteins were reported to be involved in the cotton fiber cell elongation and development (Li et al. 2013; Yang and Cui 2009). Gh_A10G2036 encoded a Rop guanine nucleotide exchange factor 5, and showed specifically high expression level at fiber elongation stages; Rho of plants (ROP) was reported participated in the spatial patterning of SCWs and regulated the polarized cell growth (Kost 2008; Oda and Fukuda 2014; Yanagisawa et al. 2018), so Gh_A10G2036 is probably involved in the cotton fiber elongation. According to the results, the candidate genes identified by the combination of meta-QTL intervals, significant SNPs, and transcriptome data, are reliable; however, there are still lots of work to do to study the function of the key candidate genes.

Table 4 The functional information of the key candidate gene homologous

Conclusion

In this study, we identified 74 meta-QTLs and 297 candidate genes associated with cotton fiber quality traits, and 20 of which showed high expression levels specifically in the developing fibers and thus are assumed to be associated with the fiber development. The study provides not only stable QTLs used for marker-assisted selection (MAS), but also candidate genes to uncover the molecular mechanisms for cotton fiber development.

Availability of data and materials

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

References

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their valuable comments and helpful suggestions which help to improve the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (31760402), Public Welfare Research Projects in the Autonomous Region (KY2019002), and Special Programs for New Varieties Cultivation of Shihezi University (YZZX201701).

Author information

Authors and Affiliations

Authors

Contributions

Nie XH, Zhang DW, Xu SD, and Pan ZY designed and performed the experiments. Nie XH, Xu SD and Pan ZY wrote the main manuscript text and prepared all figures. Xu SD, Yin FF, Yang QY and Wen TW performed data analysis. Nie XH, Pan ZY, Zhang DW, Lin ZX, and Zhu LF revised and polished the manuscript. All authors contributed in the interpretation of results and approved the final manuscript.

Corresponding authors

Correspondence to ZHANG Dawei or NIE Xinhui.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All Authors have provided ethical approval and consent to participate as well as consent for publication.

Competing interests

The authors have declared that no competing interests exist.

Supplementary Information

Additional file 1: Table S1

Initial QTL location on upland cotton TM-1 reference genome. Table S2. The information of the meta-QTL. Table S3. SNPs associated with fiber quality traits from 15 papers. Table S4. Significant SNPs location on upland cotton TM-1 reference genome. Table S5. Significant SNPs in the mQTL. Table S6. Candidate genes closely linked SNPs in the meta-QTL. Table S7. GO enrichment of candidate genes. Table S8. The expression of candidate genes in ten tissues.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

XU, S., PAN, Z., YIN, F. et al. Identification of candidate genes controlling fiber quality traits in upland cotton through integration of meta-QTL, significant SNP and transcriptomic data. J Cotton Res 3, 34 (2020). https://doi.org/10.1186/s42397-020-00075-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s42397-020-00075-z

Keywords