Phylogenetic analysis of the tubulin protein family
To investigate the evolutionary relationships of tubulin genes, the tubulin protein sequences from O. sativa, Arabidopsis, L. usitatissimum, G. arboreum, G. barbadense, G. raimondii, and G. hirsutum were used to generate an unrooted phylogenetic tree using various sequence alignments (Fig. 1). The tubulin proteins were mainly classified into three groups, viz., α-, β-, and γ-tubulins, which revealed similar results to previous studies concerning different plants (Jayaswal et al. 2019). Most of the orthologous genes between the allotetraploid and the corresponding diploid cotton were clustered into the same clade based on phylogenetic analysis with tubulin protein sequences, which indicated that the tubulin genes had been differentiated into three different subfamilies. In each group, most tubulin proteins from the four cotton species were relatively farther than tubulin proteins from O. sativa, Arabidopsis, L. usitatissimum, but a few had a particular case, including some genes in O. sativa, which were closely related to cotton species. γ-tubulins were abundant in O. sativa, which may play a vital role in the development of rice. However, γ-tubulins were scarce in Arabidopsis, and only two genes, viz., AT.5G05620 and AT.3G61650 were from this group. Interestingly, the α- and β-tubulins subfamilies are expended, but the γ-tubulin subfamily is relatively minor in cotton. The above results depicted that tubulin genes are either lost or expanded during evolution, according to plant growth needs.
Gene structure and conserved motifs analysis
Exon-intron distribution is generally associated with biological functions and is helpful to understand the evolutionary relationship among different gene families. Therefore, we analyzed the conserved motif and structural characterizations to investigate the phylogenetic relationships among different members of the tubulin family of G. hirsutum (Fig. 2). Members of α-, β-tubulin genes are quite conserved. However, γ-tubulin genes were not retained in protein motifs and gene structure analysis. The exon/intron structural variation of the tubulin gene family was then compared, yielding a comprehensive illustration of their relative lengths. We found a variable structural pattern of exon-intron in tubulin family genes. Closely related genes (sharing the same phylogenetic branch) depicted similar structural patterns; however, a considerable distinction was identified among genes from different branches. The majority of α-tubulin subfamily genes were identified with four introns, while the β-tubulin subfamily consists of three introns. In contrast, five to eleven introns were found in the γ-tubulin subfamily.
Further, we identified ten conserved motifs utilizing MEME program. There are almost all conserved sites in α-, β-tubulin subfamilies. However, the genes Gh.D05G257500, Gh.D12G088700 and Gh.D04G181300 may have lost one or two motifs 4/5/8 in α-, β-tubulin subfamily. Most γ-tubulin genes are poorly conserved compared with the members of α-, β-tubulin genes. Surprisingly, Gh.A11G102000, Gh.D11G102600, and Gh.A12G127300 contain ten conserved motifs in the γ-tubulin subfamily. The above results showed that most members from the same subfamily have similar motif features and exon-intron structure, supporting close evolutionary relationships.
As depicted in Table S2 and Fig. S1, the results showed that the most α-, β-tubulin proteins have conserved domains with Tubulin and Tubulin_C; a theoretical isoelectric point of the most tubulin proteins was weakly acidic; however, when the proteins lose Tubulin_C, the pIs of proteins were alkaline. Conserved site analysis of tubulin gene family suggested that some amino acids are very conserved (Fig. S2). The BaCelLo (Pierleoni et al. 2006), EpiLoc (http://epiloc.cs.queensu.ca/) and Plant-mPLoc (Chou and Shen 2010) predicted that the subcellular localization of all tubulin proteins is Cytoplasmic/Chloroplast. TMHMM2.0 (http://www.cbs.dtu.dk/services/TMHMM/) and TOPCONS (Tsirigos et al. 2015) predicted that all tubulin proteins are located outside the membrane. These results indicated that the tubulin genes might synthesize the cytoskeleton in the plant cell cytoplasm.
Chromosomal location, gene duplication, and collinearity relationships
We constructed chromosomal distribution maps of tubulin genes to understand the genomic distribution and structural variation of the tubulin gene family in Gossypium species. An uneven distribution pattern of these genes, on sub-genomic (A sub-genome and D sub-genome) and chromosomal level (24 chromosomes) was observed, mainly distributed towards proximate ends of the chromosomes except for At10 and Dt10, which showed nonexistence of tubulin genes (Fig. 3). The highest number of tubulin genes (7) were present on At03 and Dt05. In contrast, At02, At04, At06, Dt04, Dt07, and Dt12 of G. hirsutum included only one or two tubulin genes. Based on the uneven distribution pattern on sub-genomes, we speculated that this variation might have resulted from the evolutionary process.
We further analyzed the collinearity relationships for all tubulin genes on the chromosomes At and Dt of G. hirsutum with other Gossypium species. Among 42 tubulin genes on the chromosomes At of G. hirsutum, 33 had intergenomic homologous genes on chromosomes Dt of G. hirsutum (Fig. 4); 40 tubulin homologous genes in G. arboretum (Fig. S3); all genes on the chromosomes Dt of G. hirsutum had intergenomic homologous genes in G. raimondii (Fig. S4). Among all the 91 tubulin genes of G. hirsutum, 77 genes tubulin had intergenomic homologous genes in G. barbadense (Fig. S5), 91 tubulin homologous genes in G. darwinii and G. tomentosum (Fig. S6/7), respectively. The above results emphasized an independent evolutionary process for tubulin gene quantity in At subgroup and Dt subgroup after the formation of allotetraploid, while afterward same directional evolution for different allotetraploid cotton species.
Expression patterns of tubulin genes in different upland cotton
Although most tubulin superfamily genes have redundant functionality, they have similar functions to promote cell elongation (Segami et al. 2012; Blume et al. 2013). The expression patterns analyses were performed to understand the role of tubulin genes during fiber elongation and development. Firstly, RNA-sequencing data were used to detect the expression patterns of 98 tubulin genes in root, stem, leaf, ovule, and fiber tissues of G. hirsutum (J02–508 and ZRI-015) using a heatmap. The results showed that most α-, β-tubulin genes had higher expression levels relative to γ-tubulin genes in all tissues, and the expression levels of most α-, β-tubulin gene subfamilies were very obvious tissue specificity of fiber development stages (Fig. 5). Secondly, the expression levels of tubulin genes in the fiber development stage had higher levels than other tissues, e.g. Gh.A08G206600, Gh.D08G202200, Gh.A11G351700, Gh.D11G355300, Gh.A08G207900, Gh.A03G027200, Gh.D03G169300, Gh.A11G082000, Gh.D11G082500, Gh.A13G236900, Gh.A11G258900, Gh.A05G199500, Gh.D05G216200, Gh.D02G226100, Gh.A03G209100, Gh.A04G142400, Gh.D04G181300, Gh.A02G095500 and Gh.D02G097200, which may contribute to fiber development. Among them, only Gh.A03G027200, Gh.D03G169300, and Gh.A11G258900 had differential expression patterns at distinct stages of fiber development in the two varieties, and they were significantly different at 15 DPA, which is a critical time for fiber length and strength in J02–508 compared with ZRI-015. The above results indicated the three genes might inhibit fiber length and strength.
The expression levels of Gh.A03G027200, Gh.D03G169300, and Gh.A11G258900 in different fiber development stages for J02508 and ZRI015 were detected using qRT-PCR. As shown in Fig. 6, the relative expression data revealed that the expression levels of Gh.A03G027200 and Gh.D03G169300 were higher from 10 to 25 DPA in J02508 and ZRI015, and the expression levels of Gh.A11G258900 from 3 to 15 DPA in ZRI015 was twice as much as that of J02508.