Genome-wide identification and expression analysis of Gossypium RING-H2 finger E3 ligase genes revealed their roles in fiber development, and phytohormone and abiotic stress responses

RING-H2 finger E3 ligase (RH2FE3) genes encode cysteine-rich proteins that mediate E3 ubiquitin ligase activity and degrade target substrates. The roles of these genes in plant responses to phytohormones and abiotic stresses are well documented in various species, but their roles in cotton fiber development are poorly understood. To date, genome-wide identification and expression analyses of Gossypium hirsutum RH2FE3 genes have not been reported. We performed computational identification, structural and phylogenetic analyses, chromosomal distribution analysis and estimated Ka/Ks values of G. hirsutum RH2FE3 genes. Orthologous and paralogous gene pairs were identified by all-versus-all BLASTP searches. We predicted cis-regulatory elements and analyzed microarray data sets to generate heatmaps at different development stages. Tissue-specific expression in cotton fiber, and hormonal and abiotic stress responses were determined by quantitative real time polymerase chain reaction (qRT-PCR) analysis. We investigated 140 G. hirsutum, 80 G. arboreum, and 89 G. raimondii putative RH2FE3 genes and their evolutionary mechanisms and compared them with orthologs in Arabidopsis and rice. A domain-based analysis of the G. hirsutum RH2FE3 genes predicted conserved signature motifs and gene structures. Chromosomal localization showed the genes were distributed across all G. hirsutum chromosomes, and 60 duplication events (4 tandem and 56 segmental duplications) and 98 orthologs were detected. cis-elements were detected in the promoter regions of G. hirsutum RH2FE3 genes. Microarray data and qRT-PCR analyses showed that G. hirsutum RH2FE3 genes were strongly correlated with cotton fiber development. Additionally, almost all the identified genes were up-regulated in response to phytohormones (brassinolide, gibberellic acid (GA), indole-3-acetic acid (IAA), and salicylic acid (SA)) and abiotic stresses (cold, heat, drought, and salt). The genome-wide identification, comprehensive analysis, and characterization of conserved domains and gene structures, as well as phylogenetic analysis, cis-element prediction, and expression profile analysis of G. hirsutum RH2FE3 genes and their roles in cotton fiber development and responses to plant hormones and abiotic stresses are reported here for the first time. Our findings will contribute to the genome-wide analysis of putative RH2FE3 genes in other species and lay a foundation for future physiological and functional research on G. hirsutum RH2FE3 genes.


Background
Ubiquitin (Ub)-protein E3 ligase genes encode diverse proteins and make up more than 6% (> 1400 genes) of the total Arabidopsis genome (Vierstra 2009). E3 ligases have been categorized into four classes based on the domains present: homologous to E6-associated protein C-terminal, really interesting new genes (RING), cullin-RING ligase, and U-box (Vierstra 2009). RING E3 ligases are cysteine-rich proteins that are classified into RING-H2 (C3H2C3) and RING-HC (C3HC3) subgroups depending on the amino acid (Cys (C) or His (H)) that occupies the fifth position in the motif (Freemont 2000). RING-H2 E3 ligases contain the consensus sequence Cys-X 2 -Cys-X (9-39) -Cys-X (1-3) -His-X (2-3) -His-X 2 -Cys-X (4-48) -Cys-X 2 -Cys, where X is any amino acid (Saurin et al. 1996;Stone et al. 2005). Many plant RING-H2 proteins act as E3 ligases by mediating the ubiquitination of target substrates. The type of ubiquitination, mono-, multiple mono-, or poly-, determines the fate of the target (Haglund and Dikic 2005). Thus, E3 ligases play important roles in target protein degradation through the 26S proteasome. The reversible conjugation of a protein with Ub is a regulatory mechanism used in various cellular processes, including protein receptor trafficking, gene transcription, and DNA repair. Ub is a small protein of 76 amino acids that is highly conserved in different species (Sadanandom et al. 2012). E3 ligases are the most diverse proteins in the 26S proteasome degradation pathway, and they confer substrate selectivity for a wide range of substrates. E3 ligases together with two other proteins, Ub-activating enzyme (E1) and Ub-conjugating enzyme (E2), catalyze the attachment of Ub to the target substrate in a specific fashion (Sadanandom et al. 2012).
In eukaryotes, protein degradation by E3 ligase activity is associated with protein metabolism (Sadanandom et al. 2012). RING proteins are responsive to various biological processes (Lyzenga and Stone 2012), including plant growth and development (Li et al 2011;Schwechheimer et al. 2009;Zhang et al. 2008), hormone signaling pathways (Bu et al. 2009), cell cycle, embryogenesis, and biotic and abiotic stress responses (Lee and Kim 2011;Santner and Estelle 2010;Vierstra 2009). The diversified regulatory functions of RING proteins means they are involved in nearly every aspect of plant life, which makes them distinct from other proteins.
As sessile organisms, plants are exposed to continuously changing environmental conditions, including biotic and abiotic stresses, and have to withstand and regulate their growth and development under intense conditions (Chen and Hellmann 2013). The Ub proteasome pathway is the key regulatory mechanism that allows plants to integrate internal and external signals (Hua and Vierstra 2011). For instance, a RH2FE3, XERICO is involved in drought tolerance mechanisms by interrupting abscisic acid signaling (Ko et al. 2006), HOS1 responds to cold stress (Lee et al. 2001), Arabidopsis DREB2A-interacting proteins negatively regulate plant drought stress (Qin et al. 2008), Rma1H1 increases tolerance to drought and salt stresses (Lee et al. 2009;Seo et al. 2012), and OsDIS1 is involved in drought-stress signal transduction (Ning et al. 2011). Similarly, BPM induces abiotic-stress responses in Arabidopsis (Mizoi et al. 2012;Weber and Hellmann 2009), RHP1, a RING-H2 finger protein, mediates drought and salt tolerance in rice (Zeng et al. 2014), SDIR1, a RING finger E3 ligase, modulates salt stress responses by degrading SDIR1-interacting protein 1 (Zhang et al. 2015), and RING finger STRF1 is involved in salt stress responses in Arabidopsis through its E3 ligase activity (Zhang et al. 2015).
Cotton is the preeminent source of natural fiber for the textile industry (Yang et al. 2014). Cotton fibers are the single-cell seed hairs of the ovule epidermis that undergo four developmental phases (Kim and Triplett, 2004), among which fiber cell elongation determines main fiber quality characters (Deng et al. 2012). Fiber cell elongation is a complex process that involves diverse metabolic and regulatory events (Kim and Triplett 2001). E3 ligases have roles in fiber cell elongation. For example, in Gossypium hirsutum (upland cotton), high transcript level of GhRING1 was observed at 15 days post-anthesis in cotton fiber and GhRING1 encodes an E3 ligase (Ho et al. 2010).
In this study, we present a comprehensive analysis of RING-H2 finger E3 ligase (RH2FE3) genes in G. hirsutum, including phylogenetic analysis, chromosomal localization, gene structure, conserved motifs, gene duplication and ortholog analysis, cis-acting regulatory element analysis, fiber tissue-specific microarray data analysis, and qRT-PCR expression profiling. We also report changes in gene expression patterns in response to plant hormones and abiotic stresses. This work will lay a foundation for the elucidation of the evolutionary and functional roles of RH2FE3 genes, which may help in deciphering the detailed molecular and biological mechanisms involved in cotton fiber cell development.

Identification of RH2FE3 genes and proteins in three Gossypium species
We used the RH2FE3 genes in Arabidopsis and rice, as described by Serrano et al. (2006), as query sequences in BLASTP searches (Altschul et al. 1990) with E-value = 1e − 5 to detect candidate RH2FE3 genes in the G. hirsutum, G. arboreum, and G. raimondii genomes downloaded from the Cotton Genome Project (http:// cgp.genomics.org.cn). The retrieved sequences from each BLAST search were pooled to remove redundant sequences. The resulting gene sequences were analyzed using SMART (http://smart.embl-heidelberg.de/) and MEME (http://meme-suite.org/tools/meme) tools to identify potential RING-H2 finger proteins. The biophysical properties of the RH2FE3 proteins were computed using Expasy (https://us.expasy.org/tools/protparam.html).
Phylogenetic tree construction and structural analysis of the RH2FE3 genes The RH2FE3 gene sequences from Arabidopsis, rice, G. hirsutum, G. arboreum, and G. raimondii were aligned by "Muscle", and a maximum likelihood tree was generated using the neighbor-joining method in MEGA 6.06 (Tamura et al. 2013) with a bootstrap value of 1000. To analyze the evolution of the G. hirsutum RH2FE3 genes, we constructed another phylogenetic tree using 140 genes. We performed an exon-intron structural analysis of the RH2FE3 gene sequences of G. hirsutum using the Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/). To detect conserved motifs in the predicted RH2FE3 proteins, we used MEME tools with parameters set as 6-250 optimum width per motif and maximum number of 20 motifs. Additionally, motif annotation was performed using InterProScan (http:// www.ebi.ac.uk/interpro/search/sequence-search).
Chromosome location, gene duplication, orthologs, and nonsynonymous (Ka)/synonymous (Ks) substitution rates of the RH2FE3 genes The chromosome localization of the RH2FE3 genes was retrieved from the G. hirsutum genome annotation data (http://cgp.genomics.org.cn), and physical mapping was conducted using MapInspect software (http://www.plant breeding.wur.nl/UK/software_mapinspect.html). Orthologous and paralogous pairs of the RH2FE3 genes were obtained by all-versus-all BLASTP searches (Altschul et al. 1990) and visualized using Circos (Krzywinski et al. 2009). Gene duplication events and K a and K s values were estimated using the computed duplication events with the yn00 program in the PAML 4.9c software package (Yang 1997).

Prediction of cis-regulatory elements in the promoters of the RH2FE3 genes
The 2000 bp regions upstream of the translational start codons of the G. hirsutum RH2FE3 genes were considered as proximal promoters to predict cis-regulatory elements using the PlantCARE Database (Lescot et al. 2002). The predicted cis-elements were sorted based on their roles in transcriptional regulation (Pandey et al. 2016).

Microarray data and expression profile analysis of RH2FE3 genes in cotton fiber
We used publicly available high-throughput microarray data (https://www.ncbi.nlm.nih.gov/pmc/articles/ PMC4482290/) to determine the spatial and temporal expression patterns of the RH2FE3 genes in cotton fiber.
Further, to determine the cotton fiber tissue-specific expression patterns of the RH2FE3 genes through qRT-PCR, samples were collected at 0, 3, 5, 7, 10, 15, and 20 DPA from fiber tissues of G. hirsutum L. cv. 'CCRI 24' plants grown under field conditions. The collected samples were frozen immediately in liquid nitrogen and stored at − 80°C for subsequent analyses.
Phytohormone and abiotic stimuli, and qRT-PCR expression analysis RH2FE3 genes To investigate the expression profiles of RH2FE3 genes in response to phytohormones and abiotic stresses, G. hirsutum seedlings at the three-leaf stage were selected and treated. Three independent replicates were conducted. Seedlings were cultured in deionized water containing 10 μmol·L − 1 brassinolide (BL), 100 μmol·L − 1 gibberellic acid (GA), 100 μmol·L − 1 indole-3-acetic acid (IAA), or 10 μmol·L − 1 salicylic acid (SA) as independent exogenous phytohormone treatments. Leaf samples were collected after 0.5, 1, 3, and 5 h of treatment. Plants were incubated at 4°C and 38°C for cold and heat stress, respectively, and 20% (mass fraction) polyethylene glycol (PEG) and 300 mmol·L − 1 NaCl solutions instead of irrigating with water to stimulate dehydration and salt stress, respectively. In all four treatment groups, leaves were collected after 0, 1, 3, 6, and 12 h of treatment. All the collected samples were frozen in liquid nitrogen and stored at − 80°C for cDNA synthesis and quantitative expression analyses.
Total RNA was extracted using a RNA prep pure Plant Kit (TIANGEN, Beijing, China) and cDNA was synthesized from 1 μg of total RNA using a Prime-Script® RT reagent kit (Takara, Dalian, China). The G. hirsutum His3 gene was used as the internal control, and the qRT-PCRs were performed using SYBR Green on a LightCycler 480 (Roche Diagnostics GmbH, Sandhofer Straße 116, 68305 Mannheim, Germany). The gene-specific primers used in this study are listed in Additional file 1: Table S1.

Identification of RH2FE3 genes
We confirmed the identity of 79 RH2FE3 genes in Arabidopsis, 106 in rice, and 80, 140, and 89 in G. arboreum, G. hirsutum, and G. raimondii, respectively, using MEME and SMART tools (Additional file 1: Table S2). The lengths of the protein-coding regions of the G. hirsutum RH2FE3 genes ranged from 261 bp for GhATL31 and GhATL136 to 2637 bp for GhATL128. Similarly, the numbers of amino acids in the predicted protein sequences of these genes ranged from 86 to 878. The number of introns in the G. hirsutum RH2FE3 genes ranged from 0 to 10. Moreover, the RING-H2 finger domain was ubiquitously present in the N and C termini of all predicted protein sequences. GhATL23 had the highest isoelectric point (10.2), while GhATL28 had the lowest (4.42) isoelectric point. The molecular weights of the G. hirsutum RH2FE3 proteins ranged from 9.60 kDa for GhATL31 and GhATL136 to 98.10 kDa, for GhATL128. The predicted subcellular localization of the G. hirsutum RH2FE3 genes were plasma membrane (91 genes), extracellular (42 genes), chloroplast (6 genes), and nucleus (1 gene) (Additional file 1: Table S3).

Phylogenetic analysis of RH2FE3 genes
To determine the evolutionary relationships among the RH2FE3 genes of Arabidopsis, rice, G. arboreum, G. hirsutum, and G. raimondii, a multiple sequence alignment of 494 genes was performed and used to construct a phylogenetic tree. Nine major clades (designated Clades A to I) were formed based on their bootstrap support; Clade G was the biggest with 99 genes and Clade B was the smallest with 23 genes. Among the G. hirsutum RH2FE3 genes, 34 were in Clade C, 24 were in Clade E, and 5 each were in Clades B and H respectively (Fig. 1). To further investigate the functions and evolutionary relationships of the G. hirsutum RH2FE3 genes, we aligned the sequences of the RING-H2 finger domain (Additional file 2: Figure S1) and used the alignment to construct a phylogenetic tree. Six separate subgroups of the G. hirsutum RH2FE3 genes were formed; RH2FE3-A with 26 genes, RH2FE3-B with 23 genes, RH2FE3-C with 25 genes, RH2FE3-D with 19 genes, RH2FE3-E with 23 genes, and RH2FE3-F with 24 genes (Fig. 2).

Gene structure and conserved motif analysis of RH2FE3 genes and proteins
The intron-exon structure analysis of the 140 G. hirsutum RH2FE3 genes showed that approximately 69% of them lacked introns, 11% had one intron, 5% had two introns, 4% and 7% had three and four introns, respectively, and only 4% of the genes contain five or more than five introns. In the unrooted phylogenetic tree (Fig. 1), most of the genes occupied position in same clades had similar intron-exon structures (Additional file 3: Figure S2).
To further investigate variations within the conserved motifs of the G. hirsutum RH2FE3s, another unrooted tree was constructed using the MEME program. Ten motifs (designated Motifs 1 to 10) were identified, and proteins with similar motif compositions occupied position in same cluster (Additional file 4: Figure S3). A strong correlation was observed among the results of the phylogenetic, intron-exon structural, and conserved motif analyses of the G. hirsutum RH2FE3 genes.
Chromosomal distribution, gene duplication, collinearity analysis, and selection pressure of the RH2FE3 genes Putative G. hirsutum RH2FE3 genes were mapped onto chromosomes 1 to 26 [At and Dt chromosomes of G. hirsutum. Among the 140 RH2FE3 genes, 31 were allotted to scaffolds, not to a chromosome (Additional file 5: Figure S4). The distribution and density of the G. hirsutum RH2FE3 genes on the chromosomes were not uniform, and the genes were scattered over the whole genome. The highest density of about 12 genes was observed on the At 9 chromosome, followed by 7 genes on Dt 5, and 6 genes on each of At 1, At 4, Dt 6, and Dt 8. Only one gene was located on At 10 and its homolog was on Dt 10 ( Fig. 3a). Higher distribution of RH2FE3 genes was on At chromosomes (59 genes) compared with Dt chromosomes (50 genes). Of the 59 genes on At chromosomes, 31 were distributed in the upper and 28 were distributed in the lower centromeric regions, whereas the 50 genes on Dt chromosomes were distributed equally with 25 genes each in the upper and lower centromeric regions.
We found that tandem and segmental duplication events contributed to gene family expansion in the number of G. hirsutum RH2FE3 genes. A total of 60 gene duplication events were detected based on a whole-genome analysis, including four tandem duplications and 56 segmental duplications. Interestingly, among the 56 segmental duplication events, 20 occurred on homologous chromosomes (Additional file 1: Table S4). Four tandem genes, GhATL27, GhATL32, GhATL61, and GhATL135, may have arisen as a result of duplication events. Moreover, 98 orthologs of the G. hirsutum RH2FE3 genes were identified; 47 in the G. arboreum genome and 51 in the G. raimondii genome (Fig.  4, Additional file 1: Table S5).
To estimate the selection pressure on the G. hirsutum RH2FE3 genes, we calculated the K a and K s values (Additional file 1: Table S5) and estimated the K a /K s ratios of the 98 orthologous pairs of genes. Most gene pairs had K a /K s ratios < 1, and only 25 pairs had K a / K s ratios > 1.

cis-elements in the promoter sequences of RH2FE3 genes
To further investigate the transcriptional regulation and potential functions of the RH2FE3 genes, we predicted the cis-elements in the promoter regions 2000 bp upstream of the start codon. cis-elements play key roles in plant growth, as well as in plant responses to light, phytohormones, and stresses. Interestingly, we found that the promoter regions of the RH2FE3 genes contained different elements related to plant growth and development, as well as to light and stress responses (Additional file 1: Table S6). Most of the promoters contained Skn-1 and Sp1 elements, which are related to plant growth and development; the exceptions were the promoters of GhATL49, GhATL94, GhATL107. Circadian elements were also abundant in the promoter regions of some of the genes. A larget number of the RH2FE3 gene promoters region contained cis-elements involved in light responses, including 87.14% (122 of 140) that had Boxes 4, 76.45% (107) with Box I and the TCT-motif, 59.28% (83) with the GT1-motif, 57.85% (81) with a G-box, and 53.57% (75) that had an ATCT-motif. Additionally, 9 (6.42%) RH2FE3 gene promoters contained a TCCC-motif, 14 (10%) had a GC-motif, 16 (11.42%) had a Gap-box, 18 (12.85%) had a LAMP-element, 27 (19.28%) had a Box III, and 54 (38.57%) and 66 (47.14%) had a CATT-motif and AE-box, respectively. In addition, 80% (112) of the RH2FE3 gene promoter sequences had heat stress-response elements, 72.85% (102) had AU-rich elements, 72.14% (101) had TC-rich repeats, and 64.28% (90) and 59.28% (83) had MYB-binding site and TCA elements, respectively. The number of CE3 elements was the lowest (2.14%; 3 genes), followed by GCC-box (3.5%; 5 genes), AuxRR-core (7.14%; 10 genes), WUN-motif (12.85%; 18 genes), and TGA-element (14.28%; 20 genes) (Additional file 1: Table S6). The identified cis-elements revealed that the RH2FE3 genes may play essential roles in plant growth and development, as well as in phytohormone and stress responses.

Gene expression profile analysis in cotton fiber
The expression patterns of 140 G. hirsutum RH2FE3 genes at different fiber developmental stages, i.e. 10, 15, 18, 21, and 28 DPA, were obtained from the microarray data. Six heatmaps were constructed based on the subgroups of the G. hirsutum RH2FE3 genes identified by the phylogenetic analysis. All the genes were ubiquitously expressed in a variable fashion in all of the mapped subgroups and were clustered according to their expression patterns (Fig. 5a). In the RH2FE3-D heatmap, six genes, GhATL83, GhATL15, GhATL119, GhATL78, GhATL3, and GhATL114, were highly expressed at all the developmental stages. Similarly, in the RH2FE3-C heatmap, seven genes, GhATL103, GhATL124, GhATL82, GhATL130, GhATL132, GhATL128, and GhATL66, were highly expressed at all of the developmental stages. In the heatmaps of the other four subgroups, all the genes that were highly expressed during fiber development were clustered at the top, which indicated they made significant contributions to fiber development. The exceptions were the genes in the RH2FE3-F subgroup, which showed little involvement in fiber development. A further comparative analysis of the expression patterns of the genes of different subgroups, confirmed the genes in the RH2FE3-D subgroup had high expression levels at all the stages of fiber development. The genes in this subgroup were most highly expressed at 18 DPA, followed by 21, 15, and 28 DPA, but lowly expressed at 10 DPA (Fig. 5b). Additionally, the genes in the RH2FE3-C subgroup exhibited higher expression levels at 10, 15, and 21 DPA compared with those at 18 and 28 DPA.
To further clarify the roles of G. hirsutum RH2FE3 genes, we checked the expression of the genes in the RH2FE3-D subgroup by qRT-PCR to confirm the higher expression values obtained from the microarray data analysis. We used the tissues collected at seven fiber developmental stages, 0, 3, 5, 7, 10, 15, and 20 DPA, for the qRT-PCR analysis (Fig. 6). Among the 19 genes in the RH2FE3-D subgroup, 7 were duplicated genes, which is consistent with previous findings that Fig. 2 Phylogenetic analysis of 140 G. hirsutum RING-H2 finger putative E3 ligase genes clustered into six color-coordinated clades, as RH2FE3(RING-H2 finger E3 ligase)-A, RH2FE3-B, RH2FE3-C, RH2FE3-D, RH2FE3-E and RH2FE3-F duplicated gene pairs have similar functions. Thus, the expression patterns of only 12 genes were determined by qRT-PCR. We found that GhATL28 and GhATL26 were most highly expressed in the 7 DPA fiber tissues, whereas GhATL83 and GhATL78 were highly expressed at almost all the fiber development stages. Overall, 11 of the 12 genes had their highest expression in the 7 DPA fiber tissues; the exception was GhATL14, which showed low expression at 7, 10, 15, and 20 DPA. GhATL3 and GhATL22 were consistently expressed at all the fiber development stages except 20 DPA where their expression was low.

Phytohormone and abiotic stress responses
The putative G. hirsutum RH2FE3 genes exhibited differential expression patterns in response to phytohormone stimuli. Most of them were up-regulated after 0.5, 1, 3, and 5 h of exposure to the phytohormones BL, GA, IAA and SA. In particular, the transcript levels of almost all of the genes increased in response to SA and GA (Fig. 7). Overall, the expression levels of GhATL85 and GhATL102 increased to some extent, and both genes had similar expression patterns after 1 h of exposure to GA, IAA, and SA. GhATL3 and GhATL22 had similar expression patterns after 1 h of exposure to GA, IAA, and SA, and their expression levels increased remarkably as exposure to SA increased. The expression levels of GhATL28 increased and peaked after 1 h of exposure to SA, then decreased after 3 h and 5 h of treatment. After exposure to BL, the GhATL3 and GhATL22 expression levels increased with exposure time and were highest after 5 h of exposure. The GhATL80 and GhATL15 expression levels were low at all the time points after BL exposure, whereas the GhATL85 and GhATL102 expression levels were high at all the time points after BL exposure.
To explore the functional and physiological relevance of the G. hirsutum RH2FE3 genes, we analyzed their expression profiles in response to abiotic stimuli, namely cold, heat, drought (PEG), and salt (NaCl). Most of the gene signal cascade was activated by exposure to the abiotic stimuli. We found that GhATL83, GhATL26, a b Fig. 3 a Chromosomal distribution of G. hirsutum RING-H2 finger E3 ligase genes. Chromosomal sizes were calculated from published genome data. b Graphical representations of the gene distribution density levels on different chromosomes of the At and Dt genomes of G. hirsutum GhATL78, GhATL27, GhATL102, and GhATL22 consistently had the highest expression levels in response to the abiotic stress treatments (Fig. 8). The expression levels of the other genes also changed in response to the abiotic stress treatments, except at some time points after specific treatments. GhATL14 expression was down-regulated after exposure to all stimuli at all time points. Except for GhATL14, all the other genes were highly expressed after exposure to PEG, followed by heat, and then NaCl. Among 12 G. hirsutum RH2FE3-D genes, 8 were up-regulated in response to cold; the exceptions were GhATL85, GhATL80, GhATL15, and GhATL28. The expression levels of GhATL26 were many folds greater than the expression levels of the other genes in response to drought, salt, and cold.
For all the tested abiotic stresses, GhATL102, GhATL80, GhATL26, GhATL78, and GhATL22 had similar expression patterns at 3 h after treatment, and the expression of GhATL78 was the same after 1 h and 3 h for the tested abiotic stresses. The expression levels of GhATL3,  Tissue-specific expression profile of G. hirsutum RING-H2 finger E3 ligase RH2FE3-D clade genes in cotton fiber tissues at 0, 3, 5, 7, 10, 15 and 20 DPA using qRT-PCR. The expression level at 0 DPA was considered as 1. Presented data are the mean ± SE of three biological replicates, and different colored bars represent the expression levels of different tissues. Cotton GhHis3 was used as the internal control Fig. 7 Relative expression analysis of G. hirsutum RING-H2 finger E3 ligase RH2FE3-D clade genes in response to BL, GA, IAA and SA phytohormone treatments using qPCR. The CK (0 h) expression level was considered as 1 and used to normalize the data. Cotton GhHis3 was used as the internal control. Data are the mean ± SE of three biological replicates, and different colored bars represent different plant hormone responses GhATL15, GhATL22, and GhATL80 increased with time after the abiotic stress treatments and reached their highest levels at 3 h, then gradually decreased to lower levels at 12 h. Under NaCl stress, the GhATL22 transcript levels increased, and the GhATL27 transcript level increased significantly at 6 h after NaCl treatment. The Fig. 8 Expression patterns of G. hirsutum RING-H2 finger E3 ligase RH2FE3-D clade genes as assessed by qRT-PCR in response to abiotic stresses cold, heat, PEG and NaCl. The CK (0 h) expression level was estimated as 1 and used to normalize data. Cotton GhHis3 was used as the internal control. Data are the mean ± SE of three replicates, and different colored bars represent different abiotic-stress responses heat treatment predicted the same expression patterns for GhATL85, GhATL78, GhATL83, and GhATL3, namely an increasing trend after 1 h, peaking at 3 h, then decreasing at 6 h. In contrast, GhATL80 had consistent expression levels at 1 h, 3 h, and 6 h, which subsequently decreased at 12 h. GhATL28 had highest expression at 1 h, which gradually decreased with time. Overall, the expression levels of the G. hirsutum RH2FE3 genes fluctuated at different times after exposure to the abiotic stresses.

RH2FE3 genes of Arabidopsis, rice, and cotton
Multiple mechanisms contribute to the evolution of genes within a family, and a comprehensive evolutionary analysis through phylogeny can provide deep insights into the origins and relationships among various genes. A large number of RING finger genes have been identified in various plant species, including 469 in Arabidopsis, 378 in rice, and 399 in poplar (Du et al. 2009;Kraft et al. 2005). The ATL family of RING finger proteins, which contain a RING-H2 domain, was first identified in Arabidopsis (Salinas-Mondragon et al. 1999) and later found to be widely distributed in all plants, including 80 and 121 ATL genes in Arabidopsis and rice, respectively (Serrano et al. 2006). Cotton is the world's largest and most vital source of renewable textile fiber (Bao et al. 2011). G. arboreum and G. raimondii underwent whole-genome duplication events about 16.6 million years ago (Mya), and G. hirsutum (allotetraploid) emerged from hybridizations of A or D diploid ancestral species nearly 1.5 Mya (Li et al. 2015). Until now, a comprehensive analysis of RING-H2 finger genes has not been reported in cotton.
In the present work, we identified RH2FE3 genes in G. hirsutum, G. arboreum, and G. raimondii. A phylogenetic analysis of the Gossypium genes and already identified RH2FE3 genes in Arabidopsis and rice resulted in them clustering in common clades, which indicated the Gossypium genes shared sequence similarity with the Arabidopsis and rice genes. Additionally, the RING-H2 protein domain was encoded in all the identified Gossypium genes, which confirmed their identities as RH2FE3 genes. The common clustering provides direct evidence that duplication events took place after the separation of Arabidopsis, rice, and cotton from a common ancestor prior to the divergence of monocots and dicots. However, G. hirsutum had more RH2FE3 genes than Arabidopsis, rice, G. arboreum, and G. raimondii, implying genome expansion of the G. hirsutum RH2FE3 gene family.
G. hirsutum is an allotetraploid cotton that is widely cultivated, contributing 95% of the total world cotton production (Tiwari and Wilkins 1995). Alignment of the G. hirsutum RH2FE3 genes revealed they contained the typical features of the RING-H2 domain. Additionally, about 69% of the genes had no introns, which corroborates a previous study that found that 90% of the RH2FE3 genes of Arabidopsis and rice had no introns (Serrano et al. 2006),. Exon-intron structural differences are created by insertion/deletion events and are useful in estimating the evolutionary mechanisms of different gene families (Lecharny et al. 2003). Introns are considered to be under weak selection pressure. Here, the putative G. hirsutum RH2FE3 genes had very low intron numbers, indicating they may have evolved at a rapid rate.
The conserved motif analysis of all of the predicted G. hirsutum RH2FE3s showed that they contained a RING-H2 domain, and revealed some motifs that were specific to a particular subgroup, indicating the distinctive function of that subgroup. We found that the genes in subgroups RH2FE3-C and RH2FE3-D had more variations in gene structure and conserved motifs than the genes in the other subgroups, suggesting that the genes in these subgroups may be more active and variable than the other genes. Conserved motifs likely play vital roles in the functional diversification and transcriptional regulation of particular genes. The intron-exon structure of conserved motifs may be functionally important and evolutionary conserved, and despite the low sequence conservation in many gene families, exon-intron structures are conserved (Frugoli et al. 1998). Figure 2 shows that many of the G. hirsutum RH2FE3 genes clustered into pairs, which indicates an ancient genome duplication event. The two copies of a gene can be subjected to shuffling and rearrangements that create potential diversity. Four possible fates have been described for duplicated genes (Charon et al. 2012). One, one gene copy might be deleted during the evolutionary process, thereby removing functional redundancy. Two, sub-functionalization of both genes that shared the parental functions, leading to the development of partially different functions over time. Three, neo-functionalization in which one gene copy acquired new functions during evolution. Four, an intermediate form of sub-and neo-functionalization in which only genes critical for plant growth and development are retained. Thus, the large numbers of RH2FE3 genes in the G. hirsutum genome may be important for the plant's normal growth and development, and in responding to phytohormone and abiotic stresses.
Genomic distribution, duplication, and selection pressure of G. hirsutum RH2FE3 genes Phylogenetic analysis may not fully reveal the evolutionary mechanism of G. hirsutum RH2FE3 genes. Therefore, we generated more detailed information on the genomic distribution and duplication to determine their evolutionary relationships more precisely. Chromosomal segmental duplications scatter gene family members, while tandem amplifications cluster them (Schauser et al. 2005). In particular, segmental duplications coupled with salicoid duplications that took place about 65 Mya in ancestral plants contributed to the expansion of multi-gene families (Barakat et al. 2009;Wang et al. 2013). Two large segmental and small-scale tandem duplication events generated new genes during the evolutionary process, contributing to the genomic complexities observed in the plant kingdom (Cannon et al. 2004). The distribution of the G. hirsutum RH2FE3 genes on all chromosomes of the A and D genomes indicates their importance (Fig. 3a), and the variable density levels might reflect the addition or loss of genes through segmental or whole-genome duplication. Our results are in accordance with previous findings (Lim et al. 2010). In addition, several Arabidopsis gene families have similar evolutionary dynamics (Baumberger et al. 2003;Wang et al. 2008), which indicates a common mechanism for gene family expansion in different plant species. The specific dispersion and clustering of the G. hirsutum RH2FE3 genes provide useful information for understanding chromosomal interactions as well as polyploidization. These findings indicate differential rates of genetic evolution and information transfer through inter-genomic hereditary.
Gene duplication events and the subsequent divergence contribute to evolutionary momentum (Chothia et al. 2003). Gene duplication events can be inferred when aligned sequences cover > 80% of their total length and share > 70% identity (Yang et al. 2008). Two similar genes positioned on the same chromosome represent tandem duplications, while two similar genes located on different chromosomes represent segmental duplications (He et al. 2012). Generally, gene duplication is the result of functional divergence, which is important in the evolutionary process and for environmental adaptability (Conant and Wolfe 2008). Our investigation of gene duplication indicated that the G. hirsutum RH2FE3 gene family expanded through tandem as well as segmental duplication events. The RH2FE3 genes had the same duplication patterns in Arabidopsis (Serrano et al. 2006), rice (Lim et al. 2010), apple (Li et al. 2011), and poplar (Liu et al. 2015). Thus, we consider the complexity of RH2FE3 genes is the result of diverse duplication events. An orthologous gene pair analysis of RH2FE3 genes among G. hirsutum, G. arboreum, and G. raimondii revealed the G. raimondii chromosomes had more orthologs of G. hirsutum than the G. arboreum chromosomes. Because orthologous genes generally have similar functions (Altenhoff and Dessimoz 2009), we hypothesized that the functions of the G. hirsutum and G. raimondii RH2FE3 genes were similar.
The Ka/Ks ratio is used to estimate gene selection pressure mechanisms after the ancestral divergence. Commonly, a Ka/Ks value of 1 is considered as neutral selection, Ka/Ks values < 1 indicate purifying selection, and Ka/Ks values > 1 indicate positive selection. We found that most gene pairs had Ka/Ks values < 1, which indicated the preponderance of purifying selection pressure rather than positive selection pressure on the G. hirsutum RH2FE3 genes.
cis-elements in the promoters of the RH2FE3 genes and their relevance to the gene expression profiles Many cis-elements related to plant growth and development, and phytohormone and abiotic stress responses, were identified in the promoter regions of the G. hirsutum RH2FE3 genes. Several studies have reported the high impact of light on plant growth and development, as well as on differentiation (Fankhauser and Chory 1997).
Various key elements, such as abscisic acid (ABA) responsive elements (Narusaka et al. 2003), heat stress-response elements (Diaz-Martin et al. 2005), dehydration-response elements, and GCC-boxes (Song et al. 2005), were identified. The majority of the RH2FE3 genes also contained these elements with typical features.
Moreover, long terminal repeat elements responded to low temperatures (Maestrini et al. 2009), and CGTCA and TCA elements were involved in gene expression after SA and methyl jasmonate stimuli, respectively (Wen et al. 2014). Several other cis-elements, including ethylene responsive elements, TATC elements, and P-boxes were also detected. Additionally, the presence of W-boxes in promoter regions of some genes conferred responses to ABA and drought stress (Singh et al. 2002). The identification of several light-responsive elements in the G. hirsutum RH2FE3 gene promoter regions suggested a modulatory effect of light signaling on the expression of RH2FE3 genes. Furthermore, the presence of various cis-elements related to phytohormone and abiotic stress responses indicated the functions of the G. hirsutum RH2FE3 genes.

Fiber-specific expression profiles of RH2FE3 genes
A study of fiber development-related gene expression profiles indicated that many genes were significantly up-regulated at the secondary cell wall biosynthesis stage of fiber through Ub-mediated protein degradation pathways (Al-Ghazi et al. 2009). To investigate the putative functions of RH2FE3 family genes in cotton, we obtained the expression patterns of RH2FE3 genes from microarray data. Our analysis of the microarray data showed that the genes were differentially expressed at various stages of fiber development. In particular, the genes in the RH2FE3-D and RH2FE3-C subgroups were significantly up-regulated during cotton fiber developmental stages. We hypothesized that the G. hirsutum RH2FE3 genes in these subgroups may be fiber specific and regulate fiber development. Consistent with these findings, the qRT-PCR analysis of the genes in the RH2FE3-D subgroup supported these findings, indicating that the expression levels of these genes were significantly up-regulated at all fiber developmental stages.
Despite the importance of the G. hirsutum RH2FE3 genes in cotton fiber development, investigations into their roles are limited. The overexpression of rice SUMO E3 ligase gene OsSIZ1 in cotton improved fiber yield (Mishra et al. 2017). Similarly, GhRING1 transcript levels were significantly elevated in elongating fibers at 15 DPA (Ho et al. 2010). Our findings together with these previous findings, demonstrate the suitability of G. hirsutum RH2FE3 genes for quantitative studies at different cotton fiber developmental stages. These putative fiber-specific G. hirsutum RH2FE3 genes may be useful for the genetic manipulation of fiber quality traits.

Responses of RH2FE3 genes to phytohormone and stress stimuli
A previous study showed that 90% of the ATL genes in Arabidopsis were expressed and 50% were expressed in rice, which suggested that they were active genes, not pseudo-genes (Serrano et al. 2006). Several studies have reported the vital roles of the RH2FE3 genes in phytohormone and abiotic stress responses (Hong et al. 2007;Lee et al. 2001;Zhang et al. 2015). For instance, the SUMO E3 ligase gene OsSIZ1 enhanced heat and drought tolerance in cotton (Mishra et al. 2017). The RING-type E3 ligase gene DEFECTIVE IN ANTHER DEHISCENCE1 activated the jasmonate biosynthetic pathway in Arabidopsis (Peng et al. 2013), and overexpression of another RING-H2 domain E3 ligase gene ZmXERICO1 induced drought tolerance through ABA homeostasis in maize (Brugiere et al. 2017). Similarly, MaRING1, a RING-H2 finger gene, was induced by methyl jasmonate treatment and responded to cold stress in banana (Chen et al. 2014).
Moreover, the RING-H2 finger protein ShATL78 was involved in the responses to abiotic stresses, such as heat, drought, salt, wounding, osmotic stress, and exogenous hormones, in tomato (Song et al. 2016). The RING-H2 gene XERICO increased ABA biosynthesis and conferred drought stress tolerance in Arabidopsis (Ko et al. 2006). RING finger E3 ligase SpRing positively regulated salt-stress signaling in Solanum pimpinellifolium (Qi et al. 2016). Additionally, among 1500 Arabidopsis E3 ligase genes, 431 and 301 were up-regulated, and 365 and 245 were down-regulated in response to hormones and abiotic stresses, respectively (Mazzucotelli et al. 2006).
In this study, all the tested RH2FE3 genes were responsive to all the treatments, namely BL, GA, IAA, SA, cold, heat, PEG, and NaCl. However, only two genes were down-regulated at the transcript level in response to the BL treatment, and only four genes were down-regulated after exposure to the cold. All the other genes showed positive correlations, and their transcripts accumulated at higher levels than in the control plants. In summary, the expression profiles of G. hirsutum RH2FE3 genes demonstrated broad expression patterns in response to phytohormones and abiotic stresses and their responses had physiological as well as functional relevance. These preferentially expressed genes may play key roles in the regulation of plant growth and development.

Conclusions
We identified 140 G. hirsutum, 80 G. arboreum and 89 G. raimondii RH2FE3 genes and investigated their evolutionary dynamics in relation to RH2FE3 genes in Arabidopsis and rice. A comprehensive mapping of the G. hirsutum RH2FE3 genes on the G. hirsutum chromosomes revealed their physical locations. The genes were distributed across all the chromosomes, and 4 tandem and 56 segmental duplication events were detected. cis-elements in the G. hirsutum RH2FE3 gene promoters provided clues about the functions and expression patterns of the genes in response to phytohormone exposure and abiotic stresses. The microarray data and qPCR expression analyses helped to elucidate the functions of the G. hirsutum RH2FE3 genes in cotton fiber development. Furthermore, the general and intense responses of G. hirsutum RH2FE3 genes to phytohormones and abiotic stresses suggested their participation in BL, GA, IAA, SA, cold, heat, PEG, and NaCl response-signaling pathways. This work will help in the selection of appropriate genes for functional analyses in the future.

Additional files
Additional file 1: Table S1. List of qPCR primers used in this study. Table S2. Gene IDs and proposed information for RING-H2 finger E3 ligase genes of Arabidopsis, rice, G. hirsutum, G. arboreum and G. raimondii. Table S3. Detailed information on the biophysical properties of G. hirsutum RING-H2 finger E3 ligase genes. Table S4. Duplicated gene pairs and the type of duplication. Table S5. G. hirsutum orthologs observed in G. arboreum and G. raimondii genomes. Ka/Ks ratios of investigated orthologs were also computed based on clades. Table S6. cis-element analysis of the promoter regions of 140 G. hirsutum RING-H2 finger E3 ligase genes. cis-elements were categorized based on their relevance to growth and development, and the light and stress responses of each gene. (XLSX 73 kb) Additional file 2: Figure S1. Alignment and RING-H2 domain rich region of 140 G. hirsutum RING-H2 finger E3 ligase genes. Asterisk depicts the particular amino acids of the RING-H2 domain across all of the gene members, presenting the typical features of RING-H2 finger E3 ligase genes. (PDF 1482 kb) Additional file 3: Figure S2. (a) Gene structure analysis conducted for 140 G. hirsutum RING-H2 finger E3 ligase genes. Clusters were generated based on the intron-exon patterns among the different genes. Different genes with different numbers of introns are exhibited in different colors.