Skip to main content

Genome wide identification and characterization of MAPK genes reveals their potential in enhancing drought and salt stress tolerance in Gossypium hirsutum



The cotton crop is universally considered as protein and edible oil source besides the major contributor of natural fiber and is grown in tropical and subtropical regions around the world Unpredicted environmental stresses are becoming significant threats to sustainable cotton production, ultimately leading to a substantial irreversible economic loss. Mitogen-activated protein kinase (MAPK) is generally considered essential for recognizing environmental stresses through phosphorylating downstream signal pathways and plays a vital role in numerous biological processes.


We have identified 74 MAPK genes across cotton, 41 from G. hirsutum, 19 from G. raimondii, whereas 14 have been identified from G. arboreum. The MAPK gene-proteins have been further studied to determine their physicochemical characteristics and other essential features. In this perspective, characterization, phylogenetic relationship, chromosomal mapping, gene motif, cis-regulatory element, and subcellular localization were carried out. Based on phylogenetic analysis, the MAPK family in cotton is usually categorized as A, B, C, D, and E clade. According to the results of the phylogenic relationship, cotton has more MAPKS genes in Clade A than Clade B. The cis-elements identified were classified into five groups (hormone responsiveness, light responsiveness, stress responsiveness, cellular development, and binding site). The prevalence of such elements across the promoter region of these genes signifies their role in the growth and development of plants. Seven GHMAPK genes (GH_A07G1527, GH_D02G1138, GH_D03G0121, GH_D03G1517, GH_D05G1003, GH_D11G0040, and GH_D12G2528) were selected, and specific tissue expression and profiling were performed across drought and salt stresses. Results expressed that six genes were upregulated under drought treatment except for GH_D11G0040 which is downregulated. Whereas all the seven genes have been upregulated at various hours of salt stress treatment.


RNA sequence and qPCR results showed that genes as differentially expressed across both vegetative and reproductive plant parts. Similarly, the qPCR analysis showed that six genes had been upregulated substantially through drought treatment while all the seven genes were upregulated across salt treatments. The results of this study showed that cotton GHMPK3 genes play an important role in improving cotton resistance to drought and salt stresses. MAPKs are thought to play a significant regulatory function in plants' responses to abiotic stresses according to various studies. MAPKs' involvement in abiotic stress signaling and innovation is a key goal for crop species research, especially in crop breeding.


Cotton (Gossypium spp.) has become more important for plant research on polyploidization, phylogeny, cytogenetics, and genomics. It is regarded as one of the most important natural plants with the highest diversity, as well as having the greatest commercial value (Wang et al. 2018). Cotton is mainly cultivated as a source of fiber, food, and feed. Gossypium hirsutum L., the tetraploid one, is the largest species among the over 50  in Gossypium (Wang et al. 2018). G. hirsutum is believed to originate from genetic hybridization amongst an A-genome species that may be sourced from G. herbaceum (A1) African origin or might be from Asian cotton, also known as G. arboreum (A2) with a D-genome species might be from American origin G. raimondii. This tetraploid cotton accounts for around 90% share of worldwide cotton production annually (Page et al. 2013).

Several biotic and abiotic factors such as drought, heat, water logging, salinity, and diseases significantly affect cotton productivity, causing significant losses in the focused agricultural sector productivity. The cotton plant undergoes stress due to its growth habit; that is, it grows according to environmental conditions which are highly affected by environmental stress. Traditional crop breeding techniques have limitations such as crossing barriers, long-time effects, and genetic disease transformation. These lead to the development of recent breeding techniques (Zhang et al. 2014a).

The high amount of greenhouse gas emission in the atmosphere, as well as associated air pollution, is a major cause of heatwaves, floods, and drought. Drought stress can significantly impact crop production, and the severity and duration of the stress are also important factors. The availability of water is a critical factor in achieving long-term sustainability in crop production (Khan et al. 2018). Drought stress is among the factors that affect cotton production, because 50% global cotton supply comes from drought stress-challenged areas of the world. Cotton crops require improved yields and yield balance in both standard and moisture-stressed environments (Tuteja 2017). Drought stress influences cotton plants' growth and productivity by inducing several morpho-physiological and biochemical changes. Physiological and metabolic features such as photosynthesis, stomatal conductance, respiration, energy output, carbohydrate metabolism, and ultimately yield are clogged even though cotton has various mechanisms to relieve and withstand water-deficit stress (Tian et al. 2019). Previous studies showed that abiotic stress treatments such as salt, drought, and low temperature triggered the expression of MAPK2, MAPK3, MKK3, and MAPK16. GhMAPK16 is thought to be active in the signaling pathways and an Arabidopsis drought tolerance was significantly improved by overexpression (Shi et al. 2011). GhMKK3 functions in drought tolerance by regulating stomatal responses and root growth (Wang et al. 2016), GhMAPK16 is involved in disease resistance and drought sensitivity (Shi et al. 2011), GhMAPK3 enhances cold, drought, and salt stress in Arabidopsis (Sadau et al. 2021), and GhMAPK2 positively regulates salt and drought tolerance in tobacco (Zhang et al. 2011).

Salt stress is one of the most severe environmental stresses that plants face. Roots are the first and most direct organs to detect a signal. From germination to boll formation, salt stress harms cotton physiology, and the tolerance mechanism is well described (Munns and Tester 2008). However, early salinity tolerance responses of plant growth may not be a substantial measure of tolerance for salts across various plant species. Screening for salt tolerance among different plants may use physiological parameters as stress tolerance indicators, while enzyme concentration could be used as salt tolerance assessment in cotton. A few novel approaches, such as transcriptome profile, methylation-sensitive amplified polymorphism analysis at genes and cell levels, and genetic diversity assessed by various molecular markers, revealed salt stress-induced epigenetic changes in cotton cultivars and their salt tolerance mechanism. GhMAPK6a negatively regulates osmotic tolerance and bacterial infection in transgenic Nicotiana benthamiana and plays a pivotal role in its development (Li et al. 2013). GhMAPK4 confers the transgenic Arabidopsis hypersensitivity to salt and osmotic stresses (Wang et al. 2015). GhMKK5 affects disease resistance, induces HR-like cell death, and reduces the tolerance to salt and drought stress in transgenic Nicotiana benthamiana (Zhang et al. 2012a; b). GhMAPK4 negatively regulates salt stress (Wang et al. 2015), while GhMAPK17 positively regulates salt stress (Zhang et al. 2014a, b).

Plants' adaptive responses to environmental changes triggered by external and internal influences primarily depend on their interpretation of external signals. Multiple signal transduction pathways are used to amplify these perceived signals. MAPK can be considered as the standard signal regulation mechanism, that transforms external stimuli into cells. Following three sequentially active kinases from MAPK cascades are MAPK (MAP kinase), MAPKK (MAP kinase kinase), and MAPKKK (MAP kinase kinase kinase) (Bengough et al. 2011; Nakagami et al. 2005; Sadau et al. 2021; Wang et al. 2018). MAPK is assumed to significantly identify tolerance for environmental stress across eukaryotes by activating multiple cellular protein receptors (Larade and Storey 2006; Rohila and Yang 2007). MAPKs are activated when plants are stressed (Li et al. 2014; Wang et al. 2016; Zhang et al. 2014a, b). However, most MAPK researches have been done on model plants like Arabidopsis and tobacco (Li et al. 2013; Ichimura et al. 2000; Zhang et al. 2013). Also, almost all MAPKs examined so far enhanced abiotic or biotic stress. The importance of MAPK found within the same group differs from Arabidopsis thaliana AtMPK3 and AtMPK6 (Ichimura et al. 2000). A total of 74 MAPK genes with 41 across G. hirsutum, followed by 19 genes in G. raimondii, whereas 14 genes present in G. arboretum have been identified in this research. The MAPK proteins were studied further to determine their physicochemical characteristics, phylogenetic relationship, chromosomal mapping, and conserved motifs. Further experimental analysis of seven selected GHMPK genes was carried out for tissue-specific tissue expression, drought, and salt stresses to confirm their functions in cotton. The findings would give substantial evidence, laying the groundwork for further research into the molecular and biological functions of MAPK in cotton.

Materials and methods

Plant materials, growth conditions, and stress treatments

An upland cotton (TM-1, G. hirsutum L.), was used to evaluate tissue/organ and stress expression. Plants were planted under 25 °C, with a light cycle of 16 h light and 8 h dark cycle in a controlled chamber. Tissues like cotyledons, leaves, stems, roots, and fibers (5 days post anthesis, DPA) were collected. A drought stress treatment induced by 15% polyethylene glycol (PEG-6000) was applied to seedlings, and for salt stress, a 200 mmol·L–1 sodium chloride (NaCl) treatment was used (Li et al. 2013; Ma et al. 2020). Sample collection was carried out at 0, 2, 6, and 12 h after treatment. Sampling for each treatment was carried out three times, and samples were immediately stored in liquid nitrogen and preserved under a temperature of –80 °C.

Identification and analysis of physiochemical properties of MAPK genes

Proteins encoded by MAPK in G. hirsutum, G. arboreum, and G. raimondii have been obtained from cotton database CottonFGD using the PF00069 obtained from the Pfam database (, considering a significant level as E-value < 0.01. Further confirmation of domains was carried out using the following online tools: and The physicochemical characteristics of the cotton MAPK gene were retrieved from cottonFGD. For localization of GHMAPK, GaMAPK, and GrMAPK protein, the protein sequence was downloaded from cottonFGD, and the prediction was carried out through Wolf PSORT ( online server.

Sequence alignments, phylogenetic tree construction and collinearity

The MAPK protein sequences obtained from G. hirsutum, G. arboreum, and G. raimondii, and the protein sequence of A. thaliana obtained from Phytozome ( were subjected to the Neighbor-Joining (NJ) approach to investigate the evolutionary relationship. Computer software package MEGA 7.0 was used to construct a phylogenetic tree, considering Jones-Taylor-Thornton to be the substitution model through selecting 1 000 replications. For gene collinearity, G. hirsutum protein sequence was considered for blast search across G. raimondii and G. arboreum protein database considering E-value as < 0.01, and significance were considered ≥ 90 significant. The Gene IDs, GFF3 files, and linked files were used to construct the collinearity using the TBtools software (Chen et al. 2018).

Motif identification, gene structural analysis, chromosomal mapping, and promoter analysis of the cotton MAPK genes

The MEME, an online tool, was used to determine the cotton conserved motifs of MAPK. TB tools software was then used for the motif visualization. The coding sequences (CDS) were compared with the MAPK gene's genome sequences through an online gene structure tool ( Information about the chromosome was done by extracting cotton GFF3 from cotton GDP ( and then mapped with the gene ID using the TBtools software (Tamura et al. 2011). To examine the role of the GHMAPK gene's regulatory region in cotton, an upstream sequence within a 2 000 bp distance from the start codon has been considered and searched for PlantCARE program ( (Lescot et al. 2002).

RNA isolation, cDNA synthesis, and qRT-PCR

The RNA isolation was carried out from samples following the kit protocol using an RNA extraction kit (Polysaccharides and Polyphenolics-rich) (Tiangen, China). Using a PrimeScriptTMRT reagent package with a gDNA Eraser, RNA has been further transcribed to cDNA. RT-qPCR was performed using actin as a reference gene. The relative expression was calculated using the 2−ΔΔCT method (Schmittgen and Livak 2008). Each experiment has been repeated 3 times, along with three technical replicates.

Expression patterns of GHMPK gene in different tissues, under drought, salt, and validation RNA sequencing data

RNA sequence data for TM-1 was obtained from our lab. We analyzed the RNA sequence data under tissue expression, drought, and salt stress. Samples for tissue expression were taken from cotyledon, leaf, stem, root, and fiber at 5 DPA, while the sample for drought and salt were taken at 0, 2, 6, and 12 h as experimental conditions. PEG-6000 has been used for drought induction, whereas sodium chloride (NaCl) solution is used for salt treatment. Log transformation was carried out for reading/kilobase/million mapped values, and heatmap was constructed using software package TB tools.


Identification and physiochemical traits of MAPK gene family

A total of 74 MAPK genes were detected in cotton; out of them, 41 were in G. hirsutum, 19 in G. raimondii, whereas 14 were in G. arboreum. These genes' proteins were studied further to evaluate their physicochemical characteristics as well as other features. These genes were reported to express proteins ranging in length between 87 and 628 amino acids, with molecular mass between 9.988 and 71.503 KDa and isoelectric points (pI) between 5.672 and 10.23. All identified MAPK genes have grand average hydropathy > 0, implying that all MAPK genes in cotton were hydrophilic. Based on the WOLF PSORT ( analysis result, GHMAPK proteins are localized across different parts of the cell including the nucleus, cytoplasm, mitochondria, and chloroplast. The proteins were predominantly located on the cytoplasm (Additional file 1: Table S1).

Phylogenetic analysis of MAPK proteins

A phylogenetic tree was constructed based on multiple sequence alignment of 41 GHMAPK, 14 GaMAPK protein sequences, 19 GrMAPK protein sequences, and 18 Arabidopsis MAPK protein sequences revealing the evolutionary relationship of cotton MAPKs. The cotton MAPK family is categorized into A, B, C, D, and E clades based on results from the phylogenetic tree (Fig. 1). Clade A comprises of all the genes with TDY phosphorylation site, clade B members contain TEY motif, Clade C members contain TDY motif, clade D members contain more TDY motif and few TEY motifs, and Clade E members contain TEY motif. There are 26 GHMAPKs, 9 GrMAPKS, 9 GaMAPKs, and 3 AtMAPK that contain the TDY motif, while 15 GhMAPK, 8 GrMAPK, 6 GaMAPK, and 5 AtMAPK contain the TEY motifs. These show that the cotton MAPK genes have more TDY motifs and minor TEY motifs while the AtMPK genes have more TEY motifs and minor TDY motifs. The cotton MAKS classification was consistent with previous findings (Zhang et al. 2014b). This indicates that TEY MAPKs motif may significantly function in dicot plants than TDY MAPKs motif (Fig. 1).

Fig. 1
figure 1

Phylogenetic analysis of MAPK proteins. Different clades of MAPKs are represented by Letters A-E. MAPKs are genes highlighted in different colors, G. hirsutum highlighted in red, G. arboreum highlighted in cadet blue while G. raimondii highlighted in violet and A. thaliana highlighted in green

Collinearity analysis was performed between the physical maps of the subgenomes about G. hirsutum, G. raimondii, and G. arboreum for associations among A vs. D, A vs. At, and D Vs. Dt subgenome (Fig. 2). Generally, most of the MAPK genes from tetraploid cotton represented high similarities to the D genome (G. raimondii) and the A genome (G. arboreum).

Fig. 2
figure 2

Collinearity Analysis of MAPK genes in cotton. Dark green: G. arboreum; Red: G. raimondii; Light green: At and Dt of G. hirsutum

Motif identification and gene structure analysis of MAPK

Investigations for conserved motif about MAPK proteins were carried out through MEME 41 GHMAPKS, GaMAPKS, and 19 GrMAPKs putative protein sequence was submitted to search for conserved motifs (Fig. 3). As shown in the figure, GHMAPKs, GrMAPKS, and GaMAPKS both possess 20 motifs. In GHMAPKS, the majority of the identified genes contains 10 similar motif composition (1, 2, 3, 4, 5, 6, 10, 13, 17, and 18), few among them contains 9 motifs (7, 8, 9, 11, 12, 14, 15, 16, and 19) and only 3 genes; GH_A11G0035, GH_A12G2888, GH_D11G0040, and GH_D12G2911 contains motif 20. While in GaMAPK, the majority of identified gene contains 9 similar motif composition (1, 2, 3, 4, 6, 8, 9, 10, and 15), few among them contains 5 motifs (5, 7, 16, 17, and 18) only 3 among the genes; Ga01G0448, Ga05G3624, Ga11G0522 contain motif 19, and three among the genes; Ga01G2598, Ga12G0481, and Ga03G1217 contains motif 20. Lastly, GrMPK genes contains 9 similar motif composition (1, 2, 3, 4, 5, 6, 8, 9, 15), few among them contain 10 motifs (7, 10, 11, 12, 13, 16, 17, 18, 19, and 20) and only few among the genes; Gorai.008G065400, Gorai.011G100600 and Gorai.003G012800 contain motif 14.

Fig. 3
figure 3

Conserved motifs of MAPK proteins in A) Gossypium hirsutumB) Gossypium arboreumC) Gossypium raimondii

Gene structure was examined by using an online tool ( for cotton MAPKKKs. In G. hirsutum, out of 41 genes, 38 were found to possess an intron, and 3 are intronless. The most extended intron length was observed in GH_A07G1527 (Fig. 4). In G. arboreum, all the 19 genes possess an intron, and the highest intron length was observed in Ga02G0944 (Fig. 4B). In G. raimondii, both genes possess an intron, and the highest intron length is found in Gorai.006G007700 (Fig. 4C).

Fig. 4
figure 4

Analysis of gene structure using gene structure display server. A) Gossypium hirsutumB) Gossypium arboreum; and C) Gossypium raimondii

Chromosomal mapping of MAPK genes in Gossypium species

The chromosome distribution was investigated in the 3 Gossypium species and found that among the 41 G. hirsutum MAPK genes, 20 were located in the At-sub genome, and 21 were located in the Dt subgenome. Chromosome A12 and its homologous D12 had the highest gene locus. Three genes were found in A01, A03, A11, D03, and D05, D07, D01 and D02 have two genes each, the remaining A02, A05, A07, A08, A09, A10, D04, D08, D09, and D10 have one gene, respectively (Fig. 5A). In G. arboreum, the GrMPK genes were mapped on chromosome A01, A02, A03,A05, A11, and A12. The highest gene locus was observed in chromosomes A03, A11, and A12, each with three genes (Fig. 5B). While chromosomes A01, A02, and A05 possess two genes each. The genes in G. raimondii are distributed among D01, D02, D03, D05, D06, D07, D08, D09, and D11 chromosomes. Chromosome D08 has the most gene loci, followed by D03, three genes (Fig. 5C). While Chromosomes D01, D02,D05, D09 have two chromosomes each, one chromosome was observed in D06 and D11 (Fig. 5C).

Fig. 5
figure 5

Chromosomal mapping of cotton MAPK genes. A) GHMAPK At subgenome; B) GHMAPK Dt subgenome; C) GaMAPKD) GrMAPK. Chromosome number is indicated on the middle of the left-hand side of the chromosome

Determination of cis-regulatory elements

Cis-regulatory elements are assumed to perform various functions based on their arrangement and location across promoters (Biłas 2016). We have analyzed cotton MAPKs promoter regions for the determination of their cis-elements. The 2 kb sequences from the start of transcription from each GHMKPS on the upstream side were considered and used. The cis-elements identified in the GHMAPK promoter region were categorized into five functions: hormone responsiveness, stress responsiveness, light responsiveness, cellular responsiveness, and binding site (Additional file 2: Table S2). The majority of GHMAPK genes represented ABRE for the responsiveness of hormone, i.e. cis-element, are involved for elements related to the ABA responsiveness, TGA element (auxin-responsive element), ethylene responsive element (ERE), CA element (salicylic acid-responsive gene element), and GATA-Motif (cis-acting regulatory element involved in the MeJA-responsiveness). Few among the GHMAPK promoters have the GARE motif and P-BOX (gibberellin-responsive). Elements associated with stress include TC-rich repeats (defense and stress-responsive element), LTR (low temperature-responsive element), WUN-motif (wound responsive element), and TC-rich repeats (defense and stress responsiveness element). For light-responsiveness, elements such as ATCT-motif, Box 4, GT1-motif, LAMP-element, TCT-motif, TCCC-motif, CHS-CMA1a, TCT-motif, and TGACG-motif are found. For cellular development and binding site, few cis-elements are involved. Several cis-regulatory elements related to enhancing tolerance to abiotic and biotic stresses in plants have been found in the promoter sequences in the coding sequence related to the GHMAPK gene, suggesting that this gene could be used as an abiotic tolerant gene in cotton (Fig. 6).

Fig. 6
figure 6

GHMAPK identified cis-regulatory element

RNA sequence analysis and RT-qPCR confirmation of GHMAPK under drought and salt treatments

The RNA profile data of TM-1 was used, and the raw data and their transformed log 10 values of the genes were analyzed, and a heat map was constructed. Seven GHMAPK genes were found to have different expression patterns across different tissues, including young leaf, true leaf, the cotyledon, stem, fiber, and roots. Gene-specific RT-qPCR primers were designed (Additional file 3: Table S3). Based on RNA-sequence data as well as RT-qPCR results, we determined that GH_A07G1527 and GH_D02G1138 are upregulated in all tissues. GH_D03G0121 shows upregulation in cotyledon, stem, and root, GH_D03G1517 and GH_D05G1003 are upregulated in all the tissues. GH_D11G0040 gene is upregulated in stem and root, while GH_D12G2528 shows a relatively low expression (Fig. 7A).

Fig. 7
figure 7

RNA sequencing analysis and qRT-PCR analysis of the 7 cotton GHMAPK genes. A) gene expression pattern in cotton tissues; B) drought stress gene expression pattern; C) salt stress gene expression pattern. The relative expression was determined using the CT method with actin1 serving as a housekeeping gene. The data represents the mean average of three replicates. The TB Tools software was used to build the heat map (shown by log 10 values). Yellow indicates upregulation, blue indicates downregulation, and white indicates no expression

To determine the roles of GHMAPK genes under salt and drought stresses, the seven genes have been analyzed for both the RNA-sequence data and RT-qPCR results to detect their expression pattern after treatments. For drought treatment, GH_A07G1527 was upregulated at 6 h and 12 h, GH_D02G1138 was upregulated at 2 h, 6 h, and 12 h, and GH_D03G0121 was upregulated at 12 h. At the same time, GH_ D03G1517 and GH_D05G1003 were upregulated at 2 h, 6 h, and 12 h. While GH_D11G0040 showed no expression. Lastly, GH_D12G2528 was upregulated at 12 h (Fig. 7B). For salt treatment, GH_A07G1527 and GH_ D02G1138 were upregulated at 2 h, 6 h, and 12 h, GH_D03G0121 show no expression in RNA seq data and upregulated expression at 12 h post-treatment. GH_ D03G1517 was upregulated at 2 h, 6 h, and 12 h of post-treatment, respectively. GH_D11G0040 showed low expression in RNA seq data and upregulated at 6 h and 12 h of RT-qPCR. Lastly, GH_D12G2528 was upregulated at 2 h, 6 h, and 12 h, respectively (Fig. 7C). The results demonstrated that RNA sequence analysis and RT-qPCR expression results represented a strong correlation, with R2 = 0.91 in drought, R2 = 0.75 in salt, and R2 = 0.66 in tissues.


Plants are constantly vulnerable to different environmental stress conditions such as pathogen infections, salt, cold, drought, and even oxidative stress. Such environmental stresses have adverse effects on plant development and productiveness, resulting in significant loss of crop productivity (Tuteja 2017). Unlike animals, plants cannot avoid environmental stress when dealing with complex environmental challenges because plants are sessile and cannot move. To respond to various stress conditions, plants must develop sophisticated pathways that may help them resolve biotic or abiotic signals through appropriate cell signaling mechanisms (Wang et al. 2016). Mitogen-activated protein kinase (MAPK) cascades play an important role in abiotic stress responses as part of a critical signaling transduction module. The MAPK cascade has been identified in plant stress responses and signals transduction in cotton in several previous studies (Wang et al. 2015; Long et al. 2014; Sadau et al. 2021; Shi et al. 2011; Zhang et al. 2014a, b; Zhang et al. 2020; Zhang et al. 2011). But more studies are needed to strengthen knowledge about the sophisticated biological functions MAPK cascades in cotton under drought and salt stress.

About 41 MAPK genes in G. hirsitum, 15 in G. arboreum, and 19 in G. raimondii have been identified in this study. These three cotton species depict almost identical physicochemical properties about molecular weights with a range of 71.503 to 9.988 KDa and GRAVY values less than zero. Shallow gravity is a strong indication that the protein is hydrophilic and possesses high gravity index. Proteins encoded by stress-responsive genes are closely correlated with hydrophilic. Also, previous researches have confirmed that hydrophilic proteins are involved in tolerating numerous abiotic stresses (Hanin et al. 2011; Mogwomgwa et al. 2019; Lu et al. 2019). Cotton has more MAPKS genes in Clade A than Clade B, according to the results of the phylogenic relationship, which is following the previous research reports in G. raimondii, Arabidopsis, rice, and poplar (Hamel et al. 2006; Ichimura et al. 2000).

To better understand the potential functions of GHMAPK in cotton under various environmental stresses, we examined the cis-element distribution in promoter regions. Based on their position, form, and orientation on the promoter, the cis-regulatory elements observed serve various functions. In this study, the cis-element identified were classified into five groups (hormone responsiveness, light responsiveness, stress responsiveness, cellular development, and binding site). The prevalence of such elements across the promoter region of these genes signifies their role in the growth and development of plants (Das and Roychoudhury 2014; Elasad et al. 2018; Escobar-Sepúlveda et al. 2017).

Analysis of exon/intron structure revealed that most of the cotton MAPK genes were interrupted by an intron while few were intronless. The longest intron in G. hirsutum is found in GH_A07G1527. This results in conformity with previous researches (Zhang et al. 2014a, b; Zhang et al. 2020) which shows most cotton MAPK and MAPKKK genes possess few intronless structures.

Gene expression is mainly used to determine their functional roles and plant physiological growth. Using the RNA sequencing data and RT-qPCR validation, expression patterns in seven GHMAPK genes were observed across vegetative and reproductive tissue in this study. The results show that all genes have been highly expressed across different tissues, suggesting these genes perform several functions during different phases of growth and development in cotton. Previous researches showed that specific MAPK genes have tissue-specific expression patterns in many plants such as cotton, wheat, maize, and cucumber (Chen and Murata 2011; Liu et al. 2013). However, these genes express across both vegetative as well as reproductive organs with different expression levels.

Both RNA sequence and RT-qPCR validation analyses indicated that GHMPK genes were upregulated across drought and salt stresses. Hence, results expressed that six genes were upregulated under drought treatment while only GH_D11G0040 showed no expression, and six genes have been upregulated under salt treatment (GH_A07G1527, GH_D02G1138, GH_D03G0121, GH_D03G1517, GH_D05G1003, and GH_D12G2528) expect GH_D11G0040 with relatively low expression. This is consistent with previous findings that various abiotic stresses upregulated GhMPK2 and GbMPK3 in cotton and possibly enhanced drought and oxidative stress tolerance. The expression levels of most GhMAPKKKs have been found to increase in cotton when exposed to a water deficit, implying that these genes may be linked to cotton drought tolerance and response (Sadau et al. 2021; Teige et al. 2004; Zhang et al. 2011). Also, previous research confirmed the expression of the MAPK genes regulates drought stress across other plants such as rice, wheat, maize, and Arabidopsis (Lee et al. 2011; Rohila and Yang 2007; Zhan et al. 2017). MAPK genes have been reported to be induced through salt stress, as reported by previous researches (Teige et al. 2004; Xie et al. 2014). A study on genome-wide bioinformatics analysis of MAPK gene family in kiwifruit (Actinidia chinensis) confirmed that the expression levels of most AcMAPK gene were significantly higher at all treatment time points, whereas the leftover genes are down-regulated in salinity stress. Also, studies confirmed that TaMAPK29, TaMAPK33, and TaMAPK41 had been linked to induced salt stress (Wang et al. 2018).


The identification of MAPK genes from Gossypium hirsutum to enhance tolerance against drought and salt stresses based on family analysis of G. hirsutum, G. arboreum, and G. raimondii, has laid an underlying principle of genetic measures to cotton plant under drought and salt stress. GHMAPK elements related cis-regulatory elements analysis suggests that these genes significantly affect abiotic stress tolerance. Analysis of RNA-sequence and RT-qPCR data has revealed the upregulation of several genes across both vegetative and reproductive tissues. They have been declared candidate genes for tolerance to drought and salt stresses in cotton due to their upregulation across post-treatment examinations under the reported study. This research lays the groundwork to build a more robust cotton genotype that performs better under different environmental stress features, like drought and salt stress. Also, MAPK genes possess a wide role in regulatory properties and the extension of MAPK research is economically important to cotton crop production.

Availability of data and materials

All the related data and files are all presented including the sequences of the primers used in the genes expression profiling.


Download references


This research was funded by National Key R&D Program of China (2020YFD1001004).

Author information

Authors and Affiliations



Sadau SB, Mehari TG, Ahmad A carried out the experiments and wrote the manuscript. Tajo SM, Elasad M, and Ibrahim S assisted in data collection. Iqbal MS, Zhang JL, Wei HL, and Yu SX revised the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Sadau Salisu Bello or Wei Hengling.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declared that they have no competing interests.

Supplementary Information

Additional file 1

. Table S1: Physicochemical properties of MAPK genes

Additional file 2

. Table S2: Functions of identified cis-regulatory element

Additional file 3

. Table S3: Gene primers used for qRT-PCR

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sadau, S.B., Mehari, T.G., Ahmad, A. et al. Genome wide identification and characterization of MAPK genes reveals their potential in enhancing drought and salt stress tolerance in Gossypium hirsutum. J Cotton Res 5, 23 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Cotton
  • MAPK gene
  • Drought stress
  • Salt stress