Phenylpropanoid metabolism and pigmentation show divergent patterns between brown color and green color cottons as revealed by metabolic and gene expression analyses

Naturally-colored cotton has become increasingly popular because of their natural properties of coloration, UV protection, flame retardant, antibacterial activity and mildew resistance. But poor fiber quality and limited color choices are two key issues that have restricted the cultivation of naturally-colored cotton. To identify the possible pathways participating in fiber pigmentation in naturally-colored cottons, five colored cotton accessions in three different color types (with green, brown and white fiber) were chosen for a comprehensive analysis of phenylpropanoid metabolism during fiber development. The expression levels of flavonoid biosynthesis pathway genes in brown cotton fibers were significantly higher than those in white and green cotton fibers. Total flavonoids and proanthocyanidin were higher in brown cotton fibers relative to those in white and green cotton fibers, which suggested that the flavonoid biosynthesis pathway might not participate in the pigmentation of green cotton fibers. Further expression analysis indicated that the genes encoding enzymes for the synthesis of caffeic acid derivatives, lignin and lignan were activated in the developing fibers of the green cotton at 10 and 15 days post-anthesis. Our results strengthen the understanding of phenylpropanoid metabolism and pigmentation in green and brown cotton fibers, and may improve the breeding of naturally-colored cottons.


Background
Cotton as the most important natural textile crop in the world shares more than one-third of the world textile fiber market, playing a significant role in the world economy (Ma et al. 2018). Naturally-colored cottons (NCC) refer to the types that have naturally-colored lint and can be directly used for colored products processing (Günaydin et al. 2019;Matusiak and Frydrych 2014;Rathinamoorthy and Parthiban 2017). It is also called "5C cotton" (cotton, color, charming, certification, and care) (Zhang et al. 2011). As a peculiar type of cotton, colored cotton has the characteristics of UV protection (Crews and Hustvedt 2005), flame retardancy (Hinchliffe et al. 2016), antibacterial activity (Chen and Cluver 2010). It requires fewer dyeing in the textile production process, satisfying the advocation of natural and health-conscious consumer products. The International Committee on Organic Agriculture predicts that 30% of the total global cotton production will be replaced by colored cotton and organic cotton in the next 30 years, and NCC fiber will be a valuable commodity in the textile market (Günaydin et al. 2019;Hinchliffe et al. 2016;Rathinamoorthy and Parthiban 2017).
Accompanied by the growing demands for NCC products, there has been no corresponding increase in its cultivation because of the tight association among natural color and poor fiber quality, low yield (Chen and Cluver 2010;Feng et al. 2015;Semi̇zer-Cumıng et al. 2015;Tu et al. 2014). Since brown and green color are two major fiber types in the NCC's production, the corresponding limited color choice has been another major problem inhibiting the large-scale commercialization of NCC products (Blas-Sevillano et al. 2018). Therefore, the chemical basis underpinning NCC colors and the control of the biosynthesis of associated pigments have become key issues in NCC research.
Over the past 10 years, many studies have focused on the metabolic and transcriptional analyses and quantitative trait locus (QTL) mapping of brown cotton fibers. Flavonoids were detected in the extractions of brown cotton fibers (Hua et al. 2007), and the flavonoid biosynthesis pathway, especially proanthocyanidin (PA) biosynthesis, was activated during fiber development of brown cottons (Feng et al. 2013;Liu et al. 2018;Tan et al. 2013;Xiao et al. 2014). QTL mapping found six genetic loci (Lc1, Lc2, Lc3, Lc4, Lc5 and Lc6) which were associated with fiber colors of brown cottons, and further studies showed that GhTT2-3A(Gh_A07G2341), a gene controlling PA biosynthesis, was a candidate gene which was confirmed by transgenic analysis to control fiber pigmentation of brown cotton (Hinchliffe et al. 2016;Wen et al. 2018;Yan et al. 2018).
Therefore, both metabolic and gene expression analyses showed that the pigments in brown cotton fibers were PA or PA derivatives (Feng et al. 2014;Xiao et al. 2014;Yan et al. 2018), while the pigments in green cotton fibers remained uncertain. Some transcriptional and metabolic analyses supported the view that flavonoids and their derivatives were the dominant pigments in green cotton fibers (Hua et al. 2007;Liu et al. 2018;Yuan et al. 2012), but other analyses suggested that caffeoyl residues were related to pigmentation in these fibers (Feng et al. 2017;Ma et al. 2015). Proteomics-based analysis of green cotton fibers found that the phenylocumaran benzylic ether reductase (PCBER), a key enzyme in lignan biosynthesis, was specifically expressed in green cotton fibers, and the total lignan contents in green cotton fibers were significantly higher than that in white cotton fibers . Although the pigments in green cotton fibers have not been definitively identified, the phenylpropanoid metabolism plays a key role.
To date, no studies have compared the entire phenylpropanoid metabolism in green and brown cotton fibers to elucidate the associated pigmentation pathways. In this study, both brown and green colored cottons were compared with white cotton as a control. The expression of phenylpropanoid pathway genes and the contents of flavonoids and PAs in these three types of cotton fibers were analyzed. Our data may shed some light on the molecular pathways underlying the differences between the fiber coloration of green and brown cottons.

Plant materials and growth conditions
Three different types of fiber color and five cotton accessions were used in this study, and all these accessions belong to Gossypium hirsutum. These include one accession with the white fiber (YZ1), one accession with the brown fiber (T586/T, dark brown) and three accessions (G1, G2, G3) with the green fiber. G1, G2, G3 were developed by crossing of green cotton accessions with one white cotton accession. Plants were grown in parallel in a climate-controlled greenhouse (Wuhan, China) at a temperature of 28°C to 32°C under a 14 h day/10 h night photoperiod with identical management practice. Cotton bolls were tagged on the day of flowering as 0 day post anthesis (DPA). Bolls were harvested at 5 d intervals fiber development (0, 5, 10, 15, 20 DPA), and frozen in liquid nitrogen immediately after removing the cotton shells. All samples were collected from 9:00-11: 00 am to minimize potential variability associated with circadian rhythms. For 0 DPA and 5 DPA ovule samples, whole ovules were ground into powder in liquid nitrogen. For 10, 15 and 20 DPA, fibers were gently knocked off ovules in liquid nitrogen, and seeds were removed with forceps. Then fibers were ground into powder and stored at − 80°C until RNA and metabolite extraction.

Length measurement of cotton fibers
Mature cotton bolls from similar fruit-bearing positions of individual plants were collected at the same time. The middle two mature seeds from each ovary were chosen for fiber length measurement and color observation. The fiber length was measured with a ruler according to a previous report (Tang et al. 2014). For each accession, at least 15 seeds were measured. Error bars represent the standard deviation (SD) of the mean.
Retrieval and identification of phenylpropanoid metabolism genes from the cotton genome The coding sequences (CDS) of phenylpropanoid metabolism genes from Arabidopsis (Table S1) were used as BLAST queries against the Gossypium hirsutum L. TM-1 genome to identify all homologs to the query genes using the CottonGen database (https://www.cottongen. org/blast/nucleotide/nucleotide) (Wang et al. 2019;Yu et al. 2014). These sequences were then selected according to the annotation information and fragments per kilobase of transcript per million fragments mapped (FPKM) values of these genes were downloaded. Genesis software (Sturn et al. 2002) was used to generate heatmaps from the expression values.

RNA extraction and quantificational real-time polymerase chain reaction (qRT-PCR) analysis
Total RNA of cotton fiber samples (0, 5, 10, 15 DPA) was extracted using RNAprep Pure Plant Kit (TIAN GEN Biotech). For each sample, 2 μg of total RNA was reverse transcribed into cDNA using M-MLV reverse transcriptase (Promega). For qRT-PCR analysis, 15 μL reactions for each sample were performed using SYBR Green (Applied Biosystems) as fluorescent dye on an ABI 7500 Real-Time PCR System (Applied Biosystems) (Guo et al. 2017). GhUB7 (GenBank: DQ116441.1) was used as the reference gene to normalize gene expression levels. Primers were designed according to previous studies (Hu et al. 2018;Tan et al. 2013) and were listed in Table S2. Three technical replicates were performed, and the error bars represented the standard deviations.

Determination of the total flavonoid content
The total flavonoid content was determined based on a previous method (Hu et al. 2018). In brief, approximately 100 mg fiber (precise weight recorded) for each sample was placed in a 2 mL centrifuge tube. One mL of 80% (V/V) methanol was added to extract the metabolites on a shaker at 4°C overnight. The supernatant was collected after centrifugation at 12 000 r·min − 1 , and the residual pellet was re-extracted with 1 mL of 80% (V/V) methanol. The supernatants were combined and mixed thoroughly. Then 0.2 mL of the extract was mixed with 0.4 mL of 0.1 mol·L − 1 aluminum chloride (AlCl 3 ) solution in a test tube, to which was added 0.6 mL of 1 mol·L − 1 potassium acetate (KAc) solution. Then the mixture was diluted to 2 mL with 0.8 mL 80% (V/V) methanol and mixed thoroughly. After standing for 30 min, the absorbance was immediately measured at 420 nm using a Multimode Plate Reader (PerkinElmer). Rutin standard solutions were prepared as shown in Table S3 to make a standard curve.
The PA content was measured according to a previously reported method with some modifications (Tan et al. 2013). Approximately 100 mg samples were extracted with 500 μL of 80% methanol, then shaken at 4°C for 12 h. The residues were extracted with 500 μL of 80% (V/V) methanol again, and the two supernatants were combined as the extract solution. A total of 600 μL of 3 mol·L − 1 HCl/80% methanol (V/V = 1:1) containing 0.1% (W/V) DMACA solution were added to 40 μL extract solutions and mixed thoroughly, and incubated at room temperature for 20 min in the dark. The absorbance was measured at 643 nm using a Multimode Plate Reader (PerkinElmer).

Fiber phenotypic characterization of brown and green cottons
We collected four colored cotton accessions, including three green colored cotton accessions (G1, G2, G3) and one brown colored cotton accession (T586/ T). YZ1, one white colored cotton accession was used as control. The fiber color and length of these materials were shown in Fig. 1. The fiber color of the three green cotton accessions was yellow-green, and similar to each other. Another feature of these three green cotton fibers was the uneven coloration. The regions of the fibers near the base of the seed coat, Fig. 1 Fiber color and length of brown and green cottons. a The typical phenotype of ten mature seeds of white (YZ1), green (G1, G2, G3) and brown (T) colored cottons. All these accessions belong to Gossypium hirsutum; b Images of fibers from YZ1, G1, G2, G3, T; Bars = 1 cm; c The fiber length of YZ1, G1, G2, G3, T. Error bars represent the standard deviations. ** represent P < 0.01 based on Student's t test which were wrapped inside, were dark green, but fibers exposed to the outside exhibited a light green or even white. The brown cotton T586 (T) fiber colored uniformly ( Fig. 1a-b).
The fiber lengths of these three different colored cottons were measured, and white cotton YZ1 had the longest fiber, with an average length of 27.9 mm. The length of three accessions of green cotton (G1, G2, G3) was similar with each other but shorter than white cotton YZ1, with a length of about 24 mm. The brown cotton T586 (T) had significantly shorter fibers than white and green cottons, with a mean of 15.8 mm (Fig. 1b-c).

Expression analysis of flavonoid biosynthesis genes in brown and green cotton fibers
Flavonoids are thought to be involved in the formation of NCC fiber pigments (Feng et al. 2014;Hua et al. 2007;Liu et al. 2018;Tan et al. 2013;Yan et al. 2018). To investigate expression patterns, annotated flavonoid biosynthesis genes in G. hirsutum genome were selected, and a heatmap for these genes in white cotton fiber was constructed to illustrate transcriptional changes during fiber development (Fig. 2). All genes in the flavonoid biosynthesis pathway had only one or two copies in each subgenome except CHS, which had 9 copies in the Dt subgenome and 8 copies in the At subgenome. Most genes of the flavonoid biosynthesis pathway had a similar expression pattern, being highly expressed at the fiber initiation stage (0 DPA, 5 DPA), and the expression levels decreased during fiber development.
To reveal the relationship between endogenous flavonoid biosynthesis gene expression and fiber colors, the expression levels of flavonoid biosynthesis pathway genes were analyzed by qRT-PCR in NCC accessions (Fig. 3). F3H was the most abundantly expressed gene in brown cotton fiber, and its expression gradually increased during fiber development, with the highest expression level at 15 DPA. The stage with the highest F3H expression level in the green cotton lines was 10 DPA and in white cotton was 5 DPA. Moreover, the expression level of F3H in brown cotton fibers was 10 times higher than that in white and green cotton fibers.
As the first key enzyme in the flavonoid biosynthesis pathway, CHS plays an extremely important role in flavonoid metabolism. The expression level of CHS in brown cotton fibers was the highest in 10 DPA fiber. Like F3H, the expression level peak in green cottons was 10 DPA and in white cotton was 5 DPA. ANR can catalyze the synthesis of PAs, and the corresponding gene was also found to be highly expressed in brown cotton fiber. ANR showed the highest expression level at 10 DPA in the brown fibers, and its expression level was 4-5 times higher than that in white and green fibers.
ANS is the downstream gene of flavonoid metabolism, catalyzing the synthesis of anthocyanins. ANS was highly expressed in brown cotton fibers to a level of 4-5 times higher than those in white and green cotton fibers. Whether anthocyanins are involved in brown fiber pigmentation remains to be explored. Nevertheless, the expression level of flavonoid biosynthetic genes in brown cotton fibers was significantly higher than in green and white cotton fibers.

Endogenous flavonoid contents in brown and green cotton fibers
We measured the total flavonoid contents (TFCs) in the ovules and fibers from 5 DPA to 20 DPA to determine whether they changed during the development of different colored fibers (Fig. 4a). White cotton and green cotton fibers had similar trends in TFCs during different developmental stages. In these accessions, the TFCs accumulated to the highest levels in 5 DPA ovule samples, about 8 mg·g − 1 . In fibers, TFCs were the highest in 15 DPA fibers. TFCs in all fiber samples were significantly lower than those in 5 DPA ovule samples. In contrast to brown fibers, the total flavonoid contents in 10 DPA, 15 DPA and 20 DPA fibers were remarkably higher than those in 5 DPA ovule samples. The TFCs concentrations in 10 DPA fibers were the highest (35 mg·g − 1 ), and the levels decreased sharply in 15 DPA fiber, but then increased again in 20 DPA fibers. Overall, the TFC contents in brown fibers were significantly higher than those of white and green fibers.

Proanthocyanidin (PA) contents in brown and green cotton fibers
To further investigate whether PA plays the same role in the pigmentation of green and brown cotton fibers, the PA contents were measured. 4-dimethylaminocinnamaldehyde (DMACA) staining method, which gives a blue coloration in the presence of PA, was employed to visualize PA in mature fibers. Brown cotton fibers showed the presence of PA while white and green fibers showed no difference with controls (Fig. 5). These results suggested that PA was found to accumulate in mature brown cotton fibers, but was not detectable in mature fibers of white and green cottons.
We also checked the PA contents in immature fibers (Fig. 4b-c). Like the results found for the total flavonoids, the PA contents at different developmental stages in white and green cotton fibers were similar, but significantly lower than those in brown fiber samples. The highest PA content in brown cotton fibers was at 10 DPA, and decreased slightly at 15 DPA and 20 DPA. In summary, significantly higher level of PA was accumulated in brown cotton than those in green and white cottons.

Expression analysis of the lignin and lignan biosynthesis pathway genes in brown and green cotton fibers
The caffeoyl and caffeoyl glycerides in the extracts of green cotton fibers have been studied (Feng et al. 2017;Ma et al. 2015). Caffeic acid and caffeoyl-CoA are the intermediate metabolites of lignin and lignan metabolism (Davin and Lewis 2000). Therefore, qRT-PCR was  used to detect the expression levels of lignin and lignan biosynthetic pathway genes in these three types of cotton fibers (Fig. 6).
PAL, C4H and 4CL are the most upstream genes in phenylpropanoid metabolism, and are involved in the synthesis of not only flavonoids, but also lignin and lignan. The expression levels of PAL and C4H in brown fibers were significantly higher than those in white and green fibers. PAL and C4H transcripts accumulated to the highest levels in 5 DPA samples of white cotton (Fig.  6, Table S4), but in 10 DPA fibers of green cottons. The expression levels of these two genes in green fibers were slightly higher than in white fibers. The expression of 4CL in brown fibers was higher than in ovules at 0 DPA and 5 DPA and than in white and green cotton fibers, but lower than in green fibers at 10 DPA and 15 DPA.
HCT is the first key enzyme in the lignin synthesis pathway. The expression level of this gene in white and green fibers was higher than that in brown fibers, and the expression levels in G1 fibers were significantly higher than those in white fibers. CCoAOMT and COMT are downstream genes in lignin metabolism, and influence the biosynthesis of monolignols, which are further used to synthesize lignin or lignan. The green fibers showed a slightly higher expression level of this gene than brown or white fibers at 10 DPA and 15 DPA (Fig.  6). PCBER encodes a key enzyme in the metabolism of lignan, and is a novel candidate gene that potentially responsible for pigmentation in green cotton fibers . Similar with HCT, PCBER showed higher expression levels in white cotton and green cotton fibers than that in brown fibers (Fig. 7). The green and brown cottons are the two major commercial NCC types in the world. Determining the pigment components of colored fibers is the first key step in breeding for cotton cultivars with improved natural colorations. Most researches have been carried out on the pigmentation of brown cotton fibers, with little known about green fiber pigments. We, therefore, chose three green cotton accessions, one brown cotton accession and one white cotton accession to study the differences among these three type cottons. Unlike brown cotton, green cotton showed uneven coloration on fibers (Fig. 1). Also, green cotton fiber color changed to brown when treated with HCl/ethanol solution (Fig. 5), that is likely due to the instability of the green fiber pigments. Previous studies have shown that the coloration of green cotton fiber was easily changed by oxidants, reductants, metallic ions, alkalis, UV exposure and/or high temperature (Günaydin et al. 2019;Zhang and Hu 2003). All these results suggested that the fiber pigment components of brown and green cotton are different.
Flavonoids are one of the three major plant pigments, including six major subgroups such as chalcones, anthocyanins and proanthocyanins. Intensive biochemical and transcriptomic analyses have indicated that flavonoid biosynthesis, and especially PAs biosynthesis and accumulation, played a key role in the coloration of brown cotton fibers (Feng et al. 2014;Gong et al. 2014;Li et al. 2013;Yan et al. 2018). In agreement with previous studies, we found that flavonoid metabolism was transcriptionally activated in brown cotton fibers, and high levels of flavonoids were synthesized during fiber development (Figs. 2 and 5).
The relationship between green fiber pigmentation and flavonoids is still controversial. Flavonoids are the dominant pigment in green cotton fibers by measuring the flavonoids content during fiber development in previous works (Hua et al. 2007;Yuan et al. 2012). Further study found that PAs were not the pigments in green cotton fibers based on DMACA staining . But a recent study about transcriptomic and transgenic analyses of green and brown cotton suggested that the flavonoid biosynthetic pathway controlled green fiber pigmentation (Liu et al. 2018).
Our results found that the differences in the flavonoid metabolism between green and white fibers were not as significant as those between brown and white fibers (Figs. 2 and 5). The expression levels of flavonoid metabolism genes in green fibers were similar to those in white fibers and significantly lower than in brown fibers (Fig.  3), which was consistent with the measurement of flavonoid contents (Figs. 4 and 5). The measurement of PA contents and DMACA staining of green fibers also indicated that PA was not the accumulated pigment in green fibers. These results suggest that flavonoids are not the key determinant of pigmentation in green cotton fibers.
Lignin and lignan biosynthesis pathways were slightly activated at the transcriptional level during the development and coloration of green cotton fibers Caffeic acid is a key intermediate in the biosynthesis of lignin and lignan (Davin and Lewis 2000), and caffeicacid derivatives have been detected in green cotton fibers (Feng et al. 2017;Ma et al. 2015;Schmutz et al. 1993;Schmutz et al. 1994). Furthermore, colored cotton fibers have been found to contain more lignin and lignan than white cotton fibers (Ioelovich and Leykin 2008;Li et al. 2018). However, the comparison of lignin contents in green cotton and brown cotton fibers depends on the varieties tested. Some brown cotton fibers contained higher total lignin contents than green cotton fibers (de Morais et al. 2010), but some are exactly opposite (Ioelovich and Leykin 2008).
We checked the expression levels of six key genes involved in caffeic acid and lignin biosynthesis to gain insights into whether this pathway participates in green fiber development. The phenylpropanoid pathway was significantly up-regulated in brown fibers compared with white and green fibers (Fig. 6), consistent with previous reports. The expression levels of PAL and C4H in brown fibers were remarkably higher than those in white and green fibers. However, the expression level of genes for the metabolic flux to lignin biosynthesis was similar or slightly lower than that in white and green fibers, implying that a large amount of phenylpropanoid metabolism was directed to flavonoids in brown fibers.
Although most of the caffeic-acid and lignin and lignan biosynthesis genes in green fibers did not exhibit noticeably increased expression levels compared with white and brown fibers, they did show slightly higher expression levels at some stages of fiber development. C4H, 4CL, HCT are the enzymes directly responsible for Error bars represent the standard error of three biological replicates caffeic acid and caffeoyl-CoA synthesis (Vanholme et al. 2012). At 10 DPA and 15 DPA, the expression levels of C4H, 4CL in green fibers were higher than those in white fibers, and the green accession G1 fibers had a significantly higher expression level of HCT than that in white fibers (Fig. 6). A similar situation was also seen for lignan metabolism. 15 DPA is the point of secondary cell wall biosynthesis, and also an important stage for the initiation of pigmentation in colored cotton fibers (Kim 2015;Yuan et al. 2012). Our results indicated that the caffeic acid derivatives, and lignin and lignan biosynthesis pathways were activated during the development and coloration of green fibers, which may explain why green fibers have a higher lignan and caffeic acid derivatives contents than white fibers. Detailed biochemical and transcriptional systems biology analyses should be carried out to investigate the precise roles of the caffeic acid derivatives, lignin and lignan in the pigmentation of green cotton fibers.
Suberin is an analogous biopolymer of cutin found in some specialized plant cell walls (Cohen et al. 2017;Graca 2015). It is composed of very long chain aliphatic acid derivatives, glycerol, and linked with phenolics and embedded waxes. Typically, the phenolic components are ferulic acid, caffeic acid, coumaric acid and monolignol derivatives (Cohen et al. 2017;Vishwanath et al. 2015), which are derivatives of phenylpropanoid metabolism.
Interestingly, transmission electron microscopy observation of cotton fiber revealed that the suberin lamellae was only found in the cell wall of green cotton fibers (Ryser et al. 1983). Caffeic acid and glycerol have been detected in the extracts of green fibers, and the presence of these two chemicals in the suberin of green fiber has been confirmed in subsequent studies, leading to the proposition that they could be the pigments in green fibers (Schmutz et al. 1996). By comparing the previous studies on the location of pigments and suberin lamellae in green cotton fibers and surprisingly, we found both were deposited in alternating layers with cellulose in the secondary cell walls of fibers (Ryser et al. 1983;Zhang et al. 2011). Suberin lamellae must, therefore, be a key feature of green cotton fibers and involved in fiber coloration.
Since some caffeic acid derivatives have a yellow-green color and have been detected in the extracts of green fibers (Feng et al. 2017), caffeic acid derivatives are likely to be some of the pigments in green fibers. Monolignol derivatives and lignan might act as structural components of suberin. So far, few studies have focused on this particular cell wall structure as compared with other components in plant cell walls. More effort is needed in this area and on the relationship between the suberin lamellae and lignin and lignan. A comprehensive research effort on suberin lamellae will greatly assist in understanding the control of green cotton pigmentation and inform fiber quality breeding in green cotton cultivars.

Conclusions
A comprehensive analysis of phenylpropanoid metabolism during fiber development of five cotton accessions with three different kinds of natural coloration (three with green, one with brown and one with white colored fiber) has been carried out in this work. The expression levels of flavonoid structural genes were significantly higher, and the endogenous total flavonoids and PA were highly accumulated in brown cotton fibers than those in white cotton fibers during the fiber development, but not in green cotton fibers. We have therefore concluded that flavonoid is not a key determinant in green cotton fiber pigmentation. Compared with white cotton fibers, the lignin and lignan biosynthesis were activated in the fibers of green cotton during its early development.
Additional file 1: Table S1. Information on prey sequences used for BLAST analysis. Table S2. Primers used in this study. Table S3. Formula of standard samples for total flavonoid content measurement. Table S4. FPKM of lignin and lignan biosynthesis genes in cotton fibers.