Cotton germplasm improvement and progress in Pakistan

Cotton (Gossypium spp.) contributes significantly to the economy of cotton-producing countries. Pakistan is the fourth-largest producer of cotton after China, the USA and India. The average yield of cotton is about 570.99 kg.hm− 2 in Pakistan. Climate change and different biotic stresses are causing reduction in cotton production. Transgenic approaches have unique advantage to tackle all these problems. However, how to confer permanent resistance in cotton against insects through genetic modification, is still a big challenge to address. Development of transgenic cotton has been proven to be effective. But its effectiveness depends upon several factors, including heterogeneity, seed purity, diffusion of varieties, backcrossing and ethical concerns. Cotton biotechnology was initiated in Pakistan in 1992–1993 with a focus on acquiring cotton leaf curl virus (CLCuV)-resistant insect-resistant, and improving fiber quality. This review summarizes the use of molecular markers, QTLs, GWAS, and gene cloning for cotton germplasm improvement, particularly in Pakistan.


Introduction
Cotton is known as a "white gold" in cotton-producing countries, and it was grown over 33 million hectares in 2019 worldwide (Tarazi et al. 2020). Cotton consumption is increasing, corresponding to a tremendous increase in population around the globe (FAO 2019). Pakistan is an agricultural country, and cotton is the second most important crop with a significant role in its economy. It contributes 0.8% in the overall GDP and an additional 4.5% in agriculture value addition (Rehman et al. 2019). During 2018-2019, a decrease of 17.5% was seen with an overall production of 9 861 million bales versus11 946 million bales in 2017-2018. This decrease in cotton production was due to the decrease in incentives to the farmers compared to the previous year, leading to the shrinkage in cultivation area from 2 700 thousand to 2 373 thousand hectares (Economy Survey of Pakistan 2018-2019). Pakistan is ranked fourth in cotton production around the world. Since the introduction of cotton biotechnology in Pakistan in 1992-1993, several measures have been taken into account to improve its quality and yield. Breeding programs, cloning, cotton transformations, utilization in germplasm sources and molecular markers-based technologies have been discussed coupled with the increase in capacity building of various funding and research agencies.

Cotton germplasm
Cotton (Gossypium spp.) belongs to Malvaceae family and is found on the Indian subcontinent, America and Africa. The end product of cotton is fiber, a backbone of the textile industry, which develops through a series of processes in bolls after pollination of flower (Bakhsh et al. 2009). The major genus of cotton is Gossypium containing 50 different species, out of which only four are commercially grown for agricultural use (Wendel and Cronn 2003). The major grown species is G. hirsutum L. contributing almost 80% of the total cotton production in Asia. The genetic resources of cotton are immense and dispersed over five continents with classification as primary, secondary and tertiary germplasm pools (Wendel et al. 1994). The ploidy characters of the Gossypium genus show a big variation which makes its classification quite difficult. Many researchers have given the classification of this genus, but the most widely accepted one is done by Chen and Gallie (2004) which is based upon chromosomal paring affinities. A total of 50 species of cotton have been classified of which 45 species have been reported as diploid (2n = 26) and 5 species are reported as tetraploid (2n = 52) (Chen et al. 2014). Mexico is considered to be the center of origin of G. hirsutum and it has spread over Central America and the Caribbean. According to the archaeo-botanical survey, G. hirsutum is domesticated within the Mesoamerican gene pool. (Wendel et al. 1994;Brubaker et al. 1999). Although, Asiatic or desi cotton (G. arboreum) gives low yield, it has many important agronomical characteristics, e.g. good fiber strength with remarkable plasticity, showing better insect resistance and stronger capacity to grow under poor growing conditions than G. hirsutum.

Cotton germplasm resources in Pakistan
There are many countries with a history of cotton germplasm production such as China, Brazil, the United States, India and Pakistan (Robinson et al. 2007). G. arboreum is indigenous to Pakistan and has been evolved from G. herbaceum L. (Rehmat et al. 2014;Hutchinson 1954). These genotypes have been characterized by morphological, agronomical and physiological features showing tolerance against drought and insect/ pest (Rehmat et al. 2014).

Evolution of G. arboreum L. in Pakistan
Due to limited cross-pollination or mixing of the seeds, most of the cotton germplasm of G. arboreum has been obtained within several variants. The natural selection from a single population resulted a narrow genetic base of the cultivars. Initially, two cotton varieties such as Z. Mollisoni and 278-Mollisoni were developed to replace old varieties. The Cotton Research Institute (CRI) was established in Faisalabad for the improvement of varieties, and it was initiated by Trought T. and later on by Afzal M. The improved 15-Mollisoni cotton line was tested in national trials for 13 years and approved for cultivation in 1930 concerning its high ginning outturn (GOT) of 35% compared with 34% for Mollisoni and 33% for the mixture cultivated in the farmer's field. Another variety 39-Mollisoni was tested, which showed 36%∼37% GOT as compared with 35% of 15-Mollisoni (Rahman et al. 2012).

Cotton varieties developed by local crossovers
The Bt cotton variety CIM-775 was selected in the crosses of local cultivars and accessions from United States Department of Agriculture (USDA) National Plant Germplasm System, and this variety secured 2nd position based on yield performance among 102 varieties in the National Coordinated Varietal Trials (NCVT) conducted in 2019-2020. This is a cotton leaf curl virus (CLCuV) tolerant variety and has an increased yielding potential of 50-60 pounds per hectare with a staple length of 28.6 mm, a lint percentage of 39.5% and a micronaire value of 4.3. The Bt variety CIM-303 was also developed by crossing United States Department of Agriculture (USDA) National Plant Germplasm System Accessions to local germplasm materials, and it shows promising results on CLCuV tolerance. The cotton varieties developed by crossing NIAB 999 and NIAB 111 were early maturing, high yield, and heat and CLCuV tolerance.
The mutant cotton variety Chandi 95 was approved in 1982. It was developed by irradiation of gamma radiations (300 Gy). The cotton variety NIAB 78 was developed by irradiation-induced mutation and was a derivative of (AC-134 × Deltapine) F 1 . It was approved by Punjab Seed Corporation (PSC) for general cultivation in Punjab in 1983. The introduction of this variety increased the yield from 3 million bales in 1983 to 12.8 million bales in 1991-1992. In 2008, cotton variety NIAB-846 approved by PSC was developed by the crossing of NIAB 78 and REBA 288 (pollen irradiated with 10 Gy of gamma rays). This variety was resistant to CLCuV and CLCuV-B (Burewal strain) and had heat tolerance. NIAB 777 was approved in 2009, and has resistance against CLCuV-B and is suitable for high planting density as shown in Table 1. CRI Faisalabad had been carrying out a breeding program for desi cotton at different research stations, including Haroonabad where the breeding practices started in 1952 in the drought prone area. Based on leaf morphology and color, four candidate lines were identified. Among different tested varieties, 73/3, was found superior with 42% of GOT and a staple length of 13.7 mm compared with an cultivated mixture with 37%∼38% GOT and 16∼19 mm staple length. It was noted that the newly developed varieties showed yield compared with that of the cultivated mixture of desi cotton. These observations led the breeders to abandon the varietal improvement by selection (Campbell et al. 2010). G. arboreum was replaced by high yielding G. hirsutum, and then several cotton varieties were developed in Pakistan by hybridization and mutations. Some approved varieties of Bt and non-Bt cotton in Pakistan are listed in Table 2.
Genetic engineering and biotechnology have played a vital role in the development of transgenes application and the overall economy boosting of Pakistan through agriculture. Plant biotechnology has enabled the researchers to incorporate foreign genes that control different traits such as drought resistance, fiber quality, herbicide resistance, CLCuV resistance and pest resistance. The rapid increase in genetically modified (GM) cotton enhances the planting acreage and productivity of many countries around the world, including Pakistan.
Central Cotton Research Institute (CCRI), Multan is striving hard to develop new cotton varieties with tolerance to stresses and desired fiber traits. The CCRI has maintained 6 030 accessions of four cotton species. Many varieties have been approved for cultivation during the last few decades. For example, Bt.CIM-598 has been approved for cultivation in Sindh province. Bt-CIM-632 and Bt-CIM-610 have completed their 2 years trial in NCVT while Bt-CIM-663 and Bt-CIM-343 have completed their first year trial in NCVT. The CCRI evaluated 33 Bt lines and 15 non-Bt lines at Multan and Khanewal on desirable characteristics, and the data were presented during the 77th expert sub-committee in March 2018. The exotic lines Mac-07 and AS-0349 were crossed for resistance against CLCuV in filial generations.
Pakistan Central Cotton Committee (PCCC) was established to create funds for the development and marketing of cotton. The PCCC developed many different cotton varieties in 1985-2018 as listed in Table 3.

Molecular marker technology in cotton
Molecular marker is very useful for molecular characterization and identification of genetic variation, and has been used in the marker-assisted selection (MAS) and genome fingerprinting (Kalia et al. 2011). The molecular markers are important in genomics research because they may or may not link with the phenotypic expression of a character in an organism (Agarwal et al. 2008). In cotton genomes, the most important molecular markers are polymerase chain reaction (PCR)-based markers because of their high effectiveness and utilization which include inter simple sequence repeats (ISSRs) (Reddy et al. 2002), amplified fragment length polymorphism (AFLP) (Abdalla et al. 2001;Alvarez and Wendel 2006), simple sequence repeats (SSRs) (Liu et al. 2000;Zhu et al. 2003) and random amplified polymorphic DNA (RAPD) (Tatineni et al. 1996;Xu et al. 2001;Lu and Myers 2002). Among the genomic resources, there are about 16 162 SSRs and 312 mapped cotton RFLP sequences available publicly. The RFLP, SSR, AFLP, AFLP and RAPD markers have been applied in different mapping populations to develop linkage maps. It has been reported that the identification of DNA markers is associated with over 29 important traits such as fiber quality and yield, leaf and flower morphology, trichomes density and their distribution, and disease resistance (Rahman et al. 2012). The advent of cotton leaf curl virus (CLCuV) was proved to be a compelling factor to design novel strategies for cotton breeding programs in Pakistan. Before the occurrence of CLCuV epidemics, the genetic similarity among the elite cotton varieties in Pakistan (Gossypium spp.) was 81.5%∼93.41%. New cultivars were developed by crossing the exotic resistant germplasm with the germplasm that susceptible to CLCuV (Rehman et al. 2012). The study was designed to assess the genetic diversity or genetic relatedness among the newly released, extremely resistant and medium-resistant cultivars. Different methods such as field evaluation, whitefly-transmission studies, grafting, dot-blot hybridization, and multiplex-PCR using conserved primer sequences were employed for the screening of 27 cotton genotypes. Twenty extremely resistant and resistant cultivars were selected for DNA-RAPD analysis. The  genetic similarity of exotic germplasm with the elite cultivars was found in the range of 81.45%∼90.59%. Similarly, the genetic relationship among the elite cultivars was 81.58%∼94.90%. However, the average genetic similarity among all the studied genotypes was 89.55%. It was concluded that only cultivar VH-137 possessed a diverse genetic background. The study also emphasized breeding for high genetic diversity to serve as a buffer against potential epidemics (Rahman et al. 2002). The CLCuV has resulted in a significant loss in the economy of Pakistan and its rapid transmission is a major threat to the neighboring cotton-growing countries such as China and India. In a current study, a total of 10 cotton genotypes of different tolerance levels were taken from the cotton germplasm resource available at the National Institute of Biotechnology and Genetic Engineering (NIBGE), Faisalabad-Pakistan. In total, 322 SSRs derived from bacterial artificial chromosome end sequences of G. raimondii were screened out. Of the total, 65 primer pairs were identified as polymorphic, and the genetic similarity was found in the range of 81.7%∼98.7%. Among the polymorphic markers, only two SSR markers, PR-91 and CM-43 were amplified in the CLCuV-tolerant genotypes which showed significant association with tolerance to the disease (Abbas et al. 2015).
The epidemic virus has spread to the cotton-growing countries including the USA, China, Pakistan, and India. Both pathogenic and non-pathogenic approaches for the development of transgenic cotton are in progress. The use of DNA markers overlaid with transgenics and CRISPR/Cas system for the insertion of resistant markers into adapted cultivars would be instrumental to counteract the disease caused by CLCuV .
A summary of the genomic diversity, evaluated with genetic markers in Pakistan and major cotton-growing countries is described in Table 4.

Status of marker-based improvement of cotton in Pakistan
Recent advances in cotton biotechnology such as DNA sequencing technology and genome editing have changed the path of marker-based crop improvement. This has given valuable standards with improved status of plants (Yin et al. 2019). These techniques facilitate the discovery of SNPs which are useful and extremely saturated markers for cotton genomic research. In summary, all the markers have their own advantages and limitations and the choice of these markers depends on the selected trait and the nature of the work. Based on the SSR marker polymorphism analysis, cultivars released in Pakistan since 1914 showed relatively low genetic diversity (Khan et al. 2010).

Cotton QTL studies in Pakistan
There are many factors that determine the fiber quality, including fiber color, strength, length, micronaire, and corresponding yield (Lokhande and Reddy 2014). A certain advancements have been made in the spinning performance of cotton fiber that increases the fiber quality and yield-associated parameters to a significant level (Kohel et al. 2001). In cotton breeding, the most crucial issue is the negative correlation between lint yield and fiber quality Shen et al. 2007). Therefore, to identify the regions of linked genes related to lint yield and fiber quality in the cotton genome and to develop the linkage maps, researchers utilized the technique of QTL analysis. The first QTL mapping was reported in 1996 and a huge array of QTLs has been identified by utilizing molecular markers technology. Till now, several databases have been developed. "Cotton Gen" is one of them that contains the information of 988 quantitative loci for 25 diverse characters (http:// www.cottongen.org/data/qtl). Currently, 20 QTLs are recognized for fiber attributes (Shang et al. 2015) and these attributes were associated with 59 loci (Tan et al. 2015). This progress allows the cotton breeder to improve yield and yield-related parameters and contribute to the economy of the region. The QTLs identified in cotton germplasm using different marker technologies are summarized, and are commonly used for the genetic improvement of various agronomic traits in Pakistan. A total of 185 cotton genotypes of lint traits were selected and studied at different locations for 3 consecutive years. The genetic variations were evaluated for different traits and IR-NIBGE showed a maximum of 43.63% ginning out turn (GOT). The Ward's method was used for the clustering of genotypes. Out of 382 SSRs, 95 polymorphic SSR primer pairs were surveyed on 185 genotypes. A total of 75 markers associations were calculated, and out of which only MGHES-51 was found associated with all traits. This study indicates that the high frequency of favorable alleles in cultivated varieties is possibly due to the fixation of the desired alleles by domestication. These alleles can be further exploited in marker-assisted breeding or gene cloning through the next-generation sequencing tools (Iqbal and Rahman 2017). Saleem et al. (2018), from the Department of Plant Breeding and Genetics, BZU-Multan reported droughttolerant QTLs identified using DNA markers (NAU-2954, NAU-2715, NAU-6672, NAU-8406, and NAU-6790) among 44 drought related varieties in Pakistan. The varieties were cultivated to observe the features of the excised leaf water loss, relative water contents and cell membrane stability under drought stress. It was noted that chromosome 23 harboring QTL-qtlRWC-1 for relative water content linked with marker NAU-  (Saleem et al. 2018). Additionally, the National Institute of Biotechnology and Genetic Engineering (NIBGE) screened 322 SSR markers derived from bacterial artificial chromosome end sequences of G. raimondii. When PR-91 and CM-43 were amplified, they showed an association of resistance against CLCuV (Abbas et al. 2015). Using RAPD markers, two QTLs for relative water content were detected with the nearest markers NAU2954 and NAU2715, respectively, each on chromosome 23 and 12 at Department of Plant Breeding and Genetics BZU-Multan (Saleem et al. 2015). The RAPD-DNA was employed with three primers: OPO-19, OPQ-14 and OPY-2 for the assessment of genetic diversity of 18 cotton varieties in Pakistan. The primers revealed amplification with the product sizes of 470 bp (base pair), 325 bp and 10 701 bp with a selection efficiency of 27.7%, 67.1% and 44.4%, respectively (Mumtaz et al. 2010). Similarly, Diouf et al. identified 1709 genes with 4 QTLs that are present in two regions named as cluster 1 and cluster 2 (2018). Among 1709 genes, only 153 showed higher expression levels than those of the remaining genes with lower expression in all fiber development stages. Furthermore, five important genes playing a vital role in the development of fiber were also identified, namely Gh_ D03G0889, Gh_D12G0093, Gh_D12G0969, Gh_ D12G0410, and Gh_D12G0435 (Diouf et al. 2018).
Hence, it concluded that the QTL technique led to two complementary uses (Prioul et al. 1997): the first one focused on those QTLs that target the physiological components of macroscopic characters whereas the second is marker-assisted breeding (MAB). They are used for the tagging and analysis of pyramid favorable alleles and also break their linkage with unfavorable genes in cotton (Lee 1995;Ordon et al. 1998;Ribaut and Hoisington 1998).

Genome-wide association studies (GWAS) in Pakistan
Linkage dis-equilibrium (LD) mapping, also known as association-mapping, is an effective way to discover the dissimilarity in complex characteristics by using historical and evolutionary recombination operations at the population level (Nordborg and Tavaré 2002). GWAS is an important tool that is used to recognize QTLs and dissect the genetic control of complex quantitative characters (Saeed et al. 2014;Islam et al. 2016). To recognize the characteristics that linked with genetic markers, non-structural populations are phenotyped and genotyped in association-mapping (Myles et al. 2009). The association-mapping for cotton aids a large-scale utilization of natural genetic diversity conserved within the cotton germplasm (Abdurakhmonov 2007). Abdurakhmonov et al. analyzed genome-wide LD and association-mapping of fiber-related characters in 285 exotic accessions of cotton using 95 SSRs markers (2008). Similarly, 202 SSRs were used for the LD-based associationmapping for fiber quality characters in 335 cotton genotypes (Abdurakhmonov et al. 2009).

Gene cloning
Map-based gene cloning is a fundamental approach for exploitation and recognition of quantitative agronomic characters. The conventional map-based cloning techniques are efficient, but are laborious and time-consuming due to the complex genome of cotton (Zhu et al. 2017). Therefore, different techniques are developed to explore the function of genes of interest and how these genes are successfully transformed into crops. In cotton, the development of new transformation vectors and new strategies related to gene cloning and gene editing provides a great opportunity to transform new characteristics and improve yield-related traits that are not possible to develop through conventional methods. These characters include herbicide (Bayley et al. 1992) glyphosate resistance (Zhao et al. 2006), reduction of gossypol content in cottonseed (Sunilkumar et al. 2006) and resistance to bollworm (Rashid et al. 2008) and aphids (Wu et al. 2006). A milestone in cotton research was the development of the genetically modified organism (GMO) cotton that contains Bacillus thuringiensis (Bt) gene. Globally, the transgenic cotton is grown in an area of more than 33 million hectares (Tarazi et al. 2020). To produce an ideal transgenic plant, an appropriate gene construct is necessary. For this purpose, besides the desired gene, the vector also include a reporter gene, selection markers, an appropriate promoter for gene expression and terminator to make an efficient transformation. CaMV35S (Cauliflower mosaic virus) promoter is a constitutive promoter that is widely used in transgenic cotton. The selectable marker genes (antibiotic and herbicidal-resistant genes, anti-metabolic genes) and reporter genes (green fluorescent protein (GFP), beta-galactosidase (LacZ), luciferase (Luc), chloramphenicol acetyltransferase (CAT), and beta-glucuronidase (GUS)) are helpful to detect the plant which has transgene expression (Zapata et al. 1999). Furthermore, a series of pCAMBIA vectors are largely used worldwide, and they are commonly used in Pakistan for gene cloning in cotton. The presence of kanamycin and bialaphos herbicide resistance genes as selection markers and GUS or GFP as reporter genes in these vectors makes them more efficient and unique in nature. Zapata et al. (1999) reported the use of gramineous expression vectors pGU4AGBar and pGBIU4AGBar  Tatineni et al. 1996 Pakistan 31 Gossypium species, 3 subspecies and 1 interspecific hybrid 18 cotton varieties 11 colored cottons (10 belongs to G. hirsutum and 1 belong to G. areboreum) 5 white-linted genotypes (4 belongs to G. hirsutum and 1 belong to G. areboreum) SL 7-9 crossed with FH-634 to raise F 2 and F 3 segregating population 45 RAPD primers 03 RAPD primers 45 RAPD primers 400 RAPD primers Khan et al. 2000;Mumtaz et al. 2010Khan et al. 2010Ali et al. 2009 RAPDs  (Hou et al. 2003;Rao et al. 2016). Some successful transformations along with promoter and other essential elements are described in the gene transformation section.

Gene transformation approaches in cotton
Many methods are used for genetic transformation in cotton that have their own advantages and limitations but Agrobacterium and microprojectile bombardment are currently the most commonly and widely used procedures for gene transformation (Dai et al. 2001).
In the past decade, scientists developed genetic transformation techniques in cotton (G. hirsutum) using Agrobacterium-mediated shoot apex cut method and sonication-assisted Agrobacterium-mediated transformation. Agrobacterium-mediated gene transformation method became the most reliable and best method to generate transgenic cotton in Pakistan. It ultimately changed the way of DNA delivery but also confirmed the expansion of efficient transforming vectors.

Efficient methods and successful genetic transformations in cotton
Different protocols have been used, such as meristem transformation (Gould et al. 1991;McCabe and Martinell 1993;Zapata et al. 1999) via either the gene gun or Agrobacterium for transformation in cotton plants.

Agrobacterium-meditated gene transformation
Agrobacterium-mediated gene transformation has been the most preferred transformation method used for the transformation of foreign genes such as Cry1Ab and Cry1Ac genes of Bacillus thuringiensis into cotton to develop insect-resistant transgenic plants (Singh et al. 2004). For example, Mao et al. developed an insect-resistant transgenic cotton expressing dsCYP6AE14 using explant genetic transformation of the hypocotyl and cotyledon (2011). A certain cotton cultivars have been transformed using this technique and plants have been regenerated later by using embryogenesis; however, commercially important varieties have been proved recalcitrant because of their inability to develop embryogenic tissues. The chloroplast localization of Cry1Ac and Cry2A protein was successfully achieved in cotton. And 100% mortality was obtained in the 2nd instar larvae of the targeted insect after feeding for 72 h . A local cultivar of cotton, MNH-786, was manipulated with pKian-1 and the stable incorporation of the TP-Cry1Ac-RB construct in putative transgenic plants was confirmed by polymerase chain reaction (PCR) while fusion-protein expression in the chloroplast as well as in cytoplasm was proved using the western blot analysis. It has been confirmed that hybrid-Bt protein is expressed within the chloroplasts using confocal microscopy of leaf-sections (Kiani et al. 2013).
Furthermore, the cultivar MNH-786 was modified by the transformation of herbicide and insect resistance genes. The Cry1Ac + Cry2A and GT (herbicide resistant) genes were cloned in a different cassette using 35S promoter. The apex portions of mature embryos of MNH-786 cultivar were injured with a blade and infected with the strain of Agrobacterium tumefaciens containing transgene constructs. Cotton plants transformed, were acclimatized in pots and later were grown under greenhouse conditions. The -PCR and ELISA assured the presence of the transgene and expression of its protein in the transformed plants. Transformation efficiency was 1.05%. All larvae of Helicoverpa armigera feeding on leaves of transgenic cotton of T 0 generation were found dead as compared with the larvae feeding on leaves from non-transgenic cotton (Awan et al. 2015).
Two Bt genes including cry1Ac and cry2A were pyramided in a local cotton variety CIM 482 by sonicationassisted Agrobacterium-mediated transformation (SAAT). The insect bioassay showed promising results and 75% to 100% mortality of H. armigera was observed in transgenic plants. The results obtained explained that one vector carrying two Bt insecticidal genes with the same promoter is proving to be valuable for future breeding programs (Rashid et al. 2008). Ali et al. tested two cotton varieties CRSP1 and CRSP2 for genetic transformation efficiency concerning GT gene and insect mortality (2016). Their results exhibited that CRSP-1 has a valuable resistance against insects and weeds. They further reported that this may be helpful for farmers as well as national breeders to develop potential cultivars. The CpEXPA3 gene taken from Calotropis procera was introduced into a local cotton cultivar (NIAB-846) using strain LBA 4404. The results showed that fiber strength was greater in transformed cotton plants compared with that in non-transformed plants .

Other successful transformation methods in cotton
Different methods were also used for genetic transformation in the cotton crop. An introduction of exogenous DNA in self-pollinated flowers of cotton plants was reported by Zhou et al. (1983) using the pollen tube pathway method of transformation. Huang et al. (1999) and  reported that transgenic cotton plants showing the green fluorescent protein gene and cellulose synthesizing genes (acsA, acsB, acsC, and acsD) of Acetobacterxylinum were produced by using these types of approaches.

Some successful transformations for agronomic traits in Pakistan
It has been reported that production of non-Bt cotton is continuously decreased by 35%∼40% every year due to the attack of the insect pests which is an alarming situation for the cotton growers in Pakistan (Masood et al. 2011). Bt cotton was first registered by the Government of Pakistan in 2009 and was first grown in 2010 (Abdullah 2010). The phytochrome B gene was transferred in the cotton crop by using the Agrobacterium technique. It was observed that the photosyntehsis rate of transgenic plants showed two times higher than the normal plants, and the transpiration rate and stimatal conductance was four-times higher. Data were recorded in the greenhouse and the field for two generations. It was also observed that there is a 35% increase of yield in transgenic cotton due to over-expression of the phytochrome B gene. This gene showed pleiotropic effects as a decrease in apical dominance and an increase in boll size.
Glyphosate-tolerant plants were also generated by transferring the 5-enolpyruvilshikimate-3-phosphate synthase (CP4-EPSPS) gene by using the Agrobacterium technique (Nida et al. 1996). These plants were found successful in the field, but 12 weed species resistant to glyphosate emerged after the application of herbicide for weed control (Dill et al. 2008). Genes encoding Cry proteins of B. thuringiensis have been classified as CryI, CryII, CryIII, CryIV, CryV, and CryVI based on their insecticidal activities (Crickmore et al. 1998;Wilkins et al. 2000;Siebert et al. 2008). Cottonseed is an important source of edible oil. There has been a decrease of 70% of the gossypol content in seed (as it is a toxic polyphenolic compound) and 92% decreased in the accumulation of foliar gossypol due to the engineering of cotton plants with antisense G. arboreum δ-(+) cadinene synthase (cdn1-Cl) gene under the control of the promoter CaMV35S (Martin et al. 2003).

Gene editing in cotton
Recently, the CRISPR/Cas9 system has emerged as an effective technique to modify a gene in both plants and animals. It is based on the immune response of prokaryotes against foreign nucleic acid and viruses and has been successfully deployed in eukaryotes for targeted genome modifications ( Horvath and Barrangou 2010;Koonin and Makarova 2009). Earlier developed systems such as mutagenesis and gene targeting like zinc finger nuclease and TALEN are already in use, but the efficiency of CRISPR is much higher and target-oriented. The CRISPR/ Cas9 consists of two components: the first one is a guide RNA (gRNA) that finds the sequence which is targeted in the genome and the second one is a nuclease that breaks DNA sequence which is targeted at a specific location. There are almost 20 nucleotide sequences that are complementary to the target sequence in a single gRNA and the other tracrRNA:crRNA which forms a hairpin structure and binds with the nuclease portion in Cas9. This sgRNA/Cas9 complex allows cleavage at site in the target genome with greater precision (Mali et al. 2013). The ease of use of CRISPR/Cas9 system has helped to achieve a lot within limited resources.
Polyploidy crops, e.g., the upland cotton as a tetraploid, are always difficult for genome editing as they have multiple sets of chromosomes and a higher number of alleles. But recent studies have shown significant success in targeted mutagenesis and genome editing of G. hirsutum. (Li et al. 2017). As an allotetraploid crop, the cotton genome is very complex (2n = 4x = 52) with a very large genome size, i.e. 2.5 Gb (Li et al. 2019). Many genes have multiple copies in cotton. Different gene editing strategies have been adopted in cotton which include CRISPR/Cpf1 (Cas12a), CRISPR/LbCpf1, CRISPR/ Cas9 and multiplex systems of CRISPR, Zhang et al. (2018) have reported simultaneous editing of two copies of Gh14-3-3d genes in upland cotton.
It has become quite easy to utilize molecular tools due to the availability of the genome sequence of cotton. They can be used to to evaluate the function of genes and improve the agronomic characters by targeting specific genes for better performance and quality traits (Li et al., 2015;Zhang et al. 2016). Despite limitations, there have been reports of successful gene editing for G. barbadense and G.
hirsutum which show allotetraploid behavior with double the number of targets compared with that in diploid crops. For example, Li et al. (2017) have successfully reported the gene-editing of cotton. The CRISPR/Cas9 system has been known as a broad method to control various geminiviruses in Pakistan. However, this method only targets single virus and it has not been found beneficial to control complexes of a begomovirus associated with DNA-satellites. In addition, a cassette of sgRNA is made to target not only complete CLCuD-associated begomovirus complexes (Iqbal et al. 2016). Although, CRISPR Cas/9 has made the genome editing quite simple and efficient; it can result in non-specific editing due to a mismatch in the gRNA sequence (Ahmad et al. 2020). The genome-editing technology is still under investigation for its limitations such as off-targets, low mutagenesis efficiency, persisted CRISPR activity in subsequent generations, risk of instability of edited genome, scarcity of validated targets and its dependency on in-vitro regeneration protocols for the recovery of stable plant lines (Ahmad et al. 2020).

Marker-assisted selection (MAS) status in Pakistan
In plant breeding, the selection of plants in a segregating population with the desired characteristics and suitable gene combinations is an important component. Markerassisted selection is the selection of phenotypes based on the genotype of markers (Collard et al. 2005). The MAS is the most important in breeding programs because it improves the effectiveness and productivity of breeding methods over conventional breeding. To facilitate quantitative agronomic traits, researchers utilize mapping of QTLs and MAS. The RAPD markers have been extensively applied for MAS to obtain glandless seed and glanded plant in an interspecific population (Mergeai et al. 1998). A breeder can easily identify the plant which carries the gene if the markers are strongly associated with the targeted gene (Young 1996). DNA markers associated with important QTLs such as qtlFS1 for fiber strength are useful in F 2 generation of varieties cultivated on large scale . The cotton varieties NIBGE-115 and NIBGE-2 were developed by combining the conventional and the genomic tools to develop the resistance against cotton curl leaf disease at National Institute of Biotechnology and Genetic Engineering (NIBGE) (Rahman and Zafar, 2007a, b). In 2009, Mumtaz et al. obtained two CLCuV resistant cotton genotypes, namely CIM-443 and CIM-240 through marker-assisted screening in Pakistan. The collected data regarding the above genotypes suggested that they can be used in future cotton breeding. The cost and effectiveness of MAS depend on the selection of marker technology. Thus, it should be selected with considerable care during crop improvement.

Conclusion
Since the start of the cotton biotechnology program, several key steps and strategies have been adopted for crop improvement at different research institutions in Pakistan. Of these, conservation of germplasm resources, genetic engineering and transformation technologies, molecular markers-assisted selection, classical breeding programs, improvement of fiber quality, better resistance against biotic and abiotic stresses, policy-making for knowledge dissemination and variety approval process are important factors that have been taken into consideration. Pakistan still lags behind in yield per area compared with other major cotton-growing countries. Further researches on adaptation of genetic engineering technologies, academic industry linkage and future policy-making are required for the improvement of crops and to deal with future challenges.