Linkage Map Construction and QTL Analysis of Agronomic and Fiber Quality Traits in

Size: px
Start display at page:

Download "Linkage Map Construction and QTL Analysis of Agronomic and Fiber Quality Traits in"

Transcription

1 Page 1 of 62 Linkage Map Construction and QTL Analysis of Agronomic and Fiber Quality Traits in Cotton Michael A. Gore*, David D. Fang, Jesse A. Poland, Jinfa Zhang, Richard G. Percy, Roy G. Cantrell, Gregory Thyssen, and Alexander E. Lipka M. A. Gore, Plant Physiology and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service (USDA-ARS), U.S. Arid-Land Agricultural Research Center, North Cardon Lane, Maricopa, AZ 85138, USA D. D. Fang and G. Thyssen, Cotton Fiber Bioscience Research Unit, USDA-ARS, Southern Regional Research Center, 1100 Robert E. Lee Boulevard, New Orleans, LA 70124, USA J. A. Poland, Hard Winter Wheat Genetics Research Unit, USDA-ARS, Manhattan, KS and Department of Agronomy, Kansas State University, Manhattan, KS 66506, USA J. Zhang, Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM 88003, USA R. G. Percy, USDA-ARS, Southern Plains Agricultural Research Center, Crop Germplasm Research Unit, 2881 F&B Road, College Station, TX 77845, USA 1

2 Page 2 of 62 R. G. Cantrell, Monsanto, 700 Chesterfield Parkway West, Mail Stop CC5A, Chesterfield, MO 63017, USA A. E. Lipka, Institute for Genomic Diversity, Cornell University, Ithaca, New York, USA Current Address: M. A. Gore, Department of Plant Breeding and Genetics, Cornell University, Ithaca, NY 14853, USA M. A. Gore and D. D. Fang contributed equally to this work. Received. *Corresponding author: ([email protected]) Abbreviations: GBS, genotyping-by-sequencing; ICIM, inclusive composite interval mapping, QTL, quantitative trait locus; SNP, single-nucleotide polymorphism; SSR, simple-sequence repeat 2

3 Page 3 of 62 Abstract The superior fiber properties of Gossypium barbadense L. serve as a source of novel variation for improving fiber quality in Upland cotton (G. hirsutum L.), but introgression from G. barbadense has been largely unsuccessful due to hybrid breakdown and a lack of genetic and genomic resources. In an effort to overcome these limitations, we constructed a linkage map and conducted a quantitative trait locus (QTL) analysis of 10 agronomic and fiber quality traits in a recombinant inbred mapping population derived from a cross between TM-1, an Upland cotton line, and NM24016, an elite G. hirsutum line with stabilized introgression from G. barbadense. The linkage map consisted of 429 simple-sequence repeat (SSR) and 412 genotyping-bysequencing (GBS)-based single-nucleotide polymorphism (SNP) marker loci that covered half of the tetraploid cotton genome. Notably, the 841 marker loci were unevenly distributed among the 26 chromosomes of tetraploid cotton. The 10 traits evaluated on the TM-1 NM24016 population in a multi-environment trial were highly heritable and most of the fiber traits showed considerable transgressive variation. Through the QTL analysis, we identified a total of 28 QTLs associated with the 10 traits. Our study provides a novel resource that can be used by breeders and geneticists for the genetic improvement of agronomic and fiber quality traits in Upland cotton. 3

4 Page 4 of 62 Introduction As the world s foremost natural fiber crop, cotton supports a multi-billion dollar production and processing industry. Even though cotton is predominantly cultivated for its fiber, a byproduct of cotton processing cottonseed is an important source of vegetable oil and high protein meal. The cotton genus (Gossypium) captures a tremendous range of phenotypic and genomic diversity, with a striking native geographic distribution that includes regions of Africa, Asia, Australia, and the Americas (Fryxell, 1968; Fryxell, 1971; Fryxell, 1992). The nearly 50 species that are assigned to this genus have undergone extensive chromosomal evolution, allowing separation into one of nine genome groups that consist of either diploid (A-G and K) or allotetraploid (AD) species (Reviewed in Wendel and Cronn, 2003). Remarkably, tetraploid cotton appeared within the last 1-2 million years from a likely intercontinental dispersal of an A- genome diploid to the Americas, followed by hybridization with an indigenous D-genome diploid most closely similar to the extant wild species, G. raimondii (D5) (Reviewed in Wendel et al., 2009). Within the Gossypium genus, two of the five tetraploid species (2n=4x=52; G. hirstum and G. barbadense), along with two diploid species (2n=2x=26; G. herbaceum and G. arboreum), were independently domesticated for cotton fiber production in the last few thousand years in the New and Old World (Reviewed in Brubaker et al., 1999). However, the two cultivated tetraploid species, G. hirstum and G. barbadense, account for the vast majority of global cotton production. The higher yielding and more broadly adapted G. hirsutum (source of Upland cotton) is responsible for greater than 90% of the current world production of cotton fiber. With the increasing global demand for textile products and intense competition from synthetic fibers, the need for higher yielding Upland cotton cultivars with improved fiber quality has never been 4

5 Page 5 of 62 more critical. Inopportunely, there has been a continual decline in the rate of gain in cotton yields over the past decade (Meredith, 2000). This yield plateau is likely the result of a very narrow genetic base for Upland cotton that was initially imposed by polyploidization and domestication bottlenecks, followed by a recent extended period of genetic improvement that relied on the overutilization of elite germplasm within breeding programs that often captured a miniscule fraction of the exploitable standing genetic variation (Brubaker et al., 1999; May et al., 1995; Paterson et al., 2004; Wendel et al., 1992). The continued genetic erosion of the Upland germplasm pool has become ever more systemic due to the excessive genetic restriction that results from backcross breeding approaches routinely employed to develop commercial transgenic cotton cultivars (Paterson et al., 2004; Van Esbroeck et al., 1998). Such limited allelic variation not only restricts the rate of increase in yield potential, but also increases the genetic vulnerability of Upland cotton to adverse climatic episodes, as well as pest and disease epidemics. Although G. barbadense (source of Egyptian, Pima, and Sea Island cotton) is grown in limited areas around the world, its fiber quality properties are superior to that of G. hirsutum. Unfortunately, efforts to enhance the fiber quality of G. hirsutum through hybridization with G. barbadense have been largely unsuccessful due to the preferential elimination of G. barbadense alleles in the F 2 and later generations (Stephens, 1949). This selective loss of G. barbadense alleles not only results in extensive segregation distortion (Jiang et al., 2000; Reinisch et al., 1994), but also leads to the predominant expression of phenotypes that most closely resemble G. hirsutum. Additionally, there is a reduction in fitness that manifests itself as a decline in vigor and fertility in advanced generations of G. hirsutum G. barbadense populations (Stephens, 1950). This phenomenon, termed hybrid breakdown, has severely limited the value of 5

6 Page 6 of 62 interspecific populations for the genetic improvement of fiber quality in Upland cotton. This has made it particularly difficult, if not impossible, to develop a large number of fertile, interspecific inbred lines for multi-environment trials on a commercial scale. Therefore, an alternative breeding approach is needed to widen the genetic base of Upland cotton with novel, favorable alleles from G. barbadense. The crossing of G. hirsutum inbred lines possessing stable introgression segments from G. barbadense with elite Upland cotton cultivars is an approach that could result in the development of breeding populations for the genetic improvement of fiber quality with a lower incidence of hybrid breakdown. Through multiple cycles of recombination and selection of canonical phenotypes from both parental species, it was possible to develop NM24016, an elite G. hirsutum line with significant introgression from diverse G. barbadense lines that accounts for an estimated one-third of the mosaic genome (Cantrell and Davis, 2000; Tatineni et al., 1996). This stable introgressed line was crossed to TM-1, an Upland cotton line that is the genetic standard of G. hirsutum (Kohel et al., 1970), to develop a cotton mapping population of recombinant inbred lines (RILs) that has been shown to have tremendous phenotypic diversity, especially transgressive variation for fiber quality traits (Gore et al., 2012; Percy et al., 2006). However, the lack of a genetic linkage map, as well as quantitative trait loci (QTLs) associated with fiber quality properties has restricted the value of this novel mapping resource to the global cotton community. The identification of QTL alleles associated with economically important traits in an intraspecific inbred mapping population with introgression segments from G. barbadense could facilitate the molecular breeding of higher yielding Upland cultivars with enhanced fiber quality. 6

7 Page 7 of 62 The objectives of this study were (i) to construct a linkage map for the TM-1/NM24016 population and (ii) to identify favorable QTL alleles associated with agronomic and fiber traits. Materials and methods Plant materials and phenotypic evaluations The construction and phenotypic evaluation of the TM-1 NM24016 recombinant inbred mapping population were previously described (Gore et al., 2012; Percy et al., 2006). Briefly, TM-1, NM24016, and 98 F 5:7 RILs were evaluated in a completely randomized block design at Las Cruces, NM, and Maricopa, AZ, in 2001 and In three of the four environments, four complete replications of the experiment were grown. In 2001, three complete replications of the experiment were grown at Las Cruces. Experimental units were two-row plots with an inter-row spacing of 1.01 m at each location. The length of plots was 12.2 m in 2001 and 10 m in Mechanical harvesting of plots was performed with a two-row harvester. Prior to mechanical harvest, boll samples (50 bolls at Maricopa and 25 bolls at Las Cruces) were harvested by hand. The collected seedcotton samples were ginned using a laboratory 10-saw gin to allow for the measurement of boll and fiber quality traits. The RIL population and its two parents were phenotyped for boll size (g boll -1 ), lint percentage (%), lint yield (kg ha -1 ), plant height (m) at harvest (only Maricopa environments), fiber length (mm; 2.5%- and 50%-span lengths, the distance spanned by 2.5% and 50% of the fibers), micronaire (unit), strength (kn m -1 kg -1 ), elongation (%), and uniformity (%). A digital fibrograph was used to measure fiber length, and a 7

8 Page 8 of 62 Fibronaire instrument (Motion Control, Dallas, TX) was used to measure micronaire. A stelometer was used to measure fiber strength and elongation. DNA isolation Self-pollinated seeds from TM-1, NM24016, and 95 of the 98 F 5:7 RILs were germinated in Petri dishes lined with moistened filter paper at 32 C in a growth chamber. Viable self-pollinated seeds were not available for lines NM26, NM54, and NM59. For each line, root tips were bulk harvested from an average of ten 5-day-old seedlings. Total genomic DNA was isolated from homogenized fresh 5-d-old root tissue using 2% cetyltrimethyl ammonium bromide as previously described (Paterson et al., 1993), followed by purification with an Omega E.Z.N.A. HiBind DNA column (Omega Bio-Tek, Norcross, GA). The DNA concentration and purity of each sample were measured with a micro-volume UV-Vis spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE). SSR marker analysis Genomic DNA samples of TM-1 and NM24016 were first screened with 2,183 SSR markers to identify polymorphic markers between the two mapping parents. Primer sequences for SSR markers are available from the CottonGen database ( and a substantial number of them are included in a high-density linkage map for an interspecific cross between TM-1 (G. hirsutum) and 3-79 (G. barbadense) (Fang and Yu, 2012). SSR oligonucleotide primers were purchased from Sigma Genosys (Woodlands, TX) and Life Technologies (Foster 8

9 Page 9 of 62 City, CA). Forward primers were fluorescently-labeled at the 5 end with 6-FAM (6- carboxyfluorescein), HEX (4, 7, 2, 4, 5, 7 -hexachloro-6-carboxyfluorescein), or NED (2, 7, 8 - benzo-5 -fluoro-2, 4, 7-trichloro-5-carboxyfluorescein). PCR assay conditions for SSR marker loci were as previously described (Fang et al., 2010). Briefly, three pairs of primers that each had a different fluorescent label were multiplexed in each PCR assay. The 10 µl PCR reaction included 20 ng genomic DNA, 2.5 µm each of the forward and reverse primers, 3.5 mm MgCl 2, 0.2 mm dntps, 1 unit of DNA Taq polymerase (Promega Corporation, Madison, WI), and 1x reaction buffer without MgCl 2. Amplification conditions were 95 C for 3 min, followed by 34 cycles of 94 C for 45 s, 55 o C for 45 s, and 72 C for 1 min, with a final step of 72 C for 10 min. Amplified fragments were separated and sized on an automated capillary electrophoresis system ABI 3730XL (Applied Biosystems, Inc., Foster City, CA). GeneScan-500 ROX (Applied Biosystems, Inc.) was used as an internal DNA size standard. The output was analyzed with GeneMapper 4.0 (Applied Biosystems, Inc.). Of the 2,183 SSR markers that were screened, 538 of them were polymorphic between the two parents, and were subsequently used to genotype the 95 RILs following the approach of Fang et al. (2010) as was briefly described in the previous paragraph. We designated duplicated marker loci by adding a lower-case letter in sequential alphabetical order after the primer name. As reported in Gore et al. (2012), there was a low incidence of more than two parental alleles (15 loci) and putative non-parental alleles (54 loci; average of 2.5% non-parental alleles per RIL) for SSR loci. For each SSR locus with more than two parental alleles, the rarest alleles (i.e., lowest minor allele frequencies) were conservatively converted to missing data to allow for only one major allele from each parent. In addition, all putative non-parental alleles were converted to missing data. 9

10 Page 10 of 62 Genotyping-by-sequencing (GBS) marker analysis We constructed a PstI-MspI GBS library with a set of optimized barcoded adapters (P384A) as previously described (Poland et al., 2012). The 96-plex library consisted of the two mapping parents (TM-1 and NM24016) in duplicate and 92 RILs. The constructed library was sequenced twice on an Illumina HiSeq 2000 platform (Illumina Inc., San Diego, CA), which generated M and M sequence reads, respectively. SNP identification and population genotyping were conducted as described by Poland et al. (2012). Briefly, putative biallelic SNPs were identified by internally aligning sequence tags (i.e., unique sequences within the entire set of tags) with a maximum mismatch allowance of three nucleotides in a 64 bp tag. In addition, putative SNPs needed to be present in greater than 20% of the inbred lines. To filter for true SNPs, a Fisher s exact test was implemented to determine if the two SNP alleles were independent in the population of inbred lines. Putative SNPs for which the null hypothesis of independence was rejected at a significance level of α = 0.05 were converted to SNP calls in the population. Linkage map construction We used 499 SSR loci detected by 459 SSR primer pairs and 491 SNP marker loci to construct a linkage map with JoinMap version 4.0 software (Van Ooijen, 2006). Linkage groups were created at a logarithm of odds ratio (LOD) score threshold of 5. Marker orders were estimated using the maximum likelihood mapping algorithm. Recombination fractions were converted to 10

11 Page 11 of 62 map distances (cm) via the Kosambi mapping function (Kosambi, 1944). For each locus included in the linkage map, segregation distortion was tested by χ 2 analysis (degrees of freedom = 1) against the expected 1:1 ratio in a RIL population. A Bonferroni correction was used to control the family-wise error rate for multiple χ 2 tests, resulting in an alpha level of 5.95x10-5 (0.05/841). Localization of markers on the G. raimondii reference genome sequence We downloaded the 13 pseudomolecule chromosomes (~750 Mb) of the diploid D 5 genome species G. raimondii (JGI assembly v2.1; (Paterson et al., 2012). The BLASTN version algorithm (stand-alone) was used to align context nucleotide sequences for each of the aforementioned 459 SSR and 491 SNP markers to the G. raimondii reference genome sequence with an E-value cutoff of 1e -20 for SSR markers and 1e -10 for SNP markers. A higher E-value cutoff was used for the relatively shorter SNP context sequences because shorter alignments tend to produce higher E-values (Karlin and Altschul, 1990). Phenotypic data analysis and heritability estimation The 10 traits were initially screened for outliers in SAS version 9.3 (SAS Institute, 2012) by examining the Studentized deleted residuals (Kutner et al., 2004) obtained from mixed linear models fitted with environment, line, and replication nested within environment as random effects. For each trait, a best linear unbiased predictor (BLUP) for each line was predicted from a mixed linear model fitted across environments with ASReml version 3.0 (Gilmour et al., 2009): 11

12 Page 12 of 62 Y ijk = µ + env i + line j + env*line ij + rep(env) ik + ε ijk, in which Y ijk is an individual phenotypic observation on a single plot; µ is the overall mean; env i is the effect of the i th environment; line j is the effect of the j th line; env*line ij is the effect of the interaction between the j th line and the i th environment; rep(env) ik is the effect of the k th replication within the i th environment; and ε ijk is the random error term. Likelihood ratio tests were conducted to remove all terms from the model that were not significant at α = 0.05 (Littell et al., 1996). The final model was used to estimate BLUPs for each line. The variance components from these final models were used to estimate broad-sense heritability on an individual plot ( Ĥ ) and a line-mean basis ( Ĥ ) per Holland et al. (2003). The standard errors 2 p of heritability estimates were approximated with the delta method (Holland et al., 2003). 2 l QTL analysis For BLUPs of each trait, we mapped additive QTL effects with inclusive composite interval mapping (ICIM) (Li et al., 2007), a variant of composite interval mapping (CIM), in QTL IciMapping version 3.2 software ( The ICIM method consists of two stages. In the first stage, stepwise regression was used to fit individual markers in a general linear model. For each trait, the probability levels for markers to enter and exit the model were calculated by a permutation procedure run 1,000 times (Anderson and Braak, 2003). The P-value corresponding to an overall type I error rate of α = 0.05 was approximately 1 x 10-4 for the RIL population. To prevent a marker from entering and exiting the model during the same 12

13 Page 13 of 62 step, the entry threshold was set to P = 1 x 10-4, while the exit threshold was set to P = 2 x In the second stage, one-dimensional scanning across the entire genome at 1 cm steps was conducted based on coefficient estimation in the first stage. For each trait, a permutation procedure (Churchill and Doerge, 1994) was run 1,000 times in the QTL IciMapping version 3.2 software to select the LOD threshold for an experiment-wise type-i error rate of α = The LOD thresholds at α = 0.05 ranged from 3.33 to 3.45 with an average LOD score of Results Genetic properties of the TM-1 NM24016 linkage map We used 2,183 simple sequence repeat (SSR) markers to screen TM-1 and NM24016 for polymorphisms and found 538 polymorphic markers between the two mapping parents. When the 538 SSR markers were used to evaluate the TM-1 NM24016 mapping population, 65 of them were identified as monomorphic within the population. An additional 14 SSR markers did not produce distinct, reproducible patterns of polymorphism among lines and thus were not used for linkage map construction. Of the remaining 459 SSR markers, 419 scored a single locus each, while the other 40 scored two loci each, for a total of 499 SSR marker loci. In a complementary experiment, the parents and population were genotyped with a highly multiplexed, genotypingby-sequencing (GBS) approach to further increase the number of markers available for constructing a linkage map. With this approach that simultaneously combines polymorphism identification and genotyping, 491 biallelic GBS-based SNP loci were scored in parallel across 13

14 Page 14 of 62 the parents and their progeny with high confidence. Taken together, the two genotyping approaches scored a total of 990 SSR and SNP loci across the RIL population. Of the 990 marker loci used for linkage map construction, 841 could be assigned to 117 linkage groups, while the other 140 could not be assigned to any linkage group (Table 1). The number of markers on each linkage group ranged from 2 to 57, and the linkage map covered a total of ~2,061 cm of the cotton genome. If not considering the three linkage groups that each consisted entirely of markers that cosegregated, the length of each linkage group ranged from 0.51 to cm, while the average distance between two markers for linkage groups ranged from 0.30 to cm. Of the 841 mapped marker loci, statistically significant segregation distortion was detected for 145 loci (17.2%) at a Bonferroni-corrected threshold of 5%. In general, there was a bias for TM-1 alleles at these 145 loci, suggesting a selective elimination of alleles from NM Interestingly, loci with distorted segregation ratios were especially prevalent for two linkage groups that mapped to chromosomes 8 and 23, accounting for 40.0% of the 145 loci. The average residual heterozygosity for each marker locus ranged from 0 to 14.29%, with an overall average of 3.88%. For each RIL, the average residual heterozygosity was 3.85% and ranged from 0 to 25.72%. Comparative analysis with the TM-1 NM24016 linkage map We putatively assigned 116 of the 117 linkage groups to specific chromosomes of tetraploid cotton using the intersection of SSR markers between the TM-1/NM24016 and previously published linkage maps (Blenda et al., 2012; Fang and Yu, 2012) in combination with anchoring markers to the G. raimondii (D 5 diploid) reference genome sequence (Paterson et al., 14

15 Page 15 of ). On average, each chromosome had 4.33 linkage groups, but marker loci were unevenly distributed among the 26 chromosomes (Table 1). Notably, no marker loci were mapped to chromosome 18, while chromosome 4 only had two loci that spanned 4.07 cm. In contrast, chromosomes 11 and 19 had the highest marker densities with 84 and 81 loci, respectively. There was a moderately strong positive correlation (R 2 = 0.42, P < ) between the number of SSR and SNP marker loci per chromosome, indicating that both SSR and SNP loci were unevenly distributed in a related manner. Interestingly, there was essentially no correlation (R 2 = 0.04, P = 0.33) between the number of markers per chromosome and estimates of physical chromosome size (Mb) from G. raimondii (D5 diploid) and G. arboreum L. (A2 diploid) (Paterson et al., 2012; Wang et al., 2008). With very few exceptions, chromosomal assignments based on existing linkage maps were also concordant with those inferred by aligning marker sequences to the G. raimondii genome sequence (Table S1). However, there were 141 markers (113 SNPs and 28 SSRs) that could not be aligned to the genome sequence (no hits found). Of these 141 markers, 114, 26, and 1 were linkage mapped to A T (Chr.01-13), D T (Chr.14-26), and unknown chromosomes, respectively. We compared the colinearity of SSR markers shared between our intraspecific linkage map and an existing high-density linkage map constructed for an interspecific RIL population (G. hirsutum TM-1 G. barbadense 3-79) that also used TM-1 as the female parent (Fang and Yu, 2012). A total of 316 SSR marker loci from 99 linkage groups of the TM-1 NM24016 map were found to also exist in the TM linkage map. Among the 316 shared marker loci, 187 of them were perfectly collinear between the two linkage maps. In contrast, there were 129 discordant marker loci nearly evenly split between A T (63 loci) and D T (66 loci) chromosomes (Supplementary Table S1). Furthermore, these discrepancies in collinearity were restricted to 15

16 Page 16 of 62 only 19 and 16 linkage groups assigned to A T and D T chromosomes, respectively. The alignment of markers to the G. raimondii (D 5 diploid) genome sequence, however, revealed that TM- 1 NM24016 map positions for at least 17 of the 66 discordant loci (25.8%) mapping to D T chromosomes were collinear with their physical positions. The availability of a draft genome sequence for G. hirsutum and G. barbadense will be critical to more precisely resolve marker order along a chromosome and shed light on potential cryptic structural variation. Heritabilities and QTL mapping of traits in the TM-1 NM24016 population With a mixed model that corrected for systematic environmental effects, we reassessed 10 agronomic and fiber quality traits that had been scored on the TM-1 NM24016 population at two locations in 2001 and 2002 (Percy et al., 2006). Similar to the findings of Percy et al. (2006), the midparent BLUPs, as well as the mean and range of progeny RIL BLUPs revealed substantial variability for the 10 traits, with RIL progeny especially showing transgressive variation for fiber quality traits (Table 2). Estimates of broad-sense heritabilities on an individual plot basis ( Ĥ ) ranged from 0.30 to In contrast, estimates of broad-sense heritabilities on a line-mean basis ( Ĥ ) ranged from 0.85 to 0.95, which were slightly higher than broad-sense heritability 2 l estimates obtained by Percy et al (2006). Thus, replication of the experiment across multiple environments provided a 35 to 72% increase of mean heritability for these 10 traits. This has important implications for complex trait dissection because the statistical power to detect QTLs is higher for traits with relatively higher heritability (Yu et al., 2008). The estimated BLUPs for the 10 agronomic and fiber traits were used to map QTLs with the inclusive composite interval mapping (ICIM) procedure. The QTL analysis identified a total 2 p 16

17 Page 17 of 62 of six QTLs for five traits at an experiment-wise type-i error rate of 5% (Table 3). Given that the sample size of the TM-1 NM24016 mapping population (n = 95) only has adequate statistical power to repeatedly detect large effect QTLs (Xu, 2003), we searched for QTLs with more modest effects at an experiment-wise type-i error rate of 20%. With this less conservative type-i error rate, an additional 22 QTLs were identified (Supplementary Table S2). When considering a combined total of 28 QTLs, the number of QTLs associated with each trait ranged from one (lint yield, plant height, 2.5%- and 50%-span length) to seven (fiber strength). Given that the 22 weaker effect QTLs were only identified with a relaxed type-i error rate, we focused on the six QTLs that were declared significant at the more stringent genome-wide significance threshold. These six QTLs were distributed among chromosomes 11, 15, 17, 19, and 25 (2 QTLs). One of the two QTLs for fiber length uniformity mapped to a position on chromosome 25 that was coincident with the 2.5%-span length QTL. Interestingly, these two QTLs on chromosome 25 showed opposite sign allelic effects and there was a modest negative correlation (R 2 = 0.31, P < ) between the BLUPs of these two fiber traits. The percent variance explained by an individual QTL ranged from approximately 14 to 22%. Length uniformity was the only trait for which two QTLs were identified, with a QTL each located on chromosomes 15 and 25. Taken together, these two QTLs accounted for approximately 37% of the total variance for length uniformity and approximately 42% of genetic variance. Interestingly, both parents contributed favorable alleles for agronomic and fiber traits. Of the six QTLs identified, four of them had positive additive effects, implying a higher value for boll size, fiber strength, and length uniformity conferred by alleles from TM-1. The other two QTLs had negative additive effects, suggesting that TM-1 contributed alleles that reduced lint yield and 2.5%-span length at these two loci. 17

18 Page 18 of 62 Discussion In general, G. barbadense has superior fiber quality, but is lower yielding and less adapted to cotton growing regions relative to G. hirsutum. Unfortunately, efforts to transfer novel genetic variation for fiber traits from G. barbadense to modern Upland cotton have been repeatedly slowed by hybrid breakdown in the F 2 and later generations (Jiang et al., 2000; Reinisch et al., 1994; Stephens, 1949; Stephens, 1950). Elite G. hirsutum cotton lines with stabilized introgression from G. barbadense could be used as parental lines to potentially help overcome this genetic barrier (Cantrell and Davis, 2000; Tatineni et al., 1996), but a greater wealth of genetic and genomic resources is needed to accelerate such an effort in molecular breeding programs of Upland cotton. To address this issue, we constructed a linkage map for an immortal Upland cotton mapping population with introgressed segments from G. barbadense and conducted a QTL analysis of 10 agronomic and fiber traits. Our study is the first to use a GBS approach to simultaneously identify and score SNPs within a cotton mapping population and identify QTLs for complex trait variation with this novel mapping resource. We constructed an intraspecific linkage map of tetraploid cotton with 841 SSR and biallelic GBS-based SNP loci that spanned 2, cm of the genome. These 841 marker loci were assembled into 117 linkage groups and of which 116 could be putatively assigned to 25 of the 26 cotton chromosomes. The tetraploid cotton genome has an estimated genetic distance of 4,070 cm (Blenda et al., 2012), thus the 117 linkage groups covered 50.6% of the genome. As would be expected for elite Upland cotton germplasm (Van Deynze et al., 2009), this incomplete genome coverage is likely attributed to the low level of nucleotide diversity that exists between 18

19 Page 19 of 62 TM-1 and NM24016 (Lu et al., 2009) for the single-copy genomic fraction that the GBS method preferentially targeted. Such low levels of diversity have also limited past efforts to construct linkage maps with complete genome coverage for Upland cotton (Byers et al., 2012; Lin et al., 2009; Shen et al., 2007; Ulloa et al., 2002; Zhang et al., 2009). The 841 SSR and SNP marker loci were unevenly distributed among the 26 chromosomes of cotton (Table 1). Chromosomes 11 (84 loci) and 19 (81 loci) had the highest number of mapped SSR and SNP marker loci, while chromosomes 18 (0 loci) and 4 (2 loci) had the fewest. Similarly, Lin et al. (2010) revealed a biased distribution of mapped SSR markers among cotton chromosomes based on the integration of seven interspecific linkage maps, with the most SSR markers mapped to chromosomes 11 and 19 and least to chromosomes 2 and 4. The unbalanced chromosomal distribution of marker loci is unlikely to be entirely attributed to the genomic location of introgressed segments from G. barbadense, because 40.0% of the 145 loci with highly significant segregation distortion potential signatures of introgression segments in G. hirsutum mapping populations (Zhang et al., 2012) were contained in only two linkage groups that mapped to chromosomes 8 and 23. Furthermore, even though a higher rate of polymorphism was observed between TM-1 and NM24016 for chromosomes 11 and 19, only % of the marker loci on these two chromosomes showed highly significant segregation distortion, respectively. Given that this is the first application of GBS in a cotton RIL population, it is not possible to compare our results to that of other cotton studies. However, if the genome structure and patterns of diversity for tetraploid cotton resembles that of other species such as barley and wheat, the employed two-enzyme GBS approach is expected to identify and score SNPs evenly among chromosomes with respect to chromosome length, as well as at a fairly uniform density 19

20 Page 20 of 62 along chromosomes with the exception of recombinationally suppressed centromeric regions (Poland et al., 2012). However, essentially no correlation (R 2 = 0.04, P = 0.33) was found between marker number per chromosome and physical chromosome size for the TM- 1 NM24016 linkage map. This non-uniform marker coverage of the tetraploid cotton genome conceivably resulted from large, monomorphic chromosomal blocks of identity-by-descent (IBD) between TM-1 and NM With the implementation of SSR markers, Fang et al. (2013) identified 23 blocks of potential IBD of 20 cm or larger in a diversity panel of 193 Upland cotton cultivars. If pervasive in the modern germplasm pool, such large blocks of IBD will greatly impede the construction of medium- to high-density intraspecific linkage maps for elite Upland cotton lines. In concordance with results of a comprehensive analysis conducted by Percy et al. (2006) on the same phenotypic data set, we detected remarkable transgressive variation for fiber quality traits and showed the 10 traits to be highly heritable (Table 2). These findings suggest that RILs with extreme phenotypes inherited novel combinations of complementary alleles from TM-1 and NM24016 (devicente and Tanksley, 1993). The extent to which the 10 agronomic and fiber traits were heritable within the TM-1 NM24016 population was estimated as a function of variance components from mixed linear models (Holland et al., 2003). Replicated evaluation of the intraspecific RIL population in Las Cruces, NM, and Maricopa, AZ, over two years resulted in broad-sense heritabilities on a line-mean basis ( Ĥ ) that ranged from 0.85 to 0.95 for the ten traits. These very high heritabilities suggest that the 10 traits should respond very favorably to selection based on line means when using the identical experimental design and are predominantly controlled by QTLs (Holland et al., 2003; Hung et al., 2012). 2 l 20

21 Page 21 of 62 With ICIM of the 10 agronomic and fiber traits within the TM-1 NM24016 population, we identified a total of six QTLs associated with five traits at an experiment-wise type-i error rate of 5% (Table 3). Only a single QTL was detected for each of boll size, lint yield, fiber strength, and 2.5%-span length, while two QTLs were detected for length uniformity. Although these five traits are highly heritable (Table 2), the detected QTLs explained only approximately 20 to 42% of the estimated heritability for their associated traits. However, the proportion of phenotypic variance explained by each QTL is likely to be substantially overestimated with a mapping population of only 100 individuals (Beavis, 1998). Furthermore, no QTLs were detected for lint percentage, plant height, micronaire, fiber elongation, and 50%-span length at an experiment-wise type-i error rate of 5%, which are also highly heritable traits. A number of factors are likely contributing to the heritability remaining largely unexplained for these 10 traits. Even though the linkage map consisted of 841 SSR and SNP loci, we estimated that about 50% of the tetraploid cotton genome was not evaluated in the QTL analysis. Such a large portion of the genome is likely to harbor additional QTLs, but some of these QTLs could be interspersed among large blocks of IBD. In addition, a sample size of only 95 RILs does not provide sufficient statistical power to repeatedly identify QTLs with small to intermediate effects (Xu, 2003). An additional 22 QTLs with relatively weaker effects were detected for these traits at a less stringent experiment-wise type-i error rate of 20% (Supplementary Table S2), implying that these traits have a polygenic inheritance and are likely more suitable for genomic prediction models (Gore, unpublished data). The problem of missing heritability will need to be addressed through the construction and evaluation of larger mapping populations for cotton in combination with higher coverage linkage maps. Irrespective of these statistical limitations, these 28 identified QTLs still enhance concerted efforts for genomics- 21

22 Page 22 of 62 assisted selection in Upland cotton, but the true novelty of these QTLs will need to be assessed through a comprehensive meta-analysis of QTLs for agronomic and fiber quality in multiple cotton RIL populations (J. Zhang, unpublished data). Conclusions The construction of high-density linkage maps for genome-wide QTL analysis in intraspecific cotton populations has long been a formidable challenge. The implementation of a GBS method combined with fluorescence-based SSR genotyping enabled the construction of a linkage map with 841 SSR and SNP loci that covered half of the tetraploid cotton genome, which enabled the identification of favorable QTL alleles that could be valuable for the genetic improvement of Upland cotton. However, modification of the implemented GBS method is likely needed for a higher degree of SNP marker saturation for intraspecific cotton populations. Such modification could include the selection of more appropriate restriction enzymes for GBS in G. hirsutum that will lead to a higher frequency and more uniform distribution of SNP markers among chromosomes. This could be accomplished through an in silico digestion of the diploid and eventual tetraploid cotton genome sequences, which has been effective for maize and soybean (Elshire et al., 2011; Gore et al., 2009; Varala et al., 2011). In addition, the sequence variantcalling pipeline can be enhanced to also simultaneously discover and score presence/absence (dominant) and insertion/deletion (indel) markers. With a higher density map on a genome-wide level, it will then be possible to more comprehensively exploit the value of introgressed mapping populations for the transfer of novel variation from G. barbadense or wild G. hirsutum lines to Upland cotton breeding programs. However, such a strategy needs to be combined with a 22

23 Page 23 of 62 powerful mating design such as nested association mapping (McMullen et al., 2009), which will permit the genetic architecture of complex traits to be dissected at an unprecedented level and further strengthen the foundation for genomics-assisted selection in Upland cotton. Acknowledgements We thank past members of the Percy, Zhang, and Cantrell laboratories for their assistance in phenotypic data collection and members of the Gore, Fang, and Poland laboratories for DNA isolation, SSR genotyping, and GBS library construction. This work was supported by the USDA-ARS and Cotton Incorporated. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. The USDA is an equal opportunity provider and employer. 23

24 Page 24 of 62 References Anderson, M., and C.T. Braak Permutation tests for multi-factorial analysis of variance. J. Stat. Comput. Sim. 73: Beavis, W.D QTL analyses: power, precision, and accuracy, p , In A. H. Paterson, ed. Molecular dissection of complex traits. CRC Press, New York. Blenda, A., D.D. Fang, J.F. Rami, O. Garsmeur, F. Luo, and J.M. Lacape A high density consensus genetic map of tetraploid cotton that integrates multiple component maps through molecular marker redundancy check. PLoS ONE 7:e Brubaker, C.L., F.M. Bourland, and J.F. Wendel The origin and domestication of cotton, p. 3-32, In C. W. Smith and J. T. Cothren, eds. Cotton: Origin, history, technology, and production. Wiley & Sons, New York. Byers, R., D. Harker, S. Yourstone, P. Maughan, and J. Udall Development and mapping of SNP assays in allotetraploid cotton. Theor. Appl. Genet. 124: Cantrell, R.G., and D.D. Davis Registration of NM24016, an interspecific-derived cotton genetic stock. Crop Sci. 40:1208 Churchill, G.A., and R.W. Doerge Empirical threshold values for quantitative trait mapping. Genetics 138: devicente, M.C., and S.D. Tanksley QTL analysis of transgressive segregation in an interspecific tomato cross. Genetics 134: Elshire, R.J., J.C. Glaubitz, Q. Sun, J.A. Poland, K. Kawamoto, E.S. Buckler, and S.E. Mitchell A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6:e

25 Page 25 of 62 Fang, D., L. Hinze, R. Percy, P. Li, D. Deng, and G. Thyssen A microsatellite-based genome-wide analysis of genetic diversity and linkage disequilibrium in Upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 191: Fang, D.D., and J.Z. Yu Addition of 455 microsatellite marker loci to the high-density Gossypium hirsutum TM-1 x G. barbadense 3-79 genetic map. J. Cotton Sci. 16: Fang, D.D., J. Xiao, P.C. Canci, and R.G. Cantrell A new SNP haplotype associated with blue disease resistance gene in cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 120: Fryxell, P.A A redefinition of the tribe Gossypieae. Bot. Gaz. 129: Fryxell, P.A Phenetic analysis and the phylogeny of the diploid species of Gossypium L. (Malvaceae). Evolution 25: Fryxell, P.A A revised taxonomic interpretation of Gossypium L. (Malvaceae). Rheedea 2: Gilmour, A.R., B. Gogel, B. Cullis, R. Thompson, D. Butler, M. Cherry, D. Collins, D. Dutkowski, S. Harding, and K. Haskard ASReml user guide release 3.0. VSN International Ltd, Hemel Hempstead, UK. Gore, M.A., R.G. Percy, J. Zhang, D.D. Fang, and R.G. Cantrell Registration of the TM- 1/NM24016 cotton recombinant inbred mapping population. J. Plant Reg. 6: Gore, M.A., J.M. Chia, R.J. Elshire, Q. Sun, E.S. Ersoz, B.L. Hurwitz, J.A. Peiffer, M.D. McMullen, G.S. Grills, J. Ross-Ibarra, D.H. Ware, and E.S. Buckler A firstgeneration haplotype map of maize. Science 326:

26 Page 26 of 62 Holland, J.B., W.E. Nyquist, and C.T. Cervantes-Martínez Estimating and interpreting heritability for plant breeding: An update. Plant Breed. Rev. 22: Hung, H.Y., C. Browne, K. Guill, N. Coles, M. Eller, A. Garcia, N. Lepak, S. Melia-Hancock, M. Oropeza-Rosas, S. Salvo, N. Upadyayula, E.S. Buckler, S. Flint-Garcia, M.D. McMullen, T.R. Rocheford, and J.B. Holland The relationship between parental genetic or phenotypic divergence and progeny variation in the maize nested association mapping population. Heredity 108: Jiang, C.-X., P.W. Chee, X. Draye, P.L. Morrell, C.W. Smith, and A.H. Paterson Multilocus interactions restrict gene introgression in interspecific populations of polyploid Gossypium (Cotton). Evolution 54: Karlin, S., and S.F. Altschul Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 87: Kohel, R.J., T.R. Richmond, and C.F. Lewis Texas Marker-1. Description of a genetic standard for Gossypium hirsutum L. Crop Sci. 10: Kosambi, D.D The estimation of map distances from recombination values. Ann. Eugen. 12: Kutner, M.H., C.J. Nachtsheim, J. Neter, and W. Li Applied linear statistical models. 4th Edition. McGraw-Hill, Boston, MA. Li, H., G. Ye, and J. Wang A modified algorithm for the improvement of composite interval mapping. Genetics 175: Lin, Z., D. Yuan, and X. Zhang Mapped SSR markers unevenly distributed on the cotton chromosomes. Frontiers of Agriculture in China 4:

27 Page 27 of 62 Lin, Z., Y. Zhang, X. Zhang, and X. Guo A high-density integrative linkage map for Gossypium hirsutum. Euphytica 166: Littell, R.C., G.A. Milliken, W.W. Stroup, and R. Wolfinger SAS system for mixed models. SAS Publishing, Cary, NC. Lu, Y., J. Curtiss, R.G. Percy, S.E. Hughs, S. Yu, and J. Zhang DNA polymorphisms of genes involved in fiber development in a selected set of cultivated tetraploid cotton. Crop Sci. 49: May, O.L., D.T. Bowman, and D.S. Calhoun Genetic diversity of U.S. Upland cotton cultivars released between 1980 and Crop Sci. 35: McMullen, M.D., S. Kresovich, H.S. Villeda, P. Bradbury, H. Li, Q. Sun, S. Flint-Garcia, J. Thornsberry, C. Acharya, C. Bottoms, P. Brown, C. Browne, M. Eller, K. Guill, C. Harjes, D. Kroon, N. Lepak, S.E. Mitchell, B. Peterson, G. Pressoir, S. Romero, M. Oropeza Rosas, S. Salvo, H. Yates, M. Hanson, E. Jones, S. Smith, J.C. Glaubitz, M. Goodman, D. Ware, J.B. Holland, and E.S. Buckler Genetic properties of the maize nested association mapping population. Science 325: Meredith, W.R., Jr Cotton yield progress - why has it reached a plateau? Better Crops 84:6-9. Paterson, A.H., C.L. Brubaker, and J.F. Wendel A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Rep. 11: Paterson, A.H., R.K. Boman, S.M. Brown, P.W. Chee, J.R. Gannaway, A.R. Gingle, O.L. May, and C.W. Smith Reducing the genetic vulnerability of cotton. Crop Sci. 44:

28 Page 28 of 62 Paterson, A.H., J.F. Wendel, H. Gundlach, H. Guo, J. Jenkins, D. Jin, D. Llewellyn, K.C. Showmaker, S. Shu, J. Udall, M.J. Yoo, R. Byers, W. Chen, A. Doron-Faigenboim, M.V. Duke, L. Gong, J. Grimwood, C. Grover, K. Grupp, G. Hu, T.H. Lee, J. Li, L. Lin, T. Liu, B.S. Marler, J.T. Page, A.W. Roberts, E. Romanel, W.S. Sanders, E. Szadkowski, X. Tan, H. Tang, C. Xu, J. Wang, Z. Wang, D. Zhang, L. Zhang, H. Ashrafi, F. Bedon, J.E. Bowers, C.L. Brubaker, P.W. Chee, S. Das, A.R. Gingle, C.H. Haigler, D. Harker, L.V. Hoffmann, R. Hovav, D.C. Jones, C. Lemke, S. Mansoor, M. ur Rahman, L.N. Rainville, A. Rambani, U.K. Reddy, J.K. Rong, Y. Saranga, B.E. Scheffler, J.A. Scheffler, D.M. Stelly, B.A. Triplett, A. Van Deynze, M.F. Vaslin, V.N. Waghmare, S.A. Walford, R.J. Wright, E.A. Zaki, T. Zhang, E.S. Dennis, K.F. Mayer, D.G. Peterson, D.S. Rokhsar, X. Wang, and J. Schmutz Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492: Percy, R.G., R.G. Cantrell, and J. Zhang Genetic variation for agronomic and fiber properties in an introgressed recombinant inbred population of cotton. Crop Sci. 46: Poland, J.A., P.J. Brown, M.E. Sorrells, and J.L. Jannink Development of high-density genetic maps for barley and wheat using a novel two-enyme genotyping-by-sequencing approach. PLoS ONE 7:e Reinisch, A.J., J.M. Dong, C.L. Brubaker, D.M. Stelly, J.F. Wendel, and A.H. Paterson A detailed RFLP map of cotton, Gossypium hirsutum x Gossypium barbadense: chromosome organization and evolution in a disomic polyploid genome. Genetics 138: SAS Institute The SAS system for Windows. Release 9.3. SAS Inst., Cary, NC. 28

29 Page 29 of 62 Shen, X., W. Guo, Q. Lu, X. Zhu, Y. Yuan, and T. Zhang Genetic mapping of quantitative trait loci for fiber quality and yield trait by RIL approach in Upland cotton. Euphytica 155: Stephens, S.G The cytogenetics of speciation in Gossypium. I. Selective elimination of the donor parent genotype in interspecific backcrosses. Genetics 34: Stephens, S.G The internal mechanism of speciation in Gossypium. Bot. Rev. 16: Tatineni, V., R.G. Cantrell, and D.D. Davis Genetic diversity in elite cotton germplasm determined by morphological characteristics and RAPDs. Crop Sci. 36: Ulloa, M., W.R. Meredith Jr, Z.W. Shappley, and A.L. Kahler RFLP genetic linkage maps from four F2.3 populations and a joinmap of Gossypium hirsutum L. Theor. Appl. Genet. 104: Van Deynze, A., K. Stoffel, M. Lee, T. Wilkins, A. Kozik, R. Cantrell, J. Yu, R. Kohel, and D. Stelly Sampling nucleotide diversity in cotton. BMC Plant Biology 9:125. Van Esbroeck, G.A., D.T. Bowman, D.S. Calhoun, and O.L. May Changes in the genetic diversity of cotton in the USA from 1970 to Crop Sci. 38: Kyazma B.V JoinMap 4.0: Software for the calculation of genetic linkage maps in experimental populations. Kyazma B.V., Wageningen, The Netherlands. Varala, K., K. Swaminathan, Y. Li, and M.E. Hudson Rapid genotyping of soybean cultivars using high throughput sequencing. PLoS ONE 6:e Wang, K., B. Guan, W. Guo, B. Zhou, Y. Hu, Y. Zhu, and T. Zhang Completely distinguishing individual A-genome chromosomes and their karyotyping analysis by multiple bacterial artificial chromosome fluorescence in situ hybridization. Genetics 178:

30 Page 30 of 62 Wendel, J.F., and R.C. Cronn Polyploidy and the evolutionary history of cotton. Adv. Agron. 78: Wendel, J.F., C.L. Brubaker, and A.E. Percival Genetic diversity in Gossypium hirsutum and the origin of Upland cotton. Am. J. Bot. 79: Wendel, J.F., C. Brubaker, I. Alvarez, R. Cronn, and J.M. Stewart Evolution and natural history of the cotton genus, p. 3-22, In A. H. Paterson, ed. Genetics and genomics of cotton, Vol. 3. Springer US. Xu, S Theoretical basis of the Beavis effect. Genetics 165: Yu, J., J.B. Holland, M.D. McMullen, and E.S. Buckler Genetic design and statistical power of nested association mapping in maize. Genetics 178: Zhang, K., J. Zhang, J. Ma, S. Tang, D. Liu, Z. Teng, D. Liu, and Z. Zhang Genetic mapping and quantitative trait locus analysis of fiber quality traits using a three-parent composite population in Upland cotton (Gossypium hirsutum L.). Mol. Breeding 29: Zhang, Z.-S., M.-C. Hu, J. Zhang, D.-J. Liu, J. Zheng, K. Zhang, W. Wang, and Q. Wan Construction of a comprehensive PCR-based marker linkage map and QTL mapping for fiber quality traits in upland cotton (Gossypium hirsutum L.). Mol. Breeding 24:

31 Page 31 of 62 Table 1. Distribution of 841 simple-sequence repeat (SSR) and single-nucleotide polymorphism (SNP) marker loci among linkage groups (LGs) and the 26 tetraploid cotton chromosomes. Chromosome Total No. SSRs No. SNPs No. LGs Length, cm c c c c c c c c c c c c c c c c c c c c c c c c c c unknown Total

32 Page 32 of 62 Table 2. Means and ranges of BLUPs for 10 traits evaluated within the TM-1 NM24016 population, midparent values, and estimated broad-sense heritabilities on an individual plot basis ( Ĥ ) and line-mean basis ( Ĥ ) with their standard error (S.E.) in four summer environments: Maricopa, AZ, and Las Cruces, NM, across two years. 2 p RIL population Parents Heritabilities Traits Mean Range NM24016 TM-1 Midparent 2 l ˆ 2 p ˆ 2 p H S.E. ( H ) ˆ l2 ˆ l2 H S.E. ( H ) Boll size (g boll -1 ) Lint percentage (%) Lint yield (kg ha -1 ) Plant height (m) Micronaire (unit) Fiber elongation (%) Fiber strength (kn m kg -1 ) %-Span length (mm) %-Span length (mm) Length uniformity (%)

33 Page 33 of 62 Table 3. Summary of inclusive composite interval mapping (ICIM) of 10 agronomic and fiber traits within the TM-1 NM24016 population at an experiment-wise type-i error rate of 5%. Trait Chr. LG Peak Left Right Peak Position, cm Marker Marker LOD PVE (%) Additive effect # Boll size SNP0035 SNP Lint yield DPL0507 DPL Fiber strength DPL0570 DPL %-Span length SNP0189 SNP Length uniformity DPL1470b DPL Length uniformity SNP0189 SNP Chr, chromosome. LG, linkage group. The logarithm of odds (LOD) value at the position of peak likelihood of the QTL. Phenotypic variance explained by each QTL. # Additive effect when substituting a NM24016 allele with an allele from TM-1. 33

34 Page 34 of 62 Markers Chr. Pos. (cm) Chr. Pos. (cm) Chr.of G.h. Chr.of G.r. Supplementary Table 1. Comparison of the TM-1 NM24016 linkage map with the TM linkage map and the G. raimondii (D5 diploid) genome sequence. TM-1 NM24016 map TM map G. raimondii genome sequence start (bp) end (bp) E-value Marker sequence Marker sequence SNP0373 c c15 or c01 Chr02 7,701,497 7,701, E-18 SNP0448 c c15 or c01 Chr02 9,032,162 9,032, E-23 SNP0228 c no hit SNP0229 c no hit SNP0381 c no hit SNP0282 c c15 or c01 Chr02 8,962,055 8,962, E-22 DPL1268 c c c15 or c01 Chr02 9,141,144 9,141,781 0 SNP0236 c no hit DPL1217 c c c21 or c11 Chr07 16,238,293 16,238, E-18 SNP0362 c c15 or c01 Chr02 10,239,454 10,239, E-22 SHIN-1487 c c c15 or c01 Chr02 10,243,897 10,243,031 0 TMB0283 c c c10 or c20 Chr11 42,477,187 42,477, E-23 SNP0322 c c15 or c01 Chr02 11,763,034 11,762, E-22 SNP0192 c c15 or c01 Chr02 12,326,437 12,326, E-12 SNP0347 c no hit DPL0053 c c c15 or c01 Chr02 12,855,359 12,855, E-101 SNP0225 c c15 or c01 Chr02 12,859,705 12,859, E-20 SHIN-0602 c c c15 or c01 Chr02 13,623,543 13,623, E-144 DPL1673 c c c15 or c01 Chr02 13,624,212 13,623,650 0 DPL0094 c c c15 or c01 Chr02 13,623,650 13,624,212 0 SNP0366 c c15 or c01 Chr02 16,155,013 16,154, E-22 SHIN-1397 c c c15 or c01 Chr02 60,866,521 60,866,133 0 DPL1470a c c c15 or c01 Chr02 60,804,061 60,803,223 0 COT064 c c c17 or c02 Chr03 5,362,535 5,362, E-162 DC20076 c c15 or c01 Chr02 9,764,930 9,765, E-110 C2-048 c c c17 or c02 Chr03 19,726,486 19,726, E-56 JESPR101b c c c17 or c02 Chr03 18,634,900 18,635, E-83

35 Page 35 of 62 BNL3590b c c c17 or c02 Chr03 18,635,039 18,634, E-117 SHIN-0129b c c no hit DC40319b c c c17 or c02 Chr03 18,884,157 18,884,656 0 TMB0471 c c c17 or c02 Chr03 25,194,827 25,194, E-163 DPL0200 c c c17 or c02 Chr03 25,195,019 25,194,487 0 DPL0689 c c c14 or c03 Chr05 1,639,423 1,638,972 0 JESPR156 c c c14 or c03 Chr05 1,639,306 1,639,984 0 SHIN-1400 c c c14 or c03 Chr05 62,638,966 62,638,551 0 SNP0123 c no hit SNP0122 c no hit SNP0212 c no hit SNP0421 c no hit SNP0023 c no hit C2-037 c c c14 or c03 Chr05 60,635,841 60,636, E-116 MUCS407 c c c14 or c03 Chr05 58,424,182 58,423, E-92 SNP0001 c c14 or c03 Chr05 59,218,770 59,218, E-22 BNL4017 c c c14 or c03 Chr05 57,766,214 57,765, E-141 SNP0084 c c14 or c03 Chr05 57,452,950 57,452, E-22 SNP0068 c c14 or c03 Chr05 55,926,712 55,926, E-24 BNL0226 c c c14 or c03 Chr05 53,481,455 53,481, E-103 SNP0286 c c14 or c03 Chr05 42,401,515 42,401, E-19 MUSS172 c c c14 or c03 Chr05 46,899,710 46,899,092 0 DPL0170 c c no hit CIR209 c c24 or c08 Chr04 36,050,218 36,050, E-39 SNP0148 c c14 or c03 Chr05 39,145,142 39,145, E-24 SNP0130 c c14 or c03 Chr05 40,805,601 40,805, E-24 SNP0025 c c14 or c03 Chr05 33,057,597 33,057, E-22 SNP0232 c no hit SNP0231 c no hit SNP0024 c no hit SNP0163 c no hit DPL1154b c no hit

36 Page 36 of 62 SNP0117 c no hit BNL3267 c c c14 or c03 Chr05 26,854,396 26,854,958 0 DPL1154a c c no hit BNL3423 c no hit DPL1071 c c c14 or c02 Chr03 33,510,216 33,509, E-131 CIR293 c c26 or c12 Chr08 51,078,897 51,078, E-71 TMB1748 c c c04 or c22 Chr12 2,112,141 2,111, E-175 SHIN-1261 c c c14 or c03 Chr05 14,933,877 14,934,558 0 CIR030 c c c17 or c02 Chr03 10,146,362 10,146, E-16 SHIN-0473 c c c14 or c02 Chr03 10,146,362 10,146, E-17 UCD195 c c no sequence SHIN-0690b c c c14 or c02 Chr03 42,319,594 42,318,755 0 NAU1167b c c c17 or c02 Chr03 42,319,605 42,318,868 0 CIR347 c c c17 or c02 Chr03 42,298,424 42,298, E-108 SHIN-1343 c c c17 or c02 Chr03 43,182,864 43,183,516 0 SNP0266 c no hit NAU2291 c c c22 or c04 Chr12 32,304,399 32,304,006 0 NAU2162 c c c22 or c04 Chr12 32,304,428 32,303,887 0 DPL0750 c c c19 or c05 Chr09 68,246,597 68,246, E-24 SHIN-0008 c c c22 or c04 Chr12 7,399,102 7,399, E-173 DPL1467 c c c22 or c04 Chr12 7,398,762 7,399,526 0 SHIN-0396 c c no hit JESPR042 c c c19 or c05 Chr09 31,364,925 31,365, E-54 DPL0344 c c c19 or c05 Chr09 43,771,785 43,771, E-83 SHIN-0460 c c c19 or c05 Chr09 43,771,681 43,771,056 0 SNP0354 c c19 or c05 Chr09 17,982,677 17,982, E-24 SNP0077 c no hit SNP0182 c no hit SNP0155 c no hit CIR152 c c c19 or c05 Chr09 18,224,783 18,225,256 0 SNP0197 c no hit SNP0017 c c19 or c05 Chr09 18,227,899 18,227, E-16

37 Page 37 of 62 SHIN-0090 c c19 or c05 Chr09 17,620,632 17,621,474 0 MGHES-021 c c c19 or c05 Chr09 11,756,285 11,756, E-85 DPL0145 c c c19 or c05 Chr09 11,756,790 11,755,993 0 NAU1042 c c c19 or c05 Chr09 12,852,595 12,853,251 0 SHIN-0289 c c no hit NAU1221 c c c19 or c05 Chr09 12,852,584 12,853, E-156 SNP0417 c c25 or c06 Chr10 58,838,332 58,838, E-22 SNP0191 c c25 or c06 Chr10 58,704,833 58,704, E-18 BNL3594 c c c25 or c06 Chr10 60,792,592 60,792, E-151 SNP0369 c c25 or c06 Chr10 59,682,599 59,682, E-22 SNP0325 c c25 or c06 Chr10 56,362,837 56,362, E-18 SHIN-0962 c c c25 or c06 Chr10 56,085,503 56,084,901 0 SHIN-0706 c c c25 or c06 Chr10 55,989,571 55,990,132 0 BNL3650 c c c25 or c06 Chr10 55,621,518 55,621, E-45 SNP0071 c c25 or c06 Chr10 57,233,842 57,233, E-25 BNL2884 c c25 or c06 Chr10 57,271,208 57,271, E-141 SNP0230 c no hit SNP0014 c c25 or c06 Chr10 56,386,092 56,386, E-21 SNP0095 c c25 or c06 Chr10 58,435,732 58,435, E-24 SNP0139 c c25 or c06 Chr10 57,527,939 57,528, E-22 SNP0121 c c25 or c06 Chr10 60,283,825 60,283, E-22 SNP0030 c c25 or c06 Chr10 60,637,347 60,637, E-21 SNP0072 c c25 or c06 Chr10 60,788,759 60,788, E-24 SNP0106 c no hit SNP0489 c no hit SNP0447 c c25 or c06 Chr10 15,740,554 15,740, E-18 BNL1440b c c no hit TMB2303 c c c05 or c19 Chr09 25,100,955 25,100, E-14 DPL0153 c c c25 or c06 Chr10 17,067,950 17,068, E-178 SNP0479 c c25 or c06 Chr10 12,310,599 12,310, E-20 DPL0080b c c c25 or c06 Chr10 11,215,332 11,214,824 0 SNP0404 c no hit

38 Page 38 of 62 SNP0364 c c15 or c01 Chr02 8,172,458 8,172, E-19 SNP0154 c no hit SNP0070 c c25 or c06 Chr10 8,230,494 8,230, E-25 DPL0238 c c c25 or c06 Chr10 4,006,495 4,006, E-180 DPL1525 c c c25 or c06 Chr10 4,006,032 4,006,500 0 SNP0414 c c17 or c02 Chr03 40,621,461 40,621, E-21 SNP0146 c c25 or c06 Chr10 2,313,086 2,313, E-16 BNL2569 c c c25 or c06 Chr10 2,215,706 2,215,168 0 SNP0426 c c25 or c06 Chr10 2,105,304 2,105, E-24 SNP0132 c c25 or c06 Chr10 22,820,096 22,820, E-18 SNP0168 c c25 or c06 Chr10 22,758,200 22,758, E-24 SNP0129 c c25 or c06 Chr10 20,093,718 20,093, E-25 SNP0135 c c25 or c06 Chr10 19,216,541 19,216, E-25 SNP0044 c no hit SNP0032 c c25 or c06 Chr10 17,730,212 17,730, E-22 SNP0033 c c25 or c06 Chr10 17,702,391 17,702, E-17 SNP0039 c c25 or c06 Chr10 17,037,457 17,037, E-25 SNP0083 c c25 or c06 Chr10 15,804,574 15,804, E-24 SNP0022 c c25 or c06 Chr10 11,990,579 11,990, E-20 SNP0021 c c25 or c06 Chr10 11,990,579 11,990, E-20 SNP0004 c c25 or c06 Chr10 12,674,571 12,674, E-24 SNP0028 c c25 or c06 Chr10 10,451,524 10,451, E-24 SNP0086 c c25 or c06 Chr10 10,227,738 10,227, E-25 MUSS013 c c c16 or c07 Chr01 443, , E-55 SNP0436 c no hit DPL0790 c c15 or c01 Chr02 2,275,712 2,275,125 0 SNP0193 c c15 or c01 Chr02 2,563,038 2,562, E-25 CIR238 c c c16 or c07 Chr01 2,376,935 2,377, E-133 DPL1318 c c c16 or c07 Chr01 13,046,580 13,046, E-70 COT019 c c c16 or c07 Chr01 13,196,257 13,196, E-60 BNL3319 c c c16 or c07 Chr01 19,894,880 19,895,426 0 SHIN-1244 c c15 or c01 Chr02 28,646,476 28,646, E-29

39 Page 39 of 62 BNL1395b c c c16 or c07 Chr01 29,668,797 29,669,191 0 BNL1694 c c c16 or c07 Chr01 27,263,211 27,263, E-102 TMB2844 c c c01 or c15 Chr02 21,360,998 21,361, E-60 DC30109b c no sequence DC20084 c c c16 or c07 Chr01 29,283,349 29,283, E-12 TMB0561 c c c01 or c15 Chr02 21,361,204 21,360, E-60 DPL0013b c c c16 or c07 Chr01 29,669,254 29,668,461 0 BNL1122 c c c16 or c07 Chr01 29,669,201 29,668,797 0 SNP0296 c no hit TMB1888 c c03 or c14 Chr05 42,062,306 42,062, E-46 DC30012 c no sequence DPL0119 c c c24 or c08 Chr04 16,689,623 16,689, E-12 C2-003 c c c24 or c08 Chr04 1,432,227 1,431, E-116 SNP0200 c c24 or c08 Chr04 1,488,993 1,489, E-17 SNP0309 c c24 or c08 Chr04 1,873,571 1,873, E-22 SNP0209 c no hit SNP0471 c c24 or c08 Chr04 2,251,644 2,251, E-19 CIR278 c c c24 or c08 Chr04 5,685,979 5,686, E-163 SNP0262 c no hit SNP0263 c no hit SNP0303 c c24 or c08 Chr04 5,797,210 5,797, E-19 SNP0424 c no hit SNP0423 c no hit SNP0422 c no hit SNP0460 c c24 or c08 Chr04 8,934,151 8,934, E-18 DPL0111 c c c24 or c08 Chr04 9,164,707 9,164,106 0 CM0043 c c c24 or c08 Chr04 9,164,106 9,164, E-103 SNP0462 c c24 or c08 Chr04 8,934,151 8,934, E-19 SNP0319 c no hit SNP0335 c no hit SNP0461 c c24 or c08 Chr04 8,934,151 8,934, E-18 SNP0205 c no hit

40 Page 40 of 62 SNP0439 c c24 or c08 Chr04 29,145,623 29,145, E-20 SNP0201 c c24 or c08 Chr04 11,691,279 11,691, E-24 SNP0218 c c24 or c08 Chr04 10,205,167 10,205, E-14 SNP0432 c c24 or c08 Chr04 9,607,008 9,607, E-22 DPL0862 c c c24 or c08 Chr04 10,206,881 10,206,124 0 SHIN-1304b c c c24 or c08 Chr04 33,294,105 33,294,587 0 SNP0251 c no hit BNL1017 c c c24 or c08 Chr04 34,972,423 34,972, E-82 SNP0318 c no hit DPL0457 c c c24 or c08 Chr04 11,711,549 11,711, E-72 BNL3257 c c c15 or c01 Chr02 47,569,551 47,569, E-37 BNL3792 c c c24 or c08 Chr04 36,050,218 36,049, E-124 DPL0357 c c24 or c08 Chr04 34,294,001 34,293, E-132 SNP0280 c c24 or c08 Chr04 21,927,016 21,926, E-19 SNP0194 c no hit SNP0195 c no hit DC30102 c no sequence TMB1692 c c no hit no hit SNP0214 c no hit SNP0215 c no hit MUSB0442 c c c24 or c08 Chr04 11,409,518 11,409, E-98 DPL0877b c c c24 or c08 Chr04 28,574,777 28,574, E-64 SNP0477 c no hit SNP0478 c no hit DPL0113 c c c24 or c08 Chr04 12,132,905 12,133,545 0 DPL0030 c c no hit SHIN-1199 c no hit SNP0245 c no hit SNP0446 c no hit SNP0410 c c24 or c08 Chr04 23,721,860 23,721, E-22 SNP0371 c no hit SNP0409 c c24 or c08 Chr04 14,152,860 14,152, E-24 SNP0295 c c24 or c08 Chr04 15,115,214 15,115, E-22

41 Page 41 of 62 SNP0306 c c24 or c08 Chr04 36,874,709 36,874, E-20 DPL0686 c c24 or c08 Chr04 32,283,703 32,283, E-85 BNL0387 c c c24 or c08 Chr04 36,040,847 36,041, E-133 DPL0755 c c c24 or c08 Chr04 13,053,904 13,053,212 0 COT035 c c c24 or c08 Chr04 38,654,120 38,653,668 0 BNL3474 c c c24 or c08 Chr04 38,654,329 38,653,712 0 BNL3800 c c24 or c08 Chr04 39,594,686 39,594, E-79 SNP0332 c no hit DPL0133 c c c24 or c08 Chr04 40,567,738 40,568,451 0 SNP0415 c c24 or c08 Chr04 43,297,270 43,297, E-20 DPL1609 c c23 or c09 Chr06 29,019,864 29,019, E-20 SNP0359 c c24 or c08 Chr04 45,187,921 45,187, E-24 TMB1640 c c c06 or c25 Chr10 53,355,200 53,355, E-52 SNP0379 c c24 or c08 Chr04 44,807,468 44,807, E-19 SNP0459 c no hit C2-038 c c c23 or c09 Chr06 19,029,964 19,030, E-23 BNL3638 c c24 or c08 Chr04 49,555,334 49,554, E-144 TMB0834 c c c13 or c18 Chr13 17,204,933 17,204, E-19 SNP0125 c no hit MUSS398b c c23 or c09 Chr06 50,375,880 50,376,300 0 SNP0233 c no hit SNP0234 c no hit NAU2354 c c c23 or c09 Chr06 50,233,051 50,233, E-160 SHIN-1542 c c c23 or c09 Chr06 50,233,951 50,233,553 0 DPL0530b c c23 or c09 Chr06 50,050,707 50,051,300 0 SNP0165 c c23 or c09 Chr06 49,311,729 49,311, E-12 SNP0093 c c23 or c09 Chr06 49,320,032 49,320, E-24 SNP0092 c c23 or c09 Chr06 47,112,024 47,111, E-23 SNP0003 c c23 or c09 Chr06 46,607,526 46,607, E-24 SNP0007 c no hit SHIN-0817 c c c23 or c09 Chr06 41,841,751 41,840,933 0 SNP0190 c no hit

42 Page 42 of 62 DPL0150a c c c23 or c09 Chr06 43,132,445 43,131,773 0 SNP0221 c c23 or c09 Chr06 41,866,784 41,866, E-24 SNP0470 c no hit DPL1144 c c c23 or c09 Chr06 42,781,879 42,782, E-171 SNP0355 c c23 or c09 Chr06 42,644,695 42,644, E-22 SNP0412 c c23 or c09 Chr06 43,012,841 43,012, E-24 SNP0283 c c23 or c09 Chr06 42,851,709 42,851, E-16 SNP0407 c no hit SNP0290 c no hit DPL0745 c c c23 or c09 Chr06 40,877,086 40,877,538 0 MUCS080 c c c23 or c09 Chr06 40,594,319 40,593,917 0 TMB2483 c c c12 or c26 Chr08 8,956,449 8,956, E-40 SNP0237 c c23 or c09 Chr06 40,345,853 40,345, E-22 SNP0370 c c23 or c09 Chr06 40,273,972 40,274, E-22 SNP0330 c c23 or c09 Chr06 40,350,840 40,350, E-22 SHIN-1641 c c no hit SNP0213 c c23 or c09 Chr06 35,188,985 35,189, E-20 DPL1052 c c c23 or c09 Chr06 35,698,374 35,698,993 0 SNP0480 c c23 or c09 Chr06 35,149,818 35,149, E-22 SNP0451 c c23 or c09 Chr06 34,525,169 34,525, E-24 JESPR274a c c c23 or c09 Chr06 21,079,086 21,079, E-44 DPL0175a c c c23 or c09 Chr06 13,542,501 13,542,027 0 DPL0679 c c c23 or c09 Chr06 7,973,048 7,972,353 0 SNP0199 c no hit SHIN-0779 c c20 or c10 Chr11 18,810,862 18,811,674 0 BNL3895 c c c20 or c10 Chr11 23,405,299 23,404, E-159 DPL0431 c c c20 or c10 Chr11 24,533,692 24,534, E-60 SNP0395 c no hit SNP0388 c c20 or c10 Chr11 26,744,578 26,744, E-20 TMB1288 c c c02 or c17 Chr03 34,703,378 34,703, E-18 SNP0052 c no hit SNP0034 c no hit

43 Page 43 of 62 STV123 c c c11 or c21 Chr07 41,284,170 41,284, E-19 SNP0094 c c20 or c10 Chr11 1,721,176 1,721, E-23 SNP0002 c c20 or c10 Chr11 1,773,276 1,773, E-20 SNP0284 c no hit DPL1550 c c c20 or c10 Chr11 2,998,306 2,998,742 0 SNP0126 c no hit SNP0138 c c20 or c10 Chr11 2,517,218 2,517, E-22 SNP0314 c c20 or c10 Chr11 2,782,861 2,782, E-25 STV031 c c10 or c20 Chr11 1,693,462 1,693, E-178 SNP0240 c c20 or c10 Chr11 1,720,671 1,720, E-25 SNP0445 c c20 or c10 Chr11 2,112,861 2,112, E-23 SNP0399 c c20 or c10 Chr11 1,740,630 1,740, E-25 SNP0398 c c20 or c10 Chr11 1,740,630 1,740, E-25 SNP0050 c c20 or c10 Chr11 890, , E-18 DPL0570 c c c21 or c11 Chr07 290, ,443 0 DPL1931 c c c21 or c11 Chr07 290, ,443 0 DPL0500a c c21 or c11 Chr07 20,038,097 20,038, E-13 DPL0522 c c c21 or c11 Chr07 1,859,817 1,859, E-166 DPL0252 c c26 or c12 Chr08 2,648,020 2,647,512 0 DPL1379 c c26 or c12 Chr08 2,647,512 2,648,024 0 DC30147a c c no sequence SNP0384 c c21 or c11 Chr07 4,296,580 4,296, E-24 DPL0863a c c c21 or c11 Chr07 4,421,772 4,422,215 0 SNP0222 c c21 or c11 Chr07 4,188,707 4,188, E-22 SNP0258 c c21 or c11 Chr07 3,800,446 3,800, E-18 SNP0219 c c21 or c11 Chr07 4,946,580 4,946, E-22 SNP0298 c c21 or c11 Chr07 5,389,064 5,389, E-24 SNP0320 c c21 or c11 Chr07 5,041,799 5,041, E-15 SNP0247 c c21 or c11 Chr07 5,961,560 5,961, E-24 BNL1034b c c c21 or c11 Chr07 5,461,346 5,461, E-122 SNP0270 c c21 or c11 Chr07 5,532,672 5,532, E-24 SNP0321 c c21 or c11 Chr07 6,027,689 6,027, E-21

44 Page 44 of 62 BNL2589 c c21 or c11 Chr07 6,986,408 6,986,869 0 SNP0455 c c21 or c11 Chr07 6,285,457 6,285, E-22 SNP0484 c c21 or c11 Chr07 6,003,694 6,003, E-18 SNP0239 c c21 or c11 Chr07 6,004,606 6,004, E-16 TMB1387 c c c11 or c21 Chr07 6,263,882 6,263,490 0 SNP0186 c c21 or c11 Chr07 7,062,354 7,062, E-17 SNP0394 c c21 or c11 Chr07 7,020,771 7,020, E-25 SNP0393 c c21 or c11 Chr07 7,020,771 7,020, E-25 SNP0224 c no hit SNP0216 c c21 or c11 Chr07 6,004,924 6,004, E-22 SNP0353 c c21 or c11 Chr07 6,712,706 6,712, E-24 MUCS028 c c c21 or c11 Chr07 5,465,395 5,464,787 0 SNP0360 c c21 or c11 Chr07 7,582,936 7,582, E-22 SNP0037 c no hit SNP0339 c c21 or c11 Chr07 7,655,733 7,655, E-20 DPL0585 c c c21 or c11 Chr07 7,906,298 7,905,523 0 SNP0437 c c21 or c11 Chr07 8,331,345 8,331, E-23 SNP0317 c no hit SNP0275 c no hit SNP0276 c no hit SNP0310 c c21 or c11 Chr07 7,961,313 7,961, E-19 SNP0311 c c21 or c11 Chr07 7,961,313 7,961, E-19 SNP0217 c c21 or c11 Chr07 8,344,306 8,344, E-25 SNP0438 c no hit SNP0413 c c21 or c11 Chr07 9,037,401 9,037, E-22 BNL1151 c c c21 or c11 Chr07 10,091,988 10,091, E-104 TMB2281 c c c11 or c21 Chr07 9,563,459 9,563, E-106 SNP0269 c c21 or c11 Chr07 9,923,816 9,923, E-25 BNL3431 c c21 or c11 Chr07 28,048,086 28,047, E-107 DPL0050 c c21 or c11 Chr07 9,290,732 9,290,038 0 BNL0261 c c c21 or c11 Chr07 42,597,192 42,597, E-138 BNL1408 c c c21 or c11 Chr07 43,869,513 43,869, E-151 DPL1121 c c21 or c11 Chr07 46,966,665 46,966,002 0

45 Page 45 of 62 DPL1846 c c21 or c11 Chr07 48,739,889 48,739,261 0 BNL3649b c c21 or c11 Chr07 48,640,875 48,641,337 0 SNP0380 c c21 or c11 Chr07 4,877,524 4,877, E-22 SNP0166 c no hit SNP0167 c no hit SNP0058 c c21 or c11 Chr07 56,838,742 56,838, E-20 SNP0140 c no hit SNP0147 c c21 or c11 Chr07 52,379,009 52,379, E-22 SNP0179 c no hit SNP0048 c c21 or c11 Chr07 56,712,880 56,712, E-15 SNP0042 c no hit SNP0487 c c21 or c11 Chr07 54,303,588 54,303, E-14 SNP0486 c c21 or c11 Chr07 54,303,588 54,303, E-14 SNP0488 c c21 or c11 Chr07 54,303,588 54,303, E-14 SNP0164 c c21 or c11 Chr07 52,707,748 52,707, E-22 SNP0016 c c21 or c11 Chr07 53,650,746 53,650, E-25 SNP0053 c c21 or c11 Chr07 52,556,976 52,556, E-22 SNP0051 c c21 or c11 Chr07 52,687,166 52,687, E-22 SNP0096 c c21 or c11 Chr07 52,643,155 52,643, E-16 BNL4011b c c c21 or c11 Chr07 54,551,315 54,551, E-130 BNL1066 c c c21 or c11 Chr07 54,551,313 54,551, E-107 SNP0171 c no hit SNP0172 c no hit SNP0358 c no hit SNP0357 c no hit SNP0184 c no hit SNP0183 c c21 or c11 Chr07 56,441,244 56,441, E-19 SNP0345 c c21 or c11 Chr07 55,238,538 55,238, E-19 SNP0344 c c21 or c11 Chr07 55,219,379 55,219, E-19 SNP0343 c c21 or c11 Chr07 55,219,379 55,219, E-19 SHIN-0966 c c c21 or c11 Chr07 57,314,428 57,315,015 0 SNP0153 c c21 or c11 Chr07 56,837,865 56,837, E-20

46 Page 46 of 62 NAU2152 c c c21 or c11 Chr07 57,314,447 57,314,986 0 SNP0264 c c26 or c12 Chr08 2,681,305 2,681, E-22 MUSB1117 c c c26 or c12 Chr08 3,599,177 3,600,029 0 SHIN-0208 c c no hit SHIN-1490 c c c26 or c12 Chr08 5,727,459 5,727, E-172 SNP0365 c c26 or c12 Chr08 5,362,606 5,362, E-20 BNL3261 c c c26 or c12 Chr08 4,336,936 4,337,448 0 SNP0334 c c26 or c12 Chr08 5,466,981 5,467, E-22 SNP0416 c c26 or c12 Chr08 3,781,088 3,781, E-24 SNP0226 c c26 or c12 Chr08 3,848,543 3,848, E-24 SNP0327 c c26 or c12 Chr08 6,661,296 6,661, E-19 SNP0173 c c26 or c12 Chr08 3,652,835 3,652, E-24 SNP0211 c c26 or c12 Chr08 6,390,329 6,390, E-22 SNP0079 c no hit SNP0047 c c15 or c01 Chr02 14,469,277 14,469, E-20 SNP0136 c no hit SNP0203 c c26 or c12 Chr08 7,102,485 7,102, E-24 DPL0248 c c c26 or c12 Chr08 7,104,516 7,103,939 0 SHIN-1413 c c c26 or c12 Chr08 9,543,996 9,543,563 0 BNL3599 c c c26 or c12 Chr08 9,388,542 9,389,065 0 DPL1293 c c c26 or c12 Chr08 34,200,839 34,200,175 0 MUSS026 c c c26 or c12 Chr08 8,085,003 8,084, E-146 SHIN-0409 c c c26 or c12 Chr08 8,085,003 8,084,507 0 DPL0010 c c c26 or c12 Chr08 41,720,433 41,721,017 0 SNP0265 c c26 or c12 Chr08 41,593,503 41,593, E-22 SNP0145 c no hit DPL1325 c c c26 or c12 Chr08 42,546,371 42,546, E-73 DPL1575 c c26 or c12 Chr08 44,619,315 44,618,659 0 SNP0198 c c26 or c12 Chr08 44,710,525 44,710, E-22 BNL2709 c c no hit DPL0070 c c c26 or c12 Chr08 47,342,127 47,342,846 0 DC30107 c c no sequence

47 Page 47 of 62 DPL0400 c c c26 or c12 Chr08 48,342,201 48,342, E-49 SNP0483 c c26 or c12 Chr08 48,176,739 48,176, E-23 TMB1497 c c06 or c25 Chr10 36,076,912 36,077, E-40 BNL0666 c no hit DC30026 c no sequence DPL0545 c c14 or c03 Chr05 112, ,582 0 DPL1533 c c c26 or c12 Chr08 53,516,210 53,515,750 0 SNP0348 c c26 or c12 Chr08 54,208,289 54,208, E-25 SNP0418 c c26 or c12 Chr08 54,804,359 54,804, E-23 STV033 c c09 or c23 Chr06 46,137,859 46,137, E-27 DPL1133 c c c26 or c12 Chr08 53,966,205 53,965,666 0 DPL0404 c c26 or c12 Chr08 54,065,395 54,064,830 0 SNP0248 c c26 or c12 Chr08 54,140,427 54,140, E-25 TMB0537b c c26 or c12 Chr08 53,286,494 53,286,966 0 SHIN-1174b c c26 or c12 Chr08 53,286,494 53,286,966 0 SNP0204 c c26 or c12 Chr08 54,230,686 54,230, E-15 SNP0433 c c26 or c12 Chr08 55,514,660 55,514, E-24 DPL0917a c c c26 or c12 Chr08 55,556,379 55,555,895 0 SNP0098 c c26 or c12 Chr08 55,404,159 55,404, E-24 SNP0097 c c26 or c12 Chr08 55,404,159 55,404, E-24 TMB0537a c c c12 or c26 Chr08 53,286,494 53,286,966 0 SHIN-1174a c c26 or c12 Chr08 53,286,494 53,286,966 0 SNP0019 c no hit DPL0917b c c c26 or c12 Chr08 55,556,379 55,555,895 0 SNP0331 c no hit SNP0425 c no hit DC40080 c c26 or c12 Chr08 2,645,962 2,645,508 0 SNP0291 c no hit MUSB0285 c c18 or c13 Chr13 1,049,810 1,050,519 0 BNL0645 c no hit TMB1638 c c c13 or c18 Chr13 222, ,130 0 DPL1226 c c c18 or c13 Chr13 222, ,304 0

48 Page 48 of 62 SNP0278 c no hit BNL0243 c c c18 or c13 Chr13 1,376,605 1,376, E-83 SNP0292 c no hit SNP0143 c c18 or c13 Chr13 1,665,755 1,665, E-14 SNP0142 c c18 or c13 Chr13 1,665,606 1,665, E-14 SNP0144 c c18 or c13 Chr13 1,665,755 1,665, E-18 SNP0040 c no hit SNP0169 c no hit MUSS181 c c18 or c13 Chr13 2,409,519 2,409, E-107 SNP0137 c c18 or c13 Chr13 2,393,542 2,393, E-22 SNP0151 c no hit SNP0102 c c18 or c13 Chr13 2,791,577 2,791, E-23 SNP0119 c c18 or c13 Chr13 3,073,130 3,073, E-22 SNP0104 c no hit SHIN-1452 c c18 or c13 Chr13 2,142,700 2,142,225 0 DPL0894 c c c18 or c13 Chr13 3,797,761 3,797,245 0 SNP0013 c no hit SNP0036 c c18 or c13 Chr13 4,035,495 4,035, E-22 DPL0687 c c18 or c13 Chr13 42,618,413 42,618, E-90 DPL0249 c c18 or c13 Chr13 4,913,323 4,912,859 0 BNL1495 c c c18 or c13 Chr13 4,929,680 4,929, E-62 BNL1421 c c18 or c13 Chr13 4,929,680 4,929, E-45 CIR096 c c c18 or c13 Chr13 6,668,978 6,668, E-156 SHIN-1202 c c18 or c13 Chr13 12,923,237 12,922, E-131 DPL0535 c no hit SHIN-0145 c c no hit JESPR153 c c no hit SHIN-1163 c c c18 or c13 Chr13 48,388,207 48,387, E-102 TMB0312 c c c13 or c18 Chr13 48,388,455 48,388, E-95 SNP0326 c no hit SNP0243 c c18 or c13 Chr13 48,886,750 48,886, E-22 BNL4061 c c c18 or c13 Chr13 52,284,446 52,284, E-109 DPL1016a c c14 or c03 Chr05 47,734,934 47,735,456 0

49 Page 49 of 62 JESPR006 c c25 or c06 Chr10 41,915,067 41,914, E-83 SHIN-0229 c c no hit NAU2277 c c17 or c02 Chr03 302, , E-57 NAU2265 c c c17 or c02 Chr03 302, ,653 0 NAU0895 c c c17 or c02 Chr03 302, , E-162 SNP0109 c c14 or c03 Chr05 1,834,429 1,834, E-25 SNP0110 c c14 or c03 Chr05 1,834,429 1,834, E-25 SNP0255 c c14 or c03 Chr05 2,918,283 2,918, E-25 SNP0254 c c14 or c03 Chr05 2,918,283 2,918, E-25 TMB1931 c c c01 or c15 Chr02 38,519,894 38,519, E-103 SHIN-1280 c c14 or c03 Chr05 2,883,559 2,883, E-105 SNP0043 c c14 c14 or c03 Chr05 53,083,379 53,083, E-24 SNP0152 c c14 or c03 Chr05 53,104,147 53,104, E-24 MUSB1267 c c c15 or c01 Chr02 44,400,500 44,399,746 0 SNP0112 c c15 or c01 Chr02 46,143,747 46,143, E-24 SNP0059 c c15 or c01 Chr02 41,480,895 41,480, E-25 BNL1666 c c c15 or c01 Chr02 47,180,017 47,179, E-107 SHIN-1571 c c15 or c01 Chr02 53,492,311 53,492, E-111 DC30109a c c no sequence DPL0003 c c c15 or c01 Chr02 50,293,455 50,293, E-130 TMB0301 c c c08 or c24 Chr04 34,949,166 34,949, E-63 SNP0458 c no hit SNP0026 c c15 or c01 Chr02 27,695,253 27,695, E-16 SNP0351 c c15 or c01 Chr02 59,010,894 59,010, E-25 DPL0318 c c c15 or c01 Chr02 57,829,658 57,829,129 0 DC30210 c c15 or c01 Chr02 57,979,004 57,978,546 0 BNL1454 c c c15 or c01 Chr02 57,425,363 57,425, E-152 DPL1514 c c c15 or c01 Chr02 57,829,662 57,829,121 0 SNP0012 c c15 or c01 Chr02 57,645,587 57,645, E-25 SNP0293 c c15 or c01 Chr02 57,755,832 57,755, E-23 SNP0305 c c15 or c01 Chr02 59,011,818 59,011, E-22 SNP0268 c c15 or c01 Chr02 56,155,542 56,155, E-25

50 Page 50 of 62 TMB1660 c c scaffold_115 scaffold_115 9,190 8, E-91 DPL1470b c c c15 or c01 Chr02 60,804,061 60,803,223 0 DPL0346 c c15 or c01 Chr02 60,892,901 60,893,401 0 SNP0267 c c15 or c01 Chr02 56,064,066 56,064, E-25 SNP0389 c no hit SNP0274 c c15 or c01 Chr02 60,322,687 60,322, E-22 SNP0273 c c15 or c01 Chr02 60,322,687 60,322, E-22 DPL0402 c c c15 or c01 Chr02 60,397,279 60,397, E-114 SNP0223 c c15 or c01 Chr02 59,493,573 59,493, E-20 SNP0465 c c15 or c01 Chr02 60,215,297 60,215, E-22 SHIN-1375 c c c15 or c01 Chr02 60,699,332 60,699, E-45 TMB2945 c c07 or c16 Chr01 20,977,709 20,976,980 0 SNP0449 c c16 or c07 Chr01 27,455,807 27,455, E-25 SNP0387 c c16 or c07 Chr01 41,898,160 41,898, E-25 SNP0386 c c16 or c07 Chr01 41,898,160 41,898, E-25 SNP0385 c c16 or c07 Chr01 41,777,312 41,777, E-22 DPL0168 c c c16 or c07 Chr01 41,619,994 41,619, E-175 DPL1482 c c c16 or c07 Chr01 27,815,506 27,815, E-178 DPL0223 c c c16 or c07 Chr01 27,815,156 27,815,620 0 MUSB0632 c c c16 or c07 Chr01 36,158,056 36,157,233 0 BNL2733 c no hit DPL1084 c c c16 or c07 Chr01 36,048,230 36,048,670 0 DPL0897 c c c16 or c07 Chr01 38,246,804 38,246,007 0 BNL1026 c c no hit DPL0061 c c scaffold_14 scaffold_14 143, ,171 0 DPL0013a c c c16 or c07 Chr01 29,669,254 29,668,461 0 BNL1395a c c c16 or c07 Chr01 29,668,797 29,669,191 0 BNL1227 c c26 or c12 Chr08 35,944,570 35,944, E-130 SHIN-0815 c c c16 or c07 Chr01 35,877,474 35,877, E-157 SNP0256 c scafford 14 scaffold_14 139, , E-25 SNP0080 c no hit DPL0507 c no hit

51 Page 51 of 62 DPL1362 c c25 or c06 Chr10 842, , E-170 DC30027b c c no sequence DPL0095 c c c17 or c02 Chr03 38,615,157 38,614,496 0 BNL4073 c c c17 or c02 Chr03 34,512,188 34,512, E-171 JESPR101a c c c17 or c02 Chr03 18,634,900 18,635, E-83 SHIN-0129a c c no hit DPL0217 c c14 or c02 Chr03 21,979,138 21,979,632 0 BNL3590a c c c17 or c02 Chr03 18,635,039 18,634, E-117 DC40319a c c17 or c02 Chr03 18,884,157 18,884,656 0 SNP0127 c c17 or c02 Chr03 19,801,935 19,801, E-20 SHIN-0690a c c14 or c02 Chr03 42,319,594 42,318,755 0 NAU1167a c c17 or c02 Chr03 42,319,605 42,318,868 0 SHIN-0727 c c14 or c02 Chr03 42,347,607 42,348,015 0 HAU117 c c19 or c05 Chr09 2,449,753 2,450,142 0 SNP0202 c c19 or c05 Chr09 2,457,550 2,457, E-24 TMB0835 c c c05 or c19 Chr09 1,536,726 1,536, E-35 TMB1418 c c c07 or c16 Chr01 28,129,577 28,129, E-69 SHIN-0826 c c c19 or c05 Chr09 1,536,768 1,537, E-153 TMB1548 c c Chr09 Chr09 3,576,223 3,575,679 0 HAU016 c c23 or c09 Chr06 35,023,902 35,023,542 0 SNP0029 c no hit COT130 c c c19 or c05 Chr09 3,234,110 3,234,581 0 SNP0055 c c19 or c05 Chr09 3,688,340 3,688, E-20 SNP0082 c no hit SNP0105 c c19 or c05 Chr09 4,082,622 4,082, E-22 SNP0035 c c19 or c05 Chr09 4,049,167 4,049, E-22 SNP0316 c c19 or c05 Chr09 4,280,669 4,280, E-22 SNP0315 c c19 or c05 Chr09 4,280,669 4,280, E-22 SNP0159 c c19 or c05 Chr09 4,766,551 4,766, E-19 SNP0346 c c19 or c05 Chr09 6,067,611 6,067, E-20 SNP0099 c c19 or c05 Chr09 7,115,785 7,115, E-20 SNP0375 c c19 or c05 Chr09 12,259,464 12,259, E-24

52 Page 52 of 62 DPL1475 c c19 or c05 Chr09 12,848,851 12,848,419 0 DPL0595 c c c19 or c05 Chr09 14,294,589 14,293,886 0 DPL0064 c c c19 or c05 Chr09 16,580,220 16,579,601 0 BNL4096 c c c19 or c05 Chr09 16,548,180 16,547, E-164 TMB0366 c c05 or c19 Chr09 16,547,972 16,548,419 0 SNP0107 c c19 or c05 Chr09 15,808,952 15,809, E-22 SNP0018 c c19 or c05 Chr09 15,732,628 15,732, E-25 SHIN-1155 c c19 or c05 Chr09 15,487,571 15,487,199 0 SNP0441 c c19 or c05 Chr09 15,809,613 15,809, E-25 SNP0442 c c19 or c05 Chr09 15,809,613 15,809, E-25 SNP0440 c c19 or c05 Chr09 15,809,613 15,809, E-25 SNP0304 c c19 or c05 Chr09 18,759,729 18,759, E-23 SNP0453 c no hit MUCS127 c c c19 or c05 Chr09 18,502,533 18,502, E-171 MGHES-040 c c c19 or c05 Chr09 18,504,176 18,503, E-83 MUCS400 c c c19 or c05 Chr09 18,502,533 18,502, E-171 SHIN-1514 c c c19 or c05 Chr09 18,217,782 18,218,331 0 DPL0071 c c c19 or c05 Chr09 19,349,735 19,350,422 0 DC40309 c c19 or c05 Chr09 22,030,007 22,029, E-178 DC30008 c c no sequence TMB0836 c c03 or c14 Chr05 3,869,153 3,869, E-80 C2-135b c c c19 or c05 Chr09 27,891,618 27,892, E-174 DPL0908 c c c19 or c05 Chr09 32,164,275 32,163,644 0 HAU006 c c c19 or c05 Chr09 33,836,383 33,835,949 0 SNP0208 c c19 or c05 Chr09 46,015,250 46,015, E-18 SNP0428 c c19 or c05 Chr09 44,076,832 44,076, E-25 SNP0324 c c19 or c05 Chr09 49,208,077 49,208, E-25 SNP0196 c c19 or c05 Chr09 47,570,227 47,570, E-25 CIR255 c c c19 or c05 Chr09 42,897,727 42,897, E-119 DC40242 c c c19 or c05 Chr09 36,268,605 36,268, E-168 COT010 c c c19 or c05 Chr09 34,721,575 34,721, E-178 DPL0163 c c c19 or c05 Chr09 40,807,242 40,807,937 0 DPL0137 c c c19 or c05 Chr09 47,561,982 47,562,596 0

53 Page 53 of 62 SNP0456 c c19 or c05 Chr09 46,906,282 46,906, E-24 SNP0294 c c19 or c05 Chr09 43,838,599 43,838, E-25 SNP0401 c no hit SNP0402 c no hit SNP0301 c c19 or c05 Chr09 49,452,806 49,452, E-25 SNP0250 c c19 or c05 Chr09 49,666,728 49,666, E-25 SNP0299 c c19 or c05 Chr09 49,753,238 49,753, E-25 SNP0403 c c19 or c05 Chr09 48,248,081 48,248, E-25 SNP0244 c c19 or c05 Chr09 47,719,098 47,719, E-25 JESPR236 c c c19 or c05 Chr09 52,211,358 52,211, E-52 CM0051 c c no hit CM0042 c c c19 or c05 Chr09 52,211,156 52,211, E-93 CM0003 c c c19 or c05 Chr09 52,211,155 52,211, E-93 BNL3347 c c c19 or c05 Chr09 52,211,360 52,210,966 0 SHIN-0022 c c19 or c05 Chr09 52,712,634 52,712, E-147 UCD242 c no sequence SHIN-0827 c c c19 or c05 Chr09 70,170,721 70,171,439 0 DPL1938 c c c19 or c05 Chr09 70,173,953 70,174,674 0 DPL1211 c c c22 or c04 Chr12 5,765,893 5,766,343 0 SNP0342 c c22 or c04 Chr12 2,769,732 2,769, E-20 DPL1206 c c c22 or c04 Chr12 4,964,181 4,964,770 0 SHIN-0963 c c22 or c04 Chr12 5,025,261 5,025,863 0 HAU042 c c22 or c04 Chr12 4,934,899 4,934, E-141 SHIN-0121 c c no hit CIR253 c c c22 or c04 Chr12 4,554,023 4,554, E-136 DPL0417 c c c19 or c05 Chr09 27,606,433 27,606, E-101 SNP0271 c c22 or c04 Chr12 1,501,384 1,501, E-19 SNP0475 c no hit SHIN-0437 c c c22 or c04 Chr12 2,219,579 2,220,190 0 SNP0397 c no hit DPL0394 c c c20 or c10 Chr11 9,355,424 9,355,867 0 SHIN-1586 c c20 or c10 Chr11 9,752,769 9,753,391 0

54 Page 54 of 62 BNL0946 c c c20 or c10 Chr11 28,725,102 28,724, E-142 DPL0697 c c20 or c10 Chr11 28,519,249 28,518, E-161 DPL0135 c c c20 or c10 Chr11 28,724,363 28,724,867 0 TMB1939 c c09 or c23 Chr06 38,643,729 38,643, E-18 DPL0225 c c c20 or c10 Chr11 49,500,452 49,500,903 0 BNL0119 c c c20 or c10 Chr11 50,619,562 50,619, E-88 DPL0442 c c20 or c10 Chr11 54,950,803 54,950,259 0 SHIN-1165 c c20 or c10 Chr11 15,981,224 15,981, E-33 DC40113 c c20 c20 or c10 Chr11 55,705,856 55,706,323 0 SNP0312 c c20 or c10 Chr11 55,723,371 55,723, E-17 DPL1903 c c c20 or c10 Chr11 58,130,716 58,131,318 0 DPL1795 c c c20 or c10 Chr11 58,107,954 58,108,602 0 SNP0279 c c20 or c10 Chr11 58,536,297 58,536, E-24 SNP0065 c c20 or c10 Chr11 60,205,349 60,205, E-22 SNP0064 c c20 or c10 Chr11 60,304,292 60,304, E-22 SNP0406 c c20 or c10 Chr11 57,912,863 57,912, E-22 DPL1022 c c c20 or c10 Chr11 59,190,203 59,190,686 0 SNP0405 c c20 or c10 Chr11 57,912,863 57,912, E-16 CIR353 c c23 or c09 Chr06 14,078,315 14,078, E-77 SNP0120 c c20 or c10 Chr11 61,456,146 61,456, E-25 SHIN-1167 c c c21 or c11 Chr07 4,152,698 4,152,122 0 DPL0131 c c c21 or c11 Chr07 3,873,468 3,874,120 0 SNP0241 c c21 or c11 Chr07 3,714,555 3,714, E-24 SNP0308 c c21 or c11 Chr07 3,715,820 3,715, E-22 DPL0777 c c c21 or c11 Chr07 4,674,912 4,675,460 0 MUSS324 c no hit DPL0863b c c c21 or c11 Chr07 4,421,772 4,422,215 0 DC30147b c c no sequence BNL1034a c c c21 or c11 Chr07 5,461,346 5,461, E-122 BNL1053 c c c21 or c11 Chr07 16,505,978 16,505, E-78 JESPR251 c c c21 or c11 Chr07 16,505,850 16,505, E-19 SNP0476 c c14 or c03 Chr05 59,545,105 59,545, E-25

55 Page 55 of 62 SNP0257 c c21 or c11 Chr07 19,350,558 19,350, E-20 SNP0430 c c21 or c11 Chr07 21,860,273 21,860, E-25 TMB2038 c c c02 or c17 Chr03 1,122,992 1,122, E-26 SHIN-1344 c c21 or c11 Chr07 19,879,303 19,878,723 0 SNP0227 c no hit MUSB0753 c c21 or c11 Chr07 22,060,103 22,059, E-135 SNP0382 c c21 or c11 Chr07 19,530,429 19,530, E-20 MUSB0849 c c c15 or c01 Chr02 7,730,857 7,731,871 0 BNL2805 c c21 or c11 Chr07 28,267,292 28,267, E-119 BNL0625 c c23 or c09 Chr06 18,977,646 18,977, E-56 SNP0180 c c21 or c11 Chr07 25,729,007 25,728, E-22 SNP0113 c c21 or c11 Chr07 25,551,395 25,551, E-20 DPL0253 c c21 or c11 Chr07 27,136,798 27,136,234 0 BNL3282 c c21 or c11 Chr07 25,661,342 25,660, E-131 TMB1637a c c c11 or c21 Chr07 27,442,492 27,442, E-133 SHIN-1214a c c c21 or c11 Chr07 27,442,492 27,442, E-134 TMB1222a c c c13 or c18 Chr13 32,030,805 32,030, E-122 SHIN-1214b c c21 or c11 Chr07 27,442,492 27,442, E-134 TMB0904 c c11 or c21 Chr07 28,407,345 28,406,940 0 TMB1222b c c13 or c18 Chr13 32,030,805 32,030, E-122 TMB1637b c c11 or c21 Chr07 27,442,492 27,442, E-133 DPL0500b c c c21 or c11 Chr07 20,038,097 20,038, E-13 SNP0352 c no hit BNL4011a c c c21 or c11 Chr07 54,551,315 54,551, E-130 SNP0174 c c21 or c11 Chr07 54,047,205 54,047, E-24 SNP0187 c c22 or c04 Chr12 8,898,258 8,898, E-22 SNP0396 c c22 or c04 Chr12 8,895,928 8,895, E-19 BNL4030 c c22 or c04 Chr12 5,740,007 5,740,436 0 JESPR050 c c22 or c04 Chr12 5,936,274 5,936, E-89 SHIN-1547 c c c22 or c04 Chr12 13,642,681 13,643,363 0 SNP0285 c c22 or c04 Chr12 24,215,500 24,215, E-20 DPL0107 c c c22 or c04 Chr12 24,593,941 24,593, E-171

56 Page 56 of 62 SNP0485 c c22 or c04 Chr12 24,118,132 24,118, E-24 SHIN-1066 c c22 or c04 Chr12 31,300,185 31,301,077 0 SNP0467 c c22 or c04 Chr12 31,459,751 31,459, E-21 SNP0078 c c23 or c09 Chr06 50,505,355 50,505, E-25 SNP0176 c c23 or c09 Chr06 50,627,226 50,627, E-25 SNP0175 c c23 or c09 Chr06 50,627,226 50,627, E-25 MUSS398a c c23 or c09 Chr06 50,375,880 50,376,300 0 DPL0530a c c c23 or c09 Chr06 50,050,707 50,051,300 0 SNP0087 c c23 or c09 Chr06 50,029,648 50,029, E-25 SNP0085 c c23 or c09 Chr06 50,336,065 50,336, E-25 SNP0149 c c23 or c09 Chr06 50,144,486 50,144, E-22 SNP0128 c c23 or c09 Chr06 50,033,248 50,033, E-21 SNP0114 c c23 or c09 Chr06 50,628,115 50,628, E-24 SNP0115 c c23 or c09 Chr06 50,628,115 50,628, E-24 SNP0177 c c23 or c09 Chr06 49,953,288 49,953, E-23 SNP0073 c c23 or c09 Chr06 50,525,092 50,525, E-25 SNP0060 c c23 or c09 Chr06 49,141,192 49,141, E-24 SNP0061 c c23 or c09 Chr06 49,141,192 49,141, E-24 SNP0062 c c23 or c09 Chr06 49,141,192 49,141, E-24 MUSS298 c c c23 or c09 Chr06 49,089,927 49,090,415 0 SNP0134 c c23 or c09 Chr06 49,161,505 49,161, E-25 SNP0020 c c23 or c09 Chr06 48,898,841 48,898, E-25 SNP0162 c c23 or c09 Chr06 48,961,314 48,961, E-25 SNP0277 c c23 or c09 Chr06 48,736,666 48,736, E-24 NAU0864 c c c23 or c09 Chr06 49,089,927 49,090,415 0 SNP0074 c c23 or c09 Chr06 48,161,443 48,161, E-24 JESPR114 c c c23 or c09 Chr06 47,717,184 47,716, E-99 DPL1016b c c c14 or c03 Chr05 47,734,934 47,735,456 0 SNP0056 c c23 or c09 Chr06 47,852,025 47,851, E-25 SNP0408 c c23 or c09 Chr06 47,885,396 47,885, E-24 SHIN-0050 c c c23 or c09 Chr06 46,530,765 46,531,335 0 SNP0419 c c23 or c09 Chr06 44,568,415 44,568, E-24

57 Page 57 of 62 BNL1030 c c c23 or c09 Chr06 44,765,330 44,765, E-122 CM0007 c c c23 or c09 Chr06 44,501,974 44,502, E-99 BNL1414 c c c23 or c09 Chr06 44,502,140 44,501, E-114 CM0071 c c c23 or c09 Chr06 44,501,974 44,502, E-88 SNP0188 c c23 or c09 Chr06 42,562,824 42,562, E-24 DPL0150b c c c23 or c09 Chr06 43,132,445 43,131,773 0 BNL1317b c c c23 or c09 Chr06 43,132,176 43,131, E-84 SNP0302 c c23 or c09 Chr06 40,535,400 40,535, E-24 DPL1130 c c c23 or c09 Chr06 41,774,781 41,775, E-154 TMB0382 c c c02 or c17 Chr03 33,511,124 33,510, E-129 SNP0259 c c23 or c09 Chr06 23,938,258 23,938, E-24 C2-021 c c c23 or c09 Chr06 24,030,133 24,029, E-157 DC40058 c c c23 or c09 Chr06 24,030,131 24,029, E-156 TMB1701 c c c09 or c23 Chr06 22,251,504 22,251, E-46 CIR383 c c c23 or c09 Chr06 15,311,196 15,311, E-47 DPL0175b c c c23 or c09 Chr06 13,542,501 13,542,027 0 JESPR274b c c c23 or c09 Chr06 21,079,086 21,079, E-44 MUSS033 c c24 or c08 Chr04 2,877,992 2,878, E-106 SHIN-0334 c c24 or c08 Chr04 2,880,119 2,879, E-160 SNP0124 c c24 or c08 Chr04 21,242,076 21,242, E-24 SNP0111 c c24 or c08 Chr04 18,349,222 18,349, E-25 SNP0038 c c24 or c08 Chr04 29,731,232 29,731, E-24 SNP0054 c c24 or c08 Chr04 29,145,954 29,145, E-24 SHIN-1304a c c c24 or c08 Chr04 33,294,105 33,294,587 0 CIR270 c c c24 or c08 Chr04 33,322,033 33,322, E-147 NAU2292 c c24 or c08 Chr04 40,898,428 40,898, E-168 SNP0041 c c24 or c08 Chr04 28,646,131 28,646, E-22 SNP0011 c c24 or c08 Chr04 36,721,872 36,721, E-22 DPL0877a c c c24 or c08 Chr04 28,574,777 28,574, E-64 TMB1182 c c c11 or c21 Chr07 30,373,434 30,373, E-116 SHIN-1212 c c c24 or c08 Chr04 34,863,969 34,863,570 0 DPL0160 c c24 or c08 Chr04 9,535,942 9,535,425 0

58 Page 58 of 62 BNL2655 c c c24 or c08 Chr04 37,477,704 37,477, E-124 BNL2499 c c c24 or c08 Chr04 38,327,810 38,328,407 0 SNP0063 c c24 or c08 Chr04 59,775,862 59,775, E-24 SNP0005 c c24 or c08 Chr04 59,508,055 59,507, E-24 SNP0452 c c24 or c08 Chr04 57,621,568 57,621, E-23 SNP0108 c no hit SNP0246 c c24 or c08 Chr04 55,129,155 55,129, E-24 SNP0329 c c24 or c08 Chr04 56,984,315 56,984, E-24 SNP0281 c c24 or c08 Chr04 54,851,017 54,851, E-22 SHIN-0384 c c c18 or c13 Chr13 44,197,913 44,197, E-50 DPL0353 c c c24 or c08 Chr04 61,512,824 61,513, E-157 DPL0152 c c c24 or c08 Chr04 61,521,045 61,521, E-142 SHIN-1076 c c c24 or c08 Chr04 61,512,490 61,512, E-134 DC20106 c c c19 or c05 Chr09 65,100,318 65,100, E-78 SNP0313 c c25 or c06 Chr10 58,967,452 58,967, E-24 DPL0282 c c c25 or c06 Chr10 58,929,081 58,929,742 0 CIR268 c c c25 or c06 Chr10 54,466,260 54,465, E-131 DPL0811 c c25 or c06 Chr10 54,525,191 54,524,571 0 DPL1411a c c25 or c06 Chr10 265, ,949 0 BNL3937 c c c25 or c06 Chr10 25,302,993 25,303,436 0 TMB2898 c c c10 or c20 Chr11 16,117,773 16,117, E-144 MUSB0979 c c25 or c06 Chr10 30,500,421 30,501,114 0 DPL0532b c c c25 or c06 Chr10 20,264,061 20,263,413 0 SNP0323 c c25 or c06 Chr10 21,751,730 21,751, E-22 BNL1440a c c no hit SNP0046 c no hit SNP0045 c no hit SNP0390 c c25 or c06 Chr10 17,025,861 17,025, E-24 SNP0336 c c25 or c06 Chr10 15,766,915 15,766, E-22 SNP0337 c c25 or c06 Chr10 15,766,915 15,766, E-22 UCD311 c c no sequence SNP0338 c c25 or c06 Chr10 10,904,510 10,904, E-24

59 Page 59 of 62 SHIN-0885 c c c25 or c06 Chr10 8,978,714 8,979,364 0 DPL0080a c c25 or c06 Chr10 11,215,332 11,214,824 0 TMB0313 c c c06 or c25 Chr10 10,182,152 10,182,614 0 DPL0702 c c c25 or c06 Chr10 4,679,528 4,679, E-158 SNP0189 c c16 or c07 Chr01 39,774,193 39,774, E-17 SNP0464 c c25 or c06 Chr10 6,390,678 6,390, E-17 SNP0463 c c25 or c06 Chr10 6,390,678 6,390, E-17 SNP0435 c c25 or c06 Chr10 2,836,790 2,836, E-24 SNP0434 c c25 or c06 Chr10 2,836,790 2,836, E-24 BNL1047 c c c25 or c06 Chr10 3,104,164 3,104, E-112 SNP0420 c c25 or c06 Chr10 2,977,857 2,977, E-24 SNP0427 c c25 or c06 Chr10 2,105,304 2,105, E-24 SNP0361 c c25 or c06 Chr10 1,651,293 1,651, E-25 SNP0220 c c25 or c06 Chr10 121, , E-24 DPL1411b c c c25 or c06 Chr10 265, ,949 0 DC30135 c c no sequence DPL0059 c c c25 or c06 Chr10 179, ,148 0 SNP0429 c no hit SNP0372 c c25 or c06 Chr10 306, , E-24 BNL3359 c c25 or c06 Chr10 813, ,089 0 CIR267 c c25 or c06 Chr10 918, , E-66 BNL0827 c c c25 or c06 Chr10 995, , E-134 COT036 c c c25 or c06 Chr10 995, , E-153 SNP0367 c c26 or c12 Chr08 47,717,704 47,717, E-20 SNP0368 c c26 or c12 Chr08 47,717,704 47,717, E-21 DPL1373 c c c26 or c12 Chr08 36,819,248 36,820,045 0 SNP0376 c c26 or c12 Chr08 36,826,436 36,826, E-25 DPL1283 c c c26 or c12 Chr08 38,034,012 38,034,552 0 C2-052B c c c26 or c12 Chr08 43,676,133 43,676,555 0 SNP0207 c no hit BNL2495 c c c26 or c12 Chr08 46,952,763 46,953,293 0 SNP0088 c c26 or c12 Chr08 46,791,048 46,790, E-25

60 Page 60 of 62 DC30183 c c no sequence C2-055 c c c26 or c12 Chr08 56,544,333 56,544,787 0 DPL0481 c c c26 or c12 Chr08 56,947,266 56,947,665 0 DPL0164 unknown 0.00 c14 or c03 Chr05 2,655,784 2,656,369 0 SNP0235 unknown 5.34 c16 or c07 Chr01 23,656,811 23,656, E-20 SNP0156 unknown no hit SNP0178 unknown c25 or c06 Chr10 5,628,798 5,628, E-19 SNP0069 unknown c18 or c13 Chr13 54,407,409 54,407, E-22 SNP0031 unknown c25 or c06 Chr10 3,042,315 3,042, E-22 SNP0067 unknown c24 or c08 Chr04 1,963,653 1,963, E-20 Chr, chromosome. TM map; Fang, D.D. and Yu, J.Z J. Cotton Sci. 16: Chr.of G.h., Chromosome number of the allotetraploid G. hirsutum ; Blenda, A. et al PLoS ONE 7:e45739 Chr.of G.r., Chromosome number of the diploid G. raimondii ; Paterson, A.H. et al Nature 492:

61 Page 61 of 62 Supplementary Table S2. Summary of inclusive composite interval mapping (ICIM) of 10 agronomic and fiber traits within the TM-1 NM24016 population at an experiment-wise type-i error rate of 20%. Identified QTLs significant at a 5% type-i error rate are highlighted in bold font. Trait Chr. LG Position (cm) Marker Marker LOD PVE (%) effect # Peak Left Right Peak Additive Boll size c BNL1694 TMB Boll size c SNP0035 SNP Lint percentage c SNP0332 DPL Lint percentage c MUSS026 SHIN Lint percentage c TMB0835 TMB Lint percentage c DPL0135 TMB Lint percentage c DPL1411a BNL Lint yield c DPL0507 DPL Plant height c DPL0094 SNP Micronaire c SNP0220 DPL1411b Micronaire c CIR267 BNL Fiber elongation c SHIN-0817 SNP Fiber elongation c DPL0745 MUCS Fiber elongation c SNP0058 SNP Fiber elongation c SNP0019 DPL0917b Fiber elongation c TMB0312 SNP Fiber strength c DPL0080b SNP Fiber strength c DPL0570 DPL Fiber strength c DPL0252 DPL Fiber strength c SNP0256 SNP Fiber strength c TMB0904 TMB1222b Fiber strength c JESPR114 DPL1016b Fiber strength c MUSB0979 DPL0532b %-span length c SHIN-1344 SNP %-span length c SNP0189 SNP Length uniformity c SNP0132 SNP Length uniformity c DPL1470b DPL

62 Page 62 of 62 Length uniformity c SNP0189 SNP Chr, chromosome. LG, linkage group. The logarithm of odds (LOD) value at the position of peak likelihood of the QTL. Phenotypic variance explained by each QTL. #Additive effect when substituting a NM24016 allele with an allele from TM-1.

TARGETED INTROGRESSION OF COTTON FIBER QUALITY QTLs USING MOLECULAR MARKERS

TARGETED INTROGRESSION OF COTTON FIBER QUALITY QTLs USING MOLECULAR MARKERS TARGETED INTROGRESSION OF COTTON FIBER QUALITY QTLs USING MOLECULAR MARKERS J.-M. Lacape, T.-B. Nguyen, B. Hau, and M. Giband CIRAD-CA, Programme Coton, TA 70/03, Avenue Agropolis, 34398 Montpellier Cede

More information

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele Marker-Assisted Backcrossing Marker-Assisted Selection CS74 009 Jim Holland Target gene = Recurrent parent allele = Donor parent allele. Select donor allele at markers linked to target gene.. Select recurrent

More information

Historically, the United States has produced medium grade

Historically, the United States has produced medium grade RESEARCH Mapping and Validation of Fiber Strength Quantitative Trait Loci on Chromosome 24 in Upland Cotton Pawan Kumar, Rippy Singh, Edward L. Lubbers, Xinlian Shen, Andrew H. Paterson, B. Todd Campbell,

More information

Genetic mapping of fiber color genes on two brown cotton cultivars in Xinjiang

Genetic mapping of fiber color genes on two brown cotton cultivars in Xinjiang Wang et al. SpringerPlus 2014, 3:480 a SpringerOpen Journal RESEARCH Genetic mapping of fiber color genes on two brown cotton cultivars in Xinjiang Lixiang Wang 1,5, Haifeng Liu 2, Xueyuan Li 3, Xiangwen

More information

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University Genotyping by sequencing and data analysis Ross Whetten North Carolina State University Stein (2010) Genome Biology 11:207 More New Technology on the Horizon Genotyping By Sequencing Timeline 2007 Complexity

More information

MOLECULAR MARKERS AND THEIR APPLICATIONS IN CEREALS BREEDING

MOLECULAR MARKERS AND THEIR APPLICATIONS IN CEREALS BREEDING MOLECULAR MARKERS AND THEIR APPLICATIONS IN CEREALS BREEDING Viktor Korzun Lochow-Petkus GmbH, Grimsehlstr.24, 37574 Einbeck, Germany [email protected] Summary The development of molecular techniques

More information

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING Theo Meuwissen Institute for Animal Science and Aquaculture, Box 5025, 1432 Ås, Norway, [email protected] Summary

More information

Data Analysis for Ion Torrent Sequencing

Data Analysis for Ion Torrent Sequencing IFU022 v140202 Research Use Only Instructions For Use Part III Data Analysis for Ion Torrent Sequencing MANUFACTURER: Multiplicom N.V. Galileilaan 18 2845 Niel Belgium Revision date: August 21, 2014 Page

More information

TEXAS A&M PLANT BREEDING BULLETIN

TEXAS A&M PLANT BREEDING BULLETIN TEXAS A&M PLANT BREEDING BULLETIN October 2015 Our Mission: Educate and develop Plant Breeders worldwide Our Vision: Alleviate hunger and poverty through genetic improvement of plants A group of 54 graduate

More information

BREEDING AND GENETICS

BREEDING AND GENETICS The Journal of Cotton Science 19:15 26 (2015) http://journal.cotton.org, The Cotton Foundation 2015 15 BREEDING AND GENETICS Construction of Genetic Linkage Map and QTL Analysis for Fiber Traits in Diploid

More information

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation PN 100-9879 A1 TECHNICAL NOTE Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation Introduction Cancer is a dynamic evolutionary process of which intratumor genetic and phenotypic

More information

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the Chapter 5 Analysis of Prostate Cancer Association Study Data 5.1 Risk factors for Prostate Cancer Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the disease has

More information

Forensic DNA Testing Terminology

Forensic DNA Testing Terminology Forensic DNA Testing Terminology ABI 310 Genetic Analyzer a capillary electrophoresis instrument used by forensic DNA laboratories to separate short tandem repeat (STR) loci on the basis of their size.

More information

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Genetic engineering: humans Gene replacement therapy or gene therapy Many technical and ethical issues implications for gene pool for germ-line gene therapy what traits constitute disease rather than just

More information

QUANTITATIVE TRAIT LOCI MAPPING FOR AGRONOMIC AND FIBER QUALITY TRAITS IN UPLAND COTTON (GOSSYPIUM HIRSUTUM L.) USING MOLECULAR MARKERS

QUANTITATIVE TRAIT LOCI MAPPING FOR AGRONOMIC AND FIBER QUALITY TRAITS IN UPLAND COTTON (GOSSYPIUM HIRSUTUM L.) USING MOLECULAR MARKERS QUANTITATIVE TRAIT LOCI MAPPING FOR AGRONOMIC AND FIBER QUALITY TRAITS IN UPLAND COTTON (GOSSYPIUM HIRSUTUM L.) USING MOLECULAR MARKERS A Dissertation Submitted to the Graduate Faculty of the Louisiana

More information

Rapid Acquisition of Unknown DNA Sequence Adjacent to a Known Segment by Multiplex Restriction Site PCR

Rapid Acquisition of Unknown DNA Sequence Adjacent to a Known Segment by Multiplex Restriction Site PCR Rapid Acquisition of Unknown DNA Sequence Adjacent to a Known Segment by Multiplex Restriction Site PCR BioTechniques 25:415-419 (September 1998) ABSTRACT The determination of unknown DNA sequences around

More information

Gene Mapping Techniques

Gene Mapping Techniques Gene Mapping Techniques OBJECTIVES By the end of this session the student should be able to: Define genetic linkage and recombinant frequency State how genetic distance may be estimated State how restriction

More information

SNPbrowser Software v3.5

SNPbrowser Software v3.5 Product Bulletin SNP Genotyping SNPbrowser Software v3.5 A Free Software Tool for the Knowledge-Driven Selection of SNP Genotyping Assays Easily visualize SNPs integrated with a physical map, linkage disequilibrium

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

Molecular typing of VTEC: from PFGE to NGS-based phylogeny

Molecular typing of VTEC: from PFGE to NGS-based phylogeny Molecular typing of VTEC: from PFGE to NGS-based phylogeny Valeria Michelacci 10th Annual Workshop of the National Reference Laboratories for E. coli in the EU Rome, November 5 th 2015 Molecular typing

More information

Innovations in Molecular Epidemiology

Innovations in Molecular Epidemiology Innovations in Molecular Epidemiology Molecular Epidemiology Measure current rates of active transmission Determine whether recurrent tuberculosis is attributable to exogenous reinfection Determine whether

More information

20-10-2015. DNA profiles in DUS testing of grasses. A new UPOV model? Lolium perenne (perennial ryegrass) Pilot study (2014)

20-10-2015. DNA profiles in DUS testing of grasses. A new UPOV model? Lolium perenne (perennial ryegrass) Pilot study (2014) 2--25 DNA profiles in DUS testing of grasses A new UPOV model? Henk Bonthuis Naktuinbouw Aanvragersoverleg Rvp Wageningsche Berg 9 oktober 25 Lolium perenne (perennial ryegrass) Challenges Genetically

More information

Technical Note. Roche Applied Science. No. LC 18/2004. Assay Formats for Use in Real-Time PCR

Technical Note. Roche Applied Science. No. LC 18/2004. Assay Formats for Use in Real-Time PCR Roche Applied Science Technical Note No. LC 18/2004 Purpose of this Note Assay Formats for Use in Real-Time PCR The LightCycler Instrument uses several detection channels to monitor the amplification of

More information

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

An example of bioinformatics application on plant breeding projects in Rijk Zwaan An example of bioinformatics application on plant breeding projects in Rijk Zwaan Xiangyu Rao 17-08-2012 Introduction of RZ Rijk Zwaan is active worldwide as a vegetable breeding company that focuses on

More information

DNA Insertions and Deletions in the Human Genome. Philipp W. Messer

DNA Insertions and Deletions in the Human Genome. Philipp W. Messer DNA Insertions and Deletions in the Human Genome Philipp W. Messer Genetic Variation CGACAATAGCGCTCTTACTACGTGTATCG : : CGACAATGGCGCT---ACTACGTGCATCG 1. Nucleotide mutations 2. Genomic rearrangements 3.

More information

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications Product Bulletin Sequencing Software SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications Comprehensive reference sequence handling Helps interpret the role of each

More information

"Fingerprinting" Vegetables DNA-based Marker Assisted Selection

Fingerprinting Vegetables DNA-based Marker Assisted Selection "Fingerprinting" Vegetables DNA-based Marker Assisted Selection Faster, Cheaper, More Reliable; These are some of the goals that vegetable breeders at seed companies and public institutions desire for

More information

EGYPTIAN COTTON PRODUCTION TO MEET THE EXTRA LONG STAPLE COTTON REQUIREMENT IN THE COUNTRY. K.N. Gururajan

EGYPTIAN COTTON PRODUCTION TO MEET THE EXTRA LONG STAPLE COTTON REQUIREMENT IN THE COUNTRY. K.N. Gururajan - 48 - EGYPTIAN COTTON PRODUCTION TO MEET THE EXTRA LONG STAPLE COTTON REQUIREMENT IN THE COUNTRY K.N. Gururajan Principal Scientist, Central Institute for Cotton Research, Regional Station, Coimbatore

More information

Introductory genetics for veterinary students

Introductory genetics for veterinary students Introductory genetics for veterinary students Michel Georges Introduction 1 References Genetics Analysis of Genes and Genomes 7 th edition. Hartl & Jones Molecular Biology of the Cell 5 th edition. Alberts

More information

Arabidopsis. A Practical Approach. Edited by ZOE A. WILSON Plant Science Division, School of Biological Sciences, University of Nottingham

Arabidopsis. A Practical Approach. Edited by ZOE A. WILSON Plant Science Division, School of Biological Sciences, University of Nottingham Arabidopsis A Practical Approach Edited by ZOE A. WILSON Plant Science Division, School of Biological Sciences, University of Nottingham OXPORD UNIVERSITY PRESS List of Contributors Abbreviations xv xvu

More information

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan Combining Data from Different Genotyping Platforms Gonçalo Abecasis Center for Statistical Genetics University of Michigan The Challenge Detecting small effects requires very large sample sizes Combined

More information

Real-Time PCR Vs. Traditional PCR

Real-Time PCR Vs. Traditional PCR Real-Time PCR Vs. Traditional PCR Description This tutorial will discuss the evolution of traditional PCR methods towards the use of Real-Time chemistry and instrumentation for accurate quantitation. Objectives

More information

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS Genomic Selection in Dairy Cattle AQUAGENOME Applied Training Workshop, Sterling Hans Daetwyler, The Roslin Institute and R(D)SVS Dairy introduction Overview Traditional breeding Genomic selection Advantages

More information

Victims Compensation Claim Status of All Pending Claims and Claims Decided Within the Last Three Years

Victims Compensation Claim Status of All Pending Claims and Claims Decided Within the Last Three Years Claim#:021914-174 Initials: J.T. Last4SSN: 6996 DOB: 5/3/1970 Crime Date: 4/30/2013 Status: Claim is currently under review. Decision expected within 7 days Claim#:041715-334 Initials: M.S. Last4SSN: 2957

More information

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples DATA Sheet Single-Cell DNA Sequencing with the C 1 Single-Cell Auto Prep System Reveal hidden populations and genetic diversity within complex samples Single-cell sensitivity Discover and detect SNPs,

More information

Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15

Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15 Biology 1406 - Notes for exam 5 - Population genetics Ch 13, 14, 15 Species - group of individuals that are capable of interbreeding and producing fertile offspring; genetically similar 13.7, 14.2 Population

More information

PLANT BREEDING: CAN METABOLOMICS HELP?

PLANT BREEDING: CAN METABOLOMICS HELP? PLANT BREEDING: CAN METABOLOMICS HELP? Carlos Muñoz Schick Ingeniero Agrónomo, M.S., Ph.D. UNIVERSIDAD DE CHILE Facultad de Ciencias Agronómicas OUTLINE OF THE PRESENTATION Origin of Plant Breeding Domestication

More information

TOOLS FOR T-RFLP DATA ANALYSIS USING EXCEL

TOOLS FOR T-RFLP DATA ANALYSIS USING EXCEL TOOLS FOR T-RFLP DATA ANALYSIS USING EXCEL A collection of Visual Basic macros for the analysis of terminal restriction fragment length polymorphism data Nils Johan Fredriksson TOOLS FOR T-RFLP DATA ANALYSIS

More information

Overview International course Plant Breeding

Overview International course Plant Breeding Overview International course Plant Breeding 2016-2018 The course and modules The international course Plant Breeding consists of 6 modules of 1 week in the period October 2016 until March 2018. In between

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Next Generation Sequencing: Technology, Mapping, and Analysis

Next Generation Sequencing: Technology, Mapping, and Analysis Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University [email protected] http://tandem.bu.edu/ The Human Genome Project took

More information

Introduction To Real Time Quantitative PCR (qpcr)

Introduction To Real Time Quantitative PCR (qpcr) Introduction To Real Time Quantitative PCR (qpcr) SABiosciences, A QIAGEN Company www.sabiosciences.com The Seminar Topics The advantages of qpcr versus conventional PCR Work flow & applications Factors

More information

Introduction to Bioinformatics 3. DNA editing and contig assembly

Introduction to Bioinformatics 3. DNA editing and contig assembly Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 [email protected]

More information

Sanger Sequencing and Quality Assurance. Zbigniew Rudzki Department of Pathology University of Melbourne

Sanger Sequencing and Quality Assurance. Zbigniew Rudzki Department of Pathology University of Melbourne Sanger Sequencing and Quality Assurance Zbigniew Rudzki Department of Pathology University of Melbourne Sanger DNA sequencing The era of DNA sequencing essentially started with the publication of the enzymatic

More information

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Focusing on results not data comprehensive data analysis for targeted next generation sequencing Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes

More information

ADVANCES IN BOTANICAL RESEARCH

ADVANCES IN BOTANICAL RESEARCH o >VOLUME SIXTY NINE ADVANCES IN BOTANICAL RESEARCH Genomes of Herbaceous Land Plants Volume Editor ANDREW H. PATERSON Plant Genome Mapping Laboratory Department of Crop and Soil Sciences, Department of

More information

Application Guide... 2

Application Guide... 2 Protocol for GenomePlex Whole Genome Amplification from Formalin-Fixed Parrafin-Embedded (FFPE) tissue Application Guide... 2 I. Description... 2 II. Product Components... 2 III. Materials to be Supplied

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using Online Supplement to Polygenic Influence on Educational Attainment Construction of Polygenic Score for Educational Attainment Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

More information

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = 0.0004 ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = 0.0004 ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc Advanced genetics Kornfeld problem set_key 1A (5 points) Brenner employed 2-factor and 3-factor crosses with the mutants isolated from his screen, and visually assayed for recombination events between

More information

Protocols. Internal transcribed spacer region (ITS) region. Niklaus J. Grünwald, Frank N. Martin, and Meg M. Larsen (2013)

Protocols. Internal transcribed spacer region (ITS) region. Niklaus J. Grünwald, Frank N. Martin, and Meg M. Larsen (2013) Protocols Internal transcribed spacer region (ITS) region Niklaus J. Grünwald, Frank N. Martin, and Meg M. Larsen (2013) The nuclear ribosomal RNA (rrna) genes (small subunit, large subunit and 5.8S) are

More information

Genetics Lecture Notes 7.03 2005. Lectures 1 2

Genetics Lecture Notes 7.03 2005. Lectures 1 2 Genetics Lecture Notes 7.03 2005 Lectures 1 2 Lecture 1 We will begin this course with the question: What is a gene? This question will take us four lectures to answer because there are actually several

More information

SEQUENCING. From Sample to Sequence-Ready

SEQUENCING. From Sample to Sequence-Ready SEQUENCING From Sample to Sequence-Ready ACCESS ARRAY SYSTEM HIGH-QUALITY LIBRARIES, NOT ONCE, BUT EVERY TIME The highest-quality amplicons more sensitive, accurate, and specific Full support for all major

More information

Factors for success in big data science

Factors for success in big data science Factors for success in big data science Damjan Vukcevic Data Science Murdoch Childrens Research Institute 16 October 2014 Big Data Reading Group (Department of Mathematics & Statistics, University of Melbourne)

More information

Comparison of Major Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments

Comparison of Major Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments Comparison of Maor Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments A. Sima UYAR and A. Emre HARMANCI Istanbul Technical University Computer Engineering Department Maslak

More information

Phillips McDougall. The cost and time involved in the discovery, development and authorisation of a new plant biotechnology derived trait

Phillips McDougall. The cost and time involved in the discovery, development and authorisation of a new plant biotechnology derived trait R&D Study Phillips McDougall The cost and time involved in the discovery, development and authorisation of a new plant biotechnology derived trait A Consultancy Study for Crop Life International September

More information

GOBII. Genomic & Open-source Breeding Informatics Initiative

GOBII. Genomic & Open-source Breeding Informatics Initiative GOBII Genomic & Open-source Breeding Informatics Initiative My Background BS Animal Science, University of Tennessee MS Animal Breeding, University of Georgia Random regression models for longitudinal

More information

HLA data analysis in anthropology: basic theory and practice

HLA data analysis in anthropology: basic theory and practice HLA data analysis in anthropology: basic theory and practice Alicia Sanchez-Mazas and José Manuel Nunes Laboratory of Anthropology, Genetics and Peopling history (AGP), Department of Anthropology and Ecology,

More information

I. Genes found on the same chromosome = linked genes

I. Genes found on the same chromosome = linked genes Genetic recombination in Eukaryotes: crossing over, part 1 I. Genes found on the same chromosome = linked genes II. III. Linkage and crossing over Crossing over & chromosome mapping I. Genes found on the

More information

BREEDING AND GENETICS

BREEDING AND GENETICS The Journal of Cotton Science 20:40 45 (2016) http://journal.cotton.org, The Cotton Foundation 2016 40 BREEDING AND GENETICS Measuring Maturity in Cotton Cultivar Trials Daryl T. Bowman, Fred Bourland,

More information

The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics

The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics The GS Junior System The Power of Next-Generation Sequencing on Your Benchtop Proven technology: Uses the same long

More information

TruSeq Custom Amplicon v1.5

TruSeq Custom Amplicon v1.5 Data Sheet: Targeted Resequencing TruSeq Custom Amplicon v1.5 A new and improved amplicon sequencing solution for interrogating custom regions of interest. Highlights Figure 1: TruSeq Custom Amplicon Workflow

More information

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis Genetic Analysis Phenotype analysis: biological-biochemical analysis Behaviour under specific environmental conditions Behaviour of specific genetic configurations Behaviour of progeny in crosses - Genotype

More information

Asexual Versus Sexual Reproduction in Genetic Algorithms 1

Asexual Versus Sexual Reproduction in Genetic Algorithms 1 Asexual Versus Sexual Reproduction in Genetic Algorithms Wendy Ann Deslauriers ([email protected]) Institute of Cognitive Science,Room 22, Dunton Tower Carleton University, 25 Colonel By Drive

More information

DNA MARKERS FOR ASEASONALITY AND MILK PRODUCTION IN SHEEP. R. G. Mateescu and M.L. Thonney

DNA MARKERS FOR ASEASONALITY AND MILK PRODUCTION IN SHEEP. R. G. Mateescu and M.L. Thonney DNA MARKERS FOR ASEASONALITY AND MILK PRODUCTION IN SHEEP Introduction R. G. Mateescu and M.L. Thonney Department of Animal Science Cornell University Ithaca, New York Knowledge about genetic markers linked

More information

Isolation and characterization of nine microsatellite loci in the Pale Pitcher Plant. MARGARET M. KOOPMAN*, ELIZABETH GALLAGHER, and BRYAN C.

Isolation and characterization of nine microsatellite loci in the Pale Pitcher Plant. MARGARET M. KOOPMAN*, ELIZABETH GALLAGHER, and BRYAN C. Page 1 of 28 1 1 2 3 PERMANENT GENETIC RESOURCES Isolation and characterization of nine microsatellite loci in the Pale Pitcher Plant Sarracenia alata (Sarraceniaceae). 4 5 6 MARGARET M. KOOPMAN*, ELIZABETH

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 [email protected] Genomics A genome is an organism s

More information

Basics of Marker Assisted Selection

Basics of Marker Assisted Selection asics of Marker ssisted Selection Chapter 15 asics of Marker ssisted Selection Julius van der Werf, Department of nimal Science rian Kinghorn, Twynam Chair of nimal reeding Technologies University of New

More information

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research March 17, 2011 Rendez-Vous Séquençage Presentation Overview Core Technology Review Sequence Enrichment Application

More information

Mechanisms of Evolution

Mechanisms of Evolution page 2 page 3 Teacher's Notes Mechanisms of Evolution Grades: 11-12 Duration: 28 mins Summary of Program Evolution is the gradual change that can be seen in a population s genetic composition, from one

More information

COMMISSION IMPLEMENTING DECISION

COMMISSION IMPLEMENTING DECISION 20.3.2014 Official Journal of the European Union L 82/29 COMMISSION IMPLEMENTING DECISION of 18 March 2014 on the organisation of a temporary experiment providing for certain derogations for the marketing

More information

Typing in the NGS era: The way forward!

Typing in the NGS era: The way forward! Typing in the NGS era: The way forward! Valeria Michelacci NGS course, June 2015 Typing from sequence data NGS-derived conventional Multi Locus Sequence Typing (University of Warwick, 7 housekeeping genes)

More information

The Human Genome Project

The Human Genome Project The Human Genome Project Brief History of the Human Genome Project Physical Chromosome Maps Genetic (or Linkage) Maps DNA Markers Sequencing and Annotating Genomic DNA What Have We learned from the HGP?

More information

PRINCIPLES OF POPULATION GENETICS

PRINCIPLES OF POPULATION GENETICS PRINCIPLES OF POPULATION GENETICS FOURTH EDITION Daniel L. Hartl Harvard University Andrew G. Clark Cornell University UniversitSts- und Landesbibliothek Darmstadt Bibliothek Biologie Sinauer Associates,

More information

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters

GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters GAW 15 Problem 3: Simulated Rheumatoid Arthritis Data Full Model and Simulation Parameters Michael B Miller , Michael Li , Gregg Lind , Soon-Young

More information

Magdy S Alabady, PhD

Magdy S Alabady, PhD Magdy S. Alabady, PhD Contact Information Energy Bioscience Institute (EBI), Institute for Genomic Biology (IGB) University of Illinois at Urbana-Champaign 1206 west Gregory Street, RM#1118 Urbana, IL61801,

More information

Summary. 16 1 Genes and Variation. 16 2 Evolution as Genetic Change. Name Class Date

Summary. 16 1 Genes and Variation. 16 2 Evolution as Genetic Change. Name Class Date Chapter 16 Summary Evolution of Populations 16 1 Genes and Variation Darwin s original ideas can now be understood in genetic terms. Beginning with variation, we now know that traits are controlled by

More information

Y Chromosome Markers

Y Chromosome Markers Y Chromosome Markers Lineage Markers Autosomal chromosomes recombine with each meiosis Y and Mitochondrial DNA does not This means that the Y and mtdna remains constant from generation to generation Except

More information

Commonly Used STR Markers

Commonly Used STR Markers Commonly Used STR Markers Repeats Satellites 100 to 1000 bases repeated Minisatellites VNTR variable number tandem repeat 10 to 100 bases repeated Microsatellites STR short tandem repeat 2 to 6 bases repeated

More information

DNA Sequence Analysis

DNA Sequence Analysis DNA Sequence Analysis Two general kinds of analysis Screen for one of a set of known sequences Determine the sequence even if it is novel Screening for a known sequence usually involves an oligonucleotide

More information

Publication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore

Publication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publication List Chen Zehua Department of Statistics & Applied Probability National University of Singapore Publications Journal Papers 1. Y. He and Z. Chen (2014). A sequential procedure for feature selection

More information

PrimeSTAR HS DNA Polymerase

PrimeSTAR HS DNA Polymerase Cat. # R010A For Research Use PrimeSTAR HS DNA Polymerase Product Manual Table of Contents I. Description...3 II. III. IV. Components...3 Storage...3 Features...3 V. General Composition of PCR Reaction

More information

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99.

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99. 1. True or False? A typical chromosome can contain several hundred to several thousand genes, arranged in linear order along the DNA molecule present in the chromosome. True 2. True or False? The sequence

More information

Molecular and Cell Biology Laboratory (BIOL-UA 223) Instructor: Ignatius Tan Phone: 212-998-8295 Office: 764 Brown Email: ignatius.tan@nyu.

Molecular and Cell Biology Laboratory (BIOL-UA 223) Instructor: Ignatius Tan Phone: 212-998-8295 Office: 764 Brown Email: ignatius.tan@nyu. Molecular and Cell Biology Laboratory (BIOL-UA 223) Instructor: Ignatius Tan Phone: 212-998-8295 Office: 764 Brown Email: [email protected] Course Hours: Section 1: Mon: 12:30-3:15 Section 2: Wed: 12:30-3:15

More information

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait TWINS AND GENETICS TWINS Heritability: Twin Studies Twin studies are often used to assess genetic effects on variation in a trait Comparing MZ/DZ twins can give evidence for genetic and/or environmental

More information

A guide to the analysis of KASP genotyping data using cluster plots

A guide to the analysis of KASP genotyping data using cluster plots extraction sequencing genotyping extraction sequencing genotyping extraction sequencing genotyping extraction sequencing A guide to the analysis of KASP genotyping data using cluster plots Contents of

More information

Mitochondrial DNA Analysis

Mitochondrial DNA Analysis Mitochondrial DNA Analysis Lineage Markers Lineage markers are passed down from generation to generation without changing Except for rare mutation events They can help determine the lineage (family tree)

More information

EFFECTS OF VARYING IRRIGATION AND MEPIQUAT CHLORIDE APPLICATION ON COTTON HEIGHT, UNIFORMITY, YIELD, AND QUALITY. Abstract

EFFECTS OF VARYING IRRIGATION AND MEPIQUAT CHLORIDE APPLICATION ON COTTON HEIGHT, UNIFORMITY, YIELD, AND QUALITY. Abstract EFFECTS OF VARYING IRRIGATION AND MEPIQUAT CHLORIDE APPLICATION ON COTTON HEIGHT, UNIFORMITY, YIELD, AND QUALITY Glen Ritchie 1, Lola Sexton 1, Trey Davis 1, Don Shurley 2, and Amanda Ziehl 2 1 University

More information

Basic Principles of Forensic Molecular Biology and Genetics. Population Genetics

Basic Principles of Forensic Molecular Biology and Genetics. Population Genetics Basic Principles of Forensic Molecular Biology and Genetics Population Genetics Significance of a Match What is the significance of: a fiber match? a hair match? a glass match? a DNA match? Meaning of

More information

Biological Sciences Initiative. Human Genome

Biological Sciences Initiative. Human Genome Biological Sciences Initiative HHMI Human Genome Introduction In 2000, researchers from around the world published a draft sequence of the entire genome. 20 labs from 6 countries worked on the sequence.

More information

Genetics Module B, Anchor 3

Genetics Module B, Anchor 3 Genetics Module B, Anchor 3 Key Concepts: - An individual s characteristics are determines by factors that are passed from one parental generation to the next. - During gamete formation, the alleles for

More information

Core Facility Genomics

Core Facility Genomics Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray

More information

Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA

Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA Page 1 of 5 Biology Behind the Crime Scene Week 4: Lab #4 Genetics Exercise (Meiosis) and RFLP Analysis of DNA Genetics Exercise: Understanding how meiosis affects genetic inheritance and DNA patterns

More information

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER JMP Genomics Step-by-Step Guide to Bi-Parental Linkage Mapping Introduction JMP Genomics offers several tools for the creation of linkage maps

More information