Cover created by Brecht Crombez

Size: px
Start display at page:

Download "Cover created by Brecht Crombez"

Transcription

1

2 Cover created by Brecht Crombez

3 Ghent University Faculty of Medicine and Health Sciences Center for Medical Genetics Genetic predisposition for breast cancer: improved molecular analyses This thesis is submitted in fulfillment of the requirements for the degree of Doctor (PhD) in Medical Sciences by Kim De Leeneer, 2011 promotor Prof. dr. Anne De Paepe co-promotor Prof. dr. ir. Kathleen Claes Center for Medical Genetics, Ghent University Hospital Medical Research Building, De Pintelaan 185, B-9000 Ghent, Belgium (phone) (fax)

4

5 This thesis is dedicated to My parents and Bart For their endless love and support And to our angel Dorianne Who is missed daily

6 Thesis submitted to fulfill the requirements for the degree of Doctor in Medical Sciences 2011 Promotor: prof. dr. Anne De Paepe Ghent University, Ghent, Belgium Co-promotor: prof. dr. Kathleen Claes Ghent University, Ghent, Belgium Members of the examination committee: prof. dr. Johan Van De Walle Ghent University, Ghent, Belgium prof. dr. Elfride De Baere Ghent University, Ghent, Belgium dr. Claude Houdayer Institut Curie, Paris, France prof. dr. Abramowicz Université Libre de Bruxelles (ULB), Brussels, Belgium prof. dr. Veronique Cocquyt Ghent University, Ghent, Belgium prof. dr. Anne Vral Ghent University, Ghent, Belgium prof. dr. Rudy Van den Broecke Ghent University, Ghent, Belgium prof. dr. ir. Björn Menten Ghent University, Ghent, Belgium The research summarized in this thesis was conducted at the Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium. This thesis was realized with the funding of an Emmanuel van der Schueren scholarship of the Flemish foundation against cancer to Kim De Leeneer. The work was supported by grant from the Fund for Scientific Research Flanders (FWO) and by GOA grant BOF10/GOA/019 (Ghent University).

7 TABLE OF CONTENTS List of abbreviations 5 CHAPTER 1: INTRODUCTION AND RESEARCH OBJECTIVES The genetic basis of cancer 7 Cancer genetics 7 Proto-oncogenes vs tumor suppressor genes 7 Cancer is a multistep process 8 Breast and ovarian cancer 10 Genetic susceptibility to breast cancer 11 Introduction 11 High-Penetrance Breast Cancer Predisposition Genes 12 Functions of BRCA1 and BRCA2 14 Mutation spectrum of BRCA1/2 16 BRCA1/2 mutation prevalence 17 Risks associated with germline mutations in BRCA1 and BRCA2 17 Preventive measurements to reduce breast cancer risk in BRCA1/2 mutation carriers 19 Breast Cancer Predisposition Genes of Uncertain Penetrance 20 Intermediate-Penetrance Breast Cancer Predisposition Genes 21 Common low-penetrance Breast Cancer Predisposition Alleles 23 Research objectives and outline of the thesis 25 References 27 CHAPTER 2: TAKING THROUGHPUT IN BRCA1/2 MUTATION DETECTION TO THE NEXT LEVEL Introduction 33 PAPER 1: Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms. 37 PAPER 2: Genotyping of frequent BRCA1/2 SNPs with unlabeled probes: a supplement to HRMCA mutation scanning allowing to strongly reduce the sequencing burden. 45 PAPER 3: Practical tools to implement massively parallel pyrosequencing of PCR products in next generation molecular diagnostics 51

8 PAPER 4: Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations. 77 References 87 CHAPTER 3:MUTATION PREVALENCE AND SPECTRUM OF BRCA1&2 AND RAD51C IN SELECTED PATIENT POPULATIONS PAPER 5: Prevalence of BRCA1/2 mutations in sporadic breast/ovarian cancer patients and identification of a novel de novo BRCA1 mutation in a patient diagnosed with late onset breast and ovarian cancer: implications for genetic testing. 91 PAPER 6: Evaluation of RAD51C as a new breast cancer susceptibility gene in the Belgian/Dutch population. 101 CHAPTER 4:GENERAL DISCUSSION AND FUTURE PERSPECTIVES Discussion 107 Technical apects of molecular diagnostics for inherited breast cancer 107 Mutation spectrum and prevalences in selected breast/ovarian cancer population 114 Perspectives 117 General Conclusion 119 References 120 Summary 123 Samenvatting 125 Dankwoord 129 Curriculum Vitae 131

9 List of abbreviations BRCA1 BReast CAncer susceptibility gene 1 BRCA2 BReast CAncer susceptibility gene 2 cdna CMGG DGGE dhplc DS ER FA FBOC gdna GWAS HA HBOC HRMCA MPS PARP PTT SNP SSCP Tm TSG complementary DNA Center for Medical Genetics Ghent Denaturing Gradient Gel Electrophoresis denaturating High Performance Liquid Chromatography Direct Sequencing Estrogen Receptor Fanconi Anaemia Familial Breast and/or Ovarian Cancer syndrome genomic DNA Genome Wide Association Studies Heteroduplex Analysis Hereditary Breast and/or Ovarian Cancer syndrome High Resolution Melting Curve Analysis Massive Parallel Sequencing Poly(ADP-ribose) polymerase Protein Truncation Test Single Nucleotide Polymorphism Single-Strand Conformation Polymorphism Melting Temperature Tumor Suppressor Gene

10

11 Introduction and Research Objectives

12 Chapter 1 CHAPTER 1: INTRODUCTION AND RESEARCH OBJECTIVES The genetic basis of cancer 7 Cancer genetics 7 Proto-oncogenes vs tumor suppressor genes 7 Cancer is a multistep process 8 Breast and ovarian cancer 10 Genetic susceptibility to breast cancer 11 Introduction 11 High-Penetrance Breast Cancer Predisposition Genes 12 Functions of BRCA1 and BRCA2 14 Mutation spectrum of BRCA1/2 16 BRCA1/2 mutation prevalence 17 Risks associated with germline mutations in BRCA1 and BRCA2 17 Preventive measurements to reduce breast cancer risk in BRCA1/2 mutation carriers 19 Breast Cancer Predisposition Genes of Uncertain Penetrance 20 Intermediate-Penetrance Breast Cancer Predisposition Genes 21 Common low-penetrance Breast Cancer Predisposition Alleles 23 Research objectives and outline of the thesis 25 References 27 6

13 Chapter 1 The genetic basis of cancer Cancer genetics Cancer is the general name for over 100 medical conditions involving uncontrolled and dangerous cell growth. A cancer generally derives from a single cell that is changed dramatically by a series of genetic alterations. A healthy cell has a well-defined shape and fits neatly within the ordered array of cells surrounding it. It responds to the environment, giving rise to daughter cells solely when the balance of stimulatory and inhibitory signals from the outside favors cell division. But the process of replication, carries the constant hazard of random genetic mutations which can impair the regulatory circuits of a cell [1]. Genetic abnormalities found in cancer typically affect two general classes of genes. Cancerpromoting oncogenes are typically activated in cancer cells, giving those cells new properties (gain of function). Tumor suppressor genes are then inactivated in cancer cells, resulting in the loss of normal functions in those cells. Proto-oncogenes vs tumor suppressor genes Oncogenes are the altered forms of proto-oncogenes. Proto-oncogenes are found in normal cells and encode proteins involved in the control of replication, apoptosis (cell death) or both. They are involved in the normal function of the cell, but can turn a cell into a cancer cell when activated. Activation of proto-oncogenes by chromosomal rearrangements, mutations, or gene amplification confers a growth advantage or increased survival of cells carrying such alterations. All three mechanisms cause either an alteration in the oncogene structure or an increase in or deregulation of its expression [2]. Tumor suppressor genes (TSGs) are targeted by genetic alterations in the opposite way as proto-oncogenes. The affected cell loses one of his functions like accurate DNA replication, control over the cell cycle, orientation and adhesion within tissues, and interaction with protective cells of the immune system. According to Knudson s two hit hypothesis [3], inactivation of both TSG alleles is necessary for tumor development (Figure 1). Knudson suggested that multiple "hits" are necessary to cause cancer. The first inactivation is inherited and any second mutation would rapidly lead to cancer. In non-inherited cancer, 7

14 Chapter 1 two "hits" need to take place before tumor development, explaining the higher age of onset compared to inherited cancer. Figure 1: Knudson s two-hit hypothesis In hereditary tumor syndromes, the initial inactivation of one allele is present in the germ cells. To start tumorigenesis an additional hit or somatic inactivation of the second allele is required. Somatic inactivation events include subchromosomal deletions, mitotic recombination, nondisjunctional chromosome loss with or without reduplication of the chromosome carrying the affected TSG, intragenic mutation or an epigenetic event. In sporadic tumors, the initial and second inactivating event occurs in the same somatic cell of an individual. TSGs can be subdivided into several classes according to their normal gene function, i.e. gatekeepers, caretakers and landscapers [4, 5]. Gatekeepers (e.g. p53) act directly by inhibiting cell growth. Caretakers are involved in maintaining DNA integrity and repairing DNA damage. Mutations in these caretakers have no direct effect on the proliferation, but they result in an accelerated accumulation of other mutations and will eventually lead to genomic instability. The landscapers, the third subgroup of TSGs are genes, which act by modulating the micro-environment rather than the tumor itself. Cancer is a multistep process Later it became clear that carcinogenesis depends on more than the activation of protooncogenes or deactivation of tumor suppressor genes. A first "hit" in an oncogene will not necessarily lead to cancer, as normally functioning TSGs would still counterbalance this 8

15 Chapter 1 impetus; only additional damage to TSGs would lead to unchecked proliferation. Conversely, a damaged TSG would not lead to cancer unless there is a growth impetus from an activated oncogene. Generally, the normal cell has multiple independent mechanisms that regulate its growth and differentiation potential and several separate events are therefore needed to override all the control mechanisms, as well as induce other aspects of the transformed phenotype, like metastasis. 9

16 Chapter 1 Breast and ovarian cancer The incidence of breast cancer varies among different populations. About 10% of women in the Western world is diagnosed with breast cancer during lifetime and is herewith the most common cancer diagnosed. Prognosis and survival rate varies greatly depending on cancer type and staging. Lifestyle is an important player in tumorigenesis. The major risk factors that can be modified to decrease risk for breast cancer include improving diet, controlling weight, limiting alcohol consumption and cessation of smoking [6]. However, higher risks for breast cancer are attributed to age, gender and having a family history of the disease. Simply being female increases the risk 100 fold: women have a higher chance to develop breast cancer than men because female breast cells are constantly exposed to the growth-promoting effects of the hormones estrogen and progesterone. In addition, the risk to develop breast cancer is age related: breast and ovarian cancer incidence is much higher among post-menopausal women. This suggests that in the majority of breast cancer patients an accumulation of genetic defects is required for initiation and progression of the tumor. About 15% of breast cancer patients have a family history for the disease. In this group of patients there is a higher chance of an underlying inherited genetic defect in a breast cancer susceptibility gene. These cases will develop breast cancer at an earlier age (Knudson hypothesis). The BReast CAncer genes 1 (BRCA1) and 2 (BRCA2) are the best known genes associated with an increased risk for breast cancer, but mutations are responsible for less than 5% of all breast cancers [7] and for less than 20% of the familial cases [8]. Both BRCA1 and BRCA2 act as TSGs and loss of the 2 nd allele (2 nd hit) is seen in the tumors of patients carrying BRCA1/2 germline mutations. Both genes function as caretakers by maintaining genomic integrity and repairing DNA damage (see functions of BRCA1 and BRCA2). 10

17 Chapter 1 Genetic susceptibility to breast cancer Introduction Generally, a distinction is made between high and moderate risk families with breast and/or ovarian cancer. In families with a clear autosomal dominant inheritance pattern for breast cancer, the diagnosis of Hereditary Breast and/or Ovarian Cancer (HBOC) is proposed. Members of such families a priori have a highly increased risk to develop breast cancer. The age at onset is considerably lower compared to sporadic patients and the prevalence of bilateral or multiple primary breast cancers is higher. HBOC represents 5 to 8% of all breast cancer cases. The clinical diagnostic criteria are given in Table 1. TABLE 1: Diagnostic criteria for Hereditary breast and/or ovarian cancer syndrome Hereditary Breast and/or ovarian cancer (HBOC) syndrome Diagnosis of breast or ovarian cancer in: 1. At least 3 first degree relatives (or second degree relatives in case of paternal inheritance) 2. In at least 2 successive generations 3. And at least one of the patients diagnosed before the age of 50 Recently (2010) the diagnostic criteria defining the HBOC syndrome have been updated to: Diagnosis of breast or ovarian cancer caused by a mutation in BRCA1 or BRCA2. (STOET (Stichting Opsporing Erfelijke Tumoren, The Netherlands)). In families where breast cancer is overrepresented compared to the incidence in the general population, but not fulfilling the criteria for hereditary breast cancer, the diagnosis of familial breast and/or ovarian cancer (FBOC) is made (Table 2). Approximately 15% of all breast cancer patients have a first degree or in case of paternal inheritance a second degree relative. 11

18 Chapter 1 TABLE 2: Diagnostic criteria for Familial breast and/or ovarian cancer syndrome (StOET, 2010) Familial Breast and/or ovarian cancer (FBOC) syndrome Diagnosis of breast cancer in: 1. One first and one second degree relative with an average age of diagnosis before the age of Two second degree relatives with an average age of diagnosis before the age of Three or more first or second degree relatives, irrespective age of diagnosis 4. But not fulfilling the HBOC criteria Diagnosis of bilateral or multifocal breast cancer in: 1. One first degree relative with the first tumor diagnosed before the age of 50 Diagnosis of ovarian cancer in: 1. One first or second degree relative irrespective age of diagnosis and diagnosis of breast cancer in one first or second degree relative before the age of 60 (at least one first degree relative) High-Penetrance Breast Cancer Predisposition Genes Mutations in three high-penetrance breast cancer predisposition genes confer a higher than tenfold relative risk of breast cancer. BRCA1, BRCA2 and P53. Germline mutations in P53 lead to Li Fraumeni syndrome and are associated with other phenotypic manifestations as well. BRCA1 and BRCA2 are the two major breast cancer genes. Genetic testing for these genes is widely available in a clinical context. As mutation detection in these genes is challenging and expensive, a number of models and scoring systems have been proposed to estimate the probability of the presence of a BRCA1 or BRCA2 mutation in a given individual based on family history and age at onset. Important and widely used examples are the Claus, Tyrer- Cuzick and BRCAPRO model [9] but these cannot be reliably used for patients without a family history for breast or ovarian cancer. Hence, in many labs worldwide easily applicable, descriptive inclusion criteria based on age of onset and family history are applied. 12

19 Chapter 1 The BRCA1 gene was identified by positional cloning in 1994 [10]. It is located on chromosome 17q21 and spans approximately 81kb on the minus strand. BRCA1 contains 22 coding exons, including a large exon 11 which covers almost 50% of the coding sequence. BRCA1 spans a region with an unusually high density of Alu repetitive DNA (41.5%), but a relatively low density (4.8%) of other repetitive sequences (Figure 2). BRCA1 intron lengths range in size from 403 bp to 9.2 kb and contain 3 intragenic microsatellite markers located in introns 12, 19, and 20 [11]. The 5-prime end of the BRCA1 gene lies within a duplicated region on chromosome 17q21. This region contains BRCA1 exons 1A, 1B, and 2 and their surrounding introns; as a result, a BRCA1 pseudogene lies upstream of BRCA1 [12]. BRCA2 was identified by Wooster et al. [13] in 1995 on chromosome 13q12-q13. BRCA2 contains 27 exons (26 coding) and spans 84kb of the genome. Similar to BRCA1, BRCA2 is very AT-rich, has a large exon 11 and a translational start site in exon 2 (Figure 2) [14]. Figure 2: BRCA1 and BRCA2 gene Genomic structure of the BRCA1 and BRCA2 genes. BRCA1 has 22 coding exons including a large exon 11 (panel A). Panel B shows repetitive elements in BRCA1. BRCA2 has 26 coding exon, including a large exon 10 and 11 (panel C). Panel D shows the repetitive elements in BRCA2 (modified from [15]). The BRCA1 gene encodes a protein of 1863 amino acids [10] (Figure 3). Most of the coding region shows no sequence similarity to previously described proteins apart from the presence of a RING zinc finger domain [16] at the amino terminus of the protein and two 13

20 Chapter 1 'BRCT' (BRCA1 carboxyl terminus) repeats at the carboxyl terminus [17, 18]. The RING finger domain is involved in mediating protein-protein interactions. The BRCT repeat is a poorly conserved domain found in a range of proteins many of which are involved in either DNA repair or metabolism [18]. A two-hybrid screen with the RING finger domain region of BRCA1 resulted in the isolation of a novel gene, BARD1, the encoded product of which contains both a RING finger and a BRCT domain but no other similarity to BRCA1 [19]. Furthermore, a nuclear localization signal (NLS) domain was identified in exon 11, which is absent in different alternative spliced mrnas, leading to the hypothesis that splicing regulates the BRCA1 function and its expression in different tissues [20]. Figure 3: Features of the Human BRCA Proteins BRCA1 contains an N-terminal RING domain, nuclear localization signals (NLSs), and two C- terminal BRCT domains of 110 residues (also found in several proteins with functions in DNA repair or cell cycle control). Examples of interacting proteins are shown below approximate regions of binding. BRCA2 contains eight repeats of the 40 residue BRC motifs. (adapted from [21]) BRCA2 encodes a protein of 3418 amino acids with an estimated molecular weight of 384 kda [13, 14] (Figure 3). Although BRCA2 shows no strong similarity to known proteins, a significant feature is the presence of eight copies of a amino acid repeat (BRC repeats) that are present in the part of the protein encoded by exon 11 [22]. Comparison of the sequence of exon 11 from six mammalian species reveals that many of the repeats appear to be retained within the generally poorly conserved context of exon 11 [23]. Functions of BRCA1 and BRCA2 Both BRCA1 and BRCA2 are involved in maintaining genome integrity at least in part by engaging in DNA repair, cell cycle checkpoint control and even the regulation of key mitotic 14

21 Chapter 1 or cell division steps (caretakers). Unsurprisingly, the complete loss of function of either protein leads to a dramatic increase in genomic instability. A summary of the involved pathways and protein-protein interactions is given in Figure 4. Figure 4: Overview of the pathways used by BRCA1/2 to maintain cell integrity (adapted from [24]) The RING domain at the amino terminus of BRCA1 mediates the interaction with BARD1. The BRCA1/BARD1 heterodimer shows ubiquitin ligase activity [25, 26]. The central region interacts with the DNA repair protein complex Mre11-Rad50-NBS1 and the transcriptional repressor ZBRK1 [27, 28]. The Mre11-Rad50-NBS1 complex binds to and processes DNA double stranded breaks. This complex is involved in both nonhomologous end joining and homologous recombinational repair. The carboxyl terminus of BRCA1 contains tandem BRCA1 C-terminal (BRCT) repeats. This region binds to phospho-peptides [29, 30] involved in cell-cycle checkpoints and DNA repair. Several proteins including BACH1, CtIP, Acetyl-CoA carboxylase, Abraxas/CCDC98, and RAP80 interact with the BRCT domain of BRCA1 in a 15

22 Chapter 1 phospho-dependent manner [31]. This reveals how BRCA1 maintains genomic stability through DNA repair and cell cycle checkpoint activation. Three BRCA1 protein complexes have been characterized. One complex contains BRCA2 and PALB2, a BRCA2-associated protein [32, 33]. The BRC domain in the middle third of BRCA2 binds to the Rad51 recombinase [34, 35]. BRCA2-deficient cells show defective formation of IR-induced Rad51 foci [36]. Both the formation of Rad51 nuclear filaments and Rad51- mediated strand exchange during homologous recombination are regulated by BRCA2 [37]. At the carboxyl terminus of BRCA2 is a region with extensive secondary structure that interacts with the evolutionarily conserved protein DSS1. Based on the 3-D structure, it is predicted that the high-affinity ssdna-binding and dsdna-binding domains of BRCA2 play critical roles in homologous recombination [38]. Mutation spectrum of BRCA1/2 BRCA1 and BRCA2 must be the most frequently sequenced genes to date. Pathogenic mutations are identified throughout the complete coding and splice site region of both genes, without evidence of hot spot regions. The most prevalent type of mutations in BRCA1/2 are truncating mutations, these include nonsense, frameshift (due to small insertions and/or deletions) and splice site mutations. Over 670 distinct truncating mutations have been reported for BRCA1 and over 730 for BRCA2 in the Breast Cancer Information Core (BIC) (data retrieved in February 2011) [39]. In both genes, truncating mutations are far more frequently identified compared to missense mutations. Only for a limited number of missense mutations the pathogenic effect is proven [39]. These lie primarily in regions encoding the BRCA1 protein RING and BRCT repeats, which are known to be involved in functionally important protein-protein interactions. Missense mutations in the first codon of BRCA1 and BRCA2 are classified as pathogenic as well [40], but the effects of the base changes are more likely related to translation initiation than amino-acid substitution per se. In addition to point mutations, BRCA1 and BRCA2 are both known to have germline mutations resulting from larger scale genomic rearrangements which result in deletions or duplications of one or more exons, usually causing premature stop codons. Depending on 16

23 Chapter 1 the selection criteria and the ethnic background of the study group, large scale rearrangements in BRCA1 have been reported as frequently as 2-12% among high risk families and might represent 7-40% of all BRCA1 mutations identified [41-44]. More recently, such mutations have been found in BRCA2 in 2-8% of high risk families, again varying with the selection criteria and the investigated population [44-46]. Large genomic rearrangements are not detected with the traditional PCR-based mutation detection methods, so their contribution might be underestimated in some populations. BRCA1/2 mutation prevalence The incidence of mutations in high risk families varies widely amongst different populations, some present a wide spectrum of different mutations, while in particular ethnic groups specific mutations show a high frequency due to a founder effect. Founder mutations are more prevalent in groups of people who have remained isolated with consequent interbreeding. As a result rare mutations become more common within the population. One of the best known founder effects for breast and ovarian cancer is found in the Ashkenazi Jewish population. Ashkenazi is the term used to describe Jews who have ancestors from Eastern and Central Europe, such as Germany, Poland, Lithuania, Ukraine and Russia. Although only 10% of breast and ovarian cancer cases is due to a genetic predisposition, the hereditary proportion in the Ashkenazi Jewish population is much higher because of 3 founder mutations (c.68_69delag (185delAG) and c.5266dupc (5382insC) in BRCA1 and c.5946delt (6174delT) in BRCA2). For example, BRCA1 c.68_69delag is found in 1% of the population and contributes to 16-20% of all breast cancers diagnosed before the age of 50 [47]. Risks associated with germline mutations in BRCA1 and BRCA2 BRCA2 mutations have been linked to a wide spectrum of cancers, including prostate cancer, pancreatic cancer, fallopian tube cancer, male breast cancer and skin cancer. In contrast, a BRCA1 mutation is primarily associated with breast and ovarian cancers. Plausible explanations include the connection between BRCA1, but not BRCA2, and steroid hormone 17

24 Chapter 1 receptors [48] and the role of BRCA1 in mammary luminal epithelial lineage determination [49]. Several approaches have been used to estimate the average age-specific cumulative cancer risks associated with mutations in BRCA1 and BRCA2. In general, for BRCA1 carriers, the estimated cumulative risks to the age of 70 are 65-80% for breast cancer and 39-60% for ovarian cancer. The corresponding risks for BRCA2 carriers are 45-60% for breast cancer and 11-30% for ovarian cancer (Figure 5) [50, 51]. In comparison, the average woman in the general population has an 11% lifetime risk of developing breast cancer and a 1.5% risk of developing ovarian cancer. After the initial diagnosis of breast cancer in a BRCA1 or BRCA2 carrier, the risk of cancer in the contralateral breast (a new primary cancer) increases by approximately 3% per year [52, 53]. Figure 5: Cumulative risk plots for breast and ovarian cancer associated with BRCA1 and BRCA2 mutations. Cumulative risk of breast (diamonds) and ovarian (squares) cancer in BRCA1 (left panel) and BRCA2 (right panel) mutation carriers (adapted from [50]). The broad range of associated risks reported in literature is consistent with the hypothesis that risks in BRCA1 or BRCA2 mutation carriers can vary substantially due to the presence of additional risk factors for breast cancer, including genetic modifiers. Common genetic modifiers of breast cancer risk for carriers of mutations in BRCA1 and BRCA2 have been identified in essentially three ways: studies of single nucleotide polymorphisms (SNPs) in candidate genes [54-56], studies of common SNPs (minor allele frequency >0.13) found in genome-wide association studies (GWAS) to be associated with a small increased breast cancer risk (odds ratio <1.30) in the general population (reviewed in [51]) and GWAS carried out in mutation carriers [57]. BRCA1 and BRCA2 mutation carriers could potentially be 18

25 Chapter 1 among the first groups of individuals for whom clinically applicable risk profiling could be developed using the common breast cancer susceptibility variants identified through GWAS. Other studies investigated risk variation in mutation carriers based on factors such as parity, age at first live birth, breastfeeding and mammographic density [50, 58-60]. Although the results from those studies are not fully consistent, they suggest that the relative risks conferred by these factors in germline mutation carriers may be similar to the relative risks in non-carriers. Preventive measurements to reduce breast cancer risk in BRCA1/2 mutation carriers Individuals carrying mutations in BRCA1/2 have a 45-80% chance of developing breast cancer. There is evidence that strategies to reduce the risk of cancer in populations who carry these mutations are effective [61], therefore it is important to identify those patients who will benefit from genetic testing. Women with a BRCA1 or BRCA2 mutation may consider several options for breast cancer prevention. The three main options are prophylactic mastectomy, prophylactic oophorectomy, and chemoprevention. The goal of prophylactic mastectomy is to prevent breast cancer, thereby eliminating the potential for metastatic spread and death from the disease. Studies suggest that the residual breast cancer risk after mastectomy is minimal (less than 5%), and much less than the risk of breast cancer in the general population [62-64]. It has been shown that BRCA1-associated triple negative breast cancers are hormonally associated [65]. The purpose of an anti-hormonal therapy is to eliminate or block the effect of ovarian estrogen, and probably progesterone, or to prevent aromatization of androgen to estrogen. Tamoxifen and oophorectomy have been well studied in women with BRCA1 or BRCA2 mutations. Cohort studies estimate the reduction in hereditary breast cancer risk associated with a premenopausal oophorectomy to be about 50% [66-69]. Tamoxifen is a selective estrogen receptor modulator which competes with estrogen for binding to the estrogen receptor. In humans, tamoxifen acts as an estrogen antagonist in breast tissue, inhibiting the growth of estrogen-dependent breast tumors [70]. On theoretical grounds, tamoxifen should not reduce the incidence of estrogen-receptor (ER) 19

26 Chapter 1 negative breast cancers, and most breast cancers that occur in BRCA1 (but not BRCA2) carriers are ER negative. More recently, Poly(ADP-ribose) polymerase (PARP) inhibitors were found as a promising new class of targeted agents for the treatment of patients with various malignancies (breast, ovarian and prostate cancer) [71]. The PARP enzyme plays a key role in the repair of DNA breaks. Evidence suggests that normal cells can tolerate PARP inhibitors by activating backup repair pathways that depend upon the activity of BRCA1 and BRCA2 proteins. Though there is widespread enthusiasm to move these drugs forward quickly, much remains to be understood about the optimal use of the novel agents. Breast Cancer Predisposition Genes of Uncertain Penetrance Three genetic syndromes have clearly been associated with an increased risk of breast cancer but the magnitude of the associated risk for each remains uncertain: examples are Cowden disease associated with mutations in PTEN [72], Peutz-Jeghers syndrome with germline mutations in LKB1 [73] and Hereditary diffuse gastric cancer syndrome (caused by CDH1 mutations)[74]. All these syndromes are associated with other phenotypic manifestations and in site specific breast cancer families the contribution of mutations in PTEN, LKB1 or CDH1 is at most infrequent [75]. It is not expected that these genes are acting as breast cancer susceptibility genes outside the context of these syndromes. Together, they may cause less than 1% of all breast cancers (Figure 6). 20

27 Chapter 1 Figure 6: Prevalence of sporadic vs Hereditary and Familal breast cancers. About 40% of all inherited forms of breast cancers can be explained by mutations in single genes. (prevalence numbers adapted from [75]) Intermediate-Penetrance Breast Cancer Predisposition Genes As numerous genetic linkage studies have failed to identify additional high risk genes, it has been hypothesized that the remaining breast cancer clustering among families might not be explained simply by inheritance of variants in additional major high risk breast cancer susceptibility genes. Chance clustering and shared environmental and lifestyle factors may account for part of the familial breast cancer cases. However, twin studies have shown that the breast cancer risk for unaffected monozygotic twins with a co-twin diagnosed with breast cancer is higher compared to the risk for dizygotic twins. This strongly suggests that a significant proportion of the remaining familial breast cancer risk is likely due to genetic factors [76, 77]. As BRCA1 and BRCA2 are involved in DNA repair, and heterozygous mutations in DNA repair genes such as ATM and TP53 have been associated with an increased breast cancer risk, most candidate gene approaches now focus on genes involved in DNA repair, such as CHEK2, RAD50, BRIP1 and PALB2. Thus far, this candidate approach has led to the identification of five moderate risk breast cancer genes (CHEK2, ATM, BRIP1, PALB2 and NBS1). Odds ratios for heterozygous mutations in these genes lie between 2.0 and 4.3 (Figure 7). Variants in these intermediate penetrance genes are currently estimated to account for 5% of familial breast cancer risk and are present at 1% allele frequency [78]. 21

28 Chapter 1 Figure 7: Germline mutations that confer susceptibility to breast cancer. Schematic representation of the relation between breast cancer risks conferred by germline variants in low risk, moderate risk and high risk breast cancer genes and their prevalence in the general population. Germline mutations in high risk breast cancer susceptibility genes are very rare in the population (<0.1%) but confer high breast cancer risk (up to 20-fold). Germline variations in moderate risk breast cancer genes are rare in the population (~1%) and confer a 2 to 4-fold risk. Low risk breast cancer alleles and SNPS are more common (up to 40%) but confer only a slight increased BC risk (on average 1.3 fold). The red line represents a natural risk limitation and the bottom line represents a virtual limitation of detection. (adapted from [79]) The link between DNA repair and breast cancer susceptibility became even more intriguing after homozygous mutations in BRCA2 were found to be responsible for Fanconi Anaemia (FA) and BRCA2 was shown to be identical with FANCD1 [80]. No biallelic germline BRCA1 mutations were reported in FA patients so far, but mutations in two other breast cancer susceptibility genes, PALB2 (FANCN) and BRIP1 (FANCJ), have been identified in FA patients, giving rise to type N FA and type J FA, respectively [81]. Recently, a fourth gene was found to be probably associated with both FA and breast cancer susceptibility. Vaz et al. [82] described an FA-like phenotype in a consanguineous Pakistani family with three affected children, of which only one survived. These three patients exhibited various severe inborn anomalies, and a homozygous missense mutation in the RAD51C gene was found to be the cause of these abnormalities. Because three genes (BRCA2, PALB2, BRIP1) were already known to be associated with both FA and breast cancer susceptibility, it seemed feasible to screen breast cancer families for monoallelic mutations in the RAD51C gene. Meindl et al. [83] detected six monoallelic pathogenic mutations in RAD51C by screening unrelated German women with gynecologic malignancies (breast 22

29 Chapter 1 and/or ovarian tumors). Strikingly, all six deleterious mutations were exclusively found within 480 BRCA1/2 negative patients from breast and ovarian cancer families. No deleterious mutations were found in breast cancer only families [83]. These results support RAD51C as a new breast cancer susceptibility gene and according to his function and similarities with the other breast cancer susceptibility genes its role as tumor suppressor gene and caretaker has been stated. Common low-penetrance Breast Cancer Predisposition Alleles Genome-wide association studies identified several SNPs as low-penetrance breast cancer susceptibility polymorphisms within genes as well as in chromosomal loci with no known genes. Currently, there is a growing list of reports on common SNPs in genes or chromosomal loci that have been identified in genome-wide association studies, FGFR2 [84, 85], LSP1 [85], MAP3K1 [85], TGFB1 [54], TOX3 [85, 86], as well as a locus on 2q35 [86] and 8q [85]. The odds ratios for heterozygous and homozygous carriers range between 1.1 and 1.3, and 1.2 and 1.6, respectively. These SNPs are common in the general population (up to 40%) and more studies to validate the relative risk in different large cohorts will be needed. Under a multiplicative model of disease susceptibility, the low-penetrance variants are estimated to explain about 8.3% of the familial clustering of the disease [87]. This suggests that many other variants remain to be identified. 23

30 Chapter 1 24

31 Chapter 1 Research objectives and outline of the thesis The identification of a genetic defect in a family with a presumed inherited predisposition is of major importance. From a medical point of view this opens new perspectives for (predictive) genetic testing for relatives. It is clear that preventive actions lead to a risk reduction for breast cancer in high-risk patients. To answer the compelling demand of more certainty in an increasing number of families with a strong predisposition for breast cancer, efficient mutation detection technologies allowing higher throughput are required. The first aim of this study was the evaluation and validation of new mutation detection technologies. The introduction of these new technologies allowed offering of BRCA1/2 genetic testing to an increasing number of patients. Through these studies we gained more insight in the mutation prevalence in patient groups that were previously not eligible for genetic testing, like sporadic patients. Despite extensive analysis of the BRCA1&2 genes, mutations were detected in less than 20% of the families. As evidence is rising for the existence of additional breast cancer genes, we aimed to gain insight in the role of new candidate genes in our study population. Objective 1: Implementation of High resolution melting curve analysis in a diagnostic setting At the start of this thesis, High resolution melting curve analysis (HRMCA) was only beginning to emerge as a new mutation detection strategy. Only a few pilot studies were available and these indicated that HRMCA might be suitable for the use in a diagnostic setting. We optimized mutation scanning for BRCA1/2 and validated this new technique on two available platforms, subsequently, this workflow was implemented in our center on the best performing platform as a routine BRCA1/2 mutation detection strategy. In a second stage, we developed supplemental HRMCA genotyping assays to strongly reduce the remaining sequencing burden. 25

32 Chapter 1 Objective 2: Taking throughput to another level, introduction of massive parallel pyrosequencing in BRCA1/2 mutation detection Despite improvements in mutation detection strategies in terms of throughput and cost efficiency, the majority remains restricted to screening of a limited number of amplicons, followed by Sanger sequencing of the aberrant samples, i.e. individual disease genes. The development of massively parallel sequencing (MPS) technologies heralded an era in which molecular diagnostics for multigenic disorders becomes reality. Because of their size, polymorphic character and lack of mutation hot spots, the breast cancer susceptibility genes were excellent candidates to evaluate the challenges and pitfalls inherent to this technology. Furthermore, we gained general insights in the limitations of pyrosequencing and developed workarounds and tools which enable rapid progress on the road to next generation molecular diagnostics. Objective 3: BRCA1/2 mutations in sporadic breast cancer patients, implications for genetic testing Because of high breast cancer incidence, complicated and expensive mutation detection analyses and low prevalence of BRCA1/2 mutations, offering genetic testing to all breast cancer patients is currently not feasible. Accurate inclusion criteria to select those patients who will benefit from genetic testing are extremely important. With the introduction of the new technologies, we were able to offer genetic testing to an increasing number of patients. To gain further insights into the prevalence of BRCA1/2 mutations in sporadic breast and ovarian cancer patients in our study population, we evaluated if genetic testing of patients without any family history for breast cancer is worthwhile. Objective 4: Evaluation of RAD51C as a new breast cancer susceptibility gene in the Belgian/Dutch population Recent literature reported RAD51C as a new highly penetrant breast-ovarian cancer gene in German families. In order to evaluate if mutation analysis of this gene should be included in routine breast and ovarian cancer genetic testing in our centre, we investigated the prevalence of RAD51C mutations in the Belgian and Dutch population. 26

33 Chapter 1 References 1. Hanahan, D. and R.A. Weinberg, The hallmarks of cancer. Cell, (1): p Bishop, J.M., Molecular themes in oncogenesis. Cell, (2): p Knudson, A.G., Jr., H.W. Hethcote, and B.W. Brown, Mutation and childhood cancer: a probabilistic model for the incidence of retinoblastoma. Proceedings of the National Academy of Sciences of the United States of America, (12): p Kinzler, K.W. and B. Vogelstein, Landscaping the cancer terrain. Science, (5366): p Kinzler, K.W. and B. Vogelstein, Cancer-susceptibility genes. Gatekeepers and caretakers. Nature, (6627): p. 761, Rieck, G. and A. Fiander, The effect of lifestyle factors on gynaecological cancer. Best practice & research. Clinical obstetrics & gynaecology, (2): p Wooster, R. and B.L. Weber, Breast and ovarian cancer. The New England journal of medicine, (23): p Antoniou, A.C. and D.F. Easton, Risk prediction models for familial breast cancer. Future oncology, (2): p Fasching, P.A., et al., Evaluation of mathematical models for breast cancer risk assessment in routine clinical use. European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation, (3): p Miki, Y., et al., A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science, (5182): p Smith, T.M., et al., Complete genomic sequence and analysis of 117 kb of human DNA containing the gene BRCA1. Genome research, (11): p Puget, N., et al., Distinct BRCA1 rearrangements involving the BRCA1 pseudogene suggest the existence of a recombination hot spot. American journal of human genetics, (4): p Wooster, R., et al., Identification of the breast cancer susceptibility gene BRCA2. Nature, (6559): p Tavtigian, S.V., et al., The complete BRCA2 gene and mutations in chromosome 13q-linked kindreds. Nature genetics, (3): p Welcsh, P.L. and M.C. King, BRCA1 and BRCA2 and the genetics of breast and ovarian cancer. Human molecular genetics, (7): p Saurin, A.J., et al., Does this have a familiar RING? Trends in biochemical sciences, (6): p Koonin, E.V., S.F. Altschul, and P. Bork, BRCA1 protein products... Functional motifs. Nature genetics, (3): p Callebaut, I. and J.P. Mornon, From BRCA1 to RAP1: a widespread BRCT module closely associated with DNA repair. FEBS letters, (1): p Wu, L.C., et al., Identification of a RING protein that can interact in vivo with the BRCA1 gene product. Nature genetics, (4): p Thakur, S., et al., Localization of BRCA1 and a splice variant identifies the nuclear localization signal. Molecular and cellular biology, (1): p Venkitaraman, A.R., Cancer susceptibility and the functions of BRCA1 and BRCA2. Cell, (2): p Bork, P., N. Blomberg, and M. Nilges, Internal repeats in the BRCA2 protein sequence. Nature genetics, (1): p Bignell, G., et al., The BRC repeats are conserved in mammalian BRCA2 proteins. Human molecular genetics, (1): p Qiagen, 27

34 Chapter Chen, A., et al., Autoubiquitination of the BRCA1*BARD1 RING ubiquitin ligase. The Journal of biological chemistry, (24): p Wu-Baer, F., et al., The BRCA1/BARD1 heterodimer assembles polyubiquitin chains through an unconventional linkage involving lysine residue K6 of ubiquitin. The Journal of biological chemistry, (37): p Zheng, L., et al., Sequence-specific transcriptional corepressor function for BRCA1 through a novel zinc finger protein, ZBRK1. Molecular cell, (4): p Zhong, Q., et al., Association of BRCA1 with the hrad50-hmre11-p95 complex and the DNA damage response. Science, (5428): p Manke, I.A., et al., BRCT repeats as phosphopeptide-binding modules involved in protein targeting. Science, (5645): p Yu, X., et al., The BRCT domain is a phospho-protein binding domain. Science, (5645): p Rodriguez, M.C. and Z. Songyang, BRCT domains: phosphopeptide binding and signaling modules. Frontiers in bioscience : a journal and virtual library, : p Sy, S.M., M.S. Huen, and J. Chen, PALB2 is an integral component of the BRCA complex required for homologous recombination repair. Proceedings of the National Academy of Sciences of the United States of America, (17): p Zhang, F., et al., PALB2 links BRCA1 and BRCA2 in the DNA-damage response. Current biology : CB, (6): p Chen, P.L., et al., The BRC repeats in BRCA2 are critical for RAD51 binding and resistance to methyl methanesulfonate treatment. Proceedings of the National Academy of Sciences of the United States of America, (9): p Wong, A.K., et al., RAD51 interacts with the evolutionarily conserved BRC motifs in the human breast cancer susceptibility gene brca2. The Journal of biological chemistry, (51): p Yuan, S.S., et al., BRCA2 is required for ionizing radiation-induced assembly of Rad51 complex in vivo. Cancer research, (15): p Thorslund, T., F. Esashi, and S.C. West, Interactions between human BRCA2 protein and the meiosis-specific recombinase DMC1. The EMBO journal, (12): p Yang, H., et al., BRCA2 function in DNA binding and recombination from a BRCA2-DSS1-ssDNA structure. Science, (5588): p Core), B.B.C.I Fackenthal, J.D. and O.I. Olopade, Breast cancer risk associated with BRCA1 and BRCA2 in diverse populations. Nature reviews. Cancer, (12): p Hendrickson, B.C., et al., Prevalence of five previously reported and recurrent BRCA1 genetic rearrangement mutations in 20,000 patients from hereditary breast/ovarian cancer families. Genes, chromosomes & cancer, (3): p Hogervorst, F.B., et al., Large genomic deletions and duplications in the BRCA1 gene identified by a novel quantitative method. Cancer research, (7): p Montagna, M., et al., Genomic rearrangements account for more than one-third of the BRCA1 mutations in northern Italian breast/ovarian cancer families. Human molecular genetics, (9): p Walsh, T., et al., Spectrum of mutations in BRCA1, BRCA2, CHEK2, and TP53 in families at high risk of breast cancer. JAMA : the journal of the American Medical Association, (12): p Agata, S., et al., Large genomic deletions inactivate the BRCA2 gene in breast cancer families. Journal of medical genetics, (10): p. e Tournier, I., et al., Significant contribution of germline BRCA2 rearrangements in male breast cancer families. Cancer research, (22): p

35 Chapter Struewing, J.P., et al., The carrier frequency of the BRCA1 185delAG mutation is approximately 1 percent in Ashkenazi Jewish individuals. Nature genetics, (2): p Lee, E.Y., Promotion of BRCA1-associated triple-negative breast cancer by ovarian hormones. Current opinion in obstetrics & gynecology, (1): p Visvader, J.E., Keeping abreast of the mammary epithelial hierarchy and breast tumorigenesis. Genes & development, (22): p Antoniou, A., et al., Average risks of breast and ovarian cancer associated with BRCA1 or BRCA2 mutations detected in case Series unselected for family history: a combined analysis of 22 studies. American journal of human genetics, (5): p Milne, R.L. and A.C. Antoniou, Genetic modifiers of cancer risk for BRCA1 and BRCA2 mutation carriers. Annals of oncology : official journal of the European Society for Medical Oncology / ESMO, Suppl 1: p. i Metcalfe, K., et al., Contralateral breast cancer in BRCA1 and BRCA2 mutation carriers. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, (12): p Verhoog, L.C., et al., Contralateral breast cancer risk is influenced by the age at onset in BRCA1-associated breast cancer. British journal of cancer, (3): p Cox, A., et al., A common coding variant in CASP8 is associated with breast cancer risk. Nature genetics, (3): p Antoniou, A.C., et al., RAD51 135G-->C modifies breast cancer risk among BRCA2 mutation carriers: results from a combined analysis of 19 studies. American journal of human genetics, (6): p Engel, C., et al., Association of the variants CASP8 D302H and CASP10 V410I with breast and ovarian cancer risk in BRCA1 and BRCA2 mutation carriers. Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, (11): p Antoniou, A.C., et al., A locus on 19p13 modifies risk of breast cancer in BRCA1 mutation carriers and is associated with hormone receptor-negative breast cancer in the general population. Nature genetics, (10): p Andrieu, N., et al., Pregnancies, breast-feeding, and breast cancer risk in the International BRCA1/2 Carrier Cohort Study (IBCCS). Journal of the National Cancer Institute, (8): p Cullinane, C.A., et al., Effect of pregnancy as a risk factor for breast cancer in BRCA1/BRCA2 mutation carriers. International journal of cancer. Journal international du cancer, (6): p Antoniou, A.C., et al., Parity and breast cancer risk among BRCA1 and BRCA2 mutation carriers. Breast cancer research : BCR, (6): p. R Peters, N., et al., Knowledge, attitudes, and utilization of BRCA1/2 testing among women with early-onset breast cancer. Genetic testing, (1): p Hartmann, L.C., et al., Efficacy of bilateral prophylactic mastectomy in BRCA1 and BRCA2 gene mutation carriers. Journal of the National Cancer Institute, (21): p Meijers-Heijboer, H., et al., Breast cancer after prophylactic bilateral mastectomy in women with a BRCA1 or BRCA2 mutation. The New England journal of medicine, (3): p Rebbeck, T.R., et al., Bilateral prophylactic mastectomy reduces breast cancer risk in BRCA1 and BRCA2 mutation carriers: the PROSE Study Group. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, (6): p Narod, S.A., Modifiers of risk of hereditary breast and ovarian cancer. Nature reviews. Cancer, (2): p Kauff, N.D., et al., Risk-reducing salpingo-oophorectomy in women with a BRCA1 or BRCA2 mutation. The New England journal of medicine, (21): p

36 Chapter Rebbeck, T.R., Prophylactic oophorectomy in BRCA1 and BRCA2 mutation carriers. European journal of cancer, Suppl 6: p. S Rebbeck, T.R., et al., Breast cancer risk after bilateral prophylactic oophorectomy in BRCA1 mutation carriers. Journal of the National Cancer Institute, (17): p Rebbeck, T.R., et al., Prophylactic oophorectomy in carriers of BRCA1 or BRCA2 mutations. The New England journal of medicine, (21): p Pritchard, K.I., Breast cancer prevention with selective estrogen receptor modulators: a perspective. Annals of the New York Academy of Sciences, : p Fong, P.C., et al., Inhibition of poly(adp-ribose) polymerase in tumors from BRCA mutation carriers. The New England journal of medicine, (2): p Rustad, C.F., et al., Germline PTEN mutations are rare and highly penetrant. Hereditary cancer in clinical practice, (4): p Chen, J. and A. Lindblom, Germline mutation screening of the STK11/LKB1 gene in familial breast cancer with LOH on 19p. Clinical genetics, (5): p Pharoah, P.D., P. Guilford, and C. Caldas, Incidence of gastric cancer and breast cancer in CDH1 (E-cadherin) mutation carriers from hereditary diffuse gastric cancer families. Gastroenterology, (6): p Ripperger, T., et al., Breast cancer susceptibility: current knowledge and implications for genetic counselling. European journal of human genetics : EJHG, (6): p Ahlbom, A., et al., Cancer in twins: genetic and nongenetic familial risk factors. Journal of the National Cancer Institute, (4): p Mack, T.M., et al., Heritable breast cancer in twins. British journal of cancer, (3): p Stratton, M.R. and N. Rahman, The emerging landscape of breast cancer susceptibility. Nature genetics, (1): p Harris, T.J. and F. McCormick, The molecular pathology of cancer. Nature reviews. Clinical oncology, (5): p Howlett, N.G., et al., Biallelic inactivation of BRCA2 in Fanconi anemia. Science, (5581): p Levy-Lahad, E., Fanconi anemia and breast cancer susceptibility meet again. Nature genetics, (5): p Vaz, F., et al., Mutation of the RAD51C gene in a Fanconi anemia-like disorder. Nature genetics, (5): p Meindl, A., et al., Germline mutations in breast and ovarian cancer pedigrees establish RAD51C as a human cancer susceptibility gene. Nature genetics, (5): p Hunter, D.J., et al., A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nature genetics, (7): p Easton, D.F., et al., Genome-wide association study identifies novel breast cancer susceptibility loci. Nature, (7148): p Stacey, S.N., et al., Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nature genetics, (7): p Mavaddat, N., et al., Genetic susceptibility to breast cancer. Molecular oncology, (3): p

37 Taking throughput in BRCA1/2 mutation detection to the next level

38 Chapter 2 TAKING THROUGHPUT IN BRCA1/2 MUTATION DETECTION TO THE NEXT LEVEL Introduction 33 PAPER 1 37 Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms. De Leeneer K., Coene I., Poppe B., De Paepe A. & Claes K. Clin Chem. 2008; 54(6): PAPER 2 45 Genotyping of frequent BRCA1/2 SNPs with unlabeled probes: a supplement to HRMCA mutation scanning allowing to strongly reduce the sequencing burden. De Leeneer K., Coene I., Poppe B., De Paepe A. & Claes K. J Mol Diagn. 2009; 11(5): PAPER 3 51 Practical tools to implement massively parallel pyrosequencing of PCR products in next generation molecular diagnostics De Leeneer K., De Schrijver J., Clement L., Baetens M., Lefever S., De Keulenaer S., Van Criekinge W., Deforce D., Van Nieuwerburgh P., Bekaert S., Pattyn F., De Wilde B., Coucke P., Vandesompele J., Claes K. & Hellemans J. PLOS One (under review) PAPER 4 75 Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations. De Leeneer K., Hellemans J., De Schrijver J., Baetens M., Poppe B., Van Criekinge W., De Paepe A., Coucke P. & Claes K. Hum Mutat. 2011; 32(3): References 85 32

39 Chapter 2 Introduction Mutation detection in BRCA1/2 is challenging in routine diagnostics. Both genes are large, polymorphic and no real mutation hot-spots are present. Furthermore, the number of patients seeking genetic testing increases every year and turnaround times need to be strongly reduced as new molecular therapies based on the knowledge of DNA sequence become available. For the introduction of a new mutation detection strategy in a diagnostic setting, more than only throughput has to be taken in consideration. Ideally, the new workflow is more costefficient, more sensitive, less labor-intensive and simple enough to be readily implemented. Several labs perform direct sequencing of the coding sequences and splice sites of the BRCA1/2 genes. Direct Sequencing (DS) is considered as the 'gold standard' and capable of 100% detection (no false negatives) for point mutations (substitutions and small deletions or insertions) depending amongst others on sample quality and analysis software. However, due to cost limitations a wide variety of mutation scanning methods (prior to Sanger sequencing) are applied in many clinical diagnostic laboratories. The majority of these methods are based on heteroduplex formation during a denaturation-renaturation process. Sequencing is reserved for those fragments that formed heteroduplexes resulting from mismatched strands at points of heterozygosity. Some well known and widely used examples are: denaturing high performance liquid chromatography (DHPLC) [1-3], singlestrand conformation polymorphism (SSCP) [1, 4], denaturing gradient gel electrophoresis (DGGE) [5] and heteroduplex analysis (HA)[6]. An alternatively and widely applied mutation detection test, used to be the protein truncation test (PTT). After in vitro transcription and translation, truncated products are easily distinguished from full length proteins [1, 7]. All these different techniques have their own strengths and weaknesses and a broad range in terms of sensitivity and specificity is reported in literature. An overview is shown in Table 1. 33

40 Chapter 2 TABLE 1: Overview of sensitivity, specificity and theoretical throughput of 6 frequently applied BRCA1/2 mutation detection techniques Technique Sensitivity range reported in literature Specificity range reported in literature Maximum number of patients screened in a year* Remarks DS 100% 100% 86 gold standard dhplc % 100% 298 DGGE % 100% 276 SSCP % % 121 HA 74-89% 100% 473 PTT 92% 100% Protein truncating mutations only * The following assumptions were made in estimating the cost per DNA fragment analyzed with each technique. Each type of instrument was assumed to be used to full capacity (100%) on an annual basis, and the labor force was assumed to be optimally productive during the number of working days per year (n=217) (modified from [8]). In paper 1 we propose High resolution melting curve analysis (HRMCA) as a new mutation detection strategy for screening of the BRCA1 and 2 genes. Since HRMCA was virtually unexplored in a diagnostic setting at the start of this thesis, we performed a thorough evaluation of the sensitivity and specificity of this technique on two different platforms. We developed a HRMCA protocol for screening the complete coding region and splice sites of BRCA1/2 and gained insight in the challenges and advantages of HRMCA versus other currently available prescreening techniques. Since HRMCA mutation scanning is a prescreening technique, its main application is identifying samples where a sequence variant is present. Because of the polymorphic character of BRCA1/2, the burden of Sanger sequencing aberrant samples afterwards remains high. Therefore, we designed HRMCA-based genotyping assays using unlabeled probes for the most recurrent Single Nucleotide Polymorphisms (SNPs) in BRCA1/2 (paper 2). By supplementing our workflow with these assays, we succeeded to further decrease the sequencing workload. 34

41 Chapter 2 The development of massive parallel sequencing (MSP) or the so called next generation sequencing technologies, opened many new research opportunities. This technology outperforms all currently available mutation detection techniques concerning throughput. Unfortunately, before proceeding to next generation molecular diagnostics, a number of hurdles need to be taken. We evaluated different applications of the pyrosequencing technology (GS-FLX, Roche) and gained insights in the possible pitfalls and opportunities of the use of this chemistry in a diagnostic setting. Paper 3 summarizes the results obtained in 10 experiments and provides workarounds and calculation tools to facilitate implementation of this new technology in a diagnostic setting. Based on the strategies and methods described in this paper we successfully developed and validated the screening of the complete coding region of the BRCA1 and BRCA2 genes (paper 4), demonstrating the feasibility of performing more efficient molecular diagnostics using massively parallel sequencing. 35

42

43 Chapter 2 Clinical Chemistry 54: (2008) Molecular Diagnostics and Genetics Rapid and Sensitive Detection of BRCA1/2 Mutations in a Diagnostic Setting: Comparison of Two High-Resolution Melting Platforms Kim De Leeneer, 1 Ilse Coene, 1 Bruce Poppe, 1 Anne De Paepe, 1 and Kathleen Claes 1* BACKGROUND: High-resolution melting is an emerging technique for detection of nucleic acid sequence variations. Developments in instrumentation and saturating intercalating dyes have made accurate high-resolution melting analysis possible and created opportunities to use this technology in diagnostic settings. We evaluated 2 high-resolution melting instruments for screening BRCA1 and BRCA2 mutations. METHODS: To cover the complete coding region and splice sites, we designed 112 PCR amplicons ( bp), amplifiable with a single PCR program. LCGreen Plus was used as the intercalating dye. High-resolution melting analysis was performed on the 96-well Lightscanner (Idaho Technology Inc.) and the 96-well LightCycler 480 (Roche) instruments. We evaluated sensitivity by analyzing 212 positive controls scattered over almost all amplicons and specificity by blind screening of 22 patients for BRCA1 and BRCA2.In total, we scanned 3521 fragments. RESULTS: All 212 known heterozygous sequence variants were detected on the Lightscanner by analysis on normal sensitivity setting. On the LightCycler 480, the standard instrument sensitivity setting of 0.3 had to be increased to 0.7 to detect all variants, decreasing the specificity to 95.9% (vs 98.7% for the Lightscanner). CONCLUSIONS: Previously, we screened BRCA1/2 by direct sequencing of the large exon 11 and denaturing gel gradient electrophoresis (DGGE) for all other coding exons. Since the introduction of high-resolution melting, our turnaround time has been one third of that with direct sequencing and DGGE, as post-pcr handling is no longer required and the software allows fast analyses. High-resolution melting is a rapid, costefficient, sensitive method simple enough to be readily implemented in a diagnostic laboratory American Association for Clinical Chemistry Different approaches are used to screen the complete coding region of large genes such as BRCA1 2 (MIM ) and BRCA2 (MIM ), the 2 major breast cancer susceptibility genes. Most commonly, a prescreening method such as denaturing high-performance liquid chromatography (dhplc) 3 or denaturing gradient gel electrophoresis (DGGE) is used, followed by sequencing of aberrant fragments. Screening of BRCA1 and BRCA2 is arduous because of the complex mutational spectrum and the large size of the genes for which the complete coding sequence needs to be analyzed for an increasing number of patients. Both genes lack mutation hot-spot regions, and different types of mutations, including frameshift, missense, nonsense, and splice site, are found. BRCA1 comprises 22 coding exons; the 7.8-kb mrna transcript is translated into a protein of 1863 amino acids (1). BRCA2 comprises 26 coding exons; the 10-kb mrna transcript is translated into a protein of 3418 amino acids (2). Until recently, our mutation detection strategy consisted of direct sequencing of the large exon 11 of both BRCA1 and BRCA2 and DGGE for all other coding exons (3, 4). Because we were witnessing an increasing number of patients and aiming at a reduction of costs and workload, we evaluated 2 high-resolution melting instruments for mutation screening of both BRCA1 and BRCA2. High-resolution melting analysis of nucleic acids depends on the ability to record and evaluate fluorescence intensities as a function of the melting tempera- 1 Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium. * Address correspondence to this author at: Center for Medical Genetics, Ghent University Hospital, De Pintelaan 185, B-9000 Gent, Belgium. Fax ; Kathleen.Claes@UGent.be. Received October 14, 2007; accepted February 21, Previously published online at DOI: /clinchem Human genes: BRCA1, breast cancer 1, early onset; BRCA2, breast cancer 2, early onset. 3 Nonstandard abbreviations: dhplc, denaturing high-performance liquid chromatography; DGGE, denaturing gradient gel electrophoresis; dsdna, doublestrand DNA; Tm, melting temperature; SNP, single nucleotide polymorphism

44 Chapter 2 High-Resolution Melting for Mutation Scanning of BRCA1/2 ture of PCR products. The melting behavior of the PCR products is monitored by plotting the changes in fluorescence that occur by denaturing double-strand DNA (dsdna) upon heating. Heterozygous DNA samples form heteroduplexes, resulting in a different shape of the melt curve compared with a homozygous reference sample. The altered melting curve shape is a result of the presence of both heteroduplex and homoduplex amplicons in the PCR product, vs only homoduplexes from normal samples. Mutant homozygous samples, in contrast, are detected by a melting temperature (Tm) shift rather than an altered curve shape. Since the introduction of saturating dsdna binding dyes like LCGreen Plus, the sensitivity and specificity of DNA melting analysis has increased substantially. Nonsaturating dyes like SYBR Green possibly allow redistribution of dye molecules from melted regions back into the dsdna amplicon, resulting in no change in fluorescent signal even in the presence of a heteroduplex (5, 6). When a saturating fluorescent dye is intercalated in dsdna, dye jumping during amplicon melting is prevented, allowing theoretical detection of all sequence changes. In the present study, we evaluated high-resolution melting analysis for BRCA1 and BRCA2 and compared the Lightscanner TM (Idaho Technology Inc.) and the LightCycler 480 (Roche) instruments. To determine sensitivity, we used 212 positive controls scattered over 100 different amplicons. In addition, we performed a blind screening of 22 patients (i.e., 2464 PCR reactions) to determine the specificity of high-resolution melting analysis on the 2 instruments. Materials and Methods DNA SAMPLES AND STUDY DESIGN DNA samples from patients with previously characterized genetic variants were used as positive controls to determine the sensitivity of high-resolution melting. These variants were previously detected with a mixture of other techniques such as DGGE, dhplc, protein truncation test (PTT), and sequencing. For BRCA1, 97 positive control samples were available, scattered over 42 of 45 amplicons designed to cover the complete coding sequence of BRCA1. For amplicons 11.1, 11.17, and 15.2, we had no positive controls available. We analyzed 36 deletions (15 of 1 bp, 7 of 2 bp, 2 of 3 bp, 6 of 4 bp, 3 of 5 bp, 2 of 11 bp, and 1 of 62 bp), 12 1-bp insertions (of which 10 were duplications), a combined delttinsg mutation and delaginst mutation, and an insertion of an Alu element. Furthermore, we analyzed 29 transitions (15 T/C and 14 A/G) and 17 transversions (3 A/C, 9 G/T, 3 C/G, and 2 T/A). For BRCA2, we tested 115 positive control samples scattered over 58 of 67 amplicons, designed to cover the complete coding sequence of BRCA2. No positive controls were available for fragments 4, 11.10, 11.13, 12, 21, 26, 27.1, 27.2, or For BRCA2, we analyzed 38 deletions (13 of 1 bp, 15 of 2 bp, 9 of 4 bp, and 2 of 7 bp), 7 1-bp insertions (of which 5 were duplications), 46 transitions (19 T/C and 27 A/G), and 24 transversions (8 A/C, 6 G/T, 6 G/C, and 4 A/T). An overview of all positive controls is given in Supplementary Table 1 in the Data Supplement that accompanies the online version of this article at content/vol54/issue6. As negative controls, we used DNA samples from several healthy individuals. Also, we blindly screened 22 patients for BRCA1 and BRCA2 to determine the specificity of the highresolution melting technique. We had previously analyzed 11 of the patient samples with direct sequencing of exon 11 of BRCA1/2 and DGGE for all other exons. We performed a second blind screening of 11 patients in parallel with direct sequencing of all amplicons. PRIMERS AND PCR OPTIMIZATION We designed primers for BRCA1 and BRCA2 to cover the complete coding region and splice sites encompassing exons 2 24 of BRCA1 (45 amplicons, 21 of which encompass exon 11) and exons 2 27 of BRCA2 (67 amplicons, 27 of which encompass exon 11). Primer sequences are available in Supplementary Table 2 in the online Data Supplement. The absence of SNPs in the primers was verified with the help of the Ensembl genomic sequence database. We chose annealing temperatures all around 50 C and evaluated the specificity of the primers using the University of California Santa Cruz in silico PCR program. The amplicon length ranged between 136 and 435 bp, median 238 bp. To simplify the sequencing process afterwards, all primers were fused with universal M13-tails (forward CAC GACGTTGTAAAACGAC and reverse CAGGAAA CAGCTATGACC). The dye-stained DNA template had no interference with the sequencing reactions. PCR was performed in 25 L volumes. The amplification mixture included 1.5 mmol/l MgCl 2 (Invitrogen), 1 PCR buffer (Invitrogen), 3% DMSO (VWR International), 0.2 mol/l of both forward and reverse primer, 200 mol/l of each dntp, 0.5 U/ L Platinum Taq DNA polymerase (Invitrogen), 0.5 LCGreen Plus, and approximately 100 ng DNA. By adding 3% DMSO, we could use the same master mix for almost all fragments. For 3 amplicons (BRCA and BRCA and 11.6), we increased DMSO concentrations (to 10%, 5%, and 7%, respectively) to obtain specific PCR products. All 112 amplicons were amplified using a universal touchdown PCR program. The temperature cycling protocol consisted of an initial denaturation step at 94 C for 2 min, followed by 12 cycles of denaturation 38 Clinical Chemistry 54:6 (2008) 983

45 Chapter 2 at 94 C for 20 s, annealing starting at 58 C for 20 s (decreasing 1 C per cycle), and extension at 72 C for 1 min. This initial PCR reaction was followed by 25 additional cycles of denaturation at 94 C for 40 s, annealing at 46 C for 40 s, and extension at 72 C for 30 s. Final extension was accomplished at 72 C over 10 min. After amplification, PCR products were denatured at 95 C for 5 min and cooled (1.7 C/s) in a thermocycler block to 25 C to form heteroduplexes. MELTING ACQUISITION AND MELTING ANALYSIS Melting acquisition was performed on the 96-well Lightscanner (Idaho Technology Inc.) and the Light- Cycler 480 (Roche). According to the manufacturer s instructions, we transferred 10 L PCR product to 96- well plates suitable for high-resolution melting analysis [Lightscanner, 4Titude plates (BioKé); LightCycler 480, Roche]. To prevent evaporation during heating on the Lightscanner, PCR products were covered with a mineral oil overlay. We used a 10-min centrifugation step instead of the 1-min centrifugation specified in the manual. The longer centrifugation turned out to be helpful in the elimination of air bubbles that rise to the surface during the melting process, disturbing the fluorescent curves. The plates on the Lightscanner were heated from 70 C to 98 C at 0.1 C/s. For the LightCycler 480, the appropriate 96-well plates (Roche) were covered with the accompanying sealing foils. The applied template for high-resolution melting included first-step heating to 95 C and a melting program that went from 55 C to 95 C. Melting curve analysis was performed on the Lightscanner with Lightscanner software (version 1.5) and on the LightCycler 480 with the gene-scanning module (version 1.3). Both software programs employ a 3-step analysis: 1) normalization by selecting linear regions before (100% fluorescence) and after (0% fluorescence) the melting transition, 2) temperature shifting by moving the curves along the x-axis, facilitating grouping, and 3) use of the Autogroup function. Shape differences were further analyzed by subtracting the curves from a reference curve, generating a difference plot, where fluorescence of all curves is plotted against temperature. Results SENSITIVITY For BRCA1, we analyzed with high-resolution melting 97 known heterozygous sequence variants (78 pathogenic mutations, 15 unclassified variants, and 4 polymorphisms) spread over 42 of 45 amplicons on the Lightscanner and the LightCycler 480 instruments. To validate the technique for BRCA2, we analyzed 115 known sequence variants (58 pathogenic mutations, 33 unclassified variants, and 24 polymorphisms) spread over 58 of 67 amplicons. The melting curves of the positive controls were compared with those of control individuals. We did not have sequence variants available for 12 amplicons and verified their quality by analyzing 8 wild-type samples. For this part of our study, a total of 1057 fragments were scanned on both instruments. With the Lightscanner instrument, all 97 known heterozygous BRCA1 sequence variants were detected by analysis on the normal sensitivity setting. In addition, 15 single nucleotide polymorphisms (SNPs) were detected in the healthy control individuals and confirmed by sequencing. On the LightCycler 480, only 92 of 97 (94.8%) of the BRCA1-positive controls were detected with the default sensitivity setting of 0.3. Increasing the sensitivity setting to 0.5 led to 100% detection (97 of 97). The variants undetectable at 0.3 shared no specific features they had no common positions in the fragments, and deletions as well as insertions, transitions, and transversions were missed (examples are shown in Supplementary Fig. 1 in the online Data Supplement). On the Lightscanner, all 115 known heterozygous BRCA2 sequence variants were easily detected, except c insT. This polymorphism is an insert of a thymine in a row of 12 thymines in intron 10, and the pre- and postmelt values had to be carefully adjusted to make this variant detectable (panels are shown in Supplementary Fig. 1 in the online Data Supplement). On the LightCycler 480, 9 of 115 positive controls were not detected with the default 0.3 sensitivity setting (sensitivity 92%). Analysis of the melting curves on sensitivity setting 0.5 led to 100% detection. Detection of the c insT variant was again possible only after carefully adjusting the pre- and postmelt parameters. Hence, we detected all possible heterozygous combinations of bases (A/T, A/C, A/G, C/G, C/T, and G/T) with high-resolution melting on both instruments. Also, deletions and insertions of different sizes (1 63 bp) were easily distinguished from wild-type samples (examples shown in Supplementary Fig. 1 in the online Data Supplement). Frameshift mutations are the most frequently detected mutations in BRCA1/2. As mutations in the BRCA1 and BRCA2 genes are autosomal dominant, we tested only a limited number of homozygous (polymorphic) variants on both instruments. When only the standard temperature shifting was used, both instruments could distinguish wildtype sequences from sequences with heterozygous and homozygous SNPs. An example is shown in Fig. 1. SPECIFICITY To determine the specificity of high-resolution melting, we performed a blind screening of 22 patients for 984 Clinical Chemistry 54:6 (2008) 39

46 Chapter 2 High-Resolution Melting for Mutation Scanning of BRCA1/2 Fig. 1. Detection of heterozygous and homozygous SNPs on both instruments. Difference plots and melting curves for BRCA1 fragment 11-8 on the Lightscanner and LightCycler 480. All samples were amplified and analyzed in duplicate. The similarity of the curves for the duplicate samples illustrates the robustness of the conditions applied. SNP BRCA1 c.3667a G is shown. Both instruments could easily distinguish the wild-type homozygous (grey baselines) from the heterozygous (black) and homozygous mutant (dotted) melting curves. The melting curves illustrate that mutant homozygous variants are detected by a Tm shift rather than by an altered curve shape. By the use of the default Tm shift setting (that accounts for small temperature variations across the block) on both instruments, most of the homozygous SNPs were detected. There is a 2 C difference in Tm for these amplicons between the 2 instruments, probably due to internal calibration. Temp, temperature. BRCA1 and BRCA2. We had previously screened 11 patients with sequencing for exon 11 and DGGE for all other coding exons and splice sites of BRCA1 and BRCA2. The other 11 patients were screened with highresolution melting, and simultaneously all amplicons were sequenced. The analysis was performed on normal sensitivity setting on the Lightscanner and sensitivity settings 0.3, 0.5, and 0.7 on the LightCycler 480. Clinical Chemistry 54:6 (2008)

47 Chapter 2 All sequence variants were detected on the Lightscanner, confirming the 100% sensitivity. However, we observed 32 false positives (confirmed by direct sequencing) in the 2464 amplicons analyzed (22 patients screened for 112 amplicons), i.e., a specificity of 98.7%. On the LightCycler 480, 87 sequence variations were missed with the standard sensitivity setting of 0.3. A specificity of 98.6% (35 false positives in 2464 fragments) was calculated. As we did not detect all genetic variants without highly adjusting the preand postmelt parameters on sensitivity setting 0.5, we reanalyzed our data with sensitivity setting 0.7. This led to a detection of all known genetic variants; however, the specificity dropped slightly to 95.9% (102 false positives of 2464 fragments). An overview of the high-resolution melting specificity per exon is given in Fig. 2. Discussion The aim of this study was to evaluate and validate highresolution melting curve analysis for mutation detection on 2 distinct instruments. The Lightscanner instrument (Idaho Technology) is specially designed for high-resolution melting analysis, and the LightCycler 480 (Roche) was originally launched as a real-time PCR instrument. We used the BRCA1 and BRCA2 genes as a model to evaluate the high-throughput capacity of the high-resolution melting technique because these large breast cancer susceptibility genes are being analyzed worldwide in an increasing number of patients. We designed 112 PCR amplicons, all amplifiable with 1 PCR program, covering the complete coding region of both BRCA1 and BRCA2 for high-resolution melting analysis using LCGreen Plus as saturating dye. To accomplish our high-throughput setup, PCR reactions were not performed on the thermocycler present in the LightCycler 480. On both instruments, minimal post-pcr handling is required and analyses can be done very fast. We evaluated the sensitivity and specificity of high-resolution melting on the 96-well Lightscanner and LightCycler 480 instruments by analysis of 3521 PCR amplicons in total, the largest and most thorough study on both instruments so far. We detected all known sequence variants on both the Lightscanner (normal sensitivity setting) and the LightCycler 480 (sensitivity setting 0.7). For sensitivity setting 0.5 on the LightCycler 480, it was possible to detect all sequence variants only by carefully adjusting the pre- and postmelting parameters. In a diagnostic setting, we recommend analyzing samples on sensitivity setting 0.7 to overcome false negatives. Previously, 100% sensitivity for high-resolution melting analysis was reported for PCR products up to 300 bp (7) or 400 bp (6). This was confirmed by Kennerson et al. (8) in a small study on mutation scanning of 4 amplicons of the GJB1 gene in the 96-well Lightscanner. These investigators also found 100% sensitivity from a validation of high-resolution melting analysis with 18 positive control samples, followed by a blind study of 10 patients. As the new gene scanning software module was only recently made available on the Light- Cycler 480, we were able to find data from only one other study (9) comparing the results of both instruments for factor VIII mutations. The investigators missed 2 of 20 mutations on both instruments using their settings. Because we obtained 100% sensitivity (with the 0.7 sensitivity setting), gene scanning on the LightCycler 480 also seems to be a valuable method. For specificity, however, the Lightscanner scored slightly better than the LightCycler 480 on a sensitivity setting of 0.7. The decrease of the specificity on the LightCycler 480 was caused by increasing the sensitivity setting from the default value of 0.3 and led to sequencing of about 70 more amplicons than the Lightscanner. In a recent study, Herrmann et al. (10) compared the melting profile of a 110-bp fragment on different instruments. They concluded that the ability to accurately genotype single-base changes by amplicon melting is limited by the spatial temperature variation across the plate, which is lower on the Lightscanner than the on the LightCycle 480. This could explain the somewhat lower specificity found on the LightCycler 480. The specificity of high-resolution melting analysis was also studied by Reed and Wittwer (7) with engineered plasmids. They reported 100% specificity for PCR products up to 300 bp, and the specificity was only slightly lowered (99.4%) for larger fragments. Our somewhat lower specificity rate may be due to the use of complex genomic DNA (instead of plasmids) or to the high number of amplicons that we analyzed to screen the complete coding regions of 2 large genes. We also found that PCR conditions need to be well optimized to obtain high values for specificity and sensitivity. After optimization, PCR fragments need to be verified by high-resolution melting curve analysis for several individuals to find out if the curves are reproducible. We conclude from our data that high-resolution melting analysis is at least as sensitive as other commonly used prescreening methods such as DGGE, dhplc, or fluorescent conformation-sensitive gel electrophoresis (F-CSGE). Sensitivities and specificities of 100% are reported for these techniques (4, 11 13). However, a recent comparison (14) of dhplc and high-resolution melting found better sensitivity and specificity for the latter. The major advantage of highresolution melting is the minimal post-pcr require- 986 Clinical Chemistry 54:6 (2008) 41

48 Chapter 2 High-Resolution Melting for Mutation Scanning of BRCA1/2 Fig. 2. Specificity per exon for BRCA1 and BRCA2 by the Lightscanner and the LightCycler 480 with 2 different sensitivity settings. ment, making it a less labor-intensive method while improving its cost-effectiveness, ease of use, and throughput. In the present study, we detected several homozygous SNPs by high-resolution melting. We used the standard Tm shift analysis mode (to facilitate grouping). A large study on the detection of homozygous SNPs was performed by Liew et al. (15), who concluded that approximately 4% (class 3 or 4) of homozygous human SNPs will remain undetectable by highresolution melting due to the small Tm difference generated by homozygous C/G and A/T SNPs. The ho- Clinical Chemistry 54:6 (2008)

49 Chapter 2 Fig. 3. Different sequence variants classified together by high-resolution melting. Melting curves and difference plots for fragment BRCA on the Lightscanner and the LightCycler 480. A common polymorphism (BRCA1 c.2934t G) is grouped together with 2 distinct pathogenic mutations (BRCA1 c.3012delg and BRCA1 c.2989_2990dupa). These panels illustrate that every variation in the melting curves needs to be sequenced. mozygous SNPs detected in the present study were indeed class 1 (C/T or G/A) and class 2 (C/A or G/T). The problem for the class 3 or 4 SNPs might be overcome by mixing samples with wild-type fragments. This could also be useful when high-resolution melting analysis is applied for mutational analysis of genes associated with recessive diseases or males with X-linked diseases. From our observations, it became clear that the software is not always able to discriminate between distinct variants within the same amplicon. As an example, we found that the 2 SNPs BRCA1 c.3113a G and BRCA1 c.3119 G A were classified in the same group by the software (data not shown). This can be explained by the short distance and the small Tm variation between these 2 aberrations. In Fig. 3, however, we show a common polymorphism (BRCA1 c.2934 T G) grouped together with 2 distinct pathogenic mutations (BRCA1 c.3012delg and BRCA1 c.2989_2990dupa). This example illustrates that although high-resolution melting analysis is a very useful prescreening technique, all detected aberrations still need to be sequenced in a diagnostic setting. Recent advances in the use of unlabeled oligonucleotides will substantially reduce the sequencing work, as these approaches allow discrimination between different genetic variants within the same amplicon (16). An alternative way for genotyping specific polymorphisms or distinct sequence aberrations was recently proposed by Dobrowolski et al. (17). They designed a multiplex, short amplicon ( bp) assay with primers flanking the aberrations and described the use of melt controls. High-resolution melting is a mutation scanning technique suitable for the detection of point mutations. However, like other PCR-based techniques, large exon (or multi-exon) deletions will remain undetected. In some populations, large intragenic BRCA1/2 deletions represent an important fraction of the mutation spectrum due to founder effects. For a complete mutation detection strategy, additional techniques such as multiplex ligation-dependent probe amplification (MLPA) are required. Our experiences with high-resolution melting for the BRCA1/2 genes allowed us to readily implement the technology for screening other large genes. However, preliminary data showed that pooling samples extracted in different laboratories produced variations in melting curves. We hypothesize that the differences in DNA-extraction methods may have influenced the results. The composition of the DNA solution buffer 988 Clinical Chemistry 54:6 (2008) 43

50 Chapter 2 High-Resolution Melting for Mutation Scanning of BRCA1/2 might play an important role. The same findings were reported in a recent study of Seipp et al. (18) amplicon Tm differences up to 0.39 C were found when different DNA extraction methods were used. In summary, we present a fast and reliable mutation detection strategy by high-resolution melting analysis on 2 different instruments. By introducing this method, our reporting time for the BRCA genes can be reduced considerably (one third compared to direct sequencing and DGGE). In our setup, the hands-on postmelt analysis of 11 patients requires only 3 h (approximately 13 min per 96-well plate) followed by sequencing of the detected aberrations. Owing to the relatively low cost of the consumables (LCGreen Plus; no need of fluorescence-labeled primers or special polymers) and the lower workload compared with other mutation scanning techniques, this is a very costefficient technology. The 2 high-resolution melting instruments evaluated were able to detect all known sequence variants. As suggested by Herrmann et al. (10), we also found that the Lightscanner, specifically designed for high-resolution melting, displayed slightly References better scanning specificity than the LightCycler 480, an instrument that can also be used for real-time Q-PCR. Further reduction of the sequencing burden can be obtained using unlabeled probes for the detection of frequent SNPs in both genes. We conclude that highresolution melting is a rapid, cost-efficient, sensitive methodology simple enough to be readily implemented in a diagnostic laboratory. Grant/Funding Support: This research was supported by grant from the Fund for Scientific Research Flanders (FWO) to K.C. and by grant from the Ghent University to A.D.P. Financial Disclosures: None declared. Acknowledgments: We thank Frans Hogervorst, Kees van Roozendaal, Marjolein Ligtenberg, Katrien Storm, Erik Teugels, and Eva Machackova for kindly providing positive control samples. We thank Roche for the opportunity to evaluate the gene-scanning module version 1.3 for the LightCycler 480. We thank Jo Vandesompele for critically reading our manuscript. 1. Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science (Wash DC) 1994;266: Tavtigian SV, Simard J, Rommens J, Couch F, Shattuck-Eidens D, Neuhausen S, et al. The complete BRCA2 gene and mutations in chromosome 13q-linked kindreds. Nat Genet 1996;12: Claes K, Poppe B, Coene I, Paepe AD, Messiaen L. BRCA1 and BRCA2 germline mutation spectrum and frequencies in Belgian breast/ovarian cancer families. Br J Cancer 2004;90: Van der Hout AH, van den Ouweland AM, van der Luijt RB, Gille HJ, Bodmer D, Bruggenwirth H, et al. A DGGE system for comprehensive mutation screening of BRCA1 and BRCA2: application in a Dutch cancer clinic setting. Hum Mutat 2006;27: Wittwer CT, Reed GH, Gundry CN, Vandersteen JG, Pryor RJ. High-resolution genotyping by amplicon melting analysis using LCGreen. Clin Chem 2003;49: Herrmann MG, Durtschi JD, Bromley LK, Wittwer CT, Voelkerding KV. Amplicon DNA melting analysis for mutation scanning and genotyping: crossplatform comparison of instruments and dyes. Clin Chem 2006;52: Reed GH, Wittwer CT. Sensitivity and specificity of single-nucleotide polymorphism scanning by high-resolution melting analysis. Clin Chem 2004; 50: Kennerson ML, Warburton T, Nelis E, Brewer M, Polly P, De Jonghe P, et al. Mutation scanning the GJB1 gene with high-resolution melting analysis: implications for mutation scanning of genes for Charcot-Marie-Tooth disease. Clin Chem 2007; 53: Laurie AD, Smith MP, George PM. Detection of factor VIII gene mutations by high-resolution melting analysis. Clin Chem 2007;53: Herrmann MG, Durtschi JD, Wittwer CT, Voelkerding KV. Expanded instrument comparison of amplicon DNA melting analysis for mutation scanning and genotyping. Clin Chem 2007;53: Gerhardus A, Schleberger H, Schlegelberger B, Gadzicki D. Diagnostic accuracy of methods for the detection of BRCA1 and BRCA2 mutations: a systematic review. Eur J Hum Genet 2007;15: Ganguly T, Dhulipala R, Godmilow L, Ganguly A. High throughput fluorescence-based conformation-sensitive gel electrophoresis (F-CSGE) identifies six unique BRCA2 mutations and an overall low incidence of BRCA2 mutations in high-risk BRCA1-negative breast cancer families. Hum Genet 1998;102: Arnold N, Gross E, Schwarz-Boeger U, Pfisterer J, Jonat W, Kiechle M. A highly sensitive, fast, and economical technique for mutation analysis in hereditary breast and ovarian cancers. Hum Mutat 1999;14: Chou LS, Lyon E, Wittwer CT. A comparison of high-resolution melting analysis with denaturing high-performance liquid chromatography for mutation scanning: cystic fibrosis transmembrane conductance regulator gene as a model. Am J Clin Pathol 2005;124: Liew M, Pryor R, Palais R, Meadows C, Erali M, Lyon E, Wittwer C. Genotyping of single-nucleotide polymorphisms by high-resolution melting of small amplicons. Clin Chem 2004;50: Liew M, Seipp M, Durtschi J, Margraf RL, Dames S, Erali M, et al. Closed-tube SNP genotyping without labeled probes: a comparison between unlabeled probe and amplicon melting. Am J Clin Pathol 2007;127: Dobrowolski SF, Ellingson CE, Caldovic L, Tuchman M. Streamlined assessment of gene variants by high resolution melt profiling utilizing the ornithine transcarbamylase gene as a model system. Hum Mutat 2007;28: Seipp MT, Durtschi JD, Liew MA, Williams J, Damjanovich K, Pont-Kingdon G, et al. Unlabeled oligonucleotides as internal temperature controls for genotyping by amplicon melting. J Mol Diagn 2007;9: Clinical Chemistry 54:6 (2008) 989

51 Chapter 2 Technical Advance Journal of Molecular Diagnostics, Vol. 11, No. 5, September 2009 Copyright American Society for Investigative Pathology and the Association for Molecular Pathology DOI: /jmoldx Genotyping of Frequent BRCA1/2 SNPs with Unlabeled Probes A Supplement to HRMCA Mutation Scanning, Allowing the Strong Reduction of Sequencing Burden Kim De Leeneer, Ilse Coene, Bruce Poppe, Anne De Paepe, and Kathleen Claes From the Centre for Medical Genetics, Ghent University Hospital, Ghent, Belgium We previously validated mutation scanning for BRCA1 and 2 using high-resolution melting curve analysis (HRMCA). Due to recurrent single nucleotide polymorphisms (SNPs), a considerable amount of sequencing work remains after HRMCA, as melting curves for SNPs and deleterious mutations may be similar. Here, we present a simple approach for the optimization of SNP genotyping with HRMCA using unlabeled probes. Protocols were optimized for 14 frequent SNPs in BRCA1 and 2. Two probes contained an additional mismatch to detect a rare polymorphism a few nucleotides upstream. PCR was performed in the presence of LCgreenPlus and analyzed on a Lightscanner. Genotyping assays were optimized with five wild-type, heterozygous, and homozygous mutant samples. Sensitivity and specificity of the assays were evaluated with a blind screening of 95 samples. All unlabeled probes correctly genotyped the SNPs. A 1:5 asymmetric primer ratio produced sufficient probe-strand duplexes to accurately genotype the SNP of interest. The most important parameter to optimize was the number of PCR cycles. By complementing our BRCA1/2 HRMCA with 14 unlabeled probe assays, we reduced the sequencing burden by three-fold. Our simple approach for optimization can be used as a blueprint to design genotyping assays for other genes. This is one of the largest studies reported to date and the first that presents an approach combining genotyping and mutation scanning of two large polymorphic genes. (J Mol Diagn 2009, 11: ; DOI: /jmoldx ) Screening of the complete coding region of BRCA1 (MIM ) and BRCA2 (MIM ), the two major breast cancer susceptibility genes, is arduous because of the complex mutational spectrum and the large size of these genes. We previously reported 1 the optimization and validation of High Resolution Melting Curve Analysis (HRMCA) as mutation detection strategy for BRCA1/2 in a diagnostic setting. Due to the presence of recurrent single nucleotide polymorphisms (SNPs) in BRCA1 and BRCA2, the sequencing work after a pre-screening method still remains labor-intensive. To further reduce the sequencing burden, we present an approach to perform HRMCA mutation scanning simultaneously with genotyping of the most frequent SNPs with unlabeled probes. Genotyping with unlabeled probes is a homogeneous, end-point assay. The probe is included in the PCR mix, but is not consumed during amplification, since it is blocked at the 3 end. PCR is performed in the presence of a saturating fluorescent dye. Genotyping is accomplished by monitoring the melting of probe-target duplexes post-pcr. Since the double stranded region can only be as long as the probe, a high sensitivity rate can be obtained. Key to the method is the use of one primer in excess, which leads to the overproduction of the target strand and a reduction of the amount of double stranded amplicon generated. 2 Supported by grant from the Fund for Scientific Research Flanders (FWO) to K.C. and by grant from the Ghent university to A.D.P. B.P. is Senior Clinical Investigator of the Fund for Scientific Research of Flanders (FWO Vlaanderen). Accepted for publication March 27, Address reprint requests to Kathleen Claes, PhD, Ghent University Hospital, De Pintelaan 185, B-9000 Gent, Belgium. Kathleen. Claes@Ugent.be

52 Chapter De Leeneer et al JMD September 2009, Vol. 11, No. 5 Table 1. An Overview of Probe Sequences and Reaction Conditions for Each Optimized Genotyping Assay SNP Probe sequence Strand MgCl 2 (mm) Ta ( C) # Cycles DMSO BRCA1 rs (c c T) 5 -CTGGCCAAtAATTGCTTGACTG-3 Sense 2, % rs and rs CTGGGAAAGTATCaCTGTtATGTC-3 Anti-sense 2, (c.2077g A and/or c.2082c T) rs16940 (c.2311t C) 5 -AGCAGTATTTCAcTGGTACCTGGT-3 Sense 2, rs (c.2612c T) 5 -TGGATTTGAAAACaGAGCAAATGAC-3 Anti-sense 2, rs16941 and/or rs GCTTGAGtTGGCTcCTTTAAAAACA-3 Anti-sense 2, (c.3113a G and/or c.3119g A) rs16942 (c.3548a G) 5 -CTGCTAAGCTCTCCTcTCTGGACG-3 Anti-sense 3, rs (c.4308t C) 5 -ATAAGTGACTCcTCTGCCCTTGAG-3 Sense 2, BRCA2 rs (c.1-26g A) 5 -CAGACTTATTTACCAAaCATTGGAGGA-3 Sense 3, rs (c C T) 5 -TTAACAAaGCATTCCAAAATTGTTAGC-3 Anti-sense 2, rs (c.1114a C) 5 -TCAAATGTAGCAcATCAGAAGCC-3 Sense 2, rs (c.3396a G) 5 -TGCAATATGTAGCTTGGcTTTCTAAACT-3 Anti-sense 2, rs (c.3807t C) 5 -GTCATGATTCTGTcGTTTCAATGTTTAAGA-3 Sense 2, % rs (c.7242a G) 5 -CTGTTCAACTCTGTGAAAATGcGATTTAGTT-3 Anti-sense 2, rs c t C 5 -ATGATAATATTCTACcTTTATTTGTTCAGGGC-3 Sense 2, % In the probe sequences, nucleotides equal to the WT reference are noted in uppercase letters. Bold, underlined lowercase nucleotides denote the locations of the SNP(s) that is (are) genotyped in the assay. Materials and Methods In this study, we present the optimization of 14 unlabeled probe assays to genotype 16 BRCA1 and BRCA2 SNPs. A simple approach for optimizing genotyping assays was developed, which in our opinion can easily be extended to design genotyping assays for different sequence variants in BRCA1/2 or other genes. This is one of the largest studies with unlabeled probe assays reported so far and we are the first group to present an approach combining genotyping and mutation scanning for mutation detection in two large polymorphic genes by covering the whole coding sequence. We retrospectively selected DNA samples submitted to our center for standard diagnostic screening of BRCA1 and BRCA2. DNA was extracted from blood samples by the use of Autopure LS (Qiagen, Venlo, the Netherlands) instrument and the samples used to optimize and validate the assays were previously genotyped by sequencing, using the Bigdye v3.1 ET terminator cycle sequencing kit from Applied Biosystems. Sequencing reactions were loaded on an Applied Biosystems Prism 3730 Genetic Analyzer and analyzed with Seqscape v2.5 (Applied Biosystems, Foster City, CA). Genotyping assays for each SNP were optimized with 15 control samples (five wild-type samples, five heterozygous, and five homozygous mutants). Subsequently, the sensitivity and specificity of the optimized assay was evaluated with a blind screening of 95 samples. Probe design was performed with Lightscanner primer design software v1.0 (Idaho Technology, Salt Lake City, Utah). Melting temperatures of the probes were chosen between 60 C and 65 C to avoid interference with the extension and the maximum probe length was set on 35bp. Probes were redesigned if cross-complementarity with one of the primers was found. Table 2. Allele Frequencies of the SNPs for Developed Assays: Comparison of Frequencies in the European Population as Published in HAPMAP and the Frequencies in our Belgian Breast Cancer Patient Population Referred for Genetic Testing SNP frequencies according to CSHL-HAPMAP:HapMap-CEU SNP frequencies in Belgian breast cancer patients referred for genetic testing to our centre Wild-type Heterozygous Homozygous Wild-type Heterozygous Homozygous rs , rs ,08 rs , , rs ,08 rs , rs , rs / rs ,08 rs rs rs , rs , ,1 rs , ,1 rs rs , rs

53 Chapter 2 HRMCA: Unlabeled Probes for BRCA1/2 SNPs 417 JMD September 2009, Vol. 11, No. 5 Unlabeled oligonucleotide probes were synthesized with a 3 C3 spacer (Eurofin MWG Operon, Ebersberg, Germany or Biolegio, Nijmegen, the Netherlands) as recommended by Dames et al 3 and amplicons were generated with primers previously published. 1 PCR was performed in 15- l volumes in the presence of 1 LCgreen Plus (Idaho Technology Inc. Salt Lake City, Utah), 2.3 to 4 mmol/l magnesium chloride (MgCl 2 ) (Invitrogen, Merelbeke, Belgium), 1 PCR buffer (Invitrogen, Merelbeke, Belgium), 200 mol/l of each dntp, and 0.5 U platinum Taq polymerase (Invitrogen, Merelbeke, Belgium). PCR reactions were performed with 50 ng of genomic DNA. Primer ratios are asymmetric at 1:5 (0.3:1.5 mol/l). Three percent dimethyl sulfoxide was used for several amplicons. An overview of probe sequences and reaction conditions is given in Table 1. PCR reactions were performed on 96-well Peltier thermal cyclers (BIO-RAD, Hercules, California). The temperature cycling protocol consists of an initial denaturation step at 95 C for 30 seconds, followed by 50, 55, or 60 cycles of denaturation at 95 C for 30 seconds, annealing at 53 C, 55 C, or 58 C for 30 seconds, and an extension at 72 C during 30 seconds. This PCR reaction is followed by another denaturation step at 98 C for 30 minutes, followed by cooling down to room temperature (28 C) to facilitate the heteroduplex formation. The unlabeled probe genotyping assays are performed separately from the regular amplicon melting PCR reactions. This allows a more efficient set-up: for the unlabeled probe assays we always include positive controls (homozygous wild-type, heterozygous, and homozygous variants) while we don t for the mutation scanning Figure 1. Melting profiles of an unlabeled probe assay allowing the simultaneous identification of five distinct alleles from two BRCA1 SNPs, separated by a few bp (BRCA1 c.3113 A G and c.3119g A). Samples were analyzed with amplicon scanning and unlabeled probe genotyping. With the expert scanning module (left panel) only three distinct clusters can be detected. Samples with alleles heterozygous for only one of both SNPs generate similar melting curves. Furthermore, samples with alleles homozygous for rs16941 (BRCA1 c.3113 A G) cannot be distinguished from wild-type samples. Genotyping of these samples with our unlabeled probe assay allowed simultaneous identification of all five distinct alleles by melting analysis (right panel). 47

54 Chapter De Leeneer et al JMD September 2009, Vol. 11, No. 5 assays. Therefore, for a complete screening of the BRCA1/2 genes 126 PCR reactions are performed. Melting acquisition was performed on the 96-well Lightscanner (Idaho Technology Inc. Salt Lake City, Utah), in 96-well plates suitable for HRM analysis (4titude plates (BIOKé, Leiden, the Netherlands) and covered with a mineral oil overlay. Plates were centrifuged for 10 minutes and heated in the Lightscanner from 45 C up to 98 C with a heating rate of 0.1 C/s. Genotyping of the melting curves was performed with the standard Lightscanner software (version 2.0), in the unlabeled probe module and melting curve analysis was performed in the Expert scanning module. Results and Discussion High resolution melting curve analysis is a cost-effective method to perform mutation detection in large genes like BRCA1 and BRCA2. Due to the highly polymorphic character of these genes, the majority of aberrant melting curves detected, turned out to be SNPs. Genotyping with unlabeled probes and PCR product melting have been described before, for among others, genotyping of disease related SNPs like factor V Leiden 4 or hot-spot mutations in cystic fibrosis 2,4 and even detection and typing of viruses. 5 To our knowledge, the combined approach of HRMCA mutation scanning and SNP genotyping to distinguish pathogenic variants from neutral variants has not been reported before as a supplement for mutation scanning. We selected the 14 most frequent BRCA1 and BRCA2 SNPs in our Belgian patient population and designed genotyping assays with unlabeled probes to unequivocally distinguish these recurrent SNPs from rare variants, which require sequencing to determine the nucleotide change. An overview of the allele frequencies of these SNPs in the European (Hapmap) and our Belgian patient population is presented in Table 2. All of the probes designed have Tm values between 60 C and 65 C, allowing us to limit the number of different annealing temperatures in our PCR programs to 3. Those three Ta were tested for all probes and conditions were further optimized on the Ta that gave the most clear derivative plot. In two unlabeled probe assays (probes for BRCA1 c.3548a G and BRCA2 c.1-26g A), increasing the MgCl 2 concentrations facilitated the grouping in the difference plots, probably due a stabilized interaction of the primers. In other reactions, accurate melting profiles were acquired by adding a small amount (3%) of dimethyl sulfoxide. In 64% (9/14) of the genotyping assays, increasing the number of PCR cycles had a beneficial effect on the difference plot generated. We hypothesize that by using such a high number of cycles, a maximum amount of probe-amplicon duplex is generated, facilitating the data acquirement for grouping. After optimization of an assay, a blind screening of 95 samples and 1 no template control was used to validate 48 the reaction conditions. All unlabeled probes correctly genotyped the SNPs and no false positives nor false negatives were found. However, since the key to the method is an asymmetric PCR reaction, purification of primers and probes and calibration of the PCR instruments seem to play an important role. The average Tm difference between wild-type samples and homozygous mutant samples in the genotyping assays was 5.4 C (range, 3 C to 7 C). This facilitates visual inspection of the derivative plots and leads to a very simple interpretation of the results. These Tm differences are consistent with other published data. 6,7 Two unlabeled probes are covering two SNPs of interest, only separated by a few bp. We evaluated possible combinations of these two SNPs and found a minimal Tm difference of 3 C, indicating that accurate genotyping can be achieved. In fragments with multiple sequence variations, different variations may be classified together by amplicon scanning. Furthermore, it is known that certain homozygous changes remain undetectable, because of the small Tm difference generated compared with homozygous wild-type samples. An example is shown in Figure 1, samples heterozygous for one of the two SNPs (BRCA1 c.3113a G and BRCA1 c.3119g A) are grouped together. With amplicon scanning, only three distinct melting curves are generated (samples heterozygous for both SNPs, samples heterozygous for one of both SNPs and wild-type samples, grouped together with the homozygous samples). By introducing an unlabeled probe covering the two SNPs, the five different genotypes can be clearly distinguished. We consciously did not choose to shorten our previously published amplicons. The shorter length ( 200 bp), recommended for accurate genotyping assays, 8 was not needed to obtain high sensitivity and specificity. With this approach, these unlabeled probe assays were easily implemented in our diagnostic setting. An alternative is presented by Liew et al 9 by designing small amplicons ( 50 bp) to accurately genotype SNPs. Unlabeled probe genotyping was preferred over small amplicons genotyping, because of the combination of mutation scanning and SNP genotyping achieved with this approach. By introducing HRMCA in our laboratory, we were able to reduce our turnover time threefold. With the introduction of these 14 unlabeled probe assays for the detection of 16 SNPs the remaining sequencing burden, was reduced about threefold, hereby strongly further decreasing the cost and labor for BRCA1/2 mutation screening. Due to the simple approach used for optimizing these assays, in our opinion it will be possible to easily implement unlabeled probe assays in other HRMCA screenings for other large polymorphic genes. Acknowledgments We thank Justine Simkens and Eveline Debals for their technical support.

55 Chapter 2 HRMCA: Unlabeled Probes for BRCA1/2 SNPs 419 JMD September 2009, Vol. 11, No. 5 References 1. De Leeneer K, Coene I, Poppe B, De Paepe A, Claes K: Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms. Clin Chem 2008, 54: Zhou L, Myers AN, Vandersteen JG, Wang L, Wittwer CT: Closed-tube genotyping with unlabeled oligonucleotide probes and a saturating DNA dye. Clin Chem 2004, 50: Dames S, Margraf RL, Pattison DC, Wittwer CT, Voelkerding KV: Characterization of aberrant melting peaks in unlabeled probe assays. J Mol Diagn 2007, 9: Zhou L, Wang L, Palais R, Pryor R, Wittwer CT: High-resolution DNA melting analysis for simultaneous mutation scanning and genotyping in solution. Clin Chem 2005, 51: Dames S, Pattison DC, Bromley LK, Wittwer CT, Voelkerding KV: Unlabeled probes for the detection and typing of herpex Simplex virus. Clin Chem 2007, 53: Liew M, Seipp M, Durtschi J, Margraf RL, Dames S, Erali M, Voelkerding K, Wittwer CT: Closed-tube SNP genotyping without labeled probes/ A comparison between unlabeled probe and amplicon melting. Am J Clin Pathol 2007, 127: Habalová V, Klimčáková L, Zidzik J, Tkáč I. Rapid and cost effective genotyping method for polymorphisms in PPARG, PPARGC1 and TCF7L2 genes. Mol Cell Probes 2009, 23: Montgomery J, Wittwer CT, Palais R, Zhou L: Simultaneous mutation scanning and genotyping by high-resolution DNA melting analysis. Nat Protoc 2007, 2: Liew M, Pryor R, Palais R, Meadows C, Erali M, Lyon E, Wittwer C: Genotyping of single-nucleotide polymorphisms by high-resolution melting of small amplicons. Clin Chem 2004, 50:

56

57 Chapter 2 Practical tools to implement massive parallel pyrosequencing of PCR products in next generation molecular diagnostics Running head: Road to next generation molecular diagnostics Authors: Kim De Leeneer, 1 Joachim De Schrijver, 3 Lieven Clement, 4 Machteld Baetens, 1 Steve Lefever, 1 Sarah De Keulenaer, 2 Wim Van Criekinge, 2,3 Dieter Deforce, 2,5 Filip Van Nieuwerburgh, 2,5 Sofie Bekaert, 2 Filip Pattyn, 1 Bram De Wilde, 1 Paul Coucke, 1,2 Jo Vandesompele 1,2, Kathleen Claes, 1 Jan Hellemans, 1,2,* 1 Center for Medical Genetics, Ghent University Hospital, B-9000 Ghent, Belgium 2 NXTGNT, Ghent University, B-9000 Ghent, Belgium 3 Biobix, laboratorium for Bioinformatics and computational genomics, Ghent University, B-9000 Ghent, Belgium 4 Biostat, Department of Applied Mathematics, Biometrics and Process Control, Ghent University, B-9000 Ghent, Belgium 5 Laboratory for Pharmaceutical Biotechnology, Ghent University, Harelbekestraat 72, B-9000 Ghent, Belgium 51

58 Chapter 2 Correspondence Name: Kim De Leeneer Address: Center for Medical Genetics, De Pintelaan 185, 9000 Ghent, Belgium Phone: Kim.deleeneer@UGent.be Keywords: Massively parallel pyrosequencing Next generation sequencing PCR Coverage Molecular diagnostics Abbreviations: MPS, massively parallel sequencing SAC, sample amplicon combination Q, quality NGMD, next generation molecular diagnostics MC, minimum coverage 52

59 Chapter 2 Abstract Despite improvements in terms of sequence quality and price per basepair, Sanger sequencing remains restricted to screening of individual disease genes. The development of massively parallel sequencing (MPS) technologies heralded an era in which molecular diagnostics for disorders becomes reality. Here, we outline different PCR amplification based strategies for the screening of a multitude of genes in a patient cohort. We performed a thorough evaluation in terms of set-up, coverage and sequencing variants on the data of 10 GS-FLX experiments (over 200 patients). Crucially, we determined the actual coverage that is required for reliable diagnostic results using MPS, and provide a tool to calculate the number of patients that can be screened in a single run. Finally, we provide an overview of factors contributing to false negative or false positive mutation calls and suggest ways to maximize sensitivity and specificity, both important in a routine setting. By describing practical strategies for screening of genetically heterogeneous disorders in a multitude of samples and providing answers to questions about minimum required coverage, the number of patients that can be screened in a single run and the factors that may affect sensitivity and specificity we hope to facilitate the implementation of MPS technology in molecular diagnostics. 53

60 Chapter 2 Introduction A multitude of laboratory technologies for the detection of DNA mutations have been developed over the last decades. In current diagnostic settings, most frequently a combination of a mutation scanning technique, followed by Sanger sequencing of the abnormal DNA fragments is used. Well known examples of widely used methods to identify the aberrant fragments are single strand conformation polymorphism (SSCP), conformation sensitive gel electrophoresis (CSGE), high performance liquid chromatography (HPLC) and more recently high resolution melting curve analysis (HRMCA) [1,2,3,4]. Despite its higher cost, Sanger sequencing [5] of DNA fragments remains the preferred method for mutation analysis because of its superior sensitivity and specificity and the detailed sequence information that can be obtained in a single step approach. Improvements on sequencing chemistries, instruments and data analysis software, as well as increases in throughput and reductions in cost resulted in the adoption of this technology for routine mutation analysis for monogenic diseases. However, expansion of molecular diagnostics to the realm of genetically heterogeneous disorders requires the implementation of new methods with increased mutation detection efficiency but without a decrease in cost efficiency. Massively parallel sequencing (MPS) technologies (see [6,7] for an overview) are an interesting alternative because of their higher throughput and lower cost per base as compared to Sanger sequencing. In addition, throughput and cost for MPS technologies per base are rapidly evolving (from 0.1Gb per run for the Roche Genome Sequencer at the end of 2006 to Gb per run for Illumina s HiSeq2000 and ABI s 5500XL platform in 2011) at a speed vastly surpassing the evolution rate seen in semiconductor industries (Moore's law). In order for MPS to take over the role of Sanger sequencing and to evolve into the method of choice for next generation molecular diagnostics (NGMD), a number of hurdles need to be taken and questions be answered. The goal of this paper is to remove a number of these obstructions by describing strategies which enable mutation analysis through MPS, by presenting tools for determination of the required coverage and the number of patients who can be screened in a single run, and by listing possible sources of false negative or false positive mutation calls along with possible solutions. The guidelines and tools provided in this study were formulated or calculated based on pyrosequencing data obtained on the GS-FLX instrument (454-Roche), but may provide better insights into applications with other MPS chemistries as well. 54

61 Chapter 2 Material and methods Generation of sequencing data The data presented in this article are derived from 10 GS-FLX sequencing runs (using both Standard and Titanium chemistries) on samples prepared with different approaches. In total over 200 patient samples were evaluated in these 10 experiments. To pool different patients in a single experiment, MID tags were attached on all patients samples. Different approaches were evaluated to attach these tags: Approach 1: the samples investigated for recessive congenital deafness (15 genes: GJB2, SLC26A4, MYO15A, OTOF, CDH23, TMC1, TMPRSS3, TECTA, TRIOBP, TMIE, PJVK, ESPN, PCDH15, ESRRB, MYO7A amplicons) were prepared with PCR (Kapa TAq kit (Sopachem)) followed by an adapter ligation approach. All PCR products for a given sample are pooled, thereby reducing the number of parallel reactions in the next step from the number of sample-amplicon combination (SAC) to the number of samples. The next step involves ligation of adapters containing the sequencing recognition sites (A & B) followed by a sample specific barcode (ligation was performed according to GS FLX Shotgun DNA library preparation quick guide). Once MID containing adapters are ligated, samples can be pooled into a single tube for MPS. Approach 2: for hereditary breast cancer (2 genes: BRCA1, BRCA2-111 amplicons) and familial aorta aneurism (3 genes: FBN1, TGFBR1, TGFBR2-110 amplicons) two consecutive rounds of PCR were applied. In this approach adapter ligation is replaced by a second PCR step. During the first PCR, gene-specific amplicons are generated, using primers modified at their 5' end with a universal M13 linker sequence. In the first experiments (2/10), we equimolarly pooled singleplex reactions. In further experiments the first amplification step was replaced by a multiplex PCR in which several amplicons of the same patient are combined (we typically aimed for 10-plex PCR reactions) to reduce the workload and consumable cost. After dilution of the PCR products, a second round of PCR is performed. In the second PCR, primers containing the common A or B sequence, a patient specific barcode sequence (MID) and a universal linker sequence (M13) were used to amplify the initial PCR products, thereby extending them with the sequences that are required to initiate sequencing and to distinguish reads from the different patients. 55

62 Chapter 2 Sequencing reaction and data analysis Emulsion PCR and sequencing reactions on the GS-FLX (454-Roche) were performed according to the manufacturer s instructions. The FASTA files were analyzed with the in house developed variant interpretation pipeline (VIP) software (version 1.3) [8]. Distribution plots and log-normal curve fitting were performed using the GraphPad PRISM 5 software. Statistical analysis of the potential bias introduced during emulsion PCR and pyrosequencing was performed using the R package. The mean of both relative coverages and relative fluorescent signals was used to center both data sets for each multiplex prior to principal component analysis to remove the effect of the different multiplex sizes. 56

63 Chapter 2 Results Calculation of coverage depth in function of sensitivity With Sanger sequencing a two-fold (forward and reverse) coverage is considered to be sufficient for molecular diagnostics, provided that sequences are of high quality. At this moment there is no clear consensus on the required minimum coverage (MC) to reliably detect heterozygous variations using MPS technologies. Current guidelines typically suggest a 20-fold coverage [9], with little justification on the proposed value or how it would require adjustment depending on sequencing and analysis procedures or context. Because MPS is based on the sequencing of single, clonally amplified molecules, sampling effects need to be taken into account at low coverage. At one fold coverage there is a 50% chance to detect a heterozygous variant and a 50% chance to miss it. Even at 10-fold coverage there is a chance of about 1/1000 to miss the variant allele completely. Since data analysis usually involves filtering out low frequency variants to reduce false positives resulting from sequencing errors (see below), the minimal number of reads for detection of heterozygous variants depends on the applied filter settings. Table 1 shows an overview of the theoretically required minimal coverage (MC) to reliably detect heterozygous variants at varying minimum allele frequencies with a given power. Calculations were based on the following: the interpretation of a specific base has only two possible outcomes (equal to or different from the reference sequence). Theoretically, the probability to observe a variant in a specific number of reads (#Rv) out of all reads for a sample amplicon combination (SAC) (total coverage) can be derived from a binominal distribution with success probability equal to the expected mutant variant frequency in the total number of reads (50% for heterozygous variants without variant related alignment errors). The binomial distribution can also be used to tabulate the cumulative probabilities in function of the total coverage and the relative variant frequency that is deemed sufficient to indicate a real variant, i.e. above the filter level below which variants are thought to be sequencing errors (#Rv/total coverage). Hence, one can simply look up the coverage that is required for detecting a heterozygous variant at a minimum defined variant frequency with a predefined power. This coverage is referred to as the minimum coverage (MC) for a given SAC. To facilitate interpretation, power values (P) were converted into scores (Q) (similar to calculation of PHRED scores [10]): Q=-10*log(1-P). 57

64 Chapter 2 Not surprisingly, MC values increase as the required power to detect heterozygous variants increases. There is also a strong dependency on the sequencing error filter level: if only variants present in 30% of the reads are considered as true variants, a 61-fold MC is required, while a coverage depth of only 27 is needed if the filter threshold is lowered to 20% (both for or P=99.90%, corresponding to a Phred score of 30, required for standard molecular diagnostics). When plotting obtained variant frequencies vs. coverage of unfiltered data, the largest deviations from the binomial distribution are observed at the lower allele frequencies. Because the majority of such data points are sequencing errors, especially related to homopolymers (see below), dispersion can best be evaluated at frequencies above 50%. Allele specific amplification biases during sample preparation or emulsion PCR are the most likely cause of any remaining dispersion. A stepwise analysis starting from unfiltered variant data in one experiment (9721 variants) to determine the dispersion is shown in supplemental file 1. We calculated the overall fraction of heterozygous variants with a frequency deviating from the expected 50% ratio, and this was estimated to be 10%, after correcting for sequencing errors being interpreted as heterozygous variants. Number of samples per run in function of MC Determination of the required minimum coverage is not sufficient to calculate the number of sample amplicon combinations (SAC) that can be analyzed with a given number of reads because the coverage may differ between SAC. In an ideal experiment, all SAC have exactly the same coverage, matching the theoretically determined required MC. In practice, some SAC will display a lower coverage than others. Since these require at least the MC as well, other SAC will have a higher coverage than absolutely required wasting sequencing capacity. The correction factor to convert the minimum coverage into the required average coverage can be derived from an evaluation of the distribution of the coverage. Figure 1A plots the coverage to the number of SAC and shows that the variation in coverage depth is log normally distributed. Coverage data of 3300 SAC were used to generate this plot (3 genes: FBN1, TGFBR1, TGFBR2 for 30 patients). This variation in coverage depth, dictates how many extra reads are needed to cover all sequences at the required level. By plotting the cumulative distribution of the fold difference of the mean coverage to the SAC 58

65 Chapter 2 coverage, one can determine the correction factor by which the mean coverage needs to be multiplied in order to have a given fraction of SAC with at least the minimum coverage (Figure 1b). The value on the X-axis at which the histogram passes the 90% threshold is defined as the correction factor (F 90 ). More stringent correction can be obtained by calculating a correction factor at higher thresholds (e.g. F 95 ). Supplemental Tool 1 provides an easy to use calculation template (MS Excel) to determine the spread correction factor and the number of patients that can be screened in a single run, ensuring sufficient power to detect heterozygous variants. The results of a BRCA1/BRCA2 screening using P=99.90%, threshold=25% and spread correction factor 2.5 are provided as an example. The calculation template determines that 83 samples can be screened in a single GS-FLX (Titanium chemistry) run with 90% of sequences covered sufficiently to provide a minimum power of 99.9% to detect heterozygous variants, tremendously decreasing to 65 samples if 95% of the sequences need to be covered sufficiently. Emulsion PCR We assumed a more narrow spread in coverage would be obtained by sequencing an equimolar pool of fragments or amplicons. To test the assumption that the emulsion PCR does not introduce a substantial bias we compared the relative peak intensities (determined by fragment analysis on ABI3730xl) of 9 different fluorescently labeled multiplex PCRs (6 to 11-plexes), amplified on 5 different samples (total of 360 SACs) with the corresponding relative coverage after sequencing. Overall there seems to be good 1:1 relationship between the relative fluorescence and the relative coverage, indicating that a certain increase in relative fluorescence on average induces an equal increase in relative coverage (Figure 2). In contrast to the findings obtained for shotgun sequencing [11], our data indicate that sequencing bias is limited and that sequencing cost efficacy can be improved by generating more equimolar input pools. Equimolarity can be achieved by optimizing amplification conditions or by normalizing PCR product concentrations. Although normalization can potentially increase sequencing efficiency, one may loose on overall processing efficiency due to the required effort to normalize the SAC. With good primer design tools one should be able to get similar DNA quantities (as measured by end point fluorescence in a qpcr reaction with saturating DNA binding dye) for the 90% best assays. For such 59

66 Chapter 2 screenings, the majority of amplicons do not require any normalization and a significant portion of all remaining amplicons can be made equimolar by a simple normalization. Figure 3 shows the distribution of the relative end point fluorescence intensities (RFU, relative to the maximum fluorescence), across 627 different qpcr reactions on a single sample amplified for 15 genes associated with hearing loss. It is important to notice that comparison of end point fluorescence values is only valid for singleplex PCR products of comparable length. Sequence quality analysis Sequence quality was determined using the GS-FLX basecaller. Quality scores per base were averaged across all reads within a single run (~ reads of 1 GS-FLX Titanium experiment for BRCA1/2 and FBN1, TGFBR1, TGFBR2 amplicons ), and plotted in function of the sequenced base (Figure 4A). Because of the setup of this amplicon sequencing run, the number of reads longer than 400 bp was too low to provide accurate quality estimations in that range. Quality scores (Q) were converted into probabilities of erroneous basecalls (P) as follows: P=10^(-Q/10), corresponding to the better known Phred scores. Pyrosequencing reactions are characterized by a low false call rate for substitutions, but also by a higher error rate for insertions and deletions especially in homopolymeric regions [12]. A combination of quality and allele frequency filters may eliminate most errors, but fails to distinguish real insertions/deletions from sequencing errors in case of longer homopolymers (7 or more repeats) (Figure 4B). Discussion As massively parallel sequencing has the ability to become the standard for next generation molecular diagnostics, more insight is urgently needed in the limitations of the technology and tools are required to standardize the quality of the diagnostic tests offered in various laboratories. In this study, we thoroughly evaluated data obtained with 10 GS-FLX experiments allowing us to shed light on a number of important issues and provide workarounds. Current massively parallel sequencers offer a throughput per run that is insufficient for complete genome sequencing at affordable cost in a diagnostic setting, but mostly supersedes the requirements for targeted resequencing of single DNA samples. Strategies for next generation 60

67 Chapter 2 molecular diagnostics will therefore have to deal with both the selection of regions of interest and with sample multiplexing. Regions can be selected by either hybridization based enrichment or PCR amplification. Enrichment by capturing DNA fragments on oligonucleotides on array (e.g. NimbleGen, Febit) or in solution (e.g. Agilent, Illumina) has the advantage that many regions can be targeted in parallel (target multiplexing). While this allows enrichment of a high number of regions of interest (up to an entire human exome), it is well known to introduce large variations in coverage [13]. In addition, enrichment is never complete: some regions are not captured whereas other unwanted regions may be copurified. The main drawbacks of this technology for molecular diagnostics are its high cost and the large quantities of high quality DNA that are required. For these reasons, we evaluated PCR amplification based approaches for NGMD. Sample multiplexing can be achieved by physically separating samples in the sequencing reaction or by tagging the amplicons with different sample specific sequences during library preparation. Physical separation on current MPS instruments offers limited flexibility in the number of samples to be multiplexed (up to 16 in GS-FLX) and may reduce the available sequencing capacity by blocking parts of the available sequencing space. Therefore, a sample tagging approach is preferred. For applications where different samples are analyzed for different genes, no special multiplexing modifications need to be done when sequences can be easily attributed to the different samples based on correct alignment to the gene of interest. Four major amplification based approaches for NGMD are currently used worldwide: 1) PCR with fusion primers (GS-FLX), 2) PCR followed by adapter ligation (GS-FLX), 3) two consecutive rounds of PCR (GS-FLX), and 4) shearing of concatenated PCR products followed by adapter ligation (various MPS platforms). It must be noted that other approaches or variations on the methods described may be used as well. In this study, we evaluated approach 2 and 3. The main advantages of approach 2 are its simplicity and ease of set-up. The drawback is the large number of individual PCR reactions that need to be performed. Hence, we concluded that this approach is best suited if a screening only needs to be performed a few times or when results are quickly required and one cannot afford optimization. As soon as a few hundred samples need to be screened, approach 3 may be the preferred alternative. By multiplexing PCR reactions in approach 3, one can reduce the workload and consumable cost for sample preparation. Although optimization of multiplex PCR may be challenging, there is a good return in increased efficiency (in terms of cost and workload to prepare samples) for tests that will be run many times as is the case in diagnostic sequencing. Further optimization may be 61

68 Chapter 2 achieved if the first and second round of PCR can be combined into a single PCR containing the two types of primers (inner target specific and outer sample specific primers). Because of fundamental differences between the traditional and the so called next-generation sequencing methods, people are uncertain on how to deal with coverage and how to interpret variants, errors and quality scores. Despite the availability of some guidelines on required coverage provided by sequencing instrument suppliers, there was no theoretical framework to actually calculate the required minimum coverage. We here provide such a framework and implement it into a spreadsheet template that can be used to determine the required coverage and the number of patients that can be screened in a single run. A number of sources of false positives and false negatives are identical for both Sanger and massively parallel sequencing. However, because MPS is based on the sequencing of single, clonally amplified molecules and uses a completely different sequencing chemistry, new types of error sources must be taken into account. In addition to the variations that should be detected (real mutations and SNPs), a variety of sequencing artifacts were observed as well. In the process of variant detection, false positives were introduced at three levels: Non-specific PCR reactions co-amplified homologues sequences. The mismatches with the reference sequence are detected as heterozygous variants. Since these variants occurred in all samples equally, they were easily spotted and filtered out. Nonetheless, a more specific PCR amplification (reaction optimization or improved primer design) is strongly recommended to ease data analysis. PCR or emulsion-pcr errors randomly occur or are biased towards specific positions. Because random errors are unlikely to result in significant variant frequencies, they can easily be removed by applying a filter requiring a minimum variant frequency. We applied a variant frequency threshold of 25%, but in general it is to be determined empirically in pilot studies considering sensitivity and specificity. Non-random localized errors (polymerase and buffer specific) may result in variant frequencies that approximate the expected 50% for heterozygous mutations or SNPs. Although these PCR errors impede data analysis, they can be recognized by their appearance in all samples. 62

69 Chapter 2 Pyrosequencing reactions are characterized by a low false call rate for substitutions, but are infamous for their higher error rate for insertions and deletions especially in homopolymeric regions [12]. A combination of quality and allele frequency filters may eliminate most errors, but fails to distinguish real insertions/deletions from sequencing errors in case of longer homopolymers (7 or more repeats) (Figure 4b). Since no reliable information on insertions or deletions in long homopolymers can be obtained, homopolymer filters may be introduced to eliminate this type of false positives. This essentially excludes all long homopolymeric regions from pyrosequencing based NGMD screenings. Since next-generation sequencing is a multi-step process, false negatives can result from technical problems, limitations or wrong decisions at any point during sample preparation or data analysis. Variants that are located within SAC for which no reads have been generated cannot be detected. This problem can easily be corrected for by repeating the analysis for that SAC in a new MPS run, or by analyzing it with Sanger sequencing if only a limited number of SAC are not covered. For NGMD all SAC that are not covered at the required minimum coverage may need to be repeated to reduce the chance of false negatives. Variants may not be sequenced because only one allele is amplified and sequenced. Allelic drop out because of sampling effects can be avoided by requiring a minimum number of target molecules for the PCR reaction (e.g. more than 0.3 ng of genomic DNA which is equivalent to 100 haploid human genome copies). As for other PCR based approaches, variants in alleles that are not amplified (because of large insertions or deletions, or the presence of interfering SNPs in the primer binding regions) will be missed [14]. Variants may escape detection if the mapper or variant caller software cannot map the read or call the variation. These problems are unlikely for single nucleotide variations but may be more problematic for larger or more complex mutations. This type of error cannot be excluded with absolute certainty, but a simulated validation of the variant identification pipeline may provide estimates of the (un)likelihood of missing a variation [8]. Filters that are introduced in the variant identification step to eliminate false positives (see above) may also accidentally remove real mutations. For most filters there is a trade-off between false positives and false negatives. Less stringent settings increase the sensitivity 63

70 Chapter 2 of the screening at the cost of higher false positive ratio, and vice versa. Therefore, pilot validation experiments are required to evaluate the setup and to fine tune the settings for an optimal balance between sensitivity and specificity. Knowing the possible sources of error, one may optimize sample preparation and sequencing protocols, and take measures to adjust the data analysis pipeline for these new types of errors. Table 1 shows an overview of the theoretically required MC to reliably detect heterozygous variants at varying minimum allele frequencies with a given power. Note that this theoretical MC value only accounts for allelic drop out due to sampling effects and that it should be treated as a lower limit for the actual MC that may be larger because of additional variation affecting allele frequencies. Because of inter-lab variation we cannot propose a single value for the required minimum coverage, but labs can determine their own MC value based on their sequencing error rate (filter setting) and the required power to detect variants (Table 1, Supplemental Tool 1). When new to NGMD, filtering at 25% and aiming for 99.9% power (resulting in an MC of 38) may be a good starting point. A 5-fold coverage is expected to be sufficient to tolerate occasional sequencing errors when screening for homozygous variations only. Based on the strategies and methods described in this paper we successfully developed and validated the screening of the complete coding region of the BRCA1 and BRCA2 genes in a diagnostic setting [15], demonstrating the feasibility of performing more efficient molecular diagnostics using massively parallel sequencing. The major remaining hurdle is the availability of data analysis tools that provide the required high quality for in-vitro diagnostics and that are really tailored towards a routine diagnostic setting. The availability of commercial software packages and the advent of smaller scaled MPS instruments such as the GS-Junior are expected to push this new sequencing technology into the field of diagnostics, starting with the multigenic disorders for which there are no good alternatives available at this moment. However, because of its proven track record, its superior flexibility and its large install base, Sanger sequencing is unlikely to be replaced in the near future for smaller screening projects and it will remain a valuable technology for confirmation of mutations observed by other technologies. 64

71 Chapter 2 Acknowledgements This research was supported by grant from the Fund for Scientific Research Flanders (FWO) to Kathleen Claes, by GOA grant BOF10/GOA/019 (Ghent University) and by Fighting Aneurysmal Disease [EC-FP7]. Kim De Leeneer is supported by the Vlaamse Liga tegen Kanker through a grant of the Foundation Emmanuel van der Schueren. Jan Hellemans is supported by a grant from the Fund for Scientific Research Flanders (FWO). This research has been made possible by funding from Hercules Foundation [AUGE/039] and UGent IOF StepStone and support of the NXTGNT consortium. Author contributions Conceived and designed the experiments: KDL, JH, DD, FvN, SB, PC, JV, KC Performed the experiments: KDL, MB, SDK Analysis of the data and bioinformatics support: KDL, JDS, SL, FP, BDW, JH. Statistical analyses: JH, LC Wrote the paper: KDL, JH, JV, KC. 65

72 Chapter 2 Tables Table 1: Overview of the required coverage to detect heterozygous variants, in function of the desired power (rows) and the level of filtering being applied (columns). sequencing error filter level Power (Q) 5% 10% 15% 20% 25% 30% 35% 90.00% (10) % (13) % (20) % (23) % (30) % (33) % (40) % (43) % (50)

73 Chapter 2 Figures Figure 1: Coverage analysis Distribution plot of the coverage observed in a pilot study representative for NGMD screening (full line) with 3300 sample amplicon combinations (SAC) combinations (SAC), derived from sequencing 30 patients for FBN1, TGFBR1 and TGFBR2. The coverage across different SAC appears to be log normally distributed (R² with best Gaussian fit (dashed line) > 0.99). At low coverage (<40, vertical line), the distribution deviates from its Gaussian fit. This reflects a low number of reactions that failed to give a normal coverage. Analysis of these SAC may provide clues on how to further optimize the screening. b) Cumulative distribution plot of the relative coverage (expressed as a fold difference of each SAC to the average coverage). This histogram allows determination of the correction factor by looking up the relative coverage for which the curve passes a given threshold, e.g. 90% for the calculation of F

74 Chapter 2 Figure 2: emulsion PCR and sequencing bias. Nine different fluorescently labeled multiplex PCRs (6 to 11-plexes), amplified on 5 different samples, were analyzed on a capillary sequencer to determine relative amplicon abundances prior to emulsion PCR and sequencing on a GS-FLX. Relative fluorescent signals were compared to their corresponding coverage values. The top panel shows the relative coverage in function of the relative fluorescence for the 360 SACs. The ellipse represents the 95% confidence region according to the multivariate normal distribution. The continuous line is the first principal component (PC) which indicates the direction of the largest variance in the sample: 92% of the variance of the sample can be explained by the first PC. The first PC lays very close to the first bisectrice (dashed line). Hence, there is a good 1:1 relationship between the relative fluorescence and the relative coverage, indicating that a certain increase in relative fluorescence on average induces an equal increase in relative coverage. The table at the bottom summarizes results across all 9 multiplex PCRs (360 SACs). It shows that the first PC explains a large proportion of the variance of each multiplex (84%-98%): the majority of variation in coverage results from variations in input amounts (as determined by fragment analysis on a capillary sequencer). 68

75 Chapter 2 Figure 3: Analysis of amplicon abundance. This graph represents the distribution of the relative end point fluorescence intensities (RFU, relative to the maximum fluorescence), across 627 different qpcr reactions on a single sample. About 90% of reactions have RFU values of at least 0.5. This implies that if equal volumes of all PCR reactions are pooled, the concentration of 90% of amplicons will vary less than 2-fold. This fraction of amplicons can be increased to 96% by using a double volume for the PCRs in the RFU range, and to 97% by using a quadruple volume for the PCRs in the RFU range. The concentration of the remaining 3% of PCR reactions is too low to be efficiently used. 69

76 Chapter 2 Figure 4: GS-FLX sequence quality analysis. a) Average quality score in function of the position within the reads for a representative dataset (full Titanium run with amplicons for breast cancer and for familial aorta aneurysmata screenings). Across the first 400 bp there is an average quality of 35.3 corresponding to a predicted error rate of 0.029%. b) Comparison of the observed homopolymer length in a series of sequencing runs to the expected length based on the reference sequence. Results are plotted as the fraction of reads having correct homopolymer length estimation (n), an underestimation of the homopolymer length (n-1, n-2, n-3) or an overestimation (n+1, n+2, n+3). The vast majority of reads for homopolymers of up to 6 repeats has correct length estimation, less than 2% are overcalls and less than 10% are undercalls. For homopolymers of 7 repeats, three quarters of the reads are correctly called and over 20% of the reads are interpreted to be missing one repeat. Only by filtering for low allele frequencies can these repeats be analyzed. At 8 repeats only about half of the reads are correctly called, at even larger homopolymer lengths only a minority of reads have a correct basecalling. 70

77 Chapter 2 References 1. Chou LS, Lyon E, Wittwer CT (2005) A comparison of high-resolution melting analysis with denaturing high-performance liquid chromatography for mutation scanning: cystic fibrosis transmembrane conductance regulator gene as a model. American journal of clinical pathology 124: De Leeneer K, Coene I, Poppe B, De Paepe A, Claes K (2009) Genotyping of frequent BRCA1/2 SNPs with unlabeled probes: a supplement to HRMCA mutation scanning, allowing the strong reduction of sequencing burden. The Journal of molecular diagnostics : JMD 11: Wittwer CT (2009) High-resolution DNA melting analysis: advancements and limitations. Human mutation 30: De Leeneer K, Coene I, Poppe B, De Paepe A, Claes K (2008) Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms. Clinical chemistry 54: Sanger F, Coulson AR (1975) A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. Journal of molecular biology 94: Shendure J, Ji H (2008) Next-generation DNA sequencing. Nature biotechnology 26: Shendure JA, Porreca GJ, Church GM (2008) Overview of DNA sequencing strategies. Current protocols in molecular biology / edited by Frederick M Ausubel [et al] Chapter 7: Unit De Schrijver JM, De Leeneer K, Lefever S, Sabbe N, Pattyn F, et al. (2010) Analysing 454 amplicon resequencing experiments using the modular and database oriented Variant Identification Pipeline. BMC bioinformatics 11: Craig DW, Pearson JV, Szelinger S, Sekar A, Redman M, et al. (2008) Identification of genetic variants using bar-coded multiplexed sequencing. Nature methods 5: Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8: Harismendy O, Frazer K (2009) Method for improving sequence coverage uniformity of targeted genomic intervals amplified by LR-PCR using Illumina GA sequencing-by-synthesis technology. BioTechniques 46: Quinlan AR, Stewart DA, Stromberg MP, Marth GT (2008) Pyrobayes: an improved base caller for SNP discovery in pyrosequences. Nature methods 5: Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, et al. (2007) Direct selection of human genomic loci by microarray hybridization. Nat Methods 4: Pattyn F, Robbrecht P, De Paepe A, Speleman F, Vandesompele J (2006) RTPrimerDB: the realtime PCR primer and probe database, major update Nucleic acids research 34: D De Leeneer K, Hellemans J, De Schrijver J, Baetens M, Poppe B, et al. (2011) Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations. Human mutation. 71

78 Chapter 2 Supplemental files Supplemental Tool 1: NGMD_calculator.xlsx Supplemental file 1: Allele frequency analysis 72

79 Chapter 2 Allele frequency analysis Dataset breast cancer familial aorta aneursim 1 full GS-FLX run 9721 variants 120% 100% 80% 60% 40% 20% 0% Filtering Variation in allele frequencies for heterozygote variants is disturbed by sequencing errors that occur at elevated rates. The full dataset was trimmed for variants with quality < 30 homopolymer length >= 6 After filtering for likely sequencing errors 3642 variants (37%) remain. 73

80 Chapter 2 120% 100% 80% 60% 40% 20% 0% Allele frequency binning To evaluate whether allele frequencies for heterozygote variants fluctuate randomly around the theoretical value of 50%, variants were binned into allele frequency ranges: <20%, [20-40%[, [40-60%], ]60-95%], >95%. Variants that occurred in at least 5 samples were classified as having a systematic allelic bias if the number of samples with an allele frequency in the second (green) or fourth (red) bin was higher than the number of samples with that variant in the bin around 50%. 120,0% 100,0% 80,0% 60,0% 40,0% 20,0% 0,0% no allelic bias low allelic bias high allelic bias Out of 922 unique variants, 185 occurred in at least 5 samples. Of these, 13 (7.0%) and 6 (3.2%), respectively, showed a decreased (green) or increased (red) allele frequency. Because of sequencing errors in the lower allele frequency range, the occurrence of non-random allelic bias is expected to be closer to the occurrence rate of increased allele frequency than to that of decreased allele 74

81 Chapter 2 frequency. Correcting for sequencing problems like those observed in the problematic amplicon BRCA2_11_19 (3 variants with skewed allele frequencies), the overall fraction of real heterozygous variants with allele frequencies deviating from the expected 50% ratio [40-60%] is estimated at 5%. <20 [20-40[ [40-60] ]60-95] >95 occurrence BRCA1_11_10-262c 8% 58% 33% 12 BRCA1_ % 81% 6% 16 BRCA2_10_07-22a 22% 67% 11% 9 BRCA2_11_19-182a 11% 78% 11% 9 BRCA2_15-9t 5% 74% 21% 19 BRCA2_26-27t 86% 14% 14 FBN1_exon_16_2-90a 18% 59% 18% 5% 22 FBN1_exon_23-225a 3% 73% 25% 40 FBN1_exon_28_ % 75% 8 FBN1_exon_38-21t 15% 62% 23% 39 FBN1_exon_53-54t 14% 66% 21% 29 TGFBR1_3_UTR_1-27g 100% 41 TGFBR1_exon_7-33t 6% 76% 18% 34 BRCA2_03-23a 42% 50% 8% 12 BRCA2_11_ % 69% 13 BRCA2_11_ % 28% 44% 6% 18 BRCA2_16-9t 21% 79% 19 FBN1_exon_18_1-117c 28% 61% 11% 18 FBN1_exon_31_ % 41 Remaining variation After exclusion of variants with demonstrated allele frequency bias and variants with allele frequencies below 20% (all of which were shown to be false positives, i.e. PCR and sequencing errors), 623 variants with a coverage of at least 20 (to allow for reliable allele frequency estimation) remained. 120,0% 100,0% 80,0% 60,0% 40,0% 20,0% 0,0%

82 Chapter 2 This dataset allowed for evaluation of residual allele frequency bias. allele frequency count fraction > % ]60-95] 24 4% [40-60] % [20-40[ 68 11% Based on this data, and correcting for sequencing errors being interpreted as heterozygous variants, the overall fraction of heterozygous variants with allele frequencies deviating from the expected 50% ratio [40-60%] is estimated at 10%. 76

83 METHODS Chapter 2 Human Mutation OFFICIAL JOURNAL Massive Parallel Amplicon Sequencing of the Breast Cancer Genes BRCA1 and BRCA2: Opportunities, Challenges, and Limitations Kim De Leeneer, 1 Jan Hellemans, 1 Joachim De Schrijver, 2 Machteld Baetens, 1 Bruce Poppe, 1 Wim Van Criekinge, 2 Anne De Paepe, 1 Paul Coucke, 1 and Kathleen Claes 1 1 Center for Medical Genetics, Ghent University Hospital, De Pintelaan 185, Gent, Belgium; 2 Laboratory for Bioinformatics and Computational Genomics, Ghent University, Gent, Belgium Communicated by Mats Nilsson Received 25 June 2010; accepted revised manuscript 1 December Published online in Wiley Online Library ( DOI /humu ABSTRACT: This study describes how the new massive parallel sequencing technology can be implemented in a diagnostic setting for the breast cancer susceptibility genes (BRCA1 and BRCA2). The throughput was maximized by increasing uniformity in coverage, obtained by a multiplex approach, which outperformed pooling of singleplex PCRs. We evaluated the sensitivity by analysis of 133 distinct sequence variants; three (2%) deletions or duplications in homopolymers of greater than or equal to seven nucleotides remained undetected, illustrating a limitation of pyrosequencing. Furthermore, other limitations like nonrandom sequencing errors, pseudogene amplification, and failure to detect multiexon deletions are thoroughly described. Our workflow illustrates the potential of massive parallel sequencing of large genes in a diagnostic setting, which is of great importance to meet the increasing expectations of genetic testing. Implementation of this approach will hopefully lead to a strong reduction in turnaround times. As a consequence a wider spectrum of at risk women will be able to benefit from therapeutic interventions and prophylactic interventions. Hum Mutat 32:1 10, & 2011 Wiley-Liss, Inc. KEY WORDS: massive parallel sequencing; BRCA1; BRCA2; multiplex; barcoding; amplicon sequencing Introduction The development of high-throughput or massive parallel sequencing (MPS) facilitated many new research opportunities. Among several available platforms, the most widely used are the Genome Sequencer from Roche/454 Life Sciences, the Genome Analyzer from Illumina/Solexa, and the SOLiD System from Applied Biosystems. These platforms differ in several ways, such as the applied technology, the read length, and the number of DNA molecules sequenced. Additional Supporting Information may be found in the online version of this article. Correspondence to: Kathleen Claes, Ghent University Hospital, Center for Medical Genetics, De Pintelaan 185, Gent 9000, Belgium. Kathleen.Claes@UGent.be Contract grant sponsor: The Fund for Scientific Research Flanders (FWO); Contract grant number: (to K.C.); Contract grant sponsor: GOA; Contract grant number: BOF10/GOA/019 (Ghent University). In the case of 454 sequencing, single DNA strands with 5 0 and 3 0 adaptor sequences are attached to beads and then clonally amplified by PCR in an oil water emulsion. The beads are mixed with DNA polymerase and deposited in plates containing over 1 million wells, with one bead per well. Nucleotides then flow sequentially over the wells and as each nucleotide is added to form complementary DNA strands, pyrophosphate is released and detected in a chemiluminescent flash (pyrosequencing chemistry). Since the introduction of MPS, an increasing number of applications have been published, among others, de novo sequencing [Pearson et al., 2007; Pol et al., 2007; Velasco et al., 2007], whole genome resequencing [Albert et al., 2007; Korbel et al., 2007], amplicon sequencing (e.g., exon resequencing, virus variant detection, DNA methylation) [Dahl et al., 2007; Korshunova et al., 2008; Pettersson et al., 2008; Taylor et al., 2007; Thomas et al., 2006], mirna and splice variant discovery [Ruby et al., 2006; Yao et al., 2007]. However, the implementation of highthroughput sequencing in a clinical diagnostic setting remains largely unexplored. Over the last decades, Sanger sequencing [Sanger et al., 1977] has been the dominant DNA sequencing technology and the gold standard for DNA-based mutation detection. However, due to cost limitations direct sequencing of large genes in a diagnostic setting, is often preceded by a mutation scanning technique, followed by characterization of the detected variant with Sanger sequencing. Denaturing gradient gel electrophoresis (DGGE) [van der Hout et al., 2006], denaturing highperformance liquid chromatography (dhplc) [Liu et al., 1998], and high-resolution melting curve analysis (HRMCA) [De Leeneer et al., 2008, 2009] are well-known examples of mutation scanning techniques. For these prescreening techniques, sensitivities varying from % and specificities close to 100% were reported [Gerhardus et al., 2007]. Some of these methods are laborious and cannot be fully automated because a sequencing step is required to define the nature of the variant. We selected the BReast CAncer susceptibility genes BRCA1 (MIM] ) and BRCA2 (MIM] ) to optimize highthroughput amplicon sequencing, because of their size, polymorphic character, and lack of mutation hot spots. These genes require high-throughput screening as an increasing number of samples need to be tested within shorter turnaround times. The prospect of targeted therapeutic agents for tumors diagnosed in BRCA1/2 mutations carriers, such as PARP inhibitors, contribute to the expectations for genetic testing [Curtin, 2005]. A cost efficient MPS strategy requires an easy setup and a uniform distribution of coverage in order to maximize throughput. & 2011 WILEY-LISS, INC. 77

84 Chapter 2 We explored the challenges of optimizing the clinical use of MPS with two different approaches and compared sensitivity and specificity of MPS with Sanger sequencing and prescreening techniques currently used in molecular diagnostics. For our studies we selected the Roche Genome Sequencer FLX (GS FLX) system because of the longer read lengths generated by this instrument compared to the other two major platforms. Materials and Methods DNA Samples and Sequence Variants Evaluated In total, 123 DNA samples isolated from blood were investigated. In order to evaluate the specificity of our MPS setup genomic DNA samples from 30 patients previously analyzed for the complete coding region of BRCA1/2 with other mutation detection techniques were used. Eleven of these patients were completely Sanger sequenced. Nineteen were analyzed with HRMCA, followed by Sanger sequencing of all amplicons displaying aberrant melting curves [De Leeneer et al., 2008, 2009]. Identical primer sets were used for Sanger sequencing, HRMCA, and for MPS. The mutation detection capacity of MPS was evaluated using 93 samples with previously characterized (Sanger sequencing) deletion or insertion variants as positive control samples. These samples were sequenced for the amplicon containing the mutation and other amplicons within the specific multiplex set. An overview of all variants evaluated is shown in Supp. Table S1. In total, bp deletions, bp insertions (of which 13 were duplications), 3 combined indels, and 22 deletions and an insertion larger than 3 bp were sequenced. Twenty of these deletions and insertions are located in a homopolymeric region longer than 3 bp. In addition, 410 (40 unique) nucleotide substitutions were present in the 30 samples from patients previously analyzed for the complete coding region of BRCA1/2 with other mutation detection techniques. The majority (406) were frequent single nucleotide polymorphisms (SNPs) and four were nonsense mutations. Polymerase Chain Reaction (PCR) Setup To cover all coding regions and splice sites, we used primer sets thoroughly validated by HRMCA [De Leeneer et al., 2008, 2009] and Sanger sequencing. In runs 1 and 2 we started from equimolar pools of singleplex PCRs, in runs 3 and 4 multiplex PCRs were generated prior to sequencing. To fuse the amplicon-specific primers with the adaptor-mid (Multiplex Identifiers) barcoded primers, two consecutive PCR rounds were used with a universal M13 tail as linker [J. Hellemans et al., under review]. A schematic representation of the principle and workflow of both approaches is shown in Figure 1. Runs 1 and 2 Per sample, 111 singleplex PCRs with amplicon-specific primers were performed with identical reaction conditions and primers as published before [De Leeneer et al., 2008]. PCRs were performed on the CFX384 (Bio-Rad, Hercules, CA) and RFU data (endpoint fluorescence) were used for subsequent normalization to obtain 12 equimolar pools of PCR products per patient. Runs 3 and 4 Sixteen multiplex reactions were optimized to cover the complete coding and splice site regions of BRCA1 and BRCA2, except BRCA1 exon 2. As this amplicon amplified less efficiently, it was optimized in an individual reaction. The specifications of each multiplex reaction are given in Supp. Tables S2 and S3. Multiplex PCR was performed in 20-ml total volume. For sets 1, 2, 5, 8 15, the amplification mixture included 2 Titanium buffer (Clontech, Palo Alto, CA), 3% DMSO (VWR International, Radnor, PA), 200 mm of each dntp (GE Healthcare, Piscataway, NJ), 1 Titanium Taq Polymerase (Clontech), and approximately 100 ng of DNA. In five reactions (sets 3, 4, 6, 7, and 16), 3 mm MgCl 2 (Invitrogen, Carlsbad, CA), 2 PCR buffer (Invitrogen), and 1.5 U Platinum Taq polymerase was used instead of Titanium buffer and polymerase. Two touchdown PCR programs were used (abbreviated as Touch46 and Touch48 in Supp. Table S2). The temperature cycling protocol consists of an Figure 1. Schematic overview of both workflows applied in the four Poc studies. A: Amplicon specific primers are fused with a universal M13tail. This target specific PCR product is amplified in a second PCR with primers, consisting of an M13tail, MID-tag (patient specific) and at the end an A or B adaptor. B: Upper panel: schematic representation of the approach used in PoC 1 and PCR reactions per patient are performed and equimolarly pooled in 12 pools, followed by a second PCR to attach MID primers and sequencing adaptors. All products are equimolarly pooled prior to emulsion PCR and sequencing. Lower panel: approach used in PoC 3 and 4; 16 multiplex reactions were optimized containing the 111 amplicons to be amplified for each patient. By a second PCR round MID barcode and sequencing adaptors were attached. PCR products are equimolarly pooled prior to emulsion PCR and sequencing. [Color figures can be viewed in the online issue, which is available at 2 HUMAN MUTATION, Vol. 32, No. 0, 1 10,

85 initial denaturation step at 941C for 2 min, followed by 12 cycles of denaturation at 941C for 20 sec, annealing starting at 60 (58)1C for 20 sec (decreasing 11C per cycle), and an extension at 721C for 1 min. This initial PCR is followed by 25 additional cycles of denaturation at 941C for 40 sec, annealing at 48 (46)1C for 40 sec, and extension at 721C for 30 sec. Final extension was accomplished at 721C for 10 min. Primer concentrations in one multiplex vary between and 0.8 mm and were adjusted to obtain equimolar quantities of each amplicon in one reaction (Supp. Table S3). PCR with MID barcoded primers MID barcoded primers consist of (Fig. 1A): (1) the required sequencing adaptor (A or B); (2) a 10-nucleotide-long MID tag or barcode to identify the patient (MID sequences provided by Roche/454, application note [CRF00104]); (3) a universal M13- tail (forward primer: cacgacgttgtaaaacgac and reverse primer: caggaaacagctatgacc), identical with the M13 tail used in the first PCR round. After the initial PCRs (singleplex or multiplex), all samples were diluted 1,000 times and 1 ml product was used as a template for a second PCR with MID barcoded primers. Total volume of this reaction was 15 ml. The amplification mixture included 1.5 mm MgCl 2 (Invitrogen), 1 PCR buffer (Invitrogen) and 1.5 U Platinum Taq (Invitrogen), 3% DMSO (VWR International), 200 mm of each dntp (GE Healthcare), and 0.2 mm of both forward and reverse primer. Temperature cycling protocol consists out of following steps: 4 min at 941C, 15 cycles of denaturation at 941C for 30 sec, annealing at 601C for 30 sec, extension at 721C during 50 sec, and final extension at 721for 10 min. PCRs were performed on a CFX384 instrument (Bio-Rad). During optimization FAM labeled MID primers were used to evaluate equimolarity between amplicons within one multiplex reaction and fluorescent peaks were separated on an ABI3730 capillary system. Sequencing Runs and Data Analysis PCRs were normalized and equimolarly pooled in relation to the RFU data. This pool was purified on a High Pure PCR Cleanup Micro kit (Roche, Indianapolis, IN). Fragment length of this total amplicon pool was evaluated on the Bioanalyzer (Agilent, Placerville, CA) and compared to the theoretically predicted pattern (Fig. 2). Chapter 2 Emulsion PCR and sequencing reactions on the GS-FLX (454-Roche) were performed according to the manufacturer s instructions. The FASTA files were analyzed with in house developed variant interpretation pipeline (VIP) software version 1.3 [De Schrijver et al., 2010] and with the commercially available Nextgene software (Softgenetics, State College, PA) version 2.0. Reads were aligned against GenBank reference sequences, NC_ ( , complement BRCA1) and NC_ ( ; BRCA2) from build 18 of the Human Genome assembly. Results To evaluate the utility of MPS in a diagnostic setting, we selected crucial characteristics like uniformity of coverage, sensitivity, specificity, and throughput. Furthermore, MPS should outperform current methods used in molecular diagnostics in terms of cost efficiency to make the implementation worthwhile. The performance evaluation of the selected approach is based on the results of four proof of concept (PoC) experiments (Fig. 1). In the first two experiments an identical singleplex approach was used for the analysis of 22 different patient samples. To reduce workload we started in PoC 3 and 4 from a multiplex setup. For PoC4 a strong optimization was performed compared to PoC3: primer concentrations within several multiplex sets were adjusted for poorly covered amplicons in PoC3. Furthermore, the composition of a few multiplexes was changed to obtain more uniform coverage. Evaluation of Uniformity in Coverage Distribution: Singleplex versus Multiplex Approach In a cost-efficient test, the full capacity of the GS-FLX instrument is used. This can be achieved by pooling different patients and/or disorders in a single lane. To maximize sample size in a single experiment, a uniform distribution of coverage is required. This means that the difference in number of reads between the less efficient amplified fragments and the best performing fragments should be as small as possible. We evaluated a singleplex (PoC1 and 2) and a multiplex approach (PoC3 and 4). A summary on the number of amplicons sequenced, reads mapped, and coverage distribution is shown in Table 1 for each PoC study. By calculating the fold difference to mean coverage, we showed that starting from strongly optimized multiplex sets results in a more uniform distribution of coverage Figure 2. Comparison of fragment length of the total amplicon pool in theory and in practice. Left panel: theoretical profile of the amplicon length for one patient when all amplicons are equimolarly pooled together. Amplicon length range is 233 to 437 bp, with an average length of 335 bp. Right panel: amplicon length profile obtained with the Bioanalyzer (Agilent) after column purification of the pool. Comparison of the two patterns is used as a quality control prior to emulsion PCR. [Color figures can be viewed in the online issue, which is available at HUMAN MUTATION, Vol. 32, No. 0, 1 10,

86 Chapter 2 Table 1. Overview of Characteristics of the Four PoC Experiments Singleplex pooling Multiplex pooling PoC1 PoC2 PoC3 PoC4 SETUP Used run reads 100% 100% 30% a 25% a Mapped BRCA1/2 reads 515, ,884 78,937 55,574 Amplicons sequenced 1, Patients sequenced COVERAGE Min/average/max (per amplicon) 1/348/1,870 1/191/1,076 1/144/931 7/168/559 Standard deviation Variation coefficient Fold difference to mean coverage 90%/95% 4.48/ / / /3.16 ] amplicons o38-fold coverage (%) 107 (8.76) 84 (6.88) 54 (9.73) 21 (6.31) a Only 30% and 25% of the reads were used in these experiments, because in the same runs pooling different disorders in one experiment was evaluated. Figure 3. Evolution of distribution of coverage uniformity in the 4 PoC experiments. In total we performed four GS-FLX runs. In our first two experiments (green and orange curve), equimolar pooling of singleplex PCR reactions was used. The third (blue) and the fourth (purple) experiment were prepared according to the multiplexing protocol. A: A distribution plot of the coverage is shown. The experiments with the multiplex approach (PoC 3 and 4) clearly show a more uniform distribution of coverage compared to the pooled singleplex runs (PoC 1 and 2). Thirty-eight-fold coverage threshold is depicted by the dotted line. Failure rate in PoC 3 was higher because of poor amplification of some multiplex sets. B: Fold difference to mean coverage is shown in the function of the fraction of amplicons. The dotted lines depict the factor where a 90 or 95% fraction of all amplicons is considered. Our last experiment (PoC4, purple curve) clearly shows the best result where the smallest difference to mean coverage was obtained. A total of 95% of all amplicons are sequenced with a spread correction factor of 3.16 (Table 1) (x-axis is shown in log 10 scale). compared to equimolar pooling of singleplex sets (Fig. 3). The fold difference to mean coverage can be used as spread correction factor to calculate the average coverage one needs to make sure that even for the less efficiently amplified fragments, the minimum required coverage is obtained. The smaller the fold difference to mean coverage, the larger the number of samples that can be pooled in a single experiment, and the more cost efficient a test will be. Considering our best optimized multiplex sets (PoC4) we obtained a 3.16-fold difference to mean coverage for 95% of the amplicons. This value can be used to calculate the average coverage we need to aim for to obtain a predefined minimum coverage for at least 95% of the amplicons (see below). Multiplex Optimization Optimizing the multiplex sets by fragment analysis using FAM labeled MID primers, turned out to be a good strategy: on average, we found a nearly linear correlation between peak height ( 5 relative fluorescence) on capillary electrophoresis within sets and coverage obtained after 454 sequencing as shown for two multiplex sets in Figure 4. The better results for the multiplex approach indicate that the second PCR round attaching the MID barcodes, introduces inequimolarities between the amplicons within the pool of singleplex PCRs. These equimolarities are further increased during the emulsion PCR. Calculation of the Number of Samples That Can be Pooled in a Single Run Knowing the fold difference to mean coverage allows calculating the number of patients that can be analyzed in a single standard GS-FLX run (calculation template available in supplemental files of Hellemans et al. (under review): * With a fold difference to mean coverage of 3.16, the required average coverage to obtain a minimum coverage of 4 HUMAN MUTATION, Vol. 32, No. 0, 1 10,

87 Chapter We opted for a threshold of 38-fold coverage based on statistical analyses made by Hellemans et al., who calculated that 38-fold coverage is required to detect a particular heterozygous variant with a probability of 99.9% when only variants present in at least 25% of the reads are considered as possible true variants. The rationale for the 25% variant frequency is described in the section: Sensitivity. * A standard GS-FLX run has 400,000 reads available; based on previous runs we found that the number of reads mapped is 95% 5 400,00095% 5 380,000. * 111 amplicons are required to cover the complete coding region and splice sites of BRCA1 and BRCA2 and considering a 5% safety margin to correct for possible run errors and differences in MID amplification efficiencies during sequencing: % 5 13,986 reads/patient are required. * Therefore, 380,000/13, patients can be pooled in a standard GS-FLX run, with a maximum of 5% of the amplicons not meeting the 38 coverage threshold. * With the Titanium chemistry (1,100,000 reads available), the number of patients can be increased to 74. Figure 5 shows that in PoC4 on average 4 of 111 amplicons (96.4%) did not meet the 38 treshold. However, based on the capillary electrophoresis panels, higher coverage was expected for these amplicons, indicating that some fragments are less efficiently amplified by the emulsion PCR or had reduced coverage due to experimental variation. Primer concentrations in the multiplex reactions can still be increased for BRCA , BRCA , BRCA , and BRCA to improve coverage for these amplicons and may provide a possible solution. Sensitivity, Specificity, and Filter Settings Figure 4. Correlation between relative fluorescence and coverage within multiplex sets. The correlation between the relative fluorescence seen on capillary electrophoresis and coverage is shown for two multiplex sets. Squares represent the individual amplicons within multiplex set 8, diamonds represent the amplicons of multiplex set 7. Equimolarity of amplicons within a multiplex set was verified on capillary electrophoresis. [Color figures can be viewed in the online issue, which is available at Sensitivity In total 503 (133 distinct) sequence variants, previously identified with Sanger sequencing were evaluated in our sample set (30 patients and 93 control samples) with VIP v1.3. (The sensitivity data obtained with NextGene v2.0 are described below.) All 40 unique substitutions (SNPs, missense, nonsense, splice site mutations) were easily detected. Detection of deletions or insertions is more challenging. MPS analysis software is based on Figure 5. Distribution of coverage for each amplicon in BRCA1 and BRCA2 (PoC 4). Coverage for each BRCA1 and BRCA2 amplicon is shown, with the line indicating the 38 coverage threshold. In total, 111 amplicons cover the complete coding region of both genes. In BRCA2, four amplicons failed to reach the 38 coverage threshold; further optimization for these amplicons is required. HUMAN MUTATION, Vol. 32, No. 0, 1 10,

88 Chapter 2 Table 2. Examples of Variants Analyzed in Homopolymeric Regions Z5 Variant (c.) Flanking sequences a homopolymer length Change of Coverage of nucleotide Quality VF (%) Results with VIP software Undetected variants BRCA1 c.1016dup ACTCCCAGCACAGAAAAAAAAGGTAGATCTGAATGCTGATC n.a o10% BRCA2 c.994del CTAGMAAGACTAGGAAAAAAATTTTCCATGARGCAAACGCT n.a o10% BRCA1 c.1010del ACTCCCAGCACAGAAAAAAAGGTAGATCTGAATGCTGATCC % Detected variants in homopolymeric regions BRCA1 c.3329dup ATCCTGAAATAAAAAAAGCAAGAATATGAAGAAGTAGTTC % BRCA2 c.5577_5580del ACATGAAACAATTAAAAAAGTGAAAGACATATTTACAGAC % BRCA1 c.2989_2990dup AAAACTAAATGTAAGAAAAAAATCTGCTAGAGGAAAACTTT % Results with Nextgene sofware Examples of variants with low mutation score Mutation score BRCA2 c.6351dup GAAGATCAAAAAAAACACTAGTTTT % BRCA1 c.1010del ACTCCCAGCACAGAAAAAAAGGTAGATCTGAATGCTGATCC % BRCA1 c.1961del AGAGATAAAGAAAAAAAAGTACAACCAAA % a Bold fully underlined nucleotides are part of the homopolymeric tract. mapping of single-stranded reads on a reference sequence; hence, reads lacking one or more nucleotides or containing an insertion will complicate this process. Furthermore, pyrosequencing has its limitations for correct basecalling in homopolymeric regions [Huse et al., 2007]. Therefore, we specifically selected 93 insertion deletion (indel) mutations, of which 20 deletions and insertions were present in homopolymeric tracts longer than 3 bp, to thoroughly evaluate the limitations of the technology. Of the 93 indels, 2 remained completely undetectable and 1 additional mutation was filtered out due to low quality scores (o30) and was present in less than 25% of the reads. Table 2 shows an overview of all undetected variants and their flanking sequences. All undetected variants affect homopolymer stretches of seven nucleotides. In total, 130/133 of all unique variants were detected with the VIP software resulting in a sensitivity of 98% (100% for substitutions). In general, sensitivity of MPS will be higher, because we introduced a sample bias by selecting variants in complex sequence regions. Reference bias A priori, heterozygote variants are assumed to be present in 50% of the reads and homozygous mutant samples in 100%. On average, the heterozygous variants were present in 48.4% (95% confidence interval [CI]: %; range %) of the reads. Homozygous mutant variants were found with an average variant frequency of 99.0% (range %). Based on these data we conclude that the reference bias in mapping is minimal. Specificity MPS is more sensitive than Sanger sequencing in terms of random sequencing errors and errors introduced by Taq polymerases, because the technology is based on sequencing of single, clonally amplified molecules. It is challenging to filter out these errors in the data analysis. The VIP software (version 1.3) [De Schrijver et al., 2010] generates a list of variants detected in more than 10% of mapped reads. By analyzing 30 patient samples for the complete BRCA1/2 coding sequence (3330 amplicons), this program generated a list of 5513 variants, of which only 443 are true variants, leaving 5,070 false positives. Based on the analysis of the positive controls, criteria were defined to distinguish false positive from true Figure 6. Overview of data generated with an in house developed software program (VIP). Combined forward and reverse allele frequencies are plotted against coverage for all variants detected in 30 patients and in 93 positive control samples. Gray data points are the false positives filtered out when any of four filters are applied. Green data points are true variants (503). The green outlier at almost 80% AF is a deletion of 62 nucleotides, probably preferential amplification of the shorter allele has occurred. A total of 80% of the true variants have an allele frequency in the range. Red data points are the remaining false positives (276). variants. Reads defining a variant need to fulfill all criteria, before the variants can be called as a possible true variant, which requires confirmation with Sanger sequencing. To filter out as many false positives as possible the following filters were applied in VIP: 1. Min. 38 coverage (cov) is required (Filter 1, cov 438 ). 2. The variant needs to be present in at least 25% of the reads (Filter 2, AF 425%). 3. At least in one direction (forward [F] or reverse [R]) a high quality score is required (Filter 3 Quality [Q] 430). 4. Variants in homopolymer stretches longer than 6 base pairs are non reliable calls (Filter 4, Homopolymer [Hp] 46). An overview of the results is shown in Supp. Table S3 and Figure 6. Application of these filters resulted in a specificity of 92%. Specificity for each PoC separately is shown in Figure 7. Application of filters 1 and 2 resulted in the largest reduction of false positives: 1,432 data points were filtered out because of o38 coverage and another 2,628 because of being present in less than 25% of the reads, resulting in a specificity of 70% (filters 1 and 2). Fine tuning occurred by applying filters 3 and 4, decreasing the remaining 1,010 false positives to 276. In total, 104 distinct false positive variants remained. The recurrence of some false positives can be attributed to the Taq polymerase used or pseudogene amplification (Fig. 8). A total of 89% (244/276) of the 6 HUMAN MUTATION, Vol. 32, No. 0, 1 10,

89 Chapter 2 Figure 7. Specificity of four PoC studies analyzed with the VIP software. Specificity (%) is plotted for each PoC study separately and for the total of the four Poc studies. This plot clearly shows that Filter 1 and 2 result in the largest increase in specificity. The total specificity found is 92%. Figure 8. Pseudogene amplification by MPS. A part of the sequence of BRCA1 exon 2 is shown in both panels. In BRCA1 exon variants in 25 50% of the reads (black boxes in upper panel) were detected with MSP in each sample, but were not observed with Sanger sequencing (lower panel, black boxes represent the possible pseudogene nucleotides). Blasting these sequences, showed coamplification of BRCA1P1 (90% analogy with BRCA1 exon 2) by the reverse primer. [Color figures can be viewed in the online issue, which is available at false positives found are indel variations in the close neighborhood of a homopolymer region. Data analysis with a commercial available software package: Nextgene (Softgenetics) The performance of our in-house developed VIP software package was compared with the commercially available software Nextgene version 2.0 (Softgenetics). This software has intrinsic filters in terms of coverage and frequency of a particular variant. An overall variant score is calculated, which provides an empirical estimation of the likelihood that a given SNP is real and not an artifact of sequencing or alignment. This score is mainly based on the concept of Phred scores where quality scores are logarithmically linked to error probabilities. For example, a quality score of 10, gives a chance of 1 out of 10 that the base is incorrectly called. Furthermore, subscores are integrated in this general score, taking into account the allele frequency, forward and reverse balance, and a homopolymeric score, which penalizes indels found in homopolymeric regions. The maximum value for this overall mutation score is 30. Filters and settings were defined based on our data obtained on the analysis of the positive controls, as we did for VIP. We obtained an overall specificity of 84% (533 false positives in 30 patients), when all the variants present in 25% of the reads at least 38 coverage were taken into account. This specificity can be highly increased by applying an additional filter using the overall variant score. HUMAN MUTATION, Vol. 32, No. 0, 1 10,

90 Chapter 2 Alle 93 indel variations and 40 substitutions were detected (100%), when no filter was set on the overall variant score. To improve specificity, we included an overall variant score 415 requirement (in accordance with the software developers recommendations); this resulted in a specificity of 96%, but also in loss of detection of three variants present in homopolymeric regions (shown in Table 2). BRCA1 c.1010del was not detected with the VIP software either, the remaining two variants are different compared to those missed with VIP, but they are also present in homopolymeric regions 46. Hellemans et al. found that the vast majority of reads for homopolymers up to the length of 6 can be correctly base called. However, for homopolymers of seven nucleotides or longer, the number of correctly called reads decreases. Therefore, detection of mutations in homopolymer tracts of seven is challenging in both programs evaluated. Non random sequencing errors Currently, Sanger sequencing is considered as the gold standard. Because MPS involves the sequencing of single clones, it can be more vulnerable to errors introduced by polymerases. We did not use proofreading polymerase enzymes (except for the emulsion PCR). Because we perform three PCRs (amplicon specific, MID, and emulsion PCR) in our workflow, nonspecific errors may occur. Performing our assays with proofreading Taq might lead to a reduction of some false positives. Only 11% (32) of all false positives were single nucleotide substitutions, most of them can be explained by random sequencing or PCR artefacts and are only seen in a single direction. For at least one variant we have strong evidence of a non random sequence error caused by the Taq polymerase: BRCA2 c G4T (exon 26) was detected in 36% (95% CI: 30 40%, range %) of all reads of all patients sequenced for this amplicon with Titanium Taq polymerase. Sanger sequencing of exon 26 and splice sites amplified with Platinum Taq polymerase revealed only the G allele. Sanger sequencing of amplicons generated with Titanium Taq polymerase clearly showed the G4T substitution in all samples (Fig. 9). These data clearly point to a possible role of the polymerase for these nonrandom sequencing errors. Evaluation specificity of MID barcodes used No other mutation detection technology allows pooling of several patients in a single lane. MPS made this possible but the specificity of the MID barcodes used to distinguish individual patients is crucial. MID tags are designed in such a way that two sequencing errors may occur, without the tag being defined as another MID. We analyzed the data for an experiment containing only MIDs 1 5 and verified whether reads for MIDs 6 60 were generated. For MID10, approximately 900 reads were mapped (allowing two mismatches), compared to approximately 14,000 reads for MIDs included in the experiment. Because these reads are randomly scattered over all amplicons, chances on generating false positives are minimal, but not impossible for amplicons with low coverage. Therefore, MID10 should be excluded when two mismatches are allowed for mapping. MID10 is not present in the data when allowing only a single mismatch. Hereby, only a small fraction of the reads (maximum 2,000 scattered over all amplicons and patients) are lost and the risk on false positives is strongly reduced. As this is of major importance for reliable analyses in diagnostic settings, this solution is preferred. Figure 9. Sanger sequencing results for BRCA2 c G4T. In the upper panel a Sanger sequence flanking the G allele at position c (BRCA2) is shown for a sample amplified with Platinum Taq polymerase. In the lower panel, Sanger sequencinge of the same sample is shown after amplification with Titanium Taq polymerase. In this sample we see a clear reduction of the G allele and the replacement by a T allele. [Color figures can be viewed in the online issue, which is available at Evaluation of Multiexon Deletion Detection We evaluated the capacity to detect multiexon deletions in both approaches. In PoC 2, a patient sample with a heterozygous deletion of BRCA2 exon 1 4 was included and in PoC 4 a heterozygous deletion of BRCA1 exon was evaluated. By calculating the dosage quotient (DQ) [Goossens et al., 2009], we obtained values not statistically different for the deleted exons compared to the nondeleted exons. To clearly distinguish a deleted exon, DQ values of 0.5 are expected. Normalizing the read counts by the average coverage of the reference patients for these amplicons, resulted in a DQ of 0.35 (BRCA2 exon 2), DQ of 1 (BRCA2 exon 3), and 0.72 (BRCA2 exon 4). For BRCA1 exon 18 and 19, we found a dosage quotient of 0.9 for both exons. Therefore, differences in coverage for the deleted exons were not significant in our experiments and we failed to detect this type of mutations. Discussion In this study, we evaluated whether massive parallel amplicon sequencing on the GS-FLX is suitable for implementation in a diagnostic setting. We used a multiplex bar coded amplicon sequencing approach for BRCA1 and BRCA2 as an example. For these genes a thoroughly optimized and efficient mutation detection strategy was previously optimized in our laboratory [De Leeneer et al., 2008, 2009]. Therefore, we critically evaluated the progress that could be made using amplicon sequencing on the GS-FLX. Different critical aspects such as uniformity of coverage, sensitivity, and specificity and reliability of primers and consumables were considered. Our experiences will be very useful for the optimization of other genes. The recommended approach for amplicon sequencing of multiple sequences is based on the use of fusion primers, which start with an A or B sequence on which the pyrosequencing 8 HUMAN MUTATION, Vol. 32, No. 0, 1 10,

91 reaction is initiated, followed by a patient specific barcode (MID) and a target specific sequence at the 3 0 end. Although this fusion approach is very simple, some impractical issues, in terms of primer management and setup, will arise when the complexity of the experiment (i.e., number of amplicons and number of patients) increases. Primer costs and workload can be strongly reduced by attaching sequencing adaptors and barcodes to the PCR product by ligation [Meyer et al., 2007] or nested patch PCR [Varley and Mitra, 2008]. In our study, a second PCR was used to attach the MIDs and sequencing adaptors to the amplicon. We preferred this approach because of its simplicity, lower cost, and workload compared to some other workflows. Multiplexing PCR products allowed to further reduce workload and consumable cost, and to save patient material. We have chosen to multiplex about 10 amplicons in a single set: optimization of such sets can be obtained with a minimum of extra efforts and results in a reduction of the initial workload by 10-fold. Higher degrees of multiplexing would require significantly more efforts and would result in a relatively small additional increase in efficiency. Furthermore, it would hamper the evaluation of pools by capillary electrophoresis prior to sequencing, which turned out to be a costeffective optimization method. Our results showed a good correlation between peak height ( 5 relative fluorescence) on capillary electrophoresis within sets and the coverage obtained after 454 sequencing, indicating a limited influence of the emulsion PCR. We currently have no other explanation than experimental variation for 4% of the PCR fragments that were less efficiently amplified by the emulsion PCR. A correlation between coverage and amplicon length, GC content, or other sequence related characteristics could not be found. Although a homopolymeric tract is present in two of these amplicons (BRCA and 11.19), this cannot explain the lower coverage, because the presence of homopolymer stretches did not influence the amplification efficiency of other homopolymer-rich fragments. By analysis of 30 patient samples for the complete BRCA1/2 coding sequence with the VIP software, we obtained a list of 5,513 variants present in at least 10% of the reads, of which only 443 are true variants, leaving 5,070 false positives (specificity less than 10%). We are the first group defining filters to cope with MPS sequencing errors in a diagnostic setting. With this program we succeeded to obtain an overall specificity of 92%. Filter 2 (AF 425%) is based on the minimal allele frequency found (27%) by analysis of 93 distinct indel variants and should avoid random PCR errors. The majority of false positives were generated by sequencing of homopolymeric regions, a known complication for pyrosequencing, resulting in undercalls and overcalls in homopolymeric stretches (i.e., one nucleotide missing or one nucleotide added compared to the reference sequence). Filter 3 (Q430) and 4 (Hp46) were applied to reduce these numbers, because homopolymeric stretches influenced the Q value assigned to a specific read. These results were compared to those obtained with the commercially available software Nextgene version 2.0. Application of filters 1 and 2 (AF 425% and coverage 438 ), resulted in a specificity of 84%. Using an overall mutation score filter of 415, allowed to increase the specificity to 96% but resulted in loss of detection of some mutations present in homopolymeric regions of more than six nucleotides. Our results suggest that some Taq polymerases introduce nonrandom sequencing errors. For example, BRCA2 c G4T was found in almost all the patients analyzed for the relevant amplicon when amplified with Titanium Taq polymerase (ClonTech), but not with Platinum Taq polymerase (Invitrogen). Because these errors will be reproducible in every run in almost all Chapter 2 patients, they can be considered as true false positives and can be ignored in the long term. Furthermore, the specificity of the barcodes, used to identify patients, was evaluated to confirm that false positives could not have been generated by aligning reads to a wrong MID. Despite the fact that no patients with MID10 were included in one of our experiments, reads for MID10 were detected when allowing two mismatches for the MIDs. To avoid possible false positives for amplicons with lower coverage we excluded MID10 from further experiments, because 10 misaligned reads for a given amplicon are sufficient to generate a variant with at a frequency of 425% if only 38-fold coverage is obtained. In future experiments, data will be mapped allowing only a single mismatch as hereby the fraction of reads lost is minimal and misalignments to an incorrect MID will be avoided. Homopolymeric stretches turned out to be major sources for false positive and false negative variants. For an efficient workflow a high specificity is required and our study shows that the number of false positives is strongly reduced by the application of predefined filters. However, it was impossible to define adequate filters in both software programs evaluated without loss of detection of some variants in homopolymeric regions longer than six nucleotides. The coding region of BRCA1/2 contains seven homopolymer regions with seven and three homopolymer regions with eight nucleotides, in 11 of our amplicons. Until homopolymer analysis improves, these 11 amplicons are currently analyzed with HRMCA in our setting, increasing detection capacity to 100% for all mutations evaluated [De Leeneer et al., 2008]. Sequencing technologies not based on pyrosequencing may outperform 454 sequencing for detection of variants in homopolymer regions. Genetic testing for hereditary breast cancer in a diagnostic setting was evaluated earlier this year by deep sequencing on the GAIIx instrument (Illumina, San Diego, CA) by Morgan et al. [2010] (starting from long-range PCRs) and by Walsh et al. [2010] (DNA capturing by hybridization in solution to custom-designed crna oligonucleotide baits). Both reported detection of all variants/mutations evaluated, but the number of variants evaluated was much smaller and insertions/deletions were maximum 19 bp in length. The deletion c.5503_5564del62 evaluated in our study may have remained undetected with this sequencing technology with read lengths of 2 76-bp paired-end reads [Walsh et al., 2010] or 51 bp [Morgan et al., 2010]. Walsh et al. [2010] did not report any false positives after filtering out variants present in less than 15% of the reads. With an average coverage of 1,286 (range ,854) a reduction in false positives is indeed expected; however, such high coverage largely increases the cost per sample. We calculated consumable costs and labor time for our MPS approach, and found that our MPS setup by pooling 74 patients in a single run costs about 345 EUR per sample (consumable cost: 232 EUR), which becomes cost competitive with our HRMCA approach and is much lower than Sanger sequencing. The current development of commercial userfriendly software for data analysis, allows MPS to outperform prescreening techniques used in combination with Sanger sequencing. Automation of the workflow will even further decrease the workload. Turnaround times for genetic testing need to be strongly reduced to meet increasing expectations. Pooling large numbers of patients in a single run will only be useful in a diagnostic setting if different disorders can be pooled in a single run. This requires the development of uniform workflows for different genetic tests. In some populations, large intragenic founder deletions represent an important fraction of the BRCA1/2 mutation HUMAN MUTATION, Vol. 32, No. 0, 1 10,

92 Chapter 2 spectrum. Our MPS setup allowed to detect point mutations with high sensitivity but turned out to be unreliable for the identification of large exon (or multiexon) deletions. The application of three consecutive PCR rounds prior to sequencing most likely explains why deviations from diploidy remained undetected. Additionally, we aimed for an average coverage of 120 (to obtain minimally 38), to allow pooling a large number of samples in a single lane. Studies successfully reporting the detection of large intragenic rearrangements worked with much higher read depths, allowing more reliable quantifications of copy numbers [Goossens et al., 2009; Walsh et al., 2010]. For a sensitive analysis an additional technique for the detection of copy number variations needs to be included in the mutation detection strategy. In conclusion, we developed an efficient workflow for highthroughput BRCA1/2 amplicon sequencing. Sensitivity and specificity of MPS amplicon sequencing is high and can be further increased by supplementing MPS assays to overcome issues related to homopolymeric regions. In terms of throughput, diagnostic testing can be highly accelerated and MPS facilitates offering genetic analyses to more at-risk patients. Considering cost efficiency MPS outperforms all other mutation screening techniques, but there is a shift from wet lab work toward data analysis. In our opinion, Sanger sequencing should still be used for confirmation of deleterious variants in a diagnostic context. Acknowledgments This project was realized with the funding of an Emmanuel van der Schueren scholarship of the Flemish foundation against cancer to Kim De Leeneer. Bruce Poppe is a senior clinical investigator from FWO. This study was supported by a StepStone grant from the Industrial Research Fund from Ghent University; the Roche 454 GS-FLX instrument is part of the NXTGNT infrastructure, funded by a Hercules grant (middle heavy infrastructure). NXTGNT (initiated and supervised by Sofie Bekaert, Jo Vandesompele, Dieter Deforce, Philippe van Nieuwerburgh, Wim Van Criekinge, Jan Hellemans, Paul Coucke) is a genome analysis platform from Ghent University. References Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM, Gibbs RA Direct selection of human genomic loci by microarray hybridization. Nat Methods 4: Curtin NJ PARP inhibitors for cancer therapy. Expert Rev Mol Med 7:1 20. Dahl F, Stenberg J, Fredriksson S, Welch K, Zhang M, Nilsson M, Bicknell D, Bodmer WF, Davis RW, Ji H Multigene amplification and massively parallel sequencing for cancer mutation discovery. Proc Natl Acad Sci USA 104: De Leeneer K, Coene I, Poppe B, De Paepe A, Claes K Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms. Clin Chem 54: De Leeneer K, Coene I, Poppe B, De Paepe A, Claes K Genotyping of frequent BRCA1/2 SNPs with unlabeled probes: a supplement to HRMCA mutation scanning, allowing the strong reduction of sequencing burden. J Mol Diagn 11: De Schrijver JM, De Leeneer K, Lefever S, Sabbe N, Pattyn F, Van Nieuwerburgh F, Coucke P, Deforce D, Vandesompele J, Bekaert S, Hellemans J, Van Criekinge W Analysing 454 amplicon resequencing experiments using the modular and database oriented Variant Identification Pipeline. BMC Bioinformatics 11:269. Gerhardus A, Schleberger H, Schlegelberger B, Gadzicki D Diagnostic accuracy of methods for the detection of BRCA1 and BRCA2 mutations: a systematic review. Eur J Hum Genet 15: Goossens D, Moens LN, Nelis E, Lenaerts AS, Glassee W, Kalbe A, Frey B, Kopal G, De Jonghe P, De Rijk P, Del-Favero J Simultaneous mutation and copy number variation (CNV) detection by multiplex PCR-based GS-FLX sequencing. Hum Mutat 30: Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol 8:R143. Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, Simons JF, Kim PM, Palejev D, Carriero NJ, Du L, Taillon BE, Chen Z, Tanzer A, Saunders AC, Chi J, Yang F, Carter NP, Hurles ME, Weissman SM, Harkins TT, Gerstein MB, Egholm M, Snyder M Paired-end mapping reveals extensive structural variation in the human genome. Science 318: Korshunova Y, Maloney RK, Lakey N, Citek RW, Bacher B, Budiman A, Ordway JM, McCombie WR, Leon J, Jeddeloh JA, McPherson JD Massively parallel bisulphite pyrosequencing reveals the molecular complexity of breast cancerassociated cytosine-methylation patterns obtained from tissue and serum DNA. Genome Res 18: Liu W, Smith DI, Rechtzigel KJ, Thibodeau SN, James CD Denaturing high performance liquid chromatography (DHPLC) used in the detection of germline and somatic mutations. Nucleic Acids Res 26: Meyer M, Stenzel U, Myles S, Prufer K, Hofreiter M Targeted high-throughput sequencing of tagged nucleic acid samples. Nucleic Acids Res 35:e97. Morgan JE, Carr IM, Sheridan E, Chu CE, Hayward B, Camm N, Lindsay HA, Mattocks CJ, Markham AF, Bonthron DT, Taylor GR Genetic diagnosis of familial breast cancer using clonal sequencing. Hum Mutat 31: Pearson BM, Gaskin DJ, Segers RP, Wells JM, Nuijten PJ, van Vliet AH The complete genome sequence of Campylobacter jejuni strain (NCTC11828). J Bacteriol 189: Pettersson E, Zajac P, Stahl PL, Jacobsson JA, Fredriksson R, Marcus C, Schioth HB, Lundeberg J, Ahmadian A Allelotyping by massively parallel pyrosequencing of SNP-carrying trinucleotide threads. Hum Mutat 29: Pol A, Heijmans K, Harhangi HR, Tedesco D, Jetten MS, Op den Camp HJ Methanotrophy below ph 1 by a new Verrucomicrobia species. Nature 450: Ruby JG, Jan C, Player C, Axtell MJ, Lee W, Nusbaum C, Ge H, Bartel DP Large-scale sequencing reveals 21U-RNAs and additional micrornas and endogenous sirnas in C. elegans. Cell 127: Sanger F, Nicklen S, Coulson AR DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74: Taylor KH, Kramer RS, Davis JW, Guo J, Duff DJ, Xu D, Caldwell CW, Shi H Ultradeep bisulfite sequencing analysis of DNA methylation patterns in multiple gene promoters by 454 sequencing. Cancer Res 67: Thomas RK, Nickerson E, Simons JF, Janne PA, Tengs T, Yuza Y, Garraway LA, LaFramboise T, Lee JC, Shah K, O Neill K, Sasaki H, Lindeman N, Wong KK, Borras AM, Gutmann EJ, Dragnev KH, DeBiasi R, Chen TH, Glatt KA, Greulich H, Desany B, Lubeski CK, Brockman W, Alvarez P, Hutchison SK, Leamon JH, Ronan MT, Turenchalk GS, Egholm M, Sellers WR, Rothberg JM, Meyerson M Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat Med 12: van der Hout AH, van den Ouweland AM, van der Luijt RB, Gille HJ, Bodmer D, Bruggenwirth H, Mulder IM, van der Vlies P, Elfferich P, Huisman MT, ten Berge AM, Kromosoeto J, Jansen RP, van Zon PH, Vriesman T, Arts N, Lange MB, Oosterwijk JC, Meijers-Heijboer H, Ausems MG, Hoogerbrugge N, Verhoef S, Halley DJ, Vos YJ, Hogervorst F, Ligtenberg M, Hofstra RM A DGGE system for comprehensive mutation screening of BRCA1 and BRCA2: application in a Dutch cancer clinic setting. Hum Mutat 27: Varley KE, Mitra RD Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res 18: Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, Pindo M, Fitzgerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G, Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M, Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y, Segala C, Davenport C, Demattè L, Mraz A, Battilana J, Stormo K, Costa F, Tao Q, Si-Ammour A, Harkins T, Lackey A, Perbost C, Taillon B, Stella A, Solovyev V, Fawcett JA, Sterck L, Vandepoele K, Grando SM, Toppo S, Moser C, Lanchbury J, Bogden R, Skolnick M, Sgaramella V, Bhatnagar SK, Fontana P, Gutin A, Van de Peer Y, Salamini F, Viola R A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS One 2:e1326. Walsh T, Lee MK, Casadei S, Thornton AM, Stray SM, Pennil C, Nord AS, Mandell JB, Swisher EM, King MC Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing. Proc Natl Acad Sci USA 107: Yao Y, Guo G, Ni Z, Sunkar R, Du J, Zhu JK, Sun Q Cloning and characterization of micrornas from wheat (Triticum aestivum L.). Genome Biol 8:R HUMAN MUTATION, Vol. 32, No. 0, 1 10,

93 Chapter 2 References 1. Andrulis, I.L., et al., Comparison of DNA- and RNA-based methods for detection of truncating BRCA1 mutations. Human mutation, (1): p Kosaki, K., T. Udaka, and T. Okuyama, DHPLC in clinical molecular diagnostic services. Molecular genetics and metabolism, (1-2): p Yu, B., et al., DNA mutation detection using denaturing high-performance liquid chromatography (DHPLC). Current protocols in human genetics / editorial board, Jonathan L. Haines... [et al.], Chapter 7: p. Unit Glavac, D. and M. Dean, Optimization of the single-strand conformation polymorphism (SSCP) technique for detection of point mutations. Human mutation, (5): p Fodde, R. and M. Losekoot, Mutation detection by denaturing gradient gel electrophoresis (DGGE). Human mutation, (2): p Glavac, D. and M. Dean, Applications of heteroduplex analysis for mutation detection in disease genes. Human mutation, (4): p Roest, P.A., et al., Protein truncation test (PTT) for rapid detection of translation-terminating mutations. Human molecular genetics, (10): p Sevilla, C., et al., Testing for BRCA1 mutations: a cost-effectiveness analysis. European journal of human genetics : EJHG, (10): p

94

95 Mutation prevalence and spectrum of BRCA1&2 and RAD51C in selected patient populations

96 Chapter 3 MUTATION PREVALENCE AND SPECTRUM OF BRCA1&2 AND RAD51C IN SELECTED PATIENT POPULATIONS PAPER 5 91 Prevalence of BRCA1/2 mutations in sporadic breast/ovarian cancer patients and identification of a novel de novo BRCA1 mutation in a patient diagnosed with late onset breast and ovarian cancer: implications for genetic testing. De Leeneer K, Coene I, Crombez B, Simkens J, Van den Broecke R, Bols A, Stragier B, Vanhoutte I, De Paepe A, Poppe B and Claes K Breast Cancer Res and Treat. (in press) PAPER Evaluation of RAD51C as a new breast cancer susceptibility gene in the Belgian/Dutch population. De Leeneer K(*), Van Bockstael M(*), Swietek N, Van den Ende J, Storm Katrien, Willocx S, Blaumeiser Bettina Leunen Karin, Van Asperen C.J., Wijnen J.T, Legius Eric, Michils Geneviève, Matthijs Gert Blok M.J, Gomez-Garcia E.B, De Paepe A, Poppe B, and Claes K (*) Both authors have contributed equally to this work Breast Cancer Research (Letter-in preparation) 90

97 Chapter 3 Breast Cancer Res Treat DOI /s PRECLINICAL STUDY Prevalence of BRCA1/2 mutations in sporadic breast/ovarian cancer patients and identification of a novel de novo BRCA1 mutation in a patient diagnosed with late onset breast and ovarian cancer: implications for genetic testing Kim De Leeneer Ilse Coene Brecht Crombez Justine Simkens Rudy Van den Broecke Alain Bols Barbara Stragier Ilse Vanhoutte Anne De Paepe Bruce Poppe Kathleen Claes Received: 2 February 2011 / Accepted: 20 April 2011 Ó Springer Science+Business Media, LLC Abstract In order to adequately evaluate the clinical relevance of genetic testing in sporadic breast and ovarian cancer patients, we offered comprehensive BRCA1/2 mutation analysis in patients without a family history for the disease. We evaluated the complete coding and splice site regions of BRCA1/2 in 193 sporadic patients. In addition, a de novo mutation was further investigated with ultra deep sequencing and microsatellite marker analysis. In 17 patients (8.8%), a deleterious germline BRCA1/2 mutation was identified. The highest mutation detection ratio (3/7 = 42.9%) was obtained in sporadic patients diagnosed with breast and ovarian cancer after the age of 40. In 21 bilateral breast cancer patients, two mutations were identified (9.5%). Furthermore, 140 sporadic patients with unilateral breast cancer were investigated. Mutations were only identified in patients diagnosed with breast K. De Leeneer I. Coene B. Crombez J. Simkens A. De Paepe B. Poppe K. Claes (&) Center for Medical Genetics, Ghent University Hospital, De Pintelaan 185, 9000 Gent, Belgium Kathleen.Claes@UGent.be R. Van den Broecke Breast clinic, Department of Gynecology, Ghent University Hospital, Ghent, Belgium B. Stragier Breast clinic, Department of Medical Oncology, Heilig Hart Ziekenhuis, Roeselare, Belgium I. Vanhoutte Breast clinic, Department of Radiotherapy and Oncology, AZ Sint-Lucas, Gent, Belgium A. Bols Breast clinic, Department of Medical Oncology, AZ Sint-Jan, Brugge, Belgium cancer before the age of 40 (12/128 = 9.4% vs. 0/12 with Dx [ 40). No mutations were detected in 17 sporadic male breast cancer and 6 ovarian cancer patients. BRCA1 c.3494_3495deltt was identified in a patient diagnosed with breast and ovarian cancer at the age of 52 and 53, respectively, and was proven to have occurred de novo at the paternal allele. Our study shows that the mutation detection probability in specific patient subsets can be significant, therefore mutation analysis should be considered in sporadic patients. As a consequence, a family history for the disease and an early age of onset should not be used as the only criteria for mutation analysis of BRCA1/2. The relatively high mutation detection ratio suggests that the prevalence of BRCA1/2 may be underestimated, especially in sporadic patients who developed breast and ovarian cancer. In addition, although rare, the possibility of a de novo occurrence in a sporadic patient should be considered. Keywords Breast and ovarian cancer Sporadic cancer De novo mutation BRCA1/2 Introduction Germline BRCA1 (MIM ) [1] and BRCA2 (MIM ) [2, 3] mutations confer high risks for breast and ovarian cancer and are most prevalent in patients with a family history for the disease. The incidence of mutations in high-risk families varies widely among different populations; some present a wide spectrum of different mutations, while in particular ethnic groups specific mutations show a high frequency due to a founder effect. For example, approximately 2.5% of all people of Ashkenazi Jewish descent carry one of three ancient mutations in BRCA1 or BRCA2 (c.68_69delag

98 Chapter 3 (185delAG) and c.5266dupc (5382insC) in BRCA1 and c.5946delt (6174delT) in BRCA2) [4]. Fewer reports and data on the prevalence of BRCA1 and BRCA2 germline mutations in sporadic breast and ovarian cancer patients are available, since inclusion criteria for BRCA mutation testing are most often based on the family history. A negative family history can be the result of a small family size, predominance of males in the family, or incomplete penetrance [5]. For sporadic patients, genetic testing is often limited to patients with early age at onset, bilaterality or tumor characteristics like multifocality, multicentricity, and triple-negative breast tumors, as these features increase the probability to detect a germline BRCA1 or 2 mutation. Despite the high number of BRCA1/2 mutations identified, de novo BRCA mutations are rare. To the best of our knowledge, to date only two de novo mutations in BRCA1 are reported [6, 7] and five in BRCA2 [8 12]. All these mutations were identified in patients with tumors occurring before the age of 40. In this study, we present our data on the prevalence of BRCA1&2 mutations in 193 sporadic Belgian breast and/or ovarian cancer patients. Furthermore, we present the identification of a de novo BRCA1 mutation in a patient diagnosed with breast cancer and ovarian cancer in her fifties. Materials and methods Patient population The patients evaluated in this study are a subgroup of all Belgian patients referred to our center for diagnostic testing for BRCA1 and BRCA2. All the tested individuals provided a signed informed consent after appropriate genetic counseling. Personal and family histories of all patients were recorded, including ages of diagnosis of all cancers. We investigated 193 breast and/or ovarian cancer patients without any family history for the disease. The majority of the study population (128 patients) are unilateral breast cancer patients who were selected because of an early age of onset (\40 years). Additionally, we analyzed 12 unilateral breast cancer patients with a later age at onset (Dx [ 40). Furthermore, we investigated the BRCA1/2 coding region in 21 bilateral breast cancer patients and six ovarian cancer patients with an age at diagnosis\50. Male breast cancer patients (n = 17) and patients who developed breast in combination with ovarian cancer (n = 9) were selected without limitations based on the age of onset. Mutation detection in BRCA1/2 Genomic DNA was isolated from peripheral blood mononuclear cells of all patients. The complete coding sequence and flanking splice site regions of BRCA1/2 were evaluated, by a wide variety of techniques, since these patients were selected retrospectively. Analysis of the samples was performed with a combination of techniques: Protein Truncation Test (PTT) [13], Denaturing Gradient Gel Electroforesis (DGGE) [14], High Resolution Melting Curve Analysis (HRMCA) [15, 16], or Sanger sequencing. In addition, large intragenic rearrangements in BRCA1 and BRCA2 were evaluated with Multiplex Ligation Probe Amplification (MLPA). Paternity testing We identified a BRCA1 mutation c.3494_3495deltt (p.phe1165fs) in one of the patients, which was absent in both parents. Paternity was verified using Powerplex16 assay (Promega). The PowerPlex Ò 16 System allows the co-amplification and three-color detection of sixteen loci (fifteen STR loci and Amelogenin). The system contains the loci Penta E, D18S51, D21S11, TH01, D3S1358, FGA, TPOX, D8S1179, vwa, Amelogenin, Penta D, CSF1PO, D16S539, D7S820, D13S317, and D5S818. One primer specific for Penta E, D18S51, D21S11, TH01, and D3S1358 is labeled with fluorescein (FL); one primer specific for FGA, TPOX, D8S1179, vwa, and Amelogenin is labeled with carboxy-tetramethylrhodamine (TMR); and one primer specific for Penta D, CSF1PO, D16S539, D7S820, D13S317, and D5S818 is labeled with 6-carboxy- 4 0,5 0 -dichloro-2 0,7 0 -dimethoxy-fluorescein (JOE). All the 16 loci are amplified simultaneously in a single tube and analyzed on an ABI3730 (Applied Biosystems). Data are analyzed with the Genemapper v1 software. The power of exclusion of this system exceeds in all populations tested. Microsatellite marker analysis Breast Cancer Res Treat To determine if a possible de novo mutation (BRCA1 c.3494_3495deltt) originated on the maternal or paternal allele, the region encompassing the BRCA1 gene was investigated with microsatellite markers in the parents, sibs, and children of the proband. Twelve microsatellite markers within or flanking BRCA1 were selected, spanning a 3 cm region. A schematic overview of the used markers and their location is shown in Fig. 1. Primer sequences for amplification of these markers were obtained from the Genome database. The forward primer was labeled with a FAM label and the amplification products were separated with capillary electrophoresis on an ABI3730 (Applied Biosystems). Plots were analyzed with the Genemapper software

99 Chapter 3 Breast Cancer Res Treat Fig. 1 Schematic overview of the location of the used microsatellite markers. Thirteen microsatellite markers within or flanking BRCA1 were selected, spanning a 3.4 Mb region Deep sequencing Mosaicism in the parents of the patient with the possible de novo mutation (BRCA1 c.3494_3495deltt (p.phe1165fs) was further investigated with ultra deep sequencing. The relevant amplicon was sequenced on a GS-FLX instrument (454-Roche). Data analysis was performed with the commercial available software Nextgene v2. (Softgenetics) [17]. Statistical analysis Comparison of mutation prevalence between groups was performed with the chi-square test. For the statistical analysis of the mutation prevalence of smaller subgroups (n \ 40), the Fisher s exact test was used. Results and discussion We investigated the complete coding region of BRCA1&2 in 193 breast and/or ovarian cancer patients, without evidence for a family history of the disease. These patients were selected because they presented early onset or bilateral or multifocal tumors. In 17 (8.8%), a deleterious germline BRCA1/2 mutation was identified (an overview is shown in Table 1). In three patients, previously unreported mutations were identified: BRCA1 c.3494_3495deltt, BRCA2 c.6948_ 6949insTT, and BRCA1 c.2507_2508delaa. These were, therefore, considered as potentially de novo. For two of them parental DNA was available to verify this hypothesis. For the patient with the BRCA2 c.6948_6949instt mutation; Sanger sequencing showed that she inherited the mutation from her father. An autosomal dominant inheritance pattern for this mutation is masked by the predominance of males in this branch of the pedigree (father has only male siblings). BRCA1 c.3494_3495deltt was identified in a patient diagnosed with breast and ovarian cancer at the age of 52 and 53, respectively. This mutation was detected with high resolution melting curve analysis in our routine diagnostic screening [16]. Sanger sequencing revealed a truncating BRCA1 mutation c.3494_3495deltt (p.phe1165fs) which to the best of our knowledge, has not yet been described before (Fig. 2). The mutation was not detectable in DNA extracted from blood lymphocytes of both parents by Sanger sequencing. Since the lower limit to detect mosaicism with Sanger sequencing is around 15% [18], we investigated possible mosaicism in lymphocytes with ultra deep sequencing on the GS-FLX instrument (454/Roche). The relevant amplicon was deep sequenced for the patient DNA, the maternal and the paternal sample. This ultradeep sequencing allows the identification of low grade mosaic variants [19]. The mutation was detected in 43% of the reads in the patient at 157-fold coverage, but was absent in the maternal and paternal sample both covered 1,840 and 2,780 fold, respectively, consistent with a de novo occurrence. A prevalence of a variant in 43% of all reads is consistent with 50% heterozygosity as heterozygous mutations can be present in 25 60% of the reads [17]. The coverage of 1,840 obtained for the maternal sample allows detection of mosaic mutations present in 5% of the blood lymphocytes with a probability of C99%; the higher coverage of the paternal sample (2,7809), results in a lower detection limit of 2% mosaicism with a 99% probability. Deeper coverage would be needed to detect variants with a frequency lower than 2% with a 99% probability. To determine if the mutation originated on the maternal or paternal allele, the region encompassing the BRCA1 gene was investigated with 12 microsatellite markers in the parents, sibs, and children of the proband. Fragment analysis revealed that the mutation originated on the paternal allele of the proband. Five of her sibs inherited the same paternal allele but do not carry the mutation, providing additional evidence for the de novo occurrence of this mutation (data are shown in Fig. 3)

100 Chapter 3 Breast Cancer Res Treat Table 1 Overview of the different BRCA1 and BRCA2 mutations detected in the selected population ID Age at onset Diagnosis Mutation HGVS nomenclature (BIC nomenclature) Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer Breast cancer and and and and and 45 Breast and ovarian cancer Breast and ovarian cancer Breast and ovarian cancer Bilateral breast cancer Bilateral breast cancer BRCA1 c.2197_2201del5 (2316del5) BRCA1 c.2380dupg (2478insG) BRCA1 c delaa ( delAA) BRCA1 c.2603c[g (2722C[G) BRCA1 c.3607c[t (3726C[T) BRCA1 c.5075-?_5194??del BRCA1 c.212?3a[g (IVS5?3A[G) BRCA1 c.212?3a[g (IVS5?3A[G) BRCA1 c.212?3a[g (IVS5?3A[G) BRCA2 c.4936_4939del4 (5164del4) BRCA2 c.6275_6277deltt (6503delTT) BRCA2 c.6275_6277deltt (6503delTT) BRCA1 c.3494_3495deltt ( delTT) BRCA1 c.3661g[t (3780G[T) BRCA2 c.469_470delaa (697delAA) BRCA2 c.6948instt (7176insTT) BRCA1 c.134?3a[c (IVS3?3A[C) BRCA1 p.glu733 fs BRCA1 p.glu787fs BRCA1 p. Glu836fs BRCA1 p.ser868x BRCA1 p.arg1203x Belgian founder mutation Belgian founder mutation Previously not reported in literature Previously reported in British population Belgian founder mutation In the Italian population a founder deletion spanning exons is reported b as we did not determine the breakpoints in our patient, it is not clear if this is the same mutation Parental DNA not available a Paternal inheritance Parental DNA not available Paternal inheritance Parental DNA not available a Paternal inheritance Non coding Belgian founder mutation Paternal inheritance Non coding Belgian founder mutation Maternal inheritance Non coding Belgian founder mutation Maternal inheritance BRCA2 p.glu1646fs BRCA2 p.leu292fs BRCA2 p.leu292fs BRCA1 p.phe1165fs BRCA1 p.glu1221x BRCA2 p.lys157fs BRCA2 p.phe23556fs Previously reported in several populations Belgian founder mutation Belgian founder mutation Previously not reported in literature Belgian founder mutation Previously reported in WECARE study c Previously not reported in literature Parental DNA not available Maternal inheritance Parental DNA not available Novel de novo mutation Parental DNA not available Parental DNA not available Paternal inheritance Non coding Belgian founder mutation Parental DNA not available a Mutation was detected in one of the patients siblings, hence a de novo event can be ruled out b See ref [39] c See ref [40] Our data suggest a de novo event in a testicular germ cell, although a zygotic origin cannot be excluded. This is in agreement with previous findings in de novo mutations in BRCA1 and BRCA2, in which the paternal origin could be ascertained [8, 12]. Of note, the age of the patients father was only 29 years at the time of birth of his

101 Chapter 3 Breast Cancer Res Treat Fig. 2 Difference plot and Sanger sequencing of BRCA1 c.3494_3495deltt (p.phe1165fs). Aberrant melting curves by high resolution melting curve analysis for one of the amplicons in exon 11 were detected in our routine diagnostic screening. A frequent daughter. It has been suggested that the origin of new mutations may be influenced by both genomic imprinting effects and the increased number of cell divisions in spermatogenesis compared with oogenesis. A mainly paternal origin of mutations is seen in other hereditary syndromes as well, for instance in FGFR3, FGFR2, and RET, de novo mutations have been analyzed and all shown to be of paternal origin [20 22]. Not all disorders show such an extraordinary paternal bias, some single gene traits show only a moderate paternal bias in the male:female mutation ratio and a small paternal age effect. An example is neurofibromatosis type 1. polymorphism BRCA1 c.3548 A[G was identified in most of the samples. Sanger sequencing revealed a truncating BRCA1 mutation c.3494_3495deltt (p.phe1165fs) in the sample The NF1 gene has one of the highest mutation rates. Many of the mutations are intragenic deletions, which are more common in larger genes. Whereas base substitutions occur primarily in males and are age-dependent, small chromosomal changes (intragenic deletions) are not age-dependent because they occur by different mechanisms [23]. This is the first de novo BRCA1 mutation identified in a patient with age of onset for the disease after the age of 50. In all patients previously reported with de novo mutations, breast tumors occurred at young age (\40 years), which may reflect recruitment bias, since genetic testing in sporadic cases is most often limited to patients with early onset

102 Chapter 3 Breast Cancer Res Treat Fig. 3 Microsatellite marker investigation of the de novo mutation BRCA1 c.3494_3495deltt. To determine if the mutation originated on the maternal or paternal allele, the region encompassing the BRCA1 gene was investigated with microsatellite markers in the parents, sibs, and children of the proband. Fragment analysis revealed that the mutation originated on the paternal allele of the proband. Five of her sibs inherited the same paternal allele but do not carry the mutation, providing additional evidence of the de novo occurrence of this mutation breast or ovarian cancer. To our knowledge, only two other de novo BRCA1 mutations have been reported to date. The prevalence of de novo BRCA1 and BRCA2 mutations is still unknown and making a reliable estimation is challenging. Taking into account that recurrent BRCA1/2 mutations might appear as de novo alterations, might increase the prevalence to a higher rate than generally accepted [12]. As we could not evaluate parental DNA for all mutations detected in our sporadic patient population, the prevalence we observed might be an underestimation. The majority of the mutations identified in this study (10/17), are well known founder mutations in our population. Although the Belgian population has a mixed ethnicity, we do see a considerable number of founder mutations in our population [24, 25]. In our overall patient population, 58 distinct mutations were detected, 17 of those represent approximately 75% of the 206 BRCA1/2 mutations identified to date in our center. Ten of the 17 mutations detected in this study are part of this top 17. Three mutations (BRCA1 p.ser868x, BRCA2 c.4936_4939del4, and BRCA2 c.469_470delaa) are less prevalent in the Belgian population, but are reported in literature as founder mutations in other populations (Table 1). We detected BRCA1/2 mutations in 189 of 1,170 (17%) unrelated index patients with a family history for breast/ ovarian cancer referred to our center for genetic testing. This is statistically significantly higher than in our sporadic group (17/193 = 8.8%) (P \ 0.01), demonstrating the relevance of inclusion criteria based on a family history. However, some subgroups of sporadic patients investigated in this study revealed a much higher frequency (see below and Table 2), indicating the importance of genetic testing for these patients. Mutation analysis in 140 patients with unilateral breast cancer led to the identification of 12 germline mutations (8.6%). However, all the 12 mutations were detected in the subgroup of 128 patients diagnosed before the age of 40 (9.4%), of which three in the group of 55 patients diagnosed with unilateral breast cancer between the age of 35 and 40. Therefore, we have no evidence for a contribution of inherited BRCA1/2 mutations in unilateral breast cancer patients diagnosed in their forties, however, this observation

103 Chapter 3 Breast Cancer Res Treat Table 2 Overview of all patients investigated and mutations detected Mean age at diagnosis (range) # Mutations detected Mean age at diagnosis mutation carriers (range) 193 Sporadic breast cancer patients 17 Male breast cancer patients 56 years (25 72) 0% 176 Female breast and/or ovarian 36 years (24 68) 8.80% cancer patients Dx \ Unilateral breast cancer patients 33 years (24 40) 9% (9 in BRCA1 and 3 in BRCA2) 33 years (27 38) 5 Ovarian cancer patients 29 years (25 35) 0% 12 Bilateral breast cancer patients Dx1 = 35 years/dx = 39 years 8% (1 in BRCA1) Dx1: 36 years/dx2: 45 years (Dx1: 29 39/Dx2: 30 50) 2 Patients with breast and ovarian Dx1 = 39 years/dx = 46 years 0% cancer (Dx1: 38 40/Dx2: 43 49) Dx [ Unilateral breast cancer 44 years (41 49) 0% patients 1 Ovarian cancer patient 49 years 0% 9 Bilateral breast cancer patients 7 Patients with breast and ovarian cancer Dx1 = 46 years/dx = 49 years (Dx1: 42 49/Dx2: 45 53) Dx1 = 55 years/dx = 60 years (Dx1: 43 68/Dx2: 47 72) 11% (1 in BRCA2) Dx1: 45 years/dx2:45 years 30% (2 in BRCA1 and 1 in BRCA2) Dx1: 49 years/dx2: 55 years (Dx1: 43 53/Dx2: 53 57) is based on the analysis of only 12 patients. In the subgroup of 30 patients with at least two BRCA1/2 associated tumors (bilateral breast cancer or breast and ovarian cancer), we identified five mutations (16.7%). Interestingly, mutations were more prevalent in the patients diagnosed with their first tumor after the age of 40 (4/16 = 25%) compared to the group diagnosed before 40 (1/14 = 7.1%), however, the difference is not statistically significant (P = 0.36). Especially in patients with both breast and ovarian cancer (n = 9), a high mutation detection ratio was obtained (3/9 = 30%), even with a diagnosis at a more advanced age. Therefore, testing these patients is mandatory, independent of the age of onset. An overview of the prevalence of BRCA1/2 mutations in the different subgroups of our study population is given in Table 2. The average age to develop breast cancer caused by a germline mutation is calculated to be 42 years for BRCA1 and 49 years for BRCA2 [26]. In BRCA1 mutation carriers, ovarian cancer rarely occurs before the age of 40 years, while 50 years seem to be a critical age for ovarian cancer in BRCA2 mutation carriers [26, 27]. In only two of our sporadic patients diagnosed with breast cancer before the age of 40 a BRCA2 mutation was identified (2x BRCA2 c.6280_6281deltt), in all other cases mutations in BRCA1 were detected. This is in agreement with the study of Peto et al. [28], who concluded that BRCA1 mutations may be more penetrant at young ages compared to BRCA2 mutations. Average age at diagnosis for breast cancer patients in whom a BRCA1 mutation was identified in our study population was 35.8 years [range: 27 53] compared to 38.4 years [range: 32 45] for BRCA2 mutations. Young age at primary breast cancer diagnosis is associated with an increased susceptibility to bilateral breast cancer, arguably due to the remaining life span and the resulting interval at which the patient remains at risk for a second primary tumor. In total, we investigated 21 patients diagnosed with bilateral breast cancer. All of these cases were diagnosed with their first breast cancer before the age of 50, but in only two patients a pathogenic BRCA mutation was identified. We investigated the complete BRCA1/2 coding sequence in six sporadic ovarian cancer patients. Five of them were diagnosed with ovarian cancer at a young age (range years) but no BRCA1 or BRCA2 mutations were detected. The estimated prevalence of BRCA1 mutations in familial ovarian cancer patients is 5% [29]. Although genetic testing to these five patients was offered because of their very young age of onset, other genetic factors may have contributed to the tumor development. We identified no BRCA1/2 mutations in 17 sporadic male breast cancer patients (mean age at diagnosis: 52, range 25 72). As male breast cancer is rare, genetic testing is offered to these patients since it is believed to result from genetic susceptibility. Population and clinic based studies estimate the prevalence of BRCA2 mutations in males from 4% up to 40% depending on the ethnic background of the

104 Chapter 3 studied population [30 34]. One extreme example is the Icelandic population where 40% of all male breast cancers diagnosed carry the BRCA2 999del5 (BRCA2 c.771_ 775del) mutation. This is mainly due to a founder effect and is very unlikely to be replicated in other populations. The absence of BRCA2 mutations in our male sporadic breast cancer patients might be due to small sample size since we detected a BRCA2 mutation in 14% of all males with a family history for the disease referred to our center for genetic testing. A BRCA1/2 mutation occurrence of *10% is in agreement with previously reported findings in young (age at diagnosis \45 years) sporadic breast cancer patients in different populations [28, 35, 36]. However, prevalence is population dependent; for instance, the prevalence of BRCA1/2 mutations in British sporadic patients is lower (\4%) [37]. The ratio of BRCA1 to BRCA2 mutations varies widely between subpopulations globally. In this study, the prevalence of BRCA1 mutations is higher compared to BRCA2 mutations: 70% (12/17) of the mutations detected in the sporadic patients are present in BRCA1, this is higher than the prevalence of BRCA1 mutations (115/189 = 61%) in patients with a family history of breast/ovarian cancer, however, the difference is not statistically significant (P \ 0.2). This increased prevalence of BRCA1 mutations can be explained by the higher penetrance of BRCA1 mutations in patients mainly selected on early age of onset, but also to founder effects, since the two most prevalent mutations (c.2380dupg and c.212?3a[g) in our population are both BRCA1 mutations. Several tools to predict the likelihood for a patient to carry a mutation have been developed, unfortunately for most of them risk estimation is based on a family history for the disease. Important and widely used examples are the Claus, Tyrer-Cuzick, and BRCAPRO model but these cannot be reliably used for patients without any family history for breast or ovarian cancer. Our data and other studies demonstrate that these women may have a significant risk to carry a BRCA1 or BRCA2 mutation. Suggestions to adopt or expand the inclusion criteria for sporadic breast cancer patients have already been made [38]. Kwon et al. [38] suggest offering genetic testing to all women with triple-negative breast cancers diagnosed before the age of 50. Our study shows that a cut-off based on the age of onset cannot be used as the sole selection criterion for genetic testing of sporadic breast/ovarian cancer patients. Conclusion Selection of breast cancer patients based on the family history will not allow detection of all BRCA1/2 carriers that may benefit from preventive interventions. However, formulating a general cost effective set of guidelines to identify all sporadic BRCA1/2 mutation carriers is challenging. Our study proves that BRCA1/2 mutation detection in sporadic patients is worthwhile, even with diagnoses at later age, especially in patients with two tumors. The identification of a de novo BRCA1 mutation in a sporadic patient diagnosed with breast and ovarian cancer in her fifties, suggests that the prevalence of BRCA1/2 de novo mutations may currently be underestimated. Acknowledgments This project was realized with the funding of an Emmanuel van der Schueren scholarship of the Flemish foundation against cancer to Kim De Leeneer. This research was supported by grant from the Fund for Scientific Research Flanders (FWO) to Kathleen Claes and by GOA grant BOF10/GOA/019 (Ghent University). Bruce Poppe is a senior clinical investigator from FWO. Conflict of interest References None. Breast Cancer Res Treat 1. Miki Y et al (1994) A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266(5182): Wooster R et al (1995) Identification of the breast cancer susceptibility gene BRCA2. Nature 378(6559): Wooster R et al (1994) Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q Science 265(5181): Kauff ND et al (2002) Incidence of non-founder BRCA1 and BRCA2 mutations in high risk Ashkenazi breast and ovarian cancer families. J Med Genet 39(8): Gayther SA et al (1997) Variation of risks of breast and ovarian cancer associated with different germline mutations of the BRCA2 gene. Nat Genet 15(1): Tesoriero A et al (1999) De novo BRCA1 mutation in a patient with breast cancer and an inherited BRCA2 mutation. Am J Hum Genet 65(2): Edwards E et al (2009) Identification of a de novo BRCA1 mutation in a woman with early onset bilateral breast cancer. Fam Cancer 8(4): Diez O et al (2010) A novel de novo BRCA2 mutation of paternal origin identified in a Spanish woman with early onset bilateral breast cancer. Breast Cancer Res Treat 121(1): Hansen TV et al (2008) Novel de novo BRCA2 mutation in a patient with a family history of breast cancer. BMC Med Genet 9: Marshall M, Solomon S, Lawrence Wickerham D (2009) Case report: de novo BRCA2 gene mutation in a 35-year-old woman with breast cancer. Clin Genet 76(5): Robson M et al (2002) Unique de novo mutation of BRCA2 in a woman with early onset breast cancer. J Med Genet 39(2): van der Luijt RB et al (2001) De novo recurrent germline mutation of the BRCA2 gene in a patient with early onset breast cancer. J Med Genet 38(2): Claes K et al (1999) Mutation analysis of the BRCA1 and BRCA2 genes results in the identification of novel and recurrent mutations in 6/16 flemish families with breast and/or ovarian

105 Breast Cancer Res Treat cancer but not in 12 sporadic patients with early-onset disease. Mutations in brief no Online. Hum Mutat 13(3): van der Hout AH et al (2006) A DGGE system for comprehensive mutation screening of BRCA1 and BRCA2: application in a Dutch cancer clinic setting. Hum Mutat 27(7): De Leeneer K et al (2009) Genotyping of frequent BRCA1/2 SNPs with unlabeled probes: a supplement to HRMCA mutation scanning, allowing the strong reduction of sequencing burden. J Mol Diagn JMD 11(5): De Leeneer K et al (2008) Rapid and sensitive detection of BRCA1/2 mutations in a diagnostic setting: comparison of two high-resolution melting platforms. Clin Chem 54(6): De Leeneer K et al (2011) Massive parallel amplicon sequencing of the breast cancer genes BRCA1 and BRCA2: opportunities, challenges, and limitations. Hum Mutat 32(3): Rohlin A et al (2009) Parallel sequencing used in detection of mosaic mutations: comparison with four diagnostic DNA screening techniques. Hum Mutat 30(6): Thomas RK et al (2006) Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat Med 12(7): Carlson KM et al (1994) Parent-of-origin effects in multiple endocrine neoplasia type 2B. Am J Hum Genet 55(6): Schuffenecker I et al (1997) Prevalence and parental origin of de novo RET mutations in multiple endocrine neoplasia type 2A and familial medullary thyroid carcinoma, Le Groupe d Etude des Tumeurs a Calcitonine. Am J Hum Genet 60(1): Glaser RL et al (2000) Paternal origin of FGFR2 mutations in sporadic cases of Crouzon syndrome and Pfeiffer syndrome. Am J Hum Genet 66(3): Lazaro C et al (1996) Sex differences in mutational rate and mutational mechanism in the NF1 gene in neurofibromatosis type 1 patients. Hum Genet 98(6): Claes K et al (1999) Mutation analysis of the BRCA1 and BRCA2 genes in the Belgian patient population and identification of a Belgian founder mutation BRCA1 IVS5? 3A [ G. Dis Markers 15(1 3): Claes K et al (2004) BRCA1 and BRCA2 germline mutation spectrum and frequencies in Belgian breast/ovarian cancer families. Br J Cancer 90(6): Chapter King MC, Marks JH, Mandell JB (2003) Breast and ovarian cancer risks due to inherited mutations in BRCA1 and BRCA2. Science 302(5645): Meijers-Heijboer EJ et al (2000) Presymptomatic DNA testing and prophylactic surgery in families with a BRCA1 or BRCA2 mutation. Lancet 355(9220): Peto J et al (1999) Prevalence of BRCA1 and BRCA2 gene mutations in patients with early-onset breast cancer. J Natl Cancer Inst 91(11): Ford M, Murdoch J (1997) The management of ovarian cancer. Br J Hosp Med 58(11): Couch FJ et al (1996) BRCA2 germline mutations in male breast cancer cases and breast cancer families. Nat Genet 13(1): Haraldsson K et al (1998) BRCA2 germ-line mutations are frequent in male breast cancer patients without a family history of the disease. Cancer Res 58(7): Johannesdottir G et al (1996) High prevalence of the 999del5 mutation in icelandic breast and ovarian cancer patients. Cancer Res 56(16): Mavraki E et al (1997) Germline BRCA2 mutations in men with breast cancer. Br J Cancer 76(11): Syrjakoski K et al (2004) BRCA2 mutations in 154 Finnish male breast cancer patients. Neoplasia 6(5): Charef-Hamza S et al (2005) Loss of heterozygosity at the BRCA1 locus in Tunisian women with sporadic breast cancer. Cancer Lett 224(2): Uhrhammer N et al (2008) BRCA1 mutations in Algerian breast cancer patients: high frequency in young, sporadic cases. Int J Med Sci 5(4): Ellis D et al (2000) Low prevalence of germline BRCA1 mutations in early onset breast cancer without a family history. J Med Genet 37(10): Kwon JS et al (2010) Expanding the criteria for BRCA mutation testing in breast cancer survivors. J Clin Oncol 28(27): Ottini L et al (2000) BRCA1 and BRCA2 mutations in central and southern Italian patients. Breast Cancer Res BCR 2(4): Borg A (2010) Characterization of BRCA1 and BRCA2 deleterious mutations and variants of unknown clinical significance in unilateral and bilateral breast cancer: the WECARE study. Hum Mut 31(3):E

106

107 Chapter 3 LETTER Evaluation of RAD51C as a new breast cancer susceptibility gene in the Belgian/Dutch population. De Leeneer K.*(1), Van Bockstal M.*(1), Swietek N.(1), Van den Ende J.(2), Willocx S.(2), Storm K.(2), Blaumeiser B.(2), Leunen K.(3), Van Asperen C.J.(4), Wijnen J.T(4), Legius E.(5), Michils G.(5), Matthijs G.(5), Blok M.J.(6), Gomez-Garcia E.B.(6), De Paepe A.(1), Poppe B.(1), and Claes K. (**)(1) * equal contribution (1) Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium (2) Center for Medical Genetics, University Hospital of Antwerp, Antwerp, Belgium, (3) Department of Gynaecological Oncology, Catholic University of Leuven, Leuven, Belgium (4) Department of Clinical Genetics, Leiden University Medical Center, Leiden, Netherlands (5) Department of Human Genetics, Catholic University of Leuven, Leuven, Belgium (6) Department of Clinical Genetics, Maastricht University Medical, Maastricht, Netherlands Corresponding author: Kathleen Claes, Ph.D. address: Kathleen.Claes@UGent.be 101

108 Chapter 3 We read with great interest the recently published letters by Akbari et al. [1] and Silvestri et al.[2]. Both studies report the absence of RAD51C germline mutations in 454 BRCA1&2 negative patients with familial breast (and ovarian) cancer and 97 male breast cancer patients respectively. An initial study on unrelated German women with gynaecologic malignancies [3] identified six monoallelic pathogenic mutations in RAD51C. Strikingly, all six deleterious mutations were exclusively found within 480 BRCA1/2 negative breast and ovarian cancer families and not in breast cancer only families. We evaluated the prevalence of RAD51C germline mutations in 482 Belgian/Dutch breast and ovarian cancer families, previously found to be negative for BRCA1&2 mutations. In 99% (478/482) of the families at least one individual with breast cancer and one individual with ovarian cancer was present. Five affected males were included because of their extensive family history for breast and ovarian cancer. The complete coding region, 5 UTR, 3 UTR and splice sites of RAD51C were analyzed with High resolution melting curve analysis, followed by Sanger sequencing of the aberrant melting curves (primers available on request). No unequivocal deleterious RAD51C mutation was identified in our study population. In total our mutation analysis revealed 11 unique sequence variations of which 3 are novel. The most interesting is a novel 3 UTR variant: c.*131 A>G, identified in an ovarian cancer patient, diagnosed at the age of 60 and a positive family history of breast and colon cancer. In silico mirna binding seed predictions (PITA Scan) revealed the creation of a mir-126 binding site and the disruption of binding sites for mir-218, mir-122, mir-142-3, mir-27a/b and mir-139-5p. Since RAD51C functions as a tumor suppressor gene, the creation of a new binding site for mirna-126, will be evaluated in detail. MiRNA-126 target reporter assays are currently ongoing, human cells are transfected with a luciferase vector containing the RAD51C 3 UTR and mirna-126 to evaluate if this variant alters RAD51C expression mediated by mirnas. If the mirna binds the new site, a decrease in luciferase expression will be established. 102

109 Chapter 3 Two other novel variants (RAD51C c.-36 A>G (5 UTR) and c G>T (intronic)) were detected in our study population, none of them are predicted in silico to affect splicing (Alamut, Interactive Biosoftware). In addition several recurrent SNPs were identified: c.-26 C>T (rs ) in the 5 UTR (allele frequency (AF): 134/964 = ~14%). Other detected genetic polymorphic variants reported in dbsnp 132 were c.-54 C>G (rs ) (AF= 96/964 (~10%)), c T>C (rs ) (AF=~5%), c.376g>a (p.ala126thr) (rs ) (AF=~1%), c.859a>g (p.thr287ala) (rs ) (AF=~0.6%) and c.*25c>g (rs ) (AF=~0.3%). Furthermore we identified two missense variants: RAD51C c.506 T>C (p.val169ala) was identified twice, once in a breast cancer only family and once in a family where breast and ovarian cancer was present. RAD51C c.790 G>A (p.gly264ser) was identified in 4 index patients from breast and ovarian cancer families. Both variants were previously reported by Meindl et al. and their functional studies suggest that none of these variants is clinically important. This is the fourth [1, 2, 4] study unable to confirm a major role for RAD51C germline mutations in breast and ovarian cancer families. Since the investigated cohort is of similar size and geographically close to the patients investigated in the initial report, the prevalence of RAD51C mutations may be lower than initially expected. Germline mutations in RAD51C might still be the cause of hereditary breast/ovarian cancer in other populations. Therefore, further investigations in populations of various ethnic origins are required before RAD51C can be excluded as a major breast and ovarian cancer susceptibility gene. References 1. Akbari, M.R., et al., RAD51C germline mutations in breast and ovarian cancer patients. Breast cancer research : BCR, (4): p Silvestri, R.P., et al., Mutation screening of RAD51C in male breast cancer patients. Breast cancer research, (404). 3. Meindl, A., et al., Germline mutations in breast and ovarian cancer pedigrees establish RAD51C as a human cancer susceptibility gene. Nature genetics, (5): p Zheng, Y., et al., Screening RAD51C nucleotide alterations in patients with a family history of breast and ovarian cancer. Breast cancer research and treatment, (3): p

110 Chapter 3 104

111 General discussion and perspectives

112 Chapter 4 GENERAL DISCUSSION AND FUTURE PERSPECTIVES Discussion 107 Technical apects of molecular diagnostics for inherited breast cancer 107 Mutation spectrum and prevalences in selected breast/ovarian cancer population 114 Perspectives 117 General Conclusion 119 References

113 Chapter 4 Discussion One of the best studied familial cancer syndromes worldwide is hereditary breast and ovarian cancer. Approximately one in ten women will develop breast cancer during lifetime. Although the majority of breast cancers is sporadic in origin, an appreciable fraction is caused by inherited predisposition. Numerous factors affect the likelihood of developing breast and ovarian cancer, but currently no other predictor is as powerful as an inherited mutation in BRCA1 or BRCA2. Due to the complexity of the genes and their broad, population specific mutation spectra, genetic testing remains challenging. Technical apects of molecular diagnostics for inherited breast cancer The first goal of our studies was to develop a cost efficient, high throughput BRCA1/2 mutation detection strategy, allowing the reduction of turnaround times. At the start of this thesis the BRCA1/2 mutation detection strategy in the CMGG was performed by direct Sanger sequencing for both large exons 11 and DGGE for all other coding exons. As BRCA1&2 have no mutation hot spots, the new mutation detection strategy needed to cover the complete coding and splice site regions of both genes. At the start of this thesis, the principles of the HRMCA technique were just published, but adequate validation of the technology for application in a diagnostic context was lacking. In a first stage we evaluated HRMCA as a mutation scanning technique on two available platforms, i.e. the Lightscanner (Idaho Technology) and the Lightcycler480 (Roche). We designed 112 PCR amplicons, all amplifiable with a single PCR program, covering the complete coding region of BRCA1 and BRCA2. We performed a thorough validation of the sensitivity and specificity on both instruments and obtained slightly better results in terms of specificity on the Lightscanner instrument. By introducing this method, our turnaround time for the BRCA genes could be reduced considerably (one third compared to the approach of direct sequencing of the large exons 11 and DGGE for all other exons). In our setup, the post-melt analysis time for 11 patients requires only 3 hours (approximately 13 min per 96-well plate) followed by sequencing of the detected aberrations. Owing to the relatively low cost of the consumables and the lower workload compared to other mutation scanning techniques, this is a very cost efficient technology. However, we found some pitfalls in the application of HRMCA 107

114 Chapter 4 technology. From our observations it became clear that the software is not always able to discriminate between distinct variants within the same amplicon. This can be explained by variants generating a similar difference in melting temperature and our findings were confirmed by Tindall et al. [1] who showed that only 83% of different heterozygotes were distinguishable from single heterozygotes. This illustrates that although high-resolution melting analysis is a very useful prescreening technique, the main goal of the technique is separating the wild type samples from aberrant samples and all detected aberrations still need to be sequenced, especially in a diagnostic setting. A second pitfall to overcome is the detection of homozygous variants. The melting temperature differences between genotypes increases as the amplicon size decreases, allowing better differentiation. For 4% of homozygous SNPs present in the human genome, the generated Tm difference is too small to allow reliable detection with HRMCA even when using very short amplicons (<50bp). Typically, these SNPs are flanked on each side by complementary bases [2]. For example, if a C/G SNP is flanked by an A and a T on the same strand, nearest neighbor symmetry occurs and only a small difference in Tm will be generated. Detection of these SNPs can be overcome by mixing samples with wild-type fragments. However, for autosomal dominantly inherited mutations (like BRCA1&2) this is not an issue. This might be the method of choice when applying high-resolution melting analysis for mutational analysis of genes associated with autosomal recessive or X-linked diseases in males. Another possibility to allow detection of these variants is to design genotyping assays with unlabeled probes. Our study led to the successful implementation of HRMCA as mutation detection technology in a diagnostic setting for BRCA1/2 and other genes investigated in the CMGG. HRMCA is a prescreening technique and in combination with the polymorphic character of both BRCA1/2, the sequencing burden after mutation scanning remained high. For unequivocal detection of the 14 most recurrent SNPs in our study population, we designed unlabeled probe assays to avoid Sanger sequencing for recurrent benign polymorphisms. A schematic overview of the principle of this approach is shown in Figure 1. We demonstrated accurate genotyping of these SNPs, although the length of our amplicons exceeded the 200 bp recommended by Liew et al.[3]. 108

115 Chapter 4 Figure 1: Schematic overview of the principle of HRMCA genotyping with unlabeled probes Overview of an asymmetric PCR in the presence of an unlabeled probe covering the SNP of interest. Excess primer (pink) is complementary to strand A (gray) and produces excess copies of strand B (pink). The limiting primer (gray) is complementary to strand B and produces limited copies of strand A. Elongation of the unlabeled probe (complementary to the pink strand) is inhibited by the C3-spacer during the PCR, resulting in an excess number of probe-amplicon duplexes compared to the regular amplicon duplexes. This melting transition is visible on the same melting profile as the amplicon melting, enabling simultaneous variant scanning and genotyping. Both melting transitions appear as peaks on a derivative plot. An alternative approach for accurate genotyping of SNPs is presented by Liew et al. [2] by designing small amplicons (<50 bp). Unlabeled probe genotyping was preferred over designing small amplicons, because this allowed working with the same primer sets for mutation scanning and SNP genotyping, leading to an efficient workflow. By introducing HRMCA in our laboratory, we were able to reduce our turnover time threefold. With a limited effort accurate genotyping for 14 recurrent SNPs reduced the sequencing burden about three-fold, hereby strongly further decreasing the costs for 109

The Genetics of Early- Onset Breast Cancer. Cecelia Bellcross, Ph.D., M.S.,C.G.C. Department of Human Genetics Emory University School of Medicine

The Genetics of Early- Onset Breast Cancer. Cecelia Bellcross, Ph.D., M.S.,C.G.C. Department of Human Genetics Emory University School of Medicine The Genetics of Early- Onset Breast Cancer Cecelia Bellcross, Ph.D., M.S.,C.G.C. Department of Human Genetics Emory University School of Medicine All cancers are genetic BUT Not all cancers are hereditary

More information

Hereditary Ovarian cancer: BRCA1 and BRCA2. Karen H. Lu MD September 22, 2013

Hereditary Ovarian cancer: BRCA1 and BRCA2. Karen H. Lu MD September 22, 2013 Hereditary Ovarian cancer: BRCA1 and BRCA2 Karen H. Lu MD September 22, 2013 Outline Hereditary Breast and Ovarian Cancer (HBOC) BRCA1/2 genes How to identify What it means to you What it means to your

More information

Name of Policy: Genetic Testing for Hereditary Breast and/or Ovarian Cancer

Name of Policy: Genetic Testing for Hereditary Breast and/or Ovarian Cancer Name of Policy: Genetic Testing for Hereditary Breast and/or Ovarian Cancer Policy #: 513 Latest Review Date: January 2014 Category: Laboratory Policy Grade: B Background/Definitions: As a general rule,

More information

Genetic Testing for Hereditary Breast and Ovarian Cancer - BRCA1/2 ANALYSIS -

Genetic Testing for Hereditary Breast and Ovarian Cancer - BRCA1/2 ANALYSIS - Genetic Testing for Hereditary Breast and Ovarian Cancer - BRCA1/2 ANALYSIS - January 2005 SCIENTIFIC BACKGROUND Breast cancer is considered to be one of the most prevalent cancer in women. The overall

More information

Contents. molecular biology techniques. - Mutations in Factor II. - Mutations in MTHFR gene. - Breast cencer genes. - p53 and breast cancer

Contents. molecular biology techniques. - Mutations in Factor II. - Mutations in MTHFR gene. - Breast cencer genes. - p53 and breast cancer Contents Introduction: biology and medicine, two separated compartments What we need to know: - boring basics in DNA/RNA structure and overview of particular aspects of molecular biology techniques - How

More information

Breast cancer and the role of low penetrance alleles: a focus on ATM gene

Breast cancer and the role of low penetrance alleles: a focus on ATM gene Modena 18-19 novembre 2010 Breast cancer and the role of low penetrance alleles: a focus on ATM gene Dr. Laura La Paglia Breast Cancer genetic Other BC susceptibility genes TP53 PTEN STK11 CHEK2 BRCA1

More information

What is Cancer? Cancer is a genetic disease: Cancer typically involves a change in gene expression/function:

What is Cancer? Cancer is a genetic disease: Cancer typically involves a change in gene expression/function: Cancer is a genetic disease: Inherited cancer Sporadic cancer What is Cancer? Cancer typically involves a change in gene expression/function: Qualitative change Quantitative change Any cancer causing genetic

More information

Genetics and Breast Cancer. Elly Lynch, Senior Genetic Counsellor Manager, Austin Health Clinical Genetics Service

Genetics and Breast Cancer. Elly Lynch, Senior Genetic Counsellor Manager, Austin Health Clinical Genetics Service Genetics and Breast Cancer Elly Lynch, Senior Genetic Counsellor Manager, Austin Health Clinical Genetics Service Overview Background/Our Team What is the difference between sporadic/familial cancer? How

More information

Office of Population Health Genomics

Office of Population Health Genomics Office of Population Health Genomics Policy: Protocol for the management of female BRCA mutation carriers in Western Australia Purpose: Best Practice guidelines for the management of female BRCA mutation

More information

Ovarian Cancer Genetic Testing: Why, When, How?

Ovarian Cancer Genetic Testing: Why, When, How? Ovarian Cancer Genetic Testing: Why, When, How? Jeffrey Dungan, MD Associate Professor Division of Clinical Genetics Department of Obstetrics & Gynecology Northwestern University Feinberg School of Medicine

More information

patient education Fact Sheet PFS007: BRCA1 and BRCA2 Mutations MARCH 2015

patient education Fact Sheet PFS007: BRCA1 and BRCA2 Mutations MARCH 2015 patient education Fact Sheet PFS007: BRCA1 and BRCA2 Mutations MARCH 2015 BRCA1 and BRCA2 Mutations Cancer is a complex disease thought to be caused by several different factors. A few types of cancer

More information

PROVIDER POLICIES & PROCEDURES

PROVIDER POLICIES & PROCEDURES PROVIDER POLICIES & PROCEDURES BRCA GENETIC TESTING The purpose of this document is to provide guidance to providers enrolled in the Connecticut Medical Assistance Program (CMAP) on the requirements for

More information

Number 12.04.516 Effective Date August 11, 2015 Revision Date(s) Replaces 2.04.133 (not adopted)

Number 12.04.516 Effective Date August 11, 2015 Revision Date(s) Replaces 2.04.133 (not adopted) MEDICAL POLICY POLICY RELATED POLICIES POLICY GUIDELINES DESCRIPTION SCOPE BENEFIT APPLICATION RATIONALE REFERENCES CODING APPENDI HISTORY Genetic Testing for CHEK2 Mutations for Breast Cancer Number 12.04.516

More information

MUTATION, DNA REPAIR AND CANCER

MUTATION, DNA REPAIR AND CANCER MUTATION, DNA REPAIR AND CANCER 1 Mutation A heritable change in the genetic material Essential to the continuity of life Source of variation for natural selection New mutations are more likely to be harmful

More information

Common Cancers & Hereditary Syndromes

Common Cancers & Hereditary Syndromes Common Cancers & Hereditary Syndromes Elizabeth Hoodfar, MS, LCGC Regional Cancer Genetics Coordinator Kaiser Permanente Northern California Detect clinical characteristics of hereditary cancer syndromes.

More information

Hereditary Breast Cancer Panels. High Risk Hereditary Breast Cancer Panel Hereditary Breast/Ovarian/Endometrial Cancer Panel

Hereditary Breast Cancer Panels. High Risk Hereditary Breast Cancer Panel Hereditary Breast/Ovarian/Endometrial Cancer Panel P A T I E N T G U I D E Hereditary Breast Cancer Panels High Risk Hereditary Breast Cancer Panel Hereditary Breast/Ovarian/Endometrial Cancer Panel B a y l o r M i r a c a G e n e t i c s L a b o r a t

More information

GENETIC CONSIDERATIONS IN CANCER TREATMENT AND SURVIVORSHIP

GENETIC CONSIDERATIONS IN CANCER TREATMENT AND SURVIVORSHIP GENETIC CONSIDERATIONS IN CANCER TREATMENT AND SURVIVORSHIP WHO IS AT HIGH RISK OF HEREDITARY CANCER? Hereditary Cancer accounts for a small proportion of all cancer or approximately 5-10% THE DEVELOPMENT

More information

BRCA in Men. Mary B. Daly,M.D.,Ph.D. June 25, 2010

BRCA in Men. Mary B. Daly,M.D.,Ph.D. June 25, 2010 BRCA in Men Mary B. Daly,M.D.,Ph.D. June 25, 2010 BRCA in Men Inheritance patterns of BRCA1/2 Cancer Risks for men with BRCA1/2 mutations Risk management recommendations for men with BRCA1/2 mutations

More information

Usefulness of polymorphic markers in exclusion of BRCA1/BRCA2 mutations in families with aggregation of breast/ovarian cancers

Usefulness of polymorphic markers in exclusion of BRCA1/BRCA2 mutations in families with aggregation of breast/ovarian cancers J. Appl. Genet. 44(3), 2003, pp. 419-423 Short communication Usefulness of polymorphic markers in exclusion of BRCA1/BRCA2 mutations in families with aggregation of breast/ovarian cancers Bohdan GÓRSKI,

More information

Medical Policy Manual. Topic: Genetic Testing for Hereditary Breast and/or Ovarian Cancer. Date of Origin: January 27, 2011

Medical Policy Manual. Topic: Genetic Testing for Hereditary Breast and/or Ovarian Cancer. Date of Origin: January 27, 2011 Medical Policy Manual Topic: Genetic Testing for Hereditary Breast and/or Ovarian Cancer Date of Origin: January 27, 2011 Section: Genetic Testing Last Reviewed Date: May 2015 Policy No: 02 Effective Date:

More information

Hereditary Breast Cancer. Nicole Kounalakis, MD Assistant Professor of Surgery University of Colorado Medical Center

Hereditary Breast Cancer. Nicole Kounalakis, MD Assistant Professor of Surgery University of Colorado Medical Center Hereditary Breast Cancer Nicole Kounalakis, MD Assistant Professor of Surgery University of Colorado Medical Center Outline Background Assessing risk of patient Syndromes BRCA 1,2 Li Fraumeni Cowden Hereditary

More information

Genetic Testing for CHEK2 Mutations for Breast Cancer

Genetic Testing for CHEK2 Mutations for Breast Cancer Genetic Testing for CHEK2 Mutations for Breast Cancer Policy Number: 2.04.133 Last Review: 8/2015 Origination: 8/2015 Next Review: 8/2016 Policy Blue Cross and Blue Shield of Kansas City (Blue KC) will

More information

6/10/2015. Hereditary Predisposition for Breast Cancer: Looking at BRCA1/BRCA2 Testing & Beyond. Hereditary Cancers. BRCA1 and BRCA2 Review

6/10/2015. Hereditary Predisposition for Breast Cancer: Looking at BRCA1/BRCA2 Testing & Beyond. Hereditary Cancers. BRCA1 and BRCA2 Review Hereditary Predisposition for Breast Cancer: Looking at BRCA1/BRCA2 Testing & Beyond Arturo Anguiano MD, FACMG International Medical Director, Medical Affairs Vice Chairman, Genetics; Medical Director,

More information

Genetic Testing for Susceptibility to Breast and Ovarian Cancer (BRCA1 and BRCA 2)

Genetic Testing for Susceptibility to Breast and Ovarian Cancer (BRCA1 and BRCA 2) Easy Choice Health Plan, Inc. Harmony Health Plan of Illinois, Inc. M issouri Care, Inc. Ohana Health Plan, a plan offered by WellCare Health Insurance of Arizona, Inc. WellCare Health Insurance of Illinois,

More information

Dal germinale al somatico nella identificazione di tumori ereditari

Dal germinale al somatico nella identificazione di tumori ereditari Modena 18-19 novembre 2010 Dal germinale al somatico nella identificazione di tumori ereditari Laura Ottini Tendencies to develop cancer can be inherited Fletcher & Houlston, 2010 Cancer is a genetic disease

More information

POLICY PRODUCT VARIATIONS DESCRIPTION/BACKGROUND RATIONALE DEFINITIONS BENEFIT VARIATIONS DISCLAIMER CODING INFORMATION REFERENCES POLICY HISTORY

POLICY PRODUCT VARIATIONS DESCRIPTION/BACKGROUND RATIONALE DEFINITIONS BENEFIT VARIATIONS DISCLAIMER CODING INFORMATION REFERENCES POLICY HISTORY Original Issue Date (Created): July 1, 2002 Most Recent Review Date (Revised): May 20, 2014 Effective Date: December 1, 2014 POLICY PRODUCT VARIATIONS DESCRIPTION/BACKGROUND RATIONALE DEFINITIONS BENEFIT

More information

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources Appendix 2 Molecular Biology Core Curriculum Websites and Other Resources Chapter 1 - The Molecular Basis of Cancer 1. Inside Cancer http://www.insidecancer.org/ From the Dolan DNA Learning Center Cold

More information

Hereditary Breast Cancer Testing. Diagnostic

Hereditary Breast Cancer Testing. Diagnostic Hereditary Cancer Testing Diagnostic New solutions for hereditary breast cancer. Identifying and understanding the genetic contribution to breast cancer allows for individualized disease management and

More information

The Department of Vermont Health Access Medical Policy

The Department of Vermont Health Access Medical Policy State of Vermont Department of Vermont Health Access 312 Hurricane Lane, Suite 201 [Phone] 802-879-5903 Williston, VT 05495-2807 [Fax] 802-879-5963 www.dvha.vermont.gov Agency of Human Services The Department

More information

IMMEDIATE HOT LINE: Effective March 2, 2015

IMMEDIATE HOT LINE: Effective March 2, 2015 MEDICARE COVERAGE OF LABORATORY TESTING Please remember when ordering laboratory tests that are billed to Medicare/Medicaid or other federally funded programs, the following requirements apply: 1. Only

More information

BRCA Genes and Inherited Breast and Ovarian Cancer. Patient information leaflet

BRCA Genes and Inherited Breast and Ovarian Cancer. Patient information leaflet BRCA Genes and Inherited Breast and Ovarian Cancer Patient information leaflet This booklet has been written for people who have a personal or family history of breast and/or ovarian cancer that could

More information

Lecture 3: Mutations

Lecture 3: Mutations Lecture 3: Mutations Recall that the flow of information within a cell involves the transcription of DNA to mrna and the translation of mrna to protein. Recall also, that the flow of information between

More information

MEDICAL POLICY SUBJECT: GENETIC TESTING FOR HEREDITARY BRCA MUTATIONS. POLICY NUMBER: 2.02.06 CATEGORY: Laboratory Test

MEDICAL POLICY SUBJECT: GENETIC TESTING FOR HEREDITARY BRCA MUTATIONS. POLICY NUMBER: 2.02.06 CATEGORY: Laboratory Test MEDICAL POLICY SUBJECT: GENETIC TESTING FOR, 10/15/15 PAGE: 1 OF: 12 If the member's subscriber contract excludes coverage for a specific service it is not covered under that contract. In such cases, medical

More information

Gene mutation and molecular medicine Chapter 15

Gene mutation and molecular medicine Chapter 15 Gene mutation and molecular medicine Chapter 15 Lecture Objectives What Are Mutations? How Are DNA Molecules and Mutations Analyzed? How Do Defective Proteins Lead to Diseases? What DNA Changes Lead to

More information

Advice about familial aspects of breast cancer and epithelial ovarian cancer a guide for health professionals DECEMBER 2010

Advice about familial aspects of breast cancer and epithelial ovarian cancer a guide for health professionals DECEMBER 2010 Advice about familial aspects of breast cancer and epithelial ovarian cancer a guide for health professionals DECEMBER 2010 This guide has three parts: 1. Information for health professionals 2. Tables

More information

Test Information Sheet

Test Information Sheet Test Information Sheet GeneDx 207 Perry Parkway Gaithersburg, MD 20877 Phone: 888-729-1206 Fax: 301-710-6594 E-mail: wecare@genedx.com www.genedx.com/oncology OncoGene Dx: Breast/Ovarian Cancer Panel Sequence

More information

Riesgo genético y familiar. Javier Benitez Centro Nacional Investigaciones Oncológicas Madrid

Riesgo genético y familiar. Javier Benitez Centro Nacional Investigaciones Oncológicas Madrid Riesgo genético y familiar Javier Benitez Centro Nacional Investigaciones Oncológicas Madrid Cancer: An example of complex disease 80-85% of cancers Polygenic model (combination of X genes) Low penetrance

More information

CHAPTER 2: UNDERSTANDING CANCER

CHAPTER 2: UNDERSTANDING CANCER CHAPTER 2: UNDERSTANDING CANCER INTRODUCTION We are witnessing an era of great discovery in the field of cancer research. New insights into the causes and development of cancer are emerging. These discoveries

More information

Mutations: 2 general ways to alter DNA. Mutations. What is a mutation? Mutations are rare. Changes in a single DNA base. Change a single DNA base

Mutations: 2 general ways to alter DNA. Mutations. What is a mutation? Mutations are rare. Changes in a single DNA base. Change a single DNA base Mutations Mutations: 2 general ways to alter DNA Change a single DNA base Or entire sections of DNA can move from one place to another What is a mutation? Any change in the nucleotide sequence of DNA Here

More information

BRCA1 & BRCA2 GeneHealth UK

BRCA1 & BRCA2 GeneHealth UK BRCA1 & BRCA2 GeneHealth UK BRCA1 & BRCA2 What is hereditary breast cancer? Cancer is unfortunately very common, with 1 in 3 people developing cancer at some point in their lifetime. Breast cancer occurs

More information

1 Mutation and Genetic Change

1 Mutation and Genetic Change CHAPTER 14 1 Mutation and Genetic Change SECTION Genes in Action KEY IDEAS As you read this section, keep these questions in mind: What is the origin of genetic differences among organisms? What kinds

More information

BRCA1 and BRCA2. BRCA1 and BRCA2 Clinician Guide KNOWING WHAT TO LOOK FOR KNOWING WHERE TO LOOK AND KNOWING WHAT IT MEANS

BRCA1 and BRCA2. BRCA1 and BRCA2 Clinician Guide KNOWING WHAT TO LOOK FOR KNOWING WHERE TO LOOK AND KNOWING WHAT IT MEANS BRCA1 and BRCA2 BRCA1 and BRCA2 Clinician Guide KNOWING WHAT TO LOOK FOR KNOWING WHERE TO LOOK AND KNOWING WHAT IT MEANS BRCA1 and BRCA2 Breast cancer is the most common cancer in women, diagnosed in

More information

Recommendations for the management of early breast cancer

Recommendations for the management of early breast cancer Recommendations for the management of early breast cancer in women with an identified BRCA1 or BRCA2 gene mutation or at high risk of a gene mutation FEBRUARY 2014 Incorporates published evidence to August

More information

Lesson 3 Reading Material: Oncogenes and Tumor Suppressor Genes

Lesson 3 Reading Material: Oncogenes and Tumor Suppressor Genes Lesson 3 Reading Material: Oncogenes and Tumor Suppressor Genes Becoming a cancer cell isn t easy One of the fundamental molecular characteristics of cancer is that it does not develop all at once, but

More information

HEREDITARY BRCA1. Faulty gene INFORMATION LEAFLET. How Do I Reduce My Risk?

HEREDITARY BRCA1. Faulty gene INFORMATION LEAFLET. How Do I Reduce My Risk? HEREDITARY BREAST CANCER BRCA1 Faulty gene INFORMATION LEAFLET How Do I Reduce My Risk? Page 1 CONTENTS Part A 1 What is BRCA1 2 How does BRCA1 affect a person s risk of cancer? 3Testing for BRCA1 4Benefits

More information

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Single nucleotide polymorphisms or SNPs (pronounced "snips") are DNA sequence variations that occur

More information

Prevention GENEration. The Importance of Genetic Testing for Hereditary Breast and Ovarian Cancer Syndrome (HBOC)

Prevention GENEration. The Importance of Genetic Testing for Hereditary Breast and Ovarian Cancer Syndrome (HBOC) הסיכוי שבסיכון PREVENTION GENERATION Prevention GENEration The Importance of Genetic Testing for Hereditary Breast and Ovarian Cancer Syndrome (HBOC) We thank Prof. Ephrat Levy-Lahad Director, Medical

More information

GENETIC TESTING FOR INHERITED MUTATIONS OR SUSCEPTIBILITY TO CANCER OR OTHER CONDITIONS MED207.110

GENETIC TESTING FOR INHERITED MUTATIONS OR SUSCEPTIBILITY TO CANCER OR OTHER CONDITIONS MED207.110 GENETIC TESTING FOR INHERITED MUTATIONS OR SUSCEPTIBILITY TO CANCER OR OTHER CONDITIONS MED207.110 COVERAGE: Pre- and post-genetic test counseling may be eligible for coverage in addition to the genetic

More information

patient guide BRCA1 and BRCA2 Genetic Testing for Hereditary Breast and Ovarian Cancer

patient guide BRCA1 and BRCA2 Genetic Testing for Hereditary Breast and Ovarian Cancer patient guide BRCA1 and BRCA2 Genetic Testing for Hereditary Breast and Ovarian Cancer What is hereditary cancer? Cancer affects many people in the U.S.: breast cancer affects 1 in 8 women and ovarian

More information

Biological Sciences Initiative. Human Genome

Biological Sciences Initiative. Human Genome Biological Sciences Initiative HHMI Human Genome Introduction In 2000, researchers from around the world published a draft sequence of the entire genome. 20 labs from 6 countries worked on the sequence.

More information

Genetics Module B, Anchor 3

Genetics Module B, Anchor 3 Genetics Module B, Anchor 3 Key Concepts: - An individual s characteristics are determines by factors that are passed from one parental generation to the next. - During gamete formation, the alleles for

More information

The following chapter is called "Preimplantation Genetic Diagnosis (PGD)".

The following chapter is called Preimplantation Genetic Diagnosis (PGD). Slide 1 Welcome to chapter 9. The following chapter is called "Preimplantation Genetic Diagnosis (PGD)". The author is Dr. Maria Lalioti. Slide 2 The learning objectives of this chapter are: To learn the

More information

Hereditary Breast and Ovarian Cancer (HBOC)

Hereditary Breast and Ovarian Cancer (HBOC) Oxford University Hospitals NHS Trust Oxford Regional Genetic Department Hereditary Breast and Ovarian Cancer (HBOC) Information for women with an increased lifetime risk of breast and ovarian cancer What

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins

More information

Genetic Testing for Hereditary Breast/ Ovarian Cancer Syndrome (BRCA1/BRCA2)

Genetic Testing for Hereditary Breast/ Ovarian Cancer Syndrome (BRCA1/BRCA2) MEDICAL POLICY POLICY RELATED POLICIES POLICY GUIDELINES DESCRIPTION SCOPE BENEFIT APPLICATION RATIONALE REFERENCES CODING APPENDIX HISTORY Genetic Testing for Hereditary Breast/ Ovarian Cancer Syndrome

More information

Sequencing and microarrays for genome analysis: complementary rather than competing?

Sequencing and microarrays for genome analysis: complementary rather than competing? Sequencing and microarrays for genome analysis: complementary rather than competing? Simon Hughes, Richard Capper, Sandra Lam and Nicole Sparkes Introduction The human genome is comprised of more than

More information

BRCA1 & BRCA2: Genetic testing for hereditary breast and ovarian cancer patient guide

BRCA1 & BRCA2: Genetic testing for hereditary breast and ovarian cancer patient guide BRCA1 & BRCA2: Genetic testing for hereditary breast and ovarian cancer patient guide What is Hereditary? Breast cancer is the most common cancer in women in the U.S. (it affects about 1 in 8 women). Ovarian

More information

National Medical Policy

National Medical Policy National Medical Policy Subject: Policy Number: Genetic Testing for BRCA1 and BRCA2 NMP136 Effective Date*: April 2004 Updated: September 2015 This National Medical Policy is subject to the terms in the

More information

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Genetic engineering: humans Gene replacement therapy or gene therapy Many technical and ethical issues implications for gene pool for germ-line gene therapy what traits constitute disease rather than just

More information

treatments) worked by killing cancerous cells using chemo or radiotherapy. While these techniques can

treatments) worked by killing cancerous cells using chemo or radiotherapy. While these techniques can Shristi Pandey Genomics and Medicine Winter 2011 Prof. Doug Brutlag Chronic Myeloid Leukemia: A look into how genomics is changing the way we treat Cancer. Until the late 1990s, nearly all treatment methods

More information

UNIVERSITY OF NAPOLI FEDERICO II. Doctorate School in Molecular Medicine. Doctorate Program in. Genetics and Molecular Medicine

UNIVERSITY OF NAPOLI FEDERICO II. Doctorate School in Molecular Medicine. Doctorate Program in. Genetics and Molecular Medicine UNIVERSITY OF NAPOLI FEDERICO II Doctorate School in Molecular Medicine Doctorate Program in Genetics and Molecular Medicine Coordinator: Prof. Lucio Nitsch XXVI Cycle BRCA1 and BRCA2 mutation detection

More information

Test Information Sheet

Test Information Sheet Test Information Sheet GeneDx 207 Perry Parkway Gaithersburg, MD 20877 Phone: 888-729-1206 Fax: 301-710-6594 E-mail: wecare@genedx.com www.genedx.com/oncology OncoGene Dx: High/Moderate Risk Panel Sequence

More information

BRCA1 / 2 testing by massive sequencing highlights, shadows or pitfalls?

BRCA1 / 2 testing by massive sequencing highlights, shadows or pitfalls? BRCA1 / 2 testing by massive sequencing highlights, shadows or pitfalls? Giovanni Luca Scaglione, PhD ------------------------ Laboratory of Clinical Molecular Diagnostics and Personalized Medicine, Institute

More information

Understanding Hereditary Breast and Ovarian Cancer. Maritime Hereditary Cancer Service

Understanding Hereditary Breast and Ovarian Cancer. Maritime Hereditary Cancer Service Understanding Hereditary Breast and Ovarian Cancer Maritime Hereditary Cancer Service General Information Cancer is very common. About one in three (33%) people are diagnosed with some form of cancer during

More information

Wisconsin Cancer Data Bulletin Wisconsin Department of Health Services Division of Public Health Office of Health Informatics

Wisconsin Cancer Data Bulletin Wisconsin Department of Health Services Division of Public Health Office of Health Informatics Wisconsin Cancer Data Bulletin Wisconsin Department of Health Services Division of Public Health Office of Health Informatics In Situ Breast Cancer in Wisconsin INTRODUCTION This bulletin provides information

More information

INTERVENTIONS BREAST CANCER GENETICS YOUNG BREAST CANCER SURVIVORS

INTERVENTIONS BREAST CANCER GENETICS YOUNG BREAST CANCER SURVIVORS INTERVENTIONS BREAST CANCER GENETICS YOUNG BREAST CANCER SURVIVORS AND THEIR AT-RISK RELATIVES Maria C. Katapodi, PhD, RN, FAAN Professor of Nursing Faculty of Medicine, University of Basel, Switzerland

More information

Breast cancer and genetics

Breast cancer and genetics Breast cancer and genetics Cancer and genes Our bodies are made up of millions of cells. Each cell contains a complete set of genes. We have thousands of genes. We each inherit two copies of most genes,

More information

How many of you have checked out the web site on protein-dna interactions?

How many of you have checked out the web site on protein-dna interactions? How many of you have checked out the web site on protein-dna interactions? Example of an approximately 40,000 probe spotted oligo microarray with enlarged inset to show detail. Find and be ready to discuss

More information

Genomic instability in cancers and cancer predispositions. Popova Tatiana Inserm U830 Institut Curie

Genomic instability in cancers and cancer predispositions. Popova Tatiana Inserm U830 Institut Curie Genomic instability in cancers and cancer predispositions Popova Tatiana Inserm U830 Institut Curie Time-scale in a tumor genome discovery Bovery HYP Cancer genome Knudson 2 hit HYP Tumor DNA has transforming

More information

Progress and Prospects in Ovarian Cancer Screening and Prevention

Progress and Prospects in Ovarian Cancer Screening and Prevention Progress and Prospects in Ovarian Cancer Screening and Prevention Rebecca Stone, MD MS Assistant Professor Kelly Gynecologic Oncology Service The Johns Hopkins Hospital 1 No Disclosures 4/12/2016 2 Ovarian

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

BREAST CANCER RISK ASSESSMENT AND PRIMARY PREVENTION FOR HIGH RISK PATIENTS, RACHEL CATHERINE JANKOWITZ, MD 1

BREAST CANCER RISK ASSESSMENT AND PRIMARY PREVENTION FOR HIGH RISK PATIENTS, RACHEL CATHERINE JANKOWITZ, MD 1 FOR HIGH RISK PATIENTS, RACHEL CATHERINE JANKOWITZ, MD 1 Hello, my name is Rachel Jankowitz, I m an assistant professor of medicine in the Division of Hematology Oncology at the University of Pittsburgh

More information

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait TWINS AND GENETICS TWINS Heritability: Twin Studies Twin studies are often used to assess genetic effects on variation in a trait Comparing MZ/DZ twins can give evidence for genetic and/or environmental

More information

Information leaflet. Centrum voor Medische Genetica. Version 1/20150504 Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel

Information leaflet. Centrum voor Medische Genetica. Version 1/20150504 Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel Information on genome-wide genetic testing Array Comparative Genomic Hybridization (array CGH) Single Nucleotide Polymorphism array (SNP array) Massive Parallel Sequencing (MPS) Version 120150504 Design

More information

BioBoot Camp Genetics

BioBoot Camp Genetics BioBoot Camp Genetics BIO.B.1.2.1 Describe how the process of DNA replication results in the transmission and/or conservation of genetic information DNA Replication is the process of DNA being copied before

More information

LESSON 3.5 WORKBOOK. How do cancer cells evolve? Workbook Lesson 3.5

LESSON 3.5 WORKBOOK. How do cancer cells evolve? Workbook Lesson 3.5 LESSON 3.5 WORKBOOK How do cancer cells evolve? In this unit we have learned how normal cells can be transformed so that they stop behaving as part of a tissue community and become unresponsive to regulation.

More information

School of Nursing. Presented by Yvette Conley, PhD

School of Nursing. Presented by Yvette Conley, PhD Presented by Yvette Conley, PhD What we will cover during this webcast: Briefly discuss the approaches introduced in the paper: Genome Sequencing Genome Wide Association Studies Epigenomics Gene Expression

More information

Next Generation Sequencing: Technology, Mapping, and Analysis

Next Generation Sequencing: Technology, Mapping, and Analysis Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University gbenson@bu.edu http://tandem.bu.edu/ The Human Genome Project took

More information

Forensic DNA Testing Terminology

Forensic DNA Testing Terminology Forensic DNA Testing Terminology ABI 310 Genetic Analyzer a capillary electrophoresis instrument used by forensic DNA laboratories to separate short tandem repeat (STR) loci on the basis of their size.

More information

Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics

Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics Session # : 46 Day/Time: Friday, May 1, 2015, 1:00 4:00 pm Title: Genetics and Hearing Loss: Clinical and Molecular Characteristics Presenter: Kathleen S. Arnos, PhD, Gallaudet University This presentation

More information

Chapter 5: Organization and Expression of Immunoglobulin Genes

Chapter 5: Organization and Expression of Immunoglobulin Genes Chapter 5: Organization and Expression of Immunoglobulin Genes I. Genetic Model Compatible with Ig Structure A. Two models for Ab structure diversity 1. Germ-line theory: maintained that the genome contributed

More information

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006 Special report Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006 Gene And Protein The gene that causes the mutation is CCND1 and the protein NP_444284 The mutation deals with the cell

More information

Medical Policy An independent licensee of the Blue Cross Blue Shield Association

Medical Policy An independent licensee of the Blue Cross Blue Shield Association BRCA1 and BRCA2 Testing Page 1 of 30 Medical Policy An independent licensee of the Blue Cross Blue Shield Association Title: BRCA1 and BRCA2 Testing Pre-Determination of Services IS REQUIRED by the Member

More information

BRCA and Breast/Ovarian Cancer -- Analytic Validity Version 2003-6 2-1

BRCA and Breast/Ovarian Cancer -- Analytic Validity Version 2003-6 2-1 ANALYTIC VALIDITY Question 8: Is the test qualitative or quantitative? Question 9: How often is a test positive when a mutation is present (analytic sensitivity)? Question 10: How often is the test negative

More information

Nancy E. Davidson, MD Johns Hopkins University. Breast Cancer

Nancy E. Davidson, MD Johns Hopkins University. Breast Cancer This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE AP Biology Date SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE LEARNING OBJECTIVES Students will gain an appreciation of the physical effects of sickle cell anemia, its prevalence in the population,

More information

European registered Clinical Laboratory Geneticist (ErCLG) Core curriculum

European registered Clinical Laboratory Geneticist (ErCLG) Core curriculum (February 2015; updated from paper issued by the European Society of Human Genetics Ad hoc committee for the accreditation of clinical laboratory geneticists, published in February 2012) Speciality Profile

More information

Translating DNA repair pathways into therapeutic targets: beyond the BRCA1/2 and PARP inhibitor saga. Jorge S Reis-Filho, MD PhD FRCPath

Translating DNA repair pathways into therapeutic targets: beyond the BRCA1/2 and PARP inhibitor saga. Jorge S Reis-Filho, MD PhD FRCPath Translating DNA repair pathways into therapeutic targets: beyond the BRCA1/2 and PARP inhibitor saga Jorge S Reis-Filho, MD PhD FRCPath Summary How do PARP inhibitors work? Synthetic lethality Potential

More information

Integration of Genetic and Familial Data into. Electronic Medical Records and Healthcare Processes

Integration of Genetic and Familial Data into. Electronic Medical Records and Healthcare Processes Integration of Genetic and Familial Data into Electronic Medical Records and Healthcare Processes By Thomas Kmiecik and Dale Sanders February 2, 2009 Introduction Although our health is certainly impacted

More information

Types of Cancers [-oma growth ]!

Types of Cancers [-oma growth ]! Cancer: disease of transcription factors and replication 1 Uncontrolled cell growth and division -> immortalized cells -> tumor growth -> metastasis (cells float away from tumor and spread throughout the

More information

Data Analysis for Ion Torrent Sequencing

Data Analysis for Ion Torrent Sequencing IFU022 v140202 Research Use Only Instructions For Use Part III Data Analysis for Ion Torrent Sequencing MANUFACTURER: Multiplicom N.V. Galileilaan 18 2845 Niel Belgium Revision date: August 21, 2014 Page

More information

ITT Advanced Medical Technologies - A Programmer's Overview

ITT Advanced Medical Technologies - A Programmer's Overview ITT Advanced Medical Technologies (Ileri Tip Teknolojileri) ITT Advanced Medical Technologies (Ileri Tip Teknolojileri) is a biotechnology company (SME) established in Turkey. Its activity area is research,

More information

if your family has a history

if your family has a history if your family has a history OF CANCER. put it to the test. Learn about your risk for hereditary and ovarian and how you can reduce it. do you have a family history of Breast or Ovarian Cancer? what does

More information

Early detection of breast cancer

Early detection of breast cancer Early detection of breast cancer Professor Denise Kendrick Division of Primary Care 1 5/26/2016 Average Number of New Cases Per Year and Age-Specific Incidence Rates per 100,000 Population Females, UK

More information

Gene Mapping Techniques

Gene Mapping Techniques Gene Mapping Techniques OBJECTIVES By the end of this session the student should be able to: Define genetic linkage and recombinant frequency State how genetic distance may be estimated State how restriction

More information

Chromosomes, Mapping, and the Meiosis Inheritance Connection

Chromosomes, Mapping, and the Meiosis Inheritance Connection Chromosomes, Mapping, and the Meiosis Inheritance Connection Carl Correns 1900 Chapter 13 First suggests central role for chromosomes Rediscovery of Mendel s work Walter Sutton 1902 Chromosomal theory

More information

The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative

The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative The Human Genome Project From genome to health From human genome to other genomes and to gene function Structural Genomics initiative June 2000 What is the Human Genome Project? U.S. govt. project coordinated

More information

Tools for human molecular diagnosis. Joris Vermeesch

Tools for human molecular diagnosis. Joris Vermeesch Tools for human molecular diagnosis Joris Vermeesch Chromosome > DNA Genetic Code Effect of point mutations/polymorphisms Effect of deletions/insertions Effect of splicing mutations IVS2-2A>G Normal splice

More information

Introduction To Real Time Quantitative PCR (qpcr)

Introduction To Real Time Quantitative PCR (qpcr) Introduction To Real Time Quantitative PCR (qpcr) SABiosciences, A QIAGEN Company www.sabiosciences.com The Seminar Topics The advantages of qpcr versus conventional PCR Work flow & applications Factors

More information