AN AUTOMATED, HIGH-THROUGHPUT LIBRARY CONSTRUCTION PROTOCOL WITH BENEFITS FOR LOW-INPUT APPLICATIONS.

Poster Note As presented at AGBT 2014, Marco Island, FL AN AUTOMATED, HIGH-THROUGHPUT LIBRARY CONSTRUCTION PROTOCOL WITH BENEFITS FOR LOW-INPUT APPLICATIONS. Authors Maryke Appel 1, Olaf Stelling 2, Olga Aminova 3, Sasinya Scott 4, Adriana Heguy 3, Michael Berger 4, John Foskett 1 1 Kapa Biosystems, 200 Ballardvale St, Suite 250, Wilmington, MA 2 Alpaqua Engineering, LLC, Beverly, MA 3 Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 4 Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY ALPAQUA Liquid Handling Products

INTRODUCTION Research and clinical next-generation sequencing (NGS) applications and pipelines continue to push the limits of current protocols in the quest for high-quality sequence data from lower amounts of input DNA, and challenging or precious samples, such as DNA isolated from FFPE tissue. We and others have previously reported on the advantages an engineered, highfidelity polymerase for library amplification, which enables improved library quality and sequence coverage. The implementation of KAPA HiFi for pre- and post-capture amplification represented a major process improvement in the library construction process for targeted Illumina sequencing at MSKCC (Figure 1). A highly optimized library construction protocol, focused on the preservation of small DNA fragments and minimal sample loss, was subsequently developed to ensure high success rates with precious FFPE samples (Figure 2, left). The next challenge was to establish a robust, automated, high-throughput pipeline that yields comparable data quality and success rates. To extend the benefits of low-bias amplification, particularly to low-input and clinical applications, Kapa Biosystems recently developed an automation-friendly library construction protocol (Figure 2, right) that incorporates the with-bead strategy conceptualized by The Broad Institute of MIT and Harvard. Together with the use of ultra-pure, high-quality reagents for library construction, the with-bead protocol results in higher recoveries of adapter-ligated molecules from lower amounts of input DNA. These benefits allow for fewer cycles of amplification, thereby further reducing the risk of PCR-induced bias, error and other artefacts that can affect library quality, sequence coverage, and reliable library quantification. In collaboration with Alpaqua Engineering, an elegant and versatile automated method for highthroughput Illumina library construction using the KAPA HTP Library Preparation Kit on the Biomek FX Laboratory Automation Workstation (Beckman Coulter) has been developed. Data presented here confirms that the automated method and KAPA library construction reagents offer robust, high-throughput library construction over a range of DNA inputs and sample types, and yields sequence data comparable or better than that achieved with a highly optimized manual method, whilst employing fewer cycles of pre-capture amplification.

FIGURE 1 KAPA HiFi significantly improves depth and coverage of GC-rich exons in targeted Illumina re-sequencing Libraries were prepared from 250 ng Covaris-sheared DNA, using the Illumina TruSeq DNA Sample Preparation Kit with Phusion DNA Polymerase (red), or NEBNext library construction reagents and KAPA HiFi DNA Polymerase (blue) for pre-capture amplification. Solution-based hybridization capture was performed with a custom SeqCap EZ panel (Roche NimbleGen), targeting 38 GC-rich genes (203 kb), prior to paired-end sequencing (2 x 150 bp) on an Illumina MiSeq. Pre-capture amplification with KAPA HiFi resulted in a higher average depth of coverage (210X for 24 samples; vs. 99X for 44 samples with the TruSeq kit), and a significant improvement in the representation of GC-rich exons. FIGURE 2 Library construction workflows Fragmented DNA in 50 µl Add End Repair enzyme + buffer + water up to 100 µl Mix and incubate for 30 min at 20ºC 2X SPRI cleanup Elute DNA in 44.5 µl water, transfer 42 µl Add A-tailing enzyme + buffer (50 µl final) Mix and incubate for 30 min at 37ºC Fragmented DNA in 50 µl Add 20 µl End Repair Mix Mix and incubate for 30 min at 20ºC 1.7X SPRI cleanup Elute DNA in 50 µl A-tailing Mix Mix and incubate for 30 min at 30ºC The manual workflow optimized at MSKCC (left) employs NEB enzymes and buffers for end repair, A-tailing and ligation. Sample volumes recovered after SPRI cleanups have been maximized to limit sample loss. This requires meticulous liquid handling, which is not achievable with an automated system. To ensure a robust and automation-friendly method capable of producing libraries of similar quality and complexity, the KAPA protocol (right) employs a with-bead strategy. 2X SPRI cleanup 1.8X SPRI cleanup Elute DNA in 35.5 µl water, transfer 33.8 µl Add Ligase + buffer + adapter (50 µl final) Mix and incubate for 15 min at 20ºC Elute DNA in 45 µl Ligation Mix Add 5 µl adapter Mix and incubate for 15 min at 20ºC 1 st 1X SPRI cleanup, elute in 52.5 µl Transfer 50 µl for 2 nd 1X SPRI cleanup Elute in 23 µl, transfer 21 µl to PCR 1 st 1X SPRI cleanup, elute in 50 µl Add PEG/NaCl for 2 nd 1X cleanup Elute in 25 µl, transfer 20 µl to PCR PCR Amplification with KAPA HiFi HotStart ReadyMix (10 cycles) PCR Amplification with KAPA HiFi HotStart ReadyMix (8 cycles) 1X SPRI cleanup Elute library in 32.5 µl, recover 30 µl 1X SPRI cleanup Elute library in 30 µl, recover 25 µl Cumulative sample loss: 28% Cumulative sample loss: 33%

2 AN AUTOMATED METHOD FOCUSED ON FLEXIBILITY AND PERFORMANCE The KAPA HTP Library Preparation Kit and protocol (Figure 2, right) was optimized with automation in mind: reagents for enzymatic reactions are utilized as master mixes, reaction setups are designed to obviate the need for aspirating or transferring volumes <5 µl, and SPRI beads are re-used for cleanups after enzymatic reactions to minimize the physical transfer (an associated loss) of sample. During development of the automated method for the Biomek FX platform, the primary objective was to create a flexible, yet robust method that is applicable to different workflows (without the need for reprogramming), and yields libraries of comparable quality than those constructed manually. A key feature of the method is its intuitive, easy-to-use interface (Figure 3), which takes advantage of the flexibility and extendability of the Biomek software to offer the user full control over key reaction parameters. These include the number of samples to be processed (8 96, in columns of 8), reaction component volumes, parameters for SPRI cleanups, incubation times and temperatures, and elution volumes. Furthermore, it provides push-button access to different workflow options FIGURE 3 Start-up dialog and deck layout for the KAPA HTP Library Preparation Kit on the Biomek FX platform relating to adapter indexing, size selection (post-ligation and/or post-amplification; on-deck with SPRI beads, or off-deck), and library amplification (optional, on- or off-deck). Other user-friendly features include utilization of partial tip boxes for the Span-8 pod (if <96 samples are processed), and error-recovery (the method can be restarted at any key step in the process). A full run takes 4.5 6 hrs to complete, depending on the number of samples and specific workflow. The start-up dialog (left) allows the user to configure the run, by either leaving default values as is, or changing parameters as needed for the samples to be processed. An optional SPRI cleanup before end repair, as well as three different options for size selection after ligation and library amplification are available. Post-ligation and post-amplification size selection is configured on additional screens. The post-ligation or Size Selection 1 screen (top right) provides for No size selection, which can be configured as one 0.5X, or two, consecutive 1X SPRI cleanups (recommended). If Dual-SPRI Size Selection is selected in the start-up dialog, a 1X SPRI cleanup will be followed by a dual-spri size selection. The Off-Deck Size Selection option allows the user to perform a SPRI cleanup, then pause the method for off-deck size selection using the appropriate equipment. The volume in which samples are returned to the deck can be specified when removing or returning the samples. The post-amplification or Size Selection 2 screen provides similar options. The method can be configured for different deck layouts (e.g. bottom right) equipped with the required hardware. Unique spring-loaded automation accessories are optional, but recommended for optimal pipetting performance and reagent utilization. Consumable usage is limited through the washing and re-use of a single set of tips for the 96-well pod throughout all SPRI bead manipulations.

SIMILAR PERFORMANCE WITH HIGH QUALITY CELL LINE DNA The objective of the first validation experiment was to assess the performance of the automated KAPA HTP Library Preparation method for the Biomek FX platform to that of the manual method previously optimized at MSKCC. FIGURE 4 Tumor cell line DNA (9 samples) Covaris sheer (500 ng per sample) 6 min, 200 cycles/burst, 10% duty cycle, intensity = 5) Libraries were prepared for targeted Illumina re-sequencing from nine different tumor cell line DNA samples with the manual MSKCC protocol, or automated method for the KAPA HTP Library Preparation Kit on the Biomek. The experimental design is outlined on the left, and results are summarized on the right. Library construction with optimized manual MKSCC method Input = 50% of sheared DNA 2X SPRI cleanup after end repair 2X SPRI cleanup after A-tailing NEXTflex indexed adapters (Bioo Scientific) 10 cycles of pre-capture amplification with KAPA HiFi Library construction with automated KAPA method on Biomek Input = 50% of sheared DNA 1.7X SPRI cleanup after end repair 1.8X SPRI cleanup after A-tailing NEXTflex indexed adapters (Bioo Scientific) 8 cycles of pre-capture amplification with KAPA HiFi Quantity amplified libraries with Qubit dsdna HS assay Pool 100 ng of each library Target enrichment with custom SeqCap EZ panel (279 genes; ~1Mb) Post-capture amplification with KAPA HiFi (12 cycles) Paired-end sequencing (2 x 75 bp) on Illumina MiSeq FIGURE 5 Automated method (8 PCR cycles) Manual method (10 PCR cycles) Average pre-capture amplification yield 2.7 µg ± 0.20 µg 7.0 µg ± 0.45 µg Projected yield after 10 cycles (assuming 90% PCR efficiency) 8.7 µg ± 0.64 µg N/A Average insert size ~145 bp ~130 bp Results indicated that libraries generated with the two methods were of comparable quality and complexity, and that the automated method may have converted slightly more input material to adapter-ligated molecules. The average insert size of the Biomek libraries was 10 20 bp larger than the manual libraries, most likely as a result of the lower SPRI ratios (1.7X 1.8X vs. 2X) used for cleanups after end repair and A-tailing. The marginally higher on:near target ratio for the smaller manual libraries was attributed to less intronic content. Average library size (post-capture) 27.3 million molecules 29.4 million molecules % PCR duplicates 1.33% 1.28% Target specificity (on bait : near bait : off-bait) 64% : 22% : 14 % 66% : 20 % : 14%

CONSISTENT PERFORMANCE OVER A RANGE OF LOW DNA INPUTS The objective of the next validation experiment was to determine whether the performance of the automated method was consistent over a range of low DNA inputs. FIGURE 6 100 ng input 50 ng input 10 ng input Average post-ligation yield (25 µl) 23.9 ng 11.3 ng 2.41 ng Average post-ligation yield as % of input DNA Average % of input molecules converted to adapter-ligated DNA* Average amount of template DNA used for amplification (20 of 25 µl) 23.9% 22.6% 24.1% 14.3% 13.2% 14.5% 19.1 ng 9.0 ng 1.93 ng Average post-amplification yield (30 µl) 5.1 µg 2.5 µg 0.50 µg For this experiment, high-quality commercial DNA (Clontech, part # 636401) was Covaris-sheared to an average size of ~180 bp. Different amounts of sheared DNA (100 ng, 50 ng or 10 ng, based on the manufacturer s concentration before shearing) were used in quadruplicate for library construction with the automated KAPA HTP Library Preparation method on the Biomek. One set of duplicate samples for each input amount was withdrawn after post-ligation cleanup, whereas the other set of samples were amplified with KAPA HiFi HotStart Library Amplification ReadyMix for 8 cycles, and subjected to two consecutive 1X SPRI cleanups. Post-ligation and post-amplification yields were determined by qpcr, using the KAPA Library Quantification Kit for Illumina libraries. The results confirmed that the automated method produces consistent ligation and amplification results over a range of low DNA inputs. Average enrichment factor 272X 295X 258X *Based on the assumption that 60% of mass of adapter-ligated is contributed by library DNA (average fragment size ~180 bp) and 40% by adapters (2 x 60 bp). Calculations assume negligible loss of DNA during shearing. COMPARABLE PERFORMANCE WITH FFPE SAMPLES The acid test for the automated method was to compare its performance against the optimized, manual method with FFPE DNA. In this case, libraries were prepared for targeted Illumina re-sequencing from eight FFPE DNA (four colon cancer and four gastric cancer) samples, with the manual MSKCC protocol, or automated method for the KAPA HTP Library Preparation Kit on the Biomek. The experimental design was similar to that used for the experiment employing tumor cell line DNA (see above), but SPRI ratios used for cleanups after end repair and A-tailing were standardized at 2X for both the automated and manual methods. Results are summarized in the table and graphs on the right. Results indicated that libraries produced from FFPE DNA with the automated KAPA method were comparable in quality and complexity to libraries generated from the same samples using the manual MSKCC method. The lower yields, higher duplication rates and smaller library sizes obtained for FFPE samples (vs. cell line DNA) with both methods confirmed the challenges associated with NGS library construction from FFPE samples, and the need for robust and reliable automated methods for high-throughput library construction from difficult and precious samples. The automated KAPA HTP Library Preparation method developed for the Biomek FX platform is capable of meeting these challenges, and may outperform manual methods that are not as well optimized as the method previously developed at MSKCC.

FIGURE 7 Insert size distribution Insert Size Distribution! Fraction of Molecules Frac%on(of(Molecules( 0.016" 0.014" 0.012" 0.01" 0.008" 0.006" 0.004" 0.002" 0" 1" 10" 19" 28" 37" 46" 55" 64" 73" 82" 91" 100" 109" 118" 127" 136" 145" 154" 163" 172" 181" 190" 199" 208" 217" 226" 235" 244" 253" 262" 271" 280" 289" 298" 307" 316" 325" 334" 343" 352" 361" 370" 379" 388" 397" FIGURE 8 Insert(Size( Insert Size Fraction of duplicates Fraction of Duplicates! 0.07" 0.06" 0.05" 0.04" 0.03" 0.02" 0.01" 0" 25000000" 20000000" 15000000" 10000000" bc01" bc02" bc03" bc04" Manual method FIGURE 9 Post-capture library size bc05" bc06" bc07" bc08" Post-Capture Library Size! bc01" bc02" bc03" bc04" bc05" bc06" bc07" bc08" bc09" bc10" bc11" bc12" bc13" bc14" bc15" bc16" bc09" bc10" bc11" bc12" bc13" bc14" bc15" bc16" Automated method FIGURE 10 On-target specificity Total bases 100%# 90%# 80%# 70%# 60%# 50%# 40%# 30%# 20%# 10%# 0%# bc01# bc02# bc03# Manual method FIGURE 11 GC content profile Mean%Target%Coverage%(Normalized)% (Normalized) 1.4" 1.2" 1" 0.8" 0.6" 0.4" 0.2" 0" FIGURE 12 A-manual" H-manual" A-beckman" H-beckman" On-Target Specificity! bc04# bc05# bc06# bc07# bc08# bc09# bc10# bc11# OFF_BAIT_BASES# NEAR_BAIT_BASES# ON_BAIT_BASES# Automated method bc12# bc13# bc14# bc15# bc16# 0.3" 0.35" 0.4" 0.45" 0.5" 0.55" 0.6" 0.65" 0.7" 0.75" 0.8" Average pre-capture amplification yield Projected yield after 10 cycles (assuming 90% PCR efficiency) GC%Frac8on% Fraction Automated KAPA method (8 PCR cycles) Manual MSKCC method (10 PCR cycles) 0.62 µg ± 0.21 µg 2.1 µg ± 0.71 µg 2.0 µg ± 0.69 µg N/A Average insert size ~120 bp ~120 bp 5000000" 0" bc01" bc02" bc03" bc04" bc05" Manual method bc06" bc07" bc08" bc09" bc10" bc11" bc12" bc13" Automated method bc14" bc15" bc16" Average library size (postcapture) 9.8 million molecules 11.8 million molecules % PCR duplicates 4.7% 5.1% Target specificity (on bait : near bait : off-bait) 75% : 15% : 10 % 75% : 15 % : 10% FFPE DNA yield/ Cell line DNA yield 0.23 0.30 FFPE library size/ Cell line library size 0.36 0.40 FFPE duplicate %/ Cell line duplicate % 3.5 4.0

Contact Us at: Headquarters, United States Wilmington, Massachusetts Tel: 781.497.2933 Fax: 781.497.2934 sales@kapabiosystems.com International Office Cape Town, South Africa Tel: +27.21.448.8200 Fax: +27.21.448.6503 sales@kapabiosystems.com Kapa Technical Support kapabiosystems.com/support 2016 Kapa Biosystems. All trademarks are the property of their respective owners. PN103001 A179 4/16