How Industry Uses Big Data: Metagenomics and Beyond Jairus David & The No Boundaries Team Research & Innovation ConAgra Foods, Omaha, Nebraska IAFP Annual Meeting Session S8: Big Data - Food Safety s Holy Grail or Pandora s Box? Indianapolis, August 4, 214 1
Complex data? Maybe, but we aren t Big Data 2
and this is big data 3
But processing, managing and analyzing the data is very complex!! Sample 1 Sample 2 Giga bytes Associated Sample information Few Mega bytes Bioinformatics Database Data processing pipelines Meta data Relative Abundances 4
Next Generation DNA Sequencing Services are abundant!! 5
Partnership allowed us to access the technology University of Nebraska Andy Benson Joseph Nietfeldt Rohita Sinha Mary Ma Ryan Legge (ConAgra) Jaehyoung Kim (Geneseek) ConAgra Jairus David Gordon Smith Stefanie Gilbreth Indarpal Singh Chris Duncan
Case study: Model fresh sausage 7
Preview There are dynamic changes in the microbiome that cannot be detected by traditional methods Successions, competitions, displacements NGS is a powerful tool for discovering causal relationships between microbes and sensory attributes NGS data can be used for source tracking and emigration of microbial populations that cannot be seen with traditional methods Points us toward a new era of predictive microbiology! 8
Product Model? 1. Raw product with no kill step = high microbial load 2. Hot-boned process (no rigor mortis) and higher ph compared to cold-boned process 3. Though a raw product, it has an unusually long shelf-life (spices?) TOOLS: Routine microbiology plating method 16sRNA Metagenomics as discovery tool and not analytical Sniffing Sensory-Meat Science enose-chemistry Source Tracking Statistics 9
Experimental design Single batch model sausage Lactic acid carcass wash Proximates: Water, 3%fat, 4% meat, spice blend Distribute into 1lb chubs (1 per trt group) Days No Trt 3% LDA 4% LDA 5% LDA 6% LDA Flash freeze Slacking o C 4 o C -2 Sampling 3 6 8 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 1
Bacterial counts (Log cfu/g) 9. 8. 7. 6. 5. 4. 3. 2. 1. APC APC Psychrotroph Enterobacteriaceae Coliforms E.coli Lactic Acid Bacteria. 3 6 8 Time (days at 4 o C) Apparent disconnect between microbial counts and end of shelf-life Sensory notes develop in stationary phase as measured by traditional plating 11
Log1 Relative Abundance Processed 16S amplicon data (most abundant taxa) 1.1.1.1.1.1 3 6 8 Time (days at 4 o C) Lactobacillus graminis (T) Weissella confusa (T) Leuconostoc citreum KM2 Pseudomonas lini (T) Lactobacillus gasseri ATCC 33323 Streptococcus suis 5ZYH33 Yersinia mollaretii (T) Streptococcus minor (T) Carnobacterium divergens (T) Lactococcus lactis subsp. cremoris SK11 Acinetobacter junii SH25 Bifidobacterium_OTU1 Buttiauxella brennerae (T) Citrobacter braakii (T) Lactobacillus sakei subsp. sakei 23K Acinetobacter sp. ATCC 27244 Serratia_OTU4 Streptococcus sp. M143 Acinetobacter sp. 13TU RUH2624 Streptococcus thermophilus LMG 18311 Bacteroides_OTU9 Streptococcus parasanguinis (T) Veillonella dispar ATCC 17748 Pseudomonas poae (T) Prevotella_OTU1 Acidovorax temperans (T) Streptococcus mitis (T) Enterococcus italicus (T) Rothia_OTU2 Pseudomonas psychrophila (T) Enterococcus aquimarinus (T) Bifidobacterium_OTU6 Propionibacterium_OTU2 Weissella minor (T) Faecalibacterium prausnitzii M21/2 Meiothermus silvanus (T) Microvirgula aerodenitrificans (T) Weissella cibaria (T) Streptococcus infantarius subsp. infantariusatcc BAA-12 Staphylococcus arlettae (T) Diaphorobacter nitroreducens (T) Staphylococcus vitulinus (T) Staphylococcus epidermidis ATCC 12228 Enhydrobacter_OTU Chryseobacterium hispanicum (T) Arcobacter butzleri RM418 Pseudomonas azotoformans (T) Streptococcus pyogenes M49 591 Ruminococcus bromii (T) Haemophilus parainfluenzae (T) Enterococcus gallinarum EG2 Streptophyta_OTU19 Streptococcus sanguinis SK36 Granulicatella adiacens ATCC 49175 Pseudomonas stutzeri A1 Blautia_OTU1 Microbacterium_OTU1 Roseburia intestinalis (T) Bacillus_OTU9 Dialister invisus DSM 47 Prevotella_OTU3 Ralstonia pickettii 12J Pseudomonas aeruginosa UCBPP-PA14 Janthinobacterium agaricidamnosum (T) Acidovorax defluvii (T) Stenotrophomonas maltophilia K279a Finegoldia magna ATCC 29328 Streptococcus cristatus (T) Oscillibacter_OTU3 Castellaniella_OTU Agrococcus_OTU2 Acinetobacter calcoaceticus RUH222 Lactobacillus curvatus (T) Streptophyta_OTU9 Comamonas testosteroni (T) Lactobacillus amylovorus (T) Geobacillus subterraneus (T) Pseudomonas syringae pv. tomato str. DC3 Blautia_OTU2 Lactobacillus apodemi (T) 12
Absolute levels Vs. Relative Abundances 1 8 1 7 1 4.6.7.4 13
Selective growth.89.6.2.7.1.4.89 Selective Lysis.2.1 14
Relative abundance Relative abundance Data reduction: focus on behavior of the most abundant taxa Most abundant Overall most abundant 1 Lactobacillus graminis (T) 1 Weissella confusa (T) Leuconostoc citreum KM2.1 Pseudomonas lini (T) Lactobacillus gasseri ATCC 33323 Carnobacterium divergens (T).1 Lactobacillus graminis (T) Weissella confusa (T) Yersinia mollaretii (T) Leuconostoc citreum KM2 Buttiauxella brennerae (T).1 Serratia_OTU4 Pseudomonas psychrophila (T) Lactococcus lactis subsp. cremoris SK11.1 Pseudomonas lini (T) Lactobacillus gasseri ATCC 33323 Streptococcus suis 5ZYH33 Carnobacterium divergens (T) Streptococcus minor (T).1 Acinetobacter junii SH25 Bifidobacterium_OTU1.1 Yersinia mollaretii (T) Citrobacter braakii (T) Buttiauxella brennerae (T) Lactobacillus sakei subsp. sakei 23K Acinetobacter sp. ATCC 27244 Serratia_OTU4.1 Streptococcus sp. M143 Acinetobacter sp. 13TU RUH2624.1 Pseudomonas psychrophila (T) Streptococcus thermophilus LMG 18311 Bacteroides_OTU9.1 3 6 8 Streptococcus parasanguinis (T) Pseudomonas poae (T).1 Time (days at 4 o C) 3 6 8
Relative abundance Bacterial counts (Log cfu/g) ph Sensory characteristics 9. 8. 7. 6. 5. 4. 3. 2. 1.. 1 3 6 8 6.2 6.1 6 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 APC APC Psychrotroph Enterobacteriaceae Coliforms E.coli Lactic Acid Bacteria ph Sour Rancid Putrid/Spoiled Lactobacillus graminis (T).1 Weissella confusa (T) Leuconostoc citreum KM2.1 Pseudomonas lini (T) Lactobacillus gasseri ATCC 33323.1 Carnobacterium divergens (T) Yersinia mollaretii (T).1 Buttiauxella brennerae (T) Serratia_OTU4.1 3 6 8 Time (days at 4 o C) Pseudomonas psychrophila (T) 16
3 6 8 3 6 8 3 6 8 3 6 8 3 6 8 Relative Abundance 3 6 8 3 6 8 3 6 8 3 6 8 3 6 8 Bacterial Counts (Log CFU/g) 8. 6. 4. 2.. LD 3% 4% 5% 6% Sensory characteristics Sour Rancid Putrid/Spoiled APC APC Psychrotroph Enterobacteriaceae Lactic acid bacteria 1.1.1.1.1.1 Lactobacillus graminis (T) Weissella confusa (T) Leuconostoc citreum KM2 Pseudomonas lini (T) Lactobacillus gasseri ATCC 33323 Streptococcus suis 5ZYH33 Yersinia mollaretii (T) Streptococcus minor (T) Carnobacterium divergens (T) Lactococcus lactis subsp. cremoris SK11 Acinetobacter junii SH25 Buttiauxella brennerae (T) Citrobacter braakii (T) Lactobacillus sakei subsp. sakei 23K Acinetobacter sp. ATCC 27244 Serratia_OTU4 Streptococcus sp. M143 Acinetobacter sp. 13TU RUH2624 Streptococcus thermophilus LMG 18311 Streptococcus parasanguinis (T) Veillonella dispar ATCC 17748 Pseudomonas poae (T) Prevotella_OTU1 Acidovorax temperans (T) Streptococcus mitis (T) Enterococcus italicus (T) Rothia_OTU2 Pseudomonas psychrophila (T) Enterococcus aquimarinus (T) Propionibacterium_OTU2 Weissella minor (T) Faecalibacterium prausnitzii M21/2 Time (days at 4 o C) 17
3 6 8 3 6 8 3 6 8 3 6 8 3 6 8 Relative Abundance 3 6 8 3 6 8 3 6 8 3 6 8 3 6 8 Bacterial Counts (Log CFU/g) 9. LD 3% 4% 5% 6% 7. 5. 3. 1. APC APC Psychrotroph Enterobacteriaceae Lactic acid bacteria -1. 1.1.1.1.1 Lactobacillus amylovorus (T) Lactobacillus apodemi (T) Lactobacillus curvatus (T) Lactobacillus gasseri ATCC 33323 Lactobacillus graminis (T) Lactobacillus sakei subsp. sakei 23K.1 Time (days at 4 o C) 18
Individual taxa that are associated with chemical changes Electronic nose (enose) analysis of archived samples from each treatment group/time point Statistical analysis of enose data to identify sensors showing significant changes over time per treatment group Correlation analysis between relative abundances of individual taxa with mean values of enose sensors 19
Fig. 9. Box and whisker plots of responses from individual enose sensors showing statistically significant responses. Each plot shows the responses of a single sensor from the enose array for each individual sausage sample. Boxes represent 95% confidence intervals and whiskers depict maximum and minimum values for sensor responses over the 6-second measurements. The time points (days) are indicated above each individual sample and boxes are color-coded as depicted at the bottom. Individual sensors detect Aromatics(W1C), ammonia(w3c), aromaticaliphatics (W5C), broad range alcohols (W2S) and broad range with sensitivity to nitrogen oxides (W5S). The values for days 3-8 were significant (P<.5) by ANOVA for each of the sensors shown with the exception of the W5S sensor, where only the day 6 and day 8 untreated samples were 2 significant.
Relative abundance Relative abundance Relative abundance 1. -1..6.5.4.3.2.1 Serratia OTU4 R² =.9881 R² =.9876 R² =.983.5 1 Mean response W1C W3C W5C Linear (W1C) Linear (W3C) Linear (W5C).1.8.6 Carnobacterium divergens R² =.6721 R² =.6435 R² =.7841 Lactobacillus graminis.4 R² =.7967.35 R² =.7555.3.25.2..1.5 5 1 Mean response W1C W3C W1S W2S Linear (W1S) Linear (W2S).4 W5C.2 -.2.2.4.6.8 1 Mean response Linear (W1C) Linear (W3C) Linear (W5C) 21
Source-tracking Pyrosequencing of 16S amplicons from DNA extracted from spice blend Compare taxa and their relative abundances in spice blend and meat + spice blend 22
Meat + spice blend Spice Blend Firmicutes Actinobacteria Proteobacteria Bacteriodetes Deinococcus-Thermus Fig. 11. Source-tracking taxa from the pork sausage and its spice blend ingredient. Distribution of taxa are illustrated from the day sample of the untreated pork sausage and from a sample of the spice blend mix that was used to develop the model pork sausage. The dendrogram was developed from representative sequences derived from each of the 82 different genera that were detected in the samples. Representative sequences were developed from each of the genus-level CLASSIFIER taxonomic bins by first collapsing to 97% identity using cmalign and then aligning to a phylogenetic framework established from 16S reference sequences. The individual pie charts depict species-level OTUs for each of the 82 genera and the relative abundance of each OTU in the spice blend (dark blue) or the meat + spice blend (light blue) is illustrated by proportion in each pie. 23
So what did we learn? NGS technology will be best at discovery in microbially rich environments Reconciled the gap between routine micro plating and sniffing sensory Discovery tool can point toward causal relationships between microbes and physiochemical properties of the matrix enose data and microbiome data Relationship may not always be linear Source tracking 24
Biological questions? Growth or lysis? How does variation in ingredients and process influence outcome? Which organisms grow in the successions Rate of the successions Rate of physiochemical changes How much data do we need for predictive microbiology? 25
The No Boundaries Team Andy Benson University of Nebraska Jairus David Joseph Nietfeldt Andy Benson Gordon Smith Joseph Nietfeldt Rohita Sinha Rohita Sinha Stefanie Gilbreth Mary Ma Indarpal Singh Mary Ma Ryan Legge (ConAgra) Chris Duncan Jaehyoung Kim (Geneseek) Ryan Legge Jaehyoung Kim ConAgra Jairus David Gordon Smith Stefanie Gilbreth Indarpal Singh Chris Duncan