Derek Nexus and Sarah Nexus: working together for ICH M7 European ICGM, September 2014 Dr Nicholas Marchetti Product Manager nik.marchetti@lhasalimited.org
Derek Nexus and Sarah Nexus: working together for ICH M7 OUTLINE Impact of changes driven by M7 In silico solutions Vitic Nexus an authoritative toxicity database Derek Nexus the leading expert system Sarah Nexus an advanced statistical system Expert assessment from 2 predictions
What does M7 cover? identification categorisation qualification Control of mutagenic impurities to limit potential carcinogenic risk Harmonises guidelines FDA, EMA, Japan Recognises the primacy of the Ames assay
Focussing on the identification step Evaluate drug substance, impurities, degradants, (metabolites), intermediates Databases, in-house, literature.. 2 x in silico QSAR Leadscope Multicase Known mutagen Predicted positive Predicted negative Known non-mutagen Expert Review Ames test Expert Review Limit according to TTC or present purge argument for absence Treat as nonmutagenic
Derek Nexus and Sarah Nexus: working together for ICH M7 OUTLINE Impact of changes driven by M7 In silico solutions Vitic Nexus an authoritative toxicity database Derek Nexus the leading expert system Sarah Nexus an advanced statistical system Expert assessment from 2 predictions
Vitic Nexus an authoritative toxicity database Vitic Nexus is a repository of toxicological data Data donated by members Curated and augmented by expert scientists Genotoxicity records In vitro data In vivo data Overall call 146,444 records, 9,014 compounds 10,157 records, 2,658 compounds 15,289 records, 8,510 compounds Contains public datasets and literature including Benchmark, CGX, ISSSTY, IUCLID FDA CDER & CFSAN, JETOC (Japanese Chemical Industry Ecology-Toxicology..) IARC, JETOC, NIHS, NTP, SCCP, SIDS Members also store their own data in Vitic Nexus
Data sharing consortia Lhasa facilitate pre-competitive data sharing Members of these consortia also see Aromatic amines 1,664 records 145 compounds Intermediates (includes boronic acid sub-group) 13,834 records 910 compounds Excipients 2,286 records 764 compounds
in silico predictions for M7 Use models that predict Ames outcomes 2 complementary methods should be applied One expert rule-based One statistical-based Models should follow OECD Principles for QSAR The absence of alerts from both is sufficient to conclude that the impurity is of no concern Expert review is needed to provide additional evidence for any prediction and to explain conflicting results
Derek Nexus and Sarah Nexus: working together for ICH M7 OUTLINE Impact of changes driven by M7 In silico solutions Vitic Nexus an authoritative toxicity database Derek Nexus the leading expert system Sarah Nexus an advanced statistical system Expert assessment from 2 predictions
Enhancing Derek Nexus for mutagenicity Designed to support expert analysis for M7 Provide additional supporting information Recommend where expert should focus analysis
Derek Nexus and Sarah Nexus: working together for ICH M7 OUTLINE Impact of changes driven by M7 In silico solutions Vitic Nexus an authoritative toxicity database Derek Nexus the leading expert system Sarah Nexus an advanced statistical system Expert assessment from 2 predictions
Sarah Nexus an advanced statistical system Designed to address the ICH M7 guidelines Created with input from the FDA under a Research Collaboration Agreement
Making a prediction Query compounds are fragmented Each fragment is assessed Fragments not covered by the training set result in no prediction Relevant hypotheses for each fragment are retrieved Hypothesis, signal, confidence, supporting examples Typically several hypotheses are returned out of domain Overall Prediction = f (prediction, confidence) hypotheses Absence of a strong overall signal equivocal
Confidence correlates with accuracy TN 29% TP 31% FP 22% FN 18% TN 40% FP 13% TP 37% FN 10% TN 39% FP 9% TP 50% FN 2% TN 34% FP 4% TP 60% FN 2% FP 6% TN 23% TP 70% FN 1% 1 b. aaa = ssss + ssss 2 0.8 PPP = TT TT + FF 0.6 NNN = TT TT + FF 0.4 0.2 0 0-20% 20-40% 40-60% 60-80% 80-100% Sarah confidence score
Confidence vs PPV 100% 90% 80% 70% PPV 60% 50% 40% 30% 20% 10% 0% 0% 20% 40% 60% 80% 100% Confidence
Sarah Nexus Performance Sarah Nexus has been extensively evaluated by members 100% 80% 83-96% 60-85% 60-89% 38-84% 60% Private 1, n= 744, 28% +ive Private 2, n = 847, 12% +ive Private 3, n= 437, 16% +ive 40% 20% 0% sens + spec 2 TN TN + FP TP TP + FN Coverage Balanced accuracy Specificity Sensitivity Private 4, n = 986, 4% +ive Private 5, n = 1718, 14% +ive Private 6, n = 320, 23% +ive FDA, n=809, 36% +ive Public, n = 11209,49% +ive Sarah Nexus v1 under recommended settings Presented @ SoT, March 2014
Sarah Nexus - Summary Sarah is a statistical approach to mutagenicity Maintains high coverage even with challenging datasets Provides information needed for expert analysis
The use of integrated in silico solutions under the proposed ICH M7 guidelines OUTLINE Impact of changes driven by M7 In silico solutions Vitic Nexus an authoritative toxicity database Derek Nexus the leading expert system Sarah Nexus an advanced statistical system Expert assessment from 2 predictions
Using in silico predictions M7 explicitly states that in silico predictions should be reviewed with expert knowledge Provide supportive evidence for any prediction Elucidate underlying reasons in case of conflicting results But how will this work in real life? In silico methods combined with expert knowledge rule out mutagenic potential of pharmaceutical impurities: An industry survey Regulatory Toxicology and Pharmacology, 2012, 62, 449 455 Use of in silico systems and expert knowledge for structure-based assessment of potentially mutagenic impurities Regulatory Toxicology and Pharmacology, 2013, 67, 39
2 complementary methodologies should be applied Data methodology Expert system uses all Lhasa data including consortia & donated confidential data + data mined on-site expert system human-written rules based upon data & knowledge Statistical system only uses non-confidential data statistical model machine-learning model using a hierarchical network scope of alert hand-written Markush fragments learnt by model interpretability references expert commentary mechanistic explanation scope of alert some supporting examples transparent methodology learning summarised by hypothesis direct link to training set confidence in prediction
Using Sarah and Derek together How often do they disagree? When they agree, how accurate are they? 100% 69-85% 62-90% 80% 60% 40% 20% Private Dataset 1 Private Dataset 2 Private Dataset 3 Public Dataset 0% Agreement between Derek Nexus and Sarah Nexus Balanced accuracy for concurring predictions Acknowledgements : All the Lhasa members who worked closely with us during the evaluation and development of Sarah
Using Sarah and Derek together A simple conservative approach will increase sensitivity sensitivity 1..but at the cost of accuracy and specificity 0.9 0.68, 0.8 0.7 0.72 = 0.83 0.6 0.74 0.5 0.4 0.3 0.2 0.1 0 Private dataset 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 accuracy specificity
Using Sarah and Derek together When they disagree, which is right? Public Dataset Private Dataset 3 7% 14% 7% 7% 11% 9% 5% 7% 9% 72% 73% 80% 31% 25% 17% 27%
Handling conflicting predictions Confidence scores can give an indication Machine-learnt & expert driven rules have been assessed If both models agree Take that consensus prediction If one model has a high confidence prediction Take the most confident prediction If Derek says positive and Sarah has a positive hypothesis (despite being negative overall) Activity is most likely If the positive prediction is of low confidence Activity is unlikely.
Handling conflicting predictions Private Dataset 1 Step 1 0.85 0.8 Step 2 Step 3 Step 4 D and S agree Most confident prediction D says positive, S has positive hypothesis Low confidence positive 0.75 0.7 0.65 0.6 0.55 0.5 Accuracy Sensitivity True accuracy Coverage Simple rules give increased coverage without loss of accuracy
Ultimately, expert review is needed Decision trees may help guide an expert, but expert review is still essential We have worked with our members to deliver the information needed for expert review
Supporting the expert workflow Step 1 Specific Prediction for ICH M7
Supporting the expert workflow Derek prediction Predicted negative but there is a ring system to assess
Supporting the expert workflow Derek Nexus now shows those compounds from the Lhasa Ames test reference set most closely related to the query
Supporting the expert workflow Step 2 Sarah prediction Sarah predicts negative; no positive hypotheses seen Derek and Sarah analysis agree Supporting data from Vitic augments this prediction
Supporting the expert workflow Step 3 Vitic search similarity chosen Vitic shows a related active for which there is no obvious cause (no Derek alert fires) and also a related inactive Expert assessment ring system not of concern
Possible reasons to over-rule a positive in silico call The presence of a second confounding alert that could have caused the activity a risk with statistical models Minimised with Sarah s recursive learning approach Mechanistic interpretation stereo-electronics preclude reaction through the accepted mechanism such as that described within Derek Similar analogues trigger the same alert and have been tested as inactive were not known to the model
What our members say Combined use of two complementary in silico systems such as Derek Nexus and SEP leads to an increase in negative predictivity and sensitivity, up to 99.1% and 94.7% respectively Poster Comparative Evaluation of in Silico Systems for Ames Test Mutagenicity Prediction Ilse Koijen Janssen, GTA Newark Oct 2013, www.gta-us.org/scimtgs/2013meeting/posters2013.html SEP = the pre-release version of Sarah
Combined report view
Derek Prediction
Sarah Prediction
Batch View
Paper reports
Summary M7 will allow predictions of mutagenicity to be submitted Derek has been extended to increase support for expert review Making confident predictions of inactivity Highlighting features worthy of attention Sarah has been designed to provide the statistical 2 nd system Recursive learning and a hierarchical network provide transparency and accuracy The performance of combined predictions has been described Using a number of relevant confidential datasets Examples of expert decision-making illustrate their application Use of Vitic, an authoritative database supports this workflow
Questions?