RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling
|
|
|
- Jennifer Willis
- 10 years ago
- Views:
Transcription
1 RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling Greg Landrum [email protected]
2 SourceForge Page: Availability Source code: Browse: Download using subversion: svn co rdkit Binaries: available for Win32 Licensing: BSD license (except for GUI components, which are GPL)
3 Components End User Command-line scripts Python interface Database extensions Developer Python programming library C++ programming library
4 General Molecular Functionality Input/Output: SMILES/SMARTS, mol, SDF, TDT Cheminformatics : Substructure searching Canonical SMILES Chirality support Chemical transformations Chemical reactions Molecular serialization (e.g. mol <-> text) 2D depiction, including constrained depiction and mimicking 3D coords 2D->3D conversion/conformational analysis via distance geometry UFF implementation for cleaning up structures Fingerprinting (Daylight-like, circular, atom pairs, topological torsions, MACCS keys, etc.) Similarity/diversity picking (include fuzzy similarity) 2D pharmacophores Gasteiger-Marsili charges Hierarchical subgraph/fragment analysis Hierarchical RECAP implementation
5 General Molecular Functionality, cntd Feature maps and feature-map vectors Shape-based similarity Molecule-molecule alignment Shape-based alignment (subshape alignment)* Very fast 3D pharmacophore searching Integration with PyMOL for 3D visualization Database cartridge (PostgreSQL, sqlite coming) * * functional implementations, but not really recommended for use
6 General QSAR Functionality Molecular descriptor library: Topological (κ3, Balaban J, etc.) Electrotopological state (EState) clogp, MR (Wildman and Crippen approach) MOE like VSA descriptors Feature-map vectors Machine Learning: Clustering (hierarchical) Information theory (Shannon entropy) Decision trees, naïve Bayes*, knn* * functional implementations, but Bagging, random forests not really recommended for use Infrastructure: - data splitting - shuffling (y scrambling) - out-of-bag classification - serializable models and descriptor calculators - enrichment plots, screening, etc.
7 Command Line Tools ML/BuildComposite.py: build models ML/ScreenComposite.py: screen models ML/EnrichPlot.py: generate enrichment plot data ML/AnalyzeComposite.py: analyze models (descriptor levels) Chem/Fingerprints/FingerprintMols.py: generate 2D fingerprints Chem/BuildFragmentCatalog.py: CASE-type analysis with a hierarchical catalog
8 Database Utilities DbUtils.ExcelToDatabase('::GPCRs','bzr_raw85',keyCol='name') DbUtils.DatabaseToExcel('::GPCRs','bzr_raw85') DbUtils.TextFileToDatabase('::GPCRs','bzr_raw85', file('input.txt',delim=',', keycol='name') DbUtils.DatabaseToText('::GPCRs','bzr_raw85')
9 Reading/Writing Molecules MolFromSmiles MolFromSmarts MolFromMolBlock MolFromMolFile mol = Chem.MolFromSmiles('CCCOCC') SmilesMolSupplier SDMolSupplier TDTMolSupplier SupplierFromFilename MolToSmiles MolToSmarts MolToMolBlock SmilesWriter SDWriter TDTWriter mols = [x for x in Chem.SDMolSupplier('input.sdf')] print >>outf,chem.moltomolblock(mol) w = Chem.SDWriter('out.sdf') for m in mols: w.write(m)
10 Working with molecules: Substructure matching >>> m = Chem.MolFromSmiles('O=CCC=O') >>> p = Chem.MolFromSmarts('C=O') >>> m.hassubstructmatch(p) True >>> m.getsubstructmatch(p) (1, 0) >>> m.getsubstructmatches(p) ((1, 0), (3, 4)) >>> m = Chem.MolFromSmiles('C1CCC1C(=O)O') >>> p= Chem.MolFromSmarts('C=O') >>> m2 =Chem.DeleteSubstructs(m,p) >>> Chem.MolToSmiles(m2) 'O.C1CCC1' >>> >>> def findcore(m): smi = Chem.MolToSmiles(m) patt = Chem.MolFromSmarts('[D1]') while 1: m2 = Chem.DeleteSubstructs(m,patt) nsmi = Chem.MolToSmiles(m2) m = m2 if nsmi==smi: break else: smi = nsmi return m >>> m = Chem.MolFromSmiles('CCC1CCC1') >>> c = findcore(m) >>> Chem.MolToSmiles(c) 'C1CCC1'
11 Working with molecules: properties >>> suppl = Chem.SDMolSupplier('divscreen.400.sdf') >>> m = suppl.next() >>> list(m.getpropnames()) ['Formula', 'MolWeight', 'Mol_ID', 'Smiles', 'cdk2_ic50', 'cdk2_inhib', 'cdk_act_bin_1', 'mol_name', 'scaffold', 'sourcepool'] >>> m.getprop('scaffold') 'Scaffold_00' >>> m.getprop('missing') Traceback (most recent call last): File "<stdin>", line 1, in? KeyError: 'missing' >>> m.hasprop('missing') 0 >>> m.setprop('testing','value') >>> m.getprop('testing') 'value' >>> m.setprop('calcd','45',computed=true) >>> list(m.getpropnames()) ['Formula', 'MolWeight', 'Mol_ID', 'Smiles', 'cdk2_ic50', 'cdk2_inhib', 'cdk_act_bin_1', 'mol_name', 'scaffold', 'sourcepool', 'testing']
12 Generating Depictions >>> from rdkit import Chem >>> from rdkit.chem import AllChem >>> m = Chem.MolFromSmiles('C1CCC1') >>> m.getnumconformers() 0 >>> AllChem.Compute2DCoords(m) 0 >>> m.getnumconformers() 1 >>> print Chem.MolToMolBlock(m) V C C C C M END >>>
13 Generating 3D Coordinates >>> from rdkit import Chem >>> from rdkit.chem import AllChem >>> m = Chem.MolFromSmiles('C1CCC1') >>> AllChem.EmbedMolecule(m) 0 >>> m.getnumconformers() 1 >>> AllChem.UFFOptimizeMolecule(m) 0 >>> m.setprop('_name','testmol') >>> print Chem.MolToMolBlock(m) testmol V C C C C M END >>> list(allchem.embedmultipleconfs(m,10)) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> m.getnumconformers() 10
14 Low-budget conformational analysis >>> from rdkit import Chem >>> from rdkit.chem import AllChem >>> from rdkit.chem import PyMol >>> v = PyMol.MolViewer() >>> m=chem.molfromsmiles('oc(=o)c1ccc(cc(=o)o)cc1') >>> AllChem.EmbedMultipleConfs(m,10) <rdbase._vectint object at 0x00A2D2B8> >>> for i in range(m.getnumconformers()): AllChem.UFFOptimizeMolecule(m,confId=i) v.showmol(m,confid=i,name='conf-%d'%i,showonly=false) >>> w = Chem.SDWriter('foo.sdf') >>> for i in range(m.getnumconformers()): w.write(m,confid=i) >>> w.flush()
15 Molecular Miscellany >>> from rdkit import Chem >>> from rdkit.chem import Crippen >>> Crippen.MolLogP(Chem.MolFromSmiles('c1ccccn1')) >>> Crippen.MolMR(Chem.MolFromSmiles('c1ccccn1')) >>> AllChem.ShapeTanimotoDist(m1,m2) >>> Chem.Kekulize(m) >>> m = Chem.MolFromSmiles('F[C@H]([Cl])Br') >>> Chem.AssignAtomChiralCodes(m) >>> m.getatomwithidx(1).getprop('_cipcode') 'R' >>> m = Chem.MolFromSmiles(r'F\C=C/Cl') >>> Chem.AssignBondStereoCodes(m) >>> m.getbondwithidx(1).getstereo() Chem.rdchem.BondStereo.STEREOZ
16 Database CLI tools # building a database: % python $RDBASE/Projects/DbCLI/CreateDb.py --dbdir=bzr --molformat=sdf bzr.sdf # similarity searching: % python $RDBASE/Projects/DbCLI/SearchDb.py --dbdir=bzr --molformat=smiles \ --similaritytype=atompairs --topn=5 bzr.smi [18:23:21] INFO: Reading query molecules and generating fingerprints [18:23:21] INFO: Finding Neighbors [18:23:21] INFO: The search took 0.1 seconds [18:23:21] INFO: Creating output Alprazolam,Alprazolam,1.000,Ro ,0.918,Triazolam,0.897,Estazolam,0.871,U-35005,0.870 Bromazepam,Bromazepam,1.000,Ro ,0.801,Ro ,0.801,Nordazepam,0.801,Ro ,0.772 Delorazepam,Delorazepam,1.000,Ro ,0.900,Nordazepam,0.881,Lorazepam,0.855,Ro ,0.840 # substructure and property searching: % python $RDBASE/Projects/DbCLI/SearchDb.py --dbdir=bzr --smarts='c1ncccc1' \ -q 'activity>6.5' --sdfout=search1.sdf [18:27:25] INFO: Doing substructure query [18:27:25] INFO: Found 1 molecules matching the query O [18:27:25] INFO: Creating output Bromazepam NH [18:27:25] INFO: Done! N N Br
17 RDKit Developer's Overview Greg Landrum
18 Installation Download and install boost libraries ( Download distribution from (or get the latest version using subversion) Download and install other python dependencies (see $RDBASE/Docs/SoftwareRequirements.txt) Follow build instructions in $RDBASE/INSTALL
19 Documentation C++: Generated using doxygen: cd $RDBASE/Code doxygen doxygen.config Python: Currently generated using epydoc: cd $RDBASE/rdkit epydoc --config epydoc.config
20 Organization $RDBASE/ External/: essential 3rd party components that are hard to get or have been modified Code/: C++ library, wrapper, and testing code rdkit/: Python library, GUI, and scripts Docs/: Manually and automatically generated documentation Data/: Testing and reference data Scripts/: a couple of useful utility scripts Web/: some simple CGI applications bin/: installation dir for DLLs/shared libraries
21 C++ "Guiding Ideas" If boost already has the wheel, don't re-invent it. Try to keep classes lightweight Maintain const-correctness Test, test, test Include tests with code Keep namespaces explicit (i.e. no using namespace std;) Keep most everything in namespace RDKit Test, test, test
22 Sample test_list.py: Test Harness
23 Wrapping code using BPL: Intro boost::python is not an automatic wrapper generator: wrappers must be written by hand. very flexible handling of "primitive types" tight python type integration allows subclassing of C++ extension classes
24 Wrapping code using BPL: defining a module DataStructs/Wrap/DataStructs.cpp #include <boost/python.hpp> #include <RDBoost/Wrap.h> #include "DataStructs.h" namespace python = boost::python; void wrap_sbv(); void wrap_ebv(); BOOST_PYTHON_MODULE(cDataStructs) { python::scope().attr(" doc ") = "Module containing an assortment of functionality for basic data structures.\n" "\n" "At the moment the data structures defined are:\n" ; python::register_exception_translator<indexerrorexception>(&translate_index_error); python::register_exception_translator<valueerrorexception>(&translate_value_error); wrap_sbv(); wrap_ebv(); }
25 Wrapping code using BPL: defining a class DataStructs/Wrap/wrap_SparseBV.cpp struct SBV_wrapper { static void wrap(){ python::class_<sparsebitvect>("sparsebitvect", "Class documentation", python::init<unsigned int>()).def(python::init<std::string>()).def("setbit",(bool (SBV::*)(unsigned int))&sbv::setbit, "Turns on a particular bit on. Returns the original state of the bit.\n").def("getnumbits",&sbv::getnumbits, "Returns the number of bits in the vector (the vector's size).\n") }; void wrap_sbv() { SBV_wrapper::wrap(); }
26 Wrapping code using BPL: making it "pythonic" DataStructs/Wrap/wrap_SparseBV.cpp struct SBV_wrapper { static void wrap(){ python::class_<sparsebitvect>("sparsebitvect", "Class documentation", python::init<unsigned int>()).def(" len ",&SBV::GetNumBits).def(" getitem ", (const int (*)(const SBV&,unsigned int))get_vectitem).def(" setitem ", (const int (*)(SBV&,unsigned int,int))set_vectitem).def(python::self & python::self).def(python::self python::self).def(python::self ^ python::self).def(~python::self) ; } };
27 Wrapping code using BPL: supporting pickling DataStructs/Wrap/wrap_SparseBV.cpp // allows BitVects to be pickled struct sbv_pickle_suite : python::pickle_suite { static python::tuple getinitargs(const SparseBitVect& self) { return python::make_tuple(self.tostring()); }; }; struct SBV_wrapper { static void wrap(){ python::class_<sparsebitvect>("sparsebitvect", "Class documentation", python::init<unsigned int>()).def_pickle(sbv_pickle_suite()) ; } };
28 Wrapping code using BPL: returning complex types DataStructs/Wrap/wrap_SparseBV.cpp struct SBV_wrapper { static void wrap(){ python::class_<sparsebitvect>("sparsebitvect", "Class documentation", python::init<unsigned int>()).def("getonbits", (IntVect (*)(const SBV&))GetOnBits, "Returns a tuple containing IDs of the on bits.\n") ; } }; IntVect is a std::vector<int> Convertors provided for: std::vector<int>, std::vector<unsigned>, std::vector<string>, std::vector<double>, std::vector< std::vector<int> >, std::vector< std::vector<unsigned> >, std::vector< std::vector<double> >, std::list<int>, std::list< std::vector<int> > in rdbase.dll Others can be created using: RegisterVectorConverter<RDKit::Atom*>(); RegisterListConverter<RDKit::Atom*>();
29 Wrapping code using BPL: accepting Python types DataStructs/Wrap/wrap_SparseBV.cpp void SetBitsFromList(SparseBitVect *bv, python::object onbitlist) { PySequenceHolder<int> bitl(onbitlist); for (unsigned int i = 0; i < bitl.size(); i++) { bv->setbit(bitl[i]); } } struct SBV_wrapper { static void wrap(){ python::class_<sparsebitvect>("sparsebitvect", "Class documentation", python::init<unsigned int>()).def("setbitsfromlist",setbitsfromlist, "Turns on a set of bits. The argument should be a tuple or list of bit ids.\n") ; } };
30 Wrapping code using BPL: keyword arguments GraphMol/Wrap/rdmolfiles.cpp docstring="returns the a Mol block for a molecule\n\ ARGUMENTS:\n\ \n\ - mol: the molecule\n\ - includestereo: (optional) toggles inclusion of stereochemical\n\ information in the output\n\ - confid: (optional) selects which conformation to output (-1 = default)\n\ \n\ RETURNS:\n\ \n\ a string\n\ \n"; python::def("moltomolblock",rdkit::moltomolblock, (python::arg("mol"),python::arg("includestereo")=false, python::arg("confid")=-1), docstring.c_str()); This also demonstrates defining a function and taking a complex (wrapped) type as an argument
31 Wrapping code using BPL: returning pointers GraphMol/Wrap/rdmolfiles.cpp docstring="construct a molecule from a SMILES string.\n\n\ ARGUMENTS:\n\ - SMILES: the smiles string\n\ - sanitize: (optional) toggles sanitization of the molecule.\n\ Defaults to 1.\n\ RETURNS:\n\ a Mol object, None on failure.\n\ \n"; python::def("molfromsmiles",rdkit::molfromsmiles, (python::arg("smiles"), python::arg("sanitize")=true), docstring.c_str(), python::return_value_policy<python::manage_new_object>()); GraphMol/Wrap/Mol.cpp.def("GetAtomWithIdx",(ROMol::GRAPH_NODE_TYPE (ROMol::*)(unsigned int)) &ROMol::getAtomWithIdx, python::return_value_policy<python::reference_existing_object>(), "Returns a particular Atom.\n\n" " ARGUMENTS:\n" " - idx: which Atom to return\n\n" " NOTE: atom indices start at 0\n")
KNIME Enterprise server usage and global deployment at NIBR
KNIME Enterprise server usage and global deployment at NIBR Gregory Landrum, Ph.D. NIBR Informatics Novartis Institutes for BioMedical Research, Basel 8 th KNIME Users Group Meeting Berlin, 26 February
How to create a web-based molecular structure database with free software
How to create a web-based molecular structure database with free software Norbert Haider Department of Drug Synthesis Faculty of Life Sciences, University of Vienna [email protected] small molecules
Predicting outcome of soccer matches using machine learning
Saint-Petersburg State University Mathematics and Mechanics Faculty Albina Yezus Predicting outcome of soccer matches using machine learning Term paper Scientific adviser: Alexander Igoshkin, Yandex Mobile
COSC 6397 Big Data Analytics. Mahout and 3 rd homework assignment. Edgar Gabriel Spring 2014. Mahout
COSC 6397 Big Data Analytics Mahout and 3 rd homework assignment Edgar Gabriel Spring 2014 Mahout Scalable machine learning library Built with MapReduce and Hadoop in mind Written in Java Focusing on three
Table of Contents. The RCS MINI HOWTO
Table of Contents The RCS MINI HOWTO...1 Robert Kiesling...1 1. Overview of RCS...1 2. System requirements...1 3. Compiling RCS from Source...1 4. Creating and maintaining archives...1 5. ci(1) and co(1)...1
Car Insurance. Havránek, Pokorný, Tomášek
Car Insurance Havránek, Pokorný, Tomášek Outline Data overview Horizontal approach + Decision tree/forests Vertical (column) approach + Neural networks SVM Data overview Customers Viewed policies Bought
KITES TECHNOLOGY COURSE MODULE (C, C++, DS)
KITES TECHNOLOGY 360 Degree Solution www.kitestechnology.com/academy.php [email protected] [email protected] Contact: - 8961334776 9433759247 9830639522.NET JAVA WEB DESIGN PHP SQL, PL/SQL
Journée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
Slides from INF3331 lectures - web programming in Python
Slides from INF3331 lectures - web programming in Python Joakim Sundnes & Hans Petter Langtangen Dept. of Informatics, Univ. of Oslo & Simula Research Laboratory October 2013 Programming web applications
TEST AUTOMATION FRAMEWORK
TEST AUTOMATION FRAMEWORK Twister Topics Quick introduction Use cases High Level Description Benefits Next steps Twister How to get Twister is an open source test automation framework. The code, user guide
SciTools Understand Flavor for Structure 101g
SciTools Understand Flavor for Structure 101g By Marcio Marchini ([email protected] ) 2014/10/10 1) WHAT IS THE UNDERSTAND FLAVOR FOR STRUCTURE101G? 1 2) WHY VARIOUS FLAVORS, AND NOT JUST ONE?
Cheminformatics and Pharmacophore Modeling, Together at Last
Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction
Visualizing molecular simulations
Visualizing molecular simulations ChE210D Overview Visualization plays a very important role in molecular simulations: it enables us to develop physical intuition about the behavior of a system that is
Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices
overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding
Sources: On the Web: Slides will be available on:
C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,
Apache 2.0 Installation Guide
Apache 2.0 Installation Guide Ryan Spangler [email protected] http://ceut.uww.edu May 2002 Department of Business Education/ Computer and Network Administration Copyright Ryan Spangler 2002 Table of
Version Control with. Ben Morgan
Version Control with Ben Morgan Developer Workflow Log what we did: Add foo support Edit Sources Add Files Compile and Test Logbook ======= 1. Initial version Logbook ======= 1. Initial version 2. Remove
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Data Visualization in Cheminformatics. Simon Xi Computational Sciences CoE Pfizer Cambridge
Data Visualization in Cheminformatics Simon Xi Computational Sciences CoE Pfizer Cambridge My Background Professional Experience Senior Principal Scientist, Computational Sciences CoE, Pfizer Cambridge
Introduction Predictive Analytics Tools: Weka
Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface
Instruction Set Architecture of Mamba, a New Virtual Machine for Python
Instruction Set Architecture of Mamba, a New Virtual Machine for Python David Pereira and John Aycock Department of Computer Science University of Calgary 2500 University Drive N.W. Calgary, Alberta, Canada
Intro to scientific programming (with Python) Pietro Berkes, Brandeis University
Intro to scientific programming (with Python) Pietro Berkes, Brandeis University Next 4 lessons: Outline Scientific programming: best practices Classical learning (Hoepfield network) Probabilistic learning
Analysis report examination with CUBE
Analysis report examination with CUBE Brian Wylie Jülich Supercomputing Centre CUBE Parallel program analysis report exploration tools Libraries for XML report reading & writing Algebra utilities for report
Data Mining Individual Assignment report
Björn Þór Jónsson [email protected] Data Mining Individual Assignment report This report outlines the implementation and results gained from the Data Mining methods of preprocessing, supervised learning, frequent
An Introduction to WEKA. As presented by PACE
An Introduction to WEKA As presented by PACE Download and Install WEKA Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html 2 Content Intro and background Exploring WEKA Data Preparation Creating Models/
Scientific Programming with Python. Randy M. Wadkins, Ph.D. Asst. Prof. of Chemistry & Biochem.
Scientific Programming with Python Randy M. Wadkins, Ph.D. Asst. Prof. of Chemistry & Biochem. What is Python? Python for computers is: a scripting language a programming language interpreted, so it
Machine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
Amazon Glacier. Developer Guide API Version 2012-06-01
Amazon Glacier Developer Guide Amazon Glacier: Developer Guide Copyright 2016 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be used in
2/1/2010. Background Why Python? Getting Our Feet Wet Comparing Python to Java Resources (Including a free textbook!) Hathaway Brown School
Practical Computer Science with Preview Background Why? Getting Our Feet Wet Comparing to Resources (Including a free textbook!) James M. Allen Hathaway Brown School [email protected] Background Hathaway
Version Control with Svn, Git and git-svn. Kate Hedstrom ARSC, UAF
1 Version Control with Svn, Git and git-svn Kate Hedstrom ARSC, UAF 2 Version Control Software System for managing source files For groups of people working on the same code When you need to get back last
Advanced Visualization for Chemistry
Advanced Visualization for Chemistry Part 11 Tools customization Mario Valle March 7 8, 2006 Why we need customization Read new file formats Compute and fuse together new derived quantities Add (computed)
Introduction to Programming and Computing for Scientists
Oxana Smirnova (Lund University) Programming for Scientists Tutorial 7b 1 / 48 Introduction to Programming and Computing for Scientists Oxana Smirnova Lund University Tutorial 7b: Grid certificates and
1 Topic. 2 Scilab. 2.1 What is Scilab?
1 Topic Data Mining with Scilab. I know the name "Scilab" for a long time (http://www.scilab.org/en). For me, it is a tool for numerical analysis. It seemed not interesting in the context of the statistical
How to Design and Create Your Own Custom Ext Rep
Combinatorial Block Designs 2009-04-15 Outline Project Intro External Representation Design Database System Deployment System Overview Conclusions 1. Since the project is a specific application in Combinatorial
metaengine DataConnect For SharePoint 2007 Configuration Guide
metaengine DataConnect For SharePoint 2007 Configuration Guide metaengine DataConnect for SharePoint 2007 Configuration Guide (2.4) Page 1 Contents Introduction... 5 Installation and deployment... 6 Installation...
HPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
Open source framework for interactive data exploration in server based architecture
Open source framework for interactive data exploration in server based architecture D5.5 v1.0 WP5 Visual Analytics: D5.5 Open source framework for interactive data exploration in server based architecture
dedupe Documentation Release 1.0.0 Forest Gregg, Derek Eder, and contributors
dedupe Documentation Release 1.0.0 Forest Gregg, Derek Eder, and contributors January 04, 2016 Contents 1 Important links 3 2 Contents 5 2.1 API Documentation...........................................
To reduce or not to reduce, that is the question
To reduce or not to reduce, that is the question 1 Running jobs on the Hadoop cluster For part 1 of assignment 8, you should have gotten the word counting example from class compiling. To start with, let
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
MayaVi: A free tool for CFD data visualization
MayaVi: A free tool for CFD data visualization Prabhu Ramachandran Graduate Student, Dept. Aerospace Engg. IIT Madras, Chennai, 600 036. e mail: [email protected] Keywords: Visualization, CFD data,
This document presents the new features available in ngklast release 4.4 and KServer 4.2.
This document presents the new features available in ngklast release 4.4 and KServer 4.2. 1) KLAST search engine optimization ngklast comes with an updated release of the KLAST sequence comparison tool.
The MANTIS Framework Cyber-Threat Intelligence Mgmt. for CERTs Siemens AG 2014. All rights reserved
B. Grobauer, S.Berger, J. Göbel, T. Schreck, J. Wallinger Siemens CERT The MANTIS Framework Cyber-Threat Intelligence Mgmt. for CERTs Note MANTIS is available as Open Source under GPL v2+ from https://github.com/siemens/django-mantis
Why are we teaching you VisIt?
VisIt Tutorial Why are we teaching you VisIt? Interactive (GUI) Visualization and Analysis tool Multiplatform, Free and Open Source The interface looks the same whether you run locally or remotely, serial
Python for Chemistry in 21 days
minutes Python for Chemistry in 21 days Dr. Noel O'Boyle Dr. John Mitchell and Prof. Peter Murray-Rust UCC Talk, Sept 2005 Available at http://www-mitchell.ch.cam.ac.uk/noel/ Introduction This talk will
Survey of Unit-Testing Frameworks. by John Szakmeister and Tim Woods
Survey of Unit-Testing Frameworks by John Szakmeister and Tim Woods Our Background Using Python for 7 years Unit-testing fanatics for 5 years Agenda Why unit test? Talk about 3 frameworks: unittest nose
The C Programming Language course syllabus associate level
TECHNOLOGIES The C Programming Language course syllabus associate level Course description The course fully covers the basics of programming in the C programming language and demonstrates fundamental programming
C++ Programming Language
C++ Programming Language Lecturer: Yuri Nefedov 7th and 8th semesters Lectures: 34 hours (7th semester); 32 hours (8th semester). Seminars: 34 hours (7th semester); 32 hours (8th semester). Course abstract
Analysis Tools and Libraries for BigData
+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I
Code Estimation Tools Directions for a Services Engagement
Code Estimation Tools Directions for a Services Engagement Summary Black Duck software provides two tools to calculate size, number, and category of files in a code base. This information is necessary
Why you shouldn't use set (and what you should use instead) Matt Austern
Why you shouldn't use set (and what you should use instead) Matt Austern Everything in the standard C++ library is there for a reason, but it isn't always obvious what that reason is. The standard isn't
Visualization Cluster Getting Started
Visualization Cluster Getting Started Contents 1 Introduction to the Visualization Cluster... 1 2 Visualization Cluster hardware and software... 2 3 Remote visualization session through VNC... 2 4 Starting
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde
BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
ORACLE NOSQL DATABASE HANDS-ON WORKSHOP Cluster Deployment and Management
ORACLE NOSQL DATABASE HANDS-ON WORKSHOP Cluster Deployment and Management Lab Exercise 1 Deploy 3x3 NoSQL Cluster into single Datacenters Objective: Learn from your experience how simple and intuitive
B&K Precision 1785B, 1786B, 1787B, 1788 Power supply Python Library
B&K Precision 1785B, 1786B, 1787B, 1788 Power supply Python Library Table of Contents Introduction 2 Prerequisites 2 Why a Library is Useful 3 Using the Library from Python 6 Conventions 6 Return values
1 Choosing the right data mining techniques for the job (8 minutes,
CS490D Spring 2004 Final Solutions, May 3, 2004 Prof. Chris Clifton Time will be tight. If you spend more than the recommended time on any question, go on to the next one. If you can t answer it in the
Car Insurance. Prvák, Tomi, Havri
Car Insurance Prvák, Tomi, Havri Sumo report - expectations Sumo report - reality Bc. Jan Tomášek Deeper look into data set Column approach Reminder What the hell is this competition about??? Attributes
Bacula The Network Backup Solution
Bacula The Network Backup Solution Presented by Kern Sibbald at BSDCan 17 May 2008 in Ottawa Bacula the Network Backup Tool for *BSD, Linux, Mac, Unix and Windows Open Source Project Bacula is a network
Automation Tools for UCS Sysadmins
Automation Tools for UCS Sysadmins Eric Williams Technical Marketing Engineer What is the Cisco UCS XML API? Cisco Unified Computing System Optimized and Designed as an Integrated System Cisco UCS Manager
How To Use Postgresql With Foreign Data In A Foreign Server In A Multi Threaded Database (Postgres)
PostgreSQL at the centre of your dataverse! PGBR 2011! Presented by Dave Page! 3 rd November 2011! EnterpriseDB, Postgres Plus and Dynatune are trademarks of EnterpriseDB Corporation. Other names may be
Programming Using Python
Introduction to Computation and Programming Using Python Revised and Expanded Edition John V. Guttag The MIT Press Cambridge, Massachusetts London, England CONTENTS PREFACE xiii ACKNOWLEDGMENTS xv 1 GETTING
Introduction to Pig. Content developed and presented by: 2009 Cloudera, Inc.
Introduction to Pig Content developed and presented by: Outline Motivation Background Components How it Works with Map Reduce Pig Latin by Example Wrap up & Conclusions Motivation Map Reduce is very powerful,
EMBL-EBI. EM map display & Visualization
EM map display & Visualization Introduction Biological data comes from & is of interest to: Chemists : reaction mechanism, drug design Biologists : sequence, expression, homology, function. Structure biologists
Structure Tools and Visualization
Structure Tools and Visualization Gary Van Domselaar University of Alberta [email protected] Slides Adapted from Michel Dumontier, Blueprint Initiative 1 Visualization & Communication Visualization
Caml Virtual Machine File & data formats Document version: 1.4 http://cadmium.x9c.fr
Caml Virtual Machine File & data formats Document version: 1.4 http://cadmium.x9c.fr Copyright c 2007-2010 Xavier Clerc [email protected] Released under the LGPL version 3 February 6, 2010 Abstract: This
CORBA Programming with TAOX11. The C++11 CORBA Implementation
CORBA Programming with TAOX11 The C++11 CORBA Implementation TAOX11: the CORBA Implementation by Remedy IT TAOX11 simplifies development of CORBA based applications IDL to C++11 language mapping is easy
CDD user guide. PsN 4.4.8. Revised 2015-02-23
CDD user guide PsN 4.4.8 Revised 2015-02-23 1 Introduction The Case Deletions Diagnostics (CDD) algorithm is a tool primarily used to identify influential components of the dataset, usually individuals.
Introduction to Python
Introduction to Python Sophia Bethany Coban Problem Solving By Computer March 26, 2014 Introduction to Python Python is a general-purpose, high-level programming language. It offers readable codes, and
How To Understand How Weka Works
More Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz More Data Mining with Weka a practical course
Bacula. The leading Opensource Backup Solution
Bacula The leading Opensource Backup Solution OpenSource Project Bacula is a network backup solution, designed for *BSD, Linux, Mac OS X, Unix and Windows systems. Original project goals were to: backup
Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON
Revista Informatica Economică, nr. 4 (44)/2007 45 Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON Iulian ILIE-NEMEDI, Bucharest, Romania, [email protected] Writing a custom web
Automate Your Deployment with Bamboo, Drush and Features DrupalCamp Scotland, 9 th 10 th May 2014
This presentation was originally given at DrupalCamp Scotland, 2014. http://camp.drupalscotland.org/ The University of Edinburgh 1 We are 2 of the developers working on the University s ongoing project
latest Release 0.2.6
latest Release 0.2.6 August 19, 2015 Contents 1 Installation 3 2 Configuration 5 3 Django Integration 7 4 Stand-Alone Web Client 9 5 Daemon Mode 11 6 IRC Bots 13 7 Bot Events 15 8 Channel Events 17 9
Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015
Hadoop MapReduce and Spark Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Outline Hadoop Hadoop Import data on Hadoop Spark Spark features Scala MLlib MLlib
Gillcup Documentation
Gillcup Documentation Release 0.2.1 Petr Viktorin 2015-07-19 Contents 1 Introduction 3 1.1 Version warning............................................. 3 1.2 The Project................................................
CS 1133, LAB 2: FUNCTIONS AND TESTING http://www.cs.cornell.edu/courses/cs1133/2015fa/labs/lab02.pdf
CS 1133, LAB 2: FUNCTIONS AND TESTING http://www.cs.cornell.edu/courses/cs1133/2015fa/labs/lab02.pdf First Name: Last Name: NetID: The purpose of this lab is to help you to better understand functions:
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
Prediction of Car Prices of Federal Auctions
Prediction of Car Prices of Federal Auctions BUDT733- Final Project Report Tetsuya Morito Karen Pereira Jung-Fu Su Mahsa Saedirad 1 Executive Summary The goal of this project is to provide buyers who attend
Building a Basic Communication Network using XBee DigiMesh. Keywords: XBee, Networking, Zigbee, Digimesh, Mesh, Python, Smart Home
Building a Basic Communication Network using XBee DigiMesh Jennifer Byford April 5, 2013 Keywords: XBee, Networking, Zigbee, Digimesh, Mesh, Python, Smart Home Abstract: Using Digi International s in-house
Developing an ODBC C++ Client with MySQL Database
Developing an ODBC C++ Client with MySQL Database Author: Rajinder Yadav Date: Aug 21, 2007 Web: http://devmentor.org Email: [email protected] Assumptions I am going to assume you already know how
DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
Hadoop Distributed File System Propagation Adapter for Nimbus
University of Victoria Faculty of Engineering Coop Workterm Report Hadoop Distributed File System Propagation Adapter for Nimbus Department of Physics University of Victoria Victoria, BC Matthew Vliet
I don t intend to cover Python installation, please visit the Python web site for details.
Python Related Information I don t intend to cover Python installation, please visit the Python web site for details. http://www.python.org/ Before you start to use the Python Interface plugin make sure
Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures
Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures Jérôme Hert, Peter Willett and David J. Wilton (University of Sheffield, Sheffield, UK) Pierre Acklin, Kamal Azzaoui, Edgar
Cheminformatics and its Role in the Modern Drug Discovery Process
Cheminformatics and its Role in the Modern Drug Discovery Process Novartis Institutes for BioMedical Research Basel, Switzerland With thanks to my colleagues: J. Mühlbacher, B. Rohde, A. Schuffenhauer
Setting up an online Java Jmonitor. server using the. EXPERIMENTAL code from. John Melton GØORX/N6LYT
Setting up an online Java Jmonitor server using the EXPERIMENTAL code from John Melton GØORX/N6LYT This is NOT a commercial endeavor. John created jmonitor as an one EXPERIMENT along his path toward multiple
Illustration 1: Diagram of program function and data flow
The contract called for creation of a random access database of plumbing shops within the near perimeter of FIU Engineering school. The database features a rating number from 1-10 to offer a guideline
Exercise 0. Although Python(x,y) comes already with a great variety of scientic Python packages, we might have to install additional dependencies:
Exercise 0 Deadline: None Computer Setup Windows Download Python(x,y) via http://code.google.com/p/pythonxy/wiki/downloads and install it. Make sure that before installation the installer does not complain
Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
1 Abstract Data Types Information Hiding
1 1 Abstract Data Types Information Hiding 1.1 Data Types Data types are an integral part of every programming language. ANSI-C has int, double and char to name just a few. Programmers are rarely content
