1
Introduc)on to urika Mul)threading SPARQL Database urika Appliance XMT- 2 Programming Use Cases 2
MTA- 1 1998 Gallium arsenide: Proof of concept First produc,on implementa,on of latency- tolerant mul,threading MTA- 2 2002 Transi5on the microprocessor to CMOS Inven,on of scaling memory in a mul,threading system XMT- 1 2008 Integrated MTA and main Cray product line XT First hybrid system to support X86 and mul,threading CPUs XMT- 2 2011 Made system ready for enterprise produc5on Largest shared memory machine in the world First installa,on CSCS, May 2011 YarcData urika 2012 Op5mized for Scalable Graph Analy5cs 3
Uses the same cabinets, boards, scalable interconnect, I/O and storage infrastructure, user environment, and administra)ve tools just changes the processor Cabinet 24 blades per cabinet Ver5cal airflow with op5onal liquid assist Compute blades 4 Threadstorm processors 16-64 GB per processor Cray XT service and I/O subsystem PCIe connec5ons to storage and networks Scalable Lustre global file system Cray XT high- speed 3D torus network Cray XT power and RAS systems Linux based user environment 4
Social Networking Intelligence/Security Telecom/Mobile Life Sciences/Biology Healthcare/Medicine Finance Internet/WWW Supply Chain 5
Big Data Structured Semi-structured Unstructured Scale- out Analy)cs Unstructured Data Highly Par))onable Datasets Keyword Search Example: Hadoop, column stores Data Warehouses BI Tools Structured Data OLAP Cubes Regular Queries, Known Variables Example: Oracle Exadata In- memory Analy)cs Structured and Unstructured Data Non- par))onable datasets Complex Queries ( PaYern Matching ) Example: YarcData urika 6
7 Graphs are hard to Partition Graphs are not Predictable Graphs are highly Dynamic? High cost to follow relationships that span Cluster Nodes High cost to follow multiple competing paths which cannot be prefetched/cached High cost to load multiple, constantly changing datasets into in-memory graph models Network is 100 times SLOWER than Memory* Memory is 100 times SLOWER than Processor* Storage I/O is 1000 times SLOWER than Memory I/O* *Source: Hennessy, J. and Patterson, D., Computer Architecture: A Quantitative Approach, 2012 edition 7
8 Graphs are hard to Partition Large Shared Memory Up to 512 TB Graphs are not Predictable Massively Mul5- threaded 128 threads/processor Graphs are highly Dynamic Highly Scalable I/O Up to 350 TB/hr Real-time, Interactive Analytics on Big Data Graph Problems 8
Cray XMT- 2 Hardware Engine Originally designed for deep analysis of large datasets Very large scalable shared memory Architecture can support 512TB shared memory Typical systems are 2 TB to 32 TB Mul5threading Unique highly mul5threaded architecture 128 hardware threads per processor Extreme parallelism, hides memory latency Mul)threaded Graph Database Highly parallel in- memory RDF / SPARQL quad store High performance inference engine High performance parallel I/O Industry Standard Front End Based on Jena open source seman5c DB All standard Linux infrastructure and languages Lustre parallel file system 9 9
1 0
Rela)ve latency to memory con)nues to increase Vector processors amor,ze memory latency Cache- based microprocessors reduce memory latency Mul5threaded processors tolerate memory latency Mul)threading is most effec)ve when: Parallelism is abundant Data locality is scarce Large graph problems perform well on this architecture Graph analysis Seman5c databases Social network analysis 1 1
Many threads per processor core; small thread state Thread- level context switch at every instruc)on cycle registers ALU stream conventional processor program counter multithreaded processor Slide 12 1 2
Conven)onal processor Mul)threaded processor memory memory When one or a few threads stall, memory/ network bandwidth become idle Although some threads stall, others keep issuing local/remote memory requests, keeping most precious resources busy network network Slide 1 3
To the programmer, a mul)ple processor XMT- 2 looks like a single processor, except that the number of threads is increased. Bokhari/Sauer 2003 1 4
1 5
RDF triples databases are inherently graphical Some researchers call seman)c databases seman)c graph databases Vancouver in-country Canada is-type country object Edmonton has-population 730,000 North America 1 6
Resource Descrip)on Framework (RDF) Designed to enable seman5c web searching and integra5on of disparate data sources W3C standard formats Every datum represented as subject/predicate/object Ideally with each of those expressed with a URI Standard ontologies in some domains e.g., Open Biological and Biomedical Ontologies (OBO) Examples: subject predicate object <ncbitax:ncbitaxon_840261> rdf:type owl:class <ncbitax:ncbitaxon_195644> rdfs:subclassof <ncbitax:ncbitaxon_185881> <ncbitax:ncbitaxon_816681> rdfs:label "Characiformes sp. BOLD:AAG5151"@en 1 7
SPARQL Protocol and RDF Query Language (SPARQL) Enables matching of graph paderns in the seman5c DB Reminiscent of SQL # Lehigh University BenchMark (LUBM) Query 9 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX ub: <http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#> SELECT?X,?Y,?Z WHERE {?X rdf:type ub:student.?y rdf:type ub:faculty.?z rdf:type ub:course.?x ub:advisor?y.?y ub:teacherof?z.?x ub:takescourse?z} PREFIX == shorthand for a URI variables to be returned from the query find sets of (X, Y, Z) with a subject X of type Student, a subject Y of type Faculty, and a subject Z of type Course, where X is an advisee of Y, Y teaches course Z, and X takes course Z 1 8
1 9
20 RDF SPARQL Java Linux Apache Tomcat/Jena Graph Analy)c Tools Management, Security, Data Pipeline Graph Analy)c Database RDF datastore and SPARQL engine Gadgets / Mashups Graph Analy)c Plagorm Shared- memory, Mul5- threaded, Scalable I/O graph- op5mized hardware Data Center sees another Linux server Applica)ons see industry standard interface RDF, SPARQL, Java, Gadgets Reuse Exis)ng Skillsets Java, OSGI, App Server, SOA, ESB, Web toolkit Standard languages All applica5ons and ar5facts built on urika can be run on other plalorms 2 0
SELECT?p?x WHERE { (?x type person) (?p sells DVD ) (?x shops-at?p) } API SPARQL Engine database BGP SPARQL Ingest Applica)on Jena/Fuseki HTTP Interface Send query Parse, interpret query Set up query engine ORDER DISTINCT OPTIONAL UNION interface to Web Receive results Translate, send results FILTER generate low-level query urika service nodes urika compute nodes 21 2 1
Data Warehouse Hadoop Other Big Data Appliances Existing Analytic Environment(s) Export Analy)c/ Rela)onship Results Import Graph Datasets User/App Visualiza)on 2 2
2 3
Automa)c Parallelizing C / C++ Compiler Compiler runs on Linux login node The executable runs on the compute nodes Provides automa5c parallelism if proper op5ons are included Paralleliza5on direc5ves are used to provide hints to the compiler Login Nodes run standard SuSe (SLES 11) Linux All usual Linux languages and applica5ons may be used A hybrid programming model is possible, and encouraged Lightweight user communica)on (LUC) interface Use LUC to build a client/server interface between the front- end and back- end Symmetric; RPC- style interface in either direc5on Lustre global file system High- performance parallel file system used across Cray line 2 4 Slide
Primary responsibility for machine virtualiza)on. Maps simple programming model (flat memory, infinite processors) to actual hardware. Performs general management of resources such as processors and hardware streams. Responsible for thread management Thread crea5on/destruc5on/blocking/unblocking. Work scheduling - automa5c intra- process load balancing. Handles hardware traps and OS- induced swaps. Provides a highly- op)mized parallel memory allocator. Provides support for auxiliary func)ons. Tracing and profiling. Client- side debugging. Slide 25 2 5
Appren)ce2 GUI applica5on for debugging performance problems Consists of one or more reports based on how your program was compiled and executed Canal Report Feedback from the compiler Insight on how latent parallelism was exploited Informa5on on expected resource u5liza5on and scheduling Tview Report Hardware counter plots Actual performance of your applica5on Run5me trap informa5on for detec5ng hotspots Bprof Report Profile tables in terms of instruc5ons issued and memory references 26 2 6
2 7
Discover new drugs and treatments The Challenge Mul5ple massive datasets describing biological network graphs in cancer cells from published literature and experimental data, constantly updated Un- par55onable, densely and irregularly connected graphs Medical research churns out massive quan55es of literature, but most researchers don t have 5me to read even their own specialty publica5ons Over 10,000 genomics publica5ons per month currently urika Solu)on urika holds un- par55oned genomic network graph in memory Contrast experimental models and theories with published results to discover previously unknown rela5onships Interac5ve, real 5me access by mul5ple researchers Confirmation of elevated VEGF levels by tissue microarray: 2 8
The Cancer Genome Atlas (TCGA) data combined with literature rela)onships from Medline Protein domain interac5ons are extracted from TCGA using Random Forest classifica5on A Normalized Medline Distance (NMD) between proteins is calculated from Medline 5tles and abstracts Protein data from Uniprot and Pfam are also included 1.2 billion entries that generate 6B triples New data sets constantly being added- muta5on data, pa5ent data, methyla5on data, sta5s5cal analysis results and more Current database technology does not have the performance for useful full- scale queries across this data Muta5on Methyla5on Genes (TCGA) Protein Data (UniProt) GRAPH Query Literature (Medline) Protein Families (PFAM) Correla5ons (Sta5s5cal) Pa5ents (Health Records) 2 9
Quickly adapt surveillance to new rules and paderns The Challenge Con5nuous stream of possible compliance events New compliance rules require constant updates of signals and paderns Current solu5ons require offline prepara5on and are based on rigid rules urika Solu)on En5re rela5onship graph in memory New Paderns/templates can be iden5fied and added in real- 5me Graphical interac5ve explora5on of rela5onships between people, places, things, organiza5ons, communica5ons, etc. Business Value A responsive, flexible event detec5on plalorm that adapts to new knowledge Figure Source: Graph-based technologies for intelligence analysis,t. Coffman, S. Greenblatt, S. Marcus, Commun. ACM, 47(3):45-47, 2004 3 0
Deploy beder trading strategies with enhanced Market Sensing The Challenge Trading strategies that can react quickly and con5nuously to market drivers Market drivers are news and other events which may impact current trading outlook, posi5ons Enhanced market sensing requires seman5c analysis of news and a scalable graph- based model to implement urika Solu)on Ability to hold en5re market model in- memory as a dependency graph Ability to filter news relevant to posi5ons based on seman5c analysis Interac5ve, real- 5me access from trader worksta5ons Business Value Beder trading strategies that can be based on augmented market data including news and other secondary events of relevance 3 1
Discover similar Pa5ents to op5mize treatment The Challenge Longitudinal, historical data spanning all events, symptoms, diagnoses, diseases, treatments, prescrip5ons, etc of 10M pa5ents including gene5cs and family history Ad- hoc, constantly changing defini5on of similarity based on thousands of parameters Interac5ve, real- 5me response during consulta5on urika Solu)on urika holds en5re rela5onship graph in memory Iden5fy similar pa5ents based on ad- hoc physician specified paderns Interac5ve, real- 5me access by en5re physician community Business Value Consistent selec5on of the most effec5ve treatment for each pa5ent by each doctor every 5me Blood pressure Prior myocardial infarction Hypertension Body mass index HDL Anti-hypertension meds Patient: Jean Generic 3 2
3 3
SPARQL by Example tutorial, by Lee Feigenbaum, hdp:// www.cambridgeseman5cs.com/2008/09/sparql- by- example/ Search RDF data with SPARQL, by Philip McCarthy http://www.ibm.com/developerworks/xml/library/j-sparql/ Seman5c Web for the Working Ontologist, by Dean Allemang and James Hendler, ISBN 978-0123859655 Learning SPARQL, by Bob DuCharme, O Reilly, ISBN 978-1- 449-30659- 5 RDF: www.w3.org/rdf/ SPARQL: www.w3.org/tr/rdf- sparql- query/ D2R: www4.wiwiss.fu- berlin.de/bizer/d2r- server 3 34 4