How To Create A Data Visualization Engine For Your Computer Or Your Brain
|
|
|
- Eugene Black
- 5 years ago
- Views:
Transcription
1 Open Source Visualization with OpenGraphiti Thibault Reuille Abstract Andrew Hay The way a human efficiently digests information varies from person to person. Scientific studies have shown that some people learn better through the presentation of visual/spatial information compared to simply reading text. Why then, do vendors expect customers to consume presented data following only the written word method, as opposed to advanced graphical representations of the data? We believe the written approach is dated. To help visually inclined forensic analysts, incident responders, researchers, and data scientists, we decided to create a free and Open Source engine to remove the complexity of creating advanced data visualizations. The ultimate goal of the project was to allow for the visualization of any loosely related data without having to endlessly reformat that data. For the visual/spatial learners the engine will interpret their data, whether it be a simple or complex system, and present the results in a way that their brains can understand. Learning for visual spatial learners takes place all at once, with large chunks of information grasped in intuitive leaps, rather than in the gradual accretion of isolated facts or small steps. For example, a visual spatial learner can grasp all of the multiplication
2 facts as a related set in a chart much easier and faster than memorizing each fact independently. We believe that some security practitioners might be able to better use their respective data sets if provided with an investigative model that their brains can understand. Introduction With the right tools you can take any relational data set, quickly massage the format, and visualize the results. Observations and conclusions can also be drawn from the results of the visualization that may not have appeared in simple text form. The engine and methodologies discussed have been used by OpenDNS to track CryptoLocker and CryptoDefense ransomware, Red October malware, and the Kelihos botnet. Additionally, specific Syrian Electronic Army (SEA) campaigns, carding sites, and even a map of the Internet through Autonomous Systems have been visualized using OpenGraphiti. Interesting data can also be isolated through the use of Python and JavaScript based plugins that can be easily added to the engine framework. These plugins affect the way the data is visualized and allow analysts to make sense of their data as it relates to the question they re trying to answer. The "big picture" model will help visually inclined incident responders, security analysts, and malware researchers stitch together complex data sets. Why Visualize the Data? Some people learn better by doing, some by reading, and others by listening to a lecture. Other people however, may learn better through visual methods. One such communication tool, often employed by visually inclined learners, is known as a graphic organizer. Graphic organizers are visual representations of knowledge, concepts, thoughts, or ideas. A graphic organizer, also known as a knowledge
3 map, concept map, story map, cognitive organizer, advance organizer, or concept diagram, is a communication tool that uses visual symbols to express knowledge, concepts, thoughts, or ideas, and the relationships between them. The main purpose of a graphic organizer is to provide a visual aid to facilitate learning and instruction. According to a study conducted by The Institute for the Advancement 1 of Research in Education at AEL, using graphic organizers improves student performance in the following areas: Retention Students remember information better and can better recall it when it is represented and taught visually and verbally. Reading comprehension The use of graphic organizers helps improving the reading comprehension of students. Student achievement Students with and without learning disabilities improve achievement across content areas and grade levels. Thinking and learning skills; critical thinking When students develop and use a graphic organizer their higher order thinking and critical thinking skills are enhanced. Some students build data literacy as they collect and explore information in a dynamic inquiry process, using tables and plots to visually investigate and manipulate and analyze data. As students explore the way data moves through various plot types, such as Venn, stack, pie, and axis, they formulate pathways that link visual images to areas that store knowledge in the brain. To show the relationships between the parts, the symbols are linked with each other; words can be used to further clarify meaning. By representing information spatially and with images, students are able 1 Summary.pdf
4 to focus on meaning, reorganize and group similar ideas easily, and 2 make better use of their visual memory. This linkage model transfers very well to applied graph theory and the visualization of related data sets. From Data to Visualization When it comes to representing knowledge, semantic networks are an extremely useful data structure. They can be used to model nearly everything and can be applied to a wide range of problems. But before we dig into more details, let us consider the definition of a semantic network. A semantic network, or frame network, is a network which represents semantic relations between concepts. This is often used as a form of knowledge representation. It is often represented as a directed or undirected graph consisting of vertices and edges, which represent relationships. In other words, a semantic network can be represented as a graph connecting any kind of information by any kind of relationship. From detailed to very high level data, it is up to the user to design a model to describe relevant and meaningful knowledge. Consider the example shown in Figure
5 3 Figure 1 Example of a precise network The set of entities (nodes and edges) of this graph is also referred to as the "domain"; it describes the variety of different names, types, features, and attributes of the network. This forms the model of our information. When we want to study the structure of this information for example, how the elements relate to each other, or whether certain patterns are more or less redundant than others we want to focus on the ontology of the network (or the model of the model). After designing an accurate model of the information, the next logical step is to leverage advanced graph theory and topological data analysis to expose unique insights from the shape of the semantic network. 3 Neiswender, C "Semantic Network (Relational Vocabularies)." In The MMI Guides: Navigating the World of Marine Metadata. Accessed June 30, 2014.
6 Transforming Raw Data Into Relational Information To transform a raw data file into a relational graph, before jumping into any script implementation, the first thing to do is to identify the various entities and relationships contained in the data. The following are a few entities and relationships that may be used: Social Network: Facebook, Twitter, Linkedin Nodes: People, Companies, Bands Attributes: Name, Age, Creation Date Edges: Friends, Couple, Employer, Fan, Subscriber UML class diagram: Nodes: Classes, Namespaces Attribute: Members, Methods Edges: Inheritance, Association, Aggregation, Dependency Network map: Nodes: IPs, Servers, Hosts, Routers, Firewalls Attributes: Port, Hostname, Country Edges: Protocol, Bandwidth, Latency Of course, there is more than one representation for any given problem. The designer should align their perspective with the data. When the structural model has been decided, the source data should be parsed to populate a relational data set following the design. Visualizing Knowledge There are many approaches to the problem of graph visualization. One such methodology revolves around the use of force directed layouts. Because the main purpose of the engine is to analyze the topology of our knowledge base, it is necessary to choose a visualization technique that will let the data drive its own layout.
7 The general concept is relatively simple in that every node is treated as a particle, and every edge as a force on the particles. By implementing an engine capable of running a particle physics model we can transform relational data, however loosely related, into a 2D or 3D structure completely defined by the shape of the relational structure. This relational structure serves to highlight hidden clusters or topological patterns that may have previously gone unnoticed. 4 Discovered in 1991, the Fruchterman and Reingold layout is one of the classic force directed layouts. The main idea is to treat vertices in the graph as "atomic particles or celestial bodies, exerting attractive and repulsive forces from one another". This concept is depicted in Figure 2. Figure 2 Force system concept 4 ftp://ftp.mathe2.uni bayreuth.de/axel/papers/reingold:graph_drawing_by_force_directed_placement.pdf
8 Without entering into excessive technical depth about the math supporting the model, the principles are relatively elementary: Connected nodes attract each other Non connected nodes repulse each other Using a slightly more detailed explanation, the attractive force f a (d) and the repulsive force f r (d) both depend on the distance between the nodes and a constant k controlling the density of the layout. The algorithm also adds a notion of temperature which controls the displacement of the vertices; the higher the temperature, the faster the movement. The physics represent a system inspired by electrical or celestial forces with a general technique called "simulated annealing," where increasing/decreasing the temperature affects the particles thermodynamic vibration, helping them to progressively reach an equilibrium state where all the node forces become even. That state usually looks like a visually pleasing molecule shaped layout where relational clusters will aggregate in the same areas. Dealing With Large Graphs Most modern databases include millions or billions of entries, so a modern tool to analyze them must handle large data sets efficiently. All 3D engines and particle systems have their physical limitations, and force directed. Also, force directed layout algorithms usually increase in complexity as the size of the graph grows. How do we work around these issues? Entity Grouping One way is to decrease the amount of information by looking at the data from a higher level. Instead of dealing with individual entities, you can create nodes that represent groups of entities. The
9 possibilities are endless depending on the subject you want to visualize. For instance, if you want to visualize the whole known universe with its planets and stars, it would make sense to structure your representation by a fractal approach where you would look first at galaxies, followed by stars, planets, continents, countries, and cities to reduce the size of the point cloud. You could then interactively decide to add more details as necessary as you move closer to a certain city. This would give access to the whole of the information without having to deal with all of it at once. Sampling Another interesting way to limit the size of a dataset, without completely losing the big picture, is to use sampling methods taking a certain percentage or a random sample of the complete dataset. The random subset could be built using a uniform or normal distribution, or any other user defined distribution. Using the previous universe analogy, you would randomly remove half the galaxies, half the planets/stars, and half the cities before processing the result. The data scientist will have to adjust the hypotheses or assumptions based on the way the data is pruned. When dealing with graphs, an easy way to take random sample of a large graph is to use a Random Walk approach. You would select random entry points in the graph and trace a random path in the graph starting from those points, as depicted in Figure 3.
10 Figure 3 Random walk diagram There are many ways to tweak an exploration technique of this nature. It highly depends on the user modifications and the biases involved in the selection of the random candidates, but in general, a random walk will capture the structure of a very large graph fairly accurately. Parallelization After every other pruning technique has been used, the remaining technique is to employ parallelization. You can effectively add more computing power to a system by distributing the calculation. This can happen remotely using Grid Computing technologies, or locally using the performance of multiple threads, cores, or processes. However, the processing algorithm must to be modified to work in a parallel fashion, which is unfortunately not always possible.
11 With current graphic card technology, we can take advantage of efficient graphics processing units (GPUs) and distribute the calculation across their ever increasing number of cores and threads. GPUs are becoming exceptionally efficient at processing geometrical data such as vectors, colors, matrices, textures, or any kind of computation involving a combination of the aforementioned data types. Using technologies such as Open Graphics Library (OpenGL) rendering, OpenGL Shading Language (GLSL) shaders and Open Computing Language (OpenCL) physics seems like the obvious choice to leverage the power of GPUs. Learning how to leverage GPUs (or any parallel platform) with technologies such as OpenGL, GLSL and OpenCL (among many others) is definitely one key to push through the theoretical barrier. With OpenCL for example, a task can be fully, or even partially, distributed over several compute units. The efficiency of the whole system can then be maximized by optimizing the different parts of the algorithm (Memory access, Instructions, and Concurrency, etc.). About OpenGraphiti To simplify the visualization process we have created a tool named OpenGraphiti. OpenGraphiti is an open source 2D and 3D data visualization engine for data scientists to visualize semantic networks and to work with them. It offers an easy to use API with several associated libraries to create custom made datasets. It leverages the power of GPUs to process and explore the data and sits on a homemade 3D engine. Here is list of the technologies used: C/C++ source code OpenGL rendering library Python scripts
12 Web integration with Emscripten Javascript scripts OpenCL parallel programming library GLSL Shaders GLM math library GLSL Shader NetworkX Like any good visualization tool, data is required. In addition, the data set must be formatted in such a way that OpenGraphiti can apply the algorithms to affect the spatial representation and interconnectivity of the data nodes and edges. Creation of Data Files Suppose you have a graph G = (V,E) where V = {0, 1, 2, 3} and E = {(0, 1), (0, 2), (1, 2), (2, 3)}. Suppose further that: Vertex 0 has the attributes: {"type": "A", "id": 0} Vertex 1 has the attributes: {"type": "B", "id": 1} Vertex 2 has the attributes: {"type": "C", "id": 2} Vertex 3 has the attributes: {"type": "D", "id": 3} And that: Edge (0, 1) has the attributes: {'src': 0, 'dst': 1, 'type': 'belongs', 'id': 0} Edge (0, 2) has the attributes: {'src': 0, 'dst': 2, 'type': 'owns', 'id': 1} Edge (1, 2) has the attributes: {'src': 1, 'dst': 2, 'type': 'has', 'id': 1} Edge (2, 3) has the attributes: {'src': 2, 'dst': 3, 'type': 'owns', 'id': 1}
13 This would provide a graph similar to what is presented in Figure 4. Figure 4 Connected nodes As you can see, there is a list of "node" objects, each of which contain the node attributes and IDs, as well as a list of edge objects, each of which have the edge attributes, and the fields src and dst, which indicate the source and destination vertices, respectively. The creation of semantic graphs in a JSON format that OpenGraphiti understands is relatively trivial. One way to represent such a dataset is to use the SemanticNet python library. This python library can help you create graphs in a easy way using a simple API. An example of a graph (Figure 5) and the script used to create it (Figure 6) are shown below.
14 Figure 5 Sample semantic graph Figure 6 Sample code to generate a graph #!/usr/bin/env python import sys import semanticnet as sn graph = sn.graph() a = graph.add_node({"label" : "A"}) b = graph.add_node({"label" : "B"}) c = graph.add_node({"label" : "C"}) graph.add_edge(a, b, {"type" : "belongs"}) graph.add_edge(b, c, {"type" : "owns"}) graph.add_edge(c, a, {"type" : "has"}) graph.save_json("output.json")
15 This trivial example shows how easy it is to create customized graphs. The user can define any attribute (type or other) for the nodes or edges and then connect the elements as needed. When the graph is saved (or loaded) it can be analyzed using the NetworkX graph theory library or/and combined with OpenGraphiti. Use Cases with OpenGraphiti There are a seemingly endless number of applicable use cases for the visualization of loosely related data. Some examples include the analysis of security data, such as firewall, intrusion detection and prevention systems (IDS/IPS), and malware infection alerts could be visualized to expose a previously unrecognized patterns in a malicious actor activity, or even a misconfiguration of a technical control that allows too much, or too little, access to data, files, or networks. Financial analysts could, for example, analyze data to track venture investment with data points such as the investor, the type of company being invested in (the target), its vertical market, or even the success (or failure) of the target before, during, or after the merger or acquisition. Trends may be observed to support a new model for investment and exit success above and beyond a simple spreadsheet. Social network analysis (SNA) can be visualized to show relationships between people and their relationships with other people or things. Data could be visualized to articulate the interconnections across related networks in the fields of anthropology, biology, communication studies, economics, geography, history, information science, organizational studies, political science, social psychology, development studies, and sociolinguistics, among others.
16 Conclusion We feel that OpenGraphiti should help lower the barrier to entry for those looking to visualize complex related data sets. The engine will allow for the visualization of any data, however loosely related, in a medium that is easy to generate, navigate, and articulate. The use cases presented here only scratch the surface of what is possible. We are confident, however, that the potential may be limited only by the imagination and diligence of those leveraging the tool. Combining intelligent data mining techniques with smart data visualization is the key to better understand the complex problems we are trying to solve. To take a significant step in the monitoring and managing of a large scale state machine in constant evolution, passive introspection is not enough. It is necessary to build an interface which acts on the system through manual or algorithmic modification. By presenting and open sourcing OpenGraphiti, we hope to put in place the first blocks in the foundation of a next generation data analysis tool aimed at surfacing new techniques to build a large scale, distributed, and graph based data monitoring system. Documentation and source code for OpenGraphiti shall be made available via GitHub at: References
17 Sum mary.pdf directed.pdf ftp://ftp.mathe2.uni bayreuth.de/axel/papers/reingold:graph_drawing_b y_force_directed_placement.pdf
Open Source Visualization with OpenGraphiti. Thibault Reuille (@ThibaultReuille) [email protected]. Andrew Hay (@andrewsmhay) ahay@opendns.
Open Source Visualization with OpenGraphiti Thibault Reuille (@ThibaultReuille) [email protected] Andrew Hay (@andrewsmhay) [email protected] Introduction Humans have different ways of efficiently digesting
THE OPEN SOURCE VISUALIZATION ENGINE FOR BUSY HACKERS. Thibault Reuille & Andrew Hay
THE OPEN SOURCE VISUALIZATION ENGINE FOR BUSY HACKERS Thibault Reuille & Andrew Hay THIBAULT REUILLE Security Researcher at OpenDNS Former So?ware Engineer @ NVIDIA MS IT from EPITA: Ingénierie InformaJque
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
What is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
MEng, BSc Applied Computer Science
School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions
Security visualisation
Security visualisation This thesis provides a guideline of how to generate a visual representation of a given dataset and use visualisation in the evaluation of known security vulnerabilities by Marco
Concept-Mapping Software: How effective is the learning tool in an online learning environment?
Concept-Mapping Software: How effective is the learning tool in an online learning environment? Online learning environments address the educational objectives by putting the learner at the center of the
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
White Paper. Intelligence Driven. Security Monitoring. v.2.1.1. nexusguard.com
White Paper 1 Intelligence Driven Security Monitoring v.2.1.1 Overview In today s hypercompetitive business environment, companies have to make swift and decisive decisions. Making the right judgment call
MEng, BSc Computer Science with Artificial Intelligence
School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give
Course Syllabus For Operations Management. Management Information Systems
For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
Cymon.io. Open Threat Intelligence. 29 October 2015 Copyright 2015 esentire, Inc. 1
Cymon.io Open Threat Intelligence 29 October 2015 Copyright 2015 esentire, Inc. 1 #> whoami» Roy Firestein» Senior Consultant» Doing Research & Development» Other work include:» docping.me» threatlab.io
A STATISTICS COURSE FOR ELEMENTARY AND MIDDLE SCHOOL TEACHERS. Gary Kader and Mike Perry Appalachian State University USA
A STATISTICS COURSE FOR ELEMENTARY AND MIDDLE SCHOOL TEACHERS Gary Kader and Mike Perry Appalachian State University USA This paper will describe a content-pedagogy course designed to prepare elementary
Doctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
Visualizing Data: Scalable Interactivity
Visualizing Data: Scalable Interactivity The best data visualizations illustrate hidden information and structure contained in a data set. As access to large data sets has grown, so has the need for interactive
INTRUSION PREVENTION AND EXPERT SYSTEMS
INTRUSION PREVENTION AND EXPERT SYSTEMS By Avi Chesla [email protected] Introduction Over the past few years, the market has developed new expectations from the security industry, especially from the intrusion
Visualization. For Novices. ( Ted Hall ) University of Michigan 3D Lab Digital Media Commons, Library http://um3d.dc.umich.edu
Visualization For Novices ( Ted Hall ) University of Michigan 3D Lab Digital Media Commons, Library http://um3d.dc.umich.edu Data Visualization Data visualization deals with communicating information about
Automated Virtual Cloud Management: The need of future
Automated Virtual Cloud Management: The need of future Prof. (Ms) Manisha Shinde-Pawar Faculty of Management (Information Technology), Bharati Vidyapeeth Univerisity, Pune, IMRDA, SANGLI Abstract: With
Information Processing, Big Data, and the Cloud
Information Processing, Big Data, and the Cloud James Horey Computational Sciences & Engineering Oak Ridge National Laboratory Fall Creek Falls 2010 Information Processing Systems Model Parameters Data-intensive
Network Management Deployment Guide
Smart Business Architecture Borderless Networks for Midsized organizations Network Management Deployment Guide Revision: H1CY10 Cisco Smart Business Architecture Borderless Networks for Midsized organizations
Graph Database Proof of Concept Report
Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment
InfiniteGraph: The Distributed Graph Database
A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Chapter 1 - Web Server Management and Cluster Topology
Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management
Agenda. Distributed System Structures. Why Distributed Systems? Motivation
Agenda Distributed System Structures CSCI 444/544 Operating Systems Fall 2008 Motivation Network structure Fundamental network services Sockets and ports Client/server model Remote Procedure Call (RPC)
Introduction to Computer Graphics
Introduction to Computer Graphics Torsten Möller TASC 8021 778-782-2215 [email protected] www.cs.sfu.ca/~torsten Today What is computer graphics? Contents of this course Syllabus Overview of course topics
Network-Based Tools for the Visualization and Analysis of Domain Models
Network-Based Tools for the Visualization and Analysis of Domain Models Paper presented as the annual meeting of the American Educational Research Association, Philadelphia, PA Hua Wei April 2014 Visualizing
Integer Operations. Overview. Grade 7 Mathematics, Quarter 1, Unit 1.1. Number of Instructional Days: 15 (1 day = 45 minutes) Essential Questions
Grade 7 Mathematics, Quarter 1, Unit 1.1 Integer Operations Overview Number of Instructional Days: 15 (1 day = 45 minutes) Content to Be Learned Describe situations in which opposites combine to make zero.
an introduction to VISUALIZING DATA by joel laumans
an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA iii AN INTRODUCTION TO VISUALIZING DATA by Joel Laumans Table of Contents 1 Introduction 1 Definition Purpose 2 Data
Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem
Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem 2015 SAP SE or an SAP affiliate company. All rights
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt
)($:(%$63 6WDNHKROGHU1HHGV,VVXH 5HYLVLRQ+LVWRU\ 'DWH,VVXH 'HVFULSWLRQ $XWKRU 04/07/2000 1.0 Initial Description Marco Bittencourt &RQILGHQWLDO DPM-FEM-UNICAMP, 2000 Page 2 7DEOHRI&RQWHQWV 1. Objectives
Visualizing a Neo4j Graph Database with KeyLines
Visualizing a Neo4j Graph Database with KeyLines Introduction 2! What is a graph database? 2! What is Neo4j? 2! Why visualize Neo4j? 3! Visualization Architecture 4! Benefits of the KeyLines/Neo4j architecture
The University of Jordan
The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S
Maintaining Non-Stop Services with Multi Layer Monitoring
Maintaining Non-Stop Services with Multi Layer Monitoring Lahav Savir System Architect and CEO of Emind Systems [email protected] www.emindsys.com The approach Non-stop applications can t leave on their
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
Visualizing Threats: Improved Cyber Security Through Network Visualization
Visualizing Threats: Improved Cyber Security Through Network Visualization Intended audience This white paper has been written for anyone interested in enhancing an organizational cyber security regime
An Introduction to KeyLines and Network Visualization
An Introduction to KeyLines and Network Visualization 1. What is KeyLines?... 2 2. Benefits of network visualization... 2 3. Benefits of KeyLines... 3 4. KeyLines architecture... 3 5. Uses of network visualization...
DKAN. Data Warehousing, Visualization, and Mapping
DKAN Data Warehousing, Visualization, and Mapping Acknowledgements We d like to acknowledge the NuCivic team, led by Andrew Hoppin, which has done amazing work creating open source tools to make data available
BUSINESS RULES AND GAP ANALYSIS
Leading the Evolution WHITE PAPER BUSINESS RULES AND GAP ANALYSIS Discovery and management of business rules avoids business disruptions WHITE PAPER BUSINESS RULES AND GAP ANALYSIS Business Situation More
A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader
A Performance Evaluation of Open Source Graph Databases Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader Overview Motivation Options Evaluation Results Lessons Learned Moving Forward
Georgia Department of Education
Epidemiology Curriculum The Georgia Performance Standards are designed to provide students with the knowledge and skills for proficiency in science. The Project 2061 s Benchmarks for Science Literacy is
NNMi120 Network Node Manager i Software 9.x Essentials
NNMi120 Network Node Manager i Software 9.x Essentials Instructor-Led Training For versions 9.0 9.2 OVERVIEW This course is designed for those Network and/or System administrators tasked with the installation,
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Breach Found. Did It Hurt?
ANALYST BRIEF Breach Found. Did It Hurt? INCIDENT RESPONSE PART 2: A PROCESS FOR ASSESSING LOSS Authors Christopher Morales, Jason Pappalexis Overview Malware infections impact every organization. Many
Data Analysis & Visualization for Security Professionals
Data Analysis & Visualization for Security Professionals Jay Jacobs Verizon Bob Rudis Liberty Mutual Insurance Session ID: GRC- T18 Session Classification: Intermediate Key Learning Points Key Learning
Applying Data Center Infrastructure Management in Collocation Data Centers
Applying Data Center Infrastructure Management in Collocation Data Centers Infrastructure Management & Monitoring for Business-Critical Continuity TM Applying Data Center Infrastructure Management (DCIM)
A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data
White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only
Modeling and Simulation Firewall Using Colored Petri Net
World Applied Sciences Journal 15 (6): 826-830, 2011 ISSN 1818-4952 IDOSI Publications, 2011 Modeling and Simulation Firewall Using Colored Petri Net 1 2 Behnam Barzegar and Homayun Motameni 1 Department
Understanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
Simulation Software: Practical guidelines for approaching the selection process
Practical guidelines for approaching the selection process Randall R. Gibson, Principal / Vice President Craig Dickson, Senior Analyst TranSystems I Automation Associates, Inc. Challenge Selecting from
Research on Graphic Organizers
Research on Graphic Organizers Graphic Organizers are visual representations of a text or a topic. Organizers provide templates or frames for students or teachers to identify pertinent facts, to organize
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
Application-Centric Analysis Helps Maximize the Value of Wireshark
Application-Centric Analysis Helps Maximize the Value of Wireshark The cost of freeware Protocol analysis has long been viewed as the last line of defense when it comes to resolving nagging network and
Performance Assessment Task Which Shape? Grade 3. Common Core State Standards Math - Content Standards
Performance Assessment Task Which Shape? Grade 3 This task challenges a student to use knowledge of geometrical attributes (such as angle size, number of angles, number of sides, and parallel sides) to
Solving Simultaneous Equations and Matrices
Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering
IBM Security QRadar Risk Manager
IBM Security QRadar Risk Manager Proactively manage vulnerabilities and network device configuration to reduce risk, improve compliance Highlights Collect network security device configuration data to
Application Performance Monitoring (APM) Technical Whitepaper
Application Performance Monitoring (APM) Technical Whitepaper Table of Contents Introduction... 3 Detect Application Performance Issues Before Your Customer Does... 3 Challenge of IT Manager... 3 Best
Data Driven Discovery In the Social, Behavioral, and Economic Sciences
Data Driven Discovery In the Social, Behavioral, and Economic Sciences Simon Appleford, Marshall Scott Poole, Kevin Franklin, Peter Bajcsy, Alan B. Craig, Institute for Computing in the Humanities, Arts,
CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin
My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
SUMMER SCHOOL ON ADVANCES IN GIS
SUMMER SCHOOL ON ADVANCES IN GIS Six Workshops Overview The workshop sequence at the UMD Center for Geospatial Information Science is designed to provide a comprehensive overview of current state-of-the-art
The Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
Sampling Biases in IP Topology Measurements
Sampling Biases in IP Topology Measurements Anukool Lakhina with John Byers, Mark Crovella and Peng Xie Department of Boston University Discovering the Internet topology Goal: Discover the Internet Router
D A T A M I N I N G C L A S S I F I C A T I O N
D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
Fogbeam Vision Series - The Modern Intranet
Fogbeam Labs Cut Through The Information Fog http://www.fogbeam.com Fogbeam Vision Series - The Modern Intranet Where It All Started Intranets began to appear as a venue for collaboration and knowledge
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
VIDEO Intypedia013en LESSON 13: DNS SECURITY. AUTHOR: Javier Osuna García-Malo de Molina. GMV Head of Security and Process Consulting Division
VIDEO Intypedia013en LESSON 13: DNS SECURITY AUTHOR: Javier Osuna García-Malo de Molina GMV Head of Security and Process Consulting Division Welcome to Intypedia. In this lesson we will study the DNS domain
Norwegian Satellite Earth Observation Database for Marine and Polar Research http://normap.nersc.no USE CASES
Norwegian Satellite Earth Observation Database for Marine and Polar Research http://normap.nersc.no USE CASES The NORMAP Project team has prepared this document to present functionality of the NORMAP portal.
MPLS WAN Explorer. Enterprise Network Management Visibility through the MPLS VPN Cloud
MPLS WAN Explorer Enterprise Network Management Visibility through the MPLS VPN Cloud Executive Summary Increasing numbers of enterprises are outsourcing their backbone WAN routing to MPLS VPN service
Master of Science in Computer Science
Master of Science in Computer Science Background/Rationale The MSCS program aims to provide both breadth and depth of knowledge in the concepts and techniques related to the theory, design, implementation,
Remote Graphical Visualization of Large Interactive Spatial Data
Remote Graphical Visualization of Large Interactive Spatial Data ComplexHPC Spring School 2011 International ComplexHPC Challenge Cristinel Mihai Mocan Computer Science Department Technical University
MACHINE LEARNING & INTRUSION DETECTION: HYPE OR REALITY?
MACHINE LEARNING & INTRUSION DETECTION: 1 SUMMARY The potential use of machine learning techniques for intrusion detection is widely discussed amongst security experts. At Kudelski Security, we looked
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
CRGroup Whitepaper: Digging through the Data. www.crgroup.com. Reporting Options in Microsoft Dynamics GP
CRGroup Whitepaper: Digging through the Data Reporting Options in Microsoft Dynamics GP The objective of this paper is to provide greater insight on each of the reporting options available to you within
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
A comparative study of social network analysis tools
Membre de Membre de A comparative study of social network analysis tools David Combe, Christine Largeron, Előd Egyed-Zsigmond and Mathias Géry International Workshop on Web Intelligence and Virtual Enterprises
Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers
60 Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative
Cray: Enabling Real-Time Discovery in Big Data
Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
Titolo del paragrafo. Titolo del documento - Sottotitolo documento The Benefits of Pushing Real-Time Market Data via a Web Infrastructure
1 Alessandro Alinone Agenda Introduction Push Technology: definition, typology, history, early failures Lightstreamer: 3rd Generation architecture, true-push Client-side push technology (Browser client,
Distributed R for Big Data
Distributed R for Big Data Indrajit Roy HP Vertica Development Team Abstract Distributed R simplifies large-scale analysis. It extends R. R is a single-threaded environment which limits its utility for
Data Analysis Load Balancer
Data Analysis Load Balancer Design Document: Version: 1.0 Last saved by Chris Small April 12, 2010 Abstract: The project is to design a mechanism to load balance network traffic over multiple different
Data Mining in the Swamp
WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all
Before we can talk about virtualization security, we need to delineate the differences between the
1 Before we can talk about virtualization security, we need to delineate the differences between the terms virtualization and cloud. Virtualization, at its core, is the ability to emulate hardware via
1 Log visualization at CNES (Part II)
1 Log visualization at CNES (Part II) 1.1 Background For almost 2 years now, CNES has set up a team dedicated to "log analysis". Its role is multiple: This team is responsible for analyzing the logs after
Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics
Paper 3323-2015 Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics ABSTRACT Stephen Overton, Ben Zenick, Zencos Consulting Network diagrams in SAS
Security Event Management. February 7, 2007 (Revision 5)
Security Event Management February 7, 2007 (Revision 5) Table of Contents TABLE OF CONTENTS... 2 INTRODUCTION... 3 CRITICAL EVENT DETECTION... 3 LOG ANALYSIS, REPORTING AND STORAGE... 7 LOWER TOTAL COST
