DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights



Similar documents
Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Advanced Big Data Analytics with R and Hadoop

Big Data Integration: A Buyer's Guide

Integrating a Big Data Platform into Government:

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

COMP9321 Web Application Engineering

Doing Multidisciplinary Research in Data Science

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Big Data and Data Science: Behind the Buzz Words

Challenges of Analytics

Harnessing the power of advanced analytics with IBM Netezza

Big Data and Analytics: Challenges and Opportunities

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Leveraging Big Data. A case study from Thomson Reuters

How To Handle Big Data With A Data Scientist

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

This Symposium brought to you by

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Marketing Analytics Technology Overview

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

Architectures for Big Data Analytics A database perspective

How To Turn Big Data Into An Insight

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

2015 Workshops for Professors

Navigating Big Data business analytics

Ganzheitliches Datenmanagement

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

NTT DATA Big Data Reference Architecture Ver. 1.0

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Extend your analytic capabilities with SAP Predictive Analysis

Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Oracle Big Data SQL Technical Update

ANALYTICS CENTER LEARNING PROGRAM

Big Data Analytics Nokia

CRITEO INTERNSHIP PROGRAM 2015/2016

Towards Smart and Intelligent SDN Controller

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

IBM SPSS Modeler Professional

Cisco Data Preparation

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Sentimental Analysis using Hadoop Phase 2: Week 2

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

BIG Data Analytics Move to Competitive Advantage

R / TERR. Ana Costa e SIlva, PhD Senior Data Scientist TIBCO. Copyright TIBCO Software Inc.

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Sunnie Chung. Cleveland State University

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

Transforming the Telecoms Business using Big Data and Analytics

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

NoSQL and Hadoop Technologies On Oracle Cloud

Data Refinery with Big Data Aspects

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Advanced In-Database Analytics

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract

Big Data for Investment Research Management

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction

Journée Thématique Big Data 13/03/2015

SAP Predictive Analytics: An Overview and Roadmap. Charles Gadalla, SESSION CODE: 603

Why Big Data in the Cloud?

Predictive Analytics

Architecting for the Internet of Things & Big Data

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May ISSN BIG DATA: A New Technology

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Are You Big Data Ready?

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

The Future of Data Management

Introduction to Data Mining

Real-Time Analytics: Integrating Social Media Insights with Traditional Data

Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

Hadoop and Map-Reduce. Swati Gore

Sentiment Analysis on Big Data

Reference Architecture, Requirements, Gaps, Roles

Data Isn't Everything

Transcription:

DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE

Domains Data Mining Mining longitudinal and linked datasets from web and other archives. Data Standardization Converting raw data into cleaned and standardized formats. Big Data Analytics Implementing distributed computing methods and cutting edge statistical models to analyze huge chunks of data. Statistical Modeling Curating statistical models or running statistical analysis on real or simulated data to understand phenomenon. Data Visualization Creating customized visualization for your data-sets to tell a compelling story. Technology Implementations Implementing customized mobile and software tools to make research faster, scalable and economical.

Data Mining Using our proprietary codes, we can deliver harvested data from web or documents in a longitudinal format. Codes built on open source technologies which can run on most popular operating systems Quick turnaround time Automated quality layer can be embedded customized to the data-sets Logs are maintained throughout the lifecycle Harvested data is delivered in flat file formats usable by any database platform

Data Mining Indian Patent Data Task in hand was to write a code to harvest Indian Patent Data from Indian Patent Office Website on an ongoing basis Code was prepared to extract information on all patents filed between date A and date B defined by user. Further, a data quality tool was written to clean data from the harvested pool of data. A scheduling code was prepared to run harvesting code followed by data quality code every week (as Indian Patent office publishes patents every week). This master scheduler code runs every week and updates all fields of new patents in the master pool of database.

Data Mining Earning Conference Calls Task in hand was to write a code to harvest latest earning conference calls of all NASDAQ listed firms and parse each call at individual speaker level Earning Conference calls were harvested from NASDAQ, Thomson StreetEvents and other sources on a daily basis Automation code was written to parse those conference calls at individual speaker level with meta data like speaker name, designation, company and speech text A scheduling code was prepared to run harvesting code and parsing code followed by harvesting every day (from NASDAQ) This master scheduler code runs every day and updates all fields of new conference calls in the master pool of database.

Statistical Modeling Our team of Statisticians make complicated statistical models from scratch and our coders and designers make them a usable reality Modeling carried out statisticians from top institutes like IITs and IISCs with experience in firms like Oracle, Global Analytics, and MuSigma Statistical Modeling is carried out on technologies like R, Matlab, GLPK, Gurobi Our coders make these models scalable and efficient UI/UX designers make them user-friendly and flexible as per working requirements Harvested data is delivered in flat file formats usable by any database platform

Stats School Cost Optimization Task in hand was to prepare, implement, and realize statistical models to optimize school transportation and logistics costs for cities in a Latin American Country A Professor was appointed by NPO and a Latin American government to prepare statistical models aiming at city planning for optimizing school costs Our statisticians used clustering techniques, Ant Colony Optimization, and multi-processing to minimize total transportation + operational cost on schools for a city. Various constraints and difficulties were taken care of like bus capacity, total riding time, mixed loading, land and sea routes, different types of buses and classroom sizes, and many more. A graphical user interface with flexibility was delivered to client to view output on maps, spreadsheets, manually change routes, upload input data etc.

Stats HMM predicts App Lifetime Using Hidden Markov Models to predict app lifecycle and various hidden layers of app usage by categories Using a hypothesis that users have different lifecycles in using category based apps, for eg. Productivity apps for first two months followed by entertainment in next two months, we will predict lifecycle of an app on an average user s device. Using data on installs and de-installs of many app for 100,000 customers, and modeling it on a Hidden Markov Model, we estimated lifecycles of 6 categories of apps on an average customer s device. Finally, these lifecycles were used to correlate life-cycle of individual apps using various app parameters. This analysis was order-complex and time consuming. Thus an Hadoop layer on R was used to reduce time and memory complexity on a single machine using HDFS.

Data Analysis (Big Data) Data Transformation and Analytics at large scale to deliver insights Our team of coders and domain analysts work together to code transformation and analytical processes. These codes are made scalable, time efficient, and memory efficient when databases runs beyond megabytes of memory and millions of rows using Hadoop and Distributed File Systems. With knowledge and command on languages like Python, SQL, Hadoop, NoSQL, R, Pig, MapReduce, our coders can make any analysis happen fast enough and our domain analysts can deliver useful insights in best representations possible.

Data Analysis Sentiment Analysis Navigating through millions of tweets to understand sentimental behavior of customers towards brands Millions of tweets talking about few famous brands were extracted using data harvesting techniques. Using Naïve Bayes Analysis and a training set of 100,000 tweets, a supervised classifier was established to decide sentiment of any tweet. This analysis involved prior cleaning of stop words and substituting prolonged chat versions of words with their English word counterparts. Sentiments of customers were analyzed for various brands. Further, change in sentiments of customers were analyzed after a brand re-tweets and engages with the customer to examine effects of customer engagement by big brands on twitter. This project was not feasible to complete in a record two month timeframe without use of big data technologies.

Data Analysis Personality Analysis Personality of Chief Executives and Analysts during Conference Calls were examined and correlated with performance of firms in following quarter Having access to parsed database of various earning conference calls of NASDAQ traded firms, we performed personality analysis of over 30,000 people (including chief executives and analysts) using pysho-linguistic databases of words and SVM technique. Each person was given a score from 1 to 8 in big five personality traits. These traits were correlated with following quarters conference calls and patterns were observed in personality of chief executives and their effect on financial (especially stock) performance in the following quarter.

Clients Researchers from

Contact Sachin Jaiswal VP, Business Development (P): +91-120 431 1139 (E): sachin.jaiswal@innovaccer.com