How To Learn Computer Science

Size: px
Start display at page:

Download "How To Learn Computer Science"

Transcription

1 BIG DATAN TUTKIMUS JA OPETUS JYVÄSKYLÄN YLIOPISTOSSA JYVÄSKYLÄN YLIOPISTO INFORMAATIOTEKNOLOGIAN TIEDEKUNTA 2014

2 2 SISÄLLYS JOHDANTO BIG DATA JYVÄSKYLÄN YLIOPISTON TUTKIMUKSESSA Tieteellinen laskenta ja data-analyysi Kyberturvallisuus Big data ja SOTE Tilastotiede Bio- ja ympäristötieteet Fysiikka BIG DATA JYVÄSKYLÄN YLIOPISTON KOULUTUKSESSA Sovelletun matematiikan maisteriohjelma Laskennalliset tieteet -maisteriohjelma Web Intelligence and Service Engineering (WISE) -maisteriohjelma Tilastotieteen koulutus Ihmistieteiden metodikeskus (IHME) Big dataan liittyvä opetus lukuvuonna CENTRE OF SIMULATION AND DATA-ANALYSIS BIG DATA -ALAN KEHITYKSEN SEURANTA Data-analyysiin liittyvät väitöskirjat LIITE 1: JYVÄSKYLÄN YLIOPISTON IT-TIEDEKUNNAN OPETUSOHJELMAAN LIITTYVÄT VERKKOKURSSIT LIITE 2: YHTEENVETO BIG DATA -ALASTA JA SOVELLUTUKSISTA JYVÄSKYLÄN YLIOPISTON IT-TIEDEKUNNASSA LIITE 3: BIG DATA - PARADIGM, MAJOR TOPICS AND TRENDS LIITE 4: VERKKO-OPPIMATERIAALIEN SOPIVUUDET IT-TIEDEKUNNAN KURSSIMATERIAALIKSI... 40

3 3 JOHDANTO Tässä raportissa käsitellään Jyväskylän yliopistossa annettavaa big datan ja dataanalyysin tutkimusta ja koulutusta. Jyväskylän yliopisto haluaa olla laaja-alaisesti mukana big datan tutkimuksessa ja koulutuksessa sekä alan kehittämisessä. Data-analyysin asiantuntijan tulee kyetä vaativaan tilastolliseen mallintamiseen ja osata ohjelmointia ja tiedonhallintaa. Näiden taitojen oppiminen puolestaan vaatii pohjakseen matematiikan osaamista. Alan vaativuuden vuoksi tutkijakoulutus on usein tarpeen käytännön työtehtävissä. Ohjelmistojen tekninen hallitseminen ei riitä lisäarvon tuottamiseen, vaan big datan hyödyntäminen vaatii sekä substanssialueen että analyysimenetelmien syvällistä ymmärtämistä. Vain näin voidaan varmistaa prosessien ja tulosten luotettavuus ja käyttökelpoisuus pitkällä aikavälillä. Data-analyysin ja big datan maisteri- ja tohtorikoulutus pohjautuu Jyväskylän yliopistossa vankkaan matematiikan, tietotekniikan ja tilastotieteen tutkimukseen. Jyväskylän yliopiston monitieteinen toimintaympäristö antaa erinomaisen lähtökohdan kehittää uusia data-analyysin ja big datan menetelmiä ja soveltaa niitä eri tieteen aloilla sekä yritysmaailman että julkisen sektorin osa-alueilla. Tässä raportissa kuvataan Jyväskylän yliopiston Centre of Simulation and Data-analysis toimintaympäristöä. Jyväskylässä Dekaani, professori Pekka Neittaanmäki

4 4 1 BIG DATA JYVÄSKYLÄN YLIOPISTON TUTKIMUKSESSA Data-analyysin asiantuntijan tulee kyetä vaativaan tilastolliseen mallintamiseen ja osata ohjelmointia ja tiedonhallintaa. Näiden taitojen oppiminen puolestaan vaatii pohjakseen matematiikan osaamista. Alan vaativuuden vuoksi tutkijakoulutus on usein tarpeen käytännön työtehtävissä. Ohjelmistojen tekninen hallitseminen ei riitä lisäarvon tuottamiseen, vaan big datan hyödyntäminen vaatii sekä substanssialueen että analyysimenetelmien syvällistä ymmärtämistä. Vain näin voidaan varmistaa prosessien ja tulosten luotettavuus ja käyttökelpoisuus pitkällä aikavälillä. Informaatioteknologian tiedekunnassa tehdään kansainvälisesti korkealaatuista IT-alan tutkimusta kahdella laitoksella, jotka ovat tietotekniikan ja tietojenkäsittelytieteiden laitokset. Tiedekunnan tutkimushankkeet liittyvät usein yhdessä kansallisten ja kansanvälisten tutkimuskumppaneiden ja teollisuuden kanssa tehtäviin tutkimus- ja kehityshankkeisiin. Tietotekniikan laitoksen tutkimus perustuu pääosin analyyttis-konstruktiivisten menetelmien käyttöön teknisestä, laskennallisesta, matemaattisesta tai pedagogisesta näkökulmasta. Data-analyysin opetusta ja tutkimusta tehdään tietotekniikan laitoksella osana laskennallisten tieteiden ja ohjelmistotekniikan koulutusta. (Esimerkkejä dataanalyysin teollisista sovellutuksista löytyy Tietojenkäsittelytieteiden laitoksen tutkimuksessa tarkastellaan tietojärjestelmiä ja tietojenkäsittelyä neljästä näkökulmasta: teknologinen, ihmiskeskeinen, liiketoiminnallinen ja informaatiokeskeinen. Nämä näkökulmat muodostavat laitoksen yleisen tehtävän: ymmärtää, kehittää, suunnitella ja hallita tietojärjestelmiä ja tietojenkäsittelyä sekä niiden vaikutuksia kokonaisvaltaisesti käyttökontekstissaan. Data-analyysiin ja big dataan liittyvää tutkimusta tehdään IT-tiedekunnan lisäksi matematiikan ja tilastotieteen laitoksella, ihmistieteiden metodikeskuksessa (IHME) sekä sovelletaan useissa tutkimusryhmissä eri puolilla yliopistoa, mm. bio- ja ympäristötieteet, fysiikka, kemia ja sosiaalitieteet. Seuraavassa on kuvattu lyhyesti data-analyysiin ja big dataan liittyvää tutkimustoimintaa Jyväskylän yliopistossa.

5 1.1 Tieteellinen laskenta ja data-analyysi 5 Tieteellisen laskennan tutkimusaloja ovat matemaattinen mallintaminen, luotettava malli- ja datapohjainen simulointi, optimointi, adaptiiviset ja tehokkaat numeeriset laskentamenetelmät, epävarmuuden huomioiminen numeerisessa simuloinnissa, hajautettujen systeemien säätö, spline ja spline wavelet tekniikat signaalin ja kuvankäsittelyssä, dynaamiset systeemit ja nanoelektroniikan mallinnus. Data-analyysin tutkimusaloja ovat analysointimenetelmien kehittäminen, erityisesti numeriikka ja massiivisen datan luokittelutekniikat, hyperspektrikameran datan analysointitekniikoiden kehittäminen ja tekniikan soveltaminen sen osa-alueilla: solubiologia, lääketiede, ympäristötiede, maa- ja metsätalous, kemialliset aseet, rikospaikkatutkimustekniikka. Lisäksi yhteistyöhankkeita on mm. fysiikan ja aivotutkimuksen alueilla. 1.2 Kyberturvallisuus Tietotekniikan laitoksella tutkitaan tietotekniikkaa teknis-matemaattisesta näkökulmasta. Tutkimuskohteena on informaation käsittelyprosessien tehokas automatisointi. Tutkimuksen painoalat liittyvät informaatioteknologian keskeisiin alueisiin, kuten uudenlaisten tietojenkäsittelysovellusten ja ohjelmistojen suunnitteluun, tietoverkkojen tiedonsiirtojärjestelmien suunnitteluun ja hallintaan sekä tehokasta tietokonelaskentaa hyödyntävien numeeristen ja matemaattisten menetelmien ja mallien käyttöön, esimerkiksi teollisten tuotteiden suunnittelussa, teollisten prosessien ohjauksessa, luonnontieteellisessä mallintamisessa ja suurten tietoaineistojen analyysissä. Laitoksen tutkimushankkeita, joilla on vahva yhteys big dataan, ovat mm: Cyber Attacks Protection of Critical Infrastructures Tutkimuksessa kehitetään innovatiivista tietojärjestelmien turvaamiseen liittyvää menetelmää. Menetelmä tutkii tietomassoista epänormaaleja käyttäytymismalleja ja tekee analyysin pohjalta päätelmiä havaintojen vakavuudesta tietojärjestelmän turvallisuudelle. Hankkeessa tutkitaan teknologioita, joiden avulla voidaan automaattisesti tunnistaa, havaita ja luokitella erilaisia haittaohjelmia. Big Data Analytics - Data-Driven Methods for Cyber Security Tutkimushankkeessa kehitetään automaattisia ja puoliautomaattisia laskentametodeja, analyysityökaluja ja ohjelmistoalgoritmeja, joilla voidaan analysoida suuria tietovarantoja, jotta voidaan löytää ja määritellä tuntemattomia toimintoja ja niihin liittyviä trendejä ja erityyppisten datojen välisiä suhteita mukaan lukien haittaohjelmien havaitseminen. Organizing and analyzing massive high dimensional datasets Tutkimushankkeen kohteena on korkea-dimensionaalisen datan analysointi. Tutkimuksessa järjestellään, klusteroidaan ja luokitellaan korkea-dimensionaalista dataa sekä

6 6 tunnistetaan siitä poikkeuksia ja anomalioita. Tutkimuksessa kehitetään ydinteknologioita haittaohjelmien automaattiseen tunnistamiseen. IT-tiedekunnan big datan sekä perustutkimuksella että soveltavalla tutkimuksella kyetään jatkuvasti tuottamaan korkeatasoisia uusia innovaatioita ja tieteellisiä läpimurtoja. 1.3 Big data ja SOTE Esimerkki 1: Sairaalasuunnittelu Sairaalaympäristö on hyvä esimerkki big datan hyödyntämismahdollisuuksista. Analysoimalla dataa potilaiden, henkilökunnan, materiaalien sekä laitteiden logistiikasta, potilaille tehtävistä hoitotoimenpiteistä sekä resurssien käytöstä, on mahdollista suunnitella sairaalan hoito- sekä tukipalveluprosessit optimaalisella tavalla toteutettaviksi. Jyväskylän yliopiston IT-tiedekunnassa keskitytään sairaalatoimintojen kehittämiseen ja prosessien tehostamiseen. Hyvänä esimerkkinä tästä on Keski-Suomen sairaanhoitopiirin Uusi sairaala -projekti, jossa tiedekunta on ollut mukana jo kahden vuoden ajan: Esimerkki 2: Sosiaali- ja terveydenhuollon prosessit Pyrittäessä potilaan hoidon kokonaisvaltaiseen optimointiin, pitää tarkastelua laajentaa organisaatiotasolta potilaan koko hoitoketjun tarkasteluun. Tässä yhteydessä big data ajattelu nousee kokonaan uudelle tasolle. Tämän tiedon perusteella luotujen laskennallisten mallien avulla voidaan lähteä etsimään optimaalista tapaa toteuttaa eri potilaiden hoito sekä luomaan ennusteita siitä missä vaiheessa hoitoon pitäisi puuttua, jotta potilaan tila ei ehtisi huonontumaan. Tällä hetkellä laskennallisen prosessianalytiikan tutkimusryhmä on kehittämässä uudenlaista työkalua tämän lähestymistavan konkretisointiin ja big datan tehokkaampaan hyödyntämiseen: Tämän lisäksi IT-tiedekunnassa on käynnistynyt hanke, jossa tähän kokonaisuuteen liitetään mukaan vielä kustannusmuuttujat ja eri rahoittaja-/maksajatahot: Tilastotiede Tilastotiede vastaa kysymyksiin, kuinka dataa tulisi kerätä, kuinka toimia, kun data on valikoitunutta, ja kuinka epävarmuutta voi hallita. Jyväskylän yliopistossa tilastotieteen tutkimusaloja ovat tutkimusten kustannustehokas suunnittelu, biometria ja ympäristötilastotiede, rakenneyhtälömallit, puuttuvan tiedon käsittely, parametrittomat mene-

7 7 telmät, spatiaalinen tilastotiede ja aikasarja-analyysi, Tutkimusta tehdään läheisessä yhteistyössä muiden tieteenalojen, tutkimuslaitosten (THL, RKTL, METLA, SYKE) ja yritysten kanssa. Yhteishankkeessa THL:n kanssa pohditaan, kuinka väestön terveydentilaa voisi seurata tehokkaasti, kun perinteisten terveystarkastustutkimusten osallistumisaktiivisuus laskee. Yhteistyössä liike-elämän edustajien kanssa on selvitetty, kuinka yritykset, joilla ei ole suoraa yhteyttä asiakkaisiinsa, voisivat määrittää asiakassegmenttiensä arvon dataan perustuen ja käyttää tätä tietoa liiketoimintansa ohjaamisessa (J. Karvanen, A. Rantanen, L. Luoma (2014). Survey data and Bayesian analysis: a cost-efficient way to estimate customer equity. Accepted for publication in Quantitative Marketing and Economics, Bio- ja ympäristötieteet Bio- ja ympäristötieteissä, erityisesti bio-kuvantamisen alalla datamäärän kasvu on huikea. Hyväresoluutioinen mikroskooppidata tuottaa ison määrän dataa varsinkin kun pyrkimys nykyään on tuottaa kolmiulotteista dataa ajan suhteen (neliulotteista dataa). Mikroskooppikuvantamista käytetään tänä päivänä paljolti potentiaalisten lääkkeiden testaamiseen isoissa näyteaineistossa tai kun pyritään tunnistamaan mitkä solun omat molekyylit ovat tärkeitä solun fysiologisten tai patologisten prosessien säätelijöitä. Nämä tutkimukset ovat ns. High-throughput-tutkimuksia, joissa lyhyessä ajassa tuotetaan huikeita määriä mikroskooppidataa, joista pitää tehokkaasti ja automaattisesti tulkita tutkimustulokset. Bio- ja ympäristötieteiden laitoksella on kehitetty yhteistyössä oma ohjelmistoalusta BioImageXD ( joka pyrkii neliulotteisen datan tehokkaaseen prosessointiin, analysointiin ja animointiin. Ohjelmistoa kehitetään lähitulevaisuudessa helpottamaan high-throughput-aineistojen analysointia. Esimerkkinä tutkimuksesta on Kankaanpää et al., BioImageXD - an open general purpose and high throughput image processing and analysis platform for biomedical images, Nat Methods Jun 28;9(7): Fysiikka Fysiikan laitoksella big dataan liittyvää tutkimustietoa sovelletaan mm. syklotroni laboratoriossa, nanotieteissä ja materiaalitieteissä. Jyväskylän yliopiston fysiikan laitoksen kiihdytinlaboratorion (JYFL-ACCLAB) tieteellisenä päätehtävänä on tuottaa perustietoa aineen sub-atomaarisesta rakenteesta tutkimalla eksoottisia atomien ytimiä. Tutkimus on luonteeltaan kokeellista. Tutkimusaineistoja tuotetaan useissa mittalaitteistoissa, joiden käyttöön laboratorion kolme hiukkaskiihdytintä tuottavat ionisuihkuja. Tutkimusprojektien tuottaman datan määrä vaihtelee suuresti, muutamista kilotavuista kymmeniin teratavuihin mittausta kohden.

8 8 Kokonaisuudessaan laboratorion tuottama aineistomäärä vaihtelee vuosittain 30-70TB:n välillä. Projektien elinkaari mittauksista julkaisuun saattaa olla useiden vuosien mittainen, joten aineistoille tarvitaan luotettava keskipitkän aikavälin tallennusratkaisu. Laboratorion tutkijat ottavat osaa myös muissa laboratorioissa (esim. CERN/Isolde, GSI/FAIR) toteutettaviin kokeisiin, joissa tuotettuihin aineistoihin pätevät samat tallennuskriteerit. Aineistot ovat pääosin mittausaineistoja ydinfysiikan kokeellisista tutkimusprojekteista. Aineistojen uudelleenkäyttö ja -analysointi tutkimusalan sisällä on normaali käytäntö. Aineistojen käyttömahdollisuudet muilla tutkimusaloilla ovat rajalliset.

9 9 2 BIG DATA JYVÄSKYLÄN YLIOPISTON KOULUTUKSESSA Informaatioteknologian tiedekunta vastaa kehittyvän informaatioteknologian sekä digitalisoitumisen tuomiin tutkimus- ja koulutushaasteisiin. Tiedekunta yhdistää kokonaisvaltaisesti teknologian, informaation, organisaatioiden ja liiketoiminnan sekä ihmisen näkökulmat niin tutkimuksessa, koulutuksessa kuin sidosryhmäyhteistyössä. Tiedekunta kouluttaa informaatioteknologian laaja-alaisia ja kansainvälisiä osaajia sekä kauppatieteellisellä että luonnontieteellisellä koulutusalalla. Informaatioteknologian tiedekunnalla on keskeinen rooli yliopiston painoaloihin kuuluvan ihmisläheisen teknologian kehittämisessä. Tiedekunnan keskeinen vahvuus on kyvykkyys tarkastella informaatioteknologiaa laajasti, useita näkökulmia yhdistäen ja eri ilmiöiden yhteisvaikutuksia tunnistaen. Tämä yhdistyy kansainvälisesti arvostettuun huippututkimukseen kärkialoilla ja aktiiviseen toimijuuteen ympäröivän yhteiskunnan kanssa. IT-tiedekunta on saavuttanut johtavan aseman laskennallisissa tieteissä, kyberturvallisuudessa, tietojärjestelmätieteissä ja edustaa ainoana IT-alan tiedekuntana kognitiotieteen tutkimusta ja opetusta. Big datan ja data-analyysin koulutusta annetaan IT-tiedekunnan seuraavissa monitieteisissä maisteriohjelmissa: sovellettu matematiikka, laskennalliset tieteet ja Web Intelligence and Service Engineering (WISE). 2.1 Sovelletun matematiikan maisteriohjelma Sovelletun matematiikan avulla pyritään ratkaisemaan tosielämän ongelmia. Sovelletun matematiikan tavoitteena on mallintaa erilaisia ilmiöitä, kuvailla niitä ja yrittää ymmärtää niitä. Sovelletun matematiikan opiskelussa yhdistyy tieteellisen laskennan käsitteet ja menetelmät, joita käytetään kysymyksiin, jotka ilmentyvät matematiikan ja muiden tieteenalojen rajapinnoissa. Jyväskylän yliopistossa opinnoissa keskitytään sellaisiin osa-alueisiin, kuten funktionaalianalyysi, mitta- ja integraaliteoria, kompleksianalyysi, numeerinen analyysi, optimointi ja simulointi.

10 Laskennalliset tieteet -maisteriohjelma Laskennallisten tieteiden maisteriohjelmassa käsitellään jatkuvan ja diskreetin simuloinnin periaatteet ja sovelluskohteet. Tavoitteena ovat jatkuvien simulointimallien tavallisimmat diskretisointimenetelmät ja niiden tehokkaan toteuttamisen perusperiaatteet moderneissa tietokonearkkitehtuureissa ja lisäksi yksi- ja monitavoitteisen epälineaarisen optimoinnin periaatteet ja ratkaisumenetelmät. Opetuksessa muodostetaan tekniikan ja luonnontieteiden ilmiöille matemaattisia simulointimalleja. Opetuksessa käsitellään laaja-alaisesti tilastotieteen, numeerisen laskennan ja ohjelmoinnin käsitteitä ja menetelmiä. Data-analyysissä opetetaan ja tutkitaan menetelmiä ja lähestymistapoja, joilla eritavoin kerätystä tiedosta (data) pyritään muodostamaan malleja ja korkeampaa tai tarkempaa informaatiota. 2.3 Web Intelligence and Service Engineering (WISE) - maisteriohjelma Web Intelligence and Service Engineering keskittyy suunnittelemaan web-pohjaisia sovelluksia, jotka auttavat verkossa toimivaa palveluyhteiskuntaa niin julkisella kuin yksityisellä sektorilla. Maisteriohjelmalla on suora yhteys big data strategian tavoitteisiin, kuten verkossa olevan datan käsittelyssä tarvittavaan "älykkyyteen" ja tämän päälle rakennettaviin palveluihin. On completion of the programme, the graduates will be able to use and design complex self-managed Web-based public and industrial systems, digital ecosystems 2, platforms, services and applications; will be able to connect their designs with publicly available data and Web-based capabilities as services; will be able to figure-out and approach various challenging aspects of wicked problems world-wide, which require self-managed service-based architectures for their solutions; understand and professionally utilize for that purpose knowledge on enabling technologies and tools; perform academic doctoral level studies; will be skilful in international communication due to the integrated language and communication studies. Students, who will graduate from the programme, will think beyond the routine and will be able not just to adapt to a change but to help to create and control it. 2.4 Tilastotieteen koulutus Tilastotiede, jota Jyväskylän yliopistossa voi opiskella sekä pää- että sivuaineena, antaa hyvät valmiudet käytännön data-analyysiin. Tilastotiede vastaa kysymyksiin, kuinka dataa tulisi kerätä, kuinka toimia, kun data on valikoitunutta, ja kuinka epävarmuutta voi hallita. Tilastotieteen pääaineopintoihin kuuluu paljon myös matematiikan ja tietotekniikan opintoja ja tutkinto luo täten erinomaisen pohjan big data -tehtäviin.

11 11 Esimerkkinä big data -koulutuksesta voidaan mainita Jyväskylän yliopiston kesäkoulussa 2013 toteutetun tilastotieteen kurssin "Industrial data science", jonka luennoitsijat edustivat suomalaisen big data -osaamisen huippua yritysmaailmassa. 2.5 Ihmistieteiden metodikeskus (IHME) Informaatioteknologian tiedekunnan panos Ihmistieteiden metodikeskuksen opetuksessa on merkittävä liittyen data-analytiikan koulutukseen ja kehittämistyöhön. Ihmistieteiden metodikeskus (IHME) tarjoaa Jyväskylän yliopiston jatko-opiskelijoille ja tutkijoille tutkimusmenetelmien ja -etiikan koulutusta yli tiedekuntarajojen ja edistää toiminnallaan tieteiden välistä tutkimusyhteistyötä ja tutkimusmenetelmien innovatiivista käyttöä. Big data -lähestymistapa sisältyy teemana IHME:en koulutukseen, jossa lähtökohtana on aineistolähtöisten analyysien hyödyntäminen monimenetelmäisessä ja -alaisessa metodikoulutuksessa. IT-tiedekunnassa tehtävään tutkimukseen perustuvan koulutuksen tavoitteina on alan kehitystä seuraten lisätä eri alojen tohtorikoulutettavien datatietoisuutta, erityisesti ymmärrystä siitä minkälaisia analyysejä ja niihin perustuvia tulkintoja voidaan big datan eri menetelmiin perustuen luotettavasti tehdä, miten big data -lähestymistapaa voidaan soveltaa eri tieteenalojen tutkimuksessa sekä minkälaisia tutkimuseettisiä valmiuksia ja menettelyitä big data - lähestymistapa edellyttää. IT-tiedekunnan muille tiedekunnille antama opetus myös toteutuu IHME-yhteistyön kautta. 2.6 Big dataan liittyvä opetus lukuvuonna Lukuvuoden opetusohjelma valmistuu Ohjelmassa tulee olemaan useita big dataan, datan luokitteluun, laskennalliseen tilastotieteeseen ja tietokantoihin (mm. NoSQL) sekä sovellutuksiin liittyviä kursseja. Lisäksi opiskelijoille tarjotaan mahdollisuus suorittaa verkkokursseja. Liitteenä 1 on lista Jyväskylän yliopiston IT-tiedekunnan opetusohjelmaan liittyvistä verkkokursseista. Päivitetty versio listasta julkaistaan

12 12 3 CENTRE OF SIMULATION AND DATA-ANALYSIS Datan määrä ja asema yhteiskunnassa on radikaalisti muuttumassa: datan määrä kasvaa eksponentiaalisesti jalostettu ja analysoitu data on yhä keskeisempi tuottavuutta ja kilpailukykyä voimistava tekijä datan tuottaminen ja jalostaminen tulevat merkittäviksi liiketoiminnan alueiksi datan perusteella luodun tiedon esittämisen muodot ja keinot monipuolistuvat data-analyysi on yksi voimakkaimmin kasvavista teknologia-alueista suurien datamassojen käsittelystä on muodostunut uusi tieteen paradigma data-analyysi muuttaa merkittävästi digitaalista palvelutuotantoa Tapahtuva muutos antaa paljon tehtäviä tutkimukselle. Toisaalta tarvitaan tutkimusta, joka liittyy datan tekniseen hallintaan, sen siirtämiseen, analysointiin ja jalostamiseen sekä turvallisuuteen erityisesti päätöksenteon tueksi. Tällainen tutkimus tukee kansantalouden kilpailukykyä ja tuotantoa. Toisaalta tarvitaan tutkimusta, joka auttaa ohjaamaan tietoyhteiskunnan kehitystä. Tällöin tutkimuskohteena on inhimillinen näkökulma datan käsittelyyn, sen luottamuksellisuuteen ja yksilön roolista data-analyysin tulosten käytön kohteena. Big data-analyysiä voi lähestyä yleisesti käytettyjen neljän V:n määrittelyjen perusteella: Volume: datan määrä (sekä havaintojen että muuttujien) Variety: datan moninaisuus ja heterogeenisuus Velocity: nopeus jolla dataa syntyy Veracity: datan laatu Suomessa on osaamista mm. lääketieteellisessä tutkimuksessa, mobiiliteknologioissa, peliteollisuudessa ja ympäristömonitoroinnissa, jotka kaikki ovat hyvin dataintensiivisiä ja sen monimuotoiseen analyysiin perustuvia aloja. Suomella on myös vahvaa menetelmä- ja IT-osaamista, jota muuntamalla ja hyödyntämällä koulutuksen, tutkimuksen ja asiantuntemuksen jakamisen kautta saataisiin mukaan Big Data -kehitystyöhön. Big datan hyödyntäminen julkisella sektorilla on vasta alkutekijöissään, mutta tarjoaa suuria mahdollisuuksia niin palvelujen kuin prosessienkin parantamiseen ja tehostamiseen sekä uusiin toimintatapoihin. Suomi on ollut edelläkävijämaita avoimessa datassa ja julkinen sektori on avaamassa tietoaineistojaan. Tätä avoimuuden ja julkisten tietovarantojen saatavuuden kulttuuria tulisi hyödyntää myös Big data -kehitystyössä. Julkisten ja yksityisten data-aineistojen yhdistämisessä ja analyysissä voidaan saavuttaa merkittäviä eri osapuolia hyödyttäviä tuloksia.

13 13 Tehokas big data -tutkimus edellyttää moninaisia eri tieteenalojen tutkimusryhmiä. Tietoa louhitaan ja analysoidaan yhteistyössä muiden kanssa ja aineistolle esitetään yhä uusia kysymyksiä. Tällainen toimintatapa antaa mahdollisuuden ymmärtää laajasti digitaalista aineistoa ja tuottamaan yhä laaja-alaisempia ja tarkempia perusteita, analyysejä ja ennusteita päätöksentekijöiden käyttöön. Big data -tutkimukselle on useita sovelluskohteita. SOTE-alalla Suomella on suuri potentiaali big datan suhteen. Suomesta löytyy maailmanlaajuisesti katsoen poikkeuksellisen laadukkaita ja kattavia tietokantoja. SOTE-uudistuksen myötä on mahdollisuus tutkia, kokeilla ja ottaa käyttöön big dataan perustuvia ratkaisuja. Dataan perustuvista hoitomenetelmistä ja -käytännöistä on jo saatu merkittäviä tuloksia ja hyvillä ratkaisuilla voidaan saavuttaa taloudellisia säästöjä. Big data ajattelutapana (datatietoisuus) ja teknologiana antaa uudenlaisia näkökulmia julkishallinnolle edistää tuottavamman yhteiskunnan ja kestävyysvajeen torjumisen strategisia päätavoitteita, lisäten samalla kansalaisten tyytyväisyyttä julkisiin palveluihin. Big datan avulla on mahdollista realisoida tuottavuushyötyjä useimmilla hallinnon alueilla. Datalähtöisempää julkishallintoa voidaan tarkastella seuraavilla osa-alueilla: datalähtöinen päätöksenteko ja jatkuva organisaatiokehitys kansalaisten digitaaliset julkiset palvelut yritysten ja kansalaisten parempi osallistaminen julkisten palveluiden kehitykseen Teollinen internet (IoT) antaa big data-tutkimukselle useita sovellusalueita, kuten valmistavan teollisuuden prosessit ja niiden optimointi, ennakoiva huolto, energian käytön hallinta, käyttöomaisuuden hallinta ja ennakoiva huolto. Alan tutkimukselle on laajoja mahdollisuuksia myös muualla elinkeinoelämässä, kuten kaupan ja logistiikan alueella, rakentamisessa ja kiinteistöjen hoidossa sekä kunnallisten ja muiden julkisten palvelujen tuottamisessa. Kuvassa 1 on esitetty simuloinnin ja data-analyysin tutkimusympäristö.

14 14 KUVA 1 Centre entre of simulation and data data-analysis analysis

15 15 4 BIG DATA -ALAN KEHITYKSEN SEURANTA Big data -ala ja sovellutukset kuuluvat IT-tiedekunnan strategisiin koulutus- ja tutkimusalueisiin. Alan kehityksestä tehdään puolen vuoden välein yhteenvetoja. Liitteenä 2 on tehty yhteenveto. Päivitetty versio julkaistaan elokuussa Data-analyysiin liittyvät väitöskirjat Data-analyysiin ja big dataan liittyen on lukuvuonna julkaistu seuraavat väitöskirjat: Guy Wolf: Big high-dimensional data analysis with diffusion maps Ilkka Pölönen: Discovering knowledge in various applications with a novel hyperspectral imager Tuomo Sipola: Knowledge discovery using diffusion maps Mikhail Zolotukhinin: On data mining applications in mobile networking and network security Vuoden 2014 loppupuolella tullaan data-analyysin alueelta julkaisemaan viisi väitöskirjaa.

16 16 LIITE 1: JYVÄSKYLÄN YLIOPISTON IT-TIEDEKUNNAN OPETUSOH- JELMAAN LIITTYVÄT VERKKOKURSSIT Päivitetty versio listasta julkaistaan name 1 Statistics One 2 Statistics: Making Sense of Data 3 Statistics 4 Introduction to Statistics: Descriptive Statistics 5 Mathematical Statistics 6 Introduction to Probability and Statistics 7 Statistics for Applications 8 Probability and statistics 9 Probability & Statistics 10 Statistical Reasoning 11 An Introduction to Interactive Programming in Python 12 Introduction to Programming for Digital Artists 13 Creative, Serious and Playful Science of Android Apps 14 Introduction to Computer Science 15 Introduction to Computer Science and Programming 16 Introduction to Computer Science I 17 Introduction to Computer Science and Programming 18 Computer Science 19 Principles of Computing 20 Media Programming 21 Learn to Program: The Fundamentals 22 Ohjelmoinnin MOOC 23 Object-Oriented programming with Java, part I 24 Peliohjelmoinnin MOOC 25 Introduction to Systematic Program Design 26 Algoritmien MOOC 27 Algorithms, Part I 28 Algorithms, Part II 29 Algorithms: Design and Analysis, Part 1 30 Algorithms: Design and Analysis, Part 2 31 Algorithms 32 Introduction to Algorithms 33 Computer Algorithms in Systems Engineering 34 Game Theory

17 17 35 Games without Chance: Combinatorial Game Theory 36 General Game Playing 37 Advanced Algorithms 38 Introduction to Theoretical Computer Science 39 Interactive 3D Graphics 40 Foundations of Computer Graphics 41 Computational Geometry 42 Computer Graphics 43 Computer System Engineering 44 Programming Languages 45 Design of Computer Programs 46 Introduction to Data Science 47 Database Systems 48 Introduction to Computer Networks 49 Computer Architecture 50 Computer System Architecture 51 Artificial Intelligence for Robotics 52 Control of Mobile Robots 53 Cryptography I 54 Cryptography II 55 Cryptography and Cryptanalysis 56 Network and Computer Security 57 Applied Cryptography 58 Computer Security 59 Information Security and Risk Management in Context 60 Malicious Software and its Underground Economy: Two Sides to Every Story 61 Selected Topics in Cryptography 62 Advanced Topics in Cryptography 63 Designing and Executing Information Security Strategies 64 Building an Information Risk Management Toolkit 65 Network and Computer Security 66 Automata, Computability, and Complexity 67 Automata 68 Software Debugging 69 Software Testing 70 Functional Programming Principles in Scala 71 Creative, Serious and Playful Science of Android Apps 72 Computational Methods for Data Analysis 73 Computing for Data Analysis 74 Web Intelligence and Big Data 75 Data Mining 76 Statistics and Visualization for Data Analysis and Inference 77 Scientific Computing 78 Computing for Data Analysis 79 Data Analysis

18 18 80 High Performance Scientific Computing 81 Statistics: Making Sense of Data 82 Computational Methods for Data Analysis 83 Metadata: Organizing and Discovering Information 84 Machine Learning 85 Neural Networks for Machine Learning 86 Machine Learning 87 Computational Neuroscience 88 Dynamical Modeling Methods for Systems Biology 89 Introduction to Artificial Intelligence 90 Artificial Intelligence 91 Digital Signal Processing 92 Introduction to Communication, Control, and Signal Processing 93 Signal Processing: Continuous and Discrete 94 Discrete-Time Signal Processing 95 Digital Signal Processing 96 Signals and Systems 97 Machine Vision 98 Pattern Recognition for Machine Vision 99 Computer Vision 100 Computer Vision: The Fundamentals 101 Computer Vision: From 3D Reconstruction to Visual Recognition 102 Fundamentals of Digital Image and Video Processing 103 Linear and Discrete Optimization 104 Linear and Integer Programming 105 Systems Optimization 106 Optimization Methods 107 Nonlinear Programming 108 Everything is the Same: Modeling Engineered Systems 109 Introduction to Numerical Simulation 110 Introduction to Modeling and Simulation 111 Functional Hardware Verification 112 Introduction to Parallel Programming 113 Heterogeneous Parallel Programming 114 Parallel Computing 115 Theory of Parallel Systems 116 Differential Equations in Action 117 Linear Algebra 118 Coding the Matrix: Linear Algebra through Computer Science Applications 119 Calculus One 120 Calculus: Single Variable 121 Pre-Calculus 122 Visualizing Algebra 123 Web Development 124 HTML5 Game Development

19 Pattern-Oriented Software Architectures for Concurrent and Networked Software 126 CS169.1x: Software as a Service 127 Networked Life 128 Social Network Analysis 129 Videogames and Learning 130 Fundamentals of Online Education: Planning and Application 131 Gamification 132 Live!: A History of Art for Artists, Animators and Gamers 133 Online Games: Literature, New Media, and Narrative 134 How to Build a Startup 135 Startup Engineering 136 Startup 137 Leading Strategic Innovation in Organizations 138 Grow to Greatness: Smart Growth for Private Businesses, Part I 139 Grow to Greatness: Smart Growth for Private Businesses, Part II 140 Design Thinking for Business Innovation 141 An Introduction to Operations Management 142 Developing Innovative Ideas for New Companies 143 Creativity, Innovation, and Change 144 New Models of Business in Society 145 Foundations of Business Strategy 146 Design Thinking for Business Innovation 147 Content Strategy for Professionals: Engaging Audiences for Your Organization 148 International Organizations Management 149 Inspiring Leadership through Emotional Intelligence 150 Critical Perspectives on Management 151 Law and the Entrepreneur 152 Copyright 153 Markets with Frictions 154 An Introduction to Financial Accounting 155 Introduction to Finance 156 Corporate Finance 157 Organizational Analysis 158 Understanding economic policymaking

20 20 LIITE 2: YHTEENVETO BIG DATA -ALASTA JA SOVELLUTUKSISTA JYVÄSKYLÄN YLIOPISTON IT-TIEDEKUNNASSA Muistio Jyväskylä Asia: Big data Lisätietoja: dekaani professori Pekka Neittaanmäki, Luennot Kesä 2013, Jyväskylän kansainvälinen kesäkoulu - Amir Averbuch: TIEJ658 COM6: Advanced Methods for Classification of Big High Dimensional Data (JSS23), 2 op Lukuvuosi Gil David: ITKST47 Advanced Anomaly Detection: Theory, Algorithms and Applications, 5 op - Gil David: ITKST48 Advanced Persistence Threat, 5 op, Advanced Persistence Threat exploitation cycle - Mauri Leppänen: ITKA204 Tietokannat ja tiedonhallinnan perusteet, 4 op - Tommi Kärkkäinen: TIES445 Tiedonlouhinta, 5 op - Oleksiy Mazhelis: TJTSM61 Business Analytics and Big Data Management, 5 op - Michael Cochez: tammikuussa 2015 kurssi nimeltä "Big Data Engineering Väitöskirjat Guy Wolf: Big high-dimensional data analysis with diffusion maps. - Ilkka Pölönen: Discovering knowledge in various applications with a novel hyperspectral imager - Tuomo Sipola: Knowledge discovery using diffusion maps Väitöskirja tekeillä - Limor Gavish: Memcached - nosql and big data databased particularly for caching. Hankkeita - Pekka Neittaanmäki: New System for Cyber Attacks Protection of Critical Infrastructures , CAP-projekti ( , Tekes)

21 21 - Jari Veijalainen: Tiedonkaivuu sosiaalisesta mediasta, MineSocMed-projekti ( , SA) tarkoitus kehittää sosiaalisen median analyysialgoritmeja. - Timo Hämäläinen: Suurien moniulotteisten datajoukkojen järjestäminen ja analysointi, HIDE-hanke ( , Tekes) - Timo Hämäläinen: Kiinteistöautomaatiojärjestelmien datan älykäs analysointi, KIIAUDATA-hanke ( , Tekes) - Amir Averbuch: MeBUD: Methods For Big Unstructured High Dimensional Data ( , haettu rahoitusta SA) Maisterikoulutus Data-analyysin monitieteellinen maisterikoulutus (DATA) on Tietotekniikan sekä Matematiikan ja tilastotieteen laitoksien yhteinen ohjelma. Opetuksessa käsitellään laajaalaisesti tilastotieteen, numeerisen laskennan ja ohjelmoinnin käsitteitä ja menetelmiä. Data-analyysissä opetetaan ja tutkitaan menetelmiä ja lähestymistapoja, joilla eritavoin kerätystä tiedosta (data) pyritään muodostamaan malleja ja korkeampaa tai tarkempaa informaatiota. Data-analyysin maisterikoulutus vastaa muuttuvan maailman tilanteeseen, jossa suurien data-aineistojen automaattisesta analysoinnista on tullut keskeinen työkalu useilla aloilla. Koulutuksen tavoitteena on antaa opiskelijoille data-analyysiin liittyvää erikoisosaamista sekä tilastollisista menetelmistä että niiden soveltamisesta tietokoneisiin Muuta tutkimusta - Michael Cochez: A book chapter called 'Toward Evolving Knowledge Ecosystems for Big Data Understanding' in book called Big Data Computing, 2013, In this chapter we propose the use of ecosystems known from biology to tackle the big data problem. - Michael Cochez: much of my own research is related to big data, mainly the alignment of huge ontologies. - Timo Hämäläinen: Verkkoliikenteen analyysiin liittyvää tutkimusta (myös TIES326 tietoturvakurssilla näitä esillä). Näitä jatketaan uusien datamöykkyjen kanssa (JAMK:n labra etc.). Timo Hämäläinen: Network traffic analysis Nowadays HTTP servers and applications are some of the most popular targets for network attacks. The easiest way to carry out such attacks is to inject malicious code into HTTP request messages. Such intrusive requests can be extremely dangerous since they can corrupt the server or collect confidential information from the server databases. One of the options to detect these attacks is to process all HTTP queries as text lines and transform them to numeric feature matrices, which then are used to find intrusions with anomaly detection algorithms. However, since there can be many different kinds of requests to different HTTP applications for each unique web resource one feature matrix is constructed and analyzed as a rule. Thus, for a huge HTTP server several thousands of matrices would need to be built, which is not efficient from the

22 22 computing resources point-of-view. In addition, it is also difficult to define normal users behavior for resources that have been requested few times. In this research, we considered an algorithm for HTTP intrusions detection based on simple clustering algorithms and advanced processing of HTTP requests which allows the analysis of all queries at once and does not separate them by resource. The method proposed allows detection of HTTP intrusions in case of continuously updated web-applications. The algorithm is tested using logs acquired from a large real-life web service and, as a result, almost all attacks from these logs are detected, while the number of false alarms remains very low. [1] We have also studied online detection of anomalous HTTP requests with Growing Hierarchical Self-Organizing Maps (GHSOMs). By applying an n- gram model to HTTP requests from network logs, feature matrices were formed. GHSOMs are then used to analyze these matrices and detect anomalous requests among new requests received by the webserver. The system proposed is self-adaptive and allows detection of online malicious attacks in the case of continuously updated web-applications. The method is tested with network logs, which include normal and intrusive requests. Almost all anomalous requests from these logs are detected while keeping the false positive rate at a very low level.[2] In addition, real-world network logs were analyzed using dimensionality reduction technique called Diffusion Map (DM). First, some features like n-grams or character frequency was calculated to form a feature matrix. The dimensionality of this matrix was reduced for the purposes of visualization and facilitating subsequent clustering and other analysis phases. Principal Component Analysis (PCA) was used as a comparison, since it s one of the most frequently used methodologies. The main advantage of Diffusion Maps is the fact that it can handle non-linear dependencies in the data, which PCA cannot. DM is used successfully to find actual intrusions from the data, as well as visualizing the structure of the network traffic. Subsequently, a clustering algorithm, such as k-means, can be used to find anomalies and structures in the data. These will help network administrators detect intrusions and other anomalies from a network. Also, a rule extraction algorithm was implemented an d tested. This idea comes from the world of credit card fraud detection. The point is to first analyze and cluster network traffic using methods described previously. The problem is that running dimensionality reduction and clustering algorithms continuously can be computationally expensive. For this reason, conjunctive rule extraction is used to automatically generate signatures that approximate the traffic clustering result. Using these rules is very efficient, and they can be periodically updated completely automatically. This approach combines the accurate clustering of the advanced algorithms and the efficiency of using simple signature rules to classify traffic into normal and anomalous. [A,3,4] [A] T. Sipola, A. Juvonen, J. Lehtonen: Dimensionality Reduction Framework for Detecting Anomalies from Network Logs, Engineering Intelligent Systems, 20(1):87-97, 2012 [1] M. Zolotukhin, T. Hämäläinen: Detection of Anomalous HTTP Requests Based on Advanced N-gram Model and Clustering Techniques. NEW2AN 2013: International

23 23 Conference on Next Generation Wired/Wireless Networking, August 28, St. Petersburg, Russia, 2013 [2] M. Zolotykhin, T. Hämäläinen and A. Juvonen, Online Anomaly Detection by Using N-gram Model and Growing Hierarchical Self-Organizing Maps, IEEE IWCMC 2012, August 27-31, Limassol, Cyprus, 2012 [3] A. Juvonen, T. Sipola: Adaptive Framework for Network Traffic Classification Using Dimensionality Reduction and Clustering, In Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), th International Congress on, pages , St. Petersburg, Russia, October IEEE [4] A. Juvonen, T. Sipola: Combining Conjunctive Rule Extraction with Diffusion Maps for Network Intrusion Detection, The 18th IEEE Symposium on Computers and Communications (ISCC 13), July 7-10, 2013, Split, Croatia.

24 24 LIITE 3: BIG DATA - PARADIGM, MAJOR TOPICS AND TRENDS BIG DATA Paradigm, major topics and trends Mariia Gavriushenko, supervisor: Pekka Neittaanmäki 2/5/2014

25 25 1. INTRODUCTION Throughout 2013, big data and analytics have been among the most-hyped themes within the arena of IT. Many of the companies making the effort to use big data for improving its reputation Big data is a popular concept used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business and society as the Internet has become. But still there are some challenges which have to be considered. Many organizations are concerned that the amount of amassed data is becoming so large that it is difficult to find the most valuable pieces of information. And several questions appeared: What if your data volume gets so large and varied you don't know how to deal with it? Do you store all your data? Do you analyze it all? How can you find out which data points are really important? How can you use it to your best advantage? Big data technology has to support search, development, governance and analytics services for all data types from transaction and application data to machine and sensor data to social, image and geospatial data, and more. Several sectors with particular importance for development Big Data are quite data intensive: education, health, government, and communication host one third of the data in the country. Big data in companies: No single business trend in the last 10 years has as so big potential impact on incumbent IT investments as big data. A big data promises to upend legacy technologies at many big companies. As IT modernization initiatives gain traction and the accompanying cost savings hit the bottom line, executives in both line of business and IT organizations are getting serious about the technology solutions that are tied to big data. Companies are not only replacing legacy technologies in favor of open source solutions like Apache Hadoop, they are also replacing proprietary hardware with commodity hardware, custom-written applications with packaged solutions, and decades-old business intelligence tools with data visualization. This new combination of big data platforms, projects, and tools is driving new business innovations, from faster product time-to-market to an authoritative single view of the customer to custom-packaged product bundles and beyond. Big companies with large investments in their data warehouses have neither the resources nor the will to simply replace an environment that works

26 26 well doing what it was designed to do. At the majority of big companies, a coexistence strategy that combines the best of legacy data warehouse and analytics environments with the new power of big data solutions is the best of both worlds Many companies continue to rely on incumbent data warehouses for standard BI and analytics reporting, including regional sales reports, customer dashboards, or credit risk history. In this new environment, the data warehouse can continue with its standard workload, using data from legacy operational systems and storing historical data to provision traditional business intelligence and analytics results. Data warehouse can serve as a data source into the big data environment. Likewise, Hadoop can consolidate key data output that can populate the data warehouse for subsequent analytics. [1]

27 27 2. BIG DATA PARADIGM The advent of Big Data delivers the cost-effective prospect to improve decision-making in critical development areas such as health care, employment, economic productivity, crime and security, and natural disaster and resource management. Well-known caveats of the Big Data debate, such as privacy concerns, interoperability challenges, and the almighty power of imperfect algorithms, are aggravated in developing countries by long-standing development challenges like lacking technological infrastructure and economic and human resource scarcity. The Big Data paradigm provides loads of additional data to fine-tune the models and estimates that inform all sorts of decisions. As such, the crux of the Big Data paradigm is actually not the increasingly large amount of data itself, but its analysis for intelligent decision-making (in this sense, the term Big Data Analysis would actually be more fitting than the term Big Data by itself). On the picture 1 we can see an established three-dimensional conceptual framework that models the process of digitization as an interplay between technology, social change, and guiding policy strategies. The framework comes from the ICT4D literature (Information and Communication Technology for Development)[2] and is based on a Schumpeterian notion of social evolution through technological innovation [3]. Figure 1 The three-dimensional framework applied to Big Data [4]

28 28 The most interesting dimension is tracking: Tracking words Analyzing comments, searches or online posts can produce nearly the same results for statistical inference as household surveys and polls. The tracking of words can be combined with other databases, such as done by Global Viral Forecasting, which specializes in predicting and preventing pandemics [5],or the World Wide Anti-Malarial Resistance Network that collates data to inform and respond rapidly to the malaria parasite s ability to adapt to drug treatments [6]. Tracking locations Location-based data are usually obtained from four primary sources: inperson credit or debit card payment data; in-door tracking devices, such as RFID tags on shopping carts; GPS chips in mobile devices; or cell-tower triangulation data on mobile devices. The system can successfully predict future traffic conditions, based on matching current to historical data, combining it with weather forecasts, and information from past traffic patterns, etc. Such traffic analysis does not only save time and gasoline for citizens and businesses, but is also useful for public transportation, police and fire departments, and, of course, road administrators and urban planners. Tracking nature A recent project by the United Nations University uses climate and weather data to analyze where the rain falls in order to improve food security in developing countries [7] Combing Big Data of nature and social practices, relatively cheap standard statistical software was used by several bakeries to discover that the demand for cake grows with rain and the demand for salty goods with temperature. Sensors, robotics and computational technology have also been used to track river and estuary ecosystems, which help officials to monitor water quality and supply through the movement of chemical constituents and large volumes of underwater acoustic data that tracks the behavior of fish and marine mammal species [8]. Tracking behavior Behavioral abnormalities are usually spotted by analyzing variations in the behavior of individuals in light of the collective behavior of the crowd. With Big Data, a simple analysis of variations allows to detect unwarranted variations, which originate with the underuse, overuse, or misuse of medical care [9]. These affect the means of health care, but not its ultimate end. By now, multiplayer online games are also used to track and influence behavior at the same time. Health insurance companies are currently developing multi-layer online games that aim at increasing the fitness levels of their clients. The tracking of who relates to whom quickly produces vast amounts of data on social network structures, but defines the dynamics of opinion leadership and peer pressure, which are extremely important inputs for behavioral

29 29 change [10]. 3. BIG DATA MAJOR TOPICS BigData 2014's major topics [11] include but not limited to: Big Data Architecture, Big Data Modeling, Big Data As A Service, Big Data for Vertical Industries (Government, Healthcare, etc.), Big Data Analytics, Big Data Toolkits, Big Data Open Platforms, Economic Analysis, Big Data for Enterprise Transformation, Big Data in Business Performance Management, Big Data for Business Model Innovations and Analytics, Big Data in Enterprise Management Models and Practices, Big Data in Government Management Models and Practices, and Big Data in Smart Planet Solution. While the big news for most businesses has been the development of big data technology like Hadoop, the growing prevalence of cloud computing or the decreasing costs of flash storage, the big news for enterprise technology vendors is that a huge portion of that innovation is coming from startups and open source projects vs. the tech giants of the past. Hadoop itself is an open source project that startups have now integrated with the cloud to offer big data services. Startups have also developed hybrid and other flash solutions to help cut costs and improve performance. Some examples of startups and the technology they are offering include: MobileIron: provides software solutions for BYOD and other mobile programs Arista: A cloud service that specializes in high frequency trading along with offering big data solutions Tegile: Offers hybrid arrays that leverage the speed of flash while using lower-cost disk to boost capacity. It claims that its particular product offers five times the performance with 75 percent less capacity required when compared to legacy arrays PernixData: Offers a Flash Virtualization Platform which accelerate business applications through a scale out architecture that virtualizes serverside flash FireEye: Created a virtual cyber-threat defense system that promises real-time protection The future of on-the-go big data: According to Nathaniel Mott, the future of computing will be a question of head vs wrist instead of desktop vs mobile. The coming years will probably be flooded with new mobile devices currently unknown and all of them will require a different approach of on-the-go big data. So the challenge ahead for

30 30 organizations is to accept this and adapt on time to meet the needs of the mobile future. 4. BIG DATA TRENDS A reflection on the 2013 Big Data trends [12] 1) On-the-go Big Data, meaning being able to view Big Data visualizations on mobile devices became important and in 2013 we saw the rise of a bunch of new mobile devices including smart watches and Google Glass. There are some Big Data startups, such as Roambi, who have very clear understanding of on-the-go Big Data and are capable of bringing real-time interactive visualizations to mobile devices. 2) Big Data does not require big bucks because of the plethora of Big Data open source tools that are available in the market as well as the decreasing costs of storage. The price of storage does indeed continue to decrease in costs, but the amount of data also grows exponentially. Will we be able to keep up with this or will the amount of data outgrow the available storage? The amount of open source tools is growing rapidly, but there is also a rise in licensed Big Data solutions, because open source tools do require experience Big Data personnel and many organizations do not yet have these staff available. So, to start with Big Data it does not have to cost the world, but to develop and implement a complete Big Data solution can be expensive, although the results can also be significant. 3) Big real-time data and 2013 did indeed show an important growth in real-time analytics. More and more tools become available that create a layer on top of Hadoop to be able to deal with real-time data and Hadoop 2.0 s YARN framework enables real-time data analysis. In the coming years this will become more important as many industries see the advantages of real-time analytics. 4) Big consumer data, or the quantified-self movement. really took off in Wearable technologies that can measure everyday life have started to appear massively and more and more consumers are measuring at least something of their behavior, be it their sleeping patterns or the running results. Recent research from Pew Research Centre revealed that 69% of U.S. adults keep track of at least one health indicator such as weight, diet, exercise routine or symptom. 5) Big Data related to privacy. In 2013 there were the PRISM leak by Edward Snowden, which showed that privacy in the Big Data world is indeed an endangered species and there is probably a lot more to be revealed in the coming months. Privacy is indeed affected by Big Data and as consumers, and companies, we will have to get used to this new reality.

31 31 New trends [12] 1) Internet that will affect the industrial sector dramatically. The next year, machine-to-machine data will grow significantly and continue to do so in the years after. By 2020, 40% of all data will come from sensor data and it will unlock a $ 1 trillion global market in 2020 (currently it is a $ 121 billion global market). In addition, GE reports that the Industrial Internet could add $10 to $15 trillion to global GDP in the coming years. These sensors will completely change the way companies, factories and supply chains will be operated and managed. Technology is transforming the industrial sector, creating machines that can see, feel, sense and react, so they can be operated far more efficiently. Already there are some great examples of how this will affect companies, ranging from airline companies that can reduce turn-around time with monitoring the plane during the flight, to analyzing many different variables to pick the best places to locate wind turbines around the world to be able to harvest the most energy at the lowest costs. As sensors and storage are becoming cheaper every day, algorithms are becoming better and organizations more and more see the need for smart factories, the Industrial Internet will really take off in ) It is going to be cloudy: Big-Data-as-a-Service solutions There are more and more Big Data startups that are creating a Big-Dataas-a-Service solution to help organizations apply Big Data without the heavy costs involved. Especially useful for Small or Medium sized enterprises who do want to develop a data-driven information-centric organization, but who do not have the capacity to develop and maintain a full-fledge Big Data solution on premises. Big-data-as-a-Service is a combination of Analytics-as-a-service, Infrastructure-as-a-Service and Data-as-a-Service and it will spur the adoption of Big Data also by smaller and medium sized organizations. IIA calls this adoption ready-made analytics in the cloud. These solutions offer an attractive alternative for organizations that want to start with Big Data or want to easily scale existing programs. Gartner predicts that cloud computing will become the bulk in IT spending by 2016 and 2014 will be a turning point in the acceptance of the cloud as part of a Big Data strategy. 3) Security to protect the privacy If there is thing that the NSA documents have shown, governments from around the world have almost unprecedented access to data files from organizations and consumers. Organizations will start to focus more heavily on securing their data to protect the privacy of their customers. The first signs for this are that tech companies call for aggressive NSA reforms at the White House meeting in December Executives from 15 companies expressed their concern that the NSA s wide-ranging surveillance activities had undermined the trust of their users. Of course, a reform of the NSA activities is one side of the coin; the other side will be increased security measures by the companies to protect their data. More and more organizations will start to use Big Data techniques to secure their IT infrastructure and prevent from being hacked and have data moni-

32 32 tored or stolen. Log data will form an important aspect in this and organizations will start to see the importance in monitoring and analyzing their IT infrastructure log data in order to keep their infrastructure and data safe. This will help to restore and keep the trust of their customers. 4) Personalization will become personal Consumers are creating massive amounts of data through every click, like, tweet, cell-phone call, purchase and self-tracking applications they use. Companies like Amazon have already used these kinds of data for many years to create a personal online shopping experience with recommendations, personal homepages, personal discounts or personally targeted mass- campaigns. However, in 2014 more organizations will also start to see the value in such a personalized approach, be it online or offline. A good example is the Australian shoe retailer Shoes of Prey, who have developed an analytics system that enables them to look at individual customer-spend and profitability, and allows them to begin upselling based on the fashion tastes of its clients. Personalization is making a giant leap forward in the coming years and 2014 could very well be the inflection point in the offering of personalized offers and the acceptance of it by consumers. Consumers will start to see that their data is valuable and they do want something in return for providing their data. So consumers are willing to cooperate and share their data if it brings them personalized discounts. 5) Education will be essential for success As more organizations are trying to understand Big Data and preparing their staff for the Big Data era, education becomes a crucial aspect. Already in 2011, McKinsey predicted a shortage in the coming years of Big Data scientists and Big Data managers. Organizations will therefore stimulate their employees to be more Big Data skilled. Many organizations are heading for a major skill gap and will have to take action to be ready for the big data era. Also freshgraduates or students will see the Big Data trend and in the competitive jobs market will feel the need to differentiate to stand a chance on the job market. Therefore in 2014 we will see a steep increase in the available online and offline big data courses. Apart from the online universities such as Coursera or Udacity, many more universities from around the world will start offering a big data course or program. These courses or programs will range from Big Data strategy courses to deep analytics and machine-learning programs for Big Data scientists and anything in between to cater for the massive increase in Big Data students. 6) Big data moves into mixed data In the past years Big Data was all about obtaining as much data as possible and the perception was that you require massive datasets to gain insights from those data sources. In the coming year however, organizations will start to see that the most important aspect of Big Data is not so much the volume of a dataset, but more the insights derived from combining several, smaller, datasets. Organizations that do not have Exabyte s or petabytes can still obtain

33 33 very valuable insights with smaller, but more, data sets. Of course, more data does mean more accurate insights but it does not per se mean more insights. In the coming year, more organizations will understand this and take their first steps into the direction of Big Data. They will start mixing and combining several data sets that they will analyses to derive insights. So in 2014 Big Data will become mixed data. 7) Proof of Concept The past years we saw a lot of talk around Big Data. More conferences, new books, more Big Data startups and more interesting best practices are being shared and distributed online and offline. In 2014 many more organizations will start working towards a Big Data strategy and start developing a Proof of Concept (PoC) to investigate what Big Data can mean for them. The PoC will help organizations gain a better understanding of Big Data, will help them get educated and will help them to be better able to predict the ROI of future Big Data projects. A Proof of Concept is a vital part of a Big Data strategy when you start with Big Data. Future for BigData [13] Big data in cloud really means private cloud. In 2013, most of the big data projects we've seen were put on top of bare metal infrastructure in the enterprise. We expect to see an evolution toward a virtualized infrastructure in We re seeing a lot of investment in products that make this happen, such as Serengeti for vsphere, Savanna for OpenStack and Ironfan for Amazon Web Services (AWS). These projects allow us to automate the deployment of big data platforms to a virtualized infrastructure. The era of analytic applications begins. In 2013, enterprises learned a lot about how to use the big data infrastructure that was new to the market. This coming year, those lessons will be applied toward analytic applications. In 2014, we will see some great use cases happen on that big data infrastructure. This will be the year of: What can I do with big data? rather than: What is big data? Given this refocus on analytic applications, 2014 will create an even greater demand for people with skills in data science. The Hadoop clone wars end. We feel confident that the big data industry will consolidate down to a couple of Hadoop distributions. Currently, many distributions of Hadoop exist, some proprietary and some open source. In 2014, the industry will consolidate to two of these. Those that remain will become less relevant either because they are consolidated by acquisition into one of the survivors or they exit the market. Real time in-memory analytics, complex event processing and ETL will combine. Speaking of exits, serial extract, transform, load (ETL) processes will largely go away in As the velocity of data increases, especially social data, there s more need to analyze data in real time as a stream. Currently, Hadoop is

34 34 being pressed into service for this something it s not well suited for. Inmemory analytics and complex event processing give us the capability to analyze these streams in real time and extract intelligence on the fly. That eliminates the need to perform the traditional ETL steps. MDM will provide the dimensions for big data facts. Master data management (MDM) is used to create a single definition of data from an internal standpoint. As people realize that external data sources are going to add more dimensions to their internal problems, they ll look for a single definition, or a single piece of data that will help describe that new definition or that new dimension, even though it s coming from the outside world. If you realize that external data sources help solve a problem, you'll want an external MDM focus as well. The consolidation of NoSQL will continue. NoSQL means not only SQL rather than the absence of SQL, which means it is more inclusive than exclusive. NoSQL means there are many ways to look at data other than the structured and ordered approach that SQL requires. NoSQL was created to offer a way to look at data without forcing it into a concrete schema. That has been extremely successful, and we re seeing a massive growth in NoSQL. There will be no slowdown in the adoption of NoSQL, but just as with Hadoop distributions, the industry is beginning to settle on a few major players will bring a similar consolidation of NoSQL database distributions. If to look ahead to 2014, experts made a call on social media asking for 2014 big data predictions. Here are some of the responses that have been received [14]: "The hot new data of 2013 was 'exhaust data' powered by the Internet of Things. This will take further hold during 2014, but the hot data of next year will be human data. With knowledge that employee data has been shown to do everything from help manage organizational health to be a leading indicator of quarterly consumer demand, this area will be a hot area for startups. Expect another round of TOS updates from LinkedIn." --Dan Malligner, data science practice lead, Think Big Analytics "The data-information-insight-decision lifecycle will get shortened due to machine learning-based automated decision systems; increased adoption of Cloud ETL to analyze on-premise and off-premise open data; open-source solutions such as R will replace legacy solutions like SAS; the number of offerings of reporting solutions embedded in cloud with 90-day deployment will increase." - -Milind Kelkar, leader, Smart Decisions Lab, Genpact "The future of big data in 2014 will be 'where.' Location intelligence is of paramount importance to companies as their customers are increasingly handling the majority of their daily activity on mobile, from online banking to restaurant check-ins and social sharing in real time. Having precise location data helps organizations understand relationships between specific locations so that they can identify growth opportunities, improve information sharing internally

35 35 and to their customers, and make better strategic business decisions." --James Buckley, SVP, customer data and location intelligence,pitney Bowes Software "It's a major problem for businesses that their online and offline data management systems don't talk to one another. For example, when prospective customers start their buying process online but choose to make a phone call for assistance, the online analytics vanish. Businesses will be seeking means of appending specified data points to those interactions so they have the appropriate IDs and device information to retarget that customer on any given channel. The focus will increasingly be on automating how that data is reported, packaged, and sent to other systems." --Eric Holmen, CMO, Invoca "While Hadoop is still immature, technology advances like YARN are contributing to an enterprise-friendly big data future. Because of YARN, we'll increase opportunities to use new and more optimally efficient engines and expand Hadoop possibilities." --Mike Hoskins, CTO, Actian

36 36 5. INDUSTRIAL INTERNET There are three elements which embody the essence of Industrial Internet: Intelligent machines, advanced analytics, people at work (Figure 2 [15]) Figure 2 Key elements of the Industrial Internet Connecting and combining these elements offers new opportunities across firms and economies. As system monitoring has advanced and the cost of information technology has fallen, the ability to work with larger and larger volumes of real-time data has been expanding. High frequency real-time data brings a whole new level of insight on system operations. Machine-based analytics offers yet another dimension to the analytic process. The combination of physics- based approaches, deep sector specific domain expertise, more automation of information flows, and predictive capabilities can join with the existing suite of big data tools. The result is the Industrial Internet encompasses traditional approaches with newer hybrid approaches that can leverage the power of both historic and real-time data with industry specific advanced analytics. Remote data storage, big data sets and more advanced analytic tools that can process massive amounts of information are maturing and becoming more widely available. Together these changes are creating exciting new opportunities when applied to machines, fleets and networks. Advanced Analytics: Advances in big data software tools and analytic techniques provide the means to understand the massive quantities of data that are generated by intelligent devices. Together, these forces are changing the cost

How To Learn Computer Science

How To Learn Computer Science BIG DATAN TUTKIMUS JA OPETUS JYVÄSKYLÄN YLIOPISTOSSA 4.7.2014 JYVÄSKYLÄN YLIOPISTO INFORMAATIOTEKNOLOGIAN TIEDEKUNTA 2014 2 SISÄLLYS JOHDANTO... 3 1 BIG DATA JYVÄSKYLÄN YLIOPISTON TUTKIMUKSESSA... 4 1.1

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

MEng, BSc Applied Computer Science

MEng, BSc Applied Computer Science School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions

More information

MEng, BSc Computer Science with Artificial Intelligence

MEng, BSc Computer Science with Artificial Intelligence School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering

Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering IV International Congress on Ultra Modern Telecommunications and Control Systems 22 Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering Antti Juvonen, Tuomo

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

How To Make Data Streaming A Real Time Intelligence

How To Make Data Streaming A Real Time Intelligence REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log

More information

Chapter 6 - Enhancing Business Intelligence Using Information Systems

Chapter 6 - Enhancing Business Intelligence Using Information Systems Chapter 6 - Enhancing Business Intelligence Using Information Systems Managers need high-quality and timely information to support decision making Copyright 2014 Pearson Education, Inc. 1 Chapter 6 Learning

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

Big Data and Analytics in Government

Big Data and Analytics in Government Big Data and Analytics in Government Nov 29, 2012 Mark Johnson Director, Engineered Systems Program 2 Agenda What Big Data Is Government Big Data Use Cases Building a Complete Information Solution Conclusion

More information

Big Data & the Cloud: The Sum Is Greater Than the Parts

Big Data & the Cloud: The Sum Is Greater Than the Parts E-PAPER March 2014 Big Data & the Cloud: The Sum Is Greater Than the Parts Learn how to accelerate your move to the cloud and use big data to discover new hidden value for your business and your users.

More information

Review of IT Service Management Tools Currently in Use in Finland

Review of IT Service Management Tools Currently in Use in Finland Jussi Suominen & Lasse Tuomi Review of IT Service Management Tools Currently in Use in Finland ITIL, Implementation and Functionality Helsinki Metropolia University of Applied Sciences Bachelor of Engineering

More information

Master of Science in Health Information Technology Degree Curriculum

Master of Science in Health Information Technology Degree Curriculum Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525

More information

Master s Degree Programme in International Business Management

Master s Degree Programme in International Business Management Lahti University of Applied Sciences Master s Degree Programme in International Business Management Study Guide 2014-2015 15.9.2014 0 Sisällysluettelo International Business Management... 2 DEGREE PROGRAMME

More information

CORE CLASSES: IS 6410 Information Systems Analysis and Design IS 6420 Database Theory and Design IS 6440 Networking & Servers (3)

CORE CLASSES: IS 6410 Information Systems Analysis and Design IS 6420 Database Theory and Design IS 6440 Networking & Servers (3) COURSE DESCRIPTIONS CORE CLASSES: Required IS 6410 Information Systems Analysis and Design (3) Modern organizations operate on computer-based information systems, from day-to-day operations to corporate

More information

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Information has gone from scarce to super-abundant. That brings huge new benefits. The Economist

More information

Big Data better business benefits

Big Data better business benefits Big Data better business benefits Paul Edwards, HouseMark 2 December 2014 What I ll cover.. Explain what big data is Uses for Big Data and the potential for social housing What Big Data means for HouseMark

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Addressing government challenges with big data analytics

Addressing government challenges with big data analytics IBM Software White Paper Government Addressing government challenges with big data analytics 2 Addressing government challenges with big data analytics Contents 2 Introduction 4 How big data analytics

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

The University of Jordan

The University of Jordan The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S

More information

A New Era Of Analytic

A New Era Of Analytic Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

Kimmo Bergius (kimmo.bergius@microsoft.com) Tietoturvajohtaja

Kimmo Bergius (kimmo.bergius@microsoft.com) Tietoturvajohtaja Kimmo Bergius (kimmo.bergius@microsoft.com) Tietoturvajohtaja Number of Digital IDs Trendejä Exponential Growth of IDs Identity and access management challenging Increasingly Sophisticated Malware Anti-malware

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Improving business intelligence data quality in sales insight area. Ari Anturaniemi

Improving business intelligence data quality in sales insight area. Ari Anturaniemi Improving business intelligence data quality in sales insight area Ari Anturaniemi Thesis Master of Business Administration Business Information Technology 24.9.2012 Tiivistelmä Tekijä tai tekijät Ari

More information

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim

More information

Formal Methods for Preserving Privacy for Big Data Extraction Software

Formal Methods for Preserving Privacy for Big Data Extraction Software Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

Sustainable Development with Geospatial Information Leveraging the Data and Technology Revolution

Sustainable Development with Geospatial Information Leveraging the Data and Technology Revolution Sustainable Development with Geospatial Information Leveraging the Data and Technology Revolution Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience

Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience Scaling Big Data Mining Infrastructure: The Smart Protection Network Experience 黃 振 修 (Chris Huang) SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 About Me SPN 主 動 式 雲 端 截 毒 技 術 架 構 師 SPN Hadoop 基 礎 運 算 架 構 師 Hadoop in Taiwan

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

White Paper: SAS and Apache Hadoop For Government. Inside: Unlocking Higher Value From Business Analytics to Further the Mission

White Paper: SAS and Apache Hadoop For Government. Inside: Unlocking Higher Value From Business Analytics to Further the Mission White Paper: SAS and Apache Hadoop For Government Unlocking Higher Value From Business Analytics to Further the Mission Inside: Using SAS and Hadoop Together Design Considerations for Your SAS and Hadoop

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Demystifying Big Data Government Agencies & The Big Data Phenomenon Demystifying Big Data Government Agencies & The Big Data Phenomenon Today s Discussion If you only remember four things 1 Intensifying business challenges coupled with an explosion in data have pushed

More information

PROGRAM DIRECTOR: Arthur O Connor Email Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements

PROGRAM DIRECTOR: Arthur O Connor Email Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements Data Analytics (MS) PROGRAM DIRECTOR: Arthur O Connor CUNY School of Professional Studies 101 West 31 st Street, 7 th Floor New York, NY 10001 Email Contact: Arthur O Connor, arthur.oconnor@cuny.edu URL:

More information

EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public

EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public Leveraging Geospatial Technologies EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public Disclaimer This presentation outlines

More information

Developing Service Requirements Management as part of Product Development Process in the Case Company

Developing Service Requirements Management as part of Product Development Process in the Case Company Vesa Ahonen Developing Service Requirements Management as part of Product Development Process in the Case Company Helsinki Metropolia University of Applied Sciences Master s Degree Industrial Management

More information

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Volker Markl volker.markl@tu-berlin.de dima.tu-berlin.de dfki.de/web/research/iam/ bbdc.berlin Based on my 2014 Vision Paper On

More information

locuz.com Big Data Services

locuz.com Big Data Services locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.

More information

DISCOURSE OF INTERNATIONAL INTERNSHIPS FROM A HOLISTIC PERSPECTIVE Case: CEMS MIM network

DISCOURSE OF INTERNATIONAL INTERNSHIPS FROM A HOLISTIC PERSPECTIVE Case: CEMS MIM network DISCOURSE OF INTERNATIONAL INTERNSHIPS FROM A HOLISTIC PERSPECTIVE Case: CEMS MIM network International Business Master's thesis Emma Miettinen 2009 Department of Marketing and Management HELSINGIN KAUPPAKORKEAKOULU

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

How Big Data Transforms Data Protection and Storage

How Big Data Transforms Data Protection and Storage I D C E X E C U T I V E B R I E F How Big Data Transforms Data Protection and Storage August 2012 Written by Carla Arend Sponsored by CommVault Introduction: How Big Data Transforms Storage Omøgade 8 P.O.Box

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Pidetään ohjelmistoratkaisut yksinkertaisina

Pidetään ohjelmistoratkaisut yksinkertaisina Pidetään ohjelmistoratkaisut yksinkertaisina Moonsoftin missio Moonsoft tarjoaa asiakkailleen kilpailuetua tietokoneohjelmistojen ja niihin liittyvien palveluiden avulla. Asiantuntemuksemme parantaa asiakkaidemme

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer

More information

THE EXPERT SYSTEMS ANALYSIS USING THE CONCEPT OF BIG DATA AND CLOUD COMPUTING SERVICES

THE EXPERT SYSTEMS ANALYSIS USING THE CONCEPT OF BIG DATA AND CLOUD COMPUTING SERVICES THE EXPERT SYSTEMS ANALYSIS USING THE CONCEPT OF BIG DATA AND CLOUD COMPUTING SERVICES Violeta Nicoleta OPRIŞ 1 Ciprian RACUCIU 2 1 Inf. Ph.D. Student 1 Military Technical Academy, Faculty of Military

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

Taking Data Analytics to the Next Level

Taking Data Analytics to the Next Level Taking Data Analytics to the Next Level Implementing and Supporting Big Data Initiatives What Is Big Data and How Is It Applicable to Anti-Fraud Efforts? 2 of 20 Definition Gartner: Big data is high-volume,

More information

EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials

EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials 5th August 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

IBM Software Hadoop in the cloud

IBM Software Hadoop in the cloud IBM Software Hadoop in the cloud Leverage big data analytics easily and cost-effectively with IBM InfoSphere 1 2 3 4 5 Introduction Cloud and analytics: The new growth engine Enhancing Hadoop in the cloud

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

UNDERGRADUATE DEGREE PROGRAMME IN COMPUTER SCIENCE ENGINEERING SCHOOL OF COMPUTER SCIENCE ENGINEERING, ALBACETE

UNDERGRADUATE DEGREE PROGRAMME IN COMPUTER SCIENCE ENGINEERING SCHOOL OF COMPUTER SCIENCE ENGINEERING, ALBACETE UNDERGRADUATE DEGREE PROGRAMME IN COMPUTER SCIENCE ENGINEERING SCHOOL OF COMPUTER SCIENCE ENGINEERING, ALBACETE SCHOOL OF COMPUTER SCIENCE, CIUDAD REAL Core Subjects (CS) Compulsory Subjects (CPS) Optional

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012 MEDICAL DATA MINING Timothy Hays, PhD Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012 2 Healthcare in America Is a VERY Large Domain with Enormous Opportunities for Data

More information

Industry Impact of Big Data in the Cloud: An IBM Perspective

Industry Impact of Big Data in the Cloud: An IBM Perspective Industry Impact of Big Data in the Cloud: An IBM Perspective Inhi Cho Suh IBM Software Group, Information Management Vice President, Product Management and Strategy email: inhicho@us.ibm.com twitter: @inhicho

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

How To Create A Data Science System

How To Create A Data Science System Enhance Collaboration and Data Sharing for Faster Decisions and Improved Mission Outcome Richard Breakiron Senior Director, Cyber Solutions Rbreakiron@vion.com Office: 571-353-6127 / Cell: 803-443-8002

More information

IBM Big Data. Hadoop-tietoisku kumppaneille Pekka Leppänen, IBM Analytics Platform Leader Finland. 2015 IBM Corporation

IBM Big Data. Hadoop-tietoisku kumppaneille Pekka Leppänen, IBM Analytics Platform Leader Finland. 2015 IBM Corporation IBM Big Data Hadoop-tietoisku kumppaneille Pekka Leppänen, IBM Analytics Platform Leader Finland 2015 IBM Corporation Agenda 8.30 Aamiainen ja ilmoittautuminen 9:10 9:45 Keskeiset toimijat ja trendit markkinoilla

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Big data and its transformational effects

Big data and its transformational effects Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven

More information

Deploying Big Data to the Cloud: Roadmap for Success

Deploying Big Data to the Cloud: Roadmap for Success Deploying Big Data to the Cloud: Roadmap for Success James Kobielus Chair, CSCC Big Data in the Cloud Working Group IBM Big Data Evangelist. IBM Data Magazine, Editor-in- Chief. IBM Senior Program Director,

More information

WHITE PAPER Big Data Analytics. How Big Data Fights Back Against APTs and Malware

WHITE PAPER Big Data Analytics. How Big Data Fights Back Against APTs and Malware WHITE PAPER Big Data Analytics How Big Data Fights Back Against APTs and Malware Table of Contents Introduction 3 The Importance of Machine Learning to Big Data 4 Addressing the Long-Tail Nature of Internet

More information

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Exploiting Data at Rest and Data in Motion with a Big Data Platform Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags

More information

Concept and Project Objectives

Concept and Project Objectives 3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the

More information

Santtu Olavi Säisä RISK ANALYSIS AT SHARED SERVICE CENTRES A CASE STUDY

Santtu Olavi Säisä RISK ANALYSIS AT SHARED SERVICE CENTRES A CASE STUDY Santtu Olavi Säisä RISK ANALYSIS AT SHARED SERVICE CENTRES A CASE STUDY Ylempi AMK Tutkinto Liiketalous ja matkailu 2011 1 VAASAN AMMATTIKORKEAKOULU UNIVERSITY OF APPLIED SCIENCES Liiketalous ja matkailu

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Information Security Plan for SAP HCM. Kirsi Anturaniemi

Information Security Plan for SAP HCM. Kirsi Anturaniemi Information Security Plan for SAP HCM Kirsi Anturaniemi Thesis Degree Programme in Information Technology 2013 Abstract 8.11.2013 Degree Programme in Information Technology Author Kirsi Anturaniemi The

More information

DATA MANAGEMENT FOR THE INTERNET OF THINGS

DATA MANAGEMENT FOR THE INTERNET OF THINGS DATA MANAGEMENT FOR THE INTERNET OF THINGS February, 2015 Peter Krensky, Research Analyst, Analytics & Business Intelligence Report Highlights p2 p4 p6 p7 Data challenges Managing data at the edge Time

More information

Session 4 Cloud computing for future ICT Knowledge platforms

Session 4 Cloud computing for future ICT Knowledge platforms ITU Workshop on "Future Trust and Knowledge Infrastructure", Phase 1 Geneva, Switzerland, 24 April 2015 Session 4 Cloud computing for future ICT Knowledge platforms Olivier Le Grand, Senior Standardization

More information

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches. Detecting Anomalous Behavior with the Business Data Lake Reference Architecture and Enterprise Approaches. 2 Detecting Anomalous Behavior with the Business Data Lake Pivotal the way we see it Reference

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Big data platform for IoT Cloud Analytics. Chen Admati, Advanced Analytics, Intel

Big data platform for IoT Cloud Analytics. Chen Admati, Advanced Analytics, Intel Big data platform for IoT Cloud Analytics Chen Admati, Advanced Analytics, Intel Agenda IoT @ Intel End-to-End offering Analytics vision Big data platform for IoT Cloud Analytics Platform Capabilities

More information