ACTIVITY REPORT 2010-2011-2012



Similar documents
Overcoming the Technical and Policy Constraints That Limit Large-Scale Data Integration

Information Visualization WS 2013/14 11 Visual Analytics

Information and Communication Technologies EPIWORK. Developing the Framework for an Epidemic Forecast Infrastructure.

ICT Perspectives on Big Data: Well Sorted Materials

Location Analytics for Financial Services. An Esri White Paper October 2013

Concept and Project Objectives

PROGRAM DIRECTOR: Arthur O Connor Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements

EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials

Sanjeev Kumar. contribute

Ethnography and Big Data: A Rapprochement?

ANALYTICS STRATEGY: creating a roadmap for success

DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE

IC05 Introduction on Networks &Visualization Nov

MIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP ENVIRONMENTAL SCIENCE

MSCA Introduction to Statistical Concepts

How To Change Medicine

Distributed Database for Environmental Data Integration

D A T A M I N I N G C L A S S I F I C A T I O N

Ethnography and Big Data

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

Doctor of Philosophy in Computer Science

CHAPTER 1 INTRODUCTION

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

I. TODAY S UTILITY INFRASTRUCTURE vs. FUTURE USE CASES...1 II. MARKET & PLATFORM REQUIREMENTS...2

Online MPH Program Supplemental Application Handbook

Business Analytics and the Nexus of Information

Guidelines for Integrative Core Curriculum Themes and Perspectives Designations

Internet of Things (IoT): A vision, architectural elements, and future directions

BIG DATA + ANALYTICS

AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM

CHAOS, COMPLEXITY, AND FLOCKING BEHAVIOR: METAPHORS FOR LEARNING

CONNECTING DATA WITH BUSINESS

Following are detailed competencies which are addressed to various extents in coursework, field training and the integrative project.

SECURITY METRICS: MEASUREMENTS TO SUPPORT THE CONTINUED DEVELOPMENT OF INFORMATION SECURITY TECHNOLOGY

Opportunities to Overcome Key Challenges

Doctor of Philosophy in Informatics

Cisco Unified Communications and Collaboration technology is changing the way we go about the business of the University.

Crowdsourcing mobile networks from experiment

Big Data: Rethinking Text Visualization

Big Workflow: More than Just Intelligent Workload Management for Big Data

Undergraduate Degree in Graphic Design

12/7/2015. Data Science Master s programs

White Paper. Data Mining for Business

Change Management in Higher Education: Using Model and Design Thinking to Develop Ideas that Work for your Institution

Strategic Plan

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Data Isn't Everything

An Overview of Knowledge Discovery Database and Data mining Techniques

Ubiquitous Analytics: Interacting with Big Data Anywhere, Anytime

Lean manufacturing in the age of the Industrial Internet

EL Program: Smart Manufacturing Systems Design and Analysis

Executive Leadership MBA Course Descriptions

excerpted from Reducing Pandemic Risk, Promoting Global Health For the full report go to

I N T E L L I G E N T S O L U T I O N S, I N C. DATA MINING IMPLEMENTING THE PARADIGM SHIFT IN ANALYSIS & MODELING OF THE OILFIELD

The MPH. ability to. areas. program. planning, the following. Competencies: research. 4. Distinguish. among the for selection

Master of Science Service Oriented Architecture for Enterprise. Courses description

Data Virtualization A Potential Antidote for Big Data Growing Pains

UNIVERSITY OF MIAMI SCHOOL OF BUSINESS ADMINISTRATION MISSION, VISION & STRATEGIC PRIORITIES. Approved by SBA General Faculty (April 2012)

Copyright. Network and Protocol Simulation. What is simulation? What is simulation? What is simulation? What is simulation?

Big Data and Complex Networks Analytics. Timos Sellis, CSIT Kathy Horadam, MGS

Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA)

Collaborations between Official Statistics and Academia in the Era of Big Data

How To Monitor Your Business

The Scientific Data Mining Process

Business Service Management Links IT Services to Business Goals

Preface. A Plea for Cultural Histories of Migration as Seen from a So-called Euro-region

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

The Intelligent Data Network: Proposal for Engineering the Next Generation of Distributed Data Modeling, Analysis and Prediction

Research into competency models in arts education

ASPH Education Committee Master s Degree in Public Health Core Competency Development Project

Big Data in Pictures: Data Visualization

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Whitnall School District Report Card Content Area Domains

Big Data and Data Analytics

High Throughput Network Analysis

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

Kindergarten to Grade 4 Manitoba Foundations for Scientific Literacy

Session Title: Teaching Computational Thinking through Mobilize and the New Big Data Explosion

Department: PSYC. Course No.: 132. Credits: 3. Title: General Psychology I. Contact: David B. Miller. Content Area: CA 3 Science and Technology

School of Public Health

Prescriptive Analytics. A business guide

PhD Studies in Education in Italy within the European Research Framework and the Bologna Process: an Overview. Maura Striano

Smart City Australia

Ensuring WHO s capacity to prepare for and respond to future large-scale and sustained outbreaks and emergencies

Executive Leadership MBA Course Descriptions

UNESCO S CONTRIBUTIONS TO THE DRAFT OUTCOME STATEMENT OF THE NETMUNDIAL CONFERENCE. Introduction

SOCIAL MEDIA LISTENING AND ANALYSIS Spring 2014

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

Simulation-based traffic management for autonomous and connected vehicles

Taking A Proactive Approach To Loyalty & Retention

The 4 Pillars of Technosoft s Big Data Practice

The international conference Networks in the Global World. Bridging Theory and Method: American, European, and Russian Studies took place at St.

School of Public Health. Academic Certificate Student Handbook

Admission Criteria Minimum GPA of 3.0 in a Bachelor s degree (or equivalent from an overseas institution) in a quantitative discipline.

Artificial Intelligence and Politecnico di Milano. Presented by Matteo Matteucci

W3C USING OPEN DATA WORKSHOP VISUAL ANALYTICS FOR POLICY-MAKING OPPORTUNITIES AND RESEARCH CHALLENGES Francesco Mureddu, Tech4i2.

Three Fundamental Techniques To Maximize the Value of Your Enterprise Data

The College of Science Graduate Programs integrate the highest level of scholarship across disciplinary boundaries with significant state-of-the-art

Methods for Assessing Vulnerability of Critical Infrastructure

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.

Transcription:

Activity Report 2010\2012

ACTIVITY REPORT 2010-2011-2012 PREFACE In the last three years the Institute for Scientific Interchange Foundation went through several major transformations. The first and most visible one is the change of its physical space. At the end of 2011, the institute has moved from the historical location of Villa Gualino to the new location downtown in via Alassio. The new building has been shaped to address the needs of an Institute increasingly looking for openness, collaborations, and a modern environment able to foster an ever-increasing dialogue with the many collaborating institutions in the city of Turin. Analogously, the research vision and activities of ISI did not stand still. The ISI research is undoubtedly rooted in the area of complex systems science, a field that the Foundation has contributed to shape for more than two decades. In the last three years however, the field of complexity science has entered a new stage of its life. This is a stage of maturity in which complex systems science is finally generating applications and quantitative results that allow us to deal with problems that have a huge impact on our lives such epidemics, systemic risks, and the emergence of social collective behavior. A key factor in this shift of gears is to be found in the big data revolution that is finally providing the necessary data, numerical experiments and validation finally adding an "applied" dimension to complex systems. The ISI has faced this new challenge by focusing on Data Science with the creation of specific laboratories and initiatives. At ISI however Data Science goes well beyond technical issues of gathering data from "sensors" or programming issues of data crawlers. It also goes beyond the classical statistical analysis. The focus is on identifying new empirical laws emerging from massive data sets and addressing the related "How?" question, i.e. on conceptually new scientific methods for analyzing and synthesizing these laws. Data Science wants to recognize the picture that is hidden in these massive data streams, to predict its occurrence in a statistical sense, and to control it. But ISI also wants to go further, to the "Why?" question, by linking these findings to theoretical concepts in a broader sense, to understand their origin and their impact. We do not forget however the importance of the application's side of the Institute research. We live in an interconnected world where novel ICT technologies are defining socio-technical networks and systems whose understanding, management and resilience cannot be achieved without resorting to a complex systems approach. We are aware that the science that is carried out at ISI finds a natural outlet in technologies and tools potentially impacting decision making and social systems analysis in areas ranging from urban development and human mobility to global health and crisis management. Tools and innovations that must be communicated and shared with the policy makers, and the various stakeholders outside of the research community. The research laboratory itself must be hard wired within the life of the society and the citizen. This awareness has led to another transformation of the Institute that is restructuring itself to develop and articulate an effective knowledge exchange vision that levers on a multilevel network of collaborations and partnerships, both nationally and Internationally.! 1!

Although the last years are a living proof that the Institute is undergoing a continuous and lively transformation, always looking for new challenges and visions, The ISI foundation has been constantly keeping faith to the core values of the Institute. The Institute strives to provide science at the highest level possible without constraint; address the significant challenges of modern society; share the generated knowledge. All that driven by a culture of freedom that is essential to realize the benefits of curiosity inspired science. The above core values are those that shaped the institute three decades ago and thanks to the president Mario Rasetti they are still vividly impressed in the activities of the institute. The changes and transformations of the last years have been possible only because of the commitment of all the staff, researchers, and administrators of the foundation. The executive director Tiziana Bertoletti is navigating the foundation through the good and bad weather and bravely setting the course for new destinations. Ciro Cattuto, the Institute research director, has provided invaluable new blood and vision to the research activity of the foundation. Anna Piergiovanni, Roberto Palermo and Enza Palazzo have manned outstandingly the administrative department. Federico Fornaro carefully manages the Lagrange Project and the many related activities. The research leaders, Jacob Biamonte, Vittoria Colizza, Gianfranco Durin, Corrado Gioannini, Paolo Giorda, Vittorio Loreto, Daniela Paolotti, Francesco Vaccarino, and Stefano Zapperi are those who are relentlessly leading the research effort at ISI. Along with them more than 50 researchers and staff are making ISI a place for science and creative thinking. There is no way to thank properly all of them but saying that ISI is only because of their talent. Finally we have to thank and acknowledge the friendship and support of the many colleagues, visitors, collaborators, more than 50 each years, who have contributed to create the exceptional atmosphere of ISI. ISI is having its 30th birthday this year. The institute has gone a long way, but surely the best has yet to come. Prof. Alessandro Vespignani ISI Scientific Director! 2!

TABLE OF CONTENTS RESEARCH TOPICS Global and Complex Systems Science! Complex Networks Lagrange CRT Lab! Complexity Material Lab! Computational Epidemiology Lab! Data Science Lab! Information Dynamics Lab! Mathematics of Complexity Science Lab Team Highlights Talks Publications Quantum Science Lab Team Highlights Talks Publications GUESTS TALKS (2010-2011-2012) EVENTS Workshop on "Tensor Network States and Algebraic Geometry" November 6 th - 8 th, 2012 Giornata di alta formazione sui Sistemi Complessi Battling infectious diseases in a complex world October 29 th, 2012 First Review Meeting of the EveryAware Project October 25 th, 2012 Third COQUIT Conference Collective Quantum Operations: Mean field, Control, Estimation September 11 th - 14 th, 2012! 3!

ECCS '12 Satellite Meeting Data-Driven Modeling of Contagion Processes September 5 th, 2012 Giornata di Alta Formazione sui Sistemi Complessi Techno-social networks and the diffusion of collective social phenomena July 20 th, 2012 EveryAware third meeting Enhance environmental awareness through social information technologies July 9 th - 10 th, 2012 Giornata di Alta Formazione sui Sistemi Complessi Chaos & Complexity June 22 nd, 2012 Giornata di Alta Formazione sui Sistemi Complessi La semplicità della complessità: un' introduzione alla scienza dei sistemi complessi May 4 th, 2012 COQUIT Workshop Errors and limited resources February 12 th - 14 th, 2012 EE² - Epiwork/Epifor 2nd International Workshop Facing the Challenge of Infectious Diseases January 18 th - 20 th, 2012 VII TOP-IX Annual Conference December 6 th, 2011 Assyst Workshop Mathematics in Network Science: Implications to Socially Coupled Systems November 21 st 23 rd, 2011 International Meeting on Visualization in Complex Environments November 17 th 18 th, 2011 EveryAware Second Meeting Enhance environmental awareness through social information technologies September 19 th 20 th, 2011 Satellite Workshop of European Conference on Complex Systems Dynamics on and of Complex Networks V September 12 th 16 th, 2011 Second Review Meeting of the COQUIT Project July 1 st, 2011 Lagrange Prize - CRT Foundation Awarding Ceremony June 30 th, 2011 Incontro Nazionale FuturICT Italia June 13 th, 2011! 4!

Lagrange Day April 18 th, 2011 ICTeCollective Project Meeting March 17 th, 2011 EveryAware Kick Off Meeting March 14 th 15 th, 2011 Epiwork Science Board Meeting December 6 th 7 th, 2010 First COQUIT Workshop November 18 th 20 th, 2010 Satellite Workshop of European Conference on Complex Systems Dynamics on and of Complex Networks IV September 16 th, 2010 Workshop on Quantum Mechanics in Biological Systems July 8 th 9 th, 2010 COQUIT Review Meeting July 1 st 2 nd, 2010 FUNDED PROJECTS 2009 2012 ASSYST Action for the Science of Complex Systems and Socially Intelligent ICT European Commission 2006 2012 BOVINE LIVESTOCK MOBILITY Istituto Zooprofilattico Sperimentale dell Abruzzo e del Molise 2009 2012 COQUIT Collective quantum operations for information technologies European Commission 2009 2012 DYNANETS Computing Real-World Phenomena with Dynamically Changing Complex Networks European Commission 2008 2013 EPIFOR Complexity and predictability of epidemics: toward a computational infrastructure for epidemic forecast European Commission 2009 2013 EPIWORK Developing the framework for an epidemic forecast! 5!

infrastructure European Commission 2011 2014 EVERYAWARE Enhancing Environmental Awareness through Social Information Technologies European Commission 2009 present GLEAMVIZ The Global Epidemic and Mobility Model National Institute of Health, Defense Threat Reduction Agency, Lilly Endowment Inc., Indiana University, ISI Foundation 2010 2013 GSDP Global Systems Dynamics and Policy European Commission 2009 2012 ICTeCOLLECTIVE Harnessing ICT-enabled Collective Social Behaviour European Commission 2012 2015 IMPROVING STRATEGIES FOR PREVENTING PERTUSSIS IN INFANTS Ministero della Salute 2012 2016 MULTIPLEX Foundational Research on MULTIlevel complex networks and systems European Commission 2010 present NNOSIP Neuronal Network Oscillations and Sensory Information Processing Compagnia di San Paolo 2011 2016 PREDEMICS Providing Preparedness, Prediction and Prevention of Emerging Zoonotic Viruses with Pandemic Potential European Commission 2010 2013 Q-ARACNE Quantum Complex Networks Compagnia di San Paolo 2012 2017 SIZEFFECTS Size Effects in Fracture and Plasticity European Commission 2008 present SOCIOPATTERNS CNRS - Centre National de la Recherche Scientifique, Marseille, France; 2011 2014 STUDIOLAB! 6!

A new european platform for creative interactions between art and science European Commission 2011 2014 TOPDRIM Driven Methods for Complex Systems European Commission 2003 present LAGRANGE PROJECT Lagrange Project CRT Foundation CRT Foundation EDUCATION Level II University Master Degree in EPIDEMIOLOGY! 7!

GLOBAL AND COMPLEX SYSTEMS SCIENCE The Foundation is moving ahead the research efforts in consolidated areas of strength that draw on the experience and knowledge accumulated in the area of complex systems. By integrating complex systems science with data science and computational thinking however it is emerging a novel scientific area that aims at providing an integrated framework for the understanding, analysis and control of social, technological, and economic systems. It is then possible to see complexity science at work to the solution of major societal problems such as the containment of emergent diseases, the design of better energy distribution systems, the planning for traffic-free cities or the optimization of internet connectivity. The foundation has developed a research program that aims at building the mathematical, modeling and computational foundations of the analysis of socio-technical systems as well as to provide a portfolio of case studies to assess the feasibility and impact of this framework in real world problems and applications. For this reason in the last two years the laboratory has branched out its research activities in new areas, especially strengthening the data science component. The research activity is thus articulated around the following laboratories: Complex Networks Lagrange CRT Lab Complexity Material Lab Computational Epidemiology Lab Data Science Lab Information Dynamics Lab Mathematics of Complexity Science Lab In the following pages we have reported the activities and research focus of each laboratory. The above research laboratories however are not intended as disciplinary silos but as thematic group of interest that are all interconnected within the ISI foundation research structure. Each specific project and initiative is drawing resources and expertise across the full spectrum of the laboratories human resources. As diverse the research activities carried out by the Laboratory may seem, the methodological approach used is the nexus where different fields and problems find their unifying framework. Techniques borrowed from statistical physics, non-linear dynamics and computational modeling allow the interdisciplinary approach and cross fertilization that more than anything else are contributing to the uniqueness and richness of the Foundation research activities.! 8!

COMPLEX NETWORKS AND SYSTEMS LAGRANGE LABORATORY The research activity of the laboratory aims at the study of systems where a large number of interacting units give rise to cooperative phenomena, non-trivial patterns and complex dynamical behavior that cannot be simply inferred from the basic microscopic interactions. The laboratory is active in scientific areas ranging from social computational sciences to biological systems to interdisciplinary applications in information technology, economics and policy making. The division uses large-scale computational approaches, agent based models, complex networks, non-linear systems analysis and statistical physics methods to link the microscopic dynamics and interactions of the constituent elements to the statistical regularities and the macroscopic properties of the system under study. COMPLEXITY MATERIAL LABORATORY The Complexity in Materials group conducts both theoretical and experimental research into complex phenomena in material science. Particular areas of interest include: hysteresis and noise in ferromagnetic material, fluctuations in fracture and plasticity, collective transport in nanostructured materials and deformation in soft condensed matter physics. We are particularly active in the statistical analysis of the Barkhausen noise in ferromagnetic thin films and strips, and in the study of computational models for magnetic domain walls in disordered media. We perform large scale numerical simulations of lattice models for the fracture and plasticity of heterogeneous materials. Another subject of current investigation is the topological properties of deformed colloidal and vortex crystals, which we study by molecular dynamics simulations. COMPUTATIONAL EPIDEMIOLOGY LABORATORY The research activity of the laboratory focuses on i) developing the foundation and development of the mathematical and computational methods needed to achieve prediction and predictability of disease spreading in complex techno-social systems; ii) the development of large scale, data driven computational models endowed with a high level of realism and aimed at epidemic scenario forecast; iii) the design and implementation of original data-collection schemes motivated by identified modelling needs, such as the collection of real-time disease incidence, through innovative web and ICT applications; v) the set up of a computational platform for epidemic research and data sharing that will generate important synergies between research communities and countries. The laboratory proposes a truly interdisciplinary effort combining complex systems science, computational sciences, mathematical epidemiology, and ICT technologies. DATA SCIENCE LABORATORY The activity of the Data Science Laboratory comprises those research lines that regard digital traces of human behavior as first-order objects for scientific investigation. The laboratory has a strong interdisciplinary character, covering research on social media, on-line social networks, pervasive systems, wireless sensor networks, and applications to epidemiology. Methodologically, the laboratory extends the traditional toolbox of complex systems research with techniques from data mining, machine learning, and with the use of scalable computational infrastructures that can deal with large-scale records of activity from modern techno-social systems. At present, the activity of the Data Science Laboratory focuses on two main research areas. The first research area deals with measuring dynamical networks of human mobility and proximity in a variety of real-world settings. The SocioPatterns project is! 9!

the overarching effort that encompasses these activities: it is an international collaboration, led by the ISI Foundation, that brings together physicists, computer scientists, electrical engineers and designers under the single objective of designing and deploying scalable sensor networks that can be used to mine human contacts in hospitals, schools, conferences, and more. The SocioPatterns project leverages these data sources to inform research activities on human dynamics, opportunistic networks, organizational science, and data-driven epidemiology. Applications to ubiquitous social environments and art/science explorations are also part of the project. The second research area, at the interface with computer science, deals with on-line systems that entangle human behavior and information networks, such as Twitter and other on-line social networking systems. The research is carried out in collaboration with the Informatics School of Indiana University and the Computer Science Department of the University of Torino. MATHEMATICS OF COMPLEXITY SCIENCE LAB It has become increasingly evident that the mechanisms governing most social and technological systems are characterized by very detailed, intricate interactions with interdependencies among systems. Because of their inherent complexity, which requires analysis at many scales of space and time, complex systems face science with novel challenges in observing, describing and controlling them effectively. Our group aims to bring into the agon of complex system's science some of the more advanced XX century mathematical theories and techniques. Indeed, one of the revolutions in the XX century mathematics has been the perspective shift from focusing on a single object to landscapes of objects and morphism, i.e. agents and interactions. The goal is twofold: it concerns the development of the necessary abstract frameworks and it is focused on testing the validity of the proposed approach on real data. The abstract framework is based on the interplay between algebraic geometry, algebraic topology, representation theory and statistics, to lay the foundations for a fresh approach to complex systems theory. In particular our group focuses on the development and refinement of topological methods to reconstruct the robust patterns that underlay the structure of complex systems. Within this framework, we are currently focusing on tailoring the tools of Persistent Homology, the most recent and efficient ones in computational topology of data clouds, to the case of complex networks. Thanks to their nature, these methods allow the study of mesoscopic structures in networks, enhancing and integrating the existing tools mostly based on statistical mechanical grounds. INFORMATION DYNAMICS LAB The activity of the Information Dynamics Laboratory spans several research lines for which information, whether symbolic or embodied, plays a crucial role. The laboratory conducts in particular research, both theoretical and experimental, on areas related to social dynamics in the so-called techno-social systems, which entangle, in a somehow unpredictable way, cognitive, behavioral and social aspects of human agents with the structure of the underlying technological systems. Research of the laboratory features a highly interdisciplinary character, enjoying extensive collaborations with groups in disciplines as diverse as computer science, engineering, sociology, economics, psychology and linguistics. More in detail the present activities of the laboratory can be presented along the following lines. Opinion dynamics! 10!

The spread and evolution of opinions is a central topic in sociology, politics and economics. One is interested in understanding whether and how, starting from each agent having a different opinion, consensus emerges in which all agents share the same opinion as opposed to a polarized or fragmented state. We are interested in particular in investigating the response of a society to exogenous or endogenous perturbations; under which conditions large-scale opinion shifts are triggered; the role of the available information in decision making processes; how the possibility of disagreement makes consensus/polarization/fragmentation emerge; how individual and collective awareness can be enhanced and how it can trigger behavioral shifts; how does innovation emerge; how individual behavior is affected by external pressure, in the form of stress or temporal constraints. Language dynamics The study of the self-organization and evolution of language and meaning has led to the idea that a community of language users can be seen as a complex dynamical system which collectively solves the problem of developing a shared communication practice. In this perspective, the theoretical tools developed in statistical physics and complex systems science acquire a central role for the study of the self-generating structures of languages. Language is nowadays a hot topic in linguistic, sociology, cognitive sciences, biology, physics and mathematics. The most natural questions concern how new conventions (names, categories, grammatical rules), developed from local interactions among few individuals, can become stable in a whole population. This occurs when conceptual frameworks are acquired and fine-tuned through shared sensori-motor experiences and joint actions, and when they are constantly aligned in dialogue. This suggests a radically different way to look at language and more generally at emerging communication systems, a perspective where coherence arises out of this distributed activity in a self-organised way instead of being innate or imposed in a top-down fashion. Social computation Recent advances in Information and Communications Technologies (ICT) are enabling for the first time the possibility of precisely mapping the interactions of large numbers of people in a reproducible way. In particular the dynamics and transmission of information along social ties can be nowadays object of a quantitative investigation. From this point of view the Web is acquiring the status of a platform for "social computation", able to coordinate and exploit the cognitive abilities of the users for a given task. In this area we have constructed a brand new platform, Experimental Tribe (www.xtribe.eu), suitable for the realization of web-based experiments in social dynamics involving directly individuals into the experimental loop. The benefit is twofold: on the one hand, it allows virtually any researcher to realize his own experiment with minimal effort, paving the way of the use of the web as a standard laboratory to perform experiments. On the other hand, it can be a strong basin of attraction for people willing to participate in experiments, making recruitment much easier than for single-experiment platforms. ICT driven social dynamics Nowadays low-cost sensing technologies allow the citizens to directly assess the state of the environment; whereas social networking tools allow effective data and opinion collection and real-time information spreading processes. In addition, theoretical and modeling tools developed by physicists, computer scientists and sociologists have reached the maturity to analyse, interpret and visualize complex data sets. In this area the laboratory is coordinating the EveryAware project that integrates all crucial phases (environmental monitoring, awareness enhancement, behavioural change) in the management of the environment in a unified framework, by creating a new! 11!

technological platform combining sensing technologies, networking applications and data-processing tools; the Internet and the existing mobile communication networks provide the infrastructure hosting such a platform, allowing its replication in different times and places. The integration of participatory sensing with the monitoring of subjective opinions is novel and crucial, as it exposes the mechanisms by which the local perception of an environmental issue, corroborated by quantitative data, evolves into socially-shared opinions, eventually driving behavioural changes. Phylogeny and evolution While the traditional aim of phylogeny reconstruction is that of classifying a set of species (viruses, bacteria, languages) sharing a common origin, a more recent trend is that of uncovering the evolutionary relatedness among those species through the visualization of their phylogenetic tree. The analysis of statistical features of phylogenetic trees, e.g., their topological properties, expresses deep information about the evolutionary process that gave birth to the differentiation of the species. In this scenario, the challenges are, on the one hand, of developing suitable algorithms for the analysis of large data-sets, in order to perform robust statistical analyses. On the other hand, a quantitative analysis of the phylogenetic characteristics of populations of pathogens (viruses, bacteria), offer a precious validation tool for the discrimination of theoretical predictions for different evolutionary models. In this area we are currently working on (i) the modelization of the evolution of the Human Influenza A virus at the sequence level, to investigate the longstanding puzzle concerning the relation between the genetic profile of the virus and its interaction with the host immune system (its antigenic properties), this understanding being crucial for the control of Influenza outbreaks; (ii) the evolutionary dynamics of Neisseria meningitidis, a deadly human pathogen, which features an high-level of homologous recombinations.! 12!

TEAM GLOBAL AND COMPLEX SYSTEMS SCIENCE Research Leader Vittoria Colizza 2006 - present Santo Fortunato 2006-2012 Vittorio Loreto 2007 - present Francesco Vaccarino 2009 - present Stefano Zapperi 2007 - present Research Scientist Duygu Balcan 2011-2012 Alain Barrat 2006 - present David Brée 2006-2010 Corrado Gioannini 2009 - present Yamir Moreno 2011 - present Adil Mughal 2008-2010 Andrea Pagnani 2004-2010 Daniela Paolotti 2007 - present Filippo Radicchi 2007-2010 José-Javier Ramasco 2006-2010 Luca Rossi 2011-2011 Francesca Tria 2006 - present Wouter Van den Broeck 2008 - present Martin Weigt 2004-2010 Associated Research Scientist Gianfranco Durin 2008 - present Marco Lamieri 2005-2010 Duccio Medini 2012 - present Junior Researcher Andrea Apolloni 2011-2012 Paolo Bajardi 2008-2012 Zsolt Bertalan 2012 - present Selene Bianco 2011 2011 Zoe Budrikis 2012 - present Luca Cappa 2011-2012 Arnab Chatterjee 2011-2012 Laetitia Gauvin 2011 present Stefano Ingarra 2011 - present Lorenzo Isella 2009-2011 Lasse Laurson 2009-2011 Animesh Mukherjee 2009-2011 André Panisson 2009 - present Giovanni Petri 2012 - present Chiara Poletto 2009 - present Marco Quaggiotto 2009 present Michele Roncaglione 2012 present Fabio Saracino 2011 - present Alina Sirbu 2011 - present Michele Tizzoni 2009 - present Eom Young-Ho 2009-2012! 13!

PhD Student Gino Almondo 2012-2012 Arianna Bertolino 2009-2011 Simona Cantono 2009-2010 Emanuele Cozzo 2011-2011 Irene Donato 2011 - present Baptiste Durrande 2010-2010 Samir Hamichi 2006-2010 Andrea Lancichinetti 2008-2012 Jeanette Lehmann 2009-2010 Sandro Meloni 2011-2011 Simone Pompei 2010-2012 Joaquin Sanz 2011-2011 Martina Scolamiero 2011 - present IT Claudio Cicali 2011 - present Federico Di Gregorio 2011-2011 Pierluigi Di Nunzio 2011 2011 Simona Moscardi 2012 - present Marco Perosa 2010-2012 Coworker Filippo Menczer 2007 - present Pietro Terna 2008 - present! 14!

Highlights Data Science Laboratory The Data Science Laboratory of the ISI Foundation was created during the reporting period. It has subsumed and expanded the activity of two research lines of the ISI Foundation focusing, respectively, (i) on the data-driven investigation of time-varying human interaction networks, and (ii) on the dynamics of collective attention in sociotechnical systems. The Laboratory significantly expanded its operational capabilities on mining large-scale datasets and provided public contributions to popular opensource technology components. The Laboratory also contributed to (iii) the first European educational experience on Big Data, helping to shape a Data Science curriculum that combines mathematical modeling, data mining, and interactive visualization. In the following we will highlight our work on empirical network of human proximity for epidemiology and public health. Time-varying networks of human interaction. The SocioPatterns project (www.sociopatterns.org) has gained a strong momentum and its data-driven methodology has been increasingly taken up by diverse research communities. The Project has successfully measured time-varying contact networks in hospitals and schools, using wearable proximity sensors, and has released several dataset for public usage. The data were used to investigate the temporal structure of human interactions in space and to study the dynamics of epidemic processes in time-varying networks. A key problem is the approximation-generalization tradeoff brought forth by the use of epidemic models simulated over empirical high-resolution networks of human interactions (figure). The central questions can be phrased as follows: How much detail do we need in order to achieve a data-driven simulation of an epidemic process that can inform decision making for public health, e.g., by allowing to design smart immunization strategies, or optimized schedules for class closure in a pandemic situation. As digital records of human behaviors become available with higher and higher resolutions, what parsimonious representations can we devise that strike a balance between capturing the relevant features of the data and maintaining the generalizability of the ensuing results to contexts that go beyond the one in which the data were collected? When is more detail helpful, and when does it become too much detail? How do we decide what level of detail is appropriate for predicting a given property of the system under study? In the domain of epidemic processes on human contact networks, we have made one step in this direction by showing that some important features of an epidemic process, e.g., the timing of the epidemic peak, can be correctly modeled under strong simplifying assumptions, whereas other quantities, such as the epidemic size, do require the use of high-resolution information. On the other hand, we have shown that it is possible to devise parsimonious representations of the data that capture the relevant topological and weight heterogeneities and drop most of the other high-resolution information, achieving the same performance of more complex models. Along the same direction, for the case of structured populations (e.g., hospitals), we have extended the customary contact matrix representation and introduced a new representation which is minimally more complex and, in simulation, accurately models epidemic spread in the structured hospital population.! 15!

Sensor networks, quantified human behavior, large-scale data analytics, digital models of epidemic spreading: all of these directions contribute to a new vision of digital epidemiology that changes the role played by data in public health. The Data Science Laboratory fully embraces this vision and will further pursue it by advancing both the underpinning theoretical foundations and the operational data technologies. Selected publications: Digital Epidemiology. Salathé M, Bengtsson L, Bodnar TJ, Brewer DD, Brownstein JS, et al., PLOS Computational Biology 8(7): e1002616, 2012. Temporal Networks of Face-to-face Human Interactions, A. Barrat, C. Cattuto. In press (2013). Invited chapter for a book on temporal networks edited by P. Holme and J. Saramaki. High-Resolution Measurements of Face-to-Face Contact Patterns in a Primary School. Juliette Stehlé. N. Voirin, A. Barrat, C. Cattuto, L. Isella, J.-F. Pinton, M. Quaggiotto, W. Van den Broeck, C. Régis, B. Lina and P. Vanhems. PLOS ONE 6(8): e23176. doi:10.1371/journal.pone.0023176, 2011. Simulation of an SEIR Infectious Disease Model on the Dynamic Contact Network of Conference Attendees. J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, V. Colizza, L. Isella, C. Régis, J.-F. Pinton, N. Khanafer, W. Van den Broeck, and P. Vanhems. BMC Medicine, 9(87), 2011. What s in a Crowd? Analysis of Face-to-Face Behavioral Networks. Lorenzo Isella, J. Stehlé, A. Barrat, C. Cattuto, J.F. Pinton, and W. Van den Broeck. Journal of Theoretical Biology 271, 166-180, 2011.! 16!

Livestock movements The study of cattle trading systems is crucial for monitoring and understanding the spreading of emerging epidemics in order to reduce the risks of major outbreaks posing serious health and economical concerns. The large dataset of cattle trade movements obtained from the Italian National Bovine database providing a daily description of the movements of each bovine in Italy has been described through a dynamical network where the nodes correspond to premises and a directed link represents a displacement of bovines between two premises. The large variability and the rapidly evolving daily patterns observed in the trading system lead, in principle, to an exponential increase of the possible epidemic scenarios depending on the starting date and the seeding premises of the epidemic. Such large variations and huge number of degrees of freedom in the system may thus strongly limit our ability to devise and implement preventing actions for emerging infectious disease outbreaks. For this reason, through intensive numerical simulations on the fully dynamic network, we have investigated the role of the initial conditions in shaping the disease propagation. The analysis of the spreading patterns highlighted the presence of clusters of premises leading to similar epidemic profiles and peak times. Such clusters cannot be identified from purely structural or geographical considerations. By reducing the degrees of freedom in the initial conditions through clustering also allows us to define a novel method to identify premises characterized by a large vulnerability, an important knowledge for risk assessment analysis. Indeed, although the large temporal variability of animals trading routes intrinsically alters the centrality role of nodes from a given observation time to another, it is possible to identify sentinel nodes representing premises that are often reached by the disease and, when detected as infected, are able to provide valuable information on the seeding farms of the outbreak and thus on the likely spreading path. The proposed method can be used in order to optimize surveillance systems and define rapid and efficient containment strategies. Optimizing surveillance for livestock disease spreading through animal movements, P. Bajardi, A. Barrat, L. Savini, V. Colizza (2012) J. Roy. Soc. Interface 9, 2814-2825 (2012). Figure: Emergence of clusters of seeding nodes. The small white dots (i.e. the nodes of the network) represent the Italian livestock premises and an arc represents! 17!

the exchange of batches of bovines between two of them. If an infectious disease outbreak occurs, the epidemic may propagate spatially, from one animal holding to another, through the movements of infected animals. Taking advantage of extensive computer simulations, it is possible to compare different seeds of the outbreak in terms of the spreading patterns they produce, and group into clusters the nodes that infect similar sets of premises along their invasion paths. Here, two clusters are shown as examples, and for each of them three snapshots are reported that reproduce the invasion paths of nodes belonging to the same cluster.! 18!

Validation of H1N1 pandemic model predictions In 2012, we finalized our work on the validation of the numerical forecasts obtained by the GLEAM model during the course of the 2009 A/H1N1 pandemic. Our work appears in the manuscript published by BMC Medicine: Real-time numerical forecast of global epidemic spreading: case study of 2009 A/H1N1pdm Michele Tizzoni, Paolo Bajardi, Chiara Poletto, José J. Ramasco, Duygu Balcan, Bruno Gonçalves, Nicola Perra, Vittoria Colizza, Alessandro Vespignani. BMC Medicine, 10:165 (2012). In detail, in 2009 we used GLEAM in real-time to generate stochastic simulations of the pandemic spread worldwide, yielding the incidence and seeding events at a daily resolution for 220 countries. Using a Monte Carlo Maximum Likelihood analysis, the model provided an estimate of the seasonal transmission potential during the early phase of the H1N1 pandemic, and generated ensemble forecasts for the activity peaks in the northern hemisphere in the fall/winter wave. The forecasts were published in September 2009, well before the peak weeks of epidemic activity in the northern hemisphere [1]. Thanks to the availability of incidence data from surveillance systems worldwide, we compared our predictions against the empirical data collected in 48 countries of the world, and assessed their robustness with respect to: 1) the peak timing of the pandemic, 2) the level of spatial resolution allowed by the model; and 3) the clinical attack rate and the effectiveness of the vaccine. Real-time predictions on the peak timing were found to be in good agreement with the empirical data, showing strong robustness to data not accessible in real time (such as vaccination campaigns adherence, pre-exposure immunity, etc.), whereas these ingredients affect the predictions for the attack rates. Timing and spatial unfolding of the pandemic are critically sensitive to the level of mobility data integrated into the model. Our results showed that large-scale models can be used to provide valuable real-time forecasts of influenza spreading, at the cost of high performance computing. Better quality is achieved depending on the level of data integration, thus advocating for the need of high quality data in population-based models and of a progressive update of validated available empirical knowledge to inform the models. References [1] Balcan et al. BMC Medicine, 7:45 (2009)! 19!