ICSU and the Challenges of Big Data in Science Elsevier Conference on Big Data, E-Science and Science Policy 16 17 May 2012 Canberra Professor Ray Harris UCL
International Council for Science ICSU 121 national scientific bodies representing 140 countries Australian Academy of Science National Academy of Sciences Royal Society 31 international scientific unions International Astronomical Union International Union of Crystallography International Union of Geodesy and Geophysics
ICSU Universality of Science ICSU VISION a world where science is used for the benefit of all, excellence in science is valued and scientific knowledge is effectively linked to policy making. In such a world, universal and equitable access to high quality scientific data and information is a reality
The ICSU journey Panel Area Assessment on Scientific Information and Data, 2003 2004 The Strategic Committee on Information and Data, 2007-2008 World Data System, starting in 2009 Strategic Coordinating Committee for Information and Data, 2009-2011
Big data : Astronomy
Square Kilometre Array
Big data : Particle physics 150 million sensors 40 million times /sec 22 PB in 2012
CERN data distribution
Big data : Biomedicine Human Genome Project determine the sequences of the 3 billion chemical base pairs that make up human DNA identify all the approximately 20,000-25,000 genes in human DNA UK Biobank 0.5 million people aged 40 69 Measurements on individuals Blood, urine and saliva samples
Big data : Earth observation Canberra 3 January 2008
MTSAT Japan 11 April 2012 1100 UTC
Big data : global data volumes Hilbert and Lopez, Science, 2011 If all the data used in the world were written to CD-ROMs and the CD-ROMs piled up in a single stack, the stack would stretch from the Earth to the Moon and a quarter of the way back
The big data challenge Data explosion and data overload By 2020 35 zetabytes (one ZB = 10 21 bytes) of digital data created per annum Data complexity Changing expectations Digital divide
Panel Area Assessment (PAA) on scientific information and data Long term strategic leadership by ICSU on scientific data and information Professional data management in science Ensure universal and equitable access to data and information Do not forget the question of who pays
Sorbonne November 2007 Strategic Committee on Information and Data New World Data System created from WDCs and FAGS More prominent CODATA ICSU national members and unions more active in professional data and information management
World Data System : objectives Enable universal and equitable access to quality-assured scientific data, data services, products and information; Ensure long term data stewardship; Foster compliance to agreed-upon data standards and conventions; Provide mechanisms to facilitate and improve access to data and data products
World Data System : status Constitution Data policy Initial system architecture Formal membership criteria and assessment process International Programme Office Tokyo First WDS conference Kyoto September 2011 CODATA conference Taiwan October 2012 17
WDS members so far Over 140 expressions of interest 58 applications for membership 35 regular members approved 1 network member 1 partner member 2 associate members
Example WDS members Antarctic Data, Hobart Climate, Hamburg Oceanography, Washington DC Renewable Resources and Environment, Beijing Solid Earth Physics, Moscow International Laser Ranging Service International VLBI Service for Geodesy and Astrometry
Strategic Coordinating Committee on Information and Data Communication of best practice on data management National and union members Explore and agree the terms used in Open Access Forum for the agreement of terms ICSTI and INASP for publications CODATA for science data Use the existing OECD principles for access to research data from public funding
Strategic Coordinating Committee on Information and Data Improve the process of creating data as a publication Increased recognition Behaviour modification Potential role for legal deposit libraries Use the CODATA and World Data System conferences more actively Every two years alternately
Strategic Coordinating Committee on Information and Data Practical help to the less economically developed countries Rich in opportunity, relatively low cost CODATA active WDS to seek active nodes in LEDCs Partnership, mentoring : ICSTI, INASP, CODATA, national and union members of ICSU Cooperation with commercial companies Mutual benefit ICSTI lead, eg Microsoft
Fourth paradigm First Paradigm. Observation, description, experimentation. eg Ptolemy, Ibn Battuta. Second Paradigm. Theoretical science. eg Newton. Third Paradigm. Simulation and modelling. eg climate models. Fourth Paradigm. Data-intensive science. eg International Virtual Observatory Alliance.