Data Harmonization and Management System for the Institute for Prospective Technological Studies
Customer profile Research Centre (JRC). Its mission is to provide customer-driven support to the EU policy-making process. To further its work in developing science-based responses to policy challenges, IPTS needed a specialized data management tool that would ensure greater transparency and a better understanding of data, and also offer modeling functionality. ec.europa.eu/jrc/en/institutes/ipts The Institute for Prospective Technological Studies (IPTS) is one of the seven scientific institutes of the European Commission s Joint 2
Description Together with the JRC IPTS, Prognoz designed and implemented a software platform for data harmonization and management. The resulting product, DataM, is a database management tool intended to simplify the daily data work of analysts and modelers. It facilitates the input of data into economic models, verification of statistical information and analysis of results. DataM focuses on agriculture data, trade data and models supplied by major data providers. Users can quickly retrieve information from different sources without any special knowledge of the nomenclature. The tool addresses different needs ranging from data collection and validation to advanced reporting, with the capacity to export data. In close collaboration with IPTS multi-disciplinary research staff, Prognoz identified a wide range of socio-economic and agricultural datasets from a number of authoritative sources such as the IMF, Eurostat, UN Comtrade, FAPRI, OECD, FAO, and USDA, to name a few. Then Prognoz data analysts did the necessary preparatory work, which involved extracting the relevant information (data, metadata, and industryspecific information) from the data sources, 3 www.prognoz.com
Description prescreening data elements, and mapping data elements from source data to target structures before loading them into the data warehouse. This resulted in creating a data structure that allows for multi-source data analysis and comparison of results across one or more data models. Data from various sources is integrated and processed by an all-in-one data management tool that includes modules designed to assist IPTS with data analysis. DataM allows users to quickly navigate from database to database, run powerful searches across all external and internal datasets, select specific variables, conduct time series analysis, perform data transformations and make adjustments as required, apply built-in or user-defined statistical and econometric functions, visualize results using a wide range of embedded visualization tools, keep a record of variables, and identify areas for further work. The DataM Application comes in two versions. One is a fully-featured desktop application providing powerful tools for analyzing and validating datasets, and for automating the processing and preparation of standard reports. The other is a Web version designed for faster data search and analysis, and for preparation of reports using the indicators provided. The Web version of the Prognoz DataM Application is available at www.datamweb.com Benefits of the tool Easy access to data. With DataM, users can access most of the databases through a single interface. They can retrieve information from several databases with only one account, using a single query tool. Up-to-date data. In DataM, the data is updated regularly: the daily and weekly databases are updated twice a month; the monthly databases are updated every month, and the databases released once a year are updated within 5 or 10 working days of the release, depending on the urgency of the task. Data harmonization. To facilitate data search and comparison, the nomenclatures of the databases are linked via common dictionaries, thus enabling the user to work with data from different sources under one, «common», nomenclature. Source documentation and meta-information on data. The source documentation (methodology, definitions of variables) is stored in the tool and is directly accessible from the Start Page of DataM. In addition to that, a summary report informs the user of the data source provider, the potential need for a licence, the data frequency, the update frequency in DataM, and the geographical coverage. The meta-information on data can also be viewed directly when visualizing the data. Data analysis and reporting. DataM provides visualization tools (tables and charts) for data analysis and the comparison of different data sources. DataM is also equipped with a reporting tool. Reports for recurrent data analysis, as well as data extraction and dissemination, can be pre-defined. 4
Description Key features Data sources. DataM provides access to more than 20 online datasets from a number of authoritative sources including Eurostat, OECD, UN, IMF, the World Bank, the U.S. Department of Agriculture, the Wall Street Journal (commodity futures), and many others. Source documentation and metadata definition tools. The source documentation can be accessed from the Start Page of DataM along with the information on the data version (i.e. the date of the last data update by the source provider) available in DataM. With just one click users can get a summary report on a database. Data search. The search functionality, which is also available on the Start Page of DataM, allows the user to search all or some of the databases using either the name or the code of an element. Data visualization in the «Workbook». This feature («Workbook») is the principal visualizing tool for raw data in DataM. Within the Workbook, users can select the data series 5 www.prognoz.com
Description they want to display, and they can choose to work either in the OLAP mode (multidimensional mode) or in the time series mode. Reporting. DataM contains a reporting tool that can be used for data dissemination, and also for building routines for structured data extraction or analyzing scenario results. Reports include compelling data visualization to facilitate analysis. Data export and import. Bulk export of a database into xlsx,.mdb,.txt,.csv or.gdx formats can be performed from the Start Page. The data to be exported can easily be selected, and users can choose to keep the years in columns or to transpose them into rows. Model data can be imported through a customized user interface for importing model datasets (the automatic «data loaders»). Transformations. Transformations are used to perform large-scale calculations on complete datasets, e.g. a change in units. Different kinds of calculations can be implemented, including conditionals, or using metadata, e.g. a unit of measure. These calculations are run automatically before a data update. «Management console». The «Management console» is used mainly to manage data updates and user rights. DataM provides the functionality to grant permission to access certain databases to specified users or user groups. 6
Description Results of the implementation of the system DataM provides easy access through a single interface to most of the main agricultural, macroeconomic and trade databases as well as the in-house model databases. wide variety of data and publish the results due to data integration and automatic data updates. Extensive source documentation and meta-information on the data loaded in DataM is stored in the application. DataM enables users to easily compare data across data sources because the nomenclatures of the databases are linked to a common nomenclature. DataM provides tools to group databases into harmonized datasets by topic. A powerful search engine allows users to find the relevant data available in all the databases, and due to the linkage between the different nomenclatures, users can be sure that no data is missed from the search. DataM web provides efficient tools to access, create and disseminate reports on agriculture, food security, bioeconomy, and many other subjects. DataM is beginning to be applied by other DGs of the European Commission, and also by international organizations such as the African Development Bank. Less time and effort is required to prepare reports based on a 7 www.prognoz.com
www.prognoz.com