Automated Data Collection in Accommodation Statistics Mr Juha-Pekka Konttinen Head of Statistics, Transport and Tourism, Statistics Finland Abstract The European Statistical System is increasingly being called upon to reduce the administrative burden of European enterprises. As a consequence, initiatives need to be taken to find a better balance between needs and the burden on producers. One approach is to introduce more efficient data collection methods. In this context, Statistics Finland introduced a system of automated data collection in accommodation statistics in 2005, while INE Spain introduced a similar system in 2008. Eurostat has also been supporting the development of more efficient ways of collecting data in the field of tourism statistics. An ESSnet project involving eight Member States was launched in 2010. The ultimate goal is to develop a system which can generate statistical information automatically from the management information systems used by tourist accommodation establishments. The results of automated data collection have been encouraging. The introduction of the automated data collection system has led to a significant reduction in the reporting burden of accommodation establishments and to a notable reduction in the processing and compilation work of the statistical offices. 1
The aim of this paper is to describe the experiences of automated data collection in Statistics Finland and the parties participating in the ESSnet project in other Member States. 1 Introduction Tourism statistics in the European Union are currently covered by Regulation (EU) No 692/2011 of the European Parliament and of the Council of 6 July 2011 concerning European statistics on tourism. The regulation establishes a common framework for the systematic development, production and dissemination of European statistics on tourism supply and demand. This paper focus on statistics relating to the supply side, that is, statistics on the capacity and occupancy (arrivals, nights spent) of tourist accommodation establishments. The European Union s tourism industry occupies an important place in the economies of the Member States, and it is therefore crucial that they produce statistical information of high quality in order to guarantee reliable, detailed and comparable data. On the other hand, the European Statistical System (ESS) is increasingly being called upon to reduce the administrative burden of European enterprises. This concerns all enterprises, but particularly SMEs. Initiatives therefore need to be taken in order to find a better balance between user needs and the burden on producers. The latter group consists in the first place of the reporting enterprises, but it also includes the statistical authorities in charge of collecting and compiling the statistics. One approach to alleviating the burden is to reduce the volume of the collected information; another approach is to introduce more efficient data collection methods. In the recent context of re-engineering official statistics and making optimal use of technological developments, Eurostat has supported the development of more efficient ways of collecting data, also in the field of tourism statistics. The introduction of a widespread system of automated data collection from tourist accommodation establishments could lead to a significant reduction in the reporting burden for enterprises and to a significant reduction in the processing and compilation burden for the statistical authorities of the Member States. An ESSnet project involving eight Member States was launched in 2010 1. The aims of this project are: i) to reduce the response burden, ii) to improve timeliness and iii) to enhance the 1 Participating countries include Spain, Belgium, Bulgaria, Finland, Latvia, Lithuania, Poland and Slovakia 2
international comparability and quality of the statistics collected from tourist accommodation establishments. The ultimate objective is to develop a system that can generate statistical information automatically from the management information system(s) used by tourist accommodation establishments. This paper examines the current situation, challenges and benefits concerning automated data collection, especially at Statistics Finland, but also in other countries that have been participating in the ongoing ESSnet project. In Finland, a system of electronic data collection was introduced in 2005. In the first stage, an Internet questionnaire was used, but since data entry and transmission of data through the Internet questionnaire was considered too time-consuming, the need for automated data collection was recognized. This system is based on XML files that are formed automatically from hotels information or booking management systems and sent as an encrypted electronic transmission to the database of Statistics Finland, where a set of logic validation tests are executed. Hotels also receive a quarterly automatic feedback report on the establishment and a comparison overview in the same sector and region. Regarding other Member States, in Spain a similar system was introduced in 2008. The results of automated data collection have been encouraging in both countries. The introduction of automated data collection has led to a significant reduction in the reporting burden for accommodation establishments and to a notable reduction in the processing and compilation burden for the statistical offices, especially in the Finnish case. The positive results encouraged six other Member States to participate in the ESSnet project as copartners. The project is due to end in September 2012, but most of the work has already been done. As a result, the common part of the XML file, which contains all the necessary variables required to satisfy the needs of the EU regulation, was created. In addition, all the countries involved created an additional part of the file to ensure that national demands are met. So far, four countries out of eight have implemented the system and the other four are either testing or planning to implement it during 2012. Even though the results have been quite encouraging, there is still room for improvement in the take-up of the new transmission method by the respondents. Automated data collection has not been spread as quickly as one would expect. The implementation of the system is highly dependent on software 3
houses, which create the hotel management systems and which often operate in the global market. These software houses and/or global hotel chains do not necessarily consider one individual country to be an important market and therefore they are not readily willing to implement the system. A common European system might be a major advantage in implementing the system more rapidly. 2 Automated data collection In automated data collection, statistical information is generated automatically, e.g. from the respondent s management system into a specified file, and sent directly as an encrypted electronic transmission to the NSI s database. The procedure should be more or less automatic. In the optimal case, such as in Statistics Finland s accommodation statistics, the respondent simply presses the button in their hotel management system and the data are sent immediately to the NSI. 3 National experiences in Finland Up to the beginning of 2005, there were more or less two types of respondents: i) those who answered by faxing reports generated from their hotel management system to Statistics Finland, and ii) those who filled in paper questionnaires. Either way, the data had to be recorded manually at Statistics Finland. In addition, the response burden on accommodation establishments was high as a result of the manual work involved. Statistics Finland therefore decided to develop new modes of responding that are less burdensome both to the data suppliers and to the NSI. In early 2005, a questionnaire that could be filled in on the Internet was introduced alongside the previously used answering modes. This step was part of the already ongoing pilot project for the development of an XML-based questionnaire formulated with the application developed by Statistics Finland for the implementation of data collection via the Internet. Nevertheless, all of these modes of response involve manual phases, either for the data suppliers or for Statistics Finland, or for both. This led to the launching of a pilot project on automated data collection. Statistics Finland already had an existing collective mode of an XML-based questionnaire resulting from the pilot study. In addition, this collection mode already included an application for mass dispatching of emails as well as an application for transferring data from the collection database to the production database. At the same 4
time, the logicality of the data could be verified. The major challenge was to find a way in which XML files could be formed from the data suppliers information systems. Statistics Finland and representatives of software suppliers concluded that the shared objective was to make data collection easier and quicker. It was agreed that Statistics Finland would draw up the required specifications and documents, such as a description of the XML file according to which the automated data reporting of data could be implemented. Accordingly, the participating software suppliers agreed to add a new reporting facility to their own software. A system of automated data collection was introduced in autumn 2005, and with the help of the system, data suppliers can now transmit the XML file to Statistics Finland direct from the management system of the accommodation establishment simply by pressing the button. As a result, electronic data collection at Statistics Finland comprises two alternative modes. One is the Internet-based questionnaire and the other is automated data collection. Figure 1 describes the process at Statistics Finland. 5
Figure 1. The architecture of electronic data collection at Statistics Finland The process is quite similar in the two alternatives. If the respondent delivers data using automated data collection, the received data file is transferred with only a short delay to the Internet questionnaire. Thus the respondent can quite easily view and use the data sent. Otherwise, the respondent must manually key in data to an Internet questionnaire. After this step, the process is similar for the two alternatives. Data are transferred from the Internet questionnaire to the temporary database and then further to the production database after data control and logical verifications. The respondents also receive an automated feedback report quarterly. In 2011, Statistics Finland received approximately 66 per cent of the data electronically (automated data collection and an Internet questionnaire) and 34 per cent by other modes of reporting (paper questionnaire, email, fax, etc.). Overall, about 16 per cent (approximately 140 accommodation 6
establishments) of the data were received automatically. The development has been encouraging, but unfortunately quite slow, at least in automated data collection. Figure 2. Number of respondents by reporting method at Statistics Finland in 2005-2012 The introduction of both electronic and automated data collection has led to a notable reduction in the processing and compilation burden for Statistics Finland. During the years 2004-2009 the working hours spent on data collection, editing, reminders and feedback has decreased by 35 per cent. 7
Figure 3. Working hours spent on data collection, editing, reminders and feedback at Statistics Finland in 2004-2009. To conclude, the experiences from electronic and automated data collection have been encouraging. Once the accommodation establishment has implemented the system, the response burden is practically zero. Earlier, and in other reporting modes, the response burden per month was, on average, between 30 minutes and 2 hours. In addition, the compilation burden has been reduced significantly. Moreover, Statistics Finland receives the data earlier, which means that we have more time to analyze and go through the data. This has also improved the quality of the statistics. It must be noted that the implementation of the automated data collection system seems to be surprisingly slow. There are various reasons for this, but the most important one seems to be that the accommodation establishments either have many different management systems and software applications or else no software at all. This means that it takes time and money to update all the software. In addition, once the automated data collection function has been implemented, it seems to be fairly challenging to get the updated version of the system to the customers (accommodation establishments) and to introduce the new function. The reporter and the person who is responsible for the updates is not necessarily the same person, and as a result the reporters of the data might not be aware of the new function. 8
It is also challenging to persuade bigger hotel chains to implement the system. Global software houses, which often supply the management systems to the hotel chains, consider that one country is a small market and are not easily convinced of the necessity to update their systems. In addition, small (often seasonal) establishments do not have the appropriate software, resources and/or interest to invest money and effort in a new system. It is encouraging that since the new system was created, there have been no major technical problems during the implementation and the delivery process. 4 National experiences in other Member States Statistics Finland has been pioneering automated data collection since 2005, but other Member States have also been testing and implementing a more or less similar system in recent years. INE Spain introduced the system of automated data collection in 2008, and since then hotels have had an opportunity to send data either automatically from hotel management systems (if implemented) or through INE Spain s web page. There has been some interest in using the automated data collection, but the implementation phase has been even slower than in Finland. Nowadays some 100 establishments send in XML files automatically each month. The ongoing ESSnet project has added six other Member States to the group experimenting with automated data collection. Currently two of these countries (Latvia and Poland) have implemented the system and the other four (Belgium, Bulgaria, Lithuania and Slovakia) are either testing or implementing the system. The experiences of the ESSnet project and of the participating countries have been rather promising although all the countries encountered challenges. The consensus is that the planning and execution have been fairly easy from the technical point of view, at least in statistical offices. The challenge is to have software houses and hotels implement the system in their hotel management systems. As a result of the project, all the participating countries decided to engineer a common XML file which features all the compulsory variables defined in the EU regulation. In addition, all the countries designed an additional part of the XML file that satisfies the needs of national users. The architecture and the process of receiving and handling the data differs between countries, since all of them have different software available and/or purchased. 9
5 Conclusions As discussed earlier, the results of automated data collection have been encouraging. The introduction of automated data collection has led to a significant reduction in the reporting burden for accommodation establishments and to a notable reduction in the processing and compilation burden for statistical offices. At Statistics Finland, the number of working hours spent on data collection, editing, reminders and feedback has dropped by 35 per cent, from 300540 hours in 2004 to 200300 hours in 2009. Even if the data still need to be checked using various routines, automated data collection reduces the 'manual' work for both the respondent and the NSI. In addition, the overall quality and especially the timeliness have improved. This system of automated data collection has shown that a reduction in burden can go hand in hand with an improvement in quality. If the automated data collection system is widely implemented among respondents, the NSIs receive data earlier, which in turn enables them to publish the data earlier. The ultimate goal in the ongoing ESSnet was to create international standards for file description and data transfer in NSIs. During the project all the participating countries created a common file and an additional file to satisfy the needs of both international and national users. The reporting system is identical in each country regarding the demands of the EU regulation. This would imply that the data and the definitions are fully comparable between these countries, at least as regards automated data collection. The biggest challenge is to implement the system within each country and in addition in other Member States. A common system might assist the implementation, as this would mean that, for instance, global software vendors would need to build one system. At the moment, all the participating countries also have an additional file which makes international implementation more difficult. There is still a great deal of work to be done, but the benefits are so significant that they are worth the extra effort. References Regulation (EU) No 692/2011 of the European Parliament and of the Council of 6 July 2011 concerning European statistics on tourism and repealing Council Directive 95/97/EY, Official Journal of the European Union L192, Volume 54, 22 July 2011, 17-32 10