, pp.42-46 http://dx.doi.org/10.14257/astl.2016.129.09 Research into a Visualization Analysis of Bigdata for the Decision Making of a Tourism Policy Sungwook Yoon, Jeonghyun Lee, Hyenki Kim * Dept. of Multimedia Engineering, Andong National University, 388 Seongcheon-Dong, Andong-City, Gyeongsangbuk-Do, Republic of Korea uvgotmail@nate.com, wjdgus6163@gmail.com, hkkim@anu.ac.kr * Hyenki Kim, hkkim@anu.ac.kr Abstract. Visualization in the era of BigData plays a significant role in providing a trend of change or intuition and insight which is hard to obtain in the conventional statistical data showing mainly in letters and figures. This study has organized the visualization index through various methods of collecting, refining and analyzing BigData analysis methods so that a local government can make a decision on tourism industry policies by utilizing BigData information. It is expected to be possible to actively make a decision of policies or establish plans about multi-faceted tourism information because it is most likely that a wide range of issues from analyses happen according to a time, space, issue, and medium for a feedback of tourism resources. Keywords: Visualization, BigData, Tourism policy 1 Introduction The recent increasing interest in an analysis using BigData beyond the conventional database work to make political decisions based on questionnaire survey is attributed to the mass data produced on a daily basis along with the acquisition of personal smart devices and the progression of an information society [1]. Looking at a tourism industry, the existing tourism-related index was presented in a form of figures on the spread sheet which lacked a real-time characteristic and limited an expansive understanding of its trend, and was relatively hard to obtain insight because it utilized a result of collected data such as questionnaire surveys and totals. On the other hand, visualization which is made through standardization of multi-lateral BigData makes an intuitive translation possible by using spatial data and a visualization tool, and can be made active use of in deciding tourism policies as well as published on the web because of its real-time production of results such as a smart device-based roaming data collection[2]. Also, these data can generate a derived index according to a purpose through reprocessing and amalgamation even after an analysis has been done. ISSN: 2287-1233 ASTL Copyright 2016 SERSC
This study is going to organize visualization indexes by collecting plans of BigData analysis methods and through its refinement and analysis in order for a local government to make a decision of tourism policies by using BigData, and based on this, it will suggest an index and a method to be useful in deciding policies on the local tourism resources. 2 Relevant Studies and Current Status 2.1 Definition of BigData BigData can be defined as a set of typical or atypical mass data which exceeds the capability of the existing big database management tools to collect, save, manage and analyze data, and also as technology to extract values from these data and analyze results[3]. In general, the characteristics that distinguish BigData from the conventional data processing include Volume, Variety and Velocity, and a methodology has come up to define, analyze and create meaningful values of these 3V s. 2.2 BigData Processing and Analysis The sources of BigData include expanded personal mobile network, mass-produced information according to the purpose of a public organization, shared public data for business activities, News, information exchange between social network services, and Internet of Things-based sensor network, and it collects data by utilizing various methods such as BigData crawling, robots, Open API, FTP, RSS Feed Crawling, Streaming, log collection, and RDB-based data collection. Additionally, for the sake of public and private information sharing, the opening of public data has various subject categories for its contribution to public purposes[4]. Analyzing BigData is a process of finding out a new trend by analyzing above a petabyte of natural ordinary language in real-time and extracting it to be meaningful through analysis[5]. An analysis process of BigData consists of six stages of collection, storage, management, process, analysis, and use. Collected data go through filtering, transforming, and cleansing, and as a post-process, integrating, transforming, and reduction. Storage progresses in a way of RDB, NoSQL, and Distributed File Systems while reviewing the formatting of data to be saved and choosing a favourable method for storage. Security measures can be taken based on the review of possibility of a breach in security such as access restriction for each stage, cut-off, authorization, encryption, de-identification, and post monitoring. Copyright 2016 SERSC 43
3 Environmental Analysis for BigData Analysis The analyzing process of tourism BigData of this study can be schematize. First of all, 1) Collect statistics data about tourism promotion websites. 2) Process the collected data in a form suitable for analysis, and 3) combine and link data by putting together annual data based on the event reports. 4) Analyze data to visualize it into an effective form of tourism policy data, and 5) share and open the data through web publishing. Analysis and visualization can be published on the web though an analysis of frequency of a word mentioned and its correlation, visualization using a package library, analysis of a change in tourism information using Dynamic Chart of Excel, Google Chart Tools analysis, and Open API. 4 Analysis Design Data is collected from the local governments promotional event reports for each period and issue, and specific periods need to be defined about events for each of the major issues so that analysis results can be worked out together with tourism policies. The storage process has to be an effective distributed processing using web crawling or SNS Open API of the collection stage, or save the products of search-based websites. In this study, various types of data collected from the annual reports of Andong International Mask Dance Festival were standardized to be construable and redefined. For visualization of political contents of data and web publishing, visualization tools like D3js were utilized and data linked systems such as Knitr was used for processing. In Fig. 1, data is described for each year after having been processed to be suitable to be expressed in a chart by applying a Chart Tool of GoogleVis Package to RStudio. 44 Copyright 2016 SERSC
Fig. 1. App and Web Server Control Screen for the Web Publishing The post-event feedback analysis according to the evaluation report of Andong International Mask Dance Festival was performed for its trend using BigData. The change for each of the 18 events was visualized into a dynamic chart in demographic distribution, motivation of visiting, frequency of visiting, number of accompanying people, satisfaction level, accommodation distribution of visitors from other areas, and information sources among annual reports. The table below shows the sources of information by which the event was recognized out of the data. The annual data was presented in a form of a dynamic chart so that it can be utilized in tourism policy. 5 Conclusion In this study, typical and atypical data was analyzed using a meaningful visualization tool which was collected and refined based on the public data of annual reports on tourism. Visualization is expected to make it possible to figure out a trend in change and a direction of the era of BigData, and play a big role in providing a new kind of inspiration which is hart to get from the conventional statistical data mainly put in letter and figures. It is also expected to be possible to actively make a decision of policies or establish plans about multi-faceted tourism information because it is most likely that a wide range of issues from analyses happen according to a time, space, issue, and medium for a feedback of tourism resources, if the data of mobile environment which is becoming more diverse is analyzed based on the study results and a more specific results can be deduced. Copyright 2016 SERSC 45
Acknowledgements. This work was supported by a grant from Seoul Accord Fund (R0613-16-1148) of Ministry of Science, ICT and Future Planning (MSIP), IITP References 1. Lee, J. G.: A Case Study of R Analysis of BigData in Seoul Special City Expenditure Settlement Data, The Journal of Business Education, The Korean Research Association for the Business Education, vol. 29, no.4, pp. 57-74(2015) 2. Noh, K. S.: Convergence Analysis of Recognition and Influence on Bigdata in the e- Learning Field, Journal of Digital Convergence, The Society of Digital Policy & Management,, vol.13, no.10, pp51-58(2015) 3. Goo, J., Kima K.: Text Mining for Korean: Characteristics and Application to 2011 Korean Economic Census Data, The Korean Journal of Applied Statistics, vol.27, no.7, pp.1207-1217(2014) 4. Sahami, M. (Ed.): Learning for Text Categorization: Proceedings of the 1998 AAAI/ICML Workshop, AAAI Press, Technical Report WS-98-05(1998) 5. Cohen, W. W.: Fast effective rule induction.in A. Prieditis and S. Russell (Eds.), Proceedings of the 12th International Conferenceon MachineLearning(ML-95), LakeTahoe, CA, pp. 115 123. Morgan Kaufmann(1995) 46 Copyright 2016 SERSC