Big Data in Danish industry - Appendix



Similar documents
How to Plan and Guide In Class Peer Review Sessions

Writing Essays. SAS 25 W11 Karen Kostan, Margaret Swisher

WRITING A CRITICAL ARTICLE REVIEW

Augmented reality enhances learning at Manchester School of Medicine

Note-Taking Skills. Overview: This lesson adds to the learners note-taking skills. This is a

xxx Lesson Comprehend the writing process 2. Respond positively to the writing process

Why I Wrote this Packet

Teacher Evaluation Using the Danielson Framework 6. A Professional Portfolio: Artifact Party A. Using Artifacts to Help Determine Performance Levels

Outline. Written Communication Conveying Scientific Information Effectively. Objective of (Scientific) Writing

Writing a degree project at Lund University student perspectives

Developing an Academic Essay

How To Proofread

1. Current situation Describe the problem or opportunity (the need for your proposal).

Last year was the first year for which we turned in a YAP (for ), so I can't attach one for

100 SEO Tips. 1. Recognize the importance of web traffic.

4 Year Primary Degree QTS Student Survey Summer 2007

Generic Proposal Structure

Relative and Absolute Change Percentages

Thinking Skills. Lesson Plan. Introduction

STEP 5: Giving Feedback

Planning and Writing Essays

Thesis Proposal Template/Outline 1. Thesis Proposal Template/Outline. Abstract

Writing Thesis Defense Papers

News Writing: Lead Paragraphs

Assessment Policy. 1 Introduction. 2 Background

Critical Analysis So what does that REALLY mean?

SAMPLE INTERVIEW QUESTIONS

TAXREP 01/16 (ICAEW REP 02/16)

How to get started on research in economics? Steve Pischke June 2012

Task 1 Long Reading: Emotional Intelligence

Managing Your Career Tips and Tools for Self-Reflection

WHY AND HOW TO REVISE

A Guide to Cover Letter Writing

CC2002 CREATIVE AND CRITICAL THINKING STUDENT

UCC Writing Survey of Students

Social Return on Investment

Cambridge English: First (FCE) Frequently Asked Questions (FAQs)

cprax Internet Marketing

Step 1 Self-assessment (Who am I? What do I have to offer?)

An Interview with Ohio University Associate Professor of CALL Greg Kessler

Resume Writing Samples

Planning and conducting a dissertation research project

HOW TO... Make your website. Marketing4Solicitors. more effective. Next Month: Turn your website into a lead generation machine.

Explain Yourself: An Expository Writing Unit for High School

Components of a Reading Workshop Mini-Lesson

User research for information architecture projects

Case study: Improving performance in HR London Camden

starting your website project

A Writer s Workshop: Working in the Middle from Jennifer Alex, NNWP Consultant

Abstract The purpose of this paper is to present the results of my action research which was conducted in several 7 th /8 th grade language arts

Use Your Master s Thesis Supervisor

INTERVIEWING QUESTIONS

Aon Case Study: Creating a Global Leadership Culture

>> My name is Danielle Anguiano and I am a tutor of the Writing Center which is just outside these doors within the Student Learning Center.

Barriers to the implementation of Integrated Marketing Communications: The client perspective.

The experiences of PhD students at the division of Solid State Physics: Effects of nationality and educational background

Strategic Choices and Key Success Factors for Law Firms June, Alan Hodgart

Improving SAS Global Forum Papers

What qualities are employers looking for in teen workers? How can you prove your own skills?

Commonly Asked Interview Questions (and Answers)

to Become a Better Reader and Thinker

Shell Mentoring Toolkit

INTERVIEW QUESTIONS & TECHNIQUES Collected by MBA Dept

Cambridge English: Preliminary (PET) Frequently Asked Questions (FAQs)

CAMBRIDGE FIRST CERTIFICATE Listening and Speaking NEW EDITION. Sue O Connell with Louise Hashemi

Preparing for the GED Essay

McKinsey Problem Solving Test Top Tips

Top Ten Mistakes in the FCE Writing Paper (And How to Avoid Them) By Neil Harris

Business School Writing an Essay

The Significance of the Ducks in The Catcher in the Rye. In JD Salinger s The Catcher in the Rye, Holden Caulfield, a teenage boy,

Reviewed by Anna Lehnen. Introduction

Work Smarter: Object-Oriented User Experience Design. A conversation with Dr. Eric Schaffer CEO and Founder Human Factors International

EVALUATION OF IMPORTANCE FOR RESEARCH IN EDUCATION

Principles of Data-Driven Instruction

Rubrics for AP Histories. + Historical Thinking Skills

Mindset: The New Psychology of Success Carol S. Dweck, Ph.D.

Assessment of the project

100 Ways To Improve Your Sales Success. Some Great Tips To Boost Your Sales

Key Steps to a Management Skills Audit

ELEMENTS AND PRINCIPLES OF DESIGN

Today, my view has changed completely. I can no longer imagine teaching math without making writing an integral aspect of students' learning.

Requirements & Guidelines for the Preparation of the New Mexico Online Portfolio for Alternative Licensure

SAMPLE INTERVIEW QUESTIONS

ELS. Effective Learning Service. Report Writing. For appointments contact: or web:

some ideas on essays and essay writing

How Cell Phones Have Changed Our Lives

Section 11. Giving and Receiving Feedback

How to become a successful language learner

A Guide to Social Media Marketing for Contractors

The Impact of Leadership in. Agile Information System Development Projects: A Pitch

Thai Language Self Assessment

Appendix B Data Quality Dimensions

Grade 4 Writing Curriculum Map

Aim To help students prepare for the Academic Reading component of the IELTS exam.

The University of Adelaide Business School

Programme Curriculum for Master Programme in International Marketing and Brand Management

Interviewing Strategies & Tips. Career Center For Vocation & Development

Transcription:

Big Data in Danish industry - Appendix Table of contents 1. Introduction... 2 2. My background... 2 3. Thesis background... 2 4. Research approach confessional accounts... 3 5. The problem statement... 3 6. The literature review... 3 6.1. Finding relevant papers... 3 6.2. Reading and skimming the selected papers... 4 6.3. The selected papers a walkthrough... 4 6.4. Observations about the papers... 5 6.5. The steps five to eight of the structured literature review... 5 7. Building a good survey... 6 7.1. The questions of the survey... 6 7.2. Questions about the data application... 7 7.3 Questions about Big Data... 8 7.4. Distribution of the survey and summary... 8 8. Analysis of the results... 9 9. Writing the thesis... 9 11. What are the lessons learned in terms of research... 10 12. Discussion... 10 12.1. Implications of findings in a larger perspective... 10 12.2. What have I learned that will make me a more competent practitioner?... 10 13. Summary... 10 Acknowledgements... 11 Side 1 af 11

1. Introduction The appendix to my master thesis is a an attempt to make a confessional account about the process which has taken me from sketches on a piece of paper to an article that is as finished as time and ability has allowed it to be. The Master Thesis Appendix will take you through all the steps of the process, using a combination of descriptive, analytic and reflective methods. Writing the appendix serves two purposes. The first purpose is for me as the writer of the article to be allowed to reflect on the process of writing my thesis. Reflecting on the process is a way of learning, through the repetition and reminder of the mistakes I made in the process of writing the thesis. The second purpose of writing the appendix is to facilitate a dialogue between me and the reader. Writing the appendix allows me to describe my personal perception, and in this way allow the reader to correct me. 2. My background Before starting this project I had a limited amount of experience with academic research. Reports from various courses of my education had given me a feel of the requirements. I had experience from working with research from an assignment in the course called IS development in a business context, which is being taught by Professor Nikolaus Obwegeser and Professor Bjarne R. Schlichter. Together with two other students and Professor Nikolaus Obwegeser, we developed our report from the course into a conference paper. The paper was peer reviewed and accepted for the conference we had chosen. Working on this project gave me an impression of the amount of work required to write a good academic paper. 3. Thesis background From the first time I came across the concept of Big Data during lectures on Business Intelligence Architecture, the subject made me curious. Is size the only difference between Business Intelligence and Big Data or is there more to it? My research into the subject revealed a concept which is mainly driven by the evolution in storage technology and data processing abilities. In a scientific perspective the concept has not matured but is attracting much attention from the scientific community. The attention from the scientific community means that there are a large number of papers with a very diverse approach to Big Data. Having read and skimmed a large number of papers I found that papers about the application of Big Data can be placed into one of three groups. The first group of papers is discussing the potential of Big Data and the application of Big Data in general terms. These papers are looking at Big Data from perspectives of technology or general business opportunities. The second group of papers is limited to mentioning the application of Big Data within a few but famous companies, the likes of Google, Ebay and Facebook. The third group of papers is describing how just one organization has applied and implemented a Big Data solution for a specific purpose. These papers go into details about the business opportunities for the one organization and often discuss the technological issues of the actual Big Data implementation. The 3 groups or types of paper are all interesting and relevant in each their own way. But in most cases the lessons learned and the value created is difficult to transfer to a Danish context. The large majority of Danish companies are small and medium sized companies. The potential of Big Data is just as relevant to this kind of company as it is to large international companies. But there is little focus on this kind of company, and I decided to start looking at this. Side 2 af 11

I did not want my thesis to be about visions, technical issues, famous organizations or a few individual organizations. The ambition and intention of the thesis is to create an overview of actual application taking place in many organizations, the only limit being that the organization has to operate within Danish Industry. Having a background as a practitioner I wanted my thesis to be more than a theoretical report. I wanted to contribute to the development of Big Data within Danish industry, by showing how small and medium sized companies can use Big Data to create value. 4. Research approach confessional accounts The process of writing a thesis involves a number of discrete activities. The main activities of writing my thesis are preparing and writing the problem statement, conducting a literature review, constructing a survey, analyzing the results of the survey and finally writing the thesis based on the results of the previous activities. Before starting each activity, I would make a plan, which would include a detailed list of tasks and activities, a time schedule, overview of dependencies and a rough idea of the expected output. In this appendix I will be making my confessional accounts, relating to each of the major activities involved in writing the thesis. I will be describing in as much detail as is possible and relevant, the way each activity took place, and the lessons learned from the mistakes I have made and the frustrations I have encountered in the process of writing the thesis. Most of the activities I have been through in this process, I have done for the first time. That means making mistakes, spending more time on an activity than one would imagine, and a need to constantly reflect on process. It also means to struggle to keep the big picture, when most of the time you are focusing on getting the details right. 5. The problem statement The first and potentially most important activity is to write a problem statement. I started off wanting to cover all kinds of aspects of Big Data. Due to my practitioner background, I wanted to make a contribution that would be perceived as relevant by practitioners. I gradually realized that I had to find a relatively narrow field of research. By focusing on the application of Big Data and on small and medium sized companies I found something that is relevant to academics as well as to practitioners. The problem statement has helped me many times when I was moving my efforts into a wrong direction. The problem statement is your best friend when you are conducting a project for an extended period. 6. The literature review I had never before tried to conduct a literature review, which means that I needed all the help I could find. On the recommendation of my supervisor I have used the eight step guide to conducting a structured literature review (Okoli & Schabram, 2010) and the guideline from Webster and Watson. The eight step guide is a very structured approach with clearly defined steps, which allow me to focus on just one step at a time. 6.1. Finding relevant papers The first step was to find the most relevant papers, selected from the Scopus and Web Of Science databases. The first and maybe most important step of finding relevant papers, is choosing the words to use for the search in the databases. The search words have to reflect the subject that you want to find Side 3 af 11

without excluding relevant papers. Selecting the words was done in close cooperation with my counselor, which gave at least 2 different perspectives on the search words. Different perspectives in the search words will support the diversity in the type of papers selected. The selected papers completely decide the view you will get on the subject you are researching. In this way the literature review can end up confirming the personal conception of a subject instead of expanding the conception. There is no way of making sure that you have found the papers which are the best and most relevant for your research. The closest thing you come to a way of assuring you have the best papers is by doing the forward search and the backward search. 6.2. Reading and skimming the selected papers Reading and skimming a lot of papers is not the difficult part. The difficult part is keeping consistency in the valuation of papers all the way through the literature review. This becomes very difficult when you find yourself looking for reasons to keep a paper, when in fact the paper should not be kept. I often found myself going back to papers which I found were well written and of a high scientific value, in order to compare. Sometimes you think that a paper is good, or rather you want the paper to be good, but when comparing to a paper that is actually good, it is easy to see the difference. Being consistent in your choice of papers requires a strong but flexible definition of the relevant papers. Strong because the number of papers have to be cut down to a number that is manageable. Flexible because papers which add nuances to the research should be allowed to be selected. Using a framework is a strong way of supporting the literature review. I applied the DELTTA model with 6 specific parts, which were a big help in selecting the papers. 6.3. The selected papers a walkthrough The process of reading all the selected papers is creating some kind of overview of the literature on Big Data. The papers have very different approaches to the subject. The papers can be divided into 4 groups. 1 -The scientific paper, with a solid scientific grounding in terms of data collection, literature review and structure of the paper. The paper has been peer reviewed and published in a good journal, aimed at the scientific community. Being a scientific paper, the contents have been tested and the results can be used to generalize about the given subject. 2 - The conference paper, typically reviewed by other conference participants, published in a conference catalogue, and aimed at the audience for the specific conference. The conference paper often has a value, especially in fields of science that is evolving. Conference papers can be a good way to learn about the most recent findings which have not yet matured into scientifically solid knowledge. 3 - The practitioner paper, typically written by authors with domain knowledge and experience, often published in a journal with a relatively specific audience within a given field of business and without any scientific grounding, in terms of literature review and data collection. This kind of paper generally has a very narrow and shallow focus, based on very limited research, if any at all. In extreme cases these papers seem more like advertisement, often referring to specific brand names. 4 The last type of paper is the literature review paper. This kind of paper is repeating and rephrasing knowledge taken from other papers, without any reflection on the contents. This kind of paper can be good Side 4 af 11

for creating some kind of overview of the subject, but other than that there is little in the way of sound scientific learning to be found. The four different types of papers each create value in their own way. The different types of papers are aimed at different types of audience. When doing a literature review for a paper which should be published, the distinction between the different types of paper is important. 6.4. Observations about the papers In step 4 (Practical screen - is the article applicable) of the 8 step guide to conducting a literature review. When doing a literature review the focus of the reviewer is naturally on the subject of the literature review. This means that the reviewer perceived every paper from the perspective of the subject of the literature review. The papers the reviewer will read have been through a careful screening. The reviewer will therefore assume that the paper is relevant and can contribute new learning to the subject of the literature review. But sometimes this is not the case. Even though the paper has been selected through carefully chosen search words and the abstract indicates that the paper is relevant, the paper could still have a perspective on the subject, which is wrong in relation to the subject. When performing a literature review on the subject of Big Data, a paper with the title of The current state of Big Data within Academia seems relevant. Having read the abstract the reviewer thinks that the paper is focused on how Big Data is being handled by the world of academics. This seems to be a fair assumption. However the paper is focusing on how the subject of Big Data is missing in the current curriculum and how the curriculum should be updated to reflect the need for knowledge about Big Data. In relation to the literature review, the paper is not relevant. As a reviewer it is important to accept that some papers are just not relevant and therefore should not be included in the final literature review. In general I am very surprised that my selection criteria gave me a little more than 600 papers, and after practical screen and quality screen I end up with less than 40 papers. This to me is an indication that I might have done something wrong in the choice of words for selection of papers, or maybe it is because of the large diversity in the papers. 6.5. The steps five to eight of the structured literature review Step five is the quality appraisal. As I have mentioned earlier there are varying levels of quality in the papers which were extracted from the databases. Or rather, different types of paper have different audience and therefore they are different. The quality of a paper must therefore be seen in the context of the problem statement and of the framework. Keeping a constant level of quality throughout the literature review is difficult because the papers are very different in their approach to the subject. Sometimes I would decide to keep a paper because the paper seemed to be adding insights, but on a second read the basis in the shape of empirical data or theoretical basis was too weak. Next time I do a literature review I will start by finding a paper of high quality. When reading all the other papers I will be using the high quality paper as a kind of benchmark, especially in the situations when I am not sure about the quality of a paper. Side 5 af 11

Step six is the data extraction from the papers. The data extraction is really difficult because of two things. Firstly, in my case I was looking for data that would fit with one or more parts of my framework. The problem is that even though each part is defined you the data you extract will not always fit in relation to your problem statement. Secondly, when you take data from a paper you are taking some statements out of a context where the data make sense. When later you look at the statements they are not always able to stand on their own. They must be places in the right context in order to make sense. I got better at selecting the right data but in the beginning it was very difficult. Looking at the contents of the literature review I am not sure as to how well I did. Step seven is the synthesis of studies. This step did not go very well. It was difficult for me to find the overall subjects of the data collected and it ended up being messy. Step eight is writing the review. This part was hard and I worry the result is slightly fragmented. This is due to poor judgments in the data extraction and relatively poor synthesis of studies. Next time I will spend more time on the previous steps of the literature review. 7. Building a good survey As was the case with the literature review, I had very limited experience in building a survey from scratch. But I knew that knowing what you want to achieve and what you want to learn from the survey is essential. When you know what you want from the survey you can start to make decisions about which questions are relevant and which questions are not relevant. This how ever does not ensure that you include all the relevant questions. I soon realized that I had to start with the end result. I had to define the kind of analyzes that I wanted to conduct on the collected data. The worst situation to be in after having received the responses to a survey is to realize that you have not asked a relevant question in the survey. Once the survey has been distributed, it is too late to make changes or additions to the survey. The problem statement and the research question were good guidelines for the overall design of the survey, but not detailed enough to be helpful in the process of deciding the actual questions. Instead I had to find a combination of questions and statistical tools which would help me answer the research question. Having my counselor to give feedback was essential in this process. 7.1. The questions of the survey The basic idea of the research is to look at a broad range of companies within the small and medium sized companies. That means a broad range in the age of the companies, a broad range in the number of employees, a broad range in the types of business, and a broad range in the data usage and in the experience with Big Data. The broad range of companies directly supports the external validity of my research. Asking respondents about the age of the company, number of employees and the type of business was relatively straight forward and uncomplicated. I had the idea that I wanted to draw a profile of each company seen from a data application point of view. I had the idea that most companies that would be working with Big Data would have a certain level of maturity within the field of applying data to their internal processes and have a company culture that would trust and be used to working with data. In order to be able to draw a data profile of a company, I needed to ask some questions. This a data profile as opposed to a Big Data profile. I was very much aware of the difference Side 6 af 11

7.2. Questions about the data application I decided to start by asking about the type of data they were using. The type of data was defined along three dimensions. The first dimension was about structure. Big Data are, among other characteristics, unstructured. Being unstructured in this context means the data cannot be stored or handled by traditional databases. Traditional databases are typically made of columns and rows, and a well defined relationship between the different parts of the database. Unstructured data, like pictures, sound, scanning etc. do not fit into this type of database. The degree to which the company was using structured or unstructured data would be an indication of the Big Data application within the company and an important part of the data profile I wanted to draw. The second dimension was related to the degree to which data were being internally or externally generated. A company that only applies internally generated data is probably relatively immature in the application of Big Data. This is because a part of the idea with Big Data is to apply data from a variety of sources, including externally generated data. Externally generated data would include data bought from private or public companies, but also data from sensors, social media and other sources where the data are being generated by people outside the company. The degree to which a company is applying externally generated data is another way of drawing the data profile. The third dimension of was about the total size of data. The potentially most interesting characteristic about data in relation to Big Data, is the size measured in bytes (Gigabytes, Terrabytes, Petabytes etc.). The size of data again would add to the data profile of a company. Companies working in the large terabyte range or higher are most likely Big Data in some way. I discussed the three dimensions for the data profile, and after having explained what I wanted and how I was going to explain it to the respondents, he agreed to let me ask about structured/unstructured data and about externally/internally generated data. The third dimension about the size of data was not accepted. My supervisor did not believe that a respondent would know the answer to this question, and we would risk a low validity on this question. This is an important lesson in the construction of a survey. The survey was going out to a very broad range of companies, and we did not know the type of employee that would answer the survey. When asking questions through a survey, you should only ask questions to which you are likely to get a valid answer. Whether you are likely to receive a valid answer would depend on the question and on the respondent. I still believe that it would have been possible to get a valid answer from most companies, but then I am slightly biased in the sense that I have worked within IT before and therefore have a good understanding of the data side of companies. The idea and purpose of the data profile for each company was to help me secure the external validity in the data collected through the survey. My main worry in relation to the survey was that many respondents would either not respond to the questions about Big Data or would give me invalid answers. Both scenarios would create problems for me. By having a general data profile along two dimensions, I could validate this against the responses to the Big Data questions. In the second part of the survey I would ask about the ways in which data is currently being applied. The questions were divided into two groups. The first group of question would ask about the degree to which Side 7 af 11

data was being applied for a number of internal generic processes. The second group of questions was more specific and would ask how the application of data was creating value. Examples of ways of creating value are improved customer satisfaction and better management decisions. The two groups of questions were both meant to add details to the data profile of each company. With the two sets of questions I had a fairly detailed level of information about each company. As it turned out the two first sets of questions were relatively easy to design. Looking back it is easy to think that I should have asked the questions in a different way than I did. This is part of the learning process, and making mistakes is a good way of learning. As I found out later the really hard bit of the survey were the questions about Big Data. 7.3 Questions about Big Data The questions about Big Data are the most important part of the survey. The questions had to be structured around the DELTTA model. The idea was to create two question for each of the six parts of the DELTTA model. The first question was about the application of Big Data and the second question was about the value creation from the application of Big Data. In the introduction for the survey I had given a definition of Big Data. The DELTTA model looks at the different parts of Big Data, and therefore is very specific. That meant the questions had to be specific about each of the six parts of the DELTTA model, but still be open and general enough to receive a valid answer independently of the context in which the survey was being answered. I could not assume any knowledge of Big Data and therefore had to give the respondents the opportunity to answer with some kind of response that says I don t know without tempting too many respondents to choose this option. I wanted to get as many responses as possible but I first of all needed responses that are valid. Instead of asking questions, I decided to make statements. In this way I thought that I could get a valid response from respondents with a good understanding about Big Data and respondents with less understanding about Big Data. Because I wanted to ask questions about each of the six parts of the DELTTA mode, the questions had to be very specific to each part of the DELTTA model. In the introduction I would explain that I wanted to ask about application of Big Data and value creation from the application of Big Data, but I did not want to explain the DELTTA model to the respondents. I wanted the respondent to perceive each of the 12 questions as a clearly defined question without too much relation. The risk was that the respondents would perceive the questions as very similar and therefore would answer all the questions with the same response. This would seriously damage the validity of the answers. 7.4. Distribution of the survey and summary The actual distribution of the survey went relatively smoothly. I received around 400 responses about messages that could not be delivered due to invalid email addresses. This is the risk of using data from the CVR register. I received a total of 457 responses to the survey. I had a goal of achieving a 10 % response rate which would have been around 350 responses which seems to be a reasonable goal for this kind of survey. The good response I believe is due to a good introduction and instructions in how to answer the survey. I believe it is important for me as somebody who is asking a lot of people to spend 10-20 minutes on my survey that I show to the respondents that I appreciate their contribution and that I make it as easy as it is possible to answer the survey. I also believe it is important for the respondent to understand the Side 8 af 11

context in which the responses will be used. This helps to support the validity of the responses, and to encourage the respondents to finish the survey. 8. Analysis of the results The analysis of the data from the survey turned out to be more complicated than I had expected. Even though I had designed all the questions myself, I had to rethink the questions and the responses to figure out what the data mean. Next step was to figure out which types of analysis I wanted to do. As it has been recommended about Big Data, I decided to start with the questions. I quickly realized that the data I had collected about the basics of each company and the general data application were not very useful. I could use the data to ensure that I had a broad distribution in types of companies, based on age, number of employees and types of business. I could also quickly make an analysis that would show that companies who are already working with data had a higher tendency to also be working with Big Data, but that was hardly surprising. I ended up only using the data from the 12 statements about Big Data as a basis for the analyzes. The hard part is to interpret the results and to put the interpretations into words. Finding the balance between on one side making interesting interpretation and on the other side not reading too much interpretation into the numbers, is hard. I am very much used to always being able to support any claims with a suitable set of numbers. In this kind of research which involves people and organizations I find the restrictions are night as tight and you are allowed to interpret more on the numbers than I would normally find reasonable. I really have to convince myself that it is OK to make the interpretations even though the numbers seem too weak to support the interpretations. 9. Writing the thesis I like the process of writing and of structuring your thoughts and ideas before starting to write. I have written my fair amount of reports and a single conference paper, but writing the thesis has been the hardest thing I have done. I knew in advance that the language must be clear and simple in order to convey the message. But in order to be clear in your writing you must be clear in your thoughts and ideas. Any uncertainty will be directly reflected in your writing. The hard part therefore is not finding the good, clear, and readable sentences and phrases. The hard part is getting a clear picture in your head. Once you are clear in your head the writing is easier. Not easy but easier. I have tried to write in a clear and simple manner, without becoming boring, or monotonous. I find that relatively short sentences are easier to control and are good at conveying a message. The hardest part of writing is to make your writing reflect the thoughts you have in your head. If I can t find the right words to express my thoughts, I would leave it and write on another paragraph. This would often help me to form the sentence in my head and subsequently write it to paper. 10. How does the process differ from normal project work? I will define a project as a piece of work which is somehow unique and has a start and an end. You might choose to add that a project has to come up with some kind of result, but I am not sure that that is a characteristic of a project. Working on writing a thesis therefore fulfills all criteria for being called a project. I will define normal as what is common or the majority part. In this how most projects are conducted. Side 9 af 11

I think that working on a thesis you are not sure how the project will end in terms of output. Yes, a report will be written, but what are the results of the project. It is not until you finish an activity, that you know what the result is and what the consequences of the findings are. Writing a thesis is to create something that does not exist yet. It is the findings that guide you through the process. Up to a point the activities are planned in advance, but the actual content of each activity cannot be planned ahead. 11. What are the lessons learned in terms of research I have learned a lot from the individual activities. Not only the specifics of how to conduct a literature review or build a survey. I have learned how important it is to keep your focus and use your problem statement actively. The largest lesson I have learned from this project is that I need to work together with somebody else on this kind of project. I need to have somebody who also knows all the details and with whom I can talk about each of the activities. I decided to do this project on my own knowing that it would be difficult, but I wanted the challenge. I will be better at doing it next time but I would prefer to do it together with one or two people. That would be ideal for me. I am not good at seeing things from different perspectives, and I need more or less constant feedback. That is the main lesson learned from this project. 12. Discussion 12.1. Implications of findings in a larger perspective I would hope that the research can help more small and medium sized companies to start applying Big Data to their companies. 12.2. What have I learned that will make me a more competent practitioner? The first thing is that I have a good understanding about Big Data, which I will be able to use as a practitioner. I have learned how to find and digest large amounts of written reports. I have learned not only to look at the whole but also to look at the parts. Identifying the parts has become easier and it will help me when faced with complex problems. 13. Summary The most important parameters of conducting research are curiosity and stamina. The combination of curiosity and stamina is what will bring you through the hard times, when you think the work is without an end. I have had my share of hard times during the research and writing processes, and sometimes stamina was low, but I never lost the curiosity. Apart from the final exam and defense of the thesis, the writing of a thesis is the last part of an education. Even though writing the thesis is the last task of an education, most of the activities involved in writing a thesis, you are doing for the first time. In my case, conducting a literature review, designing a survey and partly doing analysis I faced for the first time during the process of writing my thesis. You might say that in this way the task of writing a thesis is not only the end of an education but also the start of an academic awareness and ability. Side 10 af 11

Acknowledgements To my supervisor, Sune D. Müller Thank you for all your help, encouragement, and patience throughout the process of writing this thesis. I have enjoyed cooperating with you on this project, and I have learned a lot from you about the academic work. Side 11 af 11