What is Web log analysis?



Similar documents
Schools of Psychology

Web Analytics Definitions Approved August 16, 2007

Webtrends for SharePoint 2010 A Microsoft Preferred Analytics Solution for SharePoint

Tracking True & False Demystifying Recruitment Marketing Analytics

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Above the fold: It refers to the section of a web page that is visible to a visitor without the need to scroll down.

About Google Analytics

Behaviorism: Laws of the Observable

SEACW DELIVERABLE D.1.6

Opinion 04/2012 on Cookie Consent Exemption

Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data

Weiling Liu University of Louisville

Impressive Analytics

Bibliometrics and Transaction Log Analysis. Bibliometrics Citation Analysis Transaction Log Analysis

Introduction. Chapter 1 Why Understanding Your Web Traffic Is Important to Your Business 3

HOW TO USE GOOGLE ANALYTICS. (for beginners) universal analytics. Courtney Petty, of DKS Systems adapted from our previous beginner s guide.

ABA. History of ABA. Interventions 8/24/2011. Late 1800 s and Early 1900 s. Mentalistic Approachs

U.S. Army Research, Development and Engineering Command. Cyber Security CRA Overview

LEARNING SOLUTIONS website milner.com/learning phone

Digital media glossary

Bandra (W) : Kemps Corner :

Unit 21: Hosting and managing websites (LEVEL 3)

Getting the most from your Google Analytics

Title/Description/Keywords & Various Other Meta Tags Development

Designing with Data: Using Analytics to Improve Web and Mobile Applications. Jussi Ahola

Top Practices To Boost Your Law Firm Leads

Digital marketing services

IC L05: Security.cloud Configuring DLP on to your flow & Applying security to your Office 365 or Google Apps deployment Hands-On Lab

Getting A Google Account

1. Layout and Navigation

Lecture 7: Social Media Sharing, Google Analytics, Keyword Research & Meta-Tag Optimization, Facebook Fan Pages

ANALYSING SERVER LOG FILE USING WEB LOG EXPERT IN WEB DATA MINING

This tutorial is designed for SEO professionals as well as beginners who would like to learn the basics of Web Analytics and its techniques.

Findability Consulting Services

Administrator's Guide

Tracking Success: Omni-Channel! Jimmy Bogroff! Director Digital Strategy - Oregonian Media Group!

Machine Learning and Predictive Analytics Foster Growth [1]

Digital marketing strategy

Today. Learning. Learning. What is Learning? The Biological Basis. Hebbian Learning in Neurons

GOOGLE ANALYTICS 101

Google Analytics Health Check Laying the foundations for successful analytics and optimisation

Evaluating the impact of research online with Google Analytics

STATE OF B2B MARKETING METRICS AND ANALYTICS 2015

Create Beautiful Reports with AWR Cloud and Prove the Value of Your SEO Efforts

White paper: Google Analytics 12 steps to advanced setup for developers

Using Application Insights to Monitor your Applications

Web Analytics Big Three Definitions Version 1.0

Services & Pricing. P.O. Box 10734, Eugene OR Office: sales@oregonpublishing.com CONTENTS SEO LOCAL PLAN 2

What s New in Analytics: Fall 2015

What to do Post Google Panda

Google Analytics. Web Skills Programme

Draft Response for delivering DITA.xml.org DITAweb. Written by Mark Poston, Senior Technical Consultant, Mekon Ltd.

Log analyzer programs for distance education systems

Attract traffic to your website. Convert traffic into leads. Convert leads into customers

Advanced Preprocessing using Distinct User Identification in web log usage data

Traffic Behavior Analysis with Poisson Sampling on High-speed Network 1

Zubi Advertising Privacy Policy

Pella Hosting and Website Development solutions@pellahosting.com 809 West 8 th Street Pella, IA Website Planning Worksheet

XactView User Guide. Schmooze Com Inc.

Enriched Links: A Framework For Improving Web Navigation Using Pop-Up Views

LIDL PRIVACY POLICY. Effective Date: June 11, 2015

Google Analytics Guide. for BUSINESS OWNERS. By David Weichel & Chris Pezzoli. Presented By

Ensighten Data Layer (EDL) The Missing Link in Data Management

Search Engine Optimization Glossary

KeyMetric Campaign Analytics FAQ

What s New in Analytics: Fall 2015

DIGITAL MARKETING E-LEARNING

Thinking About Psychology: The Science of Mind and. Charles T. Blair-Broeker Randal M. Ernst

Remote Desktop Web Access. Using Remote Desktop Web Access

Using the CCNY Server Space with Secure Shell 3.0 for Windows Created by Doris Grasserbauer

Concepts. Help Documentation

SKoolAide Privacy Policy

Getting Started with the new VWO

A View on Behaviorist Learning Theory. view of behaviorism assumes that all behavior is determined via the environment or how one has

HOW DOES GOOGLE ANALYTICS HELP ME?

Nonprofit Technology Collaboration. Web Analytics

Become a Hosting Reseller For Free

Program Visualization for Programming Education Case of Jeliot 3

Analyzing the Different Attributes of Web Log Files To Have An Effective Web Mining

Is a Single-Bladed Knife Enough to Dissect Human Cognition? Commentary on Griffiths et al.

Digital Marketing Training Institute

10 Essential Google Analytics Reports And How They Matter to B2B Executives

Transcription:

Outline What is Web log analysis? Jim Jansen College of Information Sciences and Technology The Pennsylvania State University jjansen@acm.org Let s make this a discussion! Definition Examples Theory and Essential Construct Data Collection Method Discussion Web log analysis is part of the domain of... Web analytics The Web Analytics Association (WAA) defines Web analytics as the measurement, collection, analysis, and reporting of Internet data for the purposes of understanding and optimizing Web usage (http://www.webanalyticsassociation.org/) Shares common theoretical and methodology characteristics with all forms of log analysis (e.g., Intranet logs, systems logs, OPAC logs, search logs, etc.) W3C Extended Log Format W3C Extended Log Format - Variety of fields for examining visitors to Web sites. Other common format is NCSA Separate Log that is composed of three logs ( Common log actions on the server, Referral log where they came from, and Agent log stuff about the client computer)

Variety of tools help make sense of this log data Other Log Examples Search Logs have some common fields, such as time, queries, results, etc. We can enrich the log with additional fields. Twitter log Keyword advertising logs provides calculated metrics Tweets with author in XML

Theoretical Foundations Part of the behaviorism paradigm Behaviorism an approach focused on the outward behavioral aspects of thought and emphases the observed behaviors Behaviorism Pavlov, Watson, & Skinner Ivan Petrovich Pavlov John B. Watson Burrhus Frederic Skinner Behaviorism Characteristics Inductive, data-driven and characterized by empirical observation of measurable behavior Grounded on somebody doing something in a situation (all the environmental and situational features are embedded behaviors) Criticsof behaviorism as a psychological theory have issues with rejection of mental processes. I agree - people are more than mediators between behavior and the environment (Skinner, 1993, p 428) What is a Behavior? an observable activity of a person, animal, team, organization, or system. One can classify behaviors into three general categories. Behaviors are something that one can detect and record actions or specific goal-driven events with some purpose other than the specific action that is observable reactive responses to environmental stimuli What is a Behavior? Behavior is the essential construct of the behaviorism and of log research Logs record behaviors of users and systems (records behavior but can t tell affective, cognitive, or situational aspects) A behavior is the key variable (i.e., an entity representing a set of events where each event may have a different value)

Ethograms a taxonomy or index of behavioral patterns details the different forms of behavior that an user exhibits categories of behavior are objective, discrete, not overlapping. This makes the definitions of each behavior (and category of behaviors) clear, detailed and distinguishable from each other View results With Scrolling Without Scrolling but No Results in Window Selection Click URL (in results listing) Next in Set of Results List Previous in Set of Results List GoTo in Set of Results List View document With Scrolling Without Scrolling Execute Execute Query Behavior Example of an Ethogram Interaction in which the user viewed or scrolled one or more pages from the results listing. If a results page was present and the user did not scroll, we counted this as a View Results Page. User scrolled the results page. User did not scroll the results page. User was looking for results, but there were no results in the listing. Interaction in which the user makes a selection in the results listing. Interaction in which the user clicked on a URL of one of the results in the results page. User moved to the Next results page. User moved to the Previous results page. User selected a specific results page. Interaction in which the user viewed or scrolled a particular document in the results listings. User scrolled the document. User did not scroll the document. Description Behavior Description of the behavior What about the data collection method? Interaction in which the user initiated an action in the interface. Interaction in which the user entered, modified, or submitted a query Data Collection: Trace Data can view the data collected in log files as trace data. people conducting the activities of their daily lives many times create things, create marks, Wear on a carpet induce wear, or reduce some existing material. Within the confines of research, these things, marks, and wear become data Trash heap Classically, trace data are the physical remains of people s interaction Trace Data In the past, trace data was often time consuming to gather and process, making such data costly. logging software makes collecting trace data easy and cheap Log data is controlled accretion data, where the researcher or some other entity alters the environment in order to create the accretion data With the user of client apps (such as desktop search bars), the collection of data is nearly unlimited from a technology perspective Computer storage media What is cool about trace data for researchers?

Data Collection Log data has significant advantages as a data collection approach for the study and investigation of behaviors, including: Scale: not a limiting factor as in lab user studies Power: large sample size for inference testing; in fact, so large must account for the size effect Scope: naturalistic; researchers can investigate range of interactions in a multi-variable context Location: can collectin distributed environments Duration: collect log data over an extended period Methodological Foundations Use of logs to collect trace data is an unobtrusive methods (a.k.a., nonreactive or low-constraint). Unobtrusive methods allows data collection without directly interfering into the context and does not require a direct response from participants Customer Behavior (video) Chemistry (surface marking) Methodological Foundations Three justifications for unobtrusive methods: Uncertainty Example: ethnography principle: researchers studies (where interjected the into an researcher environment bird become dogs a part study of participant the system Observer Example: effect: no one difference searches that for porn is made in a to lab an activity study of or Web a person s searching behaviors by being observed Observer Example: bias: is why observers medical overemphasize trials are double blind behavior rather than they single expect blind to find and fail to notice behavior they do not expect Trace data helps in overcoming the Uncertainty principle, Observer effect, and Observer bias in the data collection. Note for data collection but not data analysis Methodological Foundations Inherent characteristics in the method of log data collection; Web analytics has issues to address as a result: Abstraction how does one relate low-level data to higher-level concepts? Selection how does one separate the necessary from unnecessary data? Reduction how does one reduce the complexity and size of the data set? Context how does one interpret the significance of events? Evolution how can one collect data without impacting application deployment or use?

Recap of Web Analytics Research Type of Data Data Collection Key Construct Theoretical Foundation Trace Unobtrusive Behavior Behaviorism User Query Response Click Computer Book: Jansen, B. J., Spink, A., and Taksa, I. (2009) Handbook of Research on Web Log Analysis, Hershey, PA: Idea Group Publishing First chapter on theory of log analysis is free! Lecture: Jansen, B. J. (Forthcoming) Understanding User Web Interactions via Web Analytics. Morgan-Claypool Lecture Series. Gary. Marchionini (Ed). Morgan-Claypool: San Rafael, CA. manuscript about Web Analytics, soup to nuts Thank you! (open for questions and further discussion) Jim Jansen College of Information Sciences and Technology The Pennsylvania State University jjansen@acm.org