INPUTLOG 6.0 a research tool for logging and analyzing writing process data Linguistic analysis From character level analyses to word level analyses Linguistic analysis 2 Linguistic Analyses The concept explained Flow linguistic analyses Aggregate letter to word level Parsing the S notation Enriching process data with linguistic information 3 4 marielle.leijten@uantwerpen.be
Aggregate letter to word level Part of speech tagging and chunking 1 Extract word, word groups and sentences Tokenize sentences There is a man sleeping in an easy chair. EX V DT NN V IN DT JJ NN NP EX V DT NN V IN 5 DT JJ NN 6 Part of speech tagging and chunking 2 There is a man sleaping in an easy chair. Enrichment with process data 1 There is a man sleeping in an easy chair. EX V DT NN V IN DT JJ NN Before Word Pause 1, 2 O B Thre<<ere is a_man sleapp<<ping in an easy chair. B B NP 140 593 2 1 The first pause before a word ( 1) The second pause before a word ( 2) B NP I NP B B NP I NP I NP 7 The second pause before a word is at the same time the AfterWord+1 Pause 8 marielle.leijten@uantwerpen.be
There is a man sleaping in an easy chair. Enrichment with process data 2 There is a man sleaping in an easy chair. Enrichment with process data 3 Word production 7207 Thre<<ere is a man sleapp<<ping in an easy chair. 546 Production time of word [EndTime of last Character of Word StartTime first character of word] Within Word Pause Thre<<ere is a man sleapp<<ping in an easy chair. 499 7145 The sum of the pauses within a word [WitinWordPause 1 + WitinWordPause 2 + WitinWordPause N] Man=24976-24430 9 Man=125+374 10 There is a man sleaping in an easy chair. Enrichment with process data 4 Read more After Word Pause +1 Thre<<ere is a_man_sleapp<<ping in an easy ch... 140 234 +1 +1 The first pause after a word (+1) Macken, L., Hoste, V., Leijten, M., & Van Waes, L. (2012). From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information. Paper presented at the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey. Leijten, M., Macken, L., Hoste, V., Van Horenbeeck, E., & Van Waes, L. (2012). From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data. Paper presented at the European Association for Computational Linguistics, EACL Computational Linguistics and Writing (CL&W 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering, Avignon. The AfterWordPause+1 of a is the BeforeWordPause-2 of man 11 12 marielle.leijten@uantwerpen.be
Alzheimers disease Goal research project Test the complementary diagnostic power of a new tool assessing cognitive and linguistic aspects that characterize the process of written language production in Alzheimer's disease (AD) Focusing on motor, cognitive, and linguistic aspects. 13 14 Participants Tasks Three main groups: Patients with mild dementia due to AD Patients with mild cognitive impairment (MCI) due to AD A group of cognitive healthy participants (65 years and older) Copy task Assess person (motor) characteristics Expository task Two figurative elicitation tasks 15 16 marielle.leijten@uantwerpen.be
Ultimate goal General pause results Pause analysis: between words It is our ultimate goal to: 1. describe and test differences between the three participant groups on the basis of a selection of writing process variables (inter and intrapersonal characteristics) 2. test the diagnostic accuracy with a selection of writing process variables (for discriminating AD from healthy elderly) 3... 17 18 Pauses before words Pauses before word categories Verbs, nouns, adjectives Pauses related to revisions are excluded 19 20 marielle.leijten@uantwerpen.be
Pauses before word categories HE Healthy elderly Pauses before word categories CI Cognitive impaired elderly 21 22 Pauses before chunks Verb phrase, noun phrase, prepositional phrase Pauses beginning chunks B Verb phrase, noun phrase, prepositional phrase Extreme large pauses and pauses related to revisions are excluded 23 24 marielle.leijten@uantwerpen.be
Pauses within chunks i Verb phrase, noun phrase, Inputlog 6.0 a research tool for logging and analyzing writing process data Source analysis 25 The flow: in sum Source analyses (full) Iterative cycles from original idfx ~ analyses filtered idfx ~ analyses recoded idfx ~ analyses 27 marielle.leijten@uantwerpen.be
Source analyses (grouped) Source analyses (grouped) Information seeking in professional writing: twitter and e mail communication Contemporary writing & Theory Pilot study Experiment Discussion marielle.leijten@uantwerpen.be
Search process Long term memory: Task schemas Topic knowledge Audience knowledge Linguistic knowledge Genre knowledge External digital sources: Task schemas Topic knowledge Audience knowledge Linguistic knowledge Genre knowledge Source: Leijten, M., Van Waes, L., Schriver, K., & Hayes, J. R. (2014). Writing in the workplace: Constructing documents using multiple digital sources. Journal of Writing Research, 5(3), 285 336. (www.jowr.org) Pilot study Assumption: search style may be a predictor of the level of expertise (novice versus expert in digital communication) participants novice writers (5) professional writers (5) tasks write a tweet (max. 140 characters) write an e mail (no indication of length) duration max. 10 minutes and max. 30 minutes Two writing tasks: twitter & e-mail Twitter is a social networking and microblogging service, enabling its users to send and read messages called tweets. Tweets are text based posts up to 140 characters (often based on multiple digital sources) E mail is a method of exchanging digital messages from an author to one or more recipients. E mails consist of three main parts (message envelope, header and body text). E mails can be as long as necessary. marielle.leijten@uantwerpen.be
Observation Observation via Inputlog Inputlog 5* Tobii T60 Eyetracker Retrospective interviews Writing environments (templates via Inputlog) Tweets E-mail: novice Novice writer To all communication science students: interesting conference on internal and organisational communication on April 17 April in Bussum Professional More conversation in the organisation >> interesting conference on internal communication: www.corner stone.nl/ marielle.leijten@uantwerpen.be
E-mail: professional Pilot study Source: Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes. Written Communication, 30(3), 358 392. doi: 0.1177/0741088313491692 Experiment method Experiment procedure participants novice writers (20) professional writers (20) tasks write a tweet (max. 140 characters) write an e mail (no indication of length) duration max. 10 minutes and max. 30 minutes analysis process measures product measures combined measures 1. Typing test 2. (Reading test) 3. Writing task 1: Tweet about creative session 4. Writing taks 2: Invite colleagues to creative session via e mail 5. Stimulated retrospective interview marielle.leijten@uantwerpen.be
Materials layer 1 Observation layer 2 Inputlog 5* Tobii TX300 Eyetracker (AnHuLab Antwerp) Retrospective interviews layer 3 Results mean number of tweets Results mean age Novices (N=20) Professionals (N=20) Novices Professionals 10 tweets (st.dev. 21) 3115 tweets (st.dev. 3651) 22 (st.dev. 1,9) 32 (st.dev. 9,1) marielle.leijten@uantwerpen.be
Results process measures Results product measures tweet Total characters produced Total characters in final text tweet e mail tweet e mail Results product/process ratio tweet Results product measures e-mail 1 600 1 400 1513 1473 Total characters produced Total characters in final text 1 200 1 000 1177 1074 800 600 400 200 0 novice professional marielle.leijten@uantwerpen.be
Results product/process ratio e-mail Results relative time in sources 100 Relative time spent in other sources is equal for the writer groups Relative time spent in other sources is larger in the twitter task than in the e mail task 80 76 75 60 40 20 tweet e mail tweet e mail 0 novice professional Results other Results Quality tweet Other non discriminating variables are: Novices Professionals Mean number of P bursts Mean duration of P bursts Mean number of S bursts Mean duration of S bursts Number of sources used Duration spent in various sources Transitions absolute Transitions per minute... 0,73 (st.dev. 0,59) 1,78 (st.dev. 0,44) The tweets of the professionals follow more the conventions than the tweets of the novices. Broaden your choices via creative thinking and become a trendsetter! Introductory session: Thursday May 2 @ Bloso, Hazewinkel, Willebroek Trendsetter i.o. trend follower? Learn to think out-of-the box. Register now! #tip #TotalBrainBoxMethod www.hetvarken.wor marielle.leijten@uantwerpen.be
Results Quality e-mail Structure of content (max. 4) Reader orientation (max. 8) Attention for reader (max. 4) Novices 3,1 4,0 2,0 Professionals 3,3 5,4 * 3,0 * The e mails of the professionals are similarly structured than the e mails of the novices The e mails of the professionals are more reader oriented than the e mails of the novices The professionals pay more attention to the reader than the novices Discussion Diversity within writer groups Type of task: internal communication Type of writer: two distinct profiles (long process/text ~ short) Definition of indicators of cognitive processes Indicators at general process level versus within process variability Thank you Literature Nikki Van De Keere, Alexander Kupers, Tinne Moens, Eline Mortelmans, Caroline Van Gils, Elke Eriksson & Sofie Vanwynsberghe (students Master in Multilingual Professional Communication) Eric Van Horenbeeck (technical coordinator Inputlog) Download presentation via ResearchGate of Academia.edu (@marielle leijten) Leijten, M., Van Waes, L., Schriver, K., & Hayes, J. R. (2014). Writing in the workplace: Constructing documents using multiple digital sources. Journal of Writing Research, 5(3), 285 336 Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using Inputlog to analyze and visualize writing processes. Written Communication, 30(3), 358 392. doi: 0.1177/0741088313491692 Related articles on writing process research: marielle.leijten@uantwerpen.be
Data mining end 47 start 47 tweet 256 vwec 236 Translation Data from study by Isabelle Robert (2014) IE-other 49 IE-search 44 Google 13 Google vwec 49 Translator A Translator B marielle.leijten@uantwerpen.be
resources A target text resources B target text Average duration of episode (s) Number of episodes Pajek bilingual dict. monol.dict. Leijten, M., Van Waes, L., Schriver, K., & Hayes, J.R. (2014 internet bilingual dict. internet source text antidote source text monol.dict. 60 50 40 30 20 10 0 Overall Word 140 120 100 80 60 40 20 0 Overall Word A B More information(@uantwerpen.be) Research Foundation Flanders www.inputlog.net marielle.leijten@uantwerpen.be