Crowdsourcing the transcription of archival data
|
|
- Alyson Mathews
- 8 years ago
- Views:
Transcription
1 INSTITUTE FOR MATHEMATICAL BEHAVIORAL SCIENCES UC IRVINE Crowdsourcing the transcription of archival data Kimberly A. Jameson 1, Sean Tauber 1, Prutha S. Deshpande 2, Stephanie M. Chang 3, and Sergio Gago 3 1 Institute for Mathematical Behavioral Sciences, 2 Cognitive Sciences, and 3 Calit2 University of California, Irvine
2 UCI ColCat Project Collaborators: Prutha Deshpande Sean Tauber Stephanie Chang Sergio Gago Nathan Benjamin Yang Jiao Brian Huynh Han Ke Ram Bhakta Zhimin Xiang Ian Harris Funding and Support for the archive project: Calit2 at UCI. University of California Pacific Rim Research Program, (K.A. Jameson, PI). National Science Foundation (#SMA , K.A. Jameson, PI). UCI s UROP Program Awards. IRB Approvals HS# and
3 UCI ColCat Project Collaborators: Prutha S. Deshpande CogSci Sean Tauber IMBS Sergio Gago Calit2 Stephanie M. Chang Calit2
4 Talk Overview Background on an important problem in Cognitive Science. The domain under consideration: Color categorization. Creating a new database using internet-based procedures. Features of the internet-based research problem and solution approaches that may generalize elsewhere. Modeling the problem and developing appropriate analyses. Preliminary results from empirical tests. Summary.
5 Research on how concepts are represented across linguistic groups Individual concept formation and the sharing and transmission of concepts within and across groups. E.g., Kinship terminology
6 Concept formation across language groups E.g., Kinship terminology:
7 Concept formation across language groups E.g., Kinship terminology:
8 In what ways are representations of concepts similar across individuals and language groups? and What are the various ways concepts vary across individuals and language groups?
9 How do the world s languages map the color appearances we all see in our environments?
10 Basic Color Terms (1969) Brent Berlin Paul Kay Basic Color Terms being described as the smallest set of simple words with which the speaker can name any color.
11 Courtesy of Lindsey & Brown (2006). PNAS, 102.
12 Image Credit: Lindsey & Brown (2006). PNAS, 102.
13 Basic Color Terms (1969) (1) Found all languages tested had systems including 11 or fewer basic color words (e.g., English): red, yellow, green, blue, orange, purple, pink, brown, grey, black and white. (Terms such as crimson, blonde and royal blue are not considered to be basic.) (2) Provided a sequence by which languages adopted subsets of the 11 basic color categories.
14 Color concept universals like this were made popular by Berlin & Kay, and by several other investigators, still, there are instances where different societies have evolved different conventions for color naming... IMBS workshop UC Irvine 12/04/2015
15 Image Credit: Lindsey & Brown (2006). PNAS, 102.
16 Berinmo (5 words) Image Courtesy Credit: of Lindsey Kay & Regier & Brown (2007). (2006). Cognition, PNAS,
17 Different numbers of Color Terms: n=3 T. Regier et al, PNAS 104, 2007
18 Different numbers of Color Terms: n=3 n=4 T. Regier et al, PNAS 104, 2007
19 Different numbers of Color Terms: n=3 n=4 n=5 T. Regier et al, PNAS 104, 2007
20 Different numbers of Color Terms: n=3 n=4 n=5 n=6 T. Regier et al, PNAS 104, 2007
21 The World Color Survey 110 languages; 25 speakers. Data collection ended in Digitalizing hand coded data took more than 23 years. A very valuable site of unembellished ascii data files:
22 World Color Survey Data Uses a Generic Format
23 The existing World Color Survey (WCS) database (2009) Beginning ~2003 the WCS database was made publicly available. Has been very widely cited in the last few years.
24 E.g., Focus selection task: Shown the chart, pinpoint the best example of each root they volunteered while naming.
25 Datafile Example foci.txt : Color chip selected as category best-exemplar (WCS datafiles do not include headers) Language Number Speaker Number Focus Number Term Abbrv. Coordinates of focus selection
26 Focus selections in two languages: English Korean Deshpande, P.S. (under review). Investigating Color Categorization Behaviors in Korean- English Bilinguals. UCI Undergraduate Research Journal (submitted June, 2015).
27 See Poster: An Affordance Based Approach to Large Data-Set Navigation. The WCS data is awesome, but Nathan a platform with a GUI for empirically investigating and analyzing such data would be even better, and a site with rigorous on-board research tools would also be a big plus. We were given a chance to do this Jameson, K. A., Benjamin, N. A., Chang, S.M., Deshpande, P. S., Gago, S., Harris, I. G., Jiao, Y., and Tauber, S. (2015). Mesoamerican Color Survey Digital Archive. In Encyclopedia of Color Science and Technology, (Ronnier Luo, Ed.). Springer: Berlin / Heidelberg. ISBN: (Online). DOI /
28 The Robert E. MacLaury Archive ~23,000 pages of raw color categorization data that includes: 116 dialects from indigenous Mesoamerican societies (261 surveys), and ~130 additional surveys from a variety of languages (across Africa, Asia, the Americas and Europe).
29 R. E. MacLaury s Dissertation: Color in MesoAmerica, Vol. I: A Theory of Composite Categorization. (1986) Book: Color and Cognition in Mesoamerica: Constructing Categories as Vantages. (1997)
30 The mesoamerican portion of the REM archive: 33 within Mexico City 37 within Oaxaca 30 within Guatemala Jameson et al. (2015). ECST.
31 Chinantec language diversity in the MCS
32 Chinantec language diversity in the MCS Developing Vigorous Endangered Jameson et al. (2015). ECST.
33 Features of our transcription problem that may be general: The data has a constrained structure and format. (unlike typical historical records transcription tasks) It s a perceptual identification/reproduction problem: e.g., identify handwritten characters/symbols in a standardized template or form and reproduce them via keyboard input. transcription of large blocks of data can be broken into small tasks and transcribed by OCR or crowdsourcing methods. See Poster: Optical Character Recognition of Handwritten Tabular Data. Yang
34 Focus selection task: Shown the chart, pinpoint the best example of each root they volunteered while naming.
35 Focus selection task: Shown the chart, pinpoint the best example of each root they volunteered while naming. Problem: Convert THIS into a data addressable file
36 Problem: Convert THIS into a data addressable file American English Data
37 DATA... continues up to
38 Challenges of our transcription job: Concepts. How they apply everywhere There s a classic example color. There s an existing database. There s a chance to do better. Crowdsourcing can help greatly Why OCR doesn't work. Handwriting that is not prose. The reason is its a perceptual problem. Crowdsourcing lets us break the problem into pieces and solve it piecewise.
39 Features of our problem and approach that may apply elsewhere: The perceptual nature of our tasks differ from general information surveys or opinion-poll data e.g., response bias is likely to be itembased rather than the usual informant-based form, perhaps allowing more than one possible decision strategy. In large-scale efforts there s a need to automate quantification and evaluation of the goodness of the transcribed product. Minimize response bias by partitioning larger tasks into smaller, distributed, tasks that are answered by several subjects and reassembled into a whole lends itself to crowdsourced approaches. By definition, while crowdsourcing makes Big Data possible, an intelligent model of data aggregation (like CCT) may permit trading off smarter for bigger, giving a more economical approach to accurately deriving robust results using internet-based crowdsourcing methods. National Science Foundation (#SMA , K.A. Jameson, PI).
40 Features of our problem and approach that may apply elsewhere: The perceptual nature of our tasks differ from general information surveys or opinion-poll data e.g., response bias is likely to be itembased rather than the usual informant-based form, perhaps allowing more than one possible decision strategy. In large-scale efforts there s a need to automate quantification and evaluation of the goodness of the transcribed product. Minimize response bias by partitioning larger tasks into smaller, distributed, tasks that are answered by several subjects and reassembled into a whole lends itself to crowdsourced approaches. By definition, while crowdsourcing makes Big Data possible, an intelligent model of data aggregation (like CCT) may permit trading off smarter for bigger, giving a more economical approach to accurately deriving robust results using internet-based crowdsourcing methods. National Science Foundation (#SMA , K.A. Jameson, PI).
41 Features of our problem and approach that may apply elsewhere: The perceptual nature of our tasks differ from general information surveys or opinion-poll data e.g., response bias is likely to be itembased rather than the usual informant-based form, perhaps allowing more than one possible decision strategy. In large-scale efforts there s a need to automate quantification and evaluation of the goodness of the transcribed product. Minimize response bias by partitioning larger tasks into smaller, distributed, tasks that are answered by several subjects and reassembled into a whole lends itself to crowdsourced approaches. By definition, while crowdsourcing makes Big Data possible, an intelligent model of data aggregation (like CCT) may permit trading off smarter for bigger, giving a more economical approach to accurately deriving robust results using internet-based crowdsourcing methods. National Science Foundation (#SMA , K.A. Jameson, PI).
42 Features of our problem and approach that may apply elsewhere: The perceptual nature of our tasks differ from general information surveys or opinion-poll data e.g., response bias is likely to be itembased rather than the usual informant-based form, perhaps allowing more than one possible decision strategy. In large-scale efforts there s a need to automate quantification and evaluation of the goodness of the transcribed product. Minimize response bias by partitioning larger tasks into smaller, distributed, tasks that are answered by several subjects and reassembled into a whole lends itself to crowdsourced approaches. While crowdsourcing makes Big Data possible, an intelligent model of data aggregation (like CCT ) may permit trading off smarter data for bigger data, giving a more economical approach to accurately deriving robust results using internet-based crowdsourcing methods. National Science Foundation (#SMA , K.A. Jameson, PI).
43 Batchelder and Romney (1988) Test theory without an answerkey. Psychometrika. Cultural consensus analyses of a cognitive-perceptual task For tasks evaluating new characters designed to extend the 26 letters of the English alphabet, consensus analyses objectively identified expert typeface designers with higher competence compared to college undergraduates. Jameson & Romney (1990). Consensus on Semiotic Models of Alphabetic Systems. J. of Quant. Anthro.
44 * Automating archive transcription: Task and Judgments Design 1: OCR verification (pattern recognition) - 2-AFC yes/no Design 2: OCR verification (training data) - free response Design 3: Crowdsource verification - 2-AFC match/no-match Design 4: Naming ranges 1 - free response + confidence Design 5: Naming ranges 2 - N-AFC + confidence Design 6: Focus transcription 1 - free response + confidence Design 7: Focus transcription 2 - free response free response = a recaptcha task. Poster title: Designing Crowdsourcing Methods for the Transcription of Handwritten Documents. Stephanie
45 E.g., internet-based transcription task:
46 Cultural Consensus Theory (CCT) to aggregate the data Automate piece-wise crowdsourced transcription designs for analysis with CCT to derive the correct transcription. Enrich the model underlying Dichtomous Bayesian form of CCT (Oravecz, et al. 2014) to handle N-alternative forcedchoice data formats. As a result, employ smarter analyses of smaller samples, using CCT s formal process model, that produce solutions as robust as those from large amounts of averaged data. Deshpande, Tauber., Chang, Gago & Jameson. (in preparation). Digitizing a large corpus of handwritten documents using crowdsourcing and cultural consensus theory. See Poster: A Cultural Consensus Theory Analysis of Crowdsourced Transcription Data. Prutha
47 Results:
48 Results: Task 4 n=30
49 Results: Task 4 n=30 hi, hl
50 Results: Task 4 n=30
51 Inferring the true transcription Mode? (Bayesian) Cultural Consensus Theory (CCT) (Oravecz, Vandekerckhove & Batchelder, 2014) (Batchelder & Romney, 1988)
52 Cultural Consensus Theory (CCT) Test theory without an answer key (Batchelder & Romney, 1988) Allows us to infer: shared latent cultural knowledge (true transcription) individual ability item difficulty response bias
53 Cultural Consensus Theory (CCT) Usually applied to dichotomous (true/false) data. Other formats have been explored with Bayesian framework but not multiple choice / free response (to our knowledge). Not typically applied to perceptual identification (although, see Jameson 1990)
54 Dichotomous CCT Multiple Choice CCT
55 Dichotomous CCT Multiple Choice CCT Observed Data
56 Dichotomous CCT Multiple Choice CCT Observed Data
57 Dichotomous CCT Multiple Choice CCT Observed Data Latent Parameters
58 Dichotomous CCT Multiple Choice CCT Observed Data Latent Parameters
59 Dichotomous CCT Multiple Choice CCT Observed Data Latent Parameters
60 Dichotomous CCT Multiple Choice CCT Observed Data Latent Parameters
61 Dichotomous CCT Multiple Choice CCT Observed Data Latent Parameters (subject-wise bias)
62 Examples of perceptually confusable stimuli
63 Response bias: Individuals or items? subject-wise bias item-wise bias
64 Response bias: Individuals or items? subject-wise bias item-wise bias
65 CCT Answer Key: Task 4
66 CCT Answer Key: Task 4
67 CCT Answer Key: Task 4
68 CCT Answer Key: Task 4
69 CCT Answer Key: Task 4
70 subject-wise posteriors Answer 4 (Z4) Answer 16 (Z16) Answer 125 (Z125) Subject 0 bias (g0)
71 subject-wise posteriors Answer 4 (Z4) Answer 16 (Z16) Answer 125 (Z125) Subject 0 bias (g0)
72 item-wise posteriors Answer 4 (Z4) Item 4 bias (g4) Answer 16 (Z16) Item 16 bias (g16) Answer 125 (Z125) Item 125 bias (g125)
73 task 4 subject-wise model predictions
74 task 4 subject-wise model predictions item-wise model predictions
75 task 7 subject-wise model predictions
76 task 7 subject-wise model predictions item-wise model predictions
77 Can we use fewer informants? CCT was designed to work on small (6-10) sized subject samples typical of anthropological studies. Would the patterns of results reported for Task 4 be possible with a sample smaller than 30 participants? Method Answer Key Estimate %-correct Mean Competence Mean Item Difficulty Trial 1-8 participants 100% Trial 2-8 participants 100% Trial 3-8 participants 100% Trial 4-8 participants 100% Trial 5-8 participants 100% Participants 100% Preliminary trends suggests 8 participants may be as informative as 30.
78 Discussion points Two (or more) response-strategy subcultures? Confidence data can help CCT results Quantitative model evaluation Item + individual bias component? Automation and integration with other server-side processes (Python module vs. R, Matlab)
79 Results Summary: These preliminary results suggest two novel approaches, piece-wise crowdsourcing and CCT data handling, can be used to accurately transcribe a large corpus of ethnographic data. By using internet-based methods, it appears we can a avoid 20+ year manual transcription job and derive an accurate and unbiased database of great value to investigations of concept formation across language groups. The economical way in which we modeled this perceptuallybased transcription problem seems likely to generalize to other internet-based tasks that require extraction and evaluation of targets embedded in distracting information, and our novel use of CCT analyses seem promising for intelligently aggregating smaller subsets of crowdsourced responses to address large data handling problems.
80 Thanks for Listening!! Funding and Support for the archive project: Calit2 at UCI. University of California Pacific Rim Research Program, (K.A. Jameson, PI). National Science Foundation (#SMA , K.A. Jameson, PI). UCI s UROP Program Awards. IRB Approvals HS# and
Conference Talk Schedule. Thursday, December 3, 2015 9:00am 10:00am. Ulf-Dietrich Reips University of Konstanz, Germany (reips@uni-konstanz.
Crowdsourcing, Big Data, and Social Media in the Behavioral Sciences: Applications, Methods and Theory Dec. 3 & 4, 2015 Institute for Mathematical Behavioral Sciences at UC Irvine Conference Talk Schedule
More informationGuide to the School of Social Sciences Publications, University of California, Irvine
http://oac.cdlib.org/findaid/ark:/13030/kt8c60352q No online items of California, Irvine Processed by Cyndi Shein; machine-readable finding aid created by Cyndi Shein Special Collections and Archives The
More informationPantone Matching System Color Chart PMS Colors Used For Printing
Pantone Matching System Color Chart PMS Colors Used For Printing Use this guide to assist your color selection and specification process. This chart is a reference guide only. Pantone colors on computer
More informationHow To Color Print
Pantone Matching System Color Chart PMS Colors Used For Printing Use this guide to assist your color selection and specification process. This chart is a reference guide only. Pantone colors on computer
More informationColour Words and Colour Categorization
Blutner/Colour/Colour Words 1 Colour Words and Colour Categorization (1) Does the number and the type of the basic colour words of a language determine how a subject sees the rain bow? Answer 1: Yes (linguistic
More informationCOURSE SYLLABUS COURSE TITLE:
1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55043AC Microsoft End to End Business Intelligence Boot Camp Instructor-led None This course syllabus should be used to determine whether the
More informationBrandon M. Turner. Department of Cognitive Science University of California, Irvine turner.826@gmail.com (417) 619-0957
Brandon M. Turner Department of Cognitive Science University of California, Irvine turner.826@gmail.com (417) 619-0957 Education Ph.D. in Quantitative Psychology, The Ohio State University, 2008-11 Master
More informationOptical Character Recognition (OCR)
History of Optical Character Recognition Optical Character Recognition (OCR) What You Need to Know By Phoenix Software International Optical character recognition (OCR) is the process of translating scanned
More informationHuman wavelength identification, numerical analysis and statistical evaluation
Ŕ periodica polytechnica Mechanical Engineering 52/2 (2008) 77 81 doi: 10.3311/pp.me.2008-2.07 web: http:// www.pp.bme.hu/ me c Periodica Polytechnica 2008 Human wavelength identification, numerical analysis
More informationEPSRC Cross-SAT Big Data Workshop: Well Sorted Materials
EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials 5th August 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations
More informationOn Categorization. Importance of Categorization #1. INF5020 Philosophy of Information L5, slide set #2
On Categorization INF5020 Philosophy of Information L5, slide set #2 Prepared by: Erek Göktürk, Fall 2004 Edited by: M. Naci Akkøk, Fall 2004 From George Lakoff, Women, Fire, and Dangerous Things: What
More informationCognitive and Organizational Challenges of Big Data in Cyber Defense
Cognitive and Organizational Challenges of Big Data in Cyber Defense Nathan Bos & John Gersh Johns Hopkins University Applied Laboratory nathan.bos@jhuapl.edu, john.gersh@jhuapl.edu The cognitive and organizational
More informationCleaned Data. Recommendations
Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110
More informationA Microgenetic Study of One Student s Sense Making About the Temporal Order of Delta and Epsilon Aditya P. Adiredja University of California, Berkeley
A Microgenetic Study of One Student s Sense Making About the Temporal Order of Delta and Epsilon Aditya P. Adiredja University of California, Berkeley The formal definition of a limit, or the epsilon delta
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationTo effectively manage and control a factory, we need information. How do we collect it?
Auto-ID 321 Auto-ID Data-collection needs: What is our WIP? What is productivity or assignment of employees? What is utilization of machines? What is progress of orders? What is our inventory? What must
More informationTHE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING
International Journal of Hybrid Computational Intelligence Volume 4 Numbers 1-2 January-December 2011 pp. 1-5 THE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING
More informationAdvanced Aspects of Hospital Information Systems
Advanced Aspects of Hospital Information Systems EHR- and related Standards DI Harald Köstinger (harald.koestinger@inso.tuwien.ac.at) INSO - Industrial Software Institut für Rechnergestützte Automation
More informationThe Value of Intelligent Capture in Accounts Payable Automation. White Paper
The Value of Intelligent Capture in Accounts Payable Automation White Paper Contents Executive Summary... 2 Evolution of Capture in AP... 2 Intelligent Capture for AP... 3 Any Source or Format... 3 Integration
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationUser Authentication using Combination of Behavioral Biometrics over the Touchpad acting like Touch screen of Mobile Device
2008 International Conference on Computer and Electrical Engineering User Authentication using Combination of Behavioral Biometrics over the Touchpad acting like Touch screen of Mobile Device Hataichanok
More informationStrategic Plan Proposal: Learning science by experiencing science: A proposal for new active learning courses in Psychology
Strategic Plan Proposal: Learning science by experiencing science: A proposal for new active learning courses in Psychology Contacts: Jacob Feldman, (jacob.feldman@rutgers.edu, 848-445-1621) Eileen Kowler
More informationFormal Methods for Preserving Privacy for Big Data Extraction Software
Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability
More informationReviewed by Ok s a n a Afitska, University of Bristol
Vol. 3, No. 2 (December2009), pp. 226-235 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4441 Transana 2.30 from Wisconsin Center for Education Research Reviewed by Ok s a n a Afitska, University
More informationFinal Software Tools and Services for Traders
Final Software Tools and Services for Traders TPO and Volume Profile Chart for NinjaTrader Trial Period The software gives you a 7-day free evaluation period starting after loading and first running the
More informationEmployee Survey Analysis
Employee Survey Analysis Josh Froelich, Megaputer Intelligence Sergei Ananyan, Megaputer Intelligence www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 310 Bloomington, IN 47404
More information3704-0147 Lithichrome Stone Paint- LT Blue Gallon 3704-0001 Lithichrome Stone Paint- Blue 2 oz 3704-0055 Lithichrome Stone Paint- Blue 6 oz 3704-0082
Lithichrome Colors Item Number Item Description 120-COL Lithichrome Stone Paint - Any Size or Color 3704-0011 Lithichrome Stone Paint- LT Blue 2 oz 3704-0066 Lithichrome Stone Paint- LT Blue 6 oz 3704-0093
More informationCommunication and Change Management Planner
Communication and Change Management Planner The purpose of this document is to think through the best way to support the changes required to make the knowledge transfer project a success. Below in blue
More informationQuantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005
Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005 When you create a graph, you step through a series of choices, including which type of graph you should use and several
More informationData Management Implementation Plan
Appendix 8.H Data Management Implementation Plan Prepared by Vikram Vyas CRESP-Amchitka Data Management Component 1. INTRODUCTION... 2 1.1. OBJECTIVES AND SCOPE... 2 2. DATA REPORTING CONVENTIONS... 2
More informationProtocol for the Systematic Literature Review on Web Development Resource Estimation
Protocol for the Systematic Literature Review on Web Development Resource Estimation Author: Damir Azhar Supervisor: Associate Professor Emilia Mendes Table of Contents 1. Background... 4 2. Research Questions...
More informationICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
More informationHow To Write A Jazz Songbook
Improvisation Education Support Software Towards a Personal Improvisation Companion Robert M. Keller Harvey Mudd College Claremont, California, USA Leeds International Jazz Conference Leeds, England March
More informationSoftware Engineering of NLP-based Computer-assisted Coding Applications
Software Engineering of NLP-based Computer-assisted Coding Applications 1 Software Engineering of NLP-based Computer-assisted Coding Applications by Mark Morsch, MS; Carol Stoyla, BS, CLA; Ronald Sheffer,
More informationWhat is the Sapir-Whorf hypothesis?
What is the Sapir-Whorf hypothesis? Paul Kay & Willett Kempton (1984) Based on a powerpoint presentation by NT Rusiyanadi Outline Introduction Sapir-Whorf hypothesis Study done by Kay & Kempton Conclusions
More informationAcademic Standards for Reading, Writing, Speaking, and Listening
Academic Standards for Reading, Writing, Speaking, and Listening Pre-K - 3 REVISED May 18, 2010 Pennsylvania Department of Education These standards are offered as a voluntary resource for Pennsylvania
More informationIntroduction to Research Data Management. Tom Melvin, Anita Schwartz, and Jessica Cote April 13, 2016
Introduction to Research Data Management Tom Melvin, Anita Schwartz, and Jessica Cote April 13, 2016 What Will We Cover? Why is managing data important? Organizing and storing research data Sharing and
More informationcustomer care solutions
customer care solutions from Nuance white paper :: Understanding Natural Language Learning to speak customer-ese In recent years speech recognition systems have made impressive advances in their ability
More informationVisualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR ccfong@umac.mo Abstract An e-government
More informationCHAPTER 15: IS ARTIFICIAL INTELLIGENCE REAL?
CHAPTER 15: IS ARTIFICIAL INTELLIGENCE REAL? Multiple Choice: 1. During Word World II, used Colossus, an electronic digital computer to crack German military codes. A. Alan Kay B. Grace Murray Hopper C.
More informationLONG INTERNATIONAL. Long International, Inc. 10029 Whistling Elk Drive Littleton, CO 80127-6109 (303) 972-2443 Fax: (303) 972-6980
LONG INTERNATIONAL Long International, Inc. 10029 Whistling Elk Drive Littleton, CO 80127-6109 (303) 972-2443 Fax: (303) 972-6980 www.long-intl.com TABLE OF CONTENTS INTRODUCTION... 1 Why Use Computerized
More informationEr is door mij gebruik gemaakt van dia s uit presentaties van o.a. Anastasios Kesidis, CIL, Athene Griekenland, en Asaf Tzadok, IBM Haifa Research Lab
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Er is door mij gebruik gemaakt van dia s uit presentaties
More informationUsing Artificial Intelligence to Manage Big Data for Litigation
FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear
More informationOverview of SEO Recon Features and Benefits
Michael Marshall, CEO Overview of SEO Recon Features and Benefits Data Collection (partial sample):... 2 Multivariate analysis: (Which Factors are Important?):... 3 Multivariate Analysis: (Which Competitors
More informationLearning Disabilities: 101
Learning Disabilities: 101 Website: www.ldayr.org E-mail: info@ldayr.org 905-844-7933 x 23 By: Kelli Cote, Principal, Parent, LDAYR Director Shelley Henderson, Parent and LDAYR Director April 9, 2014 Learning
More informationThe BMC Remedy ITSM Suite s Missing Application:
The BMC Remedy ITSM Suite s Missing Application: Project Portfolio Management And 6 More Applications, Including the Functionality and Metrics You Need to Streamline Processes, Optimize Resources and Improve
More informationMICHAEL S. PRATTE CURRICULUM VITAE
MICHAEL S. PRATTE CURRICULUM VITAE Department of Psychology 301 Wilson Hall Vanderbilt University Nashville, TN 37240 Phone: (573) 864-2531 Email: michael.s.pratte@vanderbilt.edu www.psy.vanderbilt.edu/tonglab/web/mike_pratte
More informationModeling Fraction Computation
Modeling Fraction Computation Using Visuals and Manipulatives to Deepen Conceptual Understanding Larissa Peluso-Fleming, M.Ed. Mathematics Specialist/ Math Coach Wake County Public Schools Stacy Eleczko
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationCode Qualities and Coding Practices
Code Qualities and Coding Practices Practices to Achieve Quality Scott L. Bain and the Net Objectives Agile Practice 13 December 2007 Contents Overview... 3 The Code Quality Practices... 5 Write Tests
More informationGxP Process Management Software. White Paper: Software Automation Trends in the Medical Device Industry
GxP Process Management Software : Software Automation Trends in the Medical Device Industry Introduction The development and manufacturing of a medical device is an increasingly difficult endeavor as competition
More informationNew Ensemble Combination Scheme
New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,
More informationSilvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
More informationFUTURE RESEARCH DIRECTIONS OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING *
International Journal of Software Engineering and Knowledge Engineering World Scientific Publishing Company FUTURE RESEARCH DIRECTIONS OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING * HAIPING XU Computer
More informationAccredited Executive and Leadership Coach Certification
Accredited Executive and Leadership Coach Certification PragmaDoms with the Center for Executive Coaching (CEC) certified coaches undergo a rigorous, ICF-approved training process that prepares them to
More informationThe Comparisons. Grade Levels Comparisons. Focal PSSM K-8. Points PSSM CCSS 9-12 PSSM CCSS. Color Coding Legend. Not Identified in the Grade Band
Comparison of NCTM to Dr. Jim Bohan, Ed.D Intelligent Education, LLC Intel.educ@gmail.com The Comparisons Grade Levels Comparisons Focal K-8 Points 9-12 pre-k through 12 Instructional programs from prekindergarten
More informationThe General Education Program at Sweet Briar College
The General Education Program at Sweet Briar College Introduction The purpose of the General Education Program at Sweet Briar College is to provide all students with a common pattern of skills, experiences
More informationWorld Trade Analysis
World Trade Analysis Brendan Fruin brendan@cs.umd.edu Introduction With the vast amount of data being collected and made publicly available, individuals from all walks of life have been able to provide
More informationKentucky Department for Libraries and Archives Public Records Division
Introduction Kentucky Department for Libraries and Archives Public Records Division Ensuring Long-term Accessibility and Usability of Textual Records Stored as Digital Images: Guidelines for State and
More informationThe Business Case for ECA
! AccessData Group The Business Case for ECA White Paper TABLE OF CONTENTS Introduction... 1 What is ECA?... 1 ECA as a Process... 2 ECA as a Software Process... 2 AccessData ECA... 3 What Does This Mean
More informationRESEARCH OPPORTUNITY PROGRAM 299Y PROJECT DESCRIPTIONS 2015 2016 SUMMER
Project Code: CSC 1S Professor Ronald M. Baecker & Assistant Lab Director Carrie Demmans Epp Computer Science TITLE OF RESEARCH PROJECT: Exploring Language Use in Computer Science Discussion Forums NUMBER
More informationLONDON SCHOOL OF COMMERCE. Programme Specification for the. Cardiff Metropolitan University. BSc (Hons) in Computing
LONDON SCHOOL OF COMMERCE Programme Specification for the Cardiff Metropolitan University BSc (Hons) in Computing Contents Programme Aims and Objectives Programme Structure Programme Outcomes Mapping of
More informationBinary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.
Binary Representation The basis of all digital data is binary representation. Binary - means two 1, 0 True, False Hot, Cold On, Off We must be able to handle more than just values for real world problems
More informationALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search
Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti
More informationPacific Premier Bank s Business e- Banking Getting Started Guide with QuickBooks 2013-2015 for Windows
Pacific Premier Bank s Business e- Banking Getting Started Guide with QuickBooks 2013-2015 for Windows Table of Contents CONNECT AND UPDATE YOUR DATA... 2 SET UP AN ACCOUNT FOR ONLINE BANKING (DIRECT CONNECT)...
More informationHow To Become A Data Scientist
Programme Specification Awarding Body/Institution Teaching Institution Queen Mary, University of London Queen Mary, University of London Name of Final Award and Programme Title Master of Science (MSc)
More informationOPAC TEST DESCRIPTIONS
OPAC Testing Software is a product of Biddle Consulting Group, Inc. Keyboarding/Data-Entry 10-Key Measures speed and accuracy of numeric data entry in an adding machine format. Keyboarding Typing speed
More informationAutomatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
More informationImproving Traceability of Requirements Through Qualitative Data Analysis
Improving Traceability of Requirements Through Qualitative Data Analysis Andreas Kaufmann, Dirk Riehle Open Source Research Group, Computer Science Department Friedrich-Alexander University Erlangen Nürnberg
More informationSPECIFIC PROBLEMS OF ELECTRONIC DOCUMENT
ITEM, 1 (2001), e-journal Information Technology for Economics and Management ISSN 1643-8949 Andrzej Michalski Silesian Technical University, POLAND SPECIFIC PROBLEMS OF ELECTRONIC DOCUMENT Summary: The
More informationThe Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
More informationMICROSOFT OFFICE ACCESS 2007 - NEW FEATURES
MICROSOFT OFFICE 2007 MICROSOFT OFFICE ACCESS 2007 - NEW FEATURES Exploring Access Creating and Working with Tables Finding and Filtering Data Working with Queries and Recordsets Working with Forms Working
More informationGRAPHS/TABLES. (line plots, bar graphs pictographs, line graphs)
GRAPHS/TABLES (line plots, bar graphs pictographs, line graphs) Standard: 3.D.1.2 Represent data using tables and graphs (e.g., line plots, bar graphs, pictographs, and line graphs). Concept Skill: Graphs
More informationSingle Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results
, pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department
More informationBinary Representation
Binary Representation The basis of all digital data is binary representation. Binary - means two 1, 0 True, False Hot, Cold On, Off We must tbe able to handle more than just values for real world problems
More informationWorlds Without Words
Worlds Without Words Ivan Bretan ivan@sics.se Jussi Karlgren jussi@sics.se Swedish Institute of Computer Science Box 1263, S 164 28 Kista, Stockholm, Sweden. Keywords: Natural Language Interaction, Virtual
More information3D Data Visualization / Casey Reas
3D Data Visualization / Casey Reas Large scale data visualization offers the ability to see many data points at once. By providing more of the raw data for the viewer to consume, visualization hopes to
More informationMCQ on Management Information System. Answer Key
MCQ on Management Information System. Answer Key 1.Management information systems (MIS) 1. create and share documents that support day-today office activities 2. process business transactions (e.g., time
More informationPeriodontology. Digital Art Guidelines JOURNAL OF. Monochrome Combination Halftones (grayscale or color images with text and/or line art)
JOURNAL OF Periodontology Digital Art Guidelines In order to meet the Journal of Periodontology s quality standards for publication, it is important that authors submit digital art that conforms to the
More informationMeasurement Science and Standards in Forensic Handwriting Analysis Conference & Webcast Facilitated Discussion: Raw Comments June 5, 2013
Measurement Science and Standards in Forensic Handwriting Analysis Conference & Webcast Facilitated Discussion: Raw Comments June 5, 2013 Overview During the Measurement Science and Standards in Forensic
More informationPANTONE Solid to Process
PANTONE Solid to Process PANTONE C:0 M:0 Y:100 K:0 Proc. Yellow PC PANTONE C:0 M:0 Y:51 K:0 100 PC PANTONE C:0 M:2 Y:69 K:0 106 PC PANTONE C:0 M:100 Y:0 K:0 Proc. Magen. PC PANTONE C:0 M:0 Y:79 K:0 101
More informationAn Introduction to Number Theory Prime Numbers and Their Applications.
East Tennessee State University Digital Commons @ East Tennessee State University Electronic Theses and Dissertations 8-2006 An Introduction to Number Theory Prime Numbers and Their Applications. Crystal
More informationFive Steps to Ensure a Technically Accurate Document Production
Five Steps to Ensure a Technically Accurate Document Production by Elwood Clark Lawyers spend a lot of time focusing on the legal aspects of a document production, including properly defining the scope
More informationMeasuring Critical Thinking within Discussion Forums using a Computerised Content Analysis Tool
Measuring Critical Thinking within Discussion Forums using a Computerised Content Analysis Tool Stephen Corich, Kinshuk, Lynn M. Hunt Eastern Institute of Technology, New Zealand, Massey University, New
More informationTurkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr
More informationThe Re-emergence of Data Capture Technology
The Re-emergence of Data Capture Technology Understanding Today s Digital Capture Solutions Digital capture is a key enabling technology in a business world striving to balance the shifting advantages
More informationWhat is the Future for Mail Sorting?
Turning Envelope Data into Actionable Information January 2011: State-of-the-Art Recognition Technology Further Improves Mail Sorting Efficiency Parascript Page 1 1/13/2011 State-of-the-Art Recognition
More informationProbability Using Dice
Using Dice One Page Overview By Robert B. Brown, The Ohio State University Topics: Levels:, Statistics Grades 5 8 Problem: What are the probabilities of rolling various sums with two dice? How can you
More informationUsing Neural Networks to Create an Adaptive Character Recognition System
Using Neural Networks to Create an Adaptive Character Recognition System Alexander J. Faaborg Cornell University, Ithaca NY (May 14, 2002) Abstract A back-propagation neural network with one hidden layer
More informationParts of a Computer. Preparation. Objectives. Standards. Materials. 1 1999 Micron Technology Foundation, Inc. All Rights Reserved
Parts of a Computer Preparation Grade Level: 4-9 Group Size: 20-30 Time: 75-90 Minutes Presenters: 1-3 Objectives This lesson will enable students to: Identify parts of a computer Categorize parts of a
More informationLearning and Academic Analytics in the Realize it System
Learning and Academic Analytics in the Realize it System Colm Howlin CCKF Limited Dublin, Ireland colm.howlin@cckf-it.com Danny Lynch CCKF Limited Dublin, Ireland danny.lynch@cckf-it.com Abstract: Analytics
More informationD2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
More informationGeneral Findings Over the five years being reviewed (2006 to 2011) there were 566 applicants for 210 positions.
The Applicant Tracking Process An applicant tracking system, generally speaking, is a system used to track those who apply for positions. This tracking can be as complex as housing the application and
More informationA Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
More information2 Day In House Demand Planning & Forecasting Training Outline
2 Day In House Demand Planning & Forecasting Training Outline On-site Corporate Training at Your Company's Convenience! For further information or to schedule IBF s corporate training at your company,
More informationAccuRead OCR. Administrator's Guide
AccuRead OCR Administrator's Guide July 2016 www.lexmark.com Contents 2 Contents Change history... 3 Overview... 4 System requirements...4 Supported applications... 4 Supported formats and languages...
More informationLoad testing with. WAPT Cloud. Quick Start Guide
Load testing with WAPT Cloud Quick Start Guide This document describes step by step how to create a simple typical test for a web application, execute it and interpret the results. 2007-2015 SoftLogica
More informationThe basics of storytelling through numbers
Data Visualizations 101 The basics of storytelling through numbers C olleges and universities have a lot of stories to tell to a lot of different people. Prospective students and parents want to know if
More informationAccuRead OCR. Administrator's Guide
AccuRead OCR Administrator's Guide April 2015 www.lexmark.com Contents 2 Contents Overview...3 Supported applications...3 Supported formats and languages...3 OCR performance...4 Sample documents...6 Configuring
More informationVisualizing of Berkeley Earth, NASA GISS, and Hadley CRU averaging techniques
Visualizing of Berkeley Earth, NASA GISS, and Hadley CRU averaging techniques Robert Rohde Lead Scientist, Berkeley Earth Surface Temperature 1/15/2013 Abstract This document will provide a simple illustration
More information