Thesis Proposal Using Argument Analysis to Define a Structure for User Generated Content

Size: px
Start display at page:

Download "Thesis Proposal Using Argument Analysis to Define a Structure for User Generated Content"

From this document you will learn the answers to the following questions:

  • What is one step in the analysis of discussions?

  • What type of text is identified as being represented by a span of text?

  • Where is the research done on the subject of argument structure?

Transcription

1 Thesis Proposal Using Argument Analysis to Define a Structure for User Generated Content Clare Llewellyn E H U N I V E R S I T Y T O H F G R E D I N B U Doctor of Philosophy Institute for Language, Cognition and Computation School of Informatics University of Edinburgh 2012

2

3 Abstract Conversation is how people compare ideas, solve problems and identify disagreements as part of daily life. The internet contains several good information sources which can provide facts and general information, but it is less obvious where to find meaningful discussions and arguments that are grounded with evidence. User generated content can be found in social networking sites, add comments sections of news articles, blogs or micro blogs and information collation sites such as Wikipedia. Any evaluations that occur in these discussions must be conducted by humans because any argument structure is not presented explicitly and therefore cannot be automatically analysed. If a user wishes to hear all sides of an argument it can be difficult to find all the right information. This work aims to automatically identify and extract the structure of arguments from user generated content. Thereby, allowing the users of this content to see the structured arguments contained within, with the claims, counter claims, and grounds. Presenting this structure can allow users to more easily navigate the content thereby enabling the user to become part of the conversation in a way that is arguably more useful to them. The steps in the process, and therefore the problems that this work aims to tackle, are the identification of discussions on different topics, the identification of spans of text that illustrate the discussion, classification of text into an argument structure so as to define the relationships between spans of text and to present this information to users. iii

4 Declaration I declare that this thesis was composed by myself, that the work contained herein is my own except where explicitly stated otherwise in the text, and that this work has not been submitted for any other degree or professional qualification except as specified. (Clare Llewellyn) iv

5 Table of Contents 1 Introduction The Problem Motivation Objective Background Argument Theory Argument Data Structures Applications Techniques for Identifying and Classifying Argumentation in Text Classifying Arguments Identifying Arguments Approach Work In Progress Data Sets Topic Detection Argument Analysis and Classification Conclusions from work so far Future Work Approach Plan Bibliography 51 v

6

7 Chapter 1 Introduction 1.1 The Problem The scientific method requires that empirical and measurable evidence can be found to support a hypothetical explanation of a phenomenon. The hypothesis can be used to predict the outcomes in a certain scenario, which can then be evaluated through evidence. Academics can test their theories through discussion, allowing the evaluation of a hypothesis with regard to evidence provided by others. Conversation, or discourse, is how people compare ideas, solve problems and identify disagreements as part of daily life. In traditional print media journalists, academics and other writers perform the tasks of collating and curating information to present in a structured way to consumers. These professionals summarise the debates and add in their own claims and evidence. The consumer of this content is then free to consult this evidence and determine their own beliefs. In the Internet domain there is an overwhelming amount of information available, some is collated and curated in this manner but much is not. There are several good information sources which can provide facts and general information but it is less obvious where to find meaningful discussions and arguments that are grounded with evidence. Tools are freely available that allow users to be both producers and consumers of content and to be part of the wider internet conversation. Users add data to social networking sites, they use the add comments sections of news articles, they blog or micro blog and they contribute to articles for information collation sites such as Wikipedia. Any evaluations that occur in these discussions must be conducted by humans because any argument structure is not presented explicitly and therefore cannot be automati- 1

8 2 Chapter 1. Introduction cally analysed. If a user wishes to hear all sides of an argument (the full argument space) it can be difficult to find all the right information. 1.2 Motivation The models of presentation in current user generated content systems often mean the context of a post is lost. Looking at information without this context can give a skewed version of the argument. Most user generated content systems allow users to comment in a linear way, presenting comments as a list with each addition being added to the top or the bottom of the list. Additionally many sites allow users to comment upon the comments of other users, thereby creating a thread like structure. Often this data is presented by time of addition, so the user sees the most recently added data first, or by a user rating system, often represented using votes for or against particular posts (comment systems) or by the reposting of information (blogs, Twitter). Unfortunately the speed with which comments are added is extremely variable and this can make it difficult to follow the conversations and for users to comment at the most appropriate locations. If a rating system is used, a post that references or quotes another may appear out of context and the point of the post may be obscured. Users are generating a large volume of data. The volume of data can be overwhelming therefore users may adopt the approach of only reading what they are interested in. This can lead to cyberpolarization where users with opposing opinions are not aware of the evidence from the opposing side (Faridani et al., 2010). Often the users only look at the most popular or the most recent content, depending on the display system used, and therefore can form a misleading impression of the overall discussion. The current thread style is no longer a good interface for interacting with this user generated data as it can lose the context and variety of the discussion. It is a generally held belief that users who post moderate, well-thought-out opinions can often be shouted down by extremists and discussions can thus end with neither side appreciating the other s point of view and even becoming insulting. This can lead to those with interesting opinions not posting as they fear they will either not be read or they will be shouted down and insulted (Faridani et al., 2010). It can be difficult for a user to determine the most useful location to add their own information so that it can be found by others. Allowing the users of content to see the structure of arguments, with the claims, counter claims, replies and grounds, would allow the users to navigate a large volume of content whilst retaining the context of

9 1.2. Motivation 3 the original post. It would enable users to see the full discussion space whilst allowing them to easily disregard comments that are not well thought out or are insulting. This type of structure could facilitate knowledge acquisition and enable the user to become part of the conversation in a way that is arguably more useful to them. There is no easy way to gain an overview of the opposing sides of a discussion. For example within academia a user must read many journal papers or blogs and have conversations with specialists in the areas they wish to investigate. Conversations can be conducted online through commenting on specific papers or blogs or identifying the experts through social networking tools and contacting them directly. A tool that analyses these conversations to determine the structure of the discussion would make the content easier to navigate and make these conversations more accessible, increasing the usefulness of this content and aiding scholarship. In essence, argument is a conversation which involves different points of view. The initial point of view is a claim and this claim can be countered with another claim. The claims are backed up with evidence which can also be referred to as grounds. An individual constructs an argument claim that represents their position, and in this they explain to others the evidence that has led to their position (the claim and the grounds that they use to support this). A counter claim challenges the original position, providing further evidence or information. This is intended to cause the composer of the original claim to re-evaluate their position and accept the counter claim. Those who have been involved in the argument have been made aware of other perspectives on the original problem thus allowing the participants to gain new knowledge that may change or confirm their perspective, encourage them to conduct further investigation, or they may use this evidence to solve other problems at a later date. It is proposed here that user generated discussions can be expressed as arguments with different points defined as claims, counter claims and evidence and that this structure can be automatically extracted from the content and presented to the user for further interaction. Presently, as discussed further in chapter 2, there are techniques that can be used to identify possible argument components in text, and there are tools that can be used to classify these components and standardised structures for storing and displaying argumentation. However these processes have not been integrated and used to solve the problems defined above that relate specifically to user generated content.

10 4 Chapter 1. Introduction 1.3 Objective The objective of this work is to automatically identify and extract the structure of arguments from user generated content, to display this structure and to help a user identify where they should add their own content. In Chapter 2 there is a discussion on current approaches for extracting arguments from user generated content. This work differs from other areas of argument analysis as it aims to extract arguments from within a stream of conversations from user generated content sources such as Twitter, comments on news articles and other media as yet undefined; therefore specific conversations must be identified and appropriate sections of text extracted. In addition, there is an intention to combine the extraction of the argument structure with tools available for visualisation of these arguments. These tools will be extended to enable the users to add their own content at appropriate locations. The proposed steps in this process are inspired by fields such as opinion mining. Bal and Saint-Dizier (2009) analyse product reviews to determine current public opinion on commercial products; this work is described in more detail in Chapter 2. The plan for the extraction of the data is adapted from the steps they define. The steps in the process, and therefore the problems that this work aims to tackle, are: 1. Identify discussions on different topics 2. Identify spans of text that illustrate the discussion 3. Classify into a structure so as to define the relationships between spans of text 4. Present this information to users The objective of this document is to briefly cover the current research that is relevant, to describe the experimentation that has taken place up to this point and to outline and provide a plan for the future direction of this work. Chapter 2 presents a brief overview of the field of argument theory. The current standards for argument structure and ontologies are surveyed and the applications which are based upon those structures are described. A review of the current literature that describes processing methods for extracting arguments from text has been conducted and is presented in section. This section describes both current machine learning techniques for classifying text into an argument structure and other approaches such as rhetorical structure theory and the theory of argument discourse markers that may provide techniques that will be applicable to this process. Chapter 3 describes the approach that is being taken to this work. The data set that is currently being used and the work that has been completed is described.

11 1.3. Objective 5 Chapter 4 describes the work that will be conducted in the next two years and a plan is presented for the successful completion of this project.

12

13 Chapter 2 Background The study of argumentation builds upon ideas from many disciplines including logic, dialogue theory, artificial intelligence, agent systems, linguistics, and argument theory. It has application in computer supported collaborative learning, the semantic web and artificial intelligence. In investigating argument analysis, ideas from argument theory, rhetorical structure theory and computer supported collaborative work are considered. 2.1 Argument Theory Argument theory is the evaluation of conclusions by logically reasoning over claims that are based on premises. In 2008 Douglas Walton defined three major types of argument (Walton et al., 2008), (Rahwan, 2008b): Deductive, if the premise is true the conclusion is true Inductive, argument supported by generalisations from empirical evidence Defeasible (or presumptive), the conclusions are plausible given the premise, but can be proved false by new evidence. Historically the deductive model was used to evaluate an argument; if the premise can be evaluated as true then the conclusion is therefore true. More recently there has been a shift to a more human type of evaluation of arguments where a degree of uncertainty in the conclusion is possible. This is because using deductive logic to evaluate arguments does not work well in certain situations. For example considering the premise that books are normally longer than 1 page, deductive reasoning could not evaluate this premise as true until all the books that will ever exist have been evaluated in order to determine if they are longer than a page. At this current time 7

14 8 Chapter 2. Background defeasible argumention methods are preferred when evaluating arguments. Defeasible argumentation occurs when the conclusion is accepted in the light of present evidence but can be changed if new evidence is provided, so that books are normally longer than one page can be evaluated as true until many books with a single page are found. A premise can be assumed to be true until there is something which disproves it. This allows reasoning in the absence of complete knowledge. In 1958 Stephen Toulmin noted that natural human argumentation differs from the logics in deductive arguments but shares the same basis of claims and evidence, and from this he concluded that formal logic was different from that used in academic discussion (Toulmin, 2003). He produced a model of this academic style of argumentation and founded the modern study of argumentation to reflect this. The Toulmin model of argumentation is applied to practical problems where reasoning is a process of testing ideas. A claim is provided then justified, for an argument to succeed a good justification of this claim must be made. This argumentation model is not as rigorous as those that use logic but provides enough evidence to allow rational acceptance (Walton et al., 2008). It is proposed here that this more natural description of argumentation can be identified in text and used to determine a structure for that content to make it easy to navigate the conversations in user generated content. The Toulmin model (Toulmin, 2003) of argumentation can be used to provide a structure for argument. This flexible approach is based upon human argumentation which reflects the nature of discussions found in user generated text. The scheme is made up from 6 components: 1. Claim - A conclusion to be evaluated 2. Ground - Evidence and data, used to support the claim 3. Warrant - A statement authorising the ground from the claim (to bridge the gap between claim and ground) 4. Backing - Credentials that certify the statement expressed in the warrant (when the warrant itself is not enough / credible) 5. Rebuttal - Recognising the restrictions which may legitimately be applied to the claim 6. Qualifier - Expressing the degree of force or certainty of the claim The aim of analysing text would be to identify the claims provided and to identify where those claims are justified. For an argument to succeed it should provide good justification of a claim. For this a claim, ground and warrant are always needed but a

15 2.1. Argument Theory 9 qualifier, backing and rebuttal are not essential. The process would analyse the text to initially identify the claim and ground and when these are found, attempt to find the other components listed above. Argument structure takes many forms from single premise arguments to linked, convergent, serial and divergent arguments. In a linked argument there are two or more premises that are dependent on each other and when combined they support the argument conclusion. In convergent arguments each premise individually supports the argument. With serial arguments, the conclusion of an argument acts as the premise for another. A divergent argument means that a premise can support multiple argument conclusions. These structures can also be combined to model complex argumentation (Rahwan, 2008b). Argument theory can be used to automate reasoning, the evaluation of arguments to determine what is true. Schemes are used to present the argumentation structure which can then be automatically evaluated. Walton has spent many years investigating and defining argumentation through a number of these schemes. He provides extensive descriptions of 96 argument schemes that are designed to represent patterns of human argument and reasoning (Walton et al., 2008). Each scheme has a set of critical questions that can be used to reason over the arguments to evaluate it and establish its strength. Walton also presents dialogue types to provide context for argument schemes. Each type has specific speech events. There are 6 types (Walton, 2000): 1. Persuasion - resolution of a conflict of opinion to resolve or clarify an issue 2. Negotiation - to make a deal that is satisfactory to all participants 3. Inquiry - to find or verify evidence in order to evaluate a hypothesis 4. Deliberation - to decide the best course of action in a practical situation / choice 5. Information seeking - to acquire or give information 6. Eristic - to fight and quarrel without any reasonable goal Obviously there are criticisms of these approaches. Many real arguments are composites of 2 or more of the dialogue types and of the various structures defined above (Rahwan, 2008b). A way to evaluate an argument, as proposed by Walton et al. (2008), is through the answering of critical questions. As these questions are dependent on each of the argument schemes, there are therefore many. This makes it hard to evaluate using this scheme (van Eemeren et al., 2008, 2012). Critics of Toulmins model highlight that it does not allow methods for reasoning over the argument in order to

16 10 Chapter 2. Background evaluate it through the proposing of critical questions, as in Waltons schemes for argumentation (Cartwright and Atkinson, 2009). In summary it is very difficult to reason and form conclusions automatically. Humans are much better at this type of task than machines. The most efficient approach to this problem may be a system that utilises machines for tasks such as the extraction and summary of the argumentation points and allow humans to evaluate these arguments. 2.2 Argument Data Structures The aim of structuring data is to provide a basic overview that allows users to more easily understand it. When this content contains a discussion or argument the best way to structure the data is to map it to an argument structure. This allows the user to see the main points of the discussion. The claims and the evidence associated with those claims give the user an indication of where the argument is strong and where it is weak. The points where the claims have weak evidence suggest the most likely places for the evolution of the argument. Much of the work in argumentation structure comes from the education domain where maps are made by students to define arguments on certain topics. These students discuss a given question and provide claims and counter claims and shape this information into a giant map which allows others to visualise the discussion. When this is done manually it can take a very long time to cover all views on a topic. For example, it is estimated that the production of an argument map describing the debate on whether computers think took 7,000 hours to complete (Metzinger, 1999). Semantic web technologies such as RDF and ontologies are currently used to define the basis for many existing argument systems. The currently available ontologies allow a feature rich representation. The expression of an argument in an ontological structure allows discourse structure and component relationships to be accessible to computation so that the data can be navigated and compared automatically. The data can be represented in XML, for example, the Argument Markup Language (AML), which is an XML based language for annotating claims and premises (Reed and Rowe, 2004; Rahwan, 2008a). Various semantic web models for argumentation are also available including: a hypertext based approach, the Issue-Based Information Systems (IBIS) ( 2012), a biological approach the SWAN/SIOC, which is an alignment of two ontologies the Semantic Web Applications in Neuromedicine (SWAN) and Semantically-Interlinked Online Communities

17 2.3. Applications 11 (SIOC) (Schneider et al., 2010; Schneider and Passant, 2012) and the ScholOnto ontology (Buckingham Shum et al., 2000) which is used to support debate in digital libraries. The Argument Interchange Format (AIF) is an interchange format has been proposed for these different ontologies (Rahwan, 2008a). AIF is an ontology of argument related concepts from several different schemes. It is expressed in the Web Ontology Language (OWL) which offers a rich feature set and allows inference using descriptive logics (Mcguinness and van Harmelen, 2004). Currently it only represents a limited set of argument concepts and must be extended to capture other argument formalisations and schemes. In this format an argument is made up of argument entities represented as nodes. Each node has attributes such as title, creator, creation date, certainty, acceptability. It is made up of I-nodes (information) and S-nodes (schemes).the I-nodes represent passive information such as claim, premise, grounds and the S-nodes are the patterns of reasoning. The schemes belong to a class and are classified into types (such as rule of inference scheme, conflict scheme and preference scheme). AIF can also be expressed using the Resource Description Framework (RDF) (Lassila et al., 1998) as AIF-RDF which although less expressive allows interaction with other RDF based technologies (Rahwan, 2008a). Although many of these formats share a similar underlying RDF structure they are not compatible with each other. There are some indications that AIF-RDF standard may emerge as the standard that is adopted in this domain, since it allows the representation of the largest number of argument schemes. But, the extensive functionality of AIF-RDF also means that users must identify the logical aspects of argumentation, and as this can be very complex, this may limit the uptake of the system. As there is no single unified ontology as yet it is difficult to commit to using a single ontological structure. The adoption of a standard would promote interoperability, but at this point there is a reluctance to commit to one for this work, but rather to remain flexible and adapt to the ontologies as needed for display purposes. 2.3 Applications Argument structure within content can be defined either manually or automatically. Internet tools have been developed that encourage users to take part in debates (Cartwright and Atkinson, 2009). Generally these tools required the user to add the structure as they add the content. Once added the content is held in a repository and visualisation

18 12 Chapter 2. Background techniques are used to express the structure. There are several examples of argument repositories; in general they do not interoperate with each other as they do not use the same underlying format, and the argument structures provided are either very simple or very complex depending upon the purpose they are designed for (Rahwan, 2008a). A shared ontology or format would be required to support integration of arguments from different tools, or an ontology mapping tool would be necessary to align these different formats. A brief review of some of the tools available are presented below. 1. Debateabase is from the International Debate Education Association, a not-forprofit organisation which promotes, amongst other things, critical thinking. It is a database of argument questions where the pros and cons of the arguments are listed. These pros and cons for each question are written by experts for those new to the debate to learn the arguments. There are for and against points for each debate and counter points to those points. The site is well presented, has a nice look and feel and good functionality, however the arguments themselves are confusing and the logic of the debate is not very clear ( 2. Truthmapping is a volunteer site funded by donations. Users create arguments by stating premises and claims, critiques and rebuttals can be added, the arguments can be chained together and linked to evidence added. The site is used to learn and practice the logic of debates. Therefore in the logic of the site, what is a premise, a claim and conclusion is very clear. The maps of the argument are good. ( 2012). 3. Araucaria is from the University of Dundee. It is a tool for analysing, reconstructing and diagramming arguments. Users load the text of an argument into the tool and then highlight pieces of text to use as the basis of their argument: the premises and conclusions. These are added to the argument structure in diagrammatical form. This tool can also be used to manually evaluate the argument and each claim can be assigned a level of good / bad or strong / weak. Some of the argument schemes from Walton can also be used. Arguments can be used to search the Araucaria database which looks for text, structure and scheme matching. AraucariaDB is a repository of arguments that can be searched using Xpath ( 2012; Reed and Rowe, 2004).

19 2.4. Techniques for Identifying and Classifying Argumentation in Text Paramenides from the University of Liverpool uses Waltons sufficient conditioning scheme for practical reasoning. It allows the public to post responses to political policy proposals; the proposals are structured using an argument scheme and the users can add in their views and evaluate the arguments ( parmenides/, 2012; Cartwright and Atkinson, 2009). 5. ClaiMaker from the Open University uses the ScholOnto ontology, which supports basic reasoning, to model users views of academic research papers. They have a database of claims made in papers which can be queried. The approach is based upon that taken in Compendium a tool also from the Open University, that enables concept mapping using the IBIS ontology (Uren et al., 2003). Each specific language is designed for a specific tool and therefore is closely tied to the tool. A standardised format for content or a common ontology or a way of mapping between the current structures would enable data to be shared amongst sites. 2.4 Techniques for Identifying and Classifying Argumentation in Text Classifying Arguments Machine learning tools have been used to automatically classify arguments (Witten and Frank, 2005; Rose et al., 2008). Machine learning algorithms require a good set of features that strongly predict the classes and a feature space that is as small as possible to reduce processing time. The following section looks at the different approaches, algorithms and features that have been used to identify argumentation in text. An example of an area where argument theory has been implemented is in the field of Computer Supported Collaborative Work (CSCW). It is used to monitor and assist in collaborative work and learning. Systems have been developed that are used by students in online debates; these use analysis of the argument structure to make sure students stick to the proposed topic areas and to identify interactions where prompts can be used to assist students. These prompts encourage the students to ground and warrant their claims and help them to come to productive conclusions. Rose et al. (2008) use argument theory to classify text in the CSCW domain. In their 2008 paper they describe how they identify features that can be extracted from the text that are useful in predicting types of discourse interactions. A framework is

20 14 Chapter 2. Background used to analyse the text based upon a classification scheme from Weinberger and Fischer (2006). This scheme contains two initial levels of argumentation: the micro-level of argumentation, an individual argument consisting of a claim supported by a ground with warrant possibly specified by a qualifier; and the macro-level of argumentation, argumentation sequences where single arguments are connected to create an argumentation pattern such as argument and counter-argument. The specific items of the Weinberger and Fischer (2006) classification system used: 1. Epistemic activity (formal argument quality) 2. Micro level of argumentation 3. Macro level of argumentation 4. Social modes of co-construction 5. Reaction to previous contribution 6. Reaction to script (prompt) 7. Quoted text The underlying aim of the work by Rose et al. (2008). was to automate the classification of textual data from computer supported collaborative learning tools into a coding scheme. This coding scheme was used to investigate how argumentative knowledge construction occurred in online discussions. The work used the TagHelper Tool; this is a corpus analysis environment built on top of the Weka machine learning toolkit. The text analysed comprised 250 discussions from online discussion boards. Human annotators split each message from the discussion board in to several spans of text which represent the different communication acts. The thread structure of the messages and the relationships between the text spans within each message were recorded for use by the machine learning algorithm. In total 1,250 coded text segments were extracted and classified by humans using the multidimensional coding scheme from Weinberger and Fischer (2006). Rose et al. (2008) considered two tasks: which text features worked well to distinguish classes and an exploration of the suitability of different supervised machine leaning algorithms for this task. The algorithms that were tested were Naive Bayes, Support Vector Machines and Decision Trees. As a baseline they classified using SVM, Naive Bayes and Decision Trees algorithms and the top 100 unigram features. In addition to this they tested 7 different sets additional of features: Unigrams

21 2.4. Techniques for Identifying and Classifying Argumentation in Text 15 Unigrams and line length Unigrams and part of speech (POS) bigrams Unigrams and bigrams Unigrams and punctuation Unigrams and stemming Unigrams and rare words removed Unigrams, line length, punctuation, and rare words removed The classification was evaluated by determining the level of agreement between the two classifications of the data, automatic and manual. They found that in general using the SVM algorithm gave the most agreement. There is an emphasis in the Rose et al. (2008) work on the selection of features that assist in classification. The features which are identified as useful are: punctuation, unigrams, POS bigrams, line length, containing a non-stop word, stemmed words, and rare words. They found that bigrams substantially reduced performance as each one was relatively rare. They found that with this data set the most predictive features are unigrams and punctuation. This work also investigated if the context of the text would prove useful in classification, this was investigated through the analysis of the relationship of the text spans to each other and their depth in the thread. Sequential learning techniques were used to establish the context of the span of text. It was thought that this may particularly help with certain types of spans of text, for example a reaction to previous contribution cannot occur in an initial position, a specific discourse act is set up by those preceding it. The depth of the thread that the text span appears in is included as a feature by coding it using a Finite State machine feature which has two states, Q0 as initial state at the top of the message and Q1 when a quoted span of text is encountered. It remains in this state until a prompt is reached indicating the span of text that related to something someone else has said. The results showed that context and depth features assisted in identifying Microlevel argumentation and depth features helped in identifying Macro-level argumentation and social modes of co-construction. It is suggested that this is because complex levels of argumentation like counter-examples are more likely to refer to already mentioned information whereas claims are not. The set of features as described by Rose et al. (2008) was extended by Abbott et al. (2011) to assist in recognising disagreement in online discussion forums. The focus of this work was on the recognition of argument structure by identifying agreement

22 16 Chapter 2. Background and disagreement. The machine learning toolkit Weka was used with Naive-Bayes and JRip algorithms. In addition to a base set of n-gram features, several meta-post features were used such as, poster id and time between posts. The experimentation with these meta post features gave a small increase in agreement between automatic classification and human annotation from 63% to 68%. This approach may be useful when investigating classifying text into argument structure: users tend quote each other in their comments to allow them to break down posts into individual points of argument which they can then respond to in isolation, and this meta-post information could be used to determine which is the quoted text and which is the new text and thereby differentiate between claims and counter claims. Mishne and Glance (2006) discuss disputative comments, where users post comments that disagree with the previous comments with the intention of provoking debates and arguments. The disputative comments were identified using a decision tree built from manually classified comments. The most predictive features were found to be: the use of question marks early in the text, the number of comments in the thread, and the polarity of the first sentence of the comment. They also found that phrases that occur often in debates such as I don t think that or you are wrong and the words not and but are strong features. Somasundaran et al. (2008); Somasundaran and Wiebe (2010) found other specific cue phrases that are strong features for use in this type of classification. Discourse markers that are strongly associated with pragmatic functions can be used to predict the class of content, therefore useful features include the presence of a known marker such as actually, because, but, i believe, i know. In general, it has been shown that it is difficult to vastly improve a baseline that uses only unigrams as features, although simple rules defined over the content can improve the features and therefore the classification. However, there is some indication that specific features, such as cue words or POS features, can be used to identify specific aspects of argumentation and in classifying the sentences into claims, grounds and qualifiers. These approaches will be explored to determine whether the classes taken from the Toumlin approach can be identified and extended. Teufel and Moens (1997, 2002); Teufel et al. (1999) discuss how argumentative classification of extracted sentences can be used to summarize research articles and provide a degree of context for the extracted sentences that is missing in other summarization approaches. They extract sentences from the articles to form a summary but keep the overall rhetorical structure, rather than picking the highest rated sentences.

23 2.4. Techniques for Identifying and Classifying Argumentation in Text 17 The rhetorical structure is used to aid extraction of text that describes specific aspects of a paper, whether the sentence represents amongst other things, a main goal of the article, a problem or a contrast to someone else s work. This research uses the CARS (Create a Research Space) model described by Swales (1990) for expressing how academics write papers. This model describes a framework of how, in general, authors of academic papers structure their writing by making specific moves such as pointing out weakness in others research. These moves can be identified using rhetorical markers but these markers mean different things depending on where they are in the text, the same marker might mean a different thing in a related work section as opposed to a conclusion. They identify what they call meta comments in the text, these are phrases like we have presented a method for to fill certain slots in the abstract (such as method). They use a Naive Bayes classifier to identify the sentences that fill a specific role in the summary to be created, such as a method role. This is done by extracting abstractworthy sentences and the classification of the rhetorical role for that sentence. They classify the text into basic and non-basic rhetorical categories which are taken from Swales CARS model (Swales, 1990). The basic categories are: Background, generally accepted statements Others, statements made by others Own, statements made by the author In addition to this there is: Aim, which can be used to categorise the entire text Textual, providing information on the structure of the text Contrast, comparing the authors work with the work of others Basis, the work that the author builds upon. The features that are used in the machine learning are quite complex and include features based on the structure, citation, syntactic, semantic and content aspects of the text. The features based on structure are created by determining the explicit structure of the text, such as position of a text span within the full text, the section, and the paragraph. Whether the text contains a citation (self or to others) is included as a feature. The syntactic features are based on tense and whether it contains modals, voice and negation. An example of the semantic features include the action type of the first sentence. Examples of content features are if the sentence contain key words as

24 18 Chapter 2. Background defined by tf/idf or words from the title. The results from this work were variable but there is a strong indication that the approach of considering the context of the rhetorical marker aided in the classification of the text spans Identifying Arguments Contrasting Idea Identification Argumentation in text can be discovered through identifying sentences that contain contrasting ideas. A contrasting idea is identified as a sentence that summarises a claim and counter claim. This can be used to identify a contrast between two previous points. Contrasting idea identification is used by Sndor and Vorndran (2009) to identify key sentences for use in highlighting key threads for reviews when pre-reviewing papers. They use natural language processing (NLP) through the rule-based dependency parser Xerox Incremental Parser (XIP) (Kaplan, 2006). XIP looks for sentences using gramma rules, these rules contain particular discourse functions (De Liddo and Buckingham Shum, 2010; De Liddo et al., 2012). The features of the key sentences are assigned by applying the concept-matching framework. It appears that this concept matching framework is made up of a database of sets of words which is manually curated to indicate specific aspects of the text, such as a summary. The words from this database are used as seeds and added to incrementally as data from a specific domain is processed, thereby reflecting the way points are discussed in a domain specific way. Once parsed the key sentences are defined as those that describe the main points of a paper, thus allowing a reviewer to evaluate the flow of the ideas within the article. The sentences identified do not describe concepts but instead argue about concepts. They have found that a sentence may not present enough context to the user and suggest that a larger text span may be useful. They also find that they get a very large number of false positives which they suggest may be overcome by expanding the text span beyond the sentence level. De Liddo et al. (2012) also use the XIP parser to aid in Contested Collective Intelligence, the theory of the processes involved in working with ideas that are not proven and can be contested. The contrasting ideas are extracted using XIP and used to analyse problems and aid critical thinking by exploring opposing or diverging opinions. Sentences in the text are labelled as contrasting ideas. Opposing ideas are matched and used for summarisation and contrast. An annotation and knowledge mapping sys-

25 2.4. Techniques for Identifying and Classifying Argumentation in Text 19 tem, Cohere, is used to model the contested ideas and present them to users. The two approaches presented here suggest that using a rule based parser based on discourse functions would be useful for identifying sentences that contain argumentation. Unfortunately the rules used by XIP are not in the public domain, therefore current approaches to discourse analysis are investigated in the next section Rhetorical Structure Theory Rhetorical structure theory involves the identification of the relations between spans of text and the clauses involved in the relationship. These relations are organised into a hierarchical structure (Mann and Thompson, 1987, 1988). Rhetorical semantic relations indicate how the spans of text are joined - the relationship between these spans. Schemes devised using this theory describes how spans of text can co-occur. There are differences in the way different groups precisely define these schemes and the terms used to define them, but the different schema can be aligned (Blair-Goldensohn et al., 2007). Rhetorical semantic relations could be used to identify related spans of text that could be classed as using argumentation structure such as claims, counter claims, premises and conclusions. The relations are described in different ways dependent on the scheme followed, but generally the type of relations that would be useful in argument analysis include those described as CONTRAST, CAUSE-EXPLANATION- EVIDENCE, CONDITION and ELABORATION. Marcu and Echihabi (2002) describe an approach to automatically define and locate these types of relation. CONTRAST and CAUSE-EXPLANATION-EVIDENCE are often signalled by cue phrases (Example 1) such as because or however or they can be expressed implicitly (Example 2). Detecting implicit expressions is more difficult and much of the work in this field is devoted to this. 1. Example 1: because of the expenses scandal there have been a spate of resignations in the government 2. Example 2: the government was once again beset by scandal. After several key resignations Marcu and Echihabi (2002) look for the relations using defined patterns or rules. Models of the rhetorical semantic relations are defined and identified in the text using a Naive Bayes classifier. The models were learned automatically from a mas-

26 20 Chapter 2. Background sive dataset. A large number of examples are identified from a corpus: for identifying CONTRAST sentences were extracted that were linked using the phrase but, CAUSE-EXPLANATION-EVIDENCE sentences were identified as those linked by because or thus. They are able to improve the quality of training examples using topic segmentation and syntactic heuristics to filter out training examples which are invalid. Sporleder and Lascarides (2008) also extend the work by identifying features based on POS and argument structure. They looked for linguistic cue phrases to identify, extract and label training data which is used in classification. This is thought to be suitable for dealing with relations that are difficult to determine using just cue phrases. Linguistic cue phrases are made up of textual features such as word stems, POS and tense. These are then used to expand upon the type of training data that can be used to classify the rhetorical semantic relations. They use Decision Trees, rather than Naive Bayes as the algorithm in the machine learning process. The features that are used in the classification are: Positional features: if the text is at the beginning or end of a paragraph Length of spans Lexical features: words which are not cue words but might indicate certain relations POS: particularly verbs, nouns and adjectives in spans Temporal features: tense and aspect measured by modality, aspect, voice and negation. Syntactic features: complexity measured by number of NPs, VPs, PPs, ADJPs and ADVPs Cohesion Features: measured by the number of pronouns, ellipses and if a text span ends in a VP ellipse. Blair-Goldensohn et al. (2007) have refined the approach using parameter optimisation, topic segmentation and syntactic parsing. The parameter optimisation included (amongst others): Laplace smoothing Using a vocabulary of 6400 stems: replacing all others with a pseudo-token. Removing the stoplist: as the words frequently in a stop list are useful in the model

27 2.4. Techniques for Identifying and Classifying Argumentation in Text 21 Using a minimum frequency cut off: any pair with a frequency of less that 4 is removed Blair-Goldensohn et al. (2007) also reiterate that topic segmentation is important as related instances are unlikely to cross topic boundaries. They defined a topic boundary as at least three intervening sentences or a paragraph break. Their system does not seem to work significantly better but there is some evidence that it may perform better on a larger corpus Global Rhetorical Moves As described in the classification section, Teufel and Moens (1997, 2002); Teufel et al. (1999) assigned sentences to argumentative roles to create and evaluate summaries of academic papers. The work is described as finding global rhetorical moves as opposed to rhetorical structure theory moves (Marcu, 1997) which are generally focused on local relationships. Text spans are identified which characterise the articles they come from, such as: unfortunately this work does not solve problem X. The emphasis of the Teufel and Moens (1997, 2002); Teufel et al. (1999) work is that a text span has different meaning depending where it is in the text, for example in a future work section the example sentence would express some reservations about the work presented in the article and in an early section of the text it would indicate that this work is trying to solve problem X. This approach could be applicable to argumentation. If in user generated content a text span such as the example above is presented towards the end of a user generated comment this type of phrase may indicate the user has expresses a reservation about the claim that they are making, what Toulmin would describe as a Qualifier, if it was towards the beginning of a comment it would indicate the following text is a counter argument to a previous claim. Teufel and Moens (1997, 2002); Teufel et al. (1999) restrict their work to academic writing as they think there are communication goals that are specific to this domain, but adapting this approach to various types of user generated content may provide interesting results Argument Discourse Markers Discourse markers or cue phrases link the section they are introducing and the section that precedes. They are generally made up of conjunctions, adverbs and prepositional

28 22 Chapter 2. Background phrases (Fraser, 1999). Argument discourse markers are words or phrases that link argumentative points. Tseronis (2011) describes three main approaches to describing argument markers: Geneva School, Argument within Language Theory and the Pragma-dialectical Approach. The Geneva School approach describes the various relations in language, in a similar fashion to rhetorical structure theory. There are 3 main types of markers/connective, organisation markers, illocutionary function markers (the relations between acts) and interactive function markers. Among the interactive markers there are 3 sub groups: a relationship between the argument and the master act, a relationship between the counter-arguments and the master act and a reformulation of the acts that proceed. Argument within Language Theory is a study of individual words and phrases. The words identified are argument connectors: these describe an argumentative function of a text span and change the potential of it either realising or de-realising the span. The adverbs and adjectives that accompany verbs and nouns change the argumentative force. These can therefore be used as cue words to identify argument connections and force. The Pragma-dialectical Approach looks at the context beyond words and expressions that directly refer to the argument. It attempts to identify words and expressions that refer to any moves in the argument process (Tseronis, 2011). In a similar fashion to the work of Marcu and Echihabi (2002) the approach is to create a model of an ideal argument and annotate relevant units. Tseronis (2011) analysed text to identify markers of argumentative moves. The 4 units are indications of: 1. stand points (confrontation stage) 2. challenge to defend a stand point (opening stage) 3. argument schemes and related discussions (argumentation stage) 4. maintaining or withdrawing a standpoint of doubt (concluding stage) Aspects of all of these approaches could be used to define rules that identify text spans which contain argumentative text and also do indicate the argumentative role that these text spans take such as claim or counter claim Additional Techniques In addition to identifying and classifying specific text spans that contain argumentation points, other aspects of the user generated content could provide cues as to how to identify and structure the appropriate information. These techniques may include

HOW TO WRITE A CRITICAL ARGUMENTATIVE ESSAY. John Hubert School of Health Sciences Dalhousie University

HOW TO WRITE A CRITICAL ARGUMENTATIVE ESSAY. John Hubert School of Health Sciences Dalhousie University HOW TO WRITE A CRITICAL ARGUMENTATIVE ESSAY John Hubert School of Health Sciences Dalhousie University This handout is a compilation of material from a wide variety of sources on the topic of writing a

More information

WRITING A CRITICAL ARTICLE REVIEW

WRITING A CRITICAL ARTICLE REVIEW WRITING A CRITICAL ARTICLE REVIEW A critical article review briefly describes the content of an article and, more importantly, provides an in-depth analysis and evaluation of its ideas and purpose. The

More information

A LEVEL ECONOMICS. ECON1/Unit 1 Markets and Market Failure Mark scheme. 2140 June 2014. Version 0.1 Final

A LEVEL ECONOMICS. ECON1/Unit 1 Markets and Market Failure Mark scheme. 2140 June 2014. Version 0.1 Final A LEVEL ECONOMICS ECON1/Unit 1 Markets and Market Failure Mark scheme 2140 June 2014 Version 0.1 Final Mark schemes are prepared by the Lead Assessment Writer and considered, together with the relevant

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

Units of Study 9th Grade

Units of Study 9th Grade Units of Study 9th Grade First Semester Theme: The Journey Second Semester Theme: Choices The Big Ideas in English Language Arts that drive instruction: Independent thinkers construct meaning through language.

More information

English Language Proficiency Standards: At A Glance February 19, 2014

English Language Proficiency Standards: At A Glance February 19, 2014 English Language Proficiency Standards: At A Glance February 19, 2014 These English Language Proficiency (ELP) Standards were collaboratively developed with CCSSO, West Ed, Stanford University Understanding

More information

Paraphrasing controlled English texts

Paraphrasing controlled English texts Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

More information

PTE Academic Preparation Course Outline

PTE Academic Preparation Course Outline PTE Academic Preparation Course Outline August 2011 V2 Pearson Education Ltd 2011. No part of this publication may be reproduced without the prior permission of Pearson Education Ltd. Introduction The

More information

Facilitating Business Process Discovery using Email Analysis

Facilitating Business Process Discovery using Email Analysis Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process

More information

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999)

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999) Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002 Language Arts Levels of Depth of Knowledge Interpreting and assigning depth-of-knowledge levels to both objectives within

More information

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING Critical and creative thinking (higher order thinking) refer to a set of cognitive skills or strategies that increases the probability of a desired outcome. In an information- rich society, the quality

More information

Introduction to Reading Literacy Strategies

Introduction to Reading Literacy Strategies Introduction to Reading Literacy Strategies Reading involves: decoding words and understanding the alphabetic code understanding vocabulary linking known knowledge with the knowledge in texts rechecking

More information

Language Arts Literacy Areas of Focus: Grade 6

Language Arts Literacy Areas of Focus: Grade 6 Language Arts Literacy : Grade 6 Mission: Learning to read, write, speak, listen, and view critically, strategically and creatively enables students to discover personal and shared meaning throughout their

More information

LEVEL ECONOMICS. ECON2/Unit 2 The National Economy Mark scheme. June 2014. Version 1.0/Final

LEVEL ECONOMICS. ECON2/Unit 2 The National Economy Mark scheme. June 2014. Version 1.0/Final LEVEL ECONOMICS ECON2/Unit 2 The National Economy Mark scheme June 2014 Version 1.0/Final Mark schemes are prepared by the Lead Assessment Writer and considered, together with the relevant questions, by

More information

Language Arts Literacy Areas of Focus: Grade 5

Language Arts Literacy Areas of Focus: Grade 5 Language Arts Literacy : Grade 5 Mission: Learning to read, write, speak, listen, and view critically, strategically and creatively enables students to discover personal and shared meaning throughout their

More information

Philosophical argument

Philosophical argument Michael Lacewing Philosophical argument At the heart of philosophy is philosophical argument. Arguments are different from assertions. Assertions are simply stated; arguments always involve giving reasons.

More information

MATRIX OF STANDARDS AND COMPETENCIES FOR ENGLISH IN GRADES 7 10

MATRIX OF STANDARDS AND COMPETENCIES FOR ENGLISH IN GRADES 7 10 PROCESSES CONVENTIONS MATRIX OF STANDARDS AND COMPETENCIES FOR ENGLISH IN GRADES 7 10 Determine how stress, Listen for important Determine intonation, phrasing, points signaled by appropriateness of pacing,

More information

Planning and Writing Essays

Planning and Writing Essays Planning and Writing Essays Many of your coursework assignments will take the form of an essay. This leaflet will give you an overview of the basic stages of planning and writing an academic essay but

More information

The College Standard

The College Standard The College Standard Writing College Papers: Identifying Standards and Critical Thinking Challenges Building Blocks Grammar Vocabulary Questions The Goals of Academic Writing Thesis Argument Research Plagiarism

More information

Common Core State Standards Speaking and Listening

Common Core State Standards Speaking and Listening Comprehension and Collaboration. Prepare for and participate effectively in a range of conversations and collaborations with diverse partners, building on others ideas and expressing their own clearly

More information

Project Management in Marketing Senior Examiner Assessment Report March 2013

Project Management in Marketing Senior Examiner Assessment Report March 2013 Professional Diploma in Marketing Project Management in Marketing Senior Examiner Assessment Report March 2013 The Chartered Institute of Marketing 2013 Contents This report contains the following information:

More information

CERTIFICATION EXAMINATIONS FOR OKLAHOMA EDUCATORS (CEOE )

CERTIFICATION EXAMINATIONS FOR OKLAHOMA EDUCATORS (CEOE ) CERTIFICATION EXAMINATIONS FOR OKLAHOMA EDUCATORS (CEOE ) FIELD 74: OKLAHOMA GENERAL EDUCATION TEST (OGET ) Subarea Range of Competencies I. Critical Thinking Skills: Reading and Communications 01 05 II.

More information

Overview of the TACITUS Project

Overview of the TACITUS Project Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for

More information

BCS HIGHER EDUCATION QUALIFICATIONS Level 6 Professional Graduate Diploma in IT. March 2013 EXAMINERS REPORT. Knowledge Based Systems

BCS HIGHER EDUCATION QUALIFICATIONS Level 6 Professional Graduate Diploma in IT. March 2013 EXAMINERS REPORT. Knowledge Based Systems BCS HIGHER EDUCATION QUALIFICATIONS Level 6 Professional Graduate Diploma in IT March 2013 EXAMINERS REPORT Knowledge Based Systems Overall Comments Compared to last year, the pass rate is significantly

More information

IAI : Expert Systems

IAI : Expert Systems IAI : Expert Systems John A. Bullinaria, 2005 1. What is an Expert System? 2. The Architecture of Expert Systems 3. Knowledge Acquisition 4. Representing the Knowledge 5. The Inference Engine 6. The Rete-Algorithm

More information

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE Ria A. Sagum, MCS Department of Computer Science, College of Computer and Information Sciences Polytechnic University of the Philippines, Manila, Philippines

More information

Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review

Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review Mohammad H. Falakmasir, Kevin D. Ashley, Christian D. Schunn, Diane J. Litman Learning Research and Development Center,

More information

Published on www.standards.dcsf.gov.uk/nationalstrategies

Published on www.standards.dcsf.gov.uk/nationalstrategies Published on www.standards.dcsf.gov.uk/nationalstrategies 16-Dec-2010 Year 3 Narrative Unit 3 Adventure and mystery Adventure and mystery (4 weeks) This is the third in a block of four narrative units

More information

Related guides: 'Planning and Conducting a Dissertation Research Project'.

Related guides: 'Planning and Conducting a Dissertation Research Project'. Learning Enhancement Team Writing a Dissertation This Study Guide addresses the task of writing a dissertation. It aims to help you to feel confident in the construction of this extended piece of writing,

More information

Haberdashers Adams Federation Schools

Haberdashers Adams Federation Schools Haberdashers Adams Federation Schools Abraham Darby Academy Reading Policy Developing reading skills Reading is arguably the most crucial literacy skill for cross-curricular success in secondary schools.

More information

WRITING EFFECTIVE REPORTS AND ESSAYS

WRITING EFFECTIVE REPORTS AND ESSAYS WRITING EFFECTIVE REPORTS AND ESSAYS A. What are Reports? Writing Effective Reports Reports are documents which both give a reader information and ask the reader to do something with that information.

More information

Virginia English Standards of Learning Grade 8

Virginia English Standards of Learning Grade 8 A Correlation of Prentice Hall Writing Coach 2012 To the Virginia English Standards of Learning A Correlation of, 2012, Introduction This document demonstrates how, 2012, meets the objectives of the. Correlation

More information

Persuasive analytical essay

Persuasive analytical essay Persuasive analytical essay The purpose of a persuasive analytical essay is to present and argue for a particular position on a topic/issue which is under debate. When a lecturer reads a persuasive analytical

More information

Keywords academic writing phraseology dissertations online support international students

Keywords academic writing phraseology dissertations online support international students Phrasebank: a University-wide Online Writing Resource John Morley, Director of Academic Support Programmes, School of Languages, Linguistics and Cultures, The University of Manchester Summary A salient

More information

A + dvancer College Readiness Online Alignment to Florida PERT

A + dvancer College Readiness Online Alignment to Florida PERT A + dvancer College Readiness Online Alignment to Florida PERT Area Objective ID Topic Subject Activity Mathematics Math MPRC1 Equations: Solve linear in one variable College Readiness-Arithmetic Solving

More information

Representing Normative Arguments in Genetic Counseling

Representing Normative Arguments in Genetic Counseling Representing Normative Arguments in Genetic Counseling Nancy Green Department of Mathematical Sciences University of North Carolina at Greensboro Greensboro, NC 27402 USA nlgreen@uncg.edu Abstract Our

More information

Case studies: Outline. Requirement Engineering. Case Study: Automated Banking System. UML and Case Studies ITNP090 - Object Oriented Software Design

Case studies: Outline. Requirement Engineering. Case Study: Automated Banking System. UML and Case Studies ITNP090 - Object Oriented Software Design I. Automated Banking System Case studies: Outline Requirements Engineering: OO and incremental software development 1. case study: withdraw money a. use cases b. identifying class/object (class diagram)

More information

Writing a Project Report: Style Matters

Writing a Project Report: Style Matters Writing a Project Report: Style Matters Prof. Alan F. Smeaton Centre for Digital Video Processing and School of Computing Writing for Computing Why ask me to do this? I write a lot papers, chapters, project

More information

Cohesive writing 1. Conjunction: linking words What is cohesive writing?

Cohesive writing 1. Conjunction: linking words What is cohesive writing? Cohesive writing 1. Conjunction: linking words What is cohesive writing? Cohesive writing is writing which holds together well. It is easy to follow because it uses language effectively to guide the reader.

More information

Library, Teaching and Learning. Writing Essays. and other assignments. 2013 Lincoln University

Library, Teaching and Learning. Writing Essays. and other assignments. 2013 Lincoln University Library, Teaching and Learning Writing Essays and other assignments 2013 Lincoln University Writing at University During your degree at Lincoln University you will complete essays, reports, and other kinds

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

Author Gender Identification of English Novels

Author Gender Identification of English Novels Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in

More information

FUNCTIONAL SKILLS ENGLISH - WRITING LEVEL 2

FUNCTIONAL SKILLS ENGLISH - WRITING LEVEL 2 FUNCTIONAL SKILLS ENGLISH - WRITING LEVEL 2 MARK SCHEME Instructions to marker There are 30 marks available for each of the three tasks, which should be marked separately, resulting in a total of 90 marks.

More information

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction

Sentiment Analysis of Movie Reviews and Twitter Statuses. Introduction Sentiment Analysis of Movie Reviews and Twitter Statuses Introduction Sentiment analysis is the task of identifying whether the opinion expressed in a text is positive or negative in general, or about

More information

Ontology and automatic code generation on modeling and simulation

Ontology and automatic code generation on modeling and simulation Ontology and automatic code generation on modeling and simulation Youcef Gheraibia Computing Department University Md Messadia Souk Ahras, 41000, Algeria youcef.gheraibia@gmail.com Abdelhabib Bourouis

More information

Writing an Introductory Paragraph for an Expository Essay

Writing an Introductory Paragraph for an Expository Essay Handout 27 (1 of 1) Writing an Introductory Paragraph for an Expository Essay Prompt Read the following: If you re like many Americans, you have just spent a few days in close quarters with your parents,

More information

Warrant: The part of the arguments that sets up a logical connection between the grounds and the claim. (Reason) (Scientific Method = Hypothesis)

Warrant: The part of the arguments that sets up a logical connection between the grounds and the claim. (Reason) (Scientific Method = Hypothesis) Toulmin Method of Argumentation Critical thinking and the ability to write well are of primary importance in our Magnet Program. The heart of good writing is good thinking. Writing is not only an end in

More information

Five High Order Thinking Skills

Five High Order Thinking Skills Five High Order Introduction The high technology like computers and calculators has profoundly changed the world of mathematics education. It is not only what aspects of mathematics are essential for learning,

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

Assessment Policy. 1 Introduction. 2 Background

Assessment Policy. 1 Introduction. 2 Background Assessment Policy 1 Introduction This document has been written by the National Foundation for Educational Research (NFER) to provide policy makers, researchers, teacher educators and practitioners with

More information

Literature Reviews. 1. What is a Literature Review?

Literature Reviews. 1. What is a Literature Review? Literature Reviews 1. 2. 3. 4. 5. 6. 7. 8. What is a Literature Review? Choosing Material Searching for Good Material Assessing the Literature Developing the Literature Review Placing the Literature Review

More information

GRADE 11 English Language Arts Standards Pacing Guide. 1 st Nine Weeks

GRADE 11 English Language Arts Standards Pacing Guide. 1 st Nine Weeks 1 st Nine Weeks A. Verify meanings of words by the author s use of definition, restatement, example, comparison, contrast and cause and effect. B. Distinguish the relationship of word meanings between

More information

Crosswalk of the Common Core Standards and the Standards for the 21st-Century Learner Writing Standards

Crosswalk of the Common Core Standards and the Standards for the 21st-Century Learner Writing Standards Crosswalk of the Common Core Standards and the Standards for the 21st-Century Learner Writing Standards AASL Standards 1. Inquire, think critically, and gain knowledge. 1.1 Skills 1.1.1 Follow an inquiry-based

More information

The. Languages Ladder. Steps to Success. The

The. Languages Ladder. Steps to Success. The The Languages Ladder Steps to Success The What is it? The development of a national recognition scheme for languages the Languages Ladder is one of three overarching aims of the National Languages Strategy.

More information

Purposes and Processes of Reading Comprehension

Purposes and Processes of Reading Comprehension 2 PIRLS Reading Purposes and Processes of Reading Comprehension PIRLS examines the processes of comprehension and the purposes for reading, however, they do not function in isolation from each other or

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

HIGH SCHOOL MASS MEDIA AND MEDIA LITERACY STANDARDS

HIGH SCHOOL MASS MEDIA AND MEDIA LITERACY STANDARDS Guidelines for Syllabus Development of Mass Media Course (1084) DRAFT 1 of 7 HIGH SCHOOL MASS MEDIA AND MEDIA LITERACY STANDARDS Students study the importance of mass media as pervasive in modern life

More information

Indiana University East Faculty Senate

Indiana University East Faculty Senate Indiana University East Faculty Senate General Education Curriculum for Baccalaureate Degree Programs at Indiana University East The purpose of the General Education Curriculum is to ensure that every

More information

FINAL-YEAR PROJECT REPORT WRITING GUIDELINES

FINAL-YEAR PROJECT REPORT WRITING GUIDELINES FINAL-YEAR PROJECT REPORT WRITING GUIDELINES Expected Content...2 Content summary... 4 Example Layout...5 Report Format... 6 Stylistic And Grammar Advice... 9 Useful Web Based Resources... 10 FINAL-YEAR

More information

Broad and Integrative Knowledge. Applied and Collaborative Learning. Civic and Global Learning

Broad and Integrative Knowledge. Applied and Collaborative Learning. Civic and Global Learning 1 2 3 4 5 Specialized Knowledge Broad and Integrative Knowledge Intellectual Skills Applied and Collaborative Learning Civic and Global Learning The Degree Qualifications Profile (DQP) provides a baseline

More information

LANGUAGE! 4 th Edition, Levels A C, correlated to the South Carolina College and Career Readiness Standards, Grades 3 5

LANGUAGE! 4 th Edition, Levels A C, correlated to the South Carolina College and Career Readiness Standards, Grades 3 5 Page 1 of 57 Grade 3 Reading Literary Text Principles of Reading (P) Standard 1: Demonstrate understanding of the organization and basic features of print. Standard 2: Demonstrate understanding of spoken

More information

The University of Adelaide Business School

The University of Adelaide Business School The University of Adelaide Business School MBA Projects Introduction There are TWO types of project which may be undertaken by an individual student OR a team of up to 5 students. This outline presents

More information

Writing in Psychology. General Advice and Key Characteristics 1

Writing in Psychology. General Advice and Key Characteristics 1 Writing in Psychology General Advice and Key Characteristics 1 Taking a Psychological Approach to Knowledge Like other social scientists, psychologists carefully observe human behavior and ask questions

More information

Syntax: Phrases. 1. The phrase

Syntax: Phrases. 1. The phrase Syntax: Phrases Sentences can be divided into phrases. A phrase is a group of words forming a unit and united around a head, the most important part of the phrase. The head can be a noun NP, a verb VP,

More information

Writing and Presenting a Persuasive Paper Grade Nine

Writing and Presenting a Persuasive Paper Grade Nine Ohio Standards Connection Writing Applications Benchmark E Write a persuasive piece that states a clear position, includes relevant information and offers compelling in the form of facts and details. Indicator

More information

Using Use Cases for requirements capture. Pete McBreen. 1998 McBreen.Consulting

Using Use Cases for requirements capture. Pete McBreen. 1998 McBreen.Consulting Using Use Cases for requirements capture Pete McBreen 1998 McBreen.Consulting petemcbreen@acm.org All rights reserved. You have permission to copy and distribute the document as long as you make no changes

More information

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka

More information

Automated Content Analysis of Discussion Transcripts

Automated Content Analysis of Discussion Transcripts Automated Content Analysis of Discussion Transcripts Vitomir Kovanović v.kovanovic@ed.ac.uk Dragan Gašević dgasevic@acm.org School of Informatics, University of Edinburgh Edinburgh, United Kingdom v.kovanovic@ed.ac.uk

More information

PENNSYLVANIA COMMON CORE STANDARDS English Language Arts Grades 9-12

PENNSYLVANIA COMMON CORE STANDARDS English Language Arts Grades 9-12 1.2 Reading Informational Text Students read, understand, and respond to informational text with emphasis on comprehension, making connections among ideas and between texts with focus on textual evidence.

More information

The Role of Reactive Typography in the Design of Flexible Hypertext Documents

The Role of Reactive Typography in the Design of Flexible Hypertext Documents The Role of Reactive Typography in the Design of Flexible Hypertext Documents Rameshsharma Ramloll Collaborative Systems Engineering Group Computing Department Lancaster University Email: ramloll@comp.lancs.ac.uk

More information

Organisation Profiling and the Adoption of ICT: e-commerce in the UK Construction Industry

Organisation Profiling and the Adoption of ICT: e-commerce in the UK Construction Industry Organisation Profiling and the Adoption of ICT: e-commerce in the UK Construction Industry Martin Jackson and Andy Sloane University of Wolverhampton, UK A.Sloane@wlv.ac.uk M.Jackson3@wlv.ac.uk Abstract:

More information

Space Project Management

Space Project Management EUROPEAN COOPERATION FOR SPACE STANDARDIZATION Space Project Management Configuration Management Secretariat ESA ESTEC Requirements & Standards Division Noordwijk, The Netherlands Published by: Price:

More information

How To Write The English Language Learner Can Do Booklet

How To Write The English Language Learner Can Do Booklet WORLD-CLASS INSTRUCTIONAL DESIGN AND ASSESSMENT The English Language Learner CAN DO Booklet Grades 9-12 Includes: Performance Definitions CAN DO Descriptors For use in conjunction with the WIDA English

More information

Students will know Vocabulary: claims evidence reasons relevant accurate phrases/clauses credible source (inc. oral) formal style clarify

Students will know Vocabulary: claims evidence reasons relevant accurate phrases/clauses credible source (inc. oral) formal style clarify Sixth Grade Writing : Text Types and Purposes Essential Questions: 1. How do writers select the genre of writing for a specific purpose and audience? 2. How do essential components of the writing process

More information

Ontological Representations of Software Patterns

Ontological Representations of Software Patterns Ontological Representations of Software Patterns Jean-Marc Rosengard and Marian F. Ursu University of London http://w2.syronex.com/jmr/ Abstract. This paper 1 is based on and advocates the trend in software

More information

Key Steps to a Management Skills Audit

Key Steps to a Management Skills Audit Key Steps to a Management Skills Audit COPYRIGHT NOTICE PPA Consulting Pty Ltd (ACN 079 090 547) 2005-2013 You may only use this document for your own personal use or the internal use of your employer.

More information

Writer moves somewhat beyond merely paraphrasing someone else s point of view or

Writer moves somewhat beyond merely paraphrasing someone else s point of view or RUBRIC FOR EVALUATING STUDENT PHILOSOPHY ESSAYS (APPLIED ETHICS/PHILOSOPHY OF ART AND CULTURE CONCENTRATIONS) Explanation of the rubric: The rubric incorporates aspects of an already existing model for

More information

Organizing an essay the basics 2. Cause and effect essay (shorter version) 3. Compare/contrast essay (shorter version) 4

Organizing an essay the basics 2. Cause and effect essay (shorter version) 3. Compare/contrast essay (shorter version) 4 Organizing an essay the basics 2 Cause and effect essay (shorter version) 3 Compare/contrast essay (shorter version) 4 Exemplification (one version) 5 Argumentation (shorter version) 6-7 Support Go from

More information

Lesson: Editing Guidelines and Response Writing: Essay Exam (Part 1)

Lesson: Editing Guidelines and Response Writing: Essay Exam (Part 1) Put That In Writing - Level Two 113 UNIT 9 Lesson: Editing Guidelines and Response Writing: Essay Exam (Part 1) 9.1 Learning Objectives A. To know the guidelines for editing an essay. B. To edit and improve

More information

Managing Variability in Software Architectures 1 Felix Bachmann*

Managing Variability in Software Architectures 1 Felix Bachmann* Managing Variability in Software Architectures Felix Bachmann* Carnegie Bosch Institute Carnegie Mellon University Pittsburgh, Pa 523, USA fb@sei.cmu.edu Len Bass Software Engineering Institute Carnegie

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

BEPS ACTIONS 8-10. Revised Guidance on Profit Splits

BEPS ACTIONS 8-10. Revised Guidance on Profit Splits BEPS ACTIONS 8-10 Revised Guidance on Profit Splits DISCUSSION DRAFT ON THE REVISED GUIDANCE ON PROFIT SPLITS 4 July 2016 Public comments are invited on this discussion draft which deals with the clarification

More information

A Teacher s Guide: Getting Acquainted with the 2014 GED Test The Eight-Week Program

A Teacher s Guide: Getting Acquainted with the 2014 GED Test The Eight-Week Program A Teacher s Guide: Getting Acquainted with the 2014 GED Test The Eight-Week Program How the program is organized For each week, you ll find the following components: Focus: the content area you ll be concentrating

More information

Rubrics for AP Histories. + Historical Thinking Skills

Rubrics for AP Histories. + Historical Thinking Skills Rubrics for AP Histories + Historical Thinking Skills Effective Fall 2015 AP History Document-Based Question and Long Essay Rubrics AP History Document-Based Question and Long Essay Rubrics The rubrics

More information

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged

More information

Master Syllabus. Learning Outcomes. ENL 260: Intermediate Composition

Master Syllabus. Learning Outcomes. ENL 260: Intermediate Composition Master Syllabus ENL 260: Intermediate Composition University Studies Cluster Requirement 1C: Intermediate Writing. This University Studies Master Syllabus serves as a guide and standard for all instructors

More information

10th Grade Language. Goal ISAT% Objective Description (with content limits) Vocabulary Words

10th Grade Language. Goal ISAT% Objective Description (with content limits) Vocabulary Words Standard 3: Writing Process 3.1: Prewrite 58-69% 10.LA.3.1.2 Generate a main idea or thesis appropriate to a type of writing. (753.02.b) Items may include a specified purpose, audience, and writing outline.

More information

Progression in recount

Progression in recount Progression in recount Purpose Recounts (or accounts as they are sometimes called) are the most common kind of texts we encounter and create. Their primary purpose is to retell events. They are the basic

More information

DRAFT NATIONAL PROFESSIONAL STANDARDS FOR TEACHERS. Teachers Registration Board of South Australia. Submission

DRAFT NATIONAL PROFESSIONAL STANDARDS FOR TEACHERS. Teachers Registration Board of South Australia. Submission DRAFT NATIONAL PROFESSIONAL STANDARDS FOR TEACHERS Teachers Registration Board of South Australia Submission 21 May 2010 Table of Contents 1. EXECUTIVE SUMMARY 1 2. TEACHERS REGISTRATION BOARD OF SOUTH

More information

CHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages 8-10. CHARTE NIVEAU B2 Pages 11-14

CHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages 8-10. CHARTE NIVEAU B2 Pages 11-14 CHARTES D'ANGLAIS SOMMAIRE CHARTE NIVEAU A1 Pages 2-4 CHARTE NIVEAU A2 Pages 5-7 CHARTE NIVEAU B1 Pages 8-10 CHARTE NIVEAU B2 Pages 11-14 CHARTE NIVEAU C1 Pages 15-17 MAJ, le 11 juin 2014 A1 Skills-based

More information

Arguments and Dialogues

Arguments and Dialogues ONE Arguments and Dialogues The three goals of critical argumentation are to identify, analyze, and evaluate arguments. The term argument is used in a special sense, referring to the giving of reasons

More information

Writing Thesis Defense Papers

Writing Thesis Defense Papers Writing Thesis Defense Papers The point of these papers is for you to explain and defend a thesis of your own critically analyzing the reasoning offered in support of a claim made by one of the philosophers

More information

Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources

Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources Investigating Automated Sentiment Analysis of Feedback Tags in a Programming Course Stephen Cummins, Liz Burd, Andrew

More information

Writing learning objectives

Writing learning objectives Writing learning objectives This material was excerpted and adapted from the following web site: http://www.utexas.edu/academic/diia/assessment/iar/students/plan/objectives/ What is a learning objective?

More information

Semantic Interoperability

Semantic Interoperability Ivan Herman Semantic Interoperability Olle Olsson Swedish W3C Office Swedish Institute of Computer Science (SICS) Stockholm Apr 27 2011 (2) Background Stockholm Apr 27, 2011 (2) Trends: from

More information

Project management by checkpoint control

Project management by checkpoint control University of Wollongong Research Online Department of Computing Science Working Paper Series Faculty of Engineering and Information Sciences 1982 Project management by checkpoint control P. J. McKerrow

More information

How to prepare for IELTS Writing

How to prepare for IELTS Writing Contents Page Details of the writing test 2 Task 1 4 Bar and line graphs, pie charts & tables 4 Process or flow charts 7 Objects/how something works 9 How to prepare for Task 1 10 Task 2 13 Questions 14

More information

ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004

ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004 ONTOLOGY-BASED MULTIMEDIA AUTHORING AND INTERFACING TOOLS 3 rd Hellenic Conference on Artificial Intelligence, Samos, Greece, 5-8 May 2004 By Aristomenis Macris (e-mail: arism@unipi.gr), University of

More information

Science Experiments and Reviews - Part 1

Science Experiments and Reviews - Part 1 REPORT WRITING KIE-34106 & SGN-16006 Tolkki 22.1.2014 Outline Part 1: The basics Part 2: Language for reports Part 3:Practical considerations Summary Free discussion Pair formation Part 1: The basics Starting

More information