Behavioral profiles: A corpus-based perspective on synonymy and antonymy *
|
|
|
- Martha Fox
- 9 years ago
- Views:
Transcription
1 Behavioral profiles: A corpus-based perspective on synonymy and antonymy * Stefan Th. Gries, University of California, Santa Barbara Naoki Otani, Saitama University 1 Introduction 1.1 Two empirical perspectives in the study of synonymy and antonymy The domain of linguistics that has arguably been studied most from a corpus-linguistic perspective is lexical, or even lexicographical, semantics. Already the early work of pioneers such as Firth and Sinclair has paved the way for the study of lexical items, their distribution, and what their distribution reveals about their semantics and pragmatics / discourse function(s). A particularly fruitful area has been the study of (near) synonyms; 1 probably every corpus linguist has come across the specific example of strong and powerful the fact that one would say strong tea but not powerful tea as well as the general approach of studying synonyms on the basis of their distributional characteristics. However, while synonymy is probably the most frequently corpus-linguistically studied lexical relation, there is by now also quite some corpus-based work on its counterpart relation, antonymy, and we will briefly discuss examples of both below. The study of the semantics of lexical items quite obviously presupposes two central concepts. First, one requires a notion of what it means to know a word, and for reasons that we will elaborate on more below, we follow here an approach by Miller and Charles (1991), who proposed the notion of a contextual representation, which is: knowledge of how that word is used (p. 4; cf. also Miller 1999); some abstraction or generalisation derived from the contexts that have been encountered (p. 5); a mental representation of the contexts in which the word occurs, a representation that includes all of the syntactic, semantic, pragmatic, and stylistic information required to use the word appropriately. (p. 26). Second, one requires some kind of scale or, more likely, multidimensional space of semantic similarity along/within which words or, more precisely, concepts or 121
2 ICAME Journal No. 34 the contextual representations of words can be placed, compared, and ordered such that, on this scale / within this space, strong would be closer to powerful than it would be to weak. Unsurprisingly, given our assumption of a contextual representation, we consider semantic similarity a function of the contexts in which words are used (cf. Miller and Charles 1991:3), from which it somewhat naturally follows that [t]he similarity of the contextual representation of two words contributes to the semantic similarity of those words, which Miller and Charles (1991:9) referred to as the weak contextual hypothesis. However, this approach only shifts the burden of difficulty from the question of how to measure the semantic similarity of, say, two words to the question of how to measure the similarity of contextual representations of two words. The discussion of this problem in general, but also its more specific application to the issue of synonymy and antonymy has mostly contrasted two different perspectives: the co-occurrence approach (cf. e.g. Rubenstein and Goodenough 1965) and the substitutability approach (cf. e.g. Deese 1962, 1964), and we need to discuss both of them briefly The co-occurrence approach The co-occurrence approach is based on assumptions, or even axioms, that are very dear to probably nearly all corpus linguists. Corpus linguistics is inherently a distributional discipline and the study of lexical semantics with corpora is no exception: corpora do not provide meanings or functions that can be readily extracted and compared, but only the distributions of formal elements morphosyntactic and lexical (and, depending on the corpus, sometimes phonological or orthographic) elements so meanings and functions must be inferred from the distribution(s) of formal elements within their contexts. The assumption underlying the co-occurrence approach is that the distributional characteristics of the use of an item reveals many of its semantic and functional properties and purposes, an assumption that has been made in various different sources: Firth (1957:11) famously stated [y]ou shall know a word by the company it keeps ; just as famously, Bolinger (1968:127) stated that a difference in syntactic form always spells a difference in meaning ; Harris (1970:785f.), while much less quoted in this connection, asserted this even more explicitly: [i]f we consider words or morphemes A and B to be more different in meaning than A and C, then we will often find that the distributions of A and B are more different than the distributions of A and C. In other words, difference of meaning correlates with difference of distribution. 122
3 Behavioral profiles: A corpus-based perspective on synonymy and antonymy More recently, Cruse (1986:1) stated, the semantic properties of a lexical item are fully reflected in appropriate aspects of the relations it contracts with actual and potential contexts, and with a more syntactic focus, Hanks (1996:77) wrote the semantics of a verb are determined by the totality of its complementation patterns. This kind of logic has been applied especially fruitfully in the domain of synonymy, where contextual information of two kinds has been particularly useful and revealing: collocational information: what are the words that are modified by strong and powerful (Church et al. 1991), by absolutely, completely, and entirely (cf. Partington 1998: Section 3.6, also cf. Ch. 2), by big, large, and great (cf. Biber et al. 1998: Section 2.6), or by alphabetic and alphabetical and many other -ic/-ical adjective pairs (cf. Gries 2003); syntactic information: what are the preferred grammatical associations of quake and quiver (cf. Atkins and Levin 1995), of little vs. small or begin vs. start (cf. Biber et al. 1998: Sections 4.2 and 4.3 respectively), of causative get and have (cf. Gilquin 2003), of Mandarin lian constructions (cf. Wang 2006), of several Finnish verbs meaning think (cf. Arppe and Järvikivi 2007 and Arppe 2008), etc. An experimental study that is often mentioned in connection with synonymy/ antonymy is that of Rubenstein and Goodenough (1965), who had subjects generate sentences involving 130 target words from 65 word pairs and compared the collocational overlap of sentences created for words of the 65 pairs using an intersection coefficient, obtaining results that are compatible with a contextual approach, at least for highly synonymous words. Especially with regard to the study of antonymy, a more specialized kind of co-occurrence approach has also been promoted. This more specialized approach, when applied to two antonymous words x and y, does not include and compare all collocates of x and y it involves counting how often x and y themselves co-occur within the same sentence, which in turn by definition increases the amount of collocational overlap that is at the heart of the co-occurrence approach. For example, Charles and Miller (1989) reported general sentential co-occurrence counts of big, large, little, and small in the Brown corpus, which are larger than would be expected by chance. In addition, Justeson and Katz (1991) found that antonymous words also prefer to occur in substitution patterns or contrastive parallel phrases (cf. Fellbaum 1995 for similar findings and support in favor of a sentential co-occurrence approach). As a last example, in a 123
4 ICAME Journal No. 34 very interesting study Jones et al. (2007) even used such kinds of contrastive constructions to successfully identify what has been referred to as direct antonyms (cf. Gross, Fischer and Miller 1989) or canonical antonyms, namely pairs of antonyms that are strongly associatively paired (e.g. wet vs. dry) whereas their (near) synonyms are not (e.g. moist vs. dry); the contrastive constructions used this way x and y alike, between x and y, both x and y, either x or y, from x to y, x vs. y and whether x or y The substitutability approach The substitutability approach has been most strongly advocated by Charles and Miller in several different papers. It can be summarized as follows: (1) collect a set of sentences using item A; (2) collect a set of sentences using item B; then (3) delete A and B, shuffle the resulting contexts; and (4) challenge subjects to sort out which is which. The more contexts there are that will take either item, the more similar the two sets of contexts are judged to be. (Miller and Charles 1991:11) (The sentences mentioned in (1) and (2) were either taken from a corpus or generated by subjects in a pilot study.) Charles and Miller (1989) and Miller and Charles (1991) use this measure of contextual similarity and find that it is strongly correlated with experimentally-obtained semantic similarity ratings and conclude that it appears that a measure of contextual similarity based on substitutability gives better predictions than did Rubenstein and Goodenough s (1965) measure of contextual similarity based on co-occurrence (p. 17; cf. Charles 2000 for a similar conclusion). In addition, they criticize co-occurrence approaches for two shortcomings. First, they dismember the contexts they are supposed to represent, and while Miller and Charles do not explain what exactly they mean by that, it is reasonable to assume that they refer to the fact that a simple collocational count would disregard syntactic structures, which a sortingcontexts account would preserve. Second, they claim that the fact that co-occurrence accounts yield high similarity results for antonyms is problematic because [a]ntonyms have contrasting meanings not just zero similarity, but negative semantic similarity, if that is possible and thus constitute counterexamples to a co-occurrence approach Adjectives of SIZE: Some previous findings One lexical field that has received a lot of attention both in general and in corpus semantics is that of SIZE, presumably because it includes two pairs of canonical 124
5 Behavioral profiles: A corpus-based perspective on synonymy and antonymy antonyms and many studies devote at least some space to big, little, large, and small. In this section, we very briefly review some of the findings, very briefly, because most of them lead to very similar conclusions. In one of the earliest empirical studies, Deese (1964) used a word association method and finds what is intuitively obvious to every native speaker of English: big and large are very similar in meaning (their size ratings are 4.3 and 4.5 respectively, while great scores the slightly larger size rating of 5.1), as are little and small (their ratings are 1.8 and 1.9 respectively, while tiny is associated with even smaller sizes: its rating is 0.8). In addition to that, Deese also found that the by far most natural pairings of antonyms are big vs. little and large vs. small. He attributed these correlations to partial contextual equivalences (1964:356), a formulation which does not make it all that clear whether this would translate into a co-occurrence or substitutability account. While Charles and Miller were in general in favor of a substitutability approach based on sorting sentence contexts as outlined above, their 1989 study shows that this approach does not explain particularly well how canonical antonym relations are established. Their alternative is essentially a co-occurrence approach again. They showed that big and little as well as large and small tend to occur in one and the same sentence in the Brown corpus with a probability that is much larger than chance would predict and concluded (1989:374) that the clang association between direct antonyms [ ] is a consequence of frequently perceiving and using these words together in the same syntactic structures. The above-mentioned study Justeson and Katz (1991:5) also found more sentences in which antonyms co-occurred than would have been expected by chance as well as the generally accepted association of big and large to little and small respectively. 4 Finally, Jones et al. (2007) focused on antonym canonicity, but also discussed canonical-antonym relations among adjectives of size. Their findings are summarized in Figure 1 (taken from Jones et al. 2007:148), where the numbers and percentages indicate the number of frame types and the token percentages of joint occurrences in Google searches: 125
6 ICAME Journal No. 34 Figure 1: Summary of Jones et al. (2007) findings regarding four adjectives of size With maybe one exception, the results are very encouraging: in line with nearly all previous work, large and small exhibit a strong and reciprocal attraction, and little prefers big as an antonym. The only drawback is that big does not prefer little. 1.3 The present study The studies mentioned above have already gone very much beyond previous introspective accounts of the meanings and functions of lexical items. However, we think that some corpus-linguistic approaches still exhibit some areas of potential improvement and that the notion of contextual representation is worthy of more study. First, there is a great degree of consensus that synonymy and antonymy both involve a large degree of semantic similarity and, thus, a high degree of common predictable distributional behavior. However, to our knowledge there is hardly any corpus-based work that explores synonyms and antonyms within a semantic field together; Charles and Miller (1989) and Jones et al. (2007) are laudable exceptions. In addition to the above, second, it is probably fair to say that most corpusbased studies in the domains of synonymy and antonymy focus on individual 126
7 Behavioral profiles: A corpus-based perspective on synonymy and antonymy pairs of synonyms and antonyms only and usually do not take larger sets of synonymous/antonymous words into consideration. Third, many studies focus only on the base forms of the words in question as opposed to including, or differentiating between, different inflectional forms of the relevant lemmas. Fourth, many corpus-linguistic studies of lexical relations until relatively recently focus only on collocational aspects or only on syntactic patterning, but do not combine lexical and syntactic distributional characteristics. Thus, although corpus data provide a wealth of distributional characteristics, many studies do not utilize most or even all of the available information. Some early studies that use more than just collocational or just syntactic information are cognitive-linguistic in nature. For instance, Schmid (1993) studied many lexical and syntactic characteristics of begin and start in an exemplary fashion. Also, in cognitive linguistics, Kishner and Gibbs (1996) studied collocations and syntactic patterns of just, and Gibbs and Matlock (2001) investigated uses of the verb make. Corpus-linguistic studies that are also appreciably broader in scope are Atkins (1987) study of risk, Hanks s (1996) study of urge, and Arppe and Järvikivi (2007), who all involved collocate and/or colligation analysis at an otherwise rare level of detail. Finally, and in addition to the range of data included into an analysis, some of the studies also do not analyze their distributional data in the most revealing way but rather restrict themselves to observed frequencies of co-occurrence of a lexical item and a collocate or a syntactic pattern. Unfortunately, this is even true of some of the groundbreaking work of, for instance, Atkins (1987). In this study, we intend to go at least in some ways beyond what some previous works have done. As for the first three points above, we do not just explore pairs of antonymous adjectives, but we also include synonymous adjectives in our data. More specifically, we explore the semantic field of SIZE by studying the distributional behavior of the adjective set {big, great, large} as well as the antonymous set {little, small, tiny}, including their comparative and superlative forms on the basis of data from the British Component of the International Corpus of English. As for the fourth point, we do not restrict our attention to collocations or colligations only or a small set of linguistic features characterizing the use of each of the word forms included, but we use a relatively recent approach in corpusbased semantics, namely behavioral profiling. In addition, and as for the last point, our analysis is not just based on frequencies/percentages of co-occurrence, but on hierarchical agglomerative cluster analysis based on the similarity of co-occurrence vectors. 127
8 ICAME Journal No. 34 Before we proceed to the actual analysis, the latter two aspects must be clarified, so let us briefly explain the behavioral profile (BP) approach. (For reasons of space, we cannot discuss all the details of the BP approach here but need to refer to previous works. For applications of the BP approach to synonymy, cf. Divjak (2006) as well as Divjak and Gries (2006); for applications of the BP approach to polysemy, cf. Gries (2006) and Berez and Gries (2009); for experimental validation, cf. Divjak and Gries (2008); for an overview, cf. Gries and Divjak (2009).) The application of the BP method to synonyms/antonyms involves the following steps: i. the retrieval of (a representative random sample of) all instances of the lemmas of the synonyms/antonyms to be studied from a corpus in the form of a ii. concordance; a (so far largely) manual analysis and annotation of many properties of each match in the concordance of the lemmas; these properties are, following Atkins (1987), referred to as ID tags and include morphological, syntactic, semantic, and collocational characteristics (cf. below for details and Table 1 for an excerpt of our data); iii. the conversion of these data into a co-occurrence table that provides the relative frequency of co-occurrence of each lemma with each ID tag; the vector of these co-occurrence percentages for a lemma is called that lemma s behavioral profile (cf. below for details and Table 2 for an excerpt of our data); iv. the evaluation of the table by means of exploratory and other statistical techniques, especially hierarchical agglomerative cluster analysis. Before we begin to discuss the data, methods, and results, it is important to point out that it is quite difficult to anticipate what kinds of (meaningful?) results to expect. While it is clear that big, large, and great are synonyms, that little, small, and tiny are their synonymous antonyms, and that big and little as well as large and small are the canonical antonym pairs, it is not obvious, say, what kind of clustering to expect a synonym-based clustering? an antonym-based clustering? a mixture of the two? Right now, we can therefore only state that clustering results which appear to order words randomly and/or do not find the canonical antonyms would undermine the BP approach, but more specific predictions are hard to come by. This is for several reasons: first, while there is already some BP-based work now on synonymous words (between and within languages) and polysemous words, there is as yet no such work on antonyms 128
9 Behavioral profiles: A corpus-based perspective on synonymy and antonymy the corpus-based work on antonyms that exists is more restricted in terms of the number of linguistic characteristics it includes. Second, from the perspective of the degree to which distributional similarities/differences reflect functional similarities/differences, antonyms constitute an interesting case: intuitively at least, the degree of functional similarity of antonymous words should be smaller than that of synonymous words (because antonyms refer to the same semantic continuum/dichotomy, but to opposite positions/ends whereas synonyms refer to the same semantic continuum/dichotomy and to the same positions/ends), but maybe larger than that of the senses of polysemous words (because different senses of polysemous words can refer to very different semantic domains even if those may be metaphorically or otherwise related), with the caveat of the fuzziness that comes with decisions concerning distinctions between, and relatedness of, senses. On the other hand, the adjectives included in this study of course also exhibit some degree of polysemy, the evaluative sense of great probably being the most blatant case. It is therefore not immediately obvious how well, if at all, the BP approach can handle the distributional characteristics of antonyms, especially when, and this is a third point, there is no BP-based study in which synonyms and antonyms are combined. Given all this, while the BP approach has been validated and experimentally supported for synonymous and polysemous words, this paper is still largely exploratory and hopes to shed some first light of the joint similarity of synonyms and antonyms. The remainder of this paper is accordingly structured as follows: in Section 2, we explain how the data were retrieved from the corpus, how they were coded for the subsequent analysis, and how they were analyzed quantitatively. In Section 3, we present the results of several different kinds of analysis of the data; the main differences will revolve around whether both morphosyntactic and semantic data are included in the statistical evaluation. In Section 4, we will very briefly summarize the main points and conclude. 2 Data and methods 2.1 Data We first used an R script (cf. R Development Core Team 2008) to retrieve all matches of the lemmas big, great, large, little, small, and tiny (plus their comparative and superlative forms) when tagged as adjectives within their parse units as well as their and their clause s annotation from the British Component of the International Corpus of English. The data were exported into a spreadsheet software and then annotated for a variety of features: 129
10 ICAME Journal No. 34 morphological features such as the tense, voice, and transitivity marking of the finite verb of the clause in which the adjective is used, etc.; syntactic features such as the syntactic frame in which the adjective is used (attributively vs. predicatively), the clause type in which the adjective is used (main vs. subordinate plus different kinds of subordinate clauses), the function of the clause in which the adjective is used, etc.; semantic features of the noun the adjective is modifying (count vs. noncount, concrete vs. abstract vs. human vs. organization/institution vs. quantity vs. ongoing processes vs. punctual events etc.); how SIZE is modified (literally vs. metaphorically vs. quantitatively vs. evaluatively), etc. The resulting spreadsheet consisted of 2,073 rows (of matches) and 27 columns (of annotation, including case numbers as well as preceding and subsequent contexts). 5 It was evaluated statistically using the interactive R script BP 1.0 (Gries 2008), which has been written by, and is available from, the first author. This statistical evaluation will be explained in the following section. 2.2 Methods The input to BP 1.0 consists of a table that contains each match of one of the adjective forms in a separate row and the annotation of each linguistic feature in a separate column. A random excerpt of this table is shown in Table 1, with the matches they belong to represented in (1): Table 1: Excerpt of the table entered into BP 1.0 Form Syntax Modifiee_count Modifiee_what Clause function Clause level bigger attributive count abstract OD depend large attributive non-count quantity NPPO depend little attributive count organization/institution PU main (1) a. I guess size is a bigger problem actually than funding (S1B-076) b. have to be transmitted in the UHF portion of the spectrum of the large of bandwidth required (W2B-034) c. [ ] our own little <,,> magic circle or whatever it is [ ] (S1A-027) The script converts this input table into an output table of behavioral profiles, an excerpt of which is shown in Table 2. This table, for example, shows that 87 percent of the uses of big were attributive while the remaining 13 percent were 130
11 Behavioral profiles: A corpus-based perspective on synonymy and antonymy predicative, or that the percentages with which big and bigger are used for count and non-count modifiees are nearly identical but rather different from those of great: 94 percent: 6 percent vs. 95 percent : 5 percent vs. 71 percent : 29 percent; note how, in each column, the sum of the first three rows and the sum of the last two rows is 1, which means the within-id tag percentages add up to 100 percent. These columns, i.e. the vectors of percentages, are the behavioral profiles of the word forms in the table header. Table 2: Excerpt of the table returned by BP 1.0 ID tag ID tag level big great large bigger Syntax adverbial attributive predicative Modifiee_count count non-count While this first output of BP 1.0 is of course just a reorganization of the data, albeit a useful one, the more revealing next step is to compare the behavioral profiles of the adjective forms to each other. As in nearly all previous BP studies, we used a hierarchical agglomerative cluster analyses, but we had to decide on three parameters: which ID tags to include in the comparisons all of them? only all/some morphosyntactic ones? only all/some semantic ones? some combination of both? how to compare the BP vectors mathematically Euclidean distances? City-Block metric? correlational measures? how to amalgamate sufficiently similar (clusters of) BP vectors into clusters average similarity, complete similarity, Ward s method? As for the ID tags, we decided to run three analyses: one with all ID tags, one with all syntactic ID tags, and one with all semantic ID tags. As for the computation of the cluster analyses, we decided to use Canberra as a similarity measure and Ward s method as an amalgamation rule. These are the settings used in all earlier BP studies, which makes the approach maximally comparable to previous work and also means that the present study uses the same parameters that 131
12 ICAME Journal No. 34 were successfully validated in two experimental studies in Divjak and Gries (2008). In the following section, we will discuss the results of these three cluster analyses. 3 Results 3.1 Results for all ID tags The results of the cluster analysis for all ID tags are shown in Figure 2, with several relevant clusters highlighted (adjective forms not shown tinier and least were not found with the relevant part-of-speech tag): Figure 2: Dendrogram of the cluster analysis of the BP vector involving all ID tags Given the structure of the dendrogram, these findings are surprisingly meaningful on a finer level of granularity and rather interesting. One striking finding is how different dimensions/parameters appear to have given rise to the above clustering solution. These parameters include oppositeness of meaning, same- 132
13 Behavioral profiles: A corpus-based perspective on synonymy and antonymy ness of meaning, and morphological parameters, i.e., all the parameters that figured in the choice of adjectives to study (more on this below). Starting from the bottom, the first cluster to be formed is based on sameness of meaning: tiny of course means very small. Then, focusing on the left side of the dendrogram, the analysis successfully identifies what in several other corpus-based and experimental studies are considered the canonical antonym pairs {big little} and {large small}. In addition, these two lowest-level clusters are clustered together with another adjective in the base form, great, which makes the larger leftmost cluster morphologically perfectly homogeneous as it contains only adjective base forms (but only nearly all, since tiny is not included). Then, there is an early lowest-level cluster that combines comparatives of at least one canonical antonym pair, {larger smaller}, which is joined with, again, a comparative, and again a form of the lemma GREAT, namely greater. Thus, this larger cluster is not only also morphologically homogeneous, but it has the same substructure as the base form cluster on the left in that the more polysemous (since evaluative) adjective great is somewhat less similar: {{LARGE SMALL} GREAT}. Turning to the final cluster that is highlighted in Figure 2, it involves both a high degree of semantic sameness all forms are from the {big great large} end of the size spectrum and a slightly smaller degree of morphological coherence three of four forms are superlatives, and the fourth form is a comparative. The last cluster in Figure 2 consists of less and tiniest, each of which occurs only once in the data so that this cluster can safely be disregarded. Given these results, a sceptic may argue that the results are in fact not particularly interesting: after all, the three different parameters with regard to which the dendrogram can be interpreted sameness of meaning, oppositeness of meaning, and morphological form are the ones that synonyms and antonyms of different inflectional forms from the semantic field of SIZE would be expected to give rise to, and these parameters allow for a wide range of seemingly encouraging solutions, especially since sameness and oppositeness of meaning cover both ends of a single continuum and appear to render the approach somewhat futile. It is worth recognizing, however, that this view is mistaken, which can be easily demonstrated: if, as a sceptic may claim, any dendrogram was going to make some kind of sense because there are so many parameters in the light of which the cluster structure could be interpreted favorably then it should be possible to just randomly reorder the adjectives in the dendrogram and still find a lot of interesting things to say; consider the two panels of Figure 3: 133
14 ICAME Journal No. 34 Figure 3: Dendrograms of the BP vectors with random adjective assignments In the upper panel, we kept the existing dendrogram structure (i.e., the divisions of the branches etc.) but we reordered the adjectives in a random fashion. It is immediately obvious that there are of course still somewhat synonymous and somewhat antonymous clusters, but it is equally obvious that this random solution is much worse than the authentic one on all counts: the canonical antonyms are not identified (with any morphological forms), the amount of synonym clusters is low, and morphologically the solution is a mess. 134
15 Behavioral profiles: A corpus-based perspective on synonymy and antonymy In the lower panel, we illustrate that a cluster analysis can also return dendrograms with very different kinds of structure, which are then also much less revealing to interpret. Instead of a tree with several fairly delimited substructures as obtained in the authentic data, we here get a long-chained substructure which, with the exception of greatest and biggest, exhibits hardly any structure, given the long and rather indiscriminate chaining of adjectives. Thus, the structure and patterns of the authentic data in Figure 2 are not just a to-be-expected artifact of a powerful clustering algorithm the range of possible clusters solutions is huge but the one that was actually obtained allows for meaningful interpretation on many different levels. We therefore consider these encouraging and interesting results. 3.2 Results for all morphosyntactic ID tags In this section, we will very briefly discuss the results of the cluster solution obtained on the basis of only the morphosyntactic ID tags. However, since the morphosyntactic ID tags constitute the vast majority of all ID tag levels (520 out of 539), the result of this analysis is nearly identical to that of Figure 2; we therefore do not show the dendrogram here. The only difference to Figure 2 is that biggest moves from largest to become part of the cluster {smallest tiny}. In this case, the explanatory parameters do motivate both the first and the second solution: in Figure 2, biggest was grouped together with another superlative form from its synonym set (largest) whereas in the syntax-only tree biggest is grouped together with a strong synonym cluster that contains its canonical antonym in the superlative form (smallest). However, since this is the only change in an otherwise identical solution and since here even both solutions make sense different kinds of sense, though the BP approach at least does not suddenly yield a very counterintuitive result. 3.3 Results for all semantic ID tags Let us now finally turn to the last cluster analysis, which is based on our semantic annotation only. Recall that this analysis is based on only 19 ID tag levels and that these are the ID tag levels which were more problematic to code. Consider Figure 4 for the resulting dendrogram: 135
16 ICAME Journal No. 34 Figure 4: Dendrogram of the cluster analysis of the BP vector involving semantic ID tags The dendrogram is a bit different than the one based on all the data or the syntactic data, but the results are still far from erratic. In fact, the dendrogram structure is maybe even a bit more revealing in terms of canonical antonymy (but maybe less revealing with regard to other things): supporting Jones et al. (2007), who find that among the size adjectives, large and small are most canonical, the pair large/small is here instantiated in the earliest cluster, which contains these adjectives base and comparative forms, and the pair big/little is instantiated by another similar cluster from which only less is missing. We still obtain the cluster {smallest tiny}, which is now, just like the strong correlation of the canonical antonyms, attested in all three solutions. One interesting change compared to the previous dendrogram involves the lemma GREAT, whose three inflectional forms now constitute one cluster. Another change is the location of biggest, which does not appear to reveal much anymore the only positive thing that can be said about it is that it is closest to a cluster involving an adjective from its synonym set that includes a superlative, but we have to admit that this seems somewhat far-fetched. Also, largest is not positioned ideally, but at least 136
17 Behavioral profiles: A corpus-based perspective on synonymy and antonymy part of the cluster that also contains smallest, the superlative form of its canonical antonym. Finally, less and tiniest can again be disregarded given their rarity. Again, the results are rather encouraging for the BP approach. Even with a much smaller number of ID tag levels to work with and a coding that may be a bit less reliable than the syntactic annotation that comes with the corpus, the results strongly reflect distributional similarities in the form of clusters that reflect previous corpus-based findings and experimental results regarding distributional similarities and canonical antonymy. 3.4 Post hoc results for all ID tags In this section, we would like to briefly explore one useful added benefit of the BP approach. In addition to the multivariate analysis of different words, it is also very easy to determine how words differ from each other in a pairwise fashion, especially words that have been clustered together i.e. are highly similar but are of course still different. Divjak and Gries (2006) use z-scores to this end, but a conceptually much more straightforward approach is discussed in Divjak and Gries (2009). This approach is based on the recognition what BP vectors are; recall from Table 2 that BP vectors are simply aligned observed percentages of co-occurrence. For example, Table 2 above already revealed that, with regard to the countability of the modifiee, big is more similar to large than to great. Thus, if one computes the pairwise differences between percentages of ID tag levels and sorts the vectors according to the size of the difference, then one can see where two words exhibit their most marked differences. Consider Figure 5 for this kind of comparison of little and big (using all ID tags): 137
18 ICAME Journal No. 34 Figure 5: Snakeplot of differences between BP vector values: % big - % little 138
19 Behavioral profiles: A corpus-based perspective on synonymy and antonymy On the y-axis, we plotted the differences between the observed percentages between the two vectors for little and big. Thus, the further away from 0 (the center of) an ID tag level is plotted, the more distinctive it is for little or big. For example, 29 percent of the modifiees of big are abstract, but only 13 percent of the modifiees of little are, too, so Modifiee_what: abstract is plotted (centered) at It is therefore immediately obvious that, while big and little are very similar to each other, they differ most strongly with regard to the semantic characteristics of their modifiees (and this kind of pronounced difference is why the semantic clustering could yield a very good result with far fewer ID tags): for instance, big is preferred with nouns that refer to abstract things, events, or organizations/institutions and metaphorically in predicative constructions. In comparison with big, little is preferably used for the actual physical size of (referents of) nouns referring to concrete things, in particular humans and other animate entities, and in attributive constructions. Similar comparisons are of course available for all desired pairwise comparisons but, given space constraints, one additional example shall suffice. In Figure 5, we compared big to its canonical antonym little so as to illustrate how snakeplots reveal patterns it makes sense to now also compare big to its closest synonym in the present data, large; consider Figure 6: 139
20 ICAME Journal No. 34 Figure 6: Snakeplot of differences between BP vector values: % big - % large 140
21 Behavioral profiles: A corpus-based perspective on synonymy and antonymy It is immediately clear why big was clustered with little and not with large: the deviations from 0 are much larger in Figure 6 than in Figure 5, since some differences become as small as But how do the two words differ? The major difference is that large is preferably used to modify count nouns that refer to quantities, but also organizations/institutions and animate entities (not humans) big, by contrast, modifies rather non-count nouns and co-occurs with abstract nouns, but also humans and actions. Interestingly, big disprefers attributive use compared to both little and large, but, compared to large, big has no similar preference for predicative use. 6 4 Discussion 4.1 Interim summary and initial evaluation In terms of the structure of the lexico-semantic space of the adjective studied here, the BP approach alone yields many results that were previously obtained in very many (methodologically) different studies, but nearly all of which make a lot of sense. In a completely data-driven fashion but without any subject ratings etc. the dendrogram finds that tiny means smallest (cf. Deese s 1964 ratings); the dendrogram returns the canonical antonym pairs big/little and large/ small; (cf. most studies on canonical antonyms, but recall that e.g. Jones et al. s (2007) approach was less inclined to associate big with little; the morphologically clean clusters reflect subjects preference to respond to a stimulus words with one of the same inflectional form (cf. Ervin-Tripp 1970). And, to a considerable degree at least, these findings are obtained both from a large number of syntactic ID tags, from a small number of semantic ID tags, and the combination of both. In addition and as a by-product, the pairwise differences of ID tags allow us to immediately identify which distributional differences between the near synonyms are most pronounced (and would maybe even be most useful to learners). While similarly good results have already been obtained for polysemous words and sets of only synonymous words before (and in the case of the latter these results could even be experimentally validated), the present study is the first to have applied BPs to a potentially more noisy/chaotic data set in which synonyms and both direct and indirect antonyms were included but the results still turned out to be very reasonable. 141
22 ICAME Journal No Theoretical implications This successful application is of course good news for behavioral profilers in particular, but also for corpus linguists in general since it strongly supports the working assumption underlying most corpus-based work formal differences reflect functional differences. However, we feel that the BP approach takes this approach more seriously and, thus, provides a better link to recent theoretical and empirical developments than simple collocational or colligational studies can offer. The main difference between the BP approach (or its earlier but similar precursors) and other simpler methods is the range of contextual/distributional characteristics that are taken into consideration. Rather than looking at one position (e.g. R1 for adjectives), at a window around a search word (4L to 4R), one syntactically-defined slot (e.g. the noun slot in an NP, the direct object slot of a verb, or one vacant slot in a contrastive construction), the BP approach includes dozens of linguistic features from different linguistic levels. This is not only desirable because it is per se more comprehensive. It is also desirable because this property of the BP approach and, more generally, this kind of perspective makes it possible to relate Miller and Charles s (1991) notion of a contextual representation to more recent development in corpus linguistics; connect this approach to recent cognitive and psycholinguistic trends, in which corpus linguists should be very interested; at least in some sense, re-evaluate the dichotomy between co-occurrence and substitutability. Recall Miller and Charles s (1991:26) definition of a contextual representation: a mental representation of the contexts in which the word occurs, a representation that includes all of the syntactic, semantic, pragmatic, and stylistic information required to use the word appropriately. It may not be immediately obvious, but this quote is in fact remarkable in the way it connects cognitive or psycholinguistic approaches to language to corpus-linguistic approaches a possibility that many older-school corpus linguists are reluctant to entertain. One particularly clear connection of this notion of contextual representation is to Hoey s theory of lexical priming. Consider this recent quote: The notion of priming as here outlined assumes that the mind has a mental concordance of every word it has encountered, a concordance that has been richly glossed for social, physical, discoursal, generic and interpersonal context. This mental concordance is accessible and can be processed in much the same way that a computer concordance 142
23 Behavioral profiles: A corpus-based perspective on synonymy and antonymy is, so that all kinds of patterns, including collocational patterns, are available for use. (Hoey 2005:11) While Hoey does not refer to Miller and Charles s work, we still find the overlap between this metaphor of a comprehensive mental concordance and contextual representations rather striking, and it is exactly this richly glossed mental concordance, this contextual representation, that the BP approach tries to approximate. Given this close connection between contextual representations and Hoey s new and exciting new framework, exploring these notions is therefore still a very relevant issue. One way of exploration is concerned with the question of how the contextual representations that give rise to subjects behavior in experiments and the patterns in the corpus data come about in the first place, a question that the previous literature referred to so far does not discuss in much detail. Consider Charles (2000:507): Similarly, the contextual representation of a word is not an actual linguistic context but an abstraction of information in the set of natural linguistic contexts in which a word occurs. [ ] Although the process that is used to derive a contextual representation from multiple encounters is difficult to describe, many derivatives have been characterized. (our emphasis) While the quote from Charles states that the process resulting in a contextual representation is still difficult to describe, recent developments in, among other areas, cognitive linguistics, first language acquisition, and phonology, have made substantial progress in this regard (both theoretically and empirically). Our own take on this is that an exemplar-based framework is probably the best way to conceive of the generation of a contextual representation or a mental concordance. Consider Dabrowska (2009) as one recent cognitive-linguistic example that is exemplar- and usage-based. Dabrowska investigates the meanings of rare verbs of walking or running such as stagger, hobble, plod, or saunter and shows that verbs are reliably associated with semantic and collocational preferences of the main arguments and complements of the verbs. Learners acquire the meanings of words on the basis of contextual and distributional cues provided in usage events by (i) storing lexically-specific knowledge of semantic and collocational preferences and (ii) forming more phonologically and semantically abstract generalizations or schemas on the basis of recurrent exposure to particular components of meaning. 143
24 ICAME Journal No. 34 It is immediately apparent how well this characterization can be mapped onto the adjectives of size studied here. Following this logic and earlier BPbased exploration of synonyms by Divjak and Gries (2008), we here also argue for a view [...] in which acquisition depends on exemplar learning and retention, out of which permanent abstract schemas gradually emerge and are immanent across the summed similarity of exemplar collections. These schemas are graded in strength depending on the number of exemplars and the degree to which semantic similarity is reinforced by phonological, lexical, and distributional similarity. (Abbot-Smith and Tomasello 2006:275) (Cf. also, for example, Bybee 2000; Pierrehumbert 2001; Goldberg 2006: part II). This hybrid view implies that acquiring contextual representations of the size adjectives involves memorizing a cloud of exemplars in multidimensional syntactic-semantic space. Whenever a speaker encounters yet another instance of these adjectives, the memory representation of these adjectives and their actual uses the mental concordance is updated with the information contained in the most recent usage event. But recall that Charles as well as Abbot-Smith and Tomasello also speak of abstraction: of course not all actual instances are remembered. Memory traces may decay over time, and while particular salient usage events may remain accessible, what remains for the most part may well be generalizations based on many similar but now forgotten usage events. These generalizations are assumed to involve probabilistic knowledge of distributional patterns (in this case, for example, the combination of semantic properties of modifiees with grammatical co-occurrences or colligations), and in the BP approach these straightforwardly correspond to the distributions of ID tag levels. On this view, the results of both experimental techniques and authentic language production as captured in corpora result from speakers accessing traces of memory representations for the use of the adjectives. More specifically, in contexts-sorting tasks as advocated within the substitutability approach, the contextual clues provided in the contexts facilitate access of a particular subregion of the syntactic-semantic space containing a cloud of traces for adjectives that were used in a similar way. The likelihood that subjects produce the same adjective thus increases strongly, and the same sub-region is accessed in the case of language production when a speaker s processing system seeks for the right lexical and syntactic material to express a combination of concepts. 144
25 Behavioral profiles: A corpus-based perspective on synonymy and antonymy From this perspective, the conceptual distinction between co-occurrence and substitutability approaches pretty much collapses: the collocation data from cooccurrence studies and the sorting data from substitutability approaches are ultimately due to the speakers processing the same regions in syntactico-semantic space. A bit polemically, all that remains is the methodological distinction between noisy co-occurrence data and nicely controlled experimental data. Crucially, however, the BP approach with its multidimensional annotation of every single corpus example captures what both co-occurrence and substitutability approaches tap into, the richly glossed mental concordance with distributional data of past usage events. Given the consistent empirical utility of the BP approach for several lexical relations, its ability to provide many results previously obtained separately, its experimental validation for synonyms, its compatibility with the most recent psycholinguistic developments from language acquisition after all, the observed patterns must be learnable somehow and the new perspective it offers, we hope that this study has contributed to BPs being used to explore an even larger variety of linguistic phenomena. Notes * We thank Carita Paradis for her valuable comments and feedback. The usual disclaimers apply. 1. To avoid unnecessary prolixity, we will from now on just use the term synonyms to refer to words that are semantically similar and may on at least one occasion be used functionally interchangeably. 2. Our characterization of two empirical perspectives is of course not comprehensive and heavily biased towards the methods employed in studies of synonymy and antonymy as well as, for reasons that will be discussed below, towards studies that involve Miller and Charles s notion of a contextual representation. Naturally, many methods other than the two we focus on here have been employed, such as word association tests, judgment tests, forced-selection tests, gap-filling tests, and no doubt many others. 3. They note that [t]his situation had been noted before, of course, but was dismissed with a rationalisation that antonyms may not be as different in meaning as they seem (1991:26) but unfortunately, they do not really provide any argument why this rationalisation, which is after all not that uncommon, may not be correct. 4. Jones (2001) again replicates this by then already comprehensively documented finding, but there are also some potential problems with his 145
26 ICAME Journal No. 34 approach. First, he argues that his findings are more trustworthy than Justeson and Katz s: For example, Justeson & Katz calculate an Observed/Expected ratio of 19.2 for happy/sad, based on individual frequencies of 89 and 32 respectively and observed co-occurrence in just one sentence. My corpus yields an Observed/Expected ratio of 6.8 for happy/sad, based on individual frequencies of 28,217 and 9,420 respectively and observed co-occurrence in 140 sentences. It seems fair to conclude that statistics derived from the latter corpus will be more trustworthy. However, we consider this far from fair to conclude since even if Jones s corpus is larger, it is also a much more restricted convenience sample. Jones (2001) uses a 280m words corpus, whereas Justeson and Katz (1991) use the 1m word Brown corpus, but Jones s corpus contains journalese from a single newspaper only, whereas the Brown corpus has been sampled much more widely and carefully. Second, he criticizes previous studies antonym lists as too intuitionbased and outdated, but his own solution is to just make up a list on his own, which, even if that list is praised as customised to meet the demands of this research and relevant to a 21st Century investigation of antonymy (p. 299) is certainly not as objective as, say, Deese s more empirical approach. 5. The annotation of the data was mostly unproblematic, since we could rely on the remarkable coding of the ICE-GB corpus compilers. The only tricky part of the annotation was the annotation of the semantic characteristics of the modifiee. While disagreements between the authors were resolved in discussion, there are of course ambiguous cases which were hard to decide. While we tried to code those to the best of our abilities and did quite some checking for consistency, it is unlikely that all of our decisions are uncontroversial. However, given the size of the multivariate data set, it is inconceivably difficult to bias the data in one particular direction on purpose, and in the analyses discussed below, we compare the results of an analysis based on the many uncontroversial morphosyntactic characteristics to those of the few more controversial semantic characteristics as a control. 6. In our data, the preferences for attributive and predicative use of little and small are different from the findings reported by Biber et al. (1998). Our BP vectors indicate that little prefers attributive use and small disprefers predicative use, whereas Biber et al. (1998:93f.) find that small prefers predica- 146
27 Behavioral profiles: A corpus-based perspective on synonymy and antonymy tive use more strongly than little (esp. in conversation). This kind of difference is not particularly encouraging, since their conversational data are from British English from the 1990s just like ours, but they may still most likely be due to the different corpus compositions and different parsing schemes. However, most relevant to the antonymy perspective of the present study is the fact that Biber et al. s (1998) findings also result in the association of big to little and large to small that we and many other studies found. References Abbot-Smith, Kirsten and Michael Tomasello Exemplar-learning and schematization in a usage-based account of syntactic acquisition. The Linguistic Review 23: Arppe, Antti Univariate, bivariate, and multivariate methods in corpusbased lexicography: A study of synonymy. Helsinki: University of Helsinki. Arppe, Antti and Juhani Järvikivi Every method counts: Combining corpus-based and experimental evidence in the study of synonymy. Corpus Linguistics and Linguistic Theory 3 (2): Atkins, Beryl T. Sue Semantic ID tags: Corpus evidence for dictionary senses. Proceedings of the Third Annual Conference of the UW Centre for the New Oxford English Dictionary, Atkins, Beryl T. Sue and Beth Levin Building on a corpus: A linguistic and lexicographical look at some near-synonyms. International Journal of Lexicography 8 (2): Berez, Andrea L. and Stefan Th. Gries In defense of corpus-based methods: A behavioral profile analysis of polysemous get in English. Proceedings of The 24th Northwest Linguistics Conference (University of Washington Working Papers in Linguistics 27), Seattle, WA: Department of Linguistics. Biber, Douglas, Susan Conrad and Randi Reppen Corpus linguistics: Investigating language structure and use. Cambridge: Cambridge University Press. Bolinger, Dwight L Entailment and the meaning of structures. Glossa 2:
28 ICAME Journal No. 34 Bybee, Joan The phonology of the lexicon: Evidence from lexical diffusion. In M. Barlow and S. Kemmer (eds.). Usage-based models of language, Stanford: CSLI Publications. Charles, Walter G The categorization of sentential contexts. Journal of Psycholinguistic Research 17 (5): Charles, Walter G Contextual correlates of meaning. Applied Psycholinguistics 21 (4): Charles, Walter G. and George A. Miller Contexts of antonymous adjectives. Applied Psycholinguistics 10 (3): Church, Kenneth Ward, William Gale, Patrick Hanks and Donald Hindle Using statistics in lexical analysis. In U. Zernik (ed.). Lexical acquisition: Exploiting on-line resources to build a lexicon, Hillsdale, NJ: Lawrence Erlbaum. Church, Kenneth Ward, William Gale, Patrick Hanks, Donald Hindle and Rosamund Moon Lexical substitutability. In B.T.S. Atkins and A. Zampolli (eds.). Computational approaches to the lexicon, Oxford and New York: Oxford University Press. Cruse, D. Alan Lexical semantics. Cambridge: Cambridge University Press. Dabrowska, Ewa Words as constructions. In V. Evans and S.S. Pourcel (eds.). New directions in cognitive linguistics, Amsterdam and Philadelphia: John Benjamins. [The typesetting software used did not reproduce the second letter in the author s last name as it appears in the source.] Deese, James Form class and the determinants of association. Journal of Verbal Learning and Verbal Behavior 1 (2): Deese, James The associative structure of some common English adjectives. Journal of Verbal Learning and Verbal Behavior 3 (5): Divjak, Dagmar S Ways of intending. In St. Th. Gries and A. Stefanowitsch (eds.). Corpora in cognitive linguistics: Corpus-based approaches to syntax and lexis, Berlin and New York: Mouton de Gruyter. Divjak, Dagmar S. and Stefan Th. Gries Ways of trying in Russian: Clustering behavioral profiles. Corpus Linguistics and Linguistic Theory 2 (1): Divjak, Dagmar S. and Stefan Th. Gries Clusters in the mind? Converging evidence from near-synonymy in Russian. The Mental Lexicon 3 (2):
29 Behavioral profiles: A corpus-based perspective on synonymy and antonymy Divjak, Dagmar S. and Stefan Th. Gries Corpus-based cognitive semantics: A contrastive study of phasal verbs in English and Russian. In K. Dziwirek and B. Lewandowska-Tomaszczyk (eds.). Studies in cognitive corpus linguistics, Frankfurt a.m.: Peter Lang. Ervin-Tripp, Susan Substitution, context, and association. In L. Postman and G. Keppel (eds.). Norms of word association, New York: Academic Press. Fellbaum, Christiane Co-occurrence and antonymy. International Journal of Lexicography 8 (4): Firth, John R Papers in linguistics. Oxford: Oxford University Press. Gibbs, Raymond W. Jr. and Teenie Matlock Psycholinguistic perspectives on polysemy. In H. Cuyckens and B. Zawada (eds.). Polysemy in cognitive linguistics, Amsterdam and Philadelphia: John Benjamins. Gilquin, Gaëtanelle Causative get and have. Journal of English Linguistics 31 (2): Goldberg, Adele E Constructions at work. Oxford: Oxford University Press. Gries, Stefan Th Testing the sub-test: A collocational-overlap analysis of English -ic and -ical adjectives. International Journal of Corpus Linguistics 8 (1): Gries, Stefan Th Corpus-based methods and cognitive semantics: The many meanings of to run. In St. Th. Gries and A. Stefanowitsch (eds.). Corpora in cognitive linguistics: Corpus-based approaches to syntax and lexis, Berlin and New York: Mouton de Gruyter. Gries, Stefan Th BP 1.0. A program for R and higher. Gries, Stefan Th. and Dagmar S. Divjak Behavioral profiles: A corpusbased approach towards cognitive semantic analysis. In V. Evans and S.S. Pourcel (eds.). New directions in cognitive linguistics, Amsterdam and Philadelphia: John Benjamins. Gross, Derek, Ute Fischer and George A. Miller The organization of adjectival meanings. Journal of Memory and Language 28 (1): Hanks, Patrick Contextual dependency and lexical sets. International Journal of Corpus Linguistics 1 (1): Harris, Zellig S Papers in structural and transformational linguistics. Dordrecht: Reidel. 149
30 ICAME Journal No. 34 Hoey, Michael Lexical priming: a new theory of words and language. London and New York: Routledge. Jones, Steven Corpus approaches to antonymy. In P. Rayson, A. Wilson, A. McEnery, A. Hardie and S. Khoja (eds.). Proceedings of the Corpus Linguistics 2001 conference, Lancaster University: University Centre for Computer Corpus Research on Language. Jones, Steven, Carita Paradis, M. Lynne Murphy and Caroline Willners Googling for opposites : A web-based study of antonym canonicity. Corpora 2 (2): Justeson John S. and Slava M. Katz Co-occurrences of antonymous adjectives and their contexts. Computational Linguistics 1 (1): Kishner, Jeffrey M. and Raymond W. Gibbs Jr How just gets its meanings: polysemy and context in psychological semantics. Language and Speech 39 (1): Miller, George A On knowing a word. Annual Review of Psychology 50: Miller, George A. and Walter G. Charles Contextual correlates of semantic similarity. Language and Cognitive Processes 6 (1): Partington, Alan Patterns and meanings: Using corpora for English language research and teaching. Amsterdam and Philadelphia: John Benjamins. Pierrehumbert, Janet Exemplar dynamics: Word frequency, lenition, and contrast. In J. Bybee and P. Hopper (eds.). Frequency and the emergence of linguistic structure, Amsterdam and Philadelphia: John Benjamins. Rubenstein, Herbert and John B. Goodenough Contextual correlates of synonymy. Communications of the ACM 8 (10): Schmid, Hans-Jörg Cottage und Co., idea, start vs. begin. Tübingen: Max Niemeyer. Wang, Chueh-chen What distinguish synonymous constructions: A corpus-based study of Mandarin lian dou and lian ye constructions. Paper presented at GCLA
On the use of antonyms and synonyms from a domain perspective
On the use of antonyms and synonyms from a domain perspective Debela Tesfaye IT PhD Program Addis Ababa University Addis Ababa, Ethiopia [email protected] Carita Paradis Centre for Languages and Literature
Constructing a TpB Questionnaire: Conceptual and Methodological Considerations
Constructing a TpB Questionnaire: Conceptual and Methodological Considerations September, 2002 (Revised January, 2006) Icek Ajzen Brief Description of the Theory of Planned Behavior According to the theory
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions. Natalia Levshina
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Natalia Levshina RU Quantitative Lexicology and Variational Linguistics Faculteit Letteren Subfaculteit Taalkunde K.U.Leuven
Session 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
ELL Considerations for Common Core-Aligned Tasks in English Language Arts
ELL Considerations for Common Core-Aligned Tasks in English Language Arts A substantial body of research clearly indicates that utilizing specific instructional modifications as well as targeted pedagogical
Association Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
Sample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
Simple maths for keywords
Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd [email protected] Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all
Appendix B Data Quality Dimensions
Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
Syntactic Theory. Background and Transformational Grammar. Dr. Dan Flickinger & PD Dr. Valia Kordoni
Syntactic Theory Background and Transformational Grammar Dr. Dan Flickinger & PD Dr. Valia Kordoni Department of Computational Linguistics Saarland University October 28, 2011 Early work on grammar There
Comparative Analysis on the Armenian and Korean Languages
Comparative Analysis on the Armenian and Korean Languages Syuzanna Mejlumyan Yerevan State Linguistic University Abstract It has been five years since the Korean language has been taught at Yerevan State
Chapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products
Chapter 3 Cartesian Products and Relations The material in this chapter is the first real encounter with abstraction. Relations are very general thing they are a special type of subset. After introducing
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
Syntactic and Semantic Differences between Nominal Relative Clauses and Dependent wh-interrogative Clauses
Theory and Practice in English Studies 3 (2005): Proceedings from the Eighth Conference of British, American and Canadian Studies. Brno: Masarykova univerzita Syntactic and Semantic Differences between
MATRIX OF STANDARDS AND COMPETENCIES FOR ENGLISH IN GRADES 7 10
PROCESSES CONVENTIONS MATRIX OF STANDARDS AND COMPETENCIES FOR ENGLISH IN GRADES 7 10 Determine how stress, Listen for important Determine intonation, phrasing, points signaled by appropriateness of pacing,
Quotes from Object-Oriented Software Construction
Quotes from Object-Oriented Software Construction Bertrand Meyer Prentice-Hall, 1988 Preface, p. xiv We study the object-oriented approach as a set of principles, methods and tools which can be instrumental
Discourse Markers in English Writing
Discourse Markers in English Writing Li FENG Abstract Many devices, such as reference, substitution, ellipsis, and discourse marker, contribute to a discourse s cohesion and coherence. This paper focuses
Literature as an Educational Tool
Literature as an Educational Tool A Study about Learning through Literature and How Literature Contributes to the Development of Vocabulary Ida Sundelin VT-13 C-essay, 15 hp Didactics and Linguistics English
Noam Chomsky: Aspects of the Theory of Syntax notes
Noam Chomsky: Aspects of the Theory of Syntax notes Julia Krysztofiak May 16, 2006 1 Methodological preliminaries 1.1 Generative grammars as theories of linguistic competence The study is concerned with
6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
COMPUTATIONAL DATA ANALYSIS FOR SYNTAX
COLING 82, J. Horeck~ (ed.j North-Holland Publishing Compa~y Academia, 1982 COMPUTATIONAL DATA ANALYSIS FOR SYNTAX Ludmila UhliFova - Zva Nebeska - Jan Kralik Czech Language Institute Czechoslovak Academy
Rethinking the relationship between transitive and intransitive verbs
Rethinking the relationship between transitive and intransitive verbs Students with whom I have studied grammar will remember my frustration at the idea that linking verbs can be intransitive. Nonsense!
Doctoral School of Historical Sciences Dr. Székely Gábor professor Program of Assyiriology Dr. Dezső Tamás habilitate docent
Doctoral School of Historical Sciences Dr. Székely Gábor professor Program of Assyiriology Dr. Dezső Tamás habilitate docent The theses of the Dissertation Nominal and Verbal Plurality in Sumerian: A Morphosemantic
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
Historical Linguistics. Diachronic Analysis. Two Approaches to the Study of Language. Kinds of Language Change. What is Historical Linguistics?
Historical Linguistics Diachronic Analysis What is Historical Linguistics? Historical linguistics is the study of how languages change over time and of their relationships with other languages. All languages
MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces
The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces Or: How I Learned to Stop Worrying and Love the Ball Comment [DP1]: Titles, headings, and figure/table captions
Language Meaning and Use
Language Meaning and Use Raymond Hickey, English Linguistics Website: www.uni-due.de/ele Types of meaning There are four recognisable types of meaning: lexical meaning, grammatical meaning, sentence meaning
Independent samples t-test. Dr. Tom Pierce Radford University
Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of
CREATING LEARNING OUTCOMES
CREATING LEARNING OUTCOMES What Are Student Learning Outcomes? Learning outcomes are statements of the knowledge, skills and abilities individual students should possess and can demonstrate upon completion
REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION
REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety
The Oxford Learner s Dictionary of Academic English
ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students
Descriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
Regular Languages and Finite Automata
Regular Languages and Finite Automata 1 Introduction Hing Leung Department of Computer Science New Mexico State University Sep 16, 2010 In 1943, McCulloch and Pitts [4] published a pioneering work on a
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
3. Mathematical Induction
3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)
Distances, Clustering, and Classification. Heatmaps
Distances, Clustering, and Classification Heatmaps 1 Distance Clustering organizes things that are close into groups What does it mean for two genes to be close? What does it mean for two samples to be
Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134)
Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134) Very young children learn vocabulary items related to the different concepts they are learning. When children learn numbers or colors in
Absolute versus Relative Synonymy
Article 18 in LCPJ Danglli, Leonard & Abazaj, Griselda 2009: Absolute versus Relative Synonymy Absolute versus Relative Synonymy Abstract This article aims at providing an illustrated discussion of the
User research for information architecture projects
Donna Maurer Maadmob Interaction Design http://maadmob.com.au/ Unpublished article User research provides a vital input to information architecture projects. It helps us to understand what information
Managing Variability in Software Architectures 1 Felix Bachmann*
Managing Variability in Software Architectures Felix Bachmann* Carnegie Bosch Institute Carnegie Mellon University Pittsburgh, Pa 523, USA [email protected] Len Bass Software Engineering Institute Carnegie
Chapter 6 Experiment Process
Chapter 6 Process ation is not simple; we have to prepare, conduct and analyze experiments properly. One of the main advantages of an experiment is the control of, for example, subjects, objects and instrumentation.
Linear Programming. Solving LP Models Using MS Excel, 18
SUPPLEMENT TO CHAPTER SIX Linear Programming SUPPLEMENT OUTLINE Introduction, 2 Linear Programming Models, 2 Model Formulation, 4 Graphical Linear Programming, 5 Outline of Graphical Procedure, 5 Plotting
Statistical Machine Translation: IBM Models 1 and 2
Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation
Writing learning objectives
Writing learning objectives This material was excerpted and adapted from the following web site: http://www.utexas.edu/academic/diia/assessment/iar/students/plan/objectives/ What is a learning objective?
Linear Programming Notes VII Sensitivity Analysis
Linear Programming Notes VII Sensitivity Analysis 1 Introduction When you use a mathematical model to describe reality you must make approximations. The world is more complicated than the kinds of optimization
I. The SMART Project - Status Report and Plans. G. Salton. The SMART document retrieval system has been operating on a 709^
1-1 I. The SMART Project - Status Report and Plans G. Salton 1. Introduction The SMART document retrieval system has been operating on a 709^ computer since the end of 1964. The system takes documents
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
Visualizing WordNet Structure
Visualizing WordNet Structure Jaap Kamps Abstract Representations in WordNet are not on the level of individual words or word forms, but on the level of word meanings (lexemes). A word meaning, in turn,
Mathematical Induction
Mathematical Induction In logic, we often want to prove that every member of an infinite set has some feature. E.g., we would like to show: N 1 : is a number 1 : has the feature Φ ( x)(n 1 x! 1 x) How
Beyond single words: the most frequent collocations in spoken English
Beyond single words: the most frequent collocations in spoken English Dongkwang Shin and Paul Nation This study presents a list of the highest frequency collocations of spoken English based on carefully
Jack s Dyslexia Index indicates he has dyslexic difficulties that are mild in extent.
Dyslexia Portfolio Report for Jack Jones Assessed by Sue Thompson on 05/08/2009 Report for parents When a child is identified as dyslexic, additional support will be needed from both school and home to
Hypothesis testing. c 2014, Jeffrey S. Simonoff 1
Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there
WRITING A CRITICAL ARTICLE REVIEW
WRITING A CRITICAL ARTICLE REVIEW A critical article review briefly describes the content of an article and, more importantly, provides an in-depth analysis and evaluation of its ideas and purpose. The
Problem of the Month: Fair Games
Problem of the Month: The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards:
Introduction to the Smith Chart for the MSA Sam Wetterlin 10/12/09 Z +
Introduction to the Smith Chart for the MSA Sam Wetterlin 10/12/09 Quick Review of Reflection Coefficient The Smith chart is a method of graphing reflection coefficients and impedance, and is often useful
Math 4310 Handout - Quotient Vector Spaces
Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable
Time series clustering and the analysis of film style
Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such
Topics in Computational Linguistics. Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
Topics in Computational Linguistics Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Presented By: Mohammad Saif Department of Computer
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
KNOWLEDGE ORGANIZATION
KNOWLEDGE ORGANIZATION Gabi Reinmann Germany [email protected] Synonyms Information organization, information classification, knowledge representation, knowledge structuring Definition The term
Last year was the first year for which we turned in a YAP (for 2013-2014), so I can't attach one for 2012-2013.
Annual Assessment Report Department of English 2012-2013 1. Previous Yearly Action Plan Last year was the first year for which we turned in a YAP (for 2013-2014), so I can't attach one for 2012-2013. 2.
Lab 11. Simulations. The Concept
Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that
A successful market segmentation initiative answers the following critical business questions: * How can we a. Customer Status.
MARKET SEGMENTATION The simplest and most effective way to operate an organization is to deliver one product or service that meets the needs of one type of customer. However, to the delight of many organizations
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Unit 2 Module 3: Generating Examples and Nonexamples
Unit 2 Module 3: Generating Examples and Nonexamples Section 1 Slide 1 Title Slide Welcome to the third module in the Vocabulary Instructional Routines unit, Generating Examples and Nonexamples. Slide
Writing Effective Questions
Writing Effective Questions The most important thing to keep in mind when you develop test questions is that your job as an educator is to teach people so that they can learn and be successful. The idea
Reflections on Probability vs Nonprobability Sampling
Official Statistics in Honour of Daniel Thorburn, pp. 29 35 Reflections on Probability vs Nonprobability Sampling Jan Wretman 1 A few fundamental things are briefly discussed. First: What is called probability
1.7 Graphs of Functions
64 Relations and Functions 1.7 Graphs of Functions In Section 1.4 we defined a function as a special type of relation; one in which each x-coordinate was matched with only one y-coordinate. We spent most
Five High Order Thinking Skills
Five High Order Introduction The high technology like computers and calculators has profoundly changed the world of mathematics education. It is not only what aspects of mathematics are essential for learning,
CONTENTS 1. Peter Kahn. Spring 2007
CONTENTS 1 MATH 304: CONSTRUCTING THE REAL NUMBERS Peter Kahn Spring 2007 Contents 2 The Integers 1 2.1 The basic construction.......................... 1 2.2 Adding integers..............................
Local outlier detection in data forensics: data mining approach to flag unusual schools
Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential
Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.
Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative
Formal Languages and Automata Theory - Regular Expressions and Finite Automata -
Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March
Handbook on Test Development: Helpful Tips for Creating Reliable and Valid Classroom Tests. Allan S. Cohen. and. James A. Wollack
Handbook on Test Development: Helpful Tips for Creating Reliable and Valid Classroom Tests Allan S. Cohen and James A. Wollack Testing & Evaluation Services University of Wisconsin-Madison 1. Terminology
CHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
Planning a Class Session
Planning a Class Session A Guide for New Teachers by Diane M. Enerson Kathryn M. Plank R. Neill Johnson The Pennsylvania State University 301 Rider Building II University Park, PA 16802 www.schreyerinstitute.psu.edu
Comparison of Research Designs Template
Comparison of Comparison of The following seven tables provide an annotated template to guide you through the comparison of research designs assignment in this course. These tables help you organize your
Practical Research. Paul D. Leedy Jeanne Ellis Ormrod. Planning and Design. Tenth Edition
Practical Research Planning and Design Tenth Edition Paul D. Leedy Jeanne Ellis Ormrod 2013, 2010, 2005, 2001, 1997 Pearson Education, Inc. All rights reserved. Chapter 1 The Nature and Tools of Research
Covariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999)
Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002 Language Arts Levels of Depth of Knowledge Interpreting and assigning depth-of-knowledge levels to both objectives within
Chapter 8: Quantitative Sampling
Chapter 8: Quantitative Sampling I. Introduction to Sampling a. The primary goal of sampling is to get a representative sample, or a small collection of units or cases from a much larger collection or
Methodological Issues for Interdisciplinary Research
J. T. M. Miller, Department of Philosophy, University of Durham 1 Methodological Issues for Interdisciplinary Research Much of the apparent difficulty of interdisciplinary research stems from the nature
Register Differences between Prefabs in Native and EFL English
Register Differences between Prefabs in Native and EFL English MARIA WIKTORSSON 1 Introduction In the later stages of EFL (English as a Foreign Language) learning, and foreign language learning in general,
Measurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
11 Ideals. 11.1 Revisiting Z
11 Ideals The presentation here is somewhat different than the text. In particular, the sections do not match up. We have seen issues with the failure of unique factorization already, e.g., Z[ 5] = O Q(
Paraphrasing controlled English texts
Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich [email protected] Abstract. We discuss paraphrasing controlled English texts, by defining
Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
The European Financial Reporting Advisory Group (EFRAG) and the Autorité des Normes Comptables (ANC) jointly publish on their websites for
The European Financial Reporting Advisory Group (EFRAG) and the Autorité des Normes Comptables (ANC) jointly publish on their websites for information purpose a Research Paper on the proposed new Definition
Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10
CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,
Principles of Data-Driven Instruction
Education in our times must try to find whatever there is in students that might yearn for completion, and to reconstruct the learning that would enable them autonomously to seek that completion. Allan
Assessment Policy. 1 Introduction. 2 Background
Assessment Policy 1 Introduction This document has been written by the National Foundation for Educational Research (NFER) to provide policy makers, researchers, teacher educators and practitioners with
http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions
A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.
Speaking for IELTS. About Speaking for IELTS. Vocabulary. Grammar. Pronunciation. Exam technique. English for Exams.
About Collins series has been designed to be easy to use, whether by learners studying at home on their own or in a classroom with a teacher: Instructions are easy to follow Exercises are carefully arranged
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Data Deduplication in Slovak Corpora
Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain
