1 Cognition 104 (2007) Brief article Do Chinese and English speakers think about time differently? Failure of replicating Boroditsky (2001) q,qq Jenn-Yeu Chen Institute of Cognitive Science, National Cheng Kung University, 1 University Road, Tainan 701, Taiwan Received 10 May 2006; revised 19 July 2006; accepted 18 September 2006 Abstract English uses the horizontal spatial metaphors to express time (e.g., the good days ahead of us). Chinese also uses the vertical metaphors (e.g., the month above to mean last month). Do Chinese speakers, then, think about time in a different way than English speakers? Boroditsky [Boroditsky, L. (2001). Does language shape thought? Mandarin and English speakers conceptions of time. Cognitive Psychology, 43(1), 1 22] claimed that they do, and went on to conclude that language is a powerful tool in shaping habitual thought about abstract domains (such as time). By estimating the frequency of usage, we found that Chinese speakers actually use the horizontal spatial metaphors more often than the vertical metaphors. This offered no logical ground for Boroditsky s claim. We were also unable to replicate her experiments in four different attempts. We conclude that Chinese speakers do not think about time in a different way than English speakers just because Chinese also uses the vertical spatial metaphors to express time. Ó 2006 Elsevier B.V. All rights reserved. q This manuscript was accepted under the editorship of Jacques Mehler. qq This work was sponsored by the NSC H PAE grant awarded to the author by the National Council of Taiwan, ROC. It was carried out by Yi-Tien Tsai as part of the requirement for her master s thesis. address: /$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi: /j.cognition
2 428 J.-Y. Chen / Cognition 104 (2007) Keywords: Linguistic relativity hypothesis; Time; Chinese; English English uses the horizontal spatial metaphors primarily to express time (e.g., to look back 30 years ), whereas Chinese also uses the vertical metaphors (e.g., the month above means last month; the day ahead means the day before yesterday). Will the use of different spatial metaphors affect the way time is conceptualized in the language users? Boroditsky (2001) addressed this question with a spatial priming task. The task first engaged the participants in spatial processing, followed immediately by sentence judgment that involved the processing of time. In a typical trial, the participants saw two pictures in a row, each depicting two objects aligned horizontally or vertically. A sentence appeared at the bottom describing the spatial relationship of the two objects. The participants had to determine if the sentence was a correct statement. Following the pictures was shown a sentence which described the temporal relationship of two time units, e.g., June comes before May. The participants had to decide too if the sentence was a correct statement. The sentence used two kinds of words to describe the temporal relationship, one a spatial metaphor such as before and after, and the other a time word such as earlier and later. Using the spatial metaphor was meant to serve as a methodological check of the effectiveness of the priming procedure. Indeed, both the English and the Chinese participants responded to the time sentence faster when they had just processed the horizontal relationship of the objects than the vertical relationship of the objects. The critical test of the hypothesis came from the participants responses to the time sentences which used the time words. The English participants responded to this kind of sentences faster when they had just processed the horizontal relationship of the objects, but the Chinese participants were faster when they had just processed the vertical relationship of the objects. Because the Chinese participants were Chinese English bilinguals, and they processed the English sentences in the task, the findings argued strongly for the point that the use of spatial metaphors could change the way speakers think about time. The Chinese speakers displayed a tendency of thinking about time vertically because they talked about time vertically. Such a tendency persisted even when the Chinese speakers processed time in English. Boroditsky s findings were very persuasive, except that an important assumption she made runs against our intuition as a native speaker of Chinese. She assumed implicitly that Chinese speakers used the vertical metaphors far more frequently than the horizontal metaphors when expressing time. The assumption was evident in the way she analyzed and described the data: English speakers answered purely temporal questions faster after horizontal primes than after vertical primes... Mandarin speakers were faster after vertical primes than after horizontal primes (p. 10). Unfortunately, she never tested that assumption. Here, we report a study which tested the assumption (Part 1) and repeated her experiment (Part 2). To anticipate the results, the data of the frequency of usage did
3 J.-Y. Chen / Cognition 104 (2007) not support Boroditsky s assumption. We also could not replicate her experimental findings. 1. Part Method We searched the web news in Taiwan for the time expressions. In the first attempt, we downloaded 100 pieces of news from the Yahoo News Taiwan over four days. We, then, extracted all the expressions that contained time. The frequencies of the horizontal spatial terms and the vertical spatial terms were tallied. In the second attempt, we searched the Google News Taiwan, using the Chinese time words (day, week, month, season, and year) and the spatial terms (above, below, before, and after) as the combined keywords. Each search yielded many pieces of news, so we kept only the first Results The results from the Yahoo search (Table 1) showed that the number of time expressions using the horizontal spatial terms exceeded the number using the vertical spatial terms (250 vs. 122). We assumed that the news had been drawn from a single sample and so performed a v 2 analysis of the frequency distribution. The result was significant: v 2 (1) = 44, p < We have also performed a matched-sample t-test as well as a sign test, treating each news piece as the unit of analysis. The results of both tests were consistent with that of the v 2 analysis: for the t-test, t(99) = 4.38, p <.0001; for the sign test, 57 out of 100 news pieces used more horizontal metaphors than vertical metaphors, 29 showed the opposite, and 14 had a tie, z = 2.91, p <.005. The Google search yielded similar results (Table 2). Here, we applied the v 2 tests only. Overall, the time units were expressed with a horizontal metaphor more often than with a vertical metaphor: v 2 (1) = 33, p < Time of event (e.g., before the moon disappeared) was the major contributor of the horizontal bias. However, when time of event had been removed, the horizontal bias remained significant: 58% vs. 42%, v 2 (1) = 11, p < We also contrasted the usage patterns for week and other time units (excluding time of event). The patterns were opposite: v 2 (1) = 24.8, p <.0001, with week showing a vertical bias (56% over 44%) while other time units showing a horizontal bias (68% over 32%). Table 1 Number of time expressions using the horizontal and the vertical spatial terms from searching the Yahoo News Taiwan Before After Sum of horizontal Above Below Sum of vertical
4 430 J.-Y. Chen / Cognition 104 (2007) Table 2 Number of time expressions using the horizontal and the vertical spatial terms from searching the Google News Taiwan Before After Sum of horizontal Above Below Sum of vertical Day Week Month Year Season Event Total Discussion We searched the Yahoo and the Google News Taiwan to estimate the frequency of use of the horizontal and the vertical spatial metaphors when Chinese people expressed time. The results from both rounds of search showed clearly that the horizontal spatial metaphors were used more frequently than the vertical spatial metaphors (except for the time unit week ). Thus, Boroditsky s assumption cannot be justified. We, then, went on to repeat her experiment. A total of four experiments were conducted, all following the basic design and procedure of her study. The experiments were programmed in DMDX (developed at the University of Arizona by K. I. Forster and J. C. Forster) and conducted on a personal computer with a Pentium-III 667 Hz microprocessor and a 15-inch LCD monitor. 2. Part Experiment Method Twenty-five Chinese English bilinguals from the Department of Foreign Languages, National Cheng Kung University participated. They were graduate students or at least in their undergraduate sophomore year, with an English major. Fourteen native speakers of English who taught English in Tainan City also participated. All of them were paid for participation. There were 128 pictures serving as the primes. Half depicted a horizontal relation of two objects, while the other half the vertical relation. At the bottom of each picture was a sentence, which described the spatial relation of the objects using the horizontal (ahead of or behind) or the vertical (above or below) spatial metaphors. There were 32 target sentences, each depicting a time relation. Half were true, while the other half were false. Within each half, eight sentences used before/after to describe the time relation (e.g., Monday is before Wednesday), and eight used earlier/later. The time units used in the sentences included week days, months, and seasons.
5 J.-Y. Chen / Cognition 104 (2007) A trial consisted of two prime pictures followed by a target sentence. The two prime pictures depicted similar spatial relations (both horizontal or both vertical). The first one always gave a FALSE answer, while the second one always a TRUE answer. The target sentence was sometimes TRUE and sometime FALSE, and was randomly arranged. The participants responded by pressing one of two keys to indicate TRUE or FALSE. Response times were measured to the accuracy of milliseconds by the computer. All the sentences and the instructions were presented in English. The trial started with the presentation of four pound signs (####) serving as a fixation mark. The participants pressed the space bar to initiate the trial. The two prime pictures and the one target sentence followed one after another in that sequence upon the participants keypress responses. The experiment had a withinsubject 2 (prime type) 2 (target type) 3 (time unit) design Results and discussion The statistical analysis focused on the TRUE target sentences. Outliers (RTs > 6000 ms) and errors constituted 3% and 6% of the trials. Table 3 shows that the participants were slower (but not significantly) when the target sentence followed a horizontal prime than a vertical prime. This was the case regardless of the type of target, the time unit, and the native language. Detailed results of the analysis of var- Table 3 Mean response time (in ms) and its standard deviation (SD) as a function of prime type (H, horizontal; V, vertical), target type (before/after, earlier/later), and time unit (day, month, and season) from Experiment 1 Mean RT (SD) Difference of V H p of t-test H V Chinese participants Before/after Day 3053 (799) 2939 (711) Month 3219 (827) 2996 (727) Season 2780 (980) 2677 (964) All 3017 (879) 2870 (810) Earlier/later Day 3353 (822) 3368 (820) Month 3194 (818) 3103 (802) Season 3017 (782) 2987 (826) All 3188 (808) 3152 (821) English participants Before/after Day 2461 (905) 2162 (544) Month 2326 (652) 2574 (517) Season 2196 (381) 2444 (498) All 2328 (672) 2393 (536) Earlier/later Day 2461 (886) 2369 (830) Month 2464 (789) 2654 (664) Season 2937 (768) 2365 (583) All 2621 (828) 2462 (696) The p values of the matched-sample t-tests comparing the horizontal and vertical RTs are also shown.
6 432 J.-Y. Chen / Cognition 104 (2007) iance (for all experiments) are available in the supplementary items. Table 3 (as well as Tables 4 6) also presents the results of the individual t-tests contrasting the horizontal RTs with the vertical RTs for different languages, target types and time units. The results, which were inconsistent with Boroditsky s (2001) findings, suggested that the paradigm of spatial priming did not work in our experiment. This could mean some problems in the way we followed the design and the procedure of Boroditsky s experiment. But, it could also be a true failure of replication. Before drawing any conclusion, we conducted Experiment 2 with the same method, but using the Chinese English bilinguals and changing the language of the experiment (instructions and the sentences) into Chinese. The reason for the modification was to maximize the condition for observing a vertical bias in the Chinese speakers. If the Chinese speakers think about time vertically when they process an English sentence, they must display an even stronger vertical bias when they process a Chinese sentence. If Experiment 2 also failed to replicate Boroditsky s findings, perhaps we had not done the experiments grossly differently from the way she did hers Experiment Method Twenty Chinese English bilinguals were recruited from the student population of the National Cheng Kung University, with no special requirement on English proficiency. The design and the procedure were similar to those of Experiment 1, except that all the sentences were presented in Chinese, and so were the instructions Results and discussion Outliers (RTs > 5000 ms) and errors constituted 2% and 7% of the trials. Table 4 shows the participants responded more slowly (again insignificantly) when the target sentence followed a horizontal prime than a vertical prime regardless of the type of target, the time unit, and the native language. Thus, the replication failed again. We Table 4 Mean response time (in ms) and its standard deviation (SD) as a function of prime type (H, horizontal; V, vertical), target type (before/after, earlier/later), and time unit (day, month, and season) from Experiment 2 Mean RT (SD) Difference of V H p of t-test H V Before/after Day 2198 (847) 1908 (593) Month 2060 (866) 1763 (503) Season 2158 (706) 2138 (738) All 2139 (798) 1937 (628) Earlier/later Day 2010 (625) 1824 (634) Month 1735 (511) 1743 (662) Season 2099 (623) 1881 (686) All 1948 (599) 1816 (653) The p values of the matched-sample t-tests comparing the horizontal and vertical RTs are also shown.
7 J.-Y. Chen / Cognition 104 (2007) Table 5 Mean response time (in ms) and its standard deviation (SD) as a function of prime type (H, horizontal; V, vertical), target type (before/after, earlier/later), and time unit (day, month, and season) from Experiment 3 Mean RT (SD) Difference of V H p of t-test H V Before/after Day 1579 (415) 1559 (441) Month 1426 (371) 1326 (217) Season 1534 (242) 1462 (555) All 1513 (344) 1449 (424) Earlier/later Day 1577 (277) 1488 (364) Month 1347 (302) 1474 (369) Season 1812 (526) 1765 (537) All 1579 (419) 1575 (438) The p values of the matched-sample t-tests comparing the horizontal and vertical RTs are also shown. Table 6 Mean response time (in ms) and its standard deviation (SD) as a function of prime type (H, horizontal; V, vertical), target type (before/after, earlier/later), and time unit (day, month, and season) from Experiment 4 Mean RT (SD) Difference of V H p of t-test H V Before/after Day 1947 (634) 2085 (841) Month 1907 (618) 1937 (976) Season 2018 (750) 1970 (659) All 1957 (659) 1997 (822) Earlier/later Day 1904 (707) 1773 (693) Month 1721 (490) 1622 (530) Season 1982 (576) 2018 (684) All 1869 (597) 1804 (649) The p values of the matched-sample t-tests comparing the horizontal and vertical RTs are also shown. conducted Experiment 3 to rule out a potential methodological flaw in the previous experiments Experiment Method One possible explanation of why the response times tended to be slightly longer when the target sentences followed the horizontal primes than the vertical primes is that there was no delay between the response to the second prime picture and the presentation of the target sentence. Because the horizontal pictures tended to be harder to process than the vertical pictures (as verified by a separate analysis of their corresponding response times), the participants might still be pondering about their response to the horizontal picture when the target sentence appeared. This would have caused a delay in their response to the target sentence.
8 434 J.-Y. Chen / Cognition 104 (2007) To avoid this, we inserted the pound sign before every prime picture and every target sentence. The participants needed to press the space bar to receive the picture or the sentence. The experiment continued to use Chinese participants (N = 10, similarly recruited as in Experiment 2) and Chinese sentences and instructions Results and discussion The average proportions of outliers (RTs > 4000 ms) and errors were 0.5% and 4%. The results shown in Table 5 display a similar pattern to those of Experiments 1 and 2. Thus, having ruled out the potential methodological problem, Experiment 3 still failed to replicate Boroditsky s (2001) results Experiment Method In the last experiment, we rearranged the two objects in the horizontal pictures so that they were aligned vertically. Two approaching lines extending from bottom to top bordered the two objects so that the one at the top appeared in the front and the one at the bottom appeared in the back (the linear perspective). This arrangement made the horizontal pictures more comparable to the vertical pictures in terms of the visual angles. The rest of the method was the same as that of Experiment 2 and 3. Eighteen Chinese participants were recruited from the same subject pool Results The average proportions of outliers (RTs > 5000 ms) and errors were 1.5% and 8%. As Table 6 shows, although there were some conditions in which the response times were faster when the target sentences followed the horizontal pictures, overall they were not. The only significant effects were the main effect of target type and the main effect of time unit. The main effect of prime type was not significant; neither were the interactions, involving this factor. 3. General discussion Boroditsky (2001) observed that whereas English monolinguals tended to think about time horizontally, Chinese English bilinguals tended to think about time vertically even when they did it in English. She attributed this vertical bias in the Chinese English bilinguals to the fact that the Chinese language uses the vertical spatial metaphors (in addition to the horizontal metaphor) to express time, while the English language uses only the horizontal metaphors. The author concluded that the language one uses can have a profound effect on one s habitual thinking. We found that the use of the horizontal spatial metaphors in Chinese to express time was actually more frequent than the use of the vertical spatial metaphors. Moreover, we were unable, in four attempts, to replicate the results of Boroditsky s experiment. The effect sizes of the vertical bias with respect to the earlier/later questions for the Chinese participants were generally small so that extremely large sample sizes
9 J.-Y. Chen / Cognition 104 (2007) would have been required to detect them with a reasonable power (see Table 7). The English participants displayed a much larger effect size, but in the direction opposite to Boroditsky s prediction. It may appear to the readers that despite the null results of the statistical analyses, our Chinese data seem to present a trend that is consistent with Boroditsky s prediction. This impression is not supported by a careful examination of the relevant data. The trend is there only when different time units are lumped together. But, since time unit is clearly a critical variable here, it is inappropriate to combine the data across time units. If we focus on month, the time unit that was used in Boroditsky (2001), we find that Experiments 1 and 4 displayed a trend of vertical bias, while Experiments 2 and 3 displayed a trend of horizontal bias. Thus, our data, when examined with care, cannot be taken as consistent with Boroditsky s predition. There was a concern that our participants might not have processed the primes because the answers fell in a fixed pattern. This could explain our null results of spatial priming. We ran an analysis of the error rates and the RTs of the participants responses to the prime questions. The error rates were, on the average, comparable or higher for the prime questions than for the target questions. The RTs were also on the average longer for the primes than for the targets (see Table 8). If the fixed pattern of the answers for the prime questions had encouraged the participants to skip processing the primes, the error rates would have been close to zero, and the reaction Table 7 The results of the power analysis testing the vertical bias for the earlier/later questions Initial sample size Effect size Power (%) Required sample size to achieve 80% power Experiment 1 Chinese English Experiment Experiment Experiment The effect size is the standardized mean difference of the vertical RT minus the horizontal RT, which represents the vertical bias. The alpha level of the significance test was set at.05. Table 8 Error rates and the mean RTs for the primes and the targets Error rate Mean RT Prime 1 Prime 2 Target Prime 1 Prime 2 Target Experiment 1 Chinese English Experiment Experiment Experiment
10 436 J.-Y. Chen / Cognition 104 (2007) times would have been much shorter. Thus, there is no indication that the participants in our study skipped processing the prime questions. In sum, the two parts of the study led us to conclude that Chinese speakers do not conceptualize time differently than English speakers. This conclusion, however, must be limited to the way time is expressed spatially. Whether Chinese and English speakers might differ in other ways of conceptualizing time remains an open question; so does the linguistic relativity claim. One lesson that must be learned from this investigation and Boroditsky s is that researchers can reach erroneous conclusions when they examine a cross-language issue but do not have competent knowledge about the languages they examine. The controversy over the issue of counterfactual reasoning serves as a case in point (Au, 1983; Bloom, 1981). Collaborations, involving native speakers in this type of research are strongly advised. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi: /j.cognition References Au, T. K-F. (1983). Chinese and English counterfactuals: the Sapir Whorf hypothesis revised. Cognition, 15, Bloom, A. H. (1981). The linguistic shaping of thought: A study in the impact of language on thinking in China and the west. Hillsdale, NJ: Lawrence Erlbaum Associates. Boroditsky, L. (2001). Does language shape thought? Mandarin and English speakers conceptions of time. Cognitive Psychology, 43(1), 1 22.