The inclusion of a writing subtest on admissions and other major proficiency tests has long been a matter of contention among both university admissions departments and TESOL professionals. While tests such as the ibt TOEFL, SAT, IELTS (both Academic and General Training versions), and the Cambridge family of exams each includes one or more direct measures of writing ability, the question of whether such measures is worthwhile continues to linger. Anne Raimes, in a well-known TESOL Quarterly article on the former TOEFL Test of Written English¹, discussed concerns that, by their nature, remain relevant today. These include topic content and choice, the scoring system, the question of what such a test actually measures, and the issue of whether a direct test of writing is actually needed for judging students’ admissibility.
Various strategies have been implemented by test designers to address these issues. With respect to topic content and choice, the following are features of currently popular exams: the ibt TOEFL includes two tasks of writing, one a summary task on a topic integrated with reading and listening, and the other an opinion essay on a separate topic (only one topic choice)². The Cambridge C2 Proficiency Exam also contains two writing tasks, but their nature is somewhat different from the ibt TOEFL: while the first is a summary task based on two reading passages, the second is related to living rather than studying abroad, in keeping with the test’s dual purpose of serving as a proficiency measure for both work and study. It features a choice of four topics, from which test takers are to choose one, unlike the second ibt TOEFL writing task.³ Both the IELTS Academic and General Training exams also feature two writing tasks, but neither offers a choice of different topics.4
While each of the above exams is scored differently, the issues of what they actually measure, and the resulting question of whether they are needed are, from a validity standpoint, the most important issues to wrestle with. Two aspects of validity come into play. The first and most commonly considered aspect of validity is construct validity; that is, does a test in fact measure what it claims to? Since our focus is on admissions tests, and more specifically, tests for admission to post-secondary institutions, three points are of interest: topic content, topic choice, and the nature of the writing to be produced.
Apart from the Cambridge C2 Proficiency exam, which features one academic and one living-abroad related task³, the content of these writing exams is academic in nature. The first ibt TOEFL writing task for example, is a summary task contrasting a topically-related listening and reading passage on a topic such as voting systems; the second is an argumentative essay on a separate topic with no choice of prompt². The IELTS Academic writing test, meanwhile consists of first, an interpretive task focusing on a visual prompt, such as a bar graph dealing with population patterns in education, while the second, like the ibt TOEFL, is an argumentative essay with no choice of prompt4.
So, in terms of content, it would appear that these exams, apart from the second Cambridge C2 writing task are measuring what they claim to measure, academic writing ability. Yet, there are two troubling points here. The first has to do with prompt topic: if the topic of a particular prompt is more familiar to some test takers than others, will this have a differential effect on test-takers’ performance, compromising the extent to which a particular task is truly measuring writing ability, particularly for those test-takers who find the topic very unfamiliar? This issue is mitigated to some extent by the requirement of completing two rather than one writing task, yet the presence of even one such task on an exam surely has the potential to compromise construct validity, and, by extension, the validity of any decision by an admissions officer.
However, that is only one aspect of validity of concern here, as I intimated above. The second aspect of validity to consider, and the one which, I suspect for many of us, is the achilles heel of these tests, is predictive validity. Certainly, if there is any question concerning construct validity with respect to topic fairness for a given writing task, then its predictive validity is also open to question. We can also question predictive validity when we consider that only two topics have been covered–though that is certainly better than one. However, it is the limited amount of writing required of test-takers that should give us the strongest concerns with respect to predictive validity.
When we recall our own experience as undergrad students, how many term paper assignments did we receive requiring us to produce 2,000 words, or even more? In the light of this, the length requirements of the writing tasks on these exams should cause us to wonder if they have any predictive validity to speak of. The integrated and non-integrated writing tasks on the ibt TOEFL, for example have length requirements of up to a mere 225 words (suggested) and 300 words minimum (suggested) respectively². For the two Academic IELTS writing tasks, the minimum required totals are only 150 and 250 words4, while for the Cambridge C2 academic writing task, the required word range is just 240-280 words³.
Certainly, from the point of view of practicality, the large number of learners taking such test requires that each task be kept to a manageable length, so that scoring these tasks does not become a logistic nightmare. Yet the nagging question remains: how can a test taker’s success on such short tasks serve to suggest their readiness for taking regular university courses? This is a natural question, yet it is only one side of the coin with respect to predictive validity. For as a colleague of mine has very helpfully pointed out, the other side cannot be overlooked: that is, while we may hesitate to accept that success on such brief writing tasks is at least a fair predictor of success in college, the inability to complete such tasks at an acceptable level of proficiency raises a critical red flag, that a learner indeed lacks the readiness for regular university courses in an English-medium institution. This, I would suggest, is the most valuable information that academic writing tasks on a test such as the ibt TOEFL or IELTS Academic test can provide: the indication that a learner is not yet ready for regular university courses.
¹Raimes, A. (1990). The TOEFL Test of Written English: Causes for Concern. TESOL Quarterly, 24(3), 427-442.
²https://www.ets.org/Media/Tests/TOEFL/pdf/SampleQuestions.pdf
³http://www.cambridgeenglish.org/exams-and-tests/proficiency/
4http://www.cambridgeenglish.org/exams-and-tests/ielts/preparation/