Defining Constructs and Assessment Design

This chapter describes the evolution of constructs and illustrates current construct issues that affect the design of second language assessments. To understand the term construct better, the chapter begins with historical overviews presented from two perspectives. First, educational and psychological measurement ideas are reviewed in terms of the Standards for Educational and Psychological Testing (the Standards). The Standards have represented a consensus among American scholars and practitioners on what information is most helpful and important for guiding the development and use of tests. Second, language testing ideas are reviewed using the Standards as a backdrop. In both overviews, we can see shifts in meaning that have led to our current state.

The second part of the chapter discusses four inter-related issues that affect current and future designs of second language assessments: the relationship among measure, construct, and theory; a move away from the 1999 Standards view of construct validity and toward an argument-based approach; the way one's worldview affects the construct definition; and the way the inferences we want to make from test scores are influenced by the test developer's view both of the world and of validity.

By better understanding the various meanings of construct in second language assessment, test score users, students, and researchers alike should be alerted to look beyond the term itself, and examine the nuances of its use. As test developers adopt an argument-based approach to validity, they should be aware that constructs are not always necessary in assessment design, but if they are included there is a responsibility to define them in relation to a model or a theory, and to examine the predicted relations.

References

Alderson, J. C. (2000). Assessing reading. Cambridge, England: Cambridge University Press.
10.1017/CBO9780511732935
Google Scholar
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge, England: Cambridge University Press.
Google Scholar
American Council on the Teaching of Foreign Languages. (1986). ACTFL proficiency guidelines. Hastings-on-Hudson, NY: Author.
Google Scholar
American Educational Research Association & National Council on Measurement in Education. (1954). Technical standards for psychological tests and diagnostic techniques. Washington, DC: Author.
Google Scholar
American Educational Research Association, American Psychological Association, National Council on Measurement in Education. (1966). Standards for educational and psychological tests and manuals. Washington, DC: Author.
Google Scholar
American Educational Research Association, American Psychological Association, National Council on Measurement in Education. (1974). Standards for educational and psychological testing. Washington, DC: Author.
Google Scholar
American Educational Research Association, American Psychological Association, National Council on Measurement in Education. (1985). Standards for educational and psychological tests. Washington, DC: Author.
Google Scholar
American Educational Research Association, American Psychological Association, National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: Author.
Google Scholar
Bachman, L. (1990). Fundamental considerations in language testing. Oxford, England: Oxford University Press.
Google Scholar
Bachman, L. (2002). Task-based language performance assessment. Language Testing, 19, 453–76.
10.1191/0265532202lt240oa
Google Scholar
Bachman, L. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2(1), 1–34.
10.1207/s15434311laq0201_1
Google Scholar
Bachman, L. (2006). Generalizability. In M. Chalhoub-Deville, C. Chapelle, & P. Duff (Eds.), Inference and generalizability in applied linguistics (pp. 165–207). Philadelphia, PA: John Benjamins.
10.1075/lllt.12.11bac
Google Scholar
Bachman, L., & Palmer, A. (1981). A multitrait-multimethods investigation into the construct validity of six tests of speaking and reading. In A. Palmer, P. Groot, & G. Trosper (Eds.), The construct validation of tests of communicative competence (pp. 149–65). Washington, DC: TESOL.
Google Scholar
Bachman, L., & Palmer, A. (1982). The construct validation of some components of communicative proficiency. TESOL Quarterly, 16, 449–65.
10.2307/3586464
Web of Science® Google Scholar
Bachman, L., & Palmer, A. (1996). Language testing in practice. Oxford, England: Oxford University Press.
Google Scholar
Briere, E. (1975). Current trends in second language testing. In L. Palmer & B. Spolsky (Eds.), Papers on language testing 1967–1974 (pp. 220–8). Washington, DC: TESOL.
Google Scholar
Brooks, L. (2009). Interactivity in pairs in a test of oral proficiency: Co-constructing a better performance. Language Testing, 26, 341–66.
10.1177/0265532209104666
Web of Science® Google Scholar
Buck, G. (2000). Assessing listening. Cambridge, England: Cambridge University Press.
Google Scholar
Byrnes, H. (2002). The role of task and task-based assessment in a content-oriented collegiate foreign language classroom curriculum. Language Testing, 19, 419–37.
10.1191/0265532202lt238oa
Google Scholar
Canale, M. (1983). On some dimensions of language proficiency. In J. Oller (Ed.), Issues in language testing research (pp. 333–42). Rowley, MA: Newbury House.
Google Scholar
Canale, M., & Swain, M. (1980). Theoretical basis of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1–47.
10.1093/applin/1.1.1
Google Scholar
Carroll, J. B. (1941). A factor analysis of verbal abilities. Psychometrika, 6, 279–307.
10.1007/BF02288585
Google Scholar
Carroll, J. B. (1958). A factor analysis of two foreign language aptitude batteries. Journal of General Psychology, 59, 3–19.
10.1080/00221309.1958.9710168
CAS PubMed Web of Science® Google Scholar
Celce-Murcia, M., Dornyei, Z., & Thurrell, S. (1995). Communicative competence: A pedagogically motivated model with content specifications. Issues in Applied Linguistics, 6(2), 5–35.
10.5070/L462005216
Google Scholar
Chalhoub-Deville, M. (1997). Theoretical models, assessment frameworks and test construction. Language Testing, 14, 3–22.
10.1177/026553229701400102
Google Scholar
Chalhoub-Deville, M. (2003). Second language interaction: Current perspectives and future trends. Language Testing, 20, 369–83.
10.1191/0265532203lt264oa
Google Scholar
Chapelle, C. (1998). Construct definition and validity inquiry in SLA research. In L. Bachman & A. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 32–70). Cambridge, England: Cambridge University Press.
Web of Science® Google Scholar
Chapelle, C. (2006). L2 vocabulary acquisition theory. In M. Chalhoub-Deville, C. Chapelle, & P. Duff (Eds.), Inference and generalizability in applied linguistics (pp. 47–64). Philadelphia, PA: John Benjamins.
10.1075/lllt.12.05cha
Google Scholar
Chapelle, C., Enright, M., & Jamieson, J. (2008). Test score interpretation and use. In C. Chapelle, M. Enright, & J. Jamieson (Eds.), Building a validity argument for the Test of English as a Foreign Language (pp. 1–25). New York, NY: Routledge.
Web of Science® Google Scholar
Colby-Kelly, C., & Turner, C. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. Canadian Modern Language Review, 64, 9–38.
10.3138/cmlr.64.1.009
Web of Science® Google Scholar
Creswell, J., & Plano Clark, V. (2011). Designing and conducting mixed methods research ( 2nd ed.). Los Angeles, CA: Sage.
Google Scholar
Cronbach, L., & Meehl, P. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.
10.1037/h0040957
CAS PubMed Web of Science® Google Scholar
Douglas, D. (2000). Assessing language for specific purposes. Cambridge, England: Cambridge University Press.
Google Scholar
Duran, R., Canale, M., Penfield, J., Stansfield, C., & Liskin-Gasparro, J. (1985). TOEFL from a communicative viewpoint on language proficiency: A working paper (TOEFL research report 17). Princeton, NJ: Educational Testing Service.
Google Scholar
Embretson (Whitely), S. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179–97.
10.1037/0033-2909.93.1.179
Web of Science® Google Scholar
Fulcher, G., & Davidson, F. (2007). Language testing and assessment. New York, NY: Routledge.
10.4324/9780203449066
Google Scholar
Galaczi, E. (2008). Peer–peer interaction in a speaking test: The case of the First Certificate in English examination. Language Assessment Quarterly, 5(2), 89–119.
10.1080/15434300801934702
Web of Science® Google Scholar
Gulliksen, H. (1950). Intrinsic validity. American Psychologist, 5, 511–17.
10.1037/h0054604
CAS PubMed Web of Science® Google Scholar
Halliday, M. A. K. (1970). Language structure and language function. In J. Lyons, (Ed.), New horizons in linguistics (pp. 140–65). Harmondsworth, England: Penguin Books.
Google Scholar
Heaton, J. (1975). Writing English language tests. London, England: Longman.
Google Scholar
A. Hughes, & D. Porter (Eds.). (1983). Current developments in language testing. London, England: Academic Press.
Google Scholar
Hymes, D. (1972). On communicative competence. In J. Pride & J. Holmes (Eds.), Sociolinguistics (pp. 269–93). Harmondsworth, England: Penguin Books.
Google Scholar
Jamieson, J. (2011). Achievement of classroom language learning. In E. Hinkel (Ed.), Handbook of research in second language learning and teaching (Vol. 2, pp. 768–85). New York, NY: Routledge.
Google Scholar
Kane, M. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527–35.
10.1037/0033-2909.112.3.527
Web of Science® Google Scholar
Kane, M. (2002). Validating high-stakes testing programs. Educational Measurement: Issues and Practice, 21(1), 319–42.
Google Scholar
Kane, M. (2006). Validation. In R. Brennan (Ed.), Educational measurement ( 4th ed., pp. 17–64). Westport, CT: Greenwood.
10.3917/rhu.016.0017
Google Scholar
Kane, M., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(2), 5–17.
10.1111/j.1745-3992.1999.tb00010.x
Google Scholar
Lado, R. (1961). Language testing. London, England: Longman.
Google Scholar
Lantolf, J., & Frawley, W. (1988). Proficiency, understanding the construct. Studies in Second Language Acquisition, 10, 181–95.
10.1017/S0272263100007300
Google Scholar
Louma, S. (2004). Assessing speaking. Cambridge, England: Cambridge University Press.
10.1017/CBO9780511733017
Web of Science® Google Scholar
MacGregor, D., Louguit, M., Yanosky, T., Fidelman, C., Pan, M., Huang, X., & Kenyon, D. (2010). Annual technical report for ACCESS for ELLs English Language Proficiency Test, Series 200, 2008–2009 administration. Madison, WI: WIDA Consortium.
Google Scholar
McNamara, T. (1996). Measuring second language performance. London, England: Longman.
Google Scholar
Messick, S. (1975). The standard problem: Meaning and values in measurement and evaluation. American Psychologist, 30, 955–66.
10.1037/0003-066X.30.10.955
Web of Science® Google Scholar
Messick, S. (1989) Validity. In R. Linn (Ed.), Educational measurement ( 3rd ed., pp. 13–103). New York, NY: Macmillan.
Google Scholar
Mislevy, R. (2009). Validity from the perspective of model-based reasoning. In R. L. Lissitz (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 83–108). Charlotte, NC: Information Age Publishing.
Web of Science® Google Scholar
Mislevy, R., Steinberg, L., & Almond, R. (2003). On the structure of educational assessment. Measurement: Interdisciplinary Research and Perspectives, 1, 3–62.
10.1207/S15366359MEA0101_02
Google Scholar
Mislevy, R., & Yin, C. (2009). If language is a complex adaptive system, what is language assessment? Language Learning, Supplement 1, 249–67.
10.1111/j.1467-9922.2009.00543.x
Google Scholar
Norris, J. (2002). Interpretations, intended uses, and designs in task-based language assessment. Language Testing, 19, 337–46.
10.1191/0265532202lt234ed
Google Scholar
Oller, J. (1973). Cloze tests and second language proficiency and what they measure. Language Learning, 23, 105–18.
10.1111/j.1467-1770.1973.tb00100.x
Web of Science® Google Scholar
Oller, J. (1983). “g”, what is it? In A. Hughes & D. Porter (Eds.), Current developments in language testing (pp. 35–7). London, England: Academic Press.
Google Scholar
J. Oller, & K. Perkins (Eds.). (1978). Language in education: Testing the tests. Rowley, MA: Newbury House.
Google Scholar
A. Palmer, P. Groot, & G. Trosper (Eds.). (1981). The construct validation of tests of communicative competence. Washington, DC: TESOL.
Google Scholar
Poehner, M., & Lantolf, J. (2005). Dynamic assessment in the classroom. Language Teaching Research, 9, 233–65.
10.1191/1362168805lr166oa
Google Scholar
Purpura, J. (2004). Assessing grammar. Cambridge, England: Cambridge University Press.
10.1017/CBO9780511733086
Google Scholar
Read, J. (2000). Assessing vocabulary. Cambridge, England: Cambridge University Press.
10.1017/CBO9780511732942
Google Scholar
Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18, 429–62.
10.1177/026553220101800407
Google Scholar
Stoynoff, S., & Chapelle, C. (2005). ESOL tests and testing. Alexandria, VA: TESOL.
Google Scholar
Swain, M. (2001). Examining dialogue: Another approach to content specification and to validating inferences drawn from test scores. Language Testing, 18, 275–302.
10.1177/026553220101800302
Google Scholar
Thurstone, L. (1947). Multiple-factor analysis. Chicago, IL: University of Chicago Press.
Google Scholar
Tyler, R. (1934). Constructing achievement tests. Columbus, OH: Ohio State University.
Google Scholar
Valette, R. (1967). Modern language testing. New York, NY: Harcourt, Brace & World.
Google Scholar
Weigle, S. (2002). Assessing writing. Cambridge, England: Cambridge University Press.
10.1017/CBO9780511732997
Google Scholar

Citing Literature

The Companion to Language Assessment

Browse other articles of this reference work:

BROWSE TABLE OF CONTENTS