Three Key Questions on Measuring Learning
To gauge different types of learning, we need a broader collection of measures, with a greater emphasis on authentic, performance-based projects.
Educators, policy makers, parents, and others interested in improving the way we measure learning in today's schools need to examine three essential questions: 1) What really matters in a contemporary education? 2) How should we assess those things that matter? 3) How might our assessments enhance learning that matters, not just measure it?
What Matters in a Contemporary Education?
Any consideration of educational measurement must begin with the desired outcomes to be measured. In our work on the Understanding by Design® framework, the late Grant Wiggins and I described four key types of educational goals—knowledge, basic skills, conceptual understanding, and long-term transfer goals. All of them are essential to a successful education in the 21st century (Wiggins & McTighe, 2011). While these goals are interrelated, their distinctions are important because each type requires different approaches to both teaching and assessment. Let's look at each in turn.
Knowledge goals specify what students should know—factual information (state capitals, multiplication tables), vocabulary terms, and basic concepts (climate, balance). The attainment of knowledge goals can be best gauged through objective test or quiz items and teacher questioning.
Skill goals state what students should be able to do. Every subject area contains basic skills (addition, handwriting, drawing, dribbling a basketball) that are essential to building competency and mastery. Teachers can assess student proficiency in a particular skill through direct observation of a performance or by examination of an end product that required use of the skill. Unlike with assessments of knowledge, for which there is usually a single, "correct" answer, skill performances can be best tracked along a continuum of proficiency levels from novice to expert.
Understanding goals refer to students' grasp of conceptual "big ideas." Such ideas are inherently abstract. They may be in the form of concepts (patriotism), principles (F=ma), themes (friendship), issues (government regulations), or processes (problem solving). Understanding in this context generally cannot be assessed through multiple-choice or fill-in-the-blank test items. Instead, students need to provide explanations, justify conclusions, and support answers with evidence. As with a doctoral dissertation, there is a need for the defense, not just the answer.
Long-term transfer goals refer to students' capacity to apply what they've learned to a new situation or different context. Transfer goals are process oriented; they specify what we want students to be able to do with their learning in the long run when confronted by new opportunities and challenges. They tend to be reflected in the anchor standards or framework practices in official academic standards, but they are often transdisciplinary in nature (encompassing complex skills like critical thinking and collabora-tion, or developmental Habits of Mind such as persistence and self-regulation).
Transfer abilities—or qualities inherent in them—are also increasingly valued in the 21st century workplace, in a way that they haven't needed to be in the past (see fig. 1) (National Association of Colleges and Employers, 2016; Darling-Hammond & Adamson, 2013). Indeed, it's not too much to say that the future belongs to those who can apply their learning effectively in new situations.
Transfer abilities can best be measured through authentic, performance-based tasks, with well-developed rubrics for evaluation.
How Should We Assess the Things That Matter?
Assessment is a process by which we make inferences about what students know, understand, and can do. To allow valid inferences to be drawn from the results, an assessment must align with, and provide an appropriate measure of, a given goal. Moreover, because all forms of assessment are susceptible to measurement error, our inferences are more dependable when we consider multiple sources of evidence.
Given that there are different types of learning goals, we need an associated variety of assessment types to gather valid evidence of learning. Think of assessment as analogous to photography. Like the results on a test, a picture can be informative; however, no single photo can provide a complete portrayal of a situation. To continue the analogy, what we need is a photo album of evidence on student learning, not a snapshot—a collection of multiple measures, appropriately aligned to different types of learning outcomes that matter.
This raises a vital question concerning the alignment between assessments and educational goals: Are we currently assessing everything that matters, or only those things that are easiest to test and grade? With respect to large-scale, standardized assessments, the answer is fairly obvious. For example, virtually all current standards in English language arts include listening and speaking skills, which are generally acknowledged as the foundations of literacy. Yet those skills are rarely, if ever, assessed on standardized tests. Similarly, most standardized tests have limited capacity to assess transfer goals, or related complex skills like scientific investigation, historical inquiry, research, argumentation, and creative thinking.
Now for a related question: If some outcomes that matter are slipping through the cracks of standardized testing, are we doing a better job of measuring all valued outcomes through classroom assessments? Studies of classroom assessments raise doubts (Frey & Schmidt, 2010). For one study, a district collected classroom assessments from all K–12 teachers in all subjects during a six-week period (Gibble, 2000). Of the total of 664 assessments collected, 20 percent were identified via a random sample for analysis by a committee of teachers and administrators. Here were two of their findings: 1) The majority of the assessments (75.5 percent) measured the lowest levels of cognition (levels I and II on Bloom's Taxonomy), and 2) assessment items were predominantly (80 percent) in multiple-choice, true-false, matching, or fill-in-the-blank formats.
Although this study is dated, its findings are consistent with patterns that I have observed more recently as "test prep" pressures result in classroom assessments that mimic the formats (generally selected- and brief-constructed-response) of state and national accountability tests (McTighe, 2017). Ask yourself: What would the results be if you replicated this study in your school or district today? Are any of your valued outcomes not being properly assessed?
Given the limitations of large-scale testing and the status of classroom assessments, what changes do we need to make to ensure that we are assessing outcomes that matter? What assessment photos do we need for a composite album of evidence of learning? Traditional types of assessments offer sufficient ways of measuring students' knowledge and basic skills. For example, we can use multiple-choice or fill-in-the-blank test items to gauge students' knowledge of historical or scientific facts. However, to properly assess conceptual understanding, long-term transfer, and other complex skills, we need greater use of authentic, performance-based measures in which students are asked to: 1) apply their learning to a new situation, and 2) explain their thinking, show their reasoning, or justify their conclusion.
Authentic tasks are like the game in athletics. While the players have to possess knowledge (the rules) and specific skills (dribbling), playing the game also involves conceptual understanding (game strategies) and transfer (using skills and strategies to advantage in particular game situations). Assessing what matters must include assessing performance in a "game" in addition to tests of requisite knowledge and skills.
Another dimension of making sure we are assessing things that matter shifts the spotlight from educators to students. A photo album approach to assessment can enable learners to contribute their own personal "photos" as evidence of their accomplishments. They can be invited to propose ways of showing that they are meeting academic standards. Students and parents can be asked to contribute evidence of creativity, persistence, or community contributions accomplished outside of the school day. After all, isn't that what transfer means? Maintaining high standards does not require standardization of all measures.
Involving learners in creating the assessment portfolio builds students' capacity for self-assessment. The ability to honestly appraise one's performance against established criteria and performance standards is a life-long skill and a sign of intellectual maturity.
How Might Assessments Serve Learning?
Which leads us to our third big question: How might assessments become more integral to learning, as opposed to just evaluating it? To say that assessment should serve learning is a nice slogan, but what exactly does it mean? Over the years, Wiggins and I conducted a workshop exercise to explore this question. We asked participants to think of a highly eff