Tuesday, December 20, 2011

The pseudo-science of testing

A brilliant article in yesterday's NYTimes by Michael Winerip, titled "10 years of assessing students with scientific exactitude" describes the ups and downs experienced by the New York school system over a decade of No Child Left Behind-inspired Testing.

It starts with:
In the last decade, we have emerged from the Education Stone Age. No longer must we rely on primitive tools like teachers and principals to assess children’s academic progress. Thanks to the best education minds in Washington, Albany and Lower Manhattan, we now have finely calibrated state tests aligned with the highest academic standards. What follows is a look back at New York’s long march to a new age of accountability.

After reading the full chronological listing, I didn't know whether to laugh or cry.


Perhaps the single hardest (worst) part of my job is making and grading tests. I never know in advance how a particular class of students is going to do on a particular test I make up. Well, the students that do extremely well might do well in any variant of it. The students who do extremely poorly may not do that much better in some other variant of it. But for the vast majority of students, I find there is great sensitivity to every aspect of the exam, from the choice of topics, to the subtle deviations of these questions from what has been covered exactly in the notes, the book, or the homework assignments, to the length of the exam, and even the ordering of the questions.

Even in a seemingly objective field of study like Engineering, there is a lot of subjectivity in how one grades (going beyond the obvious subjectivity inherent in the choice of questions to put on an exam). We try our best to be consistent and fair across all the students for the same class; but for the same question and answer, unless one uses shallow multiple-choice questions (the approach adopted by many standardized tests), it is certain that no two instructors would grade the same way. While there may be one way (or relatively few ways)  to get the answer right, there are exponentially many combinations of errors that trip up students. Particularly if one wishes to go down the road of offering partial credit, the art of grading requires one to differentiate between these and place a value judgement on them: do you give a student that got the right numerical answer through incorrect reasoning some credit? Do you give a student that took completely the wrong approach to the problem but applied that approach correctly albeit to give the wrong answer more credit than one that tried out something new and original but failed with it and gave (if it is possible) an answer even further from the correct one? What if you discover upon grading that a question that seems perfectly straightforward to you has been misinterpreted by number of the students to be quite different from what you had intended? How large a number does this have to be for you to factor the possible ambiguity in the wording into account when grading? Does it matter if the misinterpreted question is easier or harder than the originally intended question?

Unfortunately, the politics of public K-12 education and the economics of higher education dictate that we must always have assessment and grading. Testing is a necessary evil that we cannot wish completely away. Let's continue to strive to be as fair as possible in making and grading tests, but let us not pretend that test scores and GPA's are objective, noiseless, measures of a student's intellectual capability (or, in the case of public schooling, of the effectiveness of a system of education).

Friday, December 09, 2011

No mistakes on the Bandstand

As someone who appreciates Jazz, I highly recommend this video. Stefon Harris talks about the importance of paying attention to the teammates when doing improvisation. He gives a great illustration of what it means to go with the flow, and how that's different from commanding the team to do something specific that one has already set one's mind to. 

This talk is also a great metaphor applicable to many interactive activities that an academic is involved with. Whether it is working with Ph.D. students, research collaborators, or even in class while teaching, there has to be a lot of give and take, and mindful awareness combined with a certain letting go of the ego makes for a richer and more rewarding experience. Even seemingly discordant notes are an opportunity to go someplace new. 

I have been experiencing this increasingly in the classroom, myself. I am finding that the more open I am to new ideas coming from the students through their comments and questions, the more willing I am to digress from a pre-set path, the more interesting, the more creative, the classroom experience is for all of us. This allows us to stray away from well-trodden paths of textbook exercises to occasionally discovering entirely new problems. This semester, for instance, based on student questions in my wireless networks class, we formulated and solved an interesting new variant of the problem of power allocation across parallel channels to maximize total rate (a classic Information Theory problem that is solved using the so-called ``waterfilling" algorithm). This variant was similar enough that we could use the same approach, but different enough that we could appreciate resource allocation at a deeper level. And because it was motivated by questions the students themselves had asked and clearly something new to all of us, I think it might just have made a more lasting impression at least on some students compared to the usual routine. 

In light of the ongoing debates about online education, it also occurs to me that this kind of improvisational interactive classroom experience is precisely what cannot be replicated in mass-marketed pre-packaged instructional videos.