24 April 2009

The Good and Bad of Multiple-Choice Testing

The Testing Effect refers to the improvement of students' performance when students are tested repeatedly of their knowledge. See previous blog entry. Frequent testing using multiple choice questions has also been shown to be effective not just for recall but for higher Bloom level of learning. So the testing effect is not limited to memory recall of facts only, but also to application type of learning. However, the presence of incorrect answers (also known as "lures") in multiple choice questions may cause the students to acquire incorrect concepts via faulty reasoning. Even then, repeated testings produce more positive benefits than negative side-effects.

One way to compensate the negatives of multiple-choice testing is to provide immediate feedback to correct learner's misconceptions and avoid their construction of incorrect knowledge. Another way is to offer a "don't know" option or a penalty for selecting a wrong answer. This can also reduce the amount of guessing. Lastly, a different way of testing may be used, such as short answer questions, which seem to have even more positive benefits than multiple choice questions.

Reference:

Marsh, E., Roediger III, H., Bjork, R., Bjork, E. (2007). The Memorial Consequences of Multiple-Choice Testing. Psychonomic Bulletin & Review. 14(2), 194-199.

09 April 2009

Advice to Students: do more testing and less studying!!

At least for memory recall, taking a memory test repeatedly rather than studying repeatedly results in much better long term retention. The abstract from the first reference below says it all:
Taking a memory test not only assesses what one knows, but also enhances later retention, a phenomenon known as the testing effect. We studied this effect with educationally relevant materials and investigated whether testing facilitates learning only because tests offer an opportunity to restudy material. In two experiments, students studied prose passages and took one or three immediate free-recall tests, without feedback, or restudied the material the same number of times as the students who received tests. Students then took a final retention test 5 min, 2 days, or 1 week later. When the final test was given after 5 min, repeated studying improved recall relative to repeated testing. However, on the delayed tests, prior testing produced substantially greater retention than studying, even though repeated studying increased students’ confidence in their ability to remember the material.Testing is a powerful means of improving learning, not just assessing it.
In other words, if S stands for study, and T stands for testing, a final recall test after the sequence STTT results in much higher retention than SSST or SSSS. In computer science, most of the learning requires reasoning rather than memory recall, although a good repository of learned concepts is definitely an asset to being a good programmer. However, from a number of interviews with students enrolled in a first year programming course, when asked how they prepared for exams, 90% of the students would say reading from lecture notes, textbooks, and only about 10% would mention about doing some coding and testing. The learning-by-experimentation concept seems to be foreign to many students.

It won't be surprising that the result from Roediger and Karpicke applies just as well to reasoning skills as memory recall. What will be interesting for CS is to identify the set of core skills and concepts that expert programmers need to have and apply this strategy of studying and testing (mostly) throughout a program of study rather than just a course, and conduct longitudinal study of their retention and programming skills beyond graduation. Also, how can repeat testing be made "fun" for learners? Is there an "optimal" study and test sequence for CS courses?

References:

Roediger III, H., Karpicke, J. (2006). Test-Enhanced Learning. Psychological Science. 17(3), pp 249 - 255.

Karpicke, J., Roediger III, H. (2008). The Critical Importance of Retrieval for Learning. Science. Vol 319, pp 966 - 968. Link.

Two-stage Cooperative Exams

The idea of a two-staged cooperative exam is that students take the same exam repeatedly during an extended period of time but in different settings. These settings can be individual in the beginning, then working in pairs, or collaboratively in a larger group. The goal is to turn these testing sessions into a learning experience.

Here is an example of how this is implemented in a large class for midterm or final exams: during the first 30 minutes of the class period, the students take a multiple-choice exam with about 20 - 25 questions in it individually. They hand in the answer sheets at the end of the exam. Then right away, they are given the same multiple-choice exam but with added questions in it, and are asked to work on it collaboratively with someone close by for 45 minutes. They can use books, notes, and other resources. The grade of the exam is calculated based on a weighted average (75%) of the first submission and 25% of the second submission of the exam. However, if this grade is less than the grade in the first submission (i.e. from the solo effort alone), then the final score of this exam is based solely on the first submission.

With this simple change in exam format throughout the term, there has been large improvement in the final exam scores from a mean of 74% to 80%, based only on the solo part of the exam. Although it seems that the collaborative component of the exam may have boosted the final score, a statistical comparison with grades from previous years with no collaborative component in the exams shows that there is no dramatic change in grade distribution. The number of students at the bottom rungs of the ladder are fewer with the two-stage cooperative exam strategy, but there is no increase in the upper rungs.

References:

Yuretich, R., Khan, S., Leckie, R., Clement. (March 2001). Active-Learning Methods to Improve Student Performance and Scientific Interest in a Large Introductory Oceanography Course. Journal of Geoscience Education. 49(2), p 111- 119.

Yuretich, R. Accessing Higher-Order Thinking in Large Introductory Science Classes.