I harp a lot about how computer science is a great approach to thinking about representations. This semester, however, I’ve started thinking about how computer science is also fertile ground for getting students to reflect on their own work. I seem to recall a study about how metacognition training doesn’t really work (or rather, that it falls back to pre-training levels very quickly. In this case, though, I don’t seem much opportunity cost in including it in the class.
One thing I’ve been doing in my assignments, especially as the semester went on, was to include questions that are about how students think they did and justifying that answer. In the Bayesian network homework, for example, one of the questions was about creating evidence and explaining why the resulting posteriors make sense. A follow up question then asked them to find posteriors that don’t make sense, then consider whether it’s their network model that’s incorrect, or if it’s their intuition that is incorrect. The answers okay – nothing great – but at least I enjoyed the experience.
The NLP assignment, that was due just several days ago, did something similar. It was one of the more “standard” computer science assignments, creating an information extraction program. Among the questions they have to answer, however, is one that asks them to give their own program a score. In a normal computer science project, of course, there would be some set of hand-curated answers, and students’ work would be compared against it. In this case, because the dataset is new (and because I’m lazy), I didn’t create such a solution.
It was interesting to see the responses. I didn’t specify what the score meant – only that it should be between 0 and 100. Many students gave their program fairly high scores, but justified it only vaguely, by saying that it does “well” on a “majority” of courses. The point, of course, is for them to think about how they would actually measure this, and what constitutes success. For example, how do they incorporate the false positives and false negatives into their score, if at all? Does the score even represent the percentage of courses they got correct, and if so, which courses from which departments? Very few students even considered these questions in their answers.
The questions from both assignments, especially the NLP one, highlight something in computer science that I’ve only otherwise found in design: the general lack of specified goals, and therefore the freedom to select your own metric. Building models in the sciences have some of this, but you’re still matching your model to data. In design, on the other hand, the explicitly metric for evaluation is often unclear. Consider user interface design: you could optimize for least clicks to goal, least time to goal, least time on site, or any of a hundred other metrics.
Because of this, students to need quickly ask themselves what exactly they are trying to accomplish, and if what they’re doing is the best way towards that goal. I’m not sure if I can incorporate this into my intro course next semester, but I do plan on requiring students to do more self-evaluation in the future.