I apologize for the late post. If you are not aware, things are a little hectic at Oxy, which threw everyone’s schedule off. Michigan is facing similar problems, as are more than a couple other universities, but I can’t see the school being as disrupted as things are now at Oxy.
Anyway, a long time ago I talked about how my AI class had a lot fewer programmers than I thought, and how I had to rethink the course and the assignments. Now that the semester is almost over, and all the assignments have been released (my students are working on the last one now), I want to talk about what I ended up doing, in case it’s useful to other people out there.
Just as a reminder, the course I’m teaching is an upper-level cognitive science course on AI, with the prerequisite of either a cognitive science course or a computer science course. The four topics I covered ended up being reinforcement learning, cognitive architectures, Bayesian networks, and natural language processing. Within this division were five assignments, two in reinforcement learning and one in the rest.
Just because they were the first assignments, I struggled with the two reinforcement learning assignments. The first assignment was designed to get students thinking about the simplest reinforcement learning algorithms (tabular Q-learning) and how the parameters (learning rate, exploration rate) affect the results, with a dash of more conceptual questions thrown in. I did end up using an IPython notebook, which I thought worked well; it was, however, also the only time I used IPython, since the remainder of the topics did not require as much computational support.
In contrast, the second assignment was supposed to be one focused on the applications of reinforcement learning. The main component of the assignment was for the students to frame a problem of their choice as a reinforcement learning problem, specifying the states, actions, and rewards, and justifying their choices. Since there were some computer science students in the class, I also offered the option of implementing an agent with eligibility traces, with the goal for the students to learn something about how to run experiments and explaining their results.
I’ll be honest: this assignment was a disaster, especially the application option. The first issue is that the question was too broad, and without an actual computation model to constrain it, student’s representations were all over the place. The bigger issue, however, which applied to both options, was that my students and I had different understandings of how to justify their choices. Since we covered topics like partial observability and exploration/exploitation in class, I thought they would use those concepts in the answers. Instead, their justification was often phrased in terms of the domain – why a feature is important, but not whether there are better ways of representing the feature or how it would affect the agent’s learning.
I realized my mistake once the students turned in their work. My original idea was simply to take a lecture and have students workshop each other’s papers, after they have gotten some feedback on the type of questions they should be answering. The problem was that I didn’t really want them to go back and revise their paper, but I did want to see them learning. So I did something clever: I told the students that the assignments were not what I wanted, then asked them what I should do. Which is how a student suggested that they could just talk to me for fifteen minutes instead, and I can ask them questions about their assignment. This was a particularly apt assessment because in lieu of final exams, I’m planning on a one-on-one conversation with each student, where they talk about two of their projects (we would each pick one). By including a conversation in the assignment, I’m not only completing the assignment, but also giving them practice while they have a partner.
(By the way, I’m still not sure how I should conduct the final conversation…)
I got more ambitious for Bayesian networks. Since math was not the students’ strong suit, I instead focused the assignment on applying causal networks. Students had to pick some phenomenon they want to model, then create a network with at least nine factors, and do any research necessary to structure the nodes and provide the conditional probabilities. While there are existing Bayesian network packages, most of them either required installation, or were entirely unintuitive to use.
From the evaluations, I can tell the students enjoyed the assignment, and personally I enjoyed grading it too. Part of it was because I learned to be more explicit about the questions I want answered, but part of it is also because students chose interesting topics to model. One of my original reasons for getting into computer science was because it could simulate reality, so I had a lot of fun testing whether (for example) a car’s country of manufacture had any effect on its gas mileage (the answer: not really). In addition to creating and justifying their network students still had to do some manual probability calculations, but nothing too complex. I also offered the option of writing a Bayes net solver, but no student took up the challenge.
Which brings me to the final assignment on natural language processing, which is still ongoing. This assignment ended being a mix of the third assignment (which had some light programming) and the fourth assignment (which used a program as a tool to answer other questions). The assignment asks students to extract prerequisite information from the course catalog (which I had scraped before hand), by sequencing a bunch of fairly basic text transforms such as replacing text, breaking text apart, and selecting certain pieces (essentially, poor-man’s MapReduce). In terms of implementation, the hardest part was deciding on the interface between the front-end and the back end. Since one of the complaints about the cognitive architecture assignment was that you couldn’t save your work, I made sure that the HTML-created transforms actually translated into a script, that students can copy out, save, then reload when they resume. On the server side, of course, I also had to parse it then run the transforms; for this I’m thankful that functions in Python are first class.
I would love to see this approach used in more classes, and would also love to learn how others have enabled non-programming students to do computational work.