Step 12: Prepare for a Topics in AI Course (Part 5)

This is part of a series on the Topics in Artificial Intelligence course I will be teaching in the fall. The first part was posted on 2015-06-23.

Going by the original plan, the Topics in AI course will have four parts. The topic for the last part, however, is still to be determined. Looking back on how I planned the previous three topics, there may not be room for a fourth at all, if I want to get deeper into the other ones. Still, it’s worth thinking about the space of possible topics. One trend I noticed is that the previous topics (reinforcement learning, cognitive architectures, Bayesian networks) are all driven by algorithms, and not necessarily by problems; the techniques are generic and can be adapted to any particular task.

In contrast, for example, spending time on natural language processing would require students to not just think about computer science, but also bring in knowledge from linguistics and other fields. I’m not sure how much “real” NLP I can do – my NLP background has always been more on the data-mining side, the difference being I care not so much about really “understanding” language as the ability to do cool things with it. NLP has a long history though, and it may also be an interesting time to bring up Turing tests (if students have not already heard of it).

Another potential topic is some robotics topic. The “obvious” choice would be some kind of robotic control – for example, SLAM is how robotics figure out what the world is like and where in the world they are. My expertise here is even worse than with NLP, and it’s bad enough that it’s a consideration against this topic. At the same time, it would be cool for students to explore how hard it is to really deal with the real world, and how much of the “easy” things people do are actually quite hard (aka. Moravec’s paradox).

Finally, in keeping with the current trends in AI, I can see a module on neural networks and deep belief nets. I will probably be learning as much as my students on this topic, but I’m worried about the mathematical nature. It’s also unclear to me what students will learn from this – there are many devils in the details of DBNs, and it’s unclear to me what I want students to take away.

These are all possibilities, and another one to throw out is to allow students to pick some topic then do stuff with it (it is a more advanced course after all). In the end, I suspect I’ll have to talk to students to figure out what they are interested in. I will be sure to post an update when the semester gets to that point.

Step 12: Prepare for a Topics in AI Course (Part 5)

Step 11: Prepare for a Topics in AI Course (Part 4)

This is part of a series on the Topics in Artificial Intelligence course I will be teaching in the fall. The first part was posted on 2015-06-23.

Last time I talked about teaching Bayesian networks, and after sitting on it for a bit, I decided that it would work better as the third topic of the course, after the topic for this post.

So, the second topic for the Topics in AI is cognitive architectures. This is not a common topic for an AI course, but it actually forms much of the background for my research. The idea behind cognitive architecture research is to actually build a human-level intelligence. We are nowhere near there, of course, but a lot of is in trying to understand how different specialized AI algorithms work together and how information flows from one to the other. My own research is more in the latter – and I’m sure I will write a future post about it – but the point is, cognitive architecture is a topic that I enjoy.

Despite that, I’m actually unsure what students would get out of learning about cognitive architecture. I want to say that there is a connection to cognitive science that the students might enjoy, and that as a field that’s more nebulous than reinforcement learning, they will also get a different perspective on AI. But both of these feel like rationalizations, and not the real reason I’m including it. The closest thing to a good justification is that I can introduce students to my research, and maybe find a couple I can work with over the year. But what do students get out of it?

If I have to argue for why students should study cognitive architecture, it would be to learn that integration is non-trivial. It is extremely difficult, if not impossible, to simply say “here’s a set of interfaces”, then plug-and-play specialized algorithms, because each algorithm has a consequence and can impact whether other algorithms behave optimally. One example is how the representation of knowledge changes how efficiently that knowledge can be used. These tradeoffs are common in AI and in computer science in general, of course, but they are much more explicitly a concern in a cognitive architecture. I’m also bringing up representation of knowledge here as the reason why I decided to switch the ordering of this topic with Bayesian networks – because it combines both action and knowledge, and it makes the transition to Bayesian networks a little easier.

Given the desired learning goal of thinking about integration and tradeoffs, I’m not yet sure what the assignments would look like. It’s hard to think of a task that requires cognitive architectures, because the research is explicitly aimed at making computers do multiple things well – at least, it’s hard to think of tasks which students can do that fulfill this requirement. I suspect that the assignments will be less technical here; two weeks can be spent learning the basics of an architecture, followed by reading some recent developments in the field and being able to articulate why those developments were made.

It’s clear that more work will be needed on this topic as well, much as I will need to spend time on Bayesian networks, but at least I’m more convinced that this should be in the syllabus.

Step 11: Prepare for a Topics in AI Course (Part 4)

Step 10: Prepare for a Topics in AI Course (Part 3)

This is part of a series on the Topics in Artificial Intelligence course I will be teaching in the fall. The first part was posted on 2015-06-23.

The second topic that I would like to teach is Bayesian networks. For those unfamiliar, Bayes nets are a way of presenting probabilistic causality – that is, whether one event is likely to have caused another and, more importantly, if the second event occurred, how likely it is that the first event is the cause. The overused example is of diseases and symptoms: if you have a fever, you’re more likely to have caught the flu than have (say) malaria, even though it’s a symptom of both.

To be honest, of the three topics I have settled on, I’m the least comfortable with including Bayes nets. It’s not that they are not interesting or useful – in fact, my research is likely to involve Bayes nets in the medium-term future – but I feel it’s hard to get students excited about them. For one, there is some math involved in how the inference works. It’s not difficult math, but it’s tricky math, and it can get tedious very quickly. But back to the issue of engagement, Bayes nets are ultimately a way of representing knowledge, and not of doing something. Unlike reinforcement learning, the computer is not learning to do anything, or even learning anything at all. In many ways, Bayes nets is just a particular type of math, and I’m not sure I have the talent to make pure math sound interesting.

A fair question at this point would be why I thought Bayesian networks should be a topic in the course at all. One answer – and frankly the one that has the most weight – is that I have taught the topic before, and so have some confidence I can do it well. Less selfishly, Bayesian causality is one of those ideas that had a big impact in AI. It’s inventor received the Turing Award (the “Nobel Prize” of computing) for it, and it’s an area of ongoing research. Which is to say that, even if it’s not something students might be overly interested in, it’s definitely something they should know about.

So, if I keep Bayes nets in the syllabus, I would probably spend just enough time on probability, then focusing on the creating, interpreting, and critiquing networks. Whereas reinforcement felt like a very non-human way of solving problems, Bayes nets should feel intuitive, only more rigorous. What I would like is for students to understand why Bayesian inference works the way it does – then apply it to something in their lives that they may have simply accepted before.

This is arguably stretching the boundaries of what should be taught in computer science – but then, where would such a thing be taught in college? The only places it would fit, outside of computer science or statistics, is psychology or philosophy. And since we’re getting computers to do the dirty mathematical work for us, this seems as good a time to make students go through this exercise. If nothing else, at the end they will have applied the Bayesian framework to some real-world phenomenon.

Although, from writing this post, I clearly have more work to do on this topic. I will have to think harder about what I want to achieve.

Step 10: Prepare for a Topics in AI Course (Part 3)

Step 9: Prepare for a Topics in AI Course (Part 2)

This is part of a series on the Topics in Artificial Intelligence course I will be teaching in the fall. The first part was posted on 2015-06-23.

The first topic I plan on covering is reinforcement learning. For those unfamiliar, reinforcement learning asks the question of how computers can figure what the best thing to do is, where “best” is determined by how much “reward” the computer gets. Importantly, the computer knows nothing about whether it’s doing well or doing badly, other than whether it’s being rewarded. As the terminology might have tipped you off, the idea comes from behavioral psychology, where animals that get rewarded for doing something will end up doing it more often. Equivalently, animals that are punished for doing something will try to avoid it; in reinforcement learning, this is the same thing as a negative reward. The connection to psychology and cognitive science will hopefully get students more interested in the subject, and in fact reinforcement learning is used in psychology research to justify arguments that human behavior is actually optimal.

Although I don’t use reinforcement learning in my research, I actually find the idea of reinforcement learning attractive. It’s one of those case where the problem can be stated very easily, but solving the problem turns out to be extremely hard. This is partly because it can involve a lot of other concepts in AI, things like determining what is important, or figuring out what to do when the reward has been inconsistent. Of course, these problems exist in other topics as well, but what’s nice about reinforcement learning is that their effects can be transparently explained, even to beginning students. Perhaps surprisingly, the solutions to reinforcement learning – at least the simple, foundational ones – are also easy to explain and implement. This means that students can get a lot more hands-on experience than might be otherwise possible, which is always good for the first topic in a course.

My hope for the students is that through these simply reinforcement learning examples, they will get an understanding of how hard AI really is. Lots of seemingly innocuous changes to the problem that the computer faces, and suddenly it takes a lot longer for the computer to learn to do well. If nothing else, is what students to take away: that the real world has a lot of these innocuous differences from what we’re studying, and they should be able to point them out when they think about applying these to real problems. Related to this idea of adapting AI to the real world is how to correctly represent things; for reinforcement learning, not only is that representing the world and the actions that the computer can take, but also deciding on how much reward should be given. For example, if we can represent a chess as a reinforcement learning problem (which has been done, and remains unsolved), how much would winning and losing the game be worth? If winning is worth 100 points and losing only -1 point, the computer is going to be a lot more “aggressive” than if winning was worth 1 point. Of course, if you think that winning is always the exact opposite of losing, then we’re actually creating fake rewards for the purpose of getting a specific risk-taking behavior out of the computer…

I can imagine spending a whole course on reinforcement learning – I will need to bush up on my own knowledge if I do – but equally I can skip some of these other issues. This flexibility allows me to adjust to my students’ abilities, as long as I correctly estimate the lower end of what they can do. Hopefully, every student will be able to understand the basic idea and code up the basic algorithms for reinforcement learning, which will be the first assignment. To be honest, this is sufficient for students to see many of the effects I just described, so I’m not sure what the second assignment should contain; maybe something about applying reinforcement learning to a simple real-world problem? Anyway, like I said in the overview post, the last assignment/project will be for students to explore some of the advanced topics, while in class I could cover more advanced algorithms. Whether a fourth assignment is beneficial would depend on how far the students get.

I think reinforcement learning is a good first topic to cover, even if I don’t have too much experience teaching the subject. I can only hope that my ability to teach it will live up to my own expectations.

Step 9: Prepare for a Topics in AI Course (Part 2)