*This is part of a series on the Topics in Artificial Intelligence course I will be teaching in the fall. The first part was posted on 2015-06-23.*

The first topic I plan on covering is reinforcement learning. For those unfamiliar, reinforcement learning asks the question of how computers can figure what the best thing to do is, where “best” is determined by how much “reward” the computer gets. Importantly, the computer knows *nothing* about whether it’s doing well or doing badly, other than whether it’s being rewarded. As the terminology might have tipped you off, the idea comes from behavioral psychology, where animals that get rewarded for doing something will end up doing it more often. Equivalently, animals that are punished for doing something will try to avoid it; in reinforcement learning, this is the same thing as a negative reward. The connection to psychology and cognitive science will hopefully get students more interested in the subject, and in fact reinforcement learning is used in psychology research to justify arguments that human behavior is actually optimal.

Although I don’t use reinforcement learning in my research, I actually find the idea of reinforcement learning attractive. It’s one of those case where the problem can be stated very easily, but solving the problem turns out to be extremely hard. This is partly because it can involve a lot of other concepts in AI, things like determining what is important, or figuring out what to do when the reward has been inconsistent. Of course, these problems exist in other topics as well, but what’s nice about reinforcement learning is that their effects can be transparently explained, even to beginning students. Perhaps surprisingly, the solutions to reinforcement learning – at least the simple, foundational ones – are also easy to explain and implement. This means that students can get a lot more hands-on experience than might be otherwise possible, which is always good for the first topic in a course.

My hope for the students is that through these simply reinforcement learning examples, they will get an understanding of how *hard* AI really is. Lots of seemingly innocuous changes to the problem that the computer faces, and suddenly it takes a lot longer for the computer to learn to do well. If nothing else, is what students to take away: that the real world has a lot of these innocuous differences from what we’re studying, and they should be able to point them out when they think about applying these to real problems. Related to this idea of adapting AI to the real world is how to correctly represent things; for reinforcement learning, not only is that representing the world and the actions that the computer can take, but also deciding on how much reward should be given. For example, if we can represent a chess as a reinforcement learning problem (which has been done, and remains unsolved), how much would winning and losing the game be worth? If winning is worth 100 points and losing only -1 point, the computer is going to be a lot more “aggressive” than if winning was worth 1 point. Of course, if you think that winning is always the exact opposite of losing, then we’re actually creating *fake* rewards for the purpose of getting a specific risk-taking behavior out of the computer…

I can imagine spending a whole course on reinforcement learning – I will need to bush up on my own knowledge if I do – but equally I can skip some of these other issues. This flexibility allows me to adjust to my students’ abilities, as long as I correctly estimate the lower end of what they can do. Hopefully, every student will be able to understand the basic idea and code up the basic algorithms for reinforcement learning, which will be the first assignment. To be honest, this is sufficient for students to see many of the effects I just described, so I’m not sure what the second assignment should contain; maybe something about applying reinforcement learning to a simple real-world problem? Anyway, like I said in the overview post, the last assignment/project will be for students to explore some of the advanced topics, while in class I could cover more advanced algorithms. Whether a fourth assignment is beneficial would depend on how far the students get.

I think reinforcement learning is a good first topic to cover, even if I don’t have too much experience teaching the subject. I can only hope that my ability to teach it will live up to my own expectations.

[…] me start with what I had planned. A lot of this was written into previous posts, but I think I need to make explicit the assumptions I made in those posts. For […]

LikeLike

[…] a cognitive science course or a computer science course. The four topics I covered ended up being reinforcement learning, cognitive architectures, Bayesian networks, and natural language processing. Within this division […]

LikeLike