It’s the start of the fall semester and time to map out a course schedule, but what will work best for you? You love math, so maybe you should stock up on quantitative courses; on the other hand, it might be a good idea to get some course requirements out of the way. You’re a night owl, but maybe you should take those morning classes so that you can play badminton later in the day. While the new course CS/STAT 184 Reinforcement Learning may not solve these scheduling dilemmas, the course will teach undergrads how to think through these types of problems. Through the study of reinforcement learning, a sub-field of machine learning, students will learn how to state a problem with different actions and rewards, implement algorithms to solve these problems, and identify why the algorithms work.
To get a preview of CS/STAT 184 Reinforcement Learning, we spoke with Dr. Lucas Janson, Assistant Professor of Statistics and Affiliate in Computer Science and Dr. Sham Kakade, Professor of Computer Science and Statistics, about their inspiration for the course and their goals for students. Janson arrived at Harvard Statistics five years ago after his PhD and has focused mainly on research in high dimensional inference, uncertainty quantification, and reinforcement learning. Joining Harvard in fall 2021, Kakade is the Co-Director of the Kempner Institute for the Study of Artificial Intelligence and is broadly interested in the areas of machine learning and AI. More specifically, Kakade is curious about how to improve methods “to push the frontier for computers to complete the kinds of tasks that humans excel at.” Kakade’s and Janson’s responses below are edited excerpts from a conversation with them on these topics.
1) What led you to pair up to design a new course on RL for this year?
Janson: To put this course in context, I have been trying to incorporate more machine learning courses for the past several years into the undergrad curriculum in the Statistics Department. Machine learning is often categorized into three sub-categories: unsupervised learning [uses unlabeled data], supervised learning [uses labeled data], and reinforcement learning (RL). In both unsupervised and supervised learning, models are used to provide you with information, but you must act based on the information. In contrast, in RL, the model incorporates acting into the feedback loop, so it’s the only branch of machine learning that affects the real world directly. A few years ago, I created STAT 195 Statistical Machine Learning [now Introduction to Supervised Learning], and my colleague Dr. Alex Young simultaneously started STAT 185 Introduction to Unsupervised Learning, so it seemed to be a natural progression to introduce a reinforcement learning course.
The idea for this joint venture came about during one of Sham’s visits to Harvard (before he joined the Department). We were walking around Harvard Square and realized that we both intended to independently teach a course like this one.
Kakade: After chatting with Lucas, I thought we should clearly do this together both because the collaboration would help us to reduce the prep work and to source the best material – and it’s simply more fun! My own path to this course started after I taught a graduate class on the theoretical foundations of RL at Cornell Tech in New York, and I thought it would be a good idea to bring an undergraduate version of the course to Harvard [Professor Susan Murphy teaches a graduate RL course at Harvard].
2) Why do you think this is an important course for students to take?
Kakade: In this course, we want students to understand how algorithms can solve problems both by studying the underlying mathematical formulas as well as by implementing the algorithms and analyzing the results. Students will see how the algorithms work in both abstract and hands on ways, which is a pretty good model of learning, not just for this class, but also for more advanced machine learning courses.
As Lucas mentioned previously, there are a couple of different sub-areas of machine learning – unsupervised, supervised, and RL. As we move into RL and more realistic applications, the deployment becomes interactive in nature; our computers don't just sit there, but instead, they interact with the world to learn. For any effective deployment, there is some aspect of interaction, and this course provides students with the foundation to think carefully about this process at multiple levels. In reinforcement learning, often there is not a teacher in the system telling you what to do, which to a large extent reflects the dilemmas we face in the real-world when we start deploying technological and scientific systems. Because of the increasing number of systems deployed that utilize these ideas and impact our world, students recognize the value of this style of thinking and want to learn more about RL.
3) Provide a brief overview of the course's content, structure, and goals.
Kakade: Broadly speaking, we're approaching this course topic and its structure by focusing on the different levels of complexity in how a system might interact with the world and the algorithms used to solve these problems. To begin with, we will explore simple problems with short-term decision making, then move to long-term decision making, and, finally, advance to long-term decision making in complex environments. There's a natural progression of problem complexity and solution complexity that will be adjusted throughout the course. By the end of the course, students will be able to implement algorithms and to analyze why they provide a correct solution.
I would just add that one advantage of starting with the simpler problems is that we can usually draw more conclusions about them. As problems advance in complexity, you may be able to determine the algorithms that solve them, but it becomes harder to draw rigorous conclusions about why these algorithms work. While the cutting edge of the field lies in trying to better comprehend the complex algorithms that solve problems in complex environments, students can start demonstrating interesting results in these simpler problems. This lays the groundwork for thinking about more challenging problems. When creating this course, another goal of ours was to make it accessible to more students by requiring minimal prerequisites. Exposure to probability, programming, and linear algebra is about all you need for a course focused on teaching the first principles of reinforcement learning.
4) Which part of the course do you think students will look forward to the most?
Janson: In terms of the big picture, I think this course will continue to generate interest with students because there is a lot of pioneering work with reinforcement learning in the real world now. Because RL problems act in the real world, even their simple versions can look intimidating to solve. However, even before approaching the math portion of RL, you can teach students to state the problem very simply and intuitively. In real life, we encounter scenarios all the time in which we can take different actions with different rewards and try to choose the option with the maximum reward. For instance, when deciding on the fastest route to work, you might select a route that has been fast in your experience, but maybe there’s a better route that you haven’t tried out. When determining the different rewards, you need to strike a balance between selecting an option that you already know versus a potentially better option that you need to spend time learning about. After stating the problem, then students can delve into the math and implementation of the algorithms. Students will be challenged by this course but will also quickly understand how to approach RL problems and how to determine why certain algorithms work, which will be satisfying and exciting for them.
Based on our conversation with Prof. Janson and Prof. Kakade, it sounds like students will have an enriching experience this fall in CS/STAT 184 Reinforcement Learning. Thank you to Prof. Janson and Prof. Kakade for sharing your insights into what your course is all about!