Step 88: Start CS1 on the Right Foot

July 5, 2018July 5, 2018 Justin LiLeave a comment

The first lecture sets the tone for the semester. This is especially important for CS1, where students come in with all kinds of expectations and prejudices. For me, the goal of the first lecture is to assure all students that they belong in the course, regardless of their background and goals, and to situate computer science in the broader societal context. Instead of going over the syllabus, I prefer to spend the hour setting up the relationship I expect students to have with CS.

I want to share an annotated walkthrough of the first lecture of my CS1 course, from the Fall 2017 semester. (The complete PDF slides are on Github.) Like other liberal arts colleges, only a minority of students in CS1 are majors. To engage the students whose interests lie elsewhere, I focused the lecture on the “cool” applications of CS.

slide-03

After a brief introduction of myself, I direct the class into a short discussion: what is computer science? Aside from clearing up misconceptions, such as equating computer science with programming, this (think-)pair-share activity also gets students talking to each other. Many of my lectures includes discussions, and building a community early encourages students to help each other in the labs, projects, and beyond.

slide-05

With a rough definition of computer science, the bulk of the lecture describes three reasons why students should study CS. I prefer to address the most banal reason first: that computer scientists often have high-paying jobs. Since this is likely one of the first answers anyway, I prefer to set it aside so it won’t dominate the remaining conversation.

slide-07

With money out of the way, I present the second reason to study CS: because it touches a broad swath of society. Recalling the goal of making everyone feel like they belong, I deliberately list colleagues whose work include or use computational methods.

Amanda Zellmer in Biology models the effects of climate change on local species, while Janet Scheel in Physics simulates turbulent convection flows in high-temperature fluids.
Caroline Heldman in Politics used machine learning to detect gender bias in film.
Alison de Fren in Media Arts and Culture explores the portrayal of technology in popular media.

[I just noticed that all the colleagues I named are women.]

These faculty were deliberately chosen to reflect not just the physical sciences, but also the social sciences and the humanities. The intention is for students to connect computer science with whatever other interests they may have. These slides also serve as a preview for assignments on big data social science and on digital arts and media.

slide-17

I make a guessing game out of these reasons for studying CS, and so far only one student has guessed the final one: because it’s fun. This section contains an assortment of my own experiences with CS, with a focus on hobby projects that have helped me in my personal life.

slide-21

The centerpiece of this section is the (true!) story of how I used constrained optimization to assign 19 members of the Outdoor Club into 3 vans for spring break back in 2009. I did this by considering the social network formed by relationships and friendships, and how different assignments would split people apart.

slide-31

This example has a deeper point beyond how CS can be used to solve real-world problems. Once students understand how I solved the seating problem, I then ask them: Why do relationships have a -4 penalty, but friendships only have a -2 penalty? The answer is that these numbers are arbitrary, but different numbers will have a direct impact on how enjoyable the 24-hour car ride is.

slide-30

This discussion transitions to my last section. I present some summary statistics of the class, and every time I have rough gender parity with majors from across the college.

slide-33

Pointing out the relative diversity of the class, I then list recent articles on how technology has negatively impacted society. This is how I end the lecture: with a somber note that CS needs their perspective as much as they might need CS for their career.

slide-34

If there’s a takeaway for this post, it’s that the course policies can wait – the first lecture is a singular opportunity to acculturate students to a broader view of CS. I want students to feel that CS can advance their interests and that they are welcome in CS no matter their background. This lecture is designed to get students excited and motivated about computer science – and if I may say so, it has worked pretty well so far.

Step 87: Replan a CS Writing Course

May 6, 2018 Justin Li2 Comments

This semester I taught Computer Science Junior Seminar, or as I like to think of it, Communicating Computational Concepts. The main reason this course exists is so majors can fulfill Oxy’s disciplinary writing requirement, but since the department gets to choose the topics, I decide to expose students to issues at the intersection of tech and society.

I might talk about the class syllabus in a future post, but here I want to focus on the approach. My overall goal was just to get students writing about computer science. I structured the course around several major papers. In one, students had to explain how technology is impacting society; a different paper required students to explain an algorithm in technical detail; and a third paper asked students to create an interactive document. For the lectures, I spent a week or two talking about writing, a couple weeks analyzing CS research papers and blog posts, then most of the second half of the course on a selection of tech and society topics.

As the first iteration of this course, this plan worked fine. Some discussions were more interesting than others, I had to shift some lectures around, but I got the students to do a bunch of writing.

It turned out that this wasn’t what I wanted from the course at all.

I first discovered this when I was grading the technical expository essay. I had asked students to write something approximating “a blog post by someone in the tech industry”. Some students managed the tonal shift, but many other papers kept the tone of an academic essay, using stilted and constrained sentences to fill out a formulaic structure. There was no personality in those paragraphs, no voice, no joy.

I should say here that I have a biased relationship with writing. I have kept a personal journal since I was a high school freshman in 2002, for more than half my life, and I have maintained various blogs between then and now. Through this habit, outlining and revising became the way I work through complicated ideas and feelings. This very blog post started because I wanted to reflect on the semester, and I threw out a 500-word draft that lost track of what I wanted to say.

What I saw in my students’ essays is that they don’t have this relationship with writing. I had the revelation that to students, writing was this thing they do for class. (I suspect this is true of reading as well.) You only write because you wanted to get a good grade, and “good writing” is professional and academic and full of therefore‘s and regardless‘es. One does not use I or you, one does not include jokes, and one most definitely does not use any contractions.

I had a hint of this earlier in the semester. We spent a lecture discussing audience and tone and, as an exercise, I asked students to come up with alternate introductions to a blog post on the Meltdown vulnerability. The students did adopt a more conversational tone for this activity, but one voice/style was curiously absent: that of a first person, “I just learned about this attack and I thought it’s really clever.” I have heard horror stories of students using personal anecdotes as supporting evidence, but it seems for that technical explanations, students can’t seem to write about themselves.

For me, the strange thing is that so many of the tech blogs and articles I read are in the first person or from immediate experience. This blog post on Minesweeper, a reading for the interactive document section of the course, is written as an exploration. Almost all of Eevee’s blog is in first person and about their opinions and ideas, and they can get pretty technical. Many Hacker News links are simply software engineers writing about whatever problem they solved at work or in a hobby project. I will go so far as to argue that for most students, most of the non-documentation technical writing they will encounter are casual blog posts.

That is what I really want students to get out of this course: for them to be part of the tech “culture” of sharing knowledge and struggles and achievements. I imagine students keeping up with technological developments and keeping a weekly blog. I want students to find voices they enjoy and respond to them. I want students to describe and reflect on their projects, and have others learn from that experience.

The course should have focused on writing as the means to becoming an active member of the computer science community. Instead, by focusing too much on the writing requirement, I created a course where the writing assignments were just writing assignments, perpetuating the idea that writing is only used for college essays.

I welcome ideas for how a future iteration of the course might achieve this goal.

Step 86: Write up the Ethical Engine Lab

January 13, 2018September 16, 2018 Justin LiLeave a comment

I taught Intro to CS for the third time last semester, and there are still individual assignments that I tweak or replace, both to explore what is possible and to keep the material interesting for myself. Having discovered Evan Peck‘s Ethical Engine assignment, I ran it as a three hour lab during one of the last weeks of the semester. I thought it might be useful to others to see the process I went through adapting the assignment. All code is available on GitHub.

Background

The basic premise for this assignment is that, as self-driving cars become more common, the AI will have to start making life-or-death decisions, essentially turning the philosophical trolley problem “thought”-experiment into a highly-relevant real-world dilemma. Evan’s original assignment was based off of the Moral Machine webapp, and had two parts. First, students would write the decision procedure which would determine whether a self-driving car should save its passengers or the pedestrians. Once they are done, students would write code to “audit” a blackbox decision procedure, to see whether it treated people of different genders, ages, social statuses, etc. equally. The idea is to provide a concrete example of the real life consequences of code, and that code audits may expose algorithmic bias.

Planning

Thanks to Evan putting his work on Github, I could have used the assignment as is. Evan’s write up did not describe how he used the assignment, however, and I wasn’t sure it would be appropriate for a three-hour pair-programmed lab. Mulling through his materials, I realized that my goals were slightly different. One key point that was missing from the assignment is that our values may not translate perfectly into code – that is, even if our intended algorithm was entirely “ethical”, our buggy implementation of that algorithm may not be. And this is without getting into how people may not agree on what is “ethical” in the first place, and that good intentions may not be sufficient.

To push students on these points, I partially deemphasized the role of students as code auditors, instead giving students more time to examine their own decision procedures. The lab broke down in three parts:

First, in pairs, students had to manually work through sixty randomly-generated trolley problems. Making the decisions as a group makes a discussion of the underlying ethical principles more likely, but the nature of the trolley problem also necessitated a content warning. Once students have gone through all sixty scenarios, they are presented with the statistics of their decisions. Both the scenario generator and the statistics calculation were taking from Evan’s code.
The next part of the lab is for students to translate their own decision procedure into code. This is the same as the first part of Evan’s lab. Building on the new introduction, however, I then asked students to use the same sixty problems from the first part of the lab to compare their algorithmic decisions against their manual ones (their decisions were logged to a file). The provided code not only allowed students to compare the statistical summaries, but also specific cases, so students can figure out why their code behaved differently than they did.
Finally, I asked students to reflect on the lab with the following questions:
1. Explain the reasoning behind your decision function.
2. How accurately did your automatic model match up with your manual decisions? Modify and run find_difference.py to help you identify specific scenarios where your decisions differed.
3. Looking at the statistics over 100,000 scenarios, are there priorities that do not match what you had intended? How do you think they came about?
4. Compare your statistics from the previous question to those from another group. Which decision process would you rather a self-driving have? Why?
5. Based on this exercise, what are some challenges to building (and programming) ethical self-driving cars?

Reflection

Reading the evaluations on the lab, the students generally found the assignment meaningful. Some students noted that the subject matter was heavy and somber, but still found it thought provoking. For example, one student wrote “While it was kind of dark deciding who lived and who died, I thought this lab was fun and interesting.” The point of the groupwork was also noted by a different student: “It was interesting to see the results […] and comparing them to the results of other groups to see what their thought process was.” I should note that the lab was also situated in a week of lectures around AI and society, including data privacy, automation and job loss, and tech legislation. These additional lectures provided more room for discussion, and the week served as a big-picture conclusion for the semester.

The most positive unexpected outcome of the lab, however, was one student who wanted to push the discussion of ethics much further. They wrote:

Ok this lab was interesting but the amount of really fucked up shit I heard was extraordinary. I think after this lab we should have some reading on body positive movements that originated out of feminism and how size is not always tied to health. In addition, I think we should have reading or discussion on how Black and Brown communities are more often criminalized. Also, I think it would be cool if we have some discussion on saving “disenfranchised” individuals – like woman – and how that would be determined. For instance why would a woman as a disenfranchised individual be prioritized but not a person who is overweight. Even if we don’t have these discussions, maybe we could have them written in our questions.

This student was referring to the attributes that were used in the scenario generator. I did not notice the conversations that the student overheard, but looking at the submitted code confirmed that some groups prioritized passengers/pedestrians who were “athletic” over those “overweight”, or the children over the elderly. I reached out to this student by email, and they also pointed out how “homeless” and “criminal” should not be considered professions. The student continued,

The use of body type in the code, as well – it threw me off. I felt very uncomfortable overhearing my peers casually debating whether someone was more valuable based on their size. […] In some ways, I thought having that be part of the description really allowed me to be critical of all the categories and allowed me to think about why that description might be included.

In retrospect, this is a lesson I should have learned already. At SIGCSE 2017, I led a Birds of a Feather (BoF) on Weaving Diversity and Inclusion into CS Content. One of the points raised in the discussion was that content on social inequality must be approached sensitively, as some students may be going through those exact problems in their personal lives. This was why I included a content warning for the lab, but this student’s comments made me realize I must be more aware of even the details in the future.

Talking with this student also made me realize that my existing discussion questions were inadequate. None of them directly probed at what should be considered ethical, which is extremely ironic given the content. I was galvanized to rewrite the questions entirely (I am in the middle of refactoring the code, so some functions may not exist on Github yet):

Consider your manual decision making process.
- Are there attributes of the passengers and pedestrians that had a higher (or lower) survival rate, but that was not a conscious factor in your decision process?
- What attributes went into your decisions? Does that attribute positively or negatively affect that person’s survival? Why did you consider those attributes?
- Is the use of those attributes to make those decision “fair” or “ethical” or “moral”? Why/Why not?
Change the last lines of automatic.py to call compare_manual. Running automatic.py now will run your function on the same 60 scenarios you manually worked through, and show you the scenarios where your automatic and manual decisions differed, as well as the statistics for your automatic decisions. Answer the following questions:
- How accurately did your automatic model match up with your manual decisions?
- For each scenario where your manual and automatic decisions disagreed, explain why. What were you considering when you made the decision manually? What did the automatic decision not take into account, or what did it take into account that it shouldn’t?
Change the last lines of automatic.py to run on 100,000 random scenarios. Include these statistics in your submission. Compare the simulated rates of being saved within each attribute (age, gender, etc.). Assign the attribute into one of these categories:
- Category 1: Deliberately used in decision, and the survival rates reflect what you intended
- Category 2: Deliberately used in decision, but the survival rates do not reflect what you intended
- Category 3: Not explicitly used in decision, but the survival rates are not equal between groups
- Category 4: Not explicitly used in decision, and the survival rates are equal between groups
For every attribute in Category 2, explain why the statistics do not reflect your (explicit) intented decision process. For every attribute in Category 3, explain why the survival rates are not equal despite not being used in your decision process.
Working with another group, explain to each other how you made your automatic decisions and compare your statistics from the previous question.
- How do the decision processes differ? What was the reasoning behind each difference?
- Did that lead to different statistics, in either ranking or in survival rate?
- Which decision process would you prefer to be in a self-driving car? Why?
Based on this exercise, what are some challenges to building (and programming) ethical self-driving cars?

Conclusion

Although my goals for the lab were achieved, it was clear that the assignment had a lot of potential that I didn’t tap into. Several people have noted that ethics must be incorporated into existing CS classes, and while there are some resources out there, much of it is still to be developed and refined. I could not have create this lab without Evan Peck for sharing his assignment, and I hope this writeup will be useful to others as well.

Closing on a tangential point: a recent Chronicle article was titled What Professors Can Learn About Teaching From Their Students (paywalled). The article was about students formally observing a classroom, but I have found that even informal feedback can be extremely useful. I urge other instructors to reach out to their students for comments and suggestions, especially for newly developed material.

Step 85: Teach Technical Writing

June 2, 2017 Justin LiLeave a comment

One of Oxy’s graduation requirements is that students must complete a “discipline-specific writing” course in their major. The exact implementation of this requirement is up to the department: Cognitive Science considers passing any upper-level course to be sufficient, Physics requires students to submit lab reports to a portfolio, while students in Mathematics must take a two-unit Junior Colloquium. For Computer Science, we decided that a junior seminar course would work best – students simply do not write enough in other courses to compile a portfolio.

Since I will likely teach the first offering of the Junior Seminar next Spring, I have been thinking about the writing errors that students make. By coincidence, all three of the courses I taught this past semester required some writing. Students in Intro to Cog Sci had to write two ~1800 word papers; students in Topics in AI had to write short essays as part of their homework; and even students in Data Structures had to write to justify their complexity analysis and explain their choice of data structure for an application. This trend will continue into the Fall, as I will be teaching the writing-based Cognitive Science Senior Comprehensive projects as well.

(Separately, how common is it for students to be writing in Data Structures or any other computer science course? Outside of Software Engineering Practicums and more so Senior Capstones, I don’t remember writing much in my undergraduate CS courses. Teaching Data Structures though, I am confused why I didn’t – the programming projects provided technical depth, while writing addresses some of the breadth of Data Structures. But that’s a topic for another post.)

I struggled to find overarching patterns in students’ writing more descriptive than general writing advice (eg. be specific, be concrete, signpost, etc.). If I had to identify the single biggest problem though, it would be that students don’t know how to make “arguments”. I don’t mean the strength of their evidence or how their essay is structured, but the kind of claims that they make and how to tie that back into their thesis. I put “arguments” in quotes because this does not apply only to building persuasive essays, but also the composition of explanatory pieces.

I have two examples from this semester. First, one paper in Intro to Cog Sci asks students how Marr’s levels of analysis lead to “a coherent understanding of human visual perception”. Anticipating that this prompt will be challenging, I required students to submit an outline two weeks before the final due date. From these works in progress, it was clear that many students did not address the “coherent” part of the topic. Instead, most students listed different applications for each of the levels of analysis, but remained silent on how the levels integrate with each other. A few students strengthened their argument in the final essay, but most papers remained weak even after receiving feedback.

The second example is from Topics in AI, where I asked students to take a popular-media article about some AI application, then research and explain how the underlying algorithm works. Students did do this, but not to the specificity that I want. For example, several students found applications of deep learning, but did not explain that training involves modifying the weights, nor what format the inputs and outputs were. These oversights are similar to the ones my previous students made when applying reinforcement learning. In both cases, I felt the final essays did not demonstrate the technical understanding that students had of the subject.

The second example, in particular, made me wonder how much the students’ true inability is in writing or argumentation. The first example might be considered a philosophy of science paper, while the second asked for computer science knowledge. These essays are necessarily discipline-specific, even the one for the introductory course, and students likely do not have the requisite argument content knowledge (to draw parallels with pedagogical content knowledge). Such a hypothesis would imply that even if students are already capable writers, they would need additional training to learn the acceptable types of arguments for each discipline – and that we as instructors must teach them to do so.

I consider myself a decent writer, but I have never been trained as a writing instructor and I’m still learning what works and what doesn’t. Providing feedback on outlines and leaving time for peer review helps, but not to the degree I want. In the future, I might adopt the extremely detailed grading rubric for philosophy papers that went viral recently. I have been thinking of going one step further – I would like to create examples of disciplinary writing of varying quality, together with an analysis of the strengths and weaknesses, to provide a reference for students. It’s unclear whether this effort needs to be duplicated for each type of writing (eg. an informative piece as compared to a project proposal), but I hope that more practice teaching writing courses will help me understand what best helps students.

Step 84: Set Summer Goals

May 25, 2017 Justin LiLeave a comment

To ease myself back into a writing schedule, I am presenting my goals for the summer:

Produce publishable results with my research student. This is the first time I have had a CS summer research student, and I have high hopes that her work will lead to usable experimental results by the end of the summer, if she does not help with the writing as well.
Write a paper by myself. In addition to my student, I also have a project I would like to work on, again with the goal of a submittable paper by the end of the summer.
Complete my annual and pre-tenure reviews. At Oxy, pre-tenure faculty are required to submit an annual report of our activities. Because I’m coming up for my third year, I must also write a pre-tenure review, essentially a check-up to make sure that I am on track for tenure. I am unclear at this point how different these reports should be, but I’ve heard that the annual review is backwards-looking, while the pre-tenure review is more about what I will do in the few years until my tenure case.
Formalize CS recruitment. Now that we have a department, I want to formalize a few things that I’ve been doing on an ad-hoc basis. One of these is the recruitment of students, both in funneling them towards taking computer science, but also in supporting and encouraging students who are struggling, and also compliment students who do well. Along with this effort, I would like to start thinking about shared curricula for introductory classes. This will likely be a year-long conversation, but planning for it can start now.
Plan for my two Fall courses. As my third time teaching CS1, the major structure of the course is fairly set. I am still creating new assignments just to keep things fresh, and for the fall I hope to introduce virtual reality into the course. I also hope to more deliberately include assignments that speak to all three of the sciences, the social sciences, and the arts and humanities. My other course is the Cog Sci senior seminar, where my role is mostly to mentor students in their projects and their writing. It will be my first time teaching the course, but I actually look forward to strengthening our students’ writing.

Step 83: Survive!

May 17, 2017 Justin LiLeave a comment

I’m alive! This blogging hiatus was due to the busiest semester at Oxy so far. It’s the first time I’ve had a three-course teaching load (well, two and a half), and on top of that I was involved in both the computer scientist search and the proposal for a Computer Science department/major. Everything below probably deserves a blog post of its own as I return to my weekly schedule, but in the mean time are the highlights:

Department/Major Proposal: The major news of the semester (pun intended) is that the faculty voted on and passed a resolution to establish a Computer Science department and major. This would not have happened without lots of work by other people, both on the intellectual side as well as navigating the college bureaucracy. As a junior faculty, I actually felt understanding and working through the process of proposing a major was more work than designing the major itself. I don’t want to go into the politics of proposing a department/major, but I’ve learned a lot about the different interests at the college. I will just mention that the allocation of resources between departments was a concern, which I think will be a focus of the faculty for the near future.
CS Search: Slightly before the department stuff really heated up, we also hired a new faculty! Kathryn Leonard will be joining us from CalState Channel Island in the fall, as a full professor. Her is on shape modeling from a mathematical perspective… but I can’t really do her research justice. She has also done a lot of teaching and outreach for underrepresented students, and started a Data Science Minor at CSUCI. I’m super-excited for her lead CS@Oxy.
Research: I experimented a little this semester with having a lot more students – from 2-3 the previous semesters to 6-7 this semester. The main reason is that, with all the classes I’m teaching, it’s hard for me to find time focusing on research. More students means more meetings, but that actually blocks out time for me to think about what they are doing. I am still figuring out how to split up my main projects for these students, so not every project will lead to publication-level work. I also abandoned group meetings this semester, partially because we changed how students get research credit, but also because the students’ projects do not directly overlap. Still, I think the experiment was a success, and I intend to keep this many students in future semesters.
Intro to Cog Sci: I’m not sure how much I talk about this course, since most of this blog is focused on computer science. As the name suggests, this is the first course that students take in Cognitive Science, and is usually co-taught by three professors. The ideal would to be to have one philosopher, one cognitive psychologist, and one mathematician/computer scientist, but this year it’s just me and my Cog Sci colleague. Because the teaching team changes every year, the first semester is often time to figure out how you work together, with the second semester to polish the class a little. We were lucky to have great students this semester, which let us push the concepts a little harder, but also meant more time rethinking what each lecture should be about.
Data Structures: This is the first time I’ve taught this course, but it went better than the first time I taught CS1. For some reason, over the semester I keep feeling as though there are topics I am not covering, but when I compare my syllabus to those at other colleges, the topics have significant overlap. My one big mistake was trying to useful multiple free online textbooks, but the readings ended up being disorganized and disjoint. Other than that, this class was surprisingly easy to teach, and I thoroughly enjoy drawing box-and-arrow diagrams again.
Topics in AI: The first time I taught this course was my first semester at Oxy. At the beginning of the semester, I thought I would rethink the course given what I know now about Oxy students, but in retrospect the course did not change as much as I thought it would. The balance between cognitive science and computer science remains a problem, and the assignments from last time were good enough that I reused most of them. I did introduce more computer science/machine learning into the course, but had to be careful that the mathematical details did not exceed my student’s abilities. I mostly relied on visualizations in IPython, which worked well. This course may change in the near future as Cognitive Science revisits their curriculum in light of Computer Science, so it’s unclear if I will teach this course again.

As I said, I will likely spend a full blog posts on each of these, and other smaller topics as well. I’m excited to have time to write again!

Step 82: Improve Computer Organization Lectures

January 18, 2017 Justin LiLeave a comment

A while back I mentioned that, for the first time, I included in CS1 a series of classes introducing students to other topics in computer science. It was a series of eight lectures, which covered in order:

Memory Layout, starting with the binary representations of numbers and strings, to the idea of pointers and the idea of treating programs as data.
Recursion, mostly the ideas of base case and recursive case, with a simple application of printing the contents of nested directories.
Artificial Intelligence, specifically problem solving as search and a very high-level overview of machine learning
Parsing, although the class turned into me using recursion to build a basic interpreter for arithmetic variables and expressions.
The Internet, mostly the idea of networks, DNS, and routing, with a little DDOS thrown in.
Security and Privacy, mostly best practices for personal digital security, but also some implications for privacy in the age of AI.
Computer Architecture, which includes the CPU, memory hierarchy, and OS, with a little bit of digital circuitry.
Assembly, which is exactly that, using this excellent 8-bit simulator.

This culminated in an essay where students have to explain how a computer works to the layperson. The overall idea was influenced by courses like Harvey Mudd’s CS 5, which explicitly allows students to sample upper-level courses. Michigan’s EECS 183 used to do something similar when I taught it two years ago, although it looks like they have since shrunk the number of classes reserved for these grab-bag topics. Finally, this sequence also borrows from ideas like From NAND to Tetris, although I’m going top-down instead of bottom-up with interwoven additional topics (and I’ve clearly thought through the sequence less than authors of the book).

Individually, teaching these topics did not strike me as problematic. I particularly enjoyed the memory and assembly lectures, not only because they were mind-blowing for students, but because it was material I haven’t used myself in a long time. The most difficult topics for students were recursion and parsing. For the latter, I was surprised by the lack of articles that explained in simple terms why recursion was necessary. Even now, I find myself unable to adequately explain why ordinary loops are insufficient, at least not without talk about stacks and trees. The remaining topics were ones that I have taught before, either in previous semesters of this course (the Internet lecture) or in cognitive science (AI), and all went passingly.

As a whole, however, this sequence of classes need rethinking. Even before the teaching evaluations, some students already told me that I could have better integrated the topics. It’s true that, without additional information, it’s unclear why we need CPUs and RAM and all the other components of a computer, and how an executable fits into it all. Missing the big picture also made the cumulative essay somewhat obtuse, which was corroborated in an evaluation comment: “I felt that I had very little insight to add to the paper besides technical information that I pulled from other sources.” While some students did manage to combine the topics into a coherent narrative of how code works, many other essays lacked transitions between descriptions of each system.

My takeaway here is that this sequence is too ambitious, even at the mile-high view of each topic. It was a useful exercise to see where the limits of the course are, but I think one student had the right idea: “I would prefer that instead of spending the last month learning other things about computers, to instead spread out the computer science material so it can be taught at a slower pace.” Some students did enjoy the breadth of topics, so my plan for the future is to reduce the number of grab-bag classes, but allow students to select the topics they are most interested in (but perhaps requiring the one on memory). I suspect the parsing lecture will never be chosen, but since I’m not sure how much students got out this semester anyway, I don’t see it as a huge loss.

Step 81: Address Teaching Evaluations 2

January 17, 2017 Justin LiLeave a comment

I am about to start my fourth semester of teaching at Oxy, and I’ve started thinking more about how to collect meaningful longitudinal feedback. I took time over the break to automate some of the analysis of teaching evaluations, and in particular collating scores from the same questions over multiple years. One trend that caught my eye is the student response to “The instructor stimulated intellectual enthusiasm for the material presented”:

enthusiasm

These drops are minimal on the absolute seven-point Likert scale, but even if they are not significant, they revive an old fear. One of my concerns before starting grad school was that I would have to suppress my intellectual curiosity. I’m not sure I ever liked the idea of studying a single topic, at least not at the expense of not pursuing other ideas. It was liberating to be done with grad school, and I did pick back up some old interests. My new worry, however, is that I will again be bored from teaching the same material semester after semester.

This is not an idle concern. Oxy is my first long-term full-time teaching job, but it would not be my first extended teaching experience with the same material. As an undergrad, I was a peer facilitator for Northwestern’s Gateway Science Workshop, and I taught the same faculty-created engineering worksheets for three years in a row. I didn’t need student evaluations to tell that I connected less and less with my students through the years. I stopped bothering with ice-breakers; I stopped asking about their non-academic life; I started following the worksheets more closely without wasting time to draw in additional concepts. It is the same narrative I took away from the plot above: following the same template semester after semester, growing comfortable with the material, but ultimately disengaging from the students and unable to inspire them to pursue the discipline.

Teaching at Oxy is very different from peer tutoring – for one, I have complete control over the material, which makes it easier to include new lectures and keep things interesting. Nonetheless, I am starting to feel that same slide towards apathy. To be clear, I don’t actually think I am losing enthusiasm for the material. Rather, what I think I am losing is the spontaneity and authenticity of presenting material for the first time. I could feel myself being less engaged the second or third time I reuse my slides. I suspect what’s happening is that I design my presentations with a lot of additional cues to keep in mind. The first time through, the class is only days (or hours) after my prep, so all the supplemental content is still in my head. When I revisit the lecture a semester later though, it’s no longer available, so I end up strictly following the content on the outline, to the detriment of the class.

The obvious answer is to start including speaking notes for my lectures, but that’s a lot of work and I honestly don’t prepare for class that way. I once heard a story, from someone who watched/shadowed a skilled teacher, who had apparently rehearsed their lecture down to pausing to put down their cup. At the other extreme is discarding all previous material and starting over, but I also worry about thereby lose the culminated improvements I’ve made over the years. The temptation of finding the middle ground is that it’s too easy to just take the material from the previous semester and use it wholesale.

One thing I might try this semester is to derive the goals of each class from scratch, before looking at old material. This would at least identify missing content and drive improvement to my classes. Separately, I’m resolving to rediscover my interests, if not in the lesson plans, then in introducing new students to the thought-provoking concepts in cognitive science and computer science.

Step 80: Respond to Teaching Evaluations 1

January 6, 2017January 6, 2017 Justin LiLeave a comment

Note: I’m addressing this to students, and this post is… condescending and patronizing. You have been warned.

As a whole, academics are self-centered. I don’t mean that they are egotistical – although some certainly are – but that they have an internal locus of control as well as a high self-efficacy. Academics tend to believe that they are competent and capable of doing their job. This is not to say that they are not open and sensitive to critique, but that academics tend to be critical of the criticisms themselves, and groundless criticisms mostly flow off our backs. After all, we went through a PhD program, and a lot of that was being told that our work was insufficient.

(That was a paragraph of sweeping generalizations; I apologize.)

Which is to say that if you despise a faculty and you want to tank their teaching evaluations, YouTube comment tactics are not going to work. First, giving someone across-the-board zeroes is easily detected. This is called an outlier in statistics, and is often excluded for summarization. Similarly, comments such as “Justin is a terrible person” do not mean much to me. It’s kind of like being called “stupid” by a young kid – the default response is “yeah, okay, I have better things to worry about”. For the comment about me being a terrible person, it’s not even that I disagree with the comments – me writing this blog post is terrible and passive aggressive of me.

So, students, here’s a tip. The way to make your negative evaluation count is to point out where the instructor is incompetent then (and this is key) back it up with evidence. Stop with the personal attacks (“Justin is a terrible person.”) and talk about what they did not do (“Justin is a terrible teacher.”). Personally, comments that I am condescending cause me less stress than arguments that my classes were not thought out. An evaluation that says “Justin’s classes are disorganized” is good, one that says “he jumped from one topic to the next” is better. Show that you know what the instructor was trying to do and that they failed. Talk about how the instructor negatively affected your ability to meet the goals of the course (maybe “Every class presented a random collection of facts, and there was no attempt to give the big picture.”) or better yet, that the instructor reflects negatively on the department/field (eg. “Although I was really interested in the class at first, I have decided that I will not major in this department if I have to continue taking classes with Justin.”). Finally, if you want to be just plain mean, compare them to other professors.

This is not guaranteed to work, especially as professors gain experience and have seen the gamut of comments. But you would have achieved your goal of rattling the instructor. Why am I telling you this? Because the most effective criticism are also the ones that help faculty figure out what to change. You are telling us what doesn’t work, and where we might do better. Speaking for myself, the more biting your criticism – as long as I see it as valid – the more I’m motivated to improve and change it. So if you’re disgruntled, by all means, negatively evaluate us – but doing it well.

Most of the quotes so far were made up, but I do want to give a real teaching evaluation comment that hit me hard. This was a mid-semester comment from two years ago:

Justin, honestly, has been terrible so far. His method of teaching is simply not conducive to learning. For example, the class features i-clicker questions, which from my experience have helped me test whether I’m understanding the material. However, Justin usually gives out an increasingly difficult series of questions regarding a topic and then proceeds to teaching the topic, generally making what could be considered mocking remarks when people get it wrong and effectively negating the purpose of i-clickers by testing us on material that we don’t cover until after the questions. Furthermore, when giving out answers for i-clickers, he generally makes remarks like “I think it’s this one” or “pretty sure, it’s C,” as if he is unaware of the correct answers for a class he’s teaching (i.e. unprepared for class). Finally, a TA led lecture when Justin was unable to attend, and it was by far the clearest, most helpful lecture I’ve experienced in the course. And from interacting with nearby students after the TA’s lecture, my sentiment seems to be shared.

Overall, I’m taking this class as a senior for general interest, so Justin’s inadequacy as a lecturer is frustrating but not inhibiting. However, for the freshmen/sophomore in the class who are considering an EECS major, I feel that the EECS department has done those students a massive disservice by allowing Justin to teach. I can’t imagine how uninspired I would be if I came across an unprepared, rude, unhelpful lecturer like him when doing the pre-reqs for my current major, and I sincerely hope he doesn’t deter some of the smart, engaging students around me who are considering an EECS major. Besides the problems/suggestions highlighted above, my final suggestion would be to allow another professor (or honestly, even the aforementioned TA) to teach the remaining lectures. Otherwise, the EECS department can go on knowing that they wasted two hours and forty minutes of interested, devoted students’ time per week because of Justin’s poor performance as an instructor.

After I first read this comment, I could not focus on my work for a week. I seriously questioned my ability and my desire to continue teaching. Part of it was because it was the first wholly negative teaching evaluation I have received. I still wince when I reread those two paragraphs, but I’m not sure I would break down quite as badly if I get the same evaluation now – I’ve just come to accept that I can’t please everyone.

(PS. Although the comment was provided anonymously, I have reason to believe that the same student ended the semester with a positive evaluation of me. The corresponding paragraph is one of all all-time favorite comments of my teaching.)

Step 79: Explore CS1 Grading

December 23, 2016 Justin LiLeave a comment

At the end of every semester, and especially the last two when I’ve taught CS1, I always have the same thoughts about grade inflation and the meaning of grades. This is a particular problem for CS1 because somewhere between a third and a half of the students get A-‘s or A’s – a much larger proportion than in other introductory courses. One possible interpretation is that I grade too easily, but other interpretations are possible. Two in particular come to mind:

The course structure allows students to get objectively higher grades. I’ve written about the lack of exams and the frequent, low-stakes grading structure. I also allow unlimited autograder submissions, which means that students can tweak their code until they pass all the test cases. The autograder also provides immediate feedback, which leads to…

Students spend more time on this class. The data from the end-of-semester teaching evaluations support this: over both semesters (59 respondents total), the median and average time spent on this course outside of the classroom is 6 and 8.4 respectively. Keep in mind that this course already has 6 hours of lectures and labs per week, and that a course is supposed to take about 10 hours total.

(This plot omits one student who reported spending over 40 hours per week on this course. I really hope they were exaggerating.)

The real answer to the high grades is likely some combination of all three explanations. What I can’t decide is what this means in terms of the grading structure of the course. I am less concerned about grade inflation than I am about the distribution of grades. I wrote in the previous post that my “quizzes” are tri- or quad-modal. It turns out that my final grades are not as bad, but are still bi-modal, with peaks around 85% and 95%.

(Grades lower than B- have been omitted.)

As with grade inflation, there is the question of what this means, and there is the meta-question of whether it is problematic. The face-value explanation would that there are two groups students – one that gets computer science, and one that doesn’t. I’m undecided whether this describes the “true” distribution of computer science competency, but philosophically as a teacher I should not design courses with this assumption. If instead I take for granted that student skill levels are unimodal, then what the grade distribution would suggest is that I am not sufficiently sensitive to students some middle section of that curve.

One thing I do know is that this is not a problem I can fix by changing the grading structure but keeping the same assignments. I know this because I have iterated through the space of assignment weights. Within the constraints of low-stake assignments, no set of weights would transform the existing grades of my students into a unimodal distribution peaking around B or B+. What this means to me is that if I am indeed failing to identify the B+ students, the place to start would be to look at the actual content of the assignments.

I don’t have a takeaway from this. I dislike the bimodal distribution of grades, but it’s unclear whether I am justified in my dislike, and even if so, what I can do to change it. Assigning grades, as well as deciding on the grading structure of a course, requires thinking through not just what students should learn and whether their grade reflects that, but also how we trade off student achievement, time spent, and the value of negating institutional grade inflation. As a final thought, it has occurred to me that perhaps grades are not the venue to demonstrate these nuances. Perhaps grades should be seen only as the carrot-and-stick, with more emphasis put on detailed feedback provided through other channels.

How to Start a CS Department

A new Faculty's journey at Occidental College

Menu