Introducing CS 1 students to algorithmic bias via the Ethical Engine lab

There’s a lot of recent interest around the ethics of technology. From recent popular press books like Algorithms of OppressionAutomating Inequality, and Technically Wrong*, to news stories about algorithmic bias, it seems like everyone is grappling with the ethical impacts of technology. In the computer science education community, we’re having our own discussions (and have been for some time, although there seems to be an uptick in interest there) on where ethics “belongs” in the curriculum, and how we can incorporate ethics across the curriculum — including in introductory courses.

One initiative aimed at touching on ethical issues in CS 1 particularly caught my attention. In July 2017, Evan Peck, at Bucknell University, posted about a programming project he and Gabbi LaBorwit developed based on MIT’s Moral Machine, a reworking of the classic Trolley Problem for self-driving cars. This project, the Ethical Engine, had students design and implement an algorithm for the “brains” of a self-driving car, specifically how the car would react if it could only save its passengers or the pedestrians in the car’s way. After implementing and testing their own algorithms, students audited the algorithms other students in the class designed.

Justin Li at Occidental College built upon this lab, making some changes to the code and formalizing the reflection questions and analysis. He wrote about his experiences here. In particular, Justin’s edits focused more on student self-reflection, having them compare their algorithm’s decisions against their manual decisions and reflecting to what extent their algorithm’s decisions reflected or did not reflect their priorities.

I was intrigued by the idea of this lab, and Justin’s version seemed like it would fit well with Carleton students and with my learning goals for my intro course. I decided to integrate it into my fall term section of intro CS.

Like Evan and Justin, I’ve made my code and lab writeup freely available on GitHub. Here are links to all three code repositories:

Framework

Based on Justin’s and Evan’s writeups, I made several modifications to the code.

  • In the Person class, I added “nonbinary” as a third gender option. I went back and forth for a bit on how I wanted to phrase this option, and whether “nonbinary” captured enough of the nuance without getting us into the weeds, but ultimately decided this would be appropriate enough.
  • Also in the Person class, I removed “homeless” and “criminal” as occupations, since they didn’t really fit in that category, and made them boolean attributes, similar to “pregnant”. Any human could be homeless, but only adults could have the “criminal” attribute associated with them.
  • In the Scenario class, I removed the “crossing is illegal” and “pedestrians are in your lane” messages from the screen output, since in this version of the code these things are always true.

I also made it a bit clearer in the code where the students should make changes and add their implementation of the decision making algorithm they designed.

Execution

I scheduled the lab during Week 8 of our 10 week course, just after completing our unit on writing classes. We take a modified “objects-early” approach at Carleton in CS 1, meaning students use objects of predefined classes starting almost immediately, and learn to write their own classes later in the term. The lab mainly required students to utilize classes written by others, accessing the data and calling upon the methods in these classes, which conceivably they could have done earlier in the term. However, I found that slotting the lab in at this point in the term meant that students had a deeper understanding of the structure of the Person and Scenario classes, and could engage with the classes on a deeper level.

I spread the lab over two class periods, which seemed appropriate in terms of lab length. (In fact, one of the class periods was shortened because I gave a quiz that day, and the majority of the students had not finished the lab by the end of class, which leads me to believe that 2 whole class meeting periods at Carleton, or 140 minutes, would be appropriate for this lab.) As they do in all our class activities, students worked in assigned pairs using pair programming.

On the first day, students made their manual choices and designed their algorithm on paper. To ensure they did this without starting with the code, I required them to show their paper design to either my prefect (course TA) or myself. A few pairs were able to start implementing the code at the end of Day 1. On the second day, students implemented and tested their algorithms, and started working through the lab questions for their writeups. Most groups did not complete the lab in class and had to finish it on their own outside of class.

At the end of the first day, students submitted their manual log files. To complete the lab, students submitted their algorithm implementation, the manual and automatic logs, and a lab writeup.

Observations

Unexpectedly, students struggled the most with figuring out how to access the attributes of individual passengers and pedestrians. I quickly realized this is because I instruct students to access instance variables using accessor and mutator methods, but the code I gave them did not contain accessor/mutator methods. This is a change I plan to make in the code before I use this lab again. I also plan to look a bit more closely at the description of the Person and Scenario classes in the lab, since students sometimes got confused about which attributes belonged to Scenarios and which belonged to Persons.

Students exhibited a clear bias towards younger people, often coding this into their algorithms explicitly. One pair mentioned that while their algorithm explicitly favored younger people over the elderly, in their manual decisions they did “think of our grandmas”, which led to differences in their manual and automatic decisions in some places. A fair number of students in this class came from cultures where elders traditionally hold higher status than in the US, so the fact that this bias appeared so strongly surprised me somewhat. Pregnant women also got a boost in many students’ algorithms, which then had the effect of overfavoring women in the decisions — which many students noted in their writeups. While nearly all pairs explicitly favored humans over pets, a few pairs did give a small boost to dogs over cats, while no one gave any boost to cats. I’m not sure why this class was so biased against cats.

I was impressed by the thoughtfulness and nuance in many of the lab writeups. Most students were able to identify unexpected biases and reason appropriately about them. Many thoughtfully weighed in on differences in their algorithm’s choices versus the choices of their classmates’ algorithms, one pair even going so far as to reason about which type of self-driving car would be more marketable.

In the reflection question about the challenges of programming ethical self-driving cars, many students got hung up on the feasibility of a car “knowing” your gender, age, profession, etc, not to mention the same characteristics of random pedestrians, and being able to utilize these to make a split-second decision about whom to save. This is a fair point, and in the future I’ll do a better job framing this (although to be honest I’m not 100% sure what this will end up looking like).

One of the lab questions asked students to reflect on whether the use of attributes in the decision process is ethical, moral, or fair. Two separate pairs pointed out that the selection of attributes can make the decision fair, but not ethical; one pair pointed out the converse, that a decision could be ethical but not necessarily fair. I was impressed to see this recognition in student answers. Students who favored and used simpler decision making processes also provided some interesting thoughts about the limitations of both “simpler is better” and more nuanced decision-making processes, both of which may show unexpected bias in different ways.

Conclusions and takeaway points

Ten weeks is a very limited time for a course, so for any activity I add or contemplate in any course I teach, I weigh whether the learning outcomes are worth the time spent on the activity. In this case, they are. From a course concept perspective, the lab gave the students additional practice utilizing objects and developing and testing algorithms, using a real-world problem as context. This alone is worth the time spent. But the addition of the ethical analysis portion was also completely worth it. While I have yet to read my evaluations for the course, students informally commented during and after the exercise that they found the lab interesting and thought-provoking, and that it challenged their thinking in ways they did not expect going into an intro course. I worried a bit about students not taking the exercise seriously, and while I think that was true in a few cases, by and large the students engaged seriously with the lab and in discussions with their classmates.

I teach intro again in spring term, and I’m eager to try this lab again. The lab has already sparked some interest among my colleagues, and I’m hoping we can experiment with using this lab more broadly in our intro course sections, as a way to introduce ethics in computing early in our curriculum.

*all of which are excellent books, which you should definitely read if you haven’t done so already!

Advertisements

A rough return to teaching

I’ve spent the past few summers (minus last summer when I was on sabbatical) teaching in a summer high school program. The program consists of 3 weeks of morning classes and afternoon guided research with a faculty member. I really, truly enjoy it. Teaching high school students is an interesting challenge. And by and large the students have been thoughtful, engaged, creative, and eager to learn. (It’s also very gratifying to see some of them as Carleton students post-high school!)

So when my colleague approached me last fall about teaching again this summer, I agreed. The program, I reasoned, would give me the opportunity to ease back into teaching before returning to the classroom in the fall. Plus I already had curriculum and research projects ready to go. What could possibly go wrong?

Suffice it to say that my envisioned triumphant return to teaching was anything but.

The actual mechanics of teaching? That went easier than I anticipated. The rust fell away quickly, much to my surprise. Being in front of students felt natural to me, and I found my teaching groove in short order. Pacing was still tricky at times, but pacing is always a bit of an inexact science.

What I didn’t anticipate, and what was roughest about re-entry: the small but active minority of students in my research group who decided early on that what I was teaching, human-computer interaction (HCI), was not Real Hard Core Actual Computer Science Because We’re Not Programming 24-7. And the undercurrent of disrespect for my authority, and for my RA’s authority (also a female computer scientist).

Now, I should pause and make it crystal clear at this point that THIS IS NOT NORMAL FOR THIS PROGRAM. The vast, vast majority of students are respectful and open to learning, and to expanding their ideas of what computer science is. I can count on one finger the number of research students I’ve mentored in this program who have been actively disrespectful of me and the subject matter. Sure, I’ve had some students in the past who were openly or less openly skeptical about the merits of HCI as a computer science field, but by and large those students at least came to appreciate what I was trying to teach them in the end, even if in the end they decided it wasn’t quite their cup of tea. And I’ve had some really interesting conversations with the objectors that have not only strengthened my framing of my material, but have also led me to reflect on what material I choose to include and how I include it. Both of which make me a better, more effective teacher in the end.

I spent a lot of time and energy during the program reflecting on where this particular strain of disrespect originated. Part of it likely relates to the HCI = Not Real Computer Science attitude, which is certainly not limited to the students in my class (and is still somewhat pervasive in the field, unfortunately). Part of it also likely relates to the general bro-ness and toxic masculinity that has always surrounded computer science, something that’s come into sharp focus lately with any number of recent news stories. Why did it emerge in force this year, and not in previous years? That, I’m still trying to figure out.

It’s been a very long time since I’ve had to deal with this level of disrespect in the classroom. I’ve been at Carleton long enough that I’m part of the fabric of the department — I am “accepted”. Gaining seniority (in age and in status) over the years increased my credibility with the students, giving me more authority in their eyes. The close-to-gender parity we have in our faculty also helps quell at least some of the disrespect. So I was caught off-guard.

Once I recognized what was going on, I went into damage control mode. I summoned up my Authoritative Teacher persona from the depths — she hasn’t been around much since my pre-tenure days. I blinded them with science — or, at least, hit them hard with the scientific basis for every psychological or design principle we discussed. I randomly threw out my credentials, just to remind them that Yes I Do Know What I Am Talking About As I Have A PhD In Engineering And Years Of Experience. I occasionally let out my Inner Bitch and used my Evil Mom Stare with abandon.

But I also second-guessed almost everything that I did, and said. I put up my guard in ways I haven’t had to do in a very long time. Teaching, and every single interaction in this program, took up at least twice as much of my mental and emotional energy. Teaching in this program is normally draining, but this year, at the end of the day, I truly had nothing left in my tank. And that was not fair to my family or to myself.

Lots of people have asked me if I’ll teach in the program again next year. I honestly don’t know. On the one hand, I still believe strongly in this program. I have met and worked with so many incredible teens and young adults in this program. By and large, my students are thoughtful, creative, eager to challenge themselves, whip-smart, and funny. Most of my students did outstanding work on their research projects, and embraced the experience and challenge from start to finish. And I enjoy serving as a role model to high school students, both as a female computer scientist and as an HCI researcher. But on the other hand, this summer exacted a huge toll from me. I was exhausted, and bitter, every single day. Why does it feel like it’s just my responsibility to hang in there, fight the good fight, and change their minds? How productive, and happy, would I be if I didn’t have to deal with this crap?

Hopefully, I won’t experience anything like this in the fall when I return to the classroom full time. Or, if I do, at least I’ll be prepared to recognize it and deal with it. That, I suppose, is the sad silver lining in this experience.

 

Sabbatical report: Context switching

I’m now about 6 months in to my year-long sabbatical. Currently, I’m working on two very different sub-projects. Each sub-project is related to my larger research project on self-healing home networks, and each one approaches the larger project from the lens of the two subfields I straddle.

The first sub-project is more mathematical/theoretical. I’m attempting to create a mathematical model of a home network, based on my own measurement work and the measurement studies of others. I submitted a paper in December, which was rejected but got really helpful reviews. Even the infamous Reviewer 3 had constructive and kind things to say. (Thanks, anonymous reviewers!) So now I’m working to make the model more mathematically rigorous. This project approaches the problem of self-healing home networks from the computer networks perspective, and also harkens back to my electrical engineering days, when it seems like every graduate class I took had “processes” in the title (Stochastic Processes, Random Processes, etc.).

The second sub-project could not be more different from the first. It’s a qualitative, interview based study on how people reason about the networks within their homes. This project approaches the problem of self-healing home networks from the human-computer interaction (HCI) side. The research methods I’m utilizing are completely new to me, so the learning curve has been large. While I’ve done some math for this project (mainly freshening up my knowledge of statistics), the bulk of the work resembles work that a social scientist would normally do.

The disparity in approaches of the two sub-projects has made for some interesting work weeks. I spent a few days recently cozying up with my old Stochastic Processes textbook trying to remember the details of Markov chains vs. autoregressive models, drawing lots and lots of diagrams, and calculating transition probability matrices. I haven’t thought in such a mathematically rigorous way in a while, so while my skills are definitely rusty, it felt good to return to that mode of thinking. Interspersed with this work are days where I’m reviewing techniques for asking effective interview questions, testing out my recording equipment, strategizing about how to recruit participants, and refining my interview guide. This is an entirely new way of thinking and working for me, so I alternate between feeling like a fish completely out of water and invigorated by the intellectual challenge.

There was probably a time early in my career when I couldn’t fathom working in two such disparate areas. But now, I wouldn’t have it any other way. I like that I’ve found my research passions in two very different subfields. I love that each field engages a different part of my brain. I appreciate that I’ve identified research problems that straddle both fields. I love the opportunity to do and write about math-y things AND design/people-y things. I love that I can use different tools and skill sets to construct models about the world.

I embrace and enjoy the context-switching that my research life entails.

A look back at 2016

I wasn’t planning on doing an end-of-the-year post for 2016.

As far as I’m concerned, 2016 has way overstayed its welcome. In many respects, it’s been a shitty, difficult year from start to finish. From some really difficult, nasty, unbloggable stuff I dealt with in my last year as chair; to the extreme burnout from my job (which had taken such a toll on my physical, mental, and emotional health that I still haven’t fully recovered); to the passing of so many celebrities from my childhood and formative years (I learned about Carrie Fisher’s passing, I kid you not, as we were leaving the theater after watching Rogue One); to the dumpster fires and horrors that were our presidential election, Aleppo, Brexit, and any other number of world events — there’s a lot to be sad/angry/horrified by from 2016. So, yeah, 2016 can just go away, far far away, as far as I’m concerned.

But as I sat on the plane on the way home from my mom’s house yesterday morning, I realized that I didn’t want to end 2016 on a sour note. I’ve spent so much of my time and energy this year (necessarily) ruminating on the bad, but the truth is that a lot of good happened too. And frankly, I’d like to head into the new year with positive momentum to balance some of the anger and despair.

So I am doing an end-of-the-year post, a look back at 2016, focusing on some of the positives from the year. In a future post, I’ll talk about what I want to do to keep this positive momentum moving into the new year.

  1. It was a pretty good year professionally. 2016 was a pretty solid year professionally with a lot of interesting opportunities: co-chairing the Grace Hopper poster session (with an incredibly talented, warm, funny person whom I hope to work with again in the future!), attending Tapia for the first time, continuing to expand my work in academic civic engagement (including attending POSSE and finding an excellent community there), finishing up my stint as chair on (hopefully) a high note, submitting my promotion materials. It also brought clarity and better judgment: I turned down a service opportunity that would have meant a lot of visibility, but wouldn’t have fit in with my larger goals, in favor of a smaller, local opportunity that fits in much better with my larger goals (watch this space in the future for more on that!).
  2. I reprioritized family. My crazy-ass schedule last year meant that I wasn’t always present for my family, and when I was, I was too stressed to be fully present (or, as my kids observed, “You yell a lot when you’re home, Mom.”).

    Highline Trail, Glacier National Park, USA.

    Highline Trail in Glacier National Park, one of the (many) hikes we did on our epic road trip.

    I made the conscious decision to dial way back on work this summer: not supporting summer students, not teaching in the summer program, spending Fridays and several full weeks home with my kiddos. My spouse, kids, and I took a 2 week epic camping road trip (6 national parks/monuments/memorials*, 6 states**) this summer that was just amazing. My sabbatical means that I’m working sane hours, which means that I can be fully present on weeknights and weekends, which means I can actually enjoy family time. My son started taekwondo this year, and it looked like so much fun that I recently joined him. I’m looking forward to us earning our black belts together someday!

  3. I ran. A lot. 1089 miles, to be exact, not counting whatever I end up running today***, and (woo hoo) injury free! I ran my 2nd marathon in October and PRed by 9 minutes. Best of all, I found an online community of mother runners, some of whom I trained with virtually during my marathon training cycle and some of whom I still virtually keep in touch with. I’m looking forward to marathon #3 next year, and maybe some half marathons, too.
  4. Sabbatical, sabbatical, sabbatical. I can’t tell you how positive this experience has been for every single aspect of my life. I didn’t realize the extent to which my job nearly broke me last year, and over the last few years. I feel normal again. I’ve reset my priorities, my work habits, and my professional goals. I fell in love with my research again. I’ve already submitted one paper and sketched out a brand new research project that will really stretch me professionally. I wake up every day excited to get back to work, and that’s something I haven’t felt in a very, very long time.

I’m still not sad to see 2016 go, but reflecting on the good makes me feel a smidge more hopeful about 2017. In many ways, 2016 clarified what my personal truths are, and I plan on using these truths to frame and structure my 2017. There are many things I can’t control, but there are many things I can do to be the change I want to see in this world. And that, I think, will be my guiding principle for 2017.

* In the order we visited: Theodore Roosevelt, Glacier, Craters of the Moon, Yellowstone, Grand Teton, Mount Rushmore

** Minnesota, North Dakota, South Dakota, Montana, Idaho, Wyoming

***I am super tempted to run 11 miles today to make it an even 1100 miles for the year. We’ll see.

#AcWriMo, Sabbatical Edition: The Final Reckoning

As I’ve done for the past few years, last month I participated in AcWriMo, the month-long academic writing extravaganza. I started the month with two goals:

  1. Complete an almost-submission-ready draft of a conference paper.
  2. Complete a rough draft of a new research study.

I chose this particular set of goals as a way to address some clogs in my research pipeline. Right now I have a lot of work in preliminary stages and/or various stages of write-up, but nothing out for review. I chose the first goal as a way to move something closer to the out-for-review stage of the pipeline, and the second goal as a way to move a project from the half-baked idea phase to the gee-I-could-start-collecting-data-soon stage.

So, how did I do?

I completely met my first goal. I have a complete draft of a conference paper ready to be tweaked for a particular conference. I did not start the month with a particular conference in mind. Instead, I decided to write a generic draft — more like a tech report — that I could then slightly tweak and reframe for particular venues. So all the source material is there, and all I need to do is edit it. And as luck would have it, a few days ago I found a conference with a mid-December deadline that’s a pretty good fit for it. I’ll need to cut 3 pages and I’ll need to reframe the intro to better fit the conference’s focus, but that should be pretty straightforward. So, bonus, this paper WILL be out for review soon!

I completely met my second goal. My literature search confirmed what I suspected — that this new study area is pretty underexplored. Reviewing the literature, and working through my stash of HCI books, gave me some good ideas for how I might explore this space, and I feel pretty excited about my study plan. Also, terrified, because the new study involves qualitative research methods that I’ve never, ever used before. (I am setting up a lot of meetings with my social scientist friends in the near future!)

I wanted to keep track of how I spent my writing time, so I logged my writing time, number of words, time spent coding, time spent on each project, etc. every day.

research time plot

Time spent over the month on the two projects. “Coding” was code development I did in conjunction with the conference paper.

As expected, I spent more time over the course of the month on the conference paper. This makes sense, because there was a lot more work to do on that particular project and it had a more defined finished product. I also find it interesting that the majority of the work on the new research study was done early in the month. I made a lot of progress early in the month, getting me almost all the way to my goal, which freed up my time to focus on the conference paper. (You can also clearly tell where the weekends are and where the long holiday weekend fell.)

number of words written

Number of words written over the month on the two projects.

It’s a bit demoralizing to see your word count go down over the course of the month, but this reflects the edits on the conference paper. There’s also a faster rate of word production (most of the time) for the new study, because most of that was “new” writing, so it was less edited and vetted. (It also includes the word count for notes I took while reading articles and books for the project.)

I’ve liked the experience of logging my output like this. Sometimes it’s hard to believe that you’re actually making progress when you’re slogging away day after day, but charts like these drive home the point that daily effort does add up over time. I also experimented with journaling about my research every day, and I’ve found that useful as well. I plan on continuing both practices beyond AcWriMo.

As always, I’ve enjoyed the community aspect of AcWriMo, and I will miss that. One of the many things I’ve been thinking about while on sabbatical is how I can recreate some of that supportive community around research and writing at my institution. I hope to come up with some concrete ideas and try them out next year.

I’m so glad I decided to do AcWriMo again this year. I almost didn’t participate because it felt like “cheating” since I am on sabbatical and I’m supposed to be laser-focused on my research. Participating provided me with a chance to reflect on my research practices and experiment with ways of working, as well as set specific and scary goals and make myself publicly accountable. And these are lessons that I’ll take with me beyond AcWriMo and into the new year.