To teach in the time of Chat GPT is to know pain - Ars Technica

Overview

LLM use is the most demoralizing problem I’ve faced as a college instructor.

I’ve been teaching college Earth science courses as a part-time faculty member for a long time now, all while juggling other jobs. I started because it was enjoyable; no one gets into this line of work for the famously poor pay or complete lack of job security. Working with students is just one of those genuinely fulfilling experiences that is addictive enough that they ought to warn people about it.

Details

But thanks to generative AI, it has become mostly miserable―at least in certain settings.

For the last few years, I’ve been exclusively teaching asynchronous online courses, meaning recorded videos rather than live sessions. These have always been a bit more challenging than face-to-face classes, where you have a greater ability to keep the students on track. If a student doesn’t have to show up in a room for an hour at a scheduled time and no one can see their involuntary facial expressions when they don’t understand something, the probability increases greatly that they’ll just… fall off.

But since the appearance of Chat GPT, the instructor’s job isn’t just to teach the subject and frantically attempt to keep every student’s plate spinning. Increasingly, it’s to moonlight as a detective and prosecutor because students without the motivation to do the work don’t have to skip it anymore. They can turn in a work-shaped simulacrum almost as easily. And a substantial number do—in a recent College Board survey of 600 high school students, 84 percent said they had used generative AI for schoolwork.

Teachers are certainly no strangers to cheating. But peeking at concealed notes during an exam or plagiarizing paragraphs from Wikipedia are quaint stone tools compared to the WMDs known as LLMs. I long for the binary comfort of a simple problem like “cheating or not?” Now, I’m forced to adjudicate 256 shades of gray and provide sufficient documentation to defend my decision in case a student appeals my grading to multiple levels of institutional review panels.

Not only does this soul-crushing work consume a shockingly large percentage of my time, but it leaves me with the disturbing thought that even my engaged students might not be what they seem. Maybe they grasped that difficult concept because of my help, or maybe they just laundered an LLM’s regurgitation of Wikipedia paragraphs more skillfully than I can detect.

Let me explain why students are the ones losing the most in this environment and why instructors like me feel pretty much powerless to fix the problem.

Students often carry misconceptions about coursework. They may view an instructor as an opponent standing in the way of the grade they want. And they see “getting the right answers” as the goal of education because that’s how you secure that grade.

But that’s no more true than thinking that logging a count of reps is the goal of bodybuilding. The hard work of lifting weights is the point because that yields physical results. A popular analogy is that using an LLM to write your essay is like driving a forklift into the weight room. Weights get lifted, sure, but nothing is accomplished. I’m not hoping you can answer the exam question for me—I don’t need your essay to get me out of a jam. The process of doing the work was what you needed to walk away with something.

In a recent video about how easy Sora has made it for users to generate relatively realistic but deeply problematic videos, Hank Green rubbed his eyes as he shouted in the figurative direction of Open AI CEO Sam Altman, “The friction matters, Sam!”

Green could just as well have been describing the process of learning. If there’s no friction, no effort, then no work occurred, and the student hasn’t learned. They would have been no less productive watching paint dry.

There are some questions in my course assignments that require critical thought to extend ideas beyond the material I’ve taught. For example, one asks them to stumble on the concept of a natural experiment by thinking of a way to study wind erosion without waiting many lifetimes for a particular boulder to erode (an experiment with a very different kind of control group than the examples we discuss beforehand, like placebos in medical trials.)

Noticing a shift, I recently reviewed all 279 answers I received to the question since I started asking it in 2019. Before Chat GPT, about one in three successfully figured it out, independently connecting pretty potent dots for scientific thinking rather than passively absorbing another fact. For the past two years, the success rate has climbed to over half. There’s no great mystery here: The terms Chat GPT uses when prompted with this question now appear frequently.

Students might consider this a case of simply looking up information—which a majority of them rate as the most helpful and acceptable use of LLMs in surveys—but typing a question and relaying the response is not the same as thinking about it.

A question like this is what we call “formative assessment.” I never graded the correctness of the answer, only the effort. The point was to find out if the core concept had really clicked or if that student still needed a little help making the connection. Failure is a useful part of learning when the stakes are low, as they are during the bulk of the class—encountering this question on the final exam would be an entirely different interaction.

What’s the point of building formative assessments into a course if they’re just handed off to an LLM? Suddenly, it’s a waste of time for both the student and the instructor. Small quizzes are excellent study tools to help students check their own understanding―if a student does them. Now, you can direct an “agentic” LLM browser to complete all the quizzes in an entire course with a single, frictionless prompt.

Should instructors preserve these sorts of assignments for students who want to benefit from them and accept the cheating, or should they eliminate the learning opportunity just to prevent cheating?

Many instructors are trying to adapt to this crisis by going back to the only evaluation tools that are pretty much LLM-proof—tests like oral exams or handwritten work created under supervision in the classroom.

None of these solutions are available to instructors of asynchronous online classes. That sucks, since the availability of those classes is important. They can serve students with physical disabilities, students in rural areas far from a campus, or students trying to obtain a degree while working full-time jobs or caring for dependents. If we have to simply give up on the idea of online classes, those are the casualties.

But even for in-person classes, adaptations to prevent LLM cheating are often concessions that reduce pedagogical quality. For example, labor-intensive oral exams didn’t become an endangered species just because of the swelling student-to-instructor ratio. Pen and paper (or keyboard and mouse) exams make it easier for each student’s experience to be the same and remove some of the potential for bias in scoring.

Writing assignments that may previously have been excellent teaching tools have obviously become the first things to end up on the chopping block. I used to have students in a natural disasters class write a plot for a big-budget Hollywood disaster movie, using both accurate and implausible physical processes. It was good practice for their writing skills; the students found it enjoyable, and it forced them to skillfully apply a lot of what they had learned.

But LLMs will spit out an essay shaped like this in 10 seconds flat. While these are easily recognizable (and low quality), the assignment became untenable. I could grade a real student’s paper in 15 to 30 minutes, but dealing with each instance of cheating easily ran four to eight frustrating and depressing hours of work. So I simply had to cut it from the course.

In times of old, instructors bemoaning issues with plagiarism were often (sometimes condescendingly) told that perhaps they should try creating good assignments instead of ones that can be easily completed with a bit of the old copy-and-paste. Instead of asking students to define terms or summarize a concept when Wikipedia is sitting right there, the advice was to give them higher-level tasks—to evaluate different solutions to a problem or reflect on how that concept has come up in their own lives. A higher-quality assignment would be more engaging and harder to cheat on.

But now, prompting Chat GPT to bluff a reflection is no harder than prompting it to define terms. Both are easier than plagiarizing Wikipedia ever was! And it’s all incredibly difficult to prosecute by our traditional standards of cheating because there is no incontrovertible test for LLM use. (That cuts both ways―it also means that innocent students who are wrongly accused often can’t prove they did the work honestly.)

I’m not alone in feeling exasperated by this predicament. A survey of about 3,000 college faculty showed that 85 percent felt LLMs “make students less likely to develop critical thinking abilities,” and 72 percent reported challenges managing LLM use.

Predictably, the response from higher education administrators―who are busy signing contracts for institutional LLM subscriptions to show how future-first their thought leadership is―has been to tell instructors that their job is to teach students “how to use AI effectively.”

Most examples of this “effective use” involve students generating an essay with AI and then critiquing it. (As if the Internet wasn’t bursting at the seams with human writing that one could critique!) Every time I’ve asked an instructor what their learning objective was for this assignment, the answer has been to help students see why they shouldn’t trust an LLM to write for them. Stop me when you notice the contradiction between that and the administrators’ wishes.

Even if you find creative and highly structured activities in which the guardrails lead to course-relevant learning occurring during the class period, the question remains: What impact are LLMs having on those students for the other 23 hours of the day?

The reason this feels so different to teachers than the tech panics of the past is that there is no clear solution to how AI is undermining nearly every aspect of education. It’s a strange game trying to get students to do things you think will help their education while they point LLMs at you, and it too often feels like the only winning move is not to play.

Instructors burned out with the current situation endure a barrage of repetitive bromides. It’s the future, better get used to it! Can you at least give me some good lottery numbers, oh great and powerful time traveler? Luddites once said we shouldn’t use calculators! You mean the way math instructors currently restrict the use of calculators (which, notably, do not hallucinate false answers) when teaching many skills? LLMs are personal tutors! Would you hire a tutor who plays “Two Truths and a Lie”?

It doesn’t seem like anyone wants to listen to instructors explain how bad it feels to try to do our job in the presence of this annihilative education antimatter. Instead, we’re offered AI grading tools to score AI-generated submissions for AI-generated assignments.

Perhaps critics like me just don’t understand the AI revolution (whatever that is), but we all have experience with human nature and the well-worn patterns of students. LLMs are a shortcut. Students often take shortcuts they later regret. We’ve all been there.

As an instructor, I want to build a clear path up the mountain for my students and see them reach the top. Instead, I increasingly feel like I’m just playing impossible defense to keep them from moving every direction but up. It’s exhausting, and I will mostly lose, which means I’m not even helping them. Students really do want to climb up there, but it’s always tempting to skip some mountains.

A few months ago, I overheard some college students talking about their classes. One was complaining about an assignment they needed to do that night, and another incredulously asked why they wouldn’t just have Chat GPT do it. The first replied, “This is my major, I actually need to learn stuff in this class. I use AI for my other classes.”

I haven’t encountered any students who think they’re learning when they let LLMs do their work, despite the face that college administrators and LLM advertising try to put on this. It’s just workload management to them.

Who knows what will happen if the AI bubble pops and the frictionless and ubiquitous access to LLMs withers into something much more limited. But while AI is here, it certainly isn’t revolutionizing education and enhancing learning. It’s just making it extraordinarily difficult to do all the things that have been helping students learn for a very long time.

         Shock from Iran war has Trump's vision for US energy dominance flailing

         Report: US demands Reddit unmask ICE critic, summons firm to grand jury

         The Artemis II mission has ended. Where does NASA go from here?

         F1 moves a step closer to fixing its 2026 hybrid problem

         New paper argues history, not mantle plume, powers Yellowstone

Ars Technica has been separating the signal from the noise for over 25 years. With our unique combination of technical savvy and wide-ranging interest in the technological arts and sciences, Ars is the trusted source in a sea of information. After all, you don’t need to know everything, only what’s important.

Key Takeaways

LLM use is the most demoralizing problem I’ve faced as a college instructor
I’ve been teaching college Earth science courses as a part-time faculty member for a long time now, all while juggling other jobs
But thanks to generative AI, it has become mostly miserable―at least in certain settings
For the last few years, I’ve been exclusively teaching asynchronous online courses, meaning recorded videos rather than live sessions
But since the appearance of Chat GPT, the instructor’s job isn’t just to teach the subject and frantically attempt to keep every student’s plate spinning