ChatGPT being put to the test at college level
Most assignments receive good grades
MILWAUKEE – In the era of artificial intelligence, cheating is only getting easier for students.
Some instructors say they can easily tell when students turn in AI-generated work. Others find it far trickier and will turn to online AI detectors for confirmation when their suspicions are raised. Educators everywhere are trying to create AI-proof assignments.
The Milwaukee Journal Sentinel tested how well AI can complete college-level work – and whether instructors can detect it.
A Harvard student last year asked seven professors and teaching assistants to grade essays written in response to a class assignment. To minimize response bias, the student told instructors the essays might have been written by herself or by AI, but in reality, all of the work was done by GPT-4, a version of the chatbot from OpenAI.
The AI-generated assignments received mostly A’s and B’s, along with one Pass.
“Not only can GPT-4 pass a typical social science and humanities-focused freshman year at Harvard, but it can get pretty good grades,” the student wrote in an essay published by the Chronicle of Higher Education.
I followed the same methodology as the Harvard student.
Professors emailed me a smaller assignment they would give their students, not an end-of-the-semester research paper. I told them some of the work would be done honestly and other assignments handled by ChatGPT. In fact, AI did all of the work.
I formulated prompts for ChatGPT from the assignments provided. In most cases, I wrote more tailored prompts to ChatGPT based on what it produced on the first try. Often, the additional requests asked the chatbot to provide more specific examples, expand on its ideas or use a less formal tone.
The experiment was far from scientific. Several professors said they approached grading more skeptically than they would have had it been a student’s submission, given the circumstances.
English
● Course: Critical Writing in the Field of English, University of WisconsinWhitewater
● Assignment: Write a three- to fivepage paper examining how a poem among a selection provided draws on a specific concept discussed in class. Include analysis of specific passages in the poem and explore the use of at least five literary terms.
● Was this hard for ChatGPT: At first, the chatbot analyzed a completely different poem than the title provided. I submitted the full lines of the correct poem, prompting the chatbot to apologize for the “oversight.” Additional prompts providing specific literary terms for the chatbot to incorporate into the essay helped refine the work.
● Grade: B+
● Comments: The instructor said the paper “fulfills the assignment admirably, and brings an admirable depth of understanding” of the poet’s use of the concept. The thesis statement could have been more specific, resulting in a slight deduction.
Political science
● Course: Introduction to American Politics, Marquette University
● Assignment: Write a short paper
describing the three faces of power and explaining how each constrains you in your own life.
● Was this hard for ChatGPT: No. The chatbot easily put together an essay. A second prompt asking to connect the faces of power concept to my life as a reporter provided more specificity.
● Grade: Incomplete
● Comments: “Without question, the submission deserves an A,” the instructor said. But ChatGPT made one small mistake, which immediately sparked skepticism. While the essay correctly cited the creator of the theory, the reading associated with the assignment was from a different person.
The instructor ran it through two AI detectors, both of which suggested the work was AI-generated. He said he would confront a student who submitted this work.
Library, information studies
● Course: Information Divides and Differences in a Multicultural Society, University of Wisconsin-Madison
● Assignment: Daily log of media consumption with analysis of tone, evidence, expertise of each source, roughly 350 words
Was this hard for ChatGPT: No. I submitted a second prompt asking for a less formal tone. While the chatbot cited legitimate news outlets, such as the Wisconsin State Journal and The New York Times, in the log, the summaries described general topics, not actual news stories.
● Grade: 5/5
● Comments: The instructor said there were no “egregious red flags” but one sentence stood as sounding like ChatGPT. In general, he tends to give students the benefit of the doubt and wouldn’t have suspected this log was AI-generated had it been turned in among a stack of others.