In the age of ChatGPT, what’s it like to be accused of cheating?

Spread the love

While the public release of the artificial intelligence-driven large-language chatbot, ChatGPT, has created a great deal of excitement around the promise of the technology and expanded use of AI, it has also seeded a good bit of anxiety around what a program that can churn out a passable college-level essay in seconds means for the future of teaching and learning. Naturally, this consternation drove a proliferation of detection programs — of varying effectiveness — and a commensurate increase in accusations of cheating. But how are the students feeling about all of this? Recently published research by Drexel University’s Tim Gorichanaz, Ph.D.,provides a first look into some of the reactions of college students who have been accused of using ChatGPT to cheat.

The study, published in the journal Learning: Research and Practice as part of a series on generative AI, analyzed 49 Reddit posts and their related discussions from college students who had been accused of using ChatGPT on an assignment. Gorichanaz, who is an assistant teaching professor in Drexel’s College of Computing & Informatics, identified a number of themes in these conversations, most notably frustration from wrongly accused students, anxiety about the possibility of being wrongly accused and how to avoid it, and creeping doubt and cynicism about the need for higher education in the age of generative artificial intelligence.

“As the world of higher ed collectively scrambles to understand and develop best practices and policies around the use of tools like ChatGPT, it’s vital for us to understand how the fascination, anxiety and fear that comes with adopting any new educational technology also affects the students who are going through their own process of figuring out how to use it,” Gorichanaz said.

Of the 49 students who posted, 38 of them said they did not use ChatGPT, but detection programs like Turnitin or GPTZero had nonetheless flagged their assignment as being AI-generated. As a result, many of the discussions took on the tenor of a legal argument. Students asked how they could present evidence to prove that they hadn’t cheated, some commenters advised continuing to deny that they had used the program because the detectors are unreliable.

“Many of the students expressed concern over the possibility of being wrongly accused by an AI detector,” Gorichanaz said. “Some discussions went into great detail about how students could collect evidence to prove that they had written an essay without AI, including tracking draft versions and using screen recording software. Others suggested running a detector on their own writing until it came back without being incorrectly flagged.”

Another theme that emerged in the discussions was the perceived role of colleges and universities as “gatekeepers” to success and, as a result, the high stakes associated with being wrongly accused of cheating. This led to questions about the institutions’ preparedness for the new technology and concerns that professors would be too dependent on AI detectors — whose accuracy remains in doubt.

“The conversations happening online evolved from specific doubts about the accuracy of AI detection and universities’ policies around the use of generative AI, to broadly questioning the role of higher education in society and suggesting that the technology will render institutions of higher education irrelevant in the near future,” Gorichanaz said.

The study also highlighted an erosion of trust among students — and between students and their professors — stemming from the students’ perception that they are persistently under suspicion of cheating. A range of comments illustrated the degradation of these relationships:

“I never would have expected to get accused by him, out of all my professors.”
“Of course she trusts that AI detector more than she trusts us.”
“I know I sure as hell didn’t plagiarize, but unfortunately you can’t always trust others.”

Generative AI technology has forced institutions of higher education to reconsider their educational assessment practices and policies about cheating. According to the study, students are asking many of the same questions.

“There were comments about policy inconsistencies where students were punished for using some AI tools such as ChatGPT but encouraged to use other AI tools like Grammarly. Other students suggested that using generative AI to write a paper should not be considered plagiarism because it is original work,” Gorichanaz said. “Many students reached the same conclusion that universities have been grappling with: the need to responsibility integrate the technology and move beyond essays for learning assessment.”

The study could play an important role in helping colleges and universities communicate to their students about the use of generative AI technology, Gorichanaz suggests.

“While this is a relatively small sample, these findings are still useful for understanding what students are going through right now,” he said. “Being wrongly accused, or constantly under suspicion, of using AI to cheat can be a harrowing experience for students. It can damage the trust that’s so important to a quality educational experience. So, institutions must develop consistent policies, clearly communicate them to students and understand the limitations of detection technology.”

Gorichanaz noted that even the best AI detectors could produce enough false positives for professors to wrongly accuse dozens of students — which is clearly unacceptable, considering the stakes.

“Rather than attempting to use AI detectors to evaluate whether these assessments are genuine, instructors may be better off designing different kinds of assessments: those that emphasize process over product or more frequent, lower-stakes assessments,” he wrote, in addition to suggesting that instructors could add modules on appropriate use of generative AI technology, rather than completely prohibiting its use.

While the study offers a thematic analysis, Gorichanaz suggests that future research could expand the sample to a statistically relevant size and draw it from sources beyond English-language conversations on Reddit.

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30