AI-generated papers are detected by "ChatGPT detector" with previously unheard-of accuracy.


 A machine learning-based tool separates human writers from artificial intelligence writers based on stylistic characteristics.

According to a study published on November 6 in Cell Reports Physical Science1, a machine-learning tool can quickly identify when chemistry papers are written using the chatbot ChatGPT. The specialized classifier could assist academic publishers in identifying papers produced by AI text generators, as it outperformed two artificial intelligence (AI) detectors currently in use.

Co-author Heather Desaire, a chemist at the University of Kansas in Lawrence, states that the majority of the text analysis community "wants a really general detector that will work on anything." However, "we were really going after accuracy" by creating a tool that concentrates on a specific kind of paper.

The results imply that software tailored to particular writing styles could enhance efforts to create AI detectors, according to Desaire. "It's not that hard to build something for different domains if you can build something quickly and easily."

The elements of style

In June, Desaire and her associates published the first description of their ChatGPT detector, using Perspective articles from the journal Science2. The detector uses machine learning to identify whether a text was written by ChatGPT or an academic scientist by looking at 20 aspects of writing style, such as how frequently specific words and punctuation are used and sentence length variations. According to Desaire, the results demonstrate that "you could use a small set of features to get a high level of accuracy."

In the most recent study, the American Chemical Society (ACS) published ten chemistry journals' introductory sections were used to train the detector. According to Desaire, the reason the team chose the introduction is that, with access to background literature, ChatGPT can write this portion of a paper fairly easily. After training ChatGPT-3.5 to write 200 introductions in ACS journal style, the researchers used 100 published introductions as human-written text examples. The titles of 100 of these papers were given to the tool, and the abstracts of the remaining 100 were given to it.

The tool demonstrated 100% accuracy in identifying ChatGPT-3.5-written sections based on titles when tested on human-written introductions and AI-generated ones from the same journals. The accuracy for the introductions that ChatGPT generated using abstracts was 98%, which was a little lower. With text produced by ChatGPT-4, the most recent iteration of the chatbot, the tool performed equally well. In contrast, depending on the version of ChatGPT used and whether the introduction was generated from the paper's title or abstract, the AI detector ZeroGPT could only identify AI-written introductions with an accuracy of roughly 35–65%. The ChatGPT creator, OpenAI, also produced a text-classifier tool that did not perform well, only being able to identify AI-written introductions with an accuracy of about 10–55%.

Even with introductions from journals it wasn't trained on, the new ChatGPT catcher did well. It could also capture AI text generated by a variety of prompts, including one meant to trick AI detectors. For journal articles published in scientific journals, the system is extremely specialized. It was unable to identify genuine articles from university newspapers as having been written by humans.

Wider issues

Debora Weber-Wulff, a computer scientist at the HTW Berlin University of Applied Sciences who specializes in academic plagiarism research, calls what the writers are doing "something fascinating." Instead of focusing on writing style characteristics, she claims that many current tools attempt to identify authorship by looking for predictive text patterns of AI-generated writing. "I had not considered utilizing stylometrics on ChatGPT."

However, Weber-Wulff notes that the use of ChatGPT in academia is motivated by other factors. She points out that a lot of researchers may feel pressured to publish papers quickly or they may not think of the writing process as a crucial component of science. These problems won't be solved by AI detection technologies, so they shouldn't be viewed as "a magic software solution to a social problem."

doi: https://doi.org/10.1038/d41586-023-03479-4
Post a Comment (0)
Previous Post Next Post