A massive, cross-disciplinary look at how often scientists turn to artificial intelligence (AI) to write their manuscripts has found steady increases since 2022, when OpenAI’s text-generating chatbot ChatGPT burst onto the scene. In some fields, the use of such generative AI has become almost routine, with up to 22% of computer science papers showing signs of input from the large language models (LLMs) that underlie the computer programs.

The study, which appears today in Nature Human Behaviour, analyzed more than 1 million scientific papers and preprints published between 2020 and 2024, primarily looking at abstracts and introductions for shifts in the frequency of telltale words that appear more often in AI-generated text. “It’s really impressive stuff,” says Alex Glynn, a research literacy and communications instructor at the University of Louisville. The discovery that LLM-modified content is more prevalent in areas such as computer science could help guide efforts to detect and regulate the use of these tools, adds Glynn, who was not involved in the work. “Maybe this is a conversation that needs to be primarily focused on particular disciplines.”

When ChatGPT was first released, many academic journals—hoping to avoid a flood of papers written in whole or part by computer programs—scrambled to create policies limiting the use of generative AI. Soon, however, researchers and online sleuths began to identify numerous scientific manuscripts and peer-review reports that showed blatant signs of being written with the help of LLMs, including anomalous phrases such as “regenerate response” or “my knowledge cutoff.” Some investigators, such as University of Toulouse computer scientist Guillaume Cabanac, started putting together lists of papers that contained these “smoking guns.” Since March 2024, Glynn has been compiling Academ-AI, a database that documents suspected instances of AI use in scientific papers.

More: https://www.science.org/content/article/one-fifth-computer-science-papers-may-include-ai-content