Delving into Linguistic Patterns of AI

(Image Credit: Cornell CIS - Cornell University)

(Image Credit: Philip Shapira)

(Image Credit: Kirill Kudryavtsev/APF/Getty Images)

August 23, 2024

Suri Le 

11th Grade

Fountain Valley High School


There is a wide range of controversy surrounding the impact AI has made on academia. The most common topic of debate involves large language models (LLMs) such as ChatGPT, GooseAI, and other text-generating models. Teachers at the primary and secondary school levels are concerned about whether their students use generative AI for their essays. But is this issue prevalent in higher education and research as well? 

The short answer is yes. In recent years, there has been a significant increase in the usage of certain words in academic and scientific papers. Some of the most common examples can be found here. After ChatGPT was released in late 2022, there was a tremendous spike in the use of the word “delve”—almost 10 to 100 times more than in previous years. This is observed in a graph by Dr. Jeremy Nguyen from the Swinburne University of Technology in Melbourne, who collected data on the prevalence of the word on PubMed, one of the leading and most credible databases for biomedical topics.

After ChatGPT became available for almost a full year in 2023, it is noticeable how scaled the issue has become. 

(Image Credit: Jeremy Nguyen/X)

Although there are many risks associated with this topic, one of the most significant is how this will change language as a whole. The more articles containing these words are read, the more these words will be used in real life. Slowly, words will change their meaning, just as “plethora” is now more commonly used to describe “a lot” when it originally meant “negative abundance”. 

Though many highly educated individuals are utilizing AI, few question where the data is coming from. The popular belief is that LLMs use “every piece of written English on the internet” to generate responses. However, models instead use reinforcement learning with human feedback (RLHF) to collect data and train themselves. Real people are tasked with interacting with a raw version of a model, asking it questions, creating conversations, and providing feedback. 

Because this process is incredibly tedious, it becomes expensive—not just to power these models, but also to pay the workers from whom the LLM will learn. Companies have decided to outsource this task, many of which have settled in the global south. This is where the difference in linguistic patterns arises, as “delve” and other AI-indicating words are much more common in English used in Africa. There have been several cases of OpenAI workers in Nigeria speaking up about the issue, as they are also being underpaid. However, since the workforce in these countries is plentiful, large AI companies have not addressed this concern. 

Even without knowing the context of the shown graph, its implications are frightening. How rigorous is the research we are conducting if so much of it relies on the use of AI models? Are there security and privacy concerns if studies are directly fed into them and then used by the models as training data? Will this language diffusion affect how we talk in the States? The next few decades seem daunting if things don’t change or if there aren’t stricter regulations for AI—not just for ourselves, but for the future.

Reference Sources

AI Phrase Finder. “The 10 Most Common ChatGPT Words.” AI Phrase Finder, 9 Mar. 2024, 

https://aiphrasefinder.com/common-chatgpt-words/.

Giammatteo, Giacomo. “Plethora—It’s Not What You May Think.” No Mistakes Publishing, 13 May 2020,

https://nomistakespublishing.com/plethora-the-real-meaning/. Accessed 17 Aug. 2024. 

Hern, Alex. “TechScape: How Cheap, Outsourced Labour in Africa Is Shaping AI English.” The Guardian, 16 Apr. 2024,

www.theguardian.com/technology/2024/apr/16/techscape-ai-gadgest-humane-ai-pin-chatgpt.

Rowe, Niamh. ““It’s Destroyed Me Completely”: Kenyan Moderators Decry Toll of Training of AI Models.” The Guardian, 2 Aug. 2023,

www.theguardian.com/technology/2023/aug/02/ai-chatbot-training-human-toll-content-moderator-meta-openai.

Shapira, Philip. “Delving into “Delve.”” Philip Shapira, 31 Mar. 2024, 

https://pshapira.net/2024/03/31/delving-into-delve/.  

Thompson, Nicholas. “Nicholas Thompson on LinkedIn: The Most Interesting Thing in Tech: Certain Words—“Delve” for Example—Seem…

| 75 Comments.” Linkedin.com, 6 May 2024, 

www.linkedin.com/posts/nicholasxthompson_the-most-interesting-thing-in-tech-certain-activity-7193375085292855296-yeng/?utm_source=share&utm_medium=member_ios. Accessed 17 Aug. 2024.