Skip to content

Instantly share code, notes, and snippets.

@edwinhu
Created November 18, 2025 20:21
Show Gist options
  • Select an option

  • Save edwinhu/df5a8b00092532f9ddfb6edef31b1e57 to your computer and use it in GitHub Desktop.

Select an option

Save edwinhu/df5a8b00092532f9ddfb6edef31b1e57 to your computer and use it in GitHub Desktop.
Wikipedia's Guide to Identifying AI-Generated Writing - Patterns and anti-patterns for authentic writing
title source author published created description tags
Wikipedia:Signs of AI writing - Wikipedia
2025-10-16
clippings
This is an advice page from WikiProject AI Cleanup. This page is not an encyclopedic article, nor one of Wikipedia's policies or guidelines, as it has not been thoroughly vetted by the community.

A screenshot of ChatGPT reading: "[header] Legacy & Interpretation [body] The "Black Hole Edition" is not just a meme — it's a celebration of grassroots car culture, where ideas are limitless and fun is more important than spec sheets. Whether powered by a rotary engine, a V8 swap, or an imagined fighter jet turbine, the Miata remains the canvas for car enthusiasts worldwide."

LLMs tend to have an identifiable writing style.

This is a list of writing and formatting conventions typical of AI chatbots such as ChatGPT, with real examples taken from Wikipedia articles and drafts. It is meant to act as a field guide to help detect undisclosed AI-generated content on Wikipedia. This list is descriptive, not prescriptive; it consists of observations, not rules. Advice about formatting or language to avoid in Wikipedia articles can be found in the policies and guidelines and the Manual of Style, but does not belong on this page.

This list is not a ban on certain words, phrases, or punctuation. No one is taking your em-dashes away or claiming that only AI uses them. Not all text featuring the following indicators is AI-generated, as the large language models that power AI chatbots are trained on human writing, including the writing of Wikipedia editors. This is simply a catalog of very common patterns observed over many thousands of instances of AI-generated text, specific to Wikipedia. While some of its advice may be broadly applicable, some signs—particularly those involving punctuation and formatting—may not apply in a non-Wikipedia context.

The patterns here are also only potential signs of a problem, not the problem itself. While many of these issues are immediately obvious and easy to fix—e.g., excessive boldface, poor wordsmithing, broken markup, citation style quirks—they can point to less outwardly visible problems that carry much more serious policy risks. If LLM-generated text is polished enough (initially or subsequently tidied up), those surface defects might not be present, but the deeper problems likely will. Please do not merely treat these signs as the problems to be fixed; that could just make detection harder. The actual problems are those deeper concerns, so make sure to address them, either yourself or by flagging them, per the advice at Wikipedia:Large language models § Handling suspected LLM-generated content and Wikipedia:WikiProject AI Cleanup/Guide.

The speedy deletion policy criterion G15 (LLM-generated pages without human review) is limited to the most objective and least contestable indications that the page's content was generated by an LLM. There are three such indicators, the first of which can be found in § Communication intended for the user and the other two in § Citations. The other signs, though they may indeed indicate AI use, are not sufficient for speedy deletion.

Do not solely rely on artificial intelligence content detection tools (such as GPTZero) to evaluate whether text is LLM -generated. While they perform better than random chance, these tools have nontrivial error rates and cannot replace human judgment.1

LLMs (and artificial neural networks in general) use statistical algorithms to guess (infer) what should come next based on a large corpus of training material. It thus tends to regress to the mean; that is, the result tends toward the most statistically likely result that applies to the widest variety of cases. It can simultaneously be a strength and a "tell" for detecting AI-generated content.

For example, LLMs are usually trained on data from the internet in which famous people are generally described with positive, important-sounding language. It will thus sand down specific, unusual, nuanced facts (which are statistically rare) and replace them with more generic, positive descriptions (which are statistically common). Thus the specific detail "invented a train-coupling device" might become "a revolutionary titan of industry." LLMs tend to smooth out unusual details and drift toward the most common, statistically probable way of describing a topic. It is like shouting louder and louder that a portrait shows a uniquely important person, while the portrait itself is fading from a sharp photograph into a blurry, generic sketch. The subject becomes simultaneously less specific and more exaggerated.2

This statistical regression to the mean, a smoothing over of specific facts into generic statements that could apply to many topics, makes AI-generated content easier to detect.

Words to watch: stands/serves as / is a testament/reminder, plays a vital/significant/crucial role, underscores/highlights its importance/significance, reflects broader, symbolizing its ongoing, enduring/lasting impact, key turning point, indelible mark, deeply rooted, profound heritage, steadfast dedication...

LLM writing often puffs up the importance of the subject matter by adding statements about how arbitrary aspects of the topic represent or contribute to a broader topic.3 There is a distinct and easily identifiable repertoire of ways that it writes these statements.4 LLMs may include these for even the most mundane of things, sometimes with hedging comments like "While [minor/not well known/etc], it [symbolizes/stands as/contributes]..."

When talking about biology (e.g. when asked to discuss a given animal or plant species), LLMs tend to put too much emphasis on the species' conservation status and the efforts to protect it, even if the status is unknown and no serious efforts exist, and may strain to derive symbolism from things like taxonomy.

Examples

Douera enjoys close proximity to the capital city, Algiers, further ==enhancing its significance== as a dynamic hub of activity and culture. With its coastal charm and convenient location, Douera ==captivates both residents and visitors alike== [...]

— From this revision to Douéra

Berry Hill today ==stands as a symbol== of community resilience, ecological renewal, and historical continuity. Its transformation from a coal-mining hub to a thriving green space ==reflects the evolving identity== of Stoke-on-Trent.

— From Draft:Berry Hill, Stoke-on-Trent

By preying on these pests, Zagloba species ==play a significant role== in natural pest control, ==contributing to ecological balance== and agricultural health.

— From this revision to Zagloba (beetle)

These citations, spanning more than six decades and appearing in recognized academic publications, ==illustrate Blois' lasting influence in computational linguistics, grammar, and neology.==

— From this revision to Draft:Jacques Blois (linguist)

Superficial analyses

Words to watch: ensuring..., highlighting..., emphasizing..., reflecting..., underscoring..., showcasing..., aligns with..., contributing to...

AI chatbots tend to insert superficial analysis of information, often in relation to its significance, recognition, or impact. This is often done by attaching a present participle ("-ing") phrase at the end of sentences, sometimes with vague attributions to third parties (see below).3

While many of these words are strong AI tells on their own,4 an even stronger tell is when the subjects of these verbs are facts, events, or other abstract concepts. A person, for example, can highlight or emphasize something, but a fact or event cannot. The "highlighting" or "underscoring" is not something that is actually happening; it is a claim by a disembodied narrator about what something means.3

Such comments are usually synthesis and/or unattributed opinions in wikivoice. Newer chatbots with retrieval-augmented generation may instead attach this language to attributed statements, e.g., "Critic Roger Ebert praised the film, underscoring the story's impact...", but since it is still AI-generated text it may be an inaccurate or subjective representation of what the source actually said.

Examples

In 2025, the Federation was internationally recognized and invited to participate in the Asia Pickleball Summit, ==highlighting Pakistan’s entry into the global pickleball community.==

— From this revision to Draft:Pakistan Pickleball Federation

The civil rights movement emerged as a powerful continuation of this struggle, ==emphasizing the importance of solidarity and collective action in the fight for justice==.

— From this revision to African-American culture

These partnerships ==reflect the company’s role== in serving both corporate and community organizations in Uganda.

— From Draft:GEOWISE MEDIA

Promotional language

Words to watch: rich/vibrant tapestry, artistic/cultural/literary/media/etc. landscape, boasts a, continues to captivate, groundbreaking, intricate, stunning natural beauty, enduring/lasting legacy, nestled, in the heart of...

LLMs have serious problems keeping a neutral tone, especially when writing about something that could be considered "cultural heritage"—in which case they will constantly remind the reader that it is cultural heritage.

Examples

Nestled within the ==breathtaking== region of Gonder in Ethiopia, Alamata Raya Kobo ==stands as a vibrant town== with a ==rich cultural heritage and a significant place== within the Amhara region. From its ==scenic landscapes== to its ==historical landmarks==, Alamata Raya Kobo offers visitors a ==fascinating glimpse== into the ==diverse tapestry== of Ethiopia. In this article, we will explore the ==unique characteristics== that make Alamata Raya Kobo ==a town worth visiting== and shed light on ==its significance== within the Amhara region.

— From this revision to Alamata (woreda)

TTDC ==acts as the gateway== to Tamil Nadu’s ==diverse attractions==, seamlessly connecting the beginning and end of ==every traveller's journey==. It offers ==dependable, value-driven experiences== that showcase the state’s ==rich history, spiritual heritage, and natural beauty==.

— From this revision to Tamil Nadu Tourism Development Corporation

Words to watch: it's important to note/remember/consider, may vary...

LLMs often tell the reader about things "it's important to remember." This frequently happens in the context of "disclaimers" to an imagined reader, often regarding safety or topics that vary in different locales/jurisdictions. It seems to be more common in text by older (pre-2025) chatbots.

Examples

The emergence of these informal groups reflects a growing recognition of the interconnected nature of urban issues and the potential for ANCs to play a role in shaping citywide policies. ==However, it's important to note ==that these caucuses operate outside the formal ANC structure and their influence on policy decisions ==may vary==.====

— From this revision to Advisory Neighborhood Commission

Although the National Medical Commission had deemed conversion therapy as 'professional misconduct' in response to a directive from the Madras High Court in the case of S Sushma v. Commissioner of Police, ==it's important to note== that AYUSH practitioners, who practice alternative medicine systems like Ayurveda, Yoga, Unani, Siddha, and Homeopathy, remain unregulated by the National Medical Commission.

— From this revision to Adhila Nasarin v. State Commissioner of Police

==It's important to remember== that what's free in one country might not be free in another, so always check before you use something.

— From Wikimedia's LLM-generated Simple Summary of Public domain

Section summaries

Words to watch: In summary, In conclusion, Overall...

When generating longer outputs (such as when told to "write an article"), LLMs often add a section titled "Conclusion" or similar, and will often end a paragraph or section by summarizing and restating its core idea.5

Examples

==In summary==, the educational and training trajectory for nurse scientists typically involves a progression from a master's degree in nursing to a Doctor of Philosophy in Nursing, followed by postdoctoral training in nursing research. This structured pathway ensures that nurse scientists acquire the necessary knowledge and skills to engage in rigorous research and contribute meaningfully to the advancement of nursing science.

— From this revision in Nurse scientist

Words to watch: Despite its... faces several challenges..., Despite these challenges, Challenges and Legacy, Future Outlook...

Many LLM-generated Wikipedia articles include a "Challenges" section, which typically begins with a sentence like "Despite its [positive/promotional words], [article subject] faces challenges..." and ends with either a vaguely positive assessment of the article subject 1, or speculation about how ongoing or potential initiatives could benefit the subject. Such paragraphs usually appear at the end of articles with a rigid outline structure, which may also include a separate section for "Future Prospects."

Note: This sign is about the rigid formula, not simply the mention of challenges.

Examples

==Despite its industrial and residential prosperity, Korattur faces challenges== typical of urban areas, including[...] With its ==strategic location and ongoing initiatives==, Korattur ==continues to thrive== as an integral part of the Ambattur industrial zone, embodying the synergy between industry and residential living.

— From this revision to Korattur

==Despite its success, the Panama Canal faces challenges==, including[...] ==Future investments in technology, such as automated navigation systems, and potential further expansions could enhance the canal’s efficiency== and maintain its relevance in global trade.

— From this revision to Panama Canal

==Despite their promising applications, pyroelectric materials face several challenges== that must be addressed for broader adoption. One key limitation is[...] ==Despite these challenges==, the versatility of pyroelectric materials ==positions them as critical components== for sustainable energy solutions and next-generation sensor technologies.

— From this revision to Pyroelectricity

The future of hydrocarbon economies ==faces several challenges,== including[...] This section would speculate on ==potential developments== and the changing landscape of global energy.

— From this revision to Hydrocarbon economy

Operating in the current Afghan media environment ==presents numerous challenges,== including[...] ==Despite these challenges,== Amu TV has managed to ==continue to provide a vital service== to the Afghan population.

— From this revision to Amu Television

For example, while the methodology supports transdisciplinary collaboration in principle, applying it effectively in large, heterogeneous teams ==can be challenging.== [...] SCE continues to evolve ==in response to these challenges.==

— From this revision to Draft:Socio-cognitive engineering

Negative parallelisms

Parallel constructions involving "not", "but", or "however" such as " Not only... but..." or " It is not just about..., it's..." are common in LLM writing but are often unsuitable for writing in a neutral tone.1

Examples

Self-Portrait by Yayoi Kusama, executed in 2010 and currently preserved in the famous Uffizi Gallery in Florence, constitutes ==not only== a work of self-representation, ==but== a visual document of her obsessions, visual strategies and psychobiographical narratives.

— From this revision to Self-portrait (Yayoi Kusama)

It’s ==not just about== the beat riding under the vocals; ==it’s== part of the aggression and atmosphere.

— From this revision to Draft:Critikal! The Rapper

Here is an example of a negative parallelism across multiple sentences:

He hailed from the esteemed Duse family, renowned for their theatrical legacy. Eugenio's life, however, took a path that intertwined both personal ambition and familial complexities.

— From this revision to Eugenio Duse

On rare occasions, user messages that appear AI-generated may also include phrases that read along the lines of " no..., no..., just...".

Examples

There are ==no== long-form profiles. ==No== editorial insights. ==No== coverage of her game dev career. ==No== notable accolades. ==Just== TikTok recaps and callouts.


This page should be gone, fully, cleanly, and without delay. ==No== redirect. ==No== merge. ==Just== delete.

— From Wikipedia:Articles for deletion/Lilly Contino

I'm hitting the reset button — ==no== hard feelings, ==no== drama — ==just== clean, policy-based engagement from here on out.

— From this user talk page message

LLMs overuse the ' rule of three '—"the good, the bad, and the ugly". This can take different forms from "adjective, adjective, adjective" to "short phrase, short phrase, and short phrase".1 LLMs often use this structure to make superficial analyses appear more comprehensive.

Examples

The Amaze Conference brings together ==global SEO professionals, marketing experts, and growth hackers== to discuss the latest trends in digital marketing. The event features ==keynote sessions, panel discussions, and networking opportunities==.

— From Draft:Amaze Conference

Words to watch: Industry reports, Observers have cited, Some critics argue...

AI chatbots tend to attribute opinions or claims to some vague authority—a practice called weasel wording —while citing only one or two sources that may or may not actually express such view. They also tend to overgeneralize a perspective of one or few sources into that of a wider group.

Examples

Here, the weasel wording implies the opinion comes from an independent source, but it actually cites Nick Ford's own website.

His [Nick Ford's] compositions ==have been described== as exploring conceptual themes and bridging the gaps between artistic media.6

—  Draft:Nick Ford (musician)

Due to its unique characteristics, the Haolai River is of interest to ==researchers and conservationists==. Efforts are ongoing to monitor its ecological health and preserve the surrounding grassland environment, which is part of a larger initiative to protect China’s semi-arid ecosystems from degradation.

— From this revision to Haolai River

Elegant variation

Generative AI has a repetition-penalty code, meant to discourage it from reusing words too often.3 For instance, the output might give a main character's name and then repeatedly use a different synonym or related term (e.g., protagonist, key player, eponymous character) when mentioning it again.

Note: If a user adds multiple pieces of AI-generated content in separate edits, this tell may not apply, as each piece of text may have been generated in isolation.

Examples

Vierny, after a visit in Moscow in the early 1970’s, committed to supporting artists resisting ==the constraints of socialist realism== and discovered Yankilevskly, among others such as Ilya Kabakov and Erik Bulatov. In ==the challenging climate of Soviet artistic constraints==, Yankilevsky, alongside other ==non-conformist artists==, faced obstacles in expressing ==their creativity== freely. Dina Vierny, recognizing ==the immense talent== and the struggle ==these artists== endured, played a pivotal role in aiding ==their artistic aspirations==. [...]

In this new chapter of his life, Yankilevsky found himself amidst a community of ==like-minded artists== who, despite diverse styles, shared a common goal—to break free from ==the confines of state-imposed artistic norms==, particularly socialist realism. [...]

The move to Paris facilitated an environment where Yankilevsky could further explore and exhibit ==his distinctive artistic vision== without ==the constraints imposed by the Soviet regime==. Dina Vierny's unwavering support and commitment to the ==Russian avant-garde artists== played a crucial role in fostering a space where ==their creativity== could flourish, contributing to the rich tapestry of artistic expression in the vibrant cultural landscape of Paris. Vierny's commitment culminated in the groundbreaking exhibition "Russian Avant-Garde - Moscow 1973" at her Saint-Germain-des-Prés gallery, showcasing the ==diverse yet united front of non-conformist artists== challenging ==the artistic norms== of their time.

— From this revision to Vladimir Yankilevsky

False ranges

When from... to... constructions are not used figuratively, they are used to indicate the lower and upper bounds of a scale. The scale is either quantitative, involving an explicit or implicit numerical range (e.g. from 1990 to 2000, from 15 to 20 ounces, from winter to autumn), or qualitative, involving categorical bounds (e.g. " from seed to tree ", " from mild to severe ", " from white belt to black belt "). The same constructions may be used to form a merism —a figure of speech that combines the two extremes as two contrasting parts of the whole to refer to the whole. This is a figurative meaning, but it has the same structure as the non-figurative usage, because it still requires an identifiable scale: from head to toe (the length of a body denoting the whole body), from soup to nuts (clearly based on time), etc. This is not a false range.

LLMs really like mixing it up, such as when giving examples of items within a set (instead of simply mentioning them one after another). An important consideration is whether some middle ground can be identified without changing the endpoints. If the middle requires switching from one scale to another scale, or there is no scale to begin with or a coherent whole that could be conceived, the construction is a false range. LLMs often employ "figurative" (often simply: meaningless) " from... to..." constructions that purport to signify a scale, while the endpoints are loosely related or essentially unrelated things and no meaningful scale can be inferred. LLMs do this because such meaningless language is used in persuasive writing to impress and woo, and LLMs are heavily influenced by materials consisting of persuasive writing during their training.

Examples

Our journey through the universe has taken us ==from== the singularity of the Big Bang ==to== the grand cosmic web, ==from== the birth and death of stars that forge the elements of life, ==to== the enigmatic dance of dark matter and dark energy that shape its destiny.

[...]

Intelligence and Creativity: ==From== problem-solving and tool-making ==to== scientific discovery, artistic expression, and technological innovation, human intelligence is characterized by its adaptability and capacity for novel solutions.

[...]

Continued Scientific Discovery: The quest to understand the universe, life, and ourselves will continue to drive scientific breakthroughs, ==from== fundamental physics ==to== medicine and neuroscience.

— From Draft:The Cosmos Unveiled: A Grand Tapestry of Existence

Title case

In section headings, AI chatbots strongly tend to consistently capitalize all main words (title case).1

Examples

Thomas was born in Cochranville, Pennsylvania. [...]

Thomas’s behavioral profiling has been used to evaluate Kentucky Derby [...]

Global Consulting

Thomas’s behavioral profiling has been used to evaluate Kentucky Derby and Breeders’ Cup contenders. [...]

In July 2025, Thomas was invited as a featured presenter to the Second Horse Economic Forum [...]

Educational Programs

Thomas is the founder of the Institute for Advanced Equine Studies [...]

— From Draft:Kerry M. Thomas

AI chatbots may display various phrases in boldface for emphasis in an excessive, mechanical manner. One of their tendencies, inherited from readmes, fan wikis, how-tos, sales pitches, slide decks, listicles and other materials that heavily use boldface, is to emphasize every instance of a chosen word or phrase, often in a "key takeaways" fashion. Some newer large language models or apps have instructions to avoid overuse of boldface.

Examples

It blends OKRs (Objectives and Key Results), KPIs (Key Performance Indicators), and visual strategy tools such as the Business Model Canvas (BMC) and Balanced Scorecard (BSC). OPC is designed to bridge the gap between strategy and execution by fostering a unified mindset and shared direction within organizations.

— From Draft:One Page 4 Change (OPC)

AI chatbots' content often includes vertical lists. A particular list formatting is employed: It comprises an ordered or unordered list where the list marker (number, bullet, dash, etc.) is followed by an inline boldfaced header of sorts; a colon then separates it from the remaining descriptive text. (Separately from this sign, this is one of the things which cause overabundant boldfacing; see § Excessive use of boldface)

Instead of proper wikitext, a bullet point in an unordered list may appear as a bullet character (•), hyphen (-), en dash (–), or similar character. Ordered lists (i.e. numbered lists) may use explicit numbers (such as 1.) instead of standard wikitext. When copied as bare text appearing on the screen, some of the formatting information is lost, and line breaks may be lost as well.

Examples

1. Historical Context Post-WWII Era: The world was rapidly changing after WWII, [...] 2. Nuclear Arms Race: Following the U.S. atomic bombings, the Soviet Union detonated its first bomb in 1949, [...] 3. Key Figures Edward Teller: A Hungarian physicist who advocated for the development of more powerful nuclear weapons, [...] 4. Technical Details of Sundial Hydrogen Bomb: The design of Sundial involved a hydrogen bomb [...] 5. Destructive Potential: If detonated, Sundial would create a fireball up to 50 kilometers in diameter, [...] 6. Consequences and Reactions Global Impact: The explosion would lead to an apocalyptic nuclear winter, [...] 7. Political Reactions: The U.S. military and scientists expressed horror at the implications of such a weapon, [...] 8. Modern Implications Current Nuclear Arsenal: Today, there are approximately 12,000 nuclear weapons worldwide, [...] 9. Key Takeaways Understanding the Madness: The concept of Project Sundial highlights the extremes of human ingenuity [...] 10. Questions to Consider What were the motivations behind the development of Project Sundial? [...]

— From this revision to Sundial (weapon)

Emojis

Sometimes, AI chatbots decorate section headings or bullet points by placing emojis in front of them.

Examples

Let’s decode exactly what’s happening here:
🧠 Cognitive Dissonance Pattern:
You’ve proven authorship, demonstrated originality, and introduced new frameworks, yet they’re defending a system that explicitly disallows recognition of originators unless a third party writes about them first.
[...]
🧱 Structural Gatekeeping:
Wikipedia policy favors:
[...]
🚨 Underlying Motivation:
Why would a human fight you on this?
[...]
🧭 What You’re Actually Dealing With:
This is not a debate about rules.
[...]

— From this revision to Wikipedia:Village pump (policy)

🪷 Traditional Sanskrit Name: Trikoṇamiti
Tri = Three
Koṇa = Angle
Miti = Measurement 🧭 “Measurement of three angles” — the ancient Indian art of triangle and angle mathematics.
🕰️ 1. Vedic Era (c. 1200 BCE – 500 BCE)
[...]
🔭 2. Sine of the Bow: Sanskrit Terminology
[...]
🌕 3. Āryabhaṭa (476 CE)
[...]
🌀 4. Varāhamihira (6th Century CE)
[...]
🌠 5. Bhāskarācārya II (12th Century CE)
[...]
📤 Indian Legacy Spreads

— From this revision to History of trigonometry

While human editors and writers often do use em dashes (—), LLM output tends to use them more often than nonprofessional human-written text of the same genre, and uses them in places where humans are more likely to use commas, parentheses, colons, or (misused) hyphens (-). LLMs especially tend to use em dashes in a formulaic, pat way, often mimicking "punched up" sales-like writing by over-emphasizing clauses or parallelisms. LLMs overuse em dashes because they were trained (sometimes illegally) on novels, and novelists have always used em dashes more often than is typical of a layperson.

This sign is most useful when taken in combination with other indicators, not by itself.

Examples

Elwandore is a virtual micronation for people with passion and skill — a place to build, to create, and to help each other grow while chasing wealth. But not wealth for greed — wealth to give, to help others, to donate.

— From this version of Draft:United Digital Republic Of Elwandore

The term “Dutch Caribbean” is not used in the statute and is primarily promoted by Dutch institutions, not by the people of the autonomous countries themselves. In practice, many Dutch organizations and businesses use it for their own convenience, even placing it in addresses — e.g., “Curaçao, Dutch Caribbean” — but this only adds confusion internationally and erases national identity. You don’t say “Netherlands, Europe” as an address — yet this kind of mislabeling continues.

— From this revision to Talk:Dutch Caribbean

AI chatbots typically use curly quotation marks (“...” or ‘...’) instead of straight quotation marks ("..." or '...'). In some cases, AI chatbots inconsistently use pairs of curly and straight quotation marks in the same response. They also tend to use the curly apostrophe (’; the same character as the curly right single quotation mark) instead of the straight apostrophe ('), such as in contractions and possessive forms. They may also do this inconsistently.

Curly quotes alone do not prove LLM use. Microsoft Word as well as macOS and iOS devices have a " smart quotes " feature that converts straight quotes to curly quotes. Grammar correcting tools such as LanguageTool may also have such a feature. Curly quotation marks and apostrophes are common in professionally typeset works such as major newspapers. Citation tools like Citer may repeat those that appear in the title of a web page: for example,

McClelland, Mac (2017-09-27). "When 'Not Guilty' Is a Life Sentence". The New York Times. Retrieved 2025-08-03.

Note that Wikipedia allows users to customize the fonts used to display text. Some fonts display matched curly apostrophes as straight, in which case the distinction is invisible to the user.

Subject lines

User messages and unblock requests generated by AI chatbots sometimes begin with text that is intended to be pasted into the Subject field on an email form.

Examples

Subject: Request for Permission to Edit Wikipedia Article - "Dog"

— From this revision to Talk:Dog

Subject: Request for Review and Clarification Regarding Draft Article

— From this revision to Wikipedia:WikiProject Articles for creation/Help desk

Collaborative communication

Words to watch: I hope this helps, Of course!, Certainly!, You're absolutely right!, Would you like..., is there anything else, let me know, more detailed breakdown, here is a...

In some cases, editors will paste text from an AI chatbot that was meant as correspondence, prewriting or advice by the chatbot, rather than article content. AI chatbots may also explicitly indicate that the text is for a Wikipedia article if prompted to produce one, and may mention various policies and guidelines in their outputs—often explicitly specifying that they're Wikipedia 's conventions.

Examples

This fictional article combines the tone of a Wikipedia article and the creative elements you requested, including the announcement date, release date, new cast, and crew for the sequel. Let me know if you'd like it expanded or tailored further!

— From Draft:A Knight's Tale: The Legend Continues

Certainly. Here's a draft Wikipedia-style article for Mark Biram, written in a neutral, encyclopedic tone and formatted according to Wikipedia conventions. This assumes notability is supported by independent sources (which would need to be cited for a real Wikipedia page):

— From this revision to Draft:Mark Biram

Final important tip: The ~~~~ at the very end is Wikipedia markup that automatically

— Adapted from this revision to Talk:Test automation management tools; the message also ends unexpectedly

In this section, we will discuss the background information related to the topic of the report. This will include a discussion of relevant literature, previous research, and any theoretical frameworks or concepts that underpin the study. The purpose is to provide a comprehensive understanding of the subject matter and to inform the reader about the existing knowledge and gaps in the field.

— From this revision to Metaphysics

Including photos of the forge (as above) and its tools would enrich the article’s section on culture or economy, giving readers a visual sense of Ronco’s industrial heritage. Visual resources can also highlight Ronco Canavese’s landscape and landmarks. For instance, a map of the Soana Valley or Ronco’s location in Piedmont could be added to orient readers geographically. The village’s scenery [...] could be illustrated with an image. Several such photographs are available (e.g., on Wikimedia Commons) that show Ronco’s panoramic view, [...] Historical images, if any exist (such as early 20th-century photos of villagers in traditional dress or of old alpine trades), would also add depth to the article. Additionally, the town’s notable buildings and sites can be visually presented: [...] Including an image of the Santuario di San Besso [...] could further engage readers. By leveraging these visual aids – maps, photographs of natural and cultural sites – the expanded article can provide a richer, more immersive picture of Ronco Canavese.

Words to watch: as of [date],7 Up to my last training update, as of my last knowledge update, While specific details are limited/scarce..., not widely available/documented/disclosed, ...in the provided/available sources/search results..., based on available information...

A knowledge-cutoff disclaimer is a statement used by the AI chatbot to indicate that the information provided may be incomplete, inaccurate, or outdated.

If an LLM has a fixed knowledge cutoff (usually the model's last training update), it is unable to provide any information on events or developments past that time, and it will often output a disclaimer to remind the user of this cutoff, which usually takes the form of a statement that says the information provided is accurate only up to a certain date.

If an LLM with retrieval-augmented generation (for example, an AI chatbot that can search the web) fails to find sources on a given topic, or if information is not included in sources provided to it in a prompt, it will often output a statement to that effect, which is similar to a knowledge-cutoff disclaimer. It may also pair it with text about what that information "likely" may be and why it is significant. This information is entirely speculative (including the very claim that it's "not documented") and may be based on loosely related topics or completely fabricated. It is also frequently combined with the tells above.

Examples

While specific information about the fauna of Studniční hora is limited in the provided search results, the mountain likely supports...

— From this revision to Studniční hora

Though the details of these resistance efforts aren't widely documented, they highlight her bravery...

— From this revision to Throwing Curves: Eva Zeisel

No significant public controversies or security incidents affecting Outpost24 have been documented as of June 2025.

— From Draft:Outpost24

As of my last knowledge update in January 2022, I don't have specific information about the current status or developments related to the "Chester Mental Health Center" in today's era.

— From this revision to Chester Mental Health Center

Below is a detailed overview based on available information:

Prompt refusal

Words to watch: as an AI language model, as a large language model, I'm sorry...

Occasionally, the AI chatbot will decline to answer a prompt as written, usually with an apology and a reminder that it is "an AI language model". Attempting to be helpful, it often gives suggestions or an answer to an alternative, similar request. Outright refusals have become increasingly rare.

Prompt refusals are obviously unacceptable for Wikipedia articles, so if a user includes one anyway, it may indicate that they did not review the text and/or may not be proficient in English. Remember to assume good faith, because that editor may genuinely want to improve our coverage of knowledge gaps.

Examples

As an AI language model, I can't directly add content to Wikipedia for you, but I can help you draft your bibliography.

— From this revision to Parmiter's Almshouse & Pension Charity

Links to searches

When results appear in these searches, they are almost always problematic – but remember that it would be okay for an article to include them if, for example, they were in a relevant, attributed quote.

AI chatbots may generate responses with fill-in-the-blank phrasal templates (as seen in the game Mad Libs) for the LLM user to replace with words and phrases pertaining to their use case. However, some LLM users forget to fill in those blanks. Note that non-LLM-generated templates exist for drafts and new articles, such as Wikipedia:Artist biography article template/Preload and pages in Category:Article creation templates.

Examples

Subject: Concerns about Inaccurate Information

Dear Wikipedia

I am writing to express my deep concern about the spread of misinformation on your platform. Specifically, I am referring to the article about ==[Entertainer's Name]==, which I believe contains inaccurate and harmful information.

— From this revision to Talk:Kjersti Flaa

Subject: Edit Request for Wikipedia Entry

Dear Wikipedia Editors,

I hope this message finds you well. I am writing to request an edit for the Wikipedia entry

I have identified an area within the article that requires updating/improvement. ==[Describe the specific section or content that needs editing and provide clear reasons why the edit is necessary, including reliable sources if applicable]==.

— From this revision to Talk:Spaghetti

[URL of source confirming birth, if available], [URL of reliable source]

— From this revision to Draft:Ansuman Satpathy

(Note: Actual Wikipedia articles require verifiable citations from independent sources. The following entries are placeholders to indicate where citations would go if sources were available.)

— From a speedily-deleted draft

Large-language models may also insert placeholder dates like "2025-xx-xx" into citation fields, particularly the access-date parameter and rarely the date parameter as well, producing errors.

{{cite web |title=Game Night Goes Bananas! … |url=https://www.prnewswire.com/news-releases/game-night-goes-bananas-mcmiller-entertainments-viral-smash-hit-party-game-its-bananas-now-available-on-amazon-301602448.html | ==access-date=2025-xx-xx== }}

— From this revision to Draft:McMiller draft 1

In 2025, Plot participated in the inauguration of Israel’s first grove honoring **Prisoners of Zion**, in Nof HaGalil.{{cite news |title=Israel's first grove honoring Prisoners of Zion inaugurated in Nof HaGalil |url=https://www.jns.org/wire/israels-first-grove-honoring-prisoners-of-zion-inaugurated-in-nof-hagalil/ | ==access-date=2025-xx-xx== }}

— From this revision to Draft:Ronen Plot

AI chatbots are not proficient in wikitext, the markup language used to instruct Wikipedia's MediaWiki software how to format an article. As wikitext is a niche markup language, found mostly on wikis running on MediaWiki and other MediaWiki-based platforms like Miraheze, LLMs tend to lack wikitext-formatted training data. While the corpuses of chatbots did ingest millions of Wikipedia articles, these articles would not have been processed as text files containing wikitext syntax. This is compounded by the fact that most chatbots are factory-tuned to use another, conceptually similar but much more diversely applied markup language: Markdown. Their system-level instructions direct them to format outputs using it, and the chatbot apps render its syntax as formatted text on a user's screen, enabling the display of headings, bulleted and numbered lists, tables, etc, just as MediaWiki renders wikitext to make Wikipedia articles look like formatted documents.

When asked about its "formatting guidelines", a chatbot willing to reveal some of its system-level instructions will typically generate some variation of the following (this is Microsoft Copilot in mid-2025):

## Formatting Guidelines

- All output uses GitHub-flavored Markdown.  
- Use a single main title (\`#\`) and clear primary subheadings (\`##\`).  
- Keep paragraphs short (3–5 sentences, ≤150 words).  
- Break large topics into labeled subsections.  
- Present related items as bullet or numbered lists; number only when order matters.  
- Always leave a blank line before and after each paragraph.  
- Avoid bold or italic styling in body text unless explicitly requested.  
- Use horizontal dividers (\`---\`) between major sections.  
- Employ valid Markdown tables for structured comparisons or data summaries.  
- Refrain from complex Unicode symbols; stick to simple characters.  
- Reserve code blocks for code, poems, lyrics, or similarly formatted content.  
- For mathematical expressions, use LaTeX outside of code blocks.

As the above suggests, Markdown's syntax is completely different from wikitext's: Markdown uses asterisks (*) or underscores (_) instead of single-quotes (') for bold and italic formatting, hash symbols (#) instead of equals signs (=) for section headings, parentheses (()) instead of square brackets ([]) around URLs, and three symbols (---, ***, or ___) instead of four hyphens (----) for thematic breaks.

Even when they are told to do so explicitly, chatbots generally struggle to generate text using syntactically correct wikitext, as their training data lead to a drastically greater affinity for and fluency in Markdown. When told to "generate an article", a chatbot will typically default to using Markdown for the generated output, which is preserved in clipboard text by the copy functions on some chatbot platforms. If instructed to generate content for Wikipedia, the chatbot might "realize" the need to generate Wikipedia-compatible code, and might include a message like Would you like me to... turn this into actual Wikipedia markup format (`wikitext`)?8 in its output. If the chatbot is told to proceed, the resulting syntax will often be rudimentary, syntactically incorrect, or both. The chatbot might put its attempted-wikitext content in a Markdown-style fenced code block (its syntax for WP:PRE) surrounded by Markdown-based syntax and content, which may also be preserved by platform-specific copy-to-clipboard functions, leading to a telling footprint of both markup languages' syntax. This might include the appearance of three backticks in the text, such as: ```wikitext.9

The presence of faulty wikitext syntax mixed with Markdown syntax is a strong indicator that content is LLM-generated, especially if in the form of a fenced Markdown code block. However, Markdown alone is not such a strong indicator. Software developers, researchers, technical writers, and experienced internet users frequently use Markdown in tools like Obsidian and GitHub, and on platforms like Reddit, Discord, and Slack. Some writing tools and apps, such as iOS Notes, Google Docs, and Windows Notepad, support Markdown editing or exporting. The increasing ubiquity of Markdown may also lead new editors to expect or assume Wikipedia to support Markdown by default.

Examples

I believe this block has become procedurally and substantively unsound. Despite repeatedly raising clear, policy-based concerns, every unblock request has been met with **summary rejection** — not based on specific diffs or policy violations, but instead on **speculation about motive**, assertions of being “unhelpful”, and a general impression that I am "not here to build an encyclopedia". No one has meaningfully addressed the fact that I have **not made disruptive edits**, **not engaged in edit warring**, and have consistently tried to **collaborate through talk page discussion**, citing policy and inviting clarification. Instead, I have encountered a pattern of dismissiveness from several administrators, where reasoned concerns about **in-text attribution of partisan or interpretive claims** have been brushed aside. Rather than engaging with my concerns, some editors have chosen to mock, speculate about my motives, or label my arguments "AI-generated" — without explaining how they are substantively flawed.

— From this revision to a user talk page

- The Wikipedia entry does not explicitly mention the "Cyberhero League" being recognized as a winner of the World Future Society's BetaLaunch Technology competition, as detailed in the interview with THE FUTURIST ([1] (https://consciouscreativity.com/the-futurist-interview-with-dana-klisanin-creator-of-the-cyberhero-league/)). This recognition could be explicitly stated in the "Game design and media consulting" section.

— From this revision to Talk:Dana Klisanin

Here, LLMs incorrectly use ## to denote section headings, which MediaWiki interprets as a numbered list.

    1. Geography

Villers-Chief is situated in the Jura Mountains, in the eastern part of the Doubs department. [...]

    1. History

Like many communes in the region, Villers-Chief has an agricultural past. [...]

    1. Administration

Villers-Chief is part of the Canton of Valdahon and the Arrondissement of Pontarlier. [...]

    1. Population

The population of Villers-Chief has seen some fluctuations over the decades, [...]

— From this revision to Villers-Chief

Broken wikitext

As explained above, AI-chatbots are not proficient in wikitext and Wikipedia templates, leading to faulty syntax. A noteworthy instance is garbled code related to Template:AfC submission, as new editors might ask a chatbot how to submit their Articles for Creation draft; see this discussion among AfC reviewers.

Examples

Note the badly malformed category link:

[[Category:AfC submissions by date/<0030Fri, 13 Jun 2025 08:18:00 +0000202568 2025-06-13T08:18:00+00:00Fridayam0000=error>EpFri, 13 Jun 2025 08:18:00 +0000UTC00001820256 UTCFri, 13 Jun 2025 08:18:00 +0000Fri, 13 Jun 2025 08:18:00 +00002025Fri, 13 Jun 2025 08:18:00 +0000: 17498026806Fri, 13 Jun 2025 08:18:00 +0000UTC2025-06-13T08:18:00+00:0020258618163UTC13 pu62025-06-13T08:18:00+00:0030uam301820256 2025-06-13T08:18:00+00:0008amFri, 13 Jun 2025 08:18:00 +0000am2025-06-13T08:18:00+00:0030UTCFri, 13 Jun 2025 08:18:00 +0000 &qu202530;:&qu202530;.</0030Fri, 13 Jun 2025 08:18:00 +0000202568>June 2025|sandbox]]

— From this revision to User:Dr. Omokhudu Idogho/sandbox

turn0search0

ChatGPT may include citeturn0search0 (surrounded by Unicode points in the Private Use Area) at the ends of sentences, with the number after "search" increasing as the text progresses. These are places where the chatbot links to an external site, but a human pasting the conversation into Wikipedia has that link converted into placeholder code. This was first observed in February 2025.

A set of images in a response may also render as iturn0image0turn0image1turn0image4turn0image5. Rarely, other markup of a similar style, such as citeturn0news0 (example), citeturn1file0 (example), or cite*generated-reference-identifier* (example), may appear.

Examples

The school is also a center for the US College Board examinations, SAT I & SAT II, and has been recognized as an International Fellowship Centre by Cambridge International Examinations. citeturn0search1 For more information, you can visit their official website: citeturn0search0

— From this revision to List of English-medium schools in Bangladesh

Due to a bug, ChatGPT may add code in the form of :contentReference[oaicite:0]{index=0} in place of links to references in output text. Links to ChatGPT-generated references may be labeled with oai_citation.

Examples

:contentReference[oaicite:16]{index=16}

1. **Ethnicity clarification**

- :contentReference[oaicite:17]{index=17}
    * :contentReference[oaicite:18]{index=18} :contentReference[oaicite:19]{index=19}.
    * Denzil Ibbetson’s *Panjab Castes* classifies Sial as Rajputs :contentReference[oaicite:20]{index=20}.
    * Historian’s blog notes: "The Sial are a clan of Parmara Rajputs…” :contentReference[oaicite:21]{index=21}.

2.:contentReference[oaicite:22]{index=22}

- :contentReference[oaicite:23]{index=23}
    > :contentReference[oaicite:24]{index=24} :contentReference[oaicite:25]{index=25}.

— From this revision to Talk:Sial (tribe).

#### 📌 Key facts needing addition or correction:

1. **Group launch & meetings**

*Independent Together* launched a “Zero Rates Increase Roadshow” on 15 June, with events in Karori, Hataitai, Tawa, and Newtown  [oai_citation:0‡wellington.scoop.co.nz](https://wellington.scoop.co.nz/?p=171473&utm_source=chatgpt.com).

2. **Zero-rates pledge and platform**

The group pledges no rates increases for three years, then only match inflation—responding to Wellington’s 16.9% hike for 2024/25  [oai_citation:1‡en.wikipedia.org](https://en.wikipedia.org/wiki/Independent_Together?utm_source=chatgpt.com).

— From this revision to Talk:Independent Together

ChatGPT may add JSON -formatted code at the end of sentences in the form of ({"attribution":{"attributableIndex":"X-Y"}}), with X and Y being increasing numeric indices.

Examples

^[Evdokimova was born on 6 October 1939 in Osnova, Kharkov Oblast, Ukrainian SSR (now Kharkiv, Ukraine).]({"attribution":{"attributableIndex":"1009-1"}}) ^[She graduated from the Gerasimov Institute of Cinematography (VGIK) in 1963, where she studied under Mikhail Romm.]({"attribution":{"attributableIndex":"1009-2"}}) [oai_citation:0‡IMDb](https://www.imdb.com/name/nm0947835/?utm_source=chatgpt.com) [oai_citation:1‡maly.ru](https://www.maly.ru/en/people/EvdokimovaA?utm_source=chatgpt.com)

— From Draft:Aleftina Evdokimova

Patrick Denice & Jake Rosenfeld, Les syndicats et la rémunération non syndiquée aux États-Unis, 1977–2015, ‘‘Sociological Science’’ (2018).]({“attribution”:{“attributableIndex”:“3795-0”}})

— From this diff to fr:Syndicalisme aux États-Unis

LLMs sometimes hallucinate non-existent categories (which appear as red links) because their training set includes obsolete and renamed categories that they reproduce in new content. They may also treat ordinary references to topics as categories, thus generating non-existent categories. Note that this is also a common error made by new or returning editors.

Examples

[[Category:American hip hop musicians]]

— From this revision to Draft:Paytra

rather than

[[Category:American hip-hop musicians]]

If a new article or draft has multiple citations with external links, and most of them are broken (error 404 pages), this is a strong sign of an AI-generated page, particularly if the dead links are not found in website archiving sites like Internet Archive or Archive Today. Most links become broken (see link rot) over time, but those factors make it unlikely that the link was ever valid.

A checksum can be used to verify ISBNs. An invalid checksum is a very likely sign that an ISBN is incorrect, and citation templates will display a warning if so. Similarly, DOIs are more resistant to link rot than regular hyperlinks. Unresolvable DOIs and invalid ISBNs can be indicators of hallucinated references.

Related are DOIs that point to entirely different article and general book citations without pages. This passage, for example, was generated by ChatGPT.

Ohm's Law is a fundamental principle in the field of electrical engineering and physics that states the current passing through a conductor between two points is directly proportional to the voltage across the two points, provided the temperature remains constant. Mathematically, it is expressed as V=IR, where V is the voltage, I is the current, and R is the resistance. The law was formulated by German physicist Georg Simon Ohm in 1827, and it serves as a cornerstone in the analysis and design of electrical circuits [1]. Ohm’s Law applies to many materials and components that are "ohmic," meaning their resistance remains constant regardless of the applied voltage or current. However, it does not hold for non-linear devices like diodes or transistors [2][3].

References:

1. Dorf, R. C., & Svoboda, J. A. (2010). Introduction to Electric Circuits (8th ed.). Hoboken, NJ: John Wiley & Sons. ISBN 9780470521571.

2. M. E. Van Valkenburg, “The validity and limitations of Ohm’s law in non-linear circuits,” Proceedings of the IEEE, vol. 62, no. 6, pp. 769–770, Jun. 1974. doi:10.1109/PROC.1974.9547

3. C. L. Fortescue, “Ohm’s Law in alternating current circuits,” Proceedings of the IEEE, vol. 55, no. 11, pp. 1934–1936, Nov. 1967. doi:10.1109/PROC.1967.6033

The book references appear valid – a book on electric circuits would likely have information about Ohm's law, but without the page number, the citation is not usable for verification of the claims in the prose. Worse, both Proceedings of the IEEE citations are completely made up. The DOIs lead to completely different citations and have other problems as well. For instance, C. L. Fortescue was dead for 30+ years at the purported time of writing, and Vol 55, Issue 11 does not list any articles that match anything remotely close to the information given in reference 3. Note also the use of curly quotation marks and apostrophes in some, but not all, of the above text, another indicator that text may be LLM-generated.

AI tools may have been prompted to include references, and make an attempt to do so as Wikipedia expects, but fail with some key implementation details or stand out when compared with conventions.

In the below example, note the incorrect attempt at re-using references. The tool used here was not capable of searching for non-confabulated sources (as it was done the day before Bing Deep Search launched) but nonetheless found one real reference. The syntax for re-using the references was incorrect.

In this case, the Smith, R. J. source – being the "third source" the tool presumably generated the link ' https://pubmed.ncbi.nlm.nih.gov/3' (which has a PMID reference of 3) – is also completely irrelevant to the body of the article. The user did not check the reference before they converted it to a {{ cite journal }} reference, even though the links resolve.

The LLM in this case has diligently included the incorrect re-use syntax after every single full stop.

For over thirty years, computers have been utilized in the rehabilitation of individuals with brain injuries. Initially, researchers delved into the potential of developing a "prosthetic memory."<ref>Fowler R, Hart J, Sheehan M. A prosthetic memory: an application of the prosthetic environment concept. ''Rehabil Counseling Bull''. 1972;15:80–85.</ref> However, by the early 1980s, the focus shifted towards addressing brain dysfunction through repetitive practice.<ref>{{Cite journal |last=Smith |first=R. J. |last2=Bryant |first2=R. G. |date=1975-10-27 |title=Metal substitutions incarbonic anhydrase: a halide ion probe study |url=https://pubmed.ncbi.nlm.nih.gov/3 |journal=Biochemical and Biophysical Research Communications |volume=66 |issue=4 |pages=1281–1286 |doi=10.1016/0006-291x(75)90498-2 |issn=0006-291X |pmid=3}}</ref> Only a few psychologists were developing rehabilitation software for individuals with Traumatic Brain Injury (TBI), resulting in a scarcity of available programs.<sup>[3]</sup> Cognitive rehabilitation specialists opted for commercially available computer games that were visually appealing, engaging, repetitive, and entertaining, theorizing their potential remedial effects on neuropsychological dysfunction.<sup>[3]</sup>

— From this revision to Cognitive orthotics

Some LLMs or chatbot interfaces use their own method of providing footnotes, typically using the character ↩:

References

Would you like help formatting and submitting this to Wikipedia, or do you plan to post it yourself? I can guide you step-by-step through that too.

Footnotes

  1. KLAS Research. (2024). Top Performing RCM Vendors 2024. https://klasresearch.com ↩ ↩ 2
  2. PR Newswire. (2025, February 18). CureMD AI Scribe Launch Announcement. https://www.prnewswire.com/news-releases/curemd-ai-scribe

— From this revision of Draft:CureMD

utm_source=

ChatGPT may add the UTM parameter utm_source=openai or, in edits prior to August 2025, utm_source=chatgpt.com to URLs that it is using as sources. This behavior is much less common with other LLMs such as Gemini or Claude.10

Examples

Following their marriage, Burgess and Graham settled in Cheshire, England, where Burgess serves as the head coach for the Warrington Wolves rugby league team. [https://www.theguardian.com/sport/2025/feb/11/sam-burgess-interview-warrington-rugby-league-luke-littler?utm\_source=chatgpt.com\]

— From this revision to Sam Burgess

Vertex AI documentation and blog posts describe watermarking, verification workflow, and configurable safety filters (for example, person‑generation controls and safety thresholds). ([cloud.google.com](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images?utm\_source=openai))

— From this revision to Draft:Nano Banana (Chatbot)

Examples

See these diffs for examples. The problematic references will appear as parser errors in the reflist.

AI tools may abruptly stop generating content, for example if they predict the end of text sequence (appearing as <|endoftext|>) next. Also, the number of tokens that a single response has is usually limited, and further responses will require the user to select "continue generating".

This method is not foolproof, as a malformed copy/paste from one's local computer can also cause this. It may also indicate a copyright violation rather than the use of an LLM.

A sudden shift in an editor's writing style, such as unexpectedly flawless grammar compared to their other communication, may indicate the use of AI tools.

Another discrepancy is a mismatch of user location, national ties of the topic to a variety of English, and the variety of English used. A human writer from India writing about an Indian university would probably not use American English; however, LLM outputs use American English by default, unless prompted otherwise.5 Note that non-native English speakers tend to mix up English varieties, and such signs should only raise suspicion if there is a sudden and complete shift in an editor's English variety use.

ChatGPT was launched to the public on November 30, 2022. Although OpenAI had similarly powerful LLMs before then, they were paid services and not particularly accessible or known to lay people. ChatGPT experienced extreme growth immediately on launch.

It is very unlikely that any particular text added to Wikipedia prior to November 30, 2022 was generated by an LLM. If an edit to a page was made before this date, AI use can be safely ruled out for that revision. While some text added long ago (such as in 2006) may appear to match some of the AI signs given in this list, and even convincingly appear to have been AI generated, the vastness of Wikipedia allows for these rare coincidences.

AI-generated edit summaries are often unusually long, written as formal, first-person paragraphs without abbreviations, and/or conspicuously itemize Wikipedia's conventions.

Most editors using AI do not ask for summaries to be generated.

Refined the language of the article for a neutral, encyclopedic tone consistent with Wikipedia's content guidelines. Removed promotional wording, ensured factual accuracy, and maintained a clear, well-structured presentation. Updated sections on history, coverage, challenges, and recognition for clarity and relevance. Added proper formatting and categorized the entry accordingly

— Edit summary from this revision to Khaama Press

I formalized the tone, clarified technical content, ensured neutrality, and indicated citation needs. Historical narratives were streamlined, allocation details specified with regulatory references, propagation explanations made reader-friendly, and equipment discussions focused on availability and regulatory compliance, all while adhering to encyclopedic standards.

— Edit summary from this revision to 4-metre band

**Edit Summary:** Reorganized article for clarity and neutrality; refined phrasing to align with **WP:NPOV** and **WP:BLPCRIME**; standardized formatting and citation styles; improved flow by separating professional achievements from legal issues; updated infobox with complete details; fixed broken references and inconsistencies in date formatting.

— Edit summary from this revision to David Bitel

False accusations of AI use can drive away new editors and foster an atmosphere of suspicion. Before claiming AI was used, consider if Dunning–Kruger effect and confirmation bias is clouding your judgement. In particular, there are several somewhat commonly used indicators which are ineffective (and may even indicate the opposite) in LLM detection.

  • Perfect grammar: While modern LLMs are known for their high grammatical proficiency, many editors are also skilled writers or come from professional writing backgrounds. (See also § Discrepancies in writing style and variety of English.)
  • "Bland" or "robotic" prose: By default, modern LLMs tend toward effusive and verbose prose, as detailed above; while this tendency is formulaic, it may not scan as "robotic" to those unfamiliar with AI writing.11
  • "Fancy," "academic," or unusual words: While LLMs disproportionately favor certain words and phrases, many of which are long and have difficult readability scores, the correlation does not extend to all "fancy," academic, or "advanced"-sounding prose.1 "AI vocabulary" and academic vocabulary are not the same thing; indeed, the specific words overused by AI appeared far less frequently in research abstracts prior to 2023.4 Low-frequency and "unusual" words are also less likely to show up in AI-generated writing as they are statistically less common, unless they are proper nouns directly related to the topic.
  • Letter-like writing (in isolation): Although many talk page messages written with salutations, valedictions and other formalities after 2023 tend to appear AI-generated, that is not guaranteed to be the case for all such messages. Letters and emails have conventionally been written in similar ways long before modern LLMs existed. An AI-generated message may start with a subject line, include a vertical list 12 or one or more placeholders, or end abruptly. In addition, some human editors may mistakenly post emails, letters, petitions, or messages intended for the article's subject, frequently formatted as letters. While such edits are generally off-topic and may be removed per the guidelines at WP:NOTFORUM —particularly if they contain personal information—they are not necessarily LLM-generated.
  • Conjunctions (in isolation): While LLMs tend to overuse connecting words and phrases in a stilted, formulaic way that implies inappropriate synthesis of facts, such uses are typical of essay-like writing by humans and are not strong indicators by themselves.
  • Bizarre wikitext: While LLMs may hallucinate templates or generate wikitext code with invalid syntax for reasons explained in § Use of Markdown, they are not likely to generate content with certain random-seeming, "inexplicable" errors and artifacts (excluding the ones listed on this page in § Markup). Bizarrely placed HTML tags like are more indicative of poorly programmed browser extensions or a known bug with Wikipedia's content translation tool (T113137). Misplaced syntax like ''Catch-22 i''s a satirical novel. (rendered as " Catch-22 i s a satirical novel.") are more indicative of mistakes in VisualEditor, where such errors are harder to notice than in source editing.
  • Wikipedia:Artificial intelligence

Footnotes

  1. Russell, Jenna; Karpinska, Marzena; Iyyer, Mohit (2025). People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vienna, Austria: Association for Computational Linguistics. pp. 5342– 5373. arXiv:2501.15654. doi:10.18653/v1/2025.acl-long.267. Retrieved 2025-09-05 – via ACL Anthology. 2 3 4 5 6

  2. This can be directly observed by examining images generated by text-to-image models; they look acceptable at first glance, but specific details tend to be blurry and malformed. This is especially true for background objects and text.

  3. "10 Ways AI Is Ruining Your Students' Writing". Chronicle of Higher Education. September 16, 2025. 2 3 4

  4. Juzek, Tom S.; Ward, Zina B. (2025). Why Does ChatGPT "Delve" So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models (PDF). Findings of the Association for Computational Linguistics: ACL 2025. Association for Computational Linguistics. arXiv:2412.11385. Retrieved October 13, 2025 – via ACL Anthology. 2 3

  5. Ju, Da; Blix, Hagen; Williams, Adina (2025). Domain Regeneration: How well do LLMs match syntactic properties of text domains?. Findings of the Association for Computational Linguistics: ACL 2025. Vienna, Austria: Association for Computational Linguistics. pp. 2367– 2388. arXiv:2505.07784. doi:10.18653/v1/2025.findings-acl.120. Retrieved October 4, 2025 – via ACL Anthology. 2

  6. "About". Nick Ford. Retrieved 2025-06-25.

  7. not unique to AI chatbots; is produced by the {{ as of }} template

  8. Example (deleted, administrators only)

  9. Example of ```wikitext on a draft.

  10. See T387903.

  11. Murray, Nathan; Tersigni, Elisa (2024). "Can instructors detect AI-generated papers? Postsecondary writing instructor knowledge and". Journal of Applied Learning & Teaching. 7 (2). ISSN 2591-801X. Retrieved 6 October 2025.

  12. Example of a vertical list in a deletion discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment