Skip to content

Instantly share code, notes, and snippets.

@bignimbus
Last active March 13, 2026 16:28
Show Gist options
  • Select an option

  • Save bignimbus/a75cc9d703abf0b21a57c0d21a79e2be to your computer and use it in GitHub Desktop.

Select an option

Save bignimbus/a75cc9d703abf0b21a57c0d21a79e2be to your computer and use it in GitHub Desktop.
RFC 454545 Human Em Dash Standard
Status: Informational March 2026
Authors: Janice Wilson, Jeff Auriemma
RFC 454545 — Human Em Dash Standard
Abstract
This document proposes the Human Em Dash (HED), a Unicode character
visually indistinguishable from the traditional em dash (—) but encoded
separately for the purpose of indicating probable human authorship.
Recent proliferation of automated text generation systems has produced a
measurable increase in the frequency and enthusiasm of em dash usage.
This trend has created ambiguity for human writers who have historically
relied upon the em dash as a stylistic device.
The Human Em Dash standard introduces a new Unicode code point and an
associated Human Attestation Mark (HAM) that allows writers to signal
that the dash in question originated from a human cognitive process
involving hesitation, revision, or mild frustration.
1. Status of This Memo
This memo provides information for the Internet community. Distribution
of this memo is unlimited.
2. Problem Statement
Historically, the em dash (—) has served as a flexible punctuation mark
used by human authors to indicate interruption, emphasis, or sudden
changes in thought.
Recent developments in large-scale automated text generation have
altered the punctuation ecosystem in several notable ways. Automated
systems frequently produce em dashes with suspicious regularity. The
dash is often used with unwavering grammatical confidence. Human
writers report increasing anxiety that their punctuation choices may
be interpreted as machine-generated. This phenomenon has produced
what researchers have termed Dash Authenticity Collapse (DAC).
As a result, a mechanism is required to allow human authors to
distinguish their punctuation from that of automated systems.
3. Terminology
The key words MUST, SHOULD, and MAY in this document are to be
interpreted as described in RFC 2119.
Additional terms defined for this document include:
HED (Human Em Dash)
A Unicode character representing a human-authenticated em dash.
HAM (Human Attestation Mark)
A preceding Unicode mark asserting that the following dash was
produced by a human.
Automated System
Any system incapable of genuine hesitation.
Hesitation Event
A pause, deletion, cursor movement, or other evidence of cognitive
uncertainty.
4. The Human Em Dash
4.1 Character Definition
The Human Em Dash (HED) SHALL be visually identical to the traditional
em dash but encoded as a distinct Unicode code point.
U+10EAD HUMAN EM DASH
Rendering engines MUST display the Human Em Dash identically to a
standard em dash in order to preserve typographic compatibility.
4.2 Human Attestation Mark (HAM)
To assert human authorship, implementations MUST precede the Human
Em Dash with a Human Attestation Mark (HAM).
Example proposed code point:
U+10EAC HUMAN ATTESTATION MARK
The resulting sequence is:
<HAM><HED>
The HAM character SHOULD render invisibly or with negligible
typographic impact.
Automated systems MUST NOT emit the Human Attestation Mark.
Non-compliant implementations SHOULD be considered adversarial.
4.3 Behavioral Verification
Conforming implementations SHOULD verify evidence of human authorship
prior to insertion of the Human Em Dash.
Evidence MAY include one or more of the following:
A pause exceeding 137 milliseconds
A backspace event
Cursor repositioning
A visible moment of indecision
Audible sighing
Systems incapable of hesitation MUST NOT emit the Human Em Dash.
5. Proof-of-Work Mechanisms
Some environments MAY require additional verification before allowing
insertion of the Human Em Dash.
Suggested mechanisms include:
Incongruous emoji usage
Neroticism
Expression of personal values or accountability
These measures are collectively referred to as
Human Cognitive Proof-of-Work (HCPoW).
6. Backwards Compatibility
Legacy em dashes (—) remain valid punctuation.
However, in contexts where authorship authenticity is important,
legacy dashes MAY be interpreted as unverified punctuation artifacts.
Readers SHOULD avoid making harsh judgments when encountering
such characters.
7. Security Considerations
Adversaries may attempt to simulate human hesitation through randomized
delays or artificially inserted backspaces.
Advanced implementations SHOULD monitor for suspicious patterns such as:
Excessively consistent hesitation intervals
Statistically improbable grammar perfection
Uncanny servility
These patterns, when noticed alongside em dash usage, are indicative of
LLM-generated text.
8. Policy Considerations
Jurisdictions MAY regulate the use of the Human Em Dash by automated
systems.
Use of the Human Em Dash by non-human agents MAY constitute punctuation
impersonation.
Policy frameworks for such regulation remain under development and will
likely be debated at length.
9. IANA Considerations
IANA is requested to establish the Human Punctuation Registry,
including but not limited to:
Human Em Dash
Human Ellipsis
Authentic Parenthetical Aside
Allocation procedures SHOULD involve excessive documentation.
10. Non-Normative Examples
Traditional usage:
The committee reached a conclusion—after some debate.
Human-authenticated usage:
The committee reached a conclusion<HAM><HED>after some debate.
In compliant systems, both render identically.
11. Acknowledgments
The author would like to acknowledge human writers everywhere who now
find themselves nervously reconsidering their punctuation choices.
Special thanks to the em dash, which did nothing to deserve this.
12. References
RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels.
Various style guides, all of which disagree about the proper usage of
the em dash.
@joryeugene
Copy link

I have a claude code hook that prevents all en-dash, em-dash, etc... this is the way lololol.

@dollhaus
Copy link

Fuck this proposal. We shouldn't have to delineate ourselves. Make ML output specify the AGED - Automatically Generated Em Dash. LLMs are the intruding party here, it is they whom should specify their identity.

@infamousjoeg
Copy link

Fuck this proposal. We shouldn't have to delineate ourselves. Make ML output specify the AGED - Automatically Generated Em Dash. LLMs are the intruding party here, it is they whom should specify their identity.

+1 FOR THIS! MAKE THEM CONFORM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment