Here is the rewritten and formatted version of the book. I have consolidated the structure, standardized the code highlighting, removed internal file-system comments/navigation links (which are redundant in a single-file format), and polished the typography for readability.
- The Illusion of Just Username and Password
- Hidden Layers of Authentication Complexity
- Standards and Protocols
- Why Real-World Implementations Are So Challenging
- Local Logins and the Age of Passwords
- Enterprise Identity: Directories, Federation, SSO
- Cloud Identity Providers: Rise of Managed Platforms
- Open Identity and the Ory Generation
- What Defines a Modern Identity System
- Comparing Open and Managed Identity Platforms
- Deep Dive: Ory’s Modular Architecture
- Design and Governance: Source Availability as Trust
- Evaluating Build vs. Buy: Open Systems
- Why Most Teams Fail Building Login Internally
- Compliance: NIST and OWASP
- Economic and Operational Analysis
- Migration Stories to Modern Platforms
- Continuous Identity and Zero Trust
- AI and Agentic Authentication
- Decentralized Identity and Verifiable Credentials
- Open Identity Ecosystems: Ory Architectural Blueprint
- The Convergence of Open Source and SaaS Identity
“Can’t we add login this sprint?” a PM asks. The team nods—two fields and a button seem harmless. Weeks later, a security review, a compliance questionnaire, and a token leak post‑mortem have everyone’s attention. The form was simple; the system never was. The difference between a demo and a durable implementation isn’t pixels; it’s state, time, and trust. Users want to get in. You need to know who they are, keep the wrong people out, remember that choice across devices, and change your mind quickly when something goes wrong.
Authentication looks simple from the outside: a form with two fields, a button, and a redirect. Users bring this mental model from countless websites and mobile apps. Developers, under pressure to ship, often adopt the same model—until login becomes the most failure‑prone, business‑critical component of the stack. This chapter reframes authentication as infrastructure: stateful, distributed, and regulated.
Two realities collide in login. First, identity is probabilistic in practice: the best systems combine strong authenticators (e.g., WebAuthn), contextual signals (device posture, location, history), and human‑centered recovery. Second, the browser is a hostile intermediary: scripts, extensions, referrers, and cross‑site rules conspire to leak or block your best‑laid plans unless you design defensively. Treating authentication as a product rather than a form changes priorities from “make it work” to “make it resilient and observable.”
Resilience begins with modeling the actors (user, client, identity provider, resource server, support), artifacts (credentials, factors, tokens, cookies, proofs), and states (pending challenges, active sessions, consent grants). With that model in hand, you can align UX and security: reduce friction where risk is low (remembered devices, session refresh), add friction where risk concentrates (elevations, new devices, unusual context), and make recovery at least as strong as login. Observability closes the loop: structured audit events for attempts, decisions, token issuance, and revocation turn anecdotes into data and enable fast incident response.
The industry’s hard‑won lesson is that the UI is the tip of the iceberg. Under the surface lie standards (OAuth 2.0, OIDC, SAML), platform realities (ITP/ETP cookie restrictions, mobile deep‑link quirks), legal obligations (GDPR, SOC 2, ISO 27001), and operations (rotation, invalidation, incident drills). The good news is that today’s open, modular architectures let you adopt battle‑tested components without losing control.
The login form is an orchestration surface over multiple subsystems:
- Credential verification: Passwords, WebAuthn, OTP, magic links.
- Session establishment: Cookies, device binding, refresh strategy.
- Consent and authorization bootstrap: Scopes, resource audiences.
- Recovery and elevation: Password reset, step‑up MFA.
- Audit, telemetry, and risk signals: IP reputation, device history, anomalies.
Each subsystem has independent failure modes. A password change that does not invalidate active sessions is a confidentiality flaw. A session that outlives its cookie (due to clock skew or SameSite interactions) becomes a reliability problem. A recovery flow that binds to email but not to an established authenticator invites account takeover. Engineering login as a system means designing explicit state transitions and invariants for each part, then testing those invariants under load and failure.
Before writing code, write down what you are defending against:
- Network attacker: Controls the path between browser and server.
- Malicious client: Modified SPA/native app observes or tampers with state.
- Phishing site: Imitates your login to harvest credentials and tokens.
- Stolen device: Active session with weak local protections.
- Insider: Database read access but no key material.
Countermeasures map cleanly to threats: TLS plus HSTS and strict transport; PKCE and proof‑key binding for authorization codes; phishing‑resistant authenticators (WebAuthn); short‑lived access tokens with rotation and reuse detection for refresh tokens; encrypted, HttpOnly, SameSite‑aware cookies; key management with separation of duties and hardware‑backed storage where available.
Browser App (RP) IdP (AS/OP) API (RS)
| GET /login | | |
|------------------>| | |
| 302 to /authorize with client_id, scope, redirect_uri, state, code_challenge
|<------------------| | |
| GET /authorize | | |
|--------------------------------------------->| |
| Authenticate user (password/WebAuthn/MFA) | |
| 302 to redirect_uri?code=...&state=... | |
|<---------------------------------------------| |
| GET /callback?code=...&state=... | |
|------------------>| | |
| POST /token (code, code_verifier) | |
|--------------------------------------------->| |
| { access_token, id_token, refresh_token } | |
|<---------------------------------------------| |
| Set session cookie; store refresh server-side or rotate tightly |
| | | |
| GET /resource with bearer access_token | |
|---------------------------------------------------->| |
| | | 200 OK |
|<----------------------------------------------------| |
The diagram hides important choices: whether the access token is an opaque handle validated via introspection or a JWT validated locally; whether the session is bound to a device via cookies with strong attributes; whether refresh tokens rotate with reuse detection; and whether the RP stores tokens at all or uses back‑channel calls.
Modern identity systems separate concerns cleanly:
- Identity Service (e.g., Ory Kratos): Manages accounts and flows.
- Authorization Server (e.g., Ory Hydra): Issues OAuth 2.0/OIDC tokens.
- Policy System (e.g., Ory Keto): Answers fine‑grained authorization.
- Edge Proxy (e.g., Ory Oathkeeper): Authenticates requests.
- The “username + password” UI hides a distributed, stateful, and regulated system.
- Session lifecycle, recovery, and risk controls dominate implementation complexity.
- Compliance frameworks turn implicit assumptions into explicit, auditable requirements.
- Treat authentication as infrastructure: modular, observable, and standards‑based.
- Design recovery with equal or higher assurance than login.
- Prefer phishing‑resistant factors and short‑lived tokens with rotation.
- Measure everything: issuance, reuse denials, revocations, error codes.
- NIST SP 800-63B: Digital Identity Guidelines — Authentication and Lifecycle Management
- OWASP Cheat Sheets: Authentication, Session Management
- RFC 8252: OAuth 2.0 for Native Apps
- RFC 7009: OAuth 2.0 Token Revocation
- RFC 7662: OAuth 2.0 Token Introspection
- RFC 7636: Proof Key for Code Exchange (PKCE)
- OWASP ASVS: Application Security Verification Standard
Chapter 02: Hidden Layers of Authentication Complexity
Customer Support forwards a report: “I reset my password and my old session still worked.” Another says, “I changed my phone number and lost access to my account.” Both are symptoms of a single cause: unexamined complexity. Each story touches a different subsystem—sessions, recovery, MFA, attributes—but the user experiences it as “login is broken.”
Authentication is commonly confused with authorization and identity management writ large. Untangling these concepts is the first step toward systems that are reliable, secure, and comprehensible. This chapter is a map of the terrain—UX trade‑offs, operations, compliance, and trust boundaries—so you can choose where to spend friction and where to automate.
A useful way to navigate the terrain is to treat friction like a budget. Spend it where it buys you the most risk reduction: on privilege elevations, on new or unusual devices, when signals suggest a break from the user’s norms, and around operations with outsized consequences (wire transfers, role changes, consent modifications). Conversely, save friction where confidence is already high: stable devices, short session renewals, and low‑risk reads.
Operationally, authentication is a system of queues, caches, and rate limits long before it is a UI. The login endpoint is just one of many hot spots—recovery endpoints, token exchanges, and introspection APIs are frequent targets for abuse and deserve separate controls and dashboards.
- Authentication: Establishing that the actor is who they claim to be, to a given assurance level.
- Authorization: Deciding whether an authenticated actor may perform an action on a resource.
- Identity lifecycle: Creating, updating, linking, suspending, and deleting identities and authenticators.
Separating these concerns prevents security work from being submerged by product changes. Treat identity lifecycle as its own product with APIs and events. Treat authorization policy as code with versioning and tests. Treat authentication as risk management at the edges of your system.
User Device ──(browser/app)──> Edge/Proxy ──> App Backend ──> Identity/Token Service
| |
| └──> Audit/Events
└──> Policy Decision Point
Each arrow crosses a trust boundary. At the outermost edge, assume manipulation of headers, origins, and timing. Between the app backend and identity service, assume honest‑but‑curious insiders and defend with least privilege, mTLS, and narrow scopes. Emit events at the boundary crossings so you can reconstruct state transitions when incidents happen.
-
Password reset does not revoke sessions
$\rightarrow$ Old devices retain access. -
MFA enrollment without re‑authentication
$\rightarrow$ Attacker adds a factor. -
Token audiences misconfigured
$\rightarrow$ Token valid at unintended resource. -
Long‑lived refresh tokens without rotation
$\rightarrow$ Theft grants durable access. -
Cookie attributes mis‑set (
SameSite=NonewithoutSecure)$\rightarrow$ Leakage on cross‑site requests. -
Consent not bound to client and scope
$\rightarrow$ Grants reused by other clients.
Standards like NIST SP 800‑63 and frameworks like SOC 2 and ISO 27001 do not write your architecture, but they do define auditable outcomes: strong authenticators for high‑risk actions, evidence of consent, key management processes, and incident response. Use these requirements early to drive design.
- Separate authentication, authorization, and lifecycle to reduce complexity.
- Balance friction by adapting to risk and user context.
- Operational excellence (rate limits, rotation, observability) is core to security.
- Document and harden trust boundaries; federation shifts, not removes, risk.
- Treat recovery and elevation as first‑class flows with re‑authentication.
- Prefer back‑channel token exchanges; keep lifetimes short and scopes narrow.
- GDPR (EU): Official regulation text and guidance
- SOC 2 (AICPA) and ISO 27001: Control frameworks
- NIST SP 800-63 (A/B/C): Assurance levels and federation
- RFC 9322: OAuth 2.0 Security Best Current Practice
- NIST SP 800-207: Zero Trust Architecture
An integration demo fails on stage: “redirect_uri mismatch.” The problem wasn’t the code; it was the contract. Standards prevent such gotchas, but only when implemented precisely. This chapter translates protocol language into engineering habits.
Standards exist because bespoke integrations do not scale. Identity standards form a stack:
- OAuth 2.0: Defines authorization delegation and token issuance.
- OpenID Connect (OIDC): Standardizes authentication on top of OAuth 2.0.
- JOSE (JWS/JWE/JWK): Specifies signing, encryption, and key representation.
- Discovery documents: Publish metadata and JWKs for dynamic configuration.
When reading a spec, translate nouns and verbs into interfaces and invariants: a “client” is your relying party app; an “authorization server” is your token service; a “resource server” is your API.
OAuth 2.0 is a delegation framework. The core specifications are RFC 6749 and RFC 6750. In modern deployments, the Authorization Code flow with PKCE (RFC 7636) is the default for browsers and native apps. PKCE mitigates code interception on untrusted channels.
Client AS/Authorization Server RS/Resource Server
| /authorize (client_id, redirect_uri, code_challenge, state, scope)
|----------------------------------------------------->|
| authenticate user (out of band) |
| 302 redirect_uri?code=...&state=... |
|<-----------------------------------------------------|
| /token (code, code_verifier, client_auth if conf.) |
|----------------------------------------------------->|
| { access_token, refresh_token?, expires_in } |
|<-----------------------------------------------------|
| GET /resource Authorization: Bearer access_token |
|--------------------------------------------------------------->|
| 200 OK |
|<---------------------------------------------------------------|
Implementation Tips:
- Register exact redirect URIs per environment; avoid wildcards.
- Keep access tokens short‑lived; rotate refresh tokens with reuse detection.
- Limit audiences explicitly (RFC 8707) and validate at the RS.
- Introspect opaque tokens. If using JWTs, pin algorithms and audiences and cache JWKs safely with short TTLs.
OIDC layers authentication on OAuth 2.0. The ID token communicates the subject (sub) and claims, usually including aud (intended audience), iss (issuer), exp (expiry), and a nonce.
RP/Client OP (AS/IdP)
| /authorize scope=openid ... state, nonce
|------------------------------------------->|
| (authenticate user) |
| 302 redirect?code=...&state=... |
|<-------------------------------------------|
| /token (code, code_verifier) |
|------------------------------------------->|
| { id_token (JWT), access_token } |
|<-------------------------------------------|
| /userinfo Authorization: Bearer ... |
|------------------------------------------->|
| { claims: email, name, ... } |
|<-------------------------------------------|
Example ID Token:
{
"iss": "https://auth.example.com",
"sub": "pseu-3a9f...",
"aud": "app-web",
"exp": 1731422400,
"iat": 1731421800,
"nonce": "n-0S6_WzA2Mj",
"acr": "aal2",
"amr": ["pwd", "otp"],
"azp": "app-web"
}SAML 2.0 powers enterprise web SSO via XML assertions. It excels at browser‑centric federation. Implementation care is essential: validate XML signatures correctly (avoid partial signature wrapping attacks) and favor POST binding to avoid leaking assertions in URLs.
XACML externalizes authorization decisions. A PDP (Policy Decision Point) evaluates policies and returns decisions to a PEP (Policy Enforcement Point). While powerful, XACML introduces operational complexity regarding latency and caching.
- Additional Flows: Device Authorization (RFC 8628) for input‑constrained devices. Proof‑of‑possession (DPoP/mTLS) to reduce replay.
- Cross-Protocol Guidance: Disable the
nonealgorithm. Enforce PKCE. Automate metadata refresh. - Logout: Distinguish between front‑channel (fragile, relies on browser) and back‑channel (reliable, server-to-server) logout.
- Interop: Build a compatibility matrix. Test against real providers with contract suites.
- OAuth 2.0 enables delegated access; OIDC adds authentication semantics.
- SAML powers enterprise web SSO but is browser‑centric and XML‑heavy.
- Interoperability falters at optionality; favor proven profiles (Auth Code + PKCE).
“Everything worked in staging.” Production has more browsers, more devices, more proxies, and more humans. Specs are necessary; battle‑tested defaults are sufficient. This chapter is a field guide to what breaks and how to make it boring.
Authorization servers must match registered redirect URIs exactly. Broad patterns (wildcards) create open redirects.
- Mobile: Prefer OS‑level association (iOS Universal Links, Android App Links) over custom schemes.
- Loopback: Bind to
127.0.0.1with a random high port for native apps.
Tokens leak through front‑channel exposure.
- Cookies:
Secure; HttpOnly; SameSite=Laxis the sound default. UseSameSite=Noneonly with TLS and strictly for cross-site flows. - Storage: Avoid
localStoragefor high-value tokens (XSS risk).
Consent prompts often suffer from scope creep.
- Best Practice: Use purpose‑based, human‑readable scope descriptions.
- Audit: Persist the exact scope set and text presented at consent time.
Revocation is hard with stateless JWTs.
- Refresh Tokens: Use rotation (issue new RT on use) with reuse detection. If an old RT is used, revoke the entire token family (implies theft).
- JWTs: Use short lifetimes. For immediate revocation, use a lightweight denylist keyed by
jti.
OAuth 2.0 is a framework; providers fill gaps differently.
- Strategy: Create a “canary” test app that runs regularly against staging tenants to alert on provider drift (e.g., deprecated flows, parameter changes).
- Token Exfiltration:
localStorage+ XSS → Fix: HttpOnly cookies. - Recovery Bypass: Long-lived, unscoped reset links → Fix: Short-lived, device-bound links.
- Stale JWKs: Reverse-proxy cache holding old keys → Fix: Cache headers aligned to provider TTLs.
- Small misconfigurations cascade into systemic failures.
- Ambiguity in OAuth 2.0 requires provider‑specific testing.
- Build with revocation, rotation, and strict redirect handling from day one.
In the early days of a university UNIX lab, a student sysadmin copies
/etc/passwdto audit accounts from home. The next morning, the cluster hums with password‑cracking attempts. No firewall failed; the mistake was architectural: secrets were readable to everyone.
Early UNIX stored hashes in world-readable /etc/passwd. The move to /etc/shadow restricted access. Web apps repeated these mistakes: fast hashes (MD5), unsalted storage, and weak reset flows.
- Lesson: Keep verifiers secret, make offline guessing expensive, and treat recovery as a target.
Fast hashes allow billions of guesses per second. Use adaptive Key Derivation Functions (KDFs).
- Standard: Argon2id is the modern default (memory-hard). Fallback: bcrypt or scrypt.
- Tuning: Target ~50–200ms verification latency on your hardware.
- Salts: Unique, random salt per user.
Example (Go, Argon2id):
func Hash(password string, p Params) (string, error) {
salt := make([]byte, p.SaltLen)
if _, err := rand.Read(salt); err != nil { return "", err }
key := argon2.IDKey([]byte(password), salt, p.Iterations, p.Memory, p.Parallelism, p.KeyLen)
// Store as: $argon2id$v=19$m=...,t=...,p=...$<salt>$<key>
return fmt.Sprintf("$argon2id$v=19$m=%d,t=%d,p=%d$%s$%s", p.Memory, p.Iterations, p.Parallelism,
base64.RawStdEncoding.EncodeToString(salt), base64.RawStdEncoding.EncodeToString(key)), nil
}Recovery pathways should meet or exceed the assurance of login.
- Design: Single‑use, short‑lived links.
- Security: Invalidate sessions on credential change.
- Guardrails: Rate‑limit initiation per user/IP. Require re‑authentication for factor removal.
PAM (from the UNIX world) introduced the concept of modular stacks for auth. Modern identity systems (like Ory) apply this architecturally: composing flows from reusable, policy-driven modules rather than hard-coding logic.
- Store password verifiers with memory‑hard hashing (Argon2id/bcrypt).
- Treat recovery as a high‑value target with strong verification.
- Prefer composition‑light, breach‑aware password policies (NIST).
A new employee logs into a Windows laptop on day one. No separate password for email, calendar, or intranet—everything “just works.” Behind the scenes, a domain controller issued Kerberos tickets and SSO stitched applications together.
Directories centralize identity data. LDAP provides the protocol; Active Directory (AD) adds Kerberos and Group Policy.
-
Structure: Hierarchical (
inetOrgPerson,groupOfNames). -
Risk: Nested group expansion can cause
$O(N^2)$ lookup issues.
Kerberos authenticates clients to services using time‑bound tickets, avoiding password transmission.
-
Flow: TGT (Ticket-Granting Ticket)
$\rightarrow$ Service Ticket. - Critical: Time synchronization (NTP) is mandatory.
SAML 2.0 moves authentication to a central IdP, using signed XML assertions.
- Bridging: Many organizations now bridge SAML to OIDC. An internal broker terminates SAML and issues OIDC tokens to modern apps.
- RBAC (Role-Based): Coarse, simple to audit.
- ABAC (Attribute-Based): Expressive, complex governance.
- ReBAC (Relationship-Based): Models graphs (e.g., "friend of owner").
- Directories and Kerberos centralized enterprise identity.
- Federation (SAML) enabled web SSO across boundaries.
- Modernization involves placing identity-aware proxies in front of legacy apps.
Your startup’s login system buckles under a viral launch. A single dashboard toggle in a managed identity provider adds rate limiting and adaptive MFA overnight. You ship features again the next morning—and start a spreadsheet tracking what you just delegated.
- Okta/Auth0: Enterprise focus, extensive catalogs, developer tools.
- Azure AD (Entra ID): Default for Microsoft ecosystems.
- AWS Cognito: AWS-native, tight IAM integration.
- Trade-offs: Convenience vs. limits (token size, rate limits, regional availability).
Managed flows accelerate delivery but create "one-way doors" (e.g., proprietary risk scores, proprietary hook logic).
- Exit Strategy: Ensure you can export password hashes (in standard formats) and factor seeds. Use standard OIDC claims where possible.
- Control Plane: Managed (config, keys).
- Data Plane: Self-hosted (PEPs, PDPs, Identity Stores).
- Example: Use a managed control plane (like Ory Network) to issue tokens, but run the enforcement proxy (Oathkeeper) in your VPC for latency and sovereignty.
- Managed platforms accelerate the shift to OAuth/OIDC.
- Convenience comes with trade‑offs in control and data governance.
- Hybrid adoption patterns reduce lock‑in.
An engineering team faces a security review. “How do we know your login system handles key rotation correctly?” With open components, they point auditors to code, tests, and release notes. The conversation shifts from “trust us” to “verify with us.”
Open identity emphasizes transparency (source availability), composability (modular components), and portability (run anywhere). This allows for audits, public fixes, and reducing single-vendor risk.
Ory represents a modular, API-first architecture:
- Kratos: Identity management (login, registration, recovery). Headless, secure defaults (Argon2id).
- Hydra: OAuth 2.0/OIDC server. Issues tokens, delegates consent.
- Keto: Authorization (ReBAC/Zanzibar-style). High-speed permission checks.
- Oathkeeper: Identity & Access Proxy (IAP). Authenticates requests at the edge.
- Ory: Modular, open-core, API parity between SaaS and self-hosted.
- Keycloak: All-in-one, Java-based, widely deployed for self-hosting.
- SaaS (Auth0/Okta): Full-featured, closed source logic.
A minimal Kratos config defines endpoints, schemas, and flows.
dsn: postgres://kratos:secret@db/kratos
selfservice:
flows:
login:
ui_url: https://id.example.com/ui/login
hashers:
argon2:
config:
memory: 65536
iterations: 2- Open identity systems deliver composability and auditability.
- Ory’s modular design separates identity, token service, policy, and edge enforcement.
- Self‑hosting is feasible; SaaS options with matching APIs enable hybrid patterns.
Modern systems explicitly define assurance levels (IAL/AAL). They use phishing-resistant authenticators (WebAuthn) and secure token flows (Code + PKCE).
Real interop goes beyond checkboxes. It requires conformance tests, algorithm agility, and correct handling of metadata (Discovery, JWKS).
Identity is on the hot path.
- Caching: Cache JWKS and introspection results carefully.
- Architecture: Token issuance is CPU-bound (signing).
- Hooks: Webhooks pre/post registration or login.
- Policy: Externalized policy engines (Keto/OPA).
Governance builds trust via structured audit logs, proper role separation (RBAC for admins), and rigorous key management (rotation, break-glass procedures).
- Define assurance explicitly (IAL/AAL/FAL).
- Favor Authorization Code + PKCE and sender constraints.
- Externalize policy and emit rich audits.
| Dimension | Ory (OSS/SaaS) | Okta (SaaS) | Auth0 (SaaS) | Keycloak (OSS) |
|---|---|---|---|---|
| OIDC | Yes | Yes | Yes | Yes |
| Deployment | Hybrid | SaaS | SaaS | Self-host |
| Portability | High | Medium | Medium | High |
| Extensibility | High (APIs) | High | High (Actions) | Medium (SPIs) |
Evaluate not just features, but the health of the ecosystem. Look for transparent deprecation policies, active security advisories, and SDK maintenance.
Run a time-boxed Proof-of-Concept (POC). Integrate one SPA and one API. Measure integration effort, latency, and audit coverage.
- Evaluate on security, interop, performance, extensibility, and governance.
- Plan an exit strategy on day one.
- Open‑core enables hybrid control/data planes without lock‑in.
Hydra is designed to be "headless." It handles the crypto and protocol complexity but delegates the User Interface (Login/Consent) to your application.
- Hardening: Enforce PKCE for public clients. Use
client_secret_basicorprivate_key_jwtfor confidential clients.
Kratos manages the user lifecycle as a state machine.
- Schema: JSON Schema definitions for user traits (
email,name). - Flows: Resumable flows for registration, login, and recovery.
Keto implements Zanzibar-style relation tuples (ReBAC).
- Model:
namespace:object#relation@subject. - Use Case: "Is User X a viewer of Document Y?" (Checks graph reachability).
A proxy that sits at the edge. It takes incoming requests, authenticates them (validates tokens/sessions), authorizes them (calls Keto), and mutates headers (injects X-User-ID) for downstream services.
- Stateless: Run services behind load balancers.
- State: Store durable state in SQL (Postgres/MySQL).
- Caching: Cache JWKS and introspection at the edge.
- Separate identity flows, token service, policy, and PEP.
- Keep components stateless where possible.
- Push identity to the edge via Oathkeeper.
Code transparency allows independent review of cryptography and error paths. It changes trust from a black-box promise into a verifiable property.
Adopt modern practices:
- SLSA: Provenance for build pipelines.
- SBOMs: Software Bill of Materials (SPDX/CycloneDX).
- Signing: Signed releases and container images.
Implement Admin RBAC. Ensure separation of duties (e.g., the person approving key rotation cannot execute it).
- Source availability improves auditability and speed of fixes.
- Publish threat models and event schemas for external review.
- Make evidence generation routine (logs, exports).
Budget for:
- People: Engineering, SRE, Security.
- Infra: Compute, database, egress.
- Hidden: Email deliverability, fraud tooling, compliance audits.
- Build (Custom): High flexibility, huge maintenance tail, risk of spec drift.
- Buy (SaaS): Fast TCO, platform rent, potential lock-in.
- Open Core / Hybrid: Retains control and portability, externalizes protocol burden.
- Requirements: Users, protocols, compliance.
- Risks: Threat model, vendor concentration.
- Portability: Can we export hashes and factors?
- Governance: Audit trails, admin RBAC.
- Identity is infrastructure; model TCO including people and risk.
- Managed platforms require exit plans and contracts.
- Hybrids balance sovereignty with velocity.
- Browser-accessible tokens: Storing tokens in
localStorage(XSS risk). - Missing rotation: Infinite-life tokens.
- Weak recovery: Email-only reset without device binding.
- Wildcards: Broad redirect URIs (
*.example.com).
Common patterns: A browser update tightens SameSite rules, breaking logout. A third-party widget introduces XSS, stealing tokens from local storage.
A two-week shortcut (skipping rotation) costs months later in incident response and re-engineering.
- Authorization Code + PKCE.
- Secure, HttpOnly Cookies.
- Short-lived tokens with rotation.
- Structured audits.
- Design invariants and telemetry up front.
- Re‑authentication and elevation must be explicit.
- Avoid implicit trust in browsers.
- IAL: Identity Assurance (Proofing).
- AAL: Authenticator Assurance (Login strength).
- FAL: Federation Assurance (Assertion protection).
- Guidance: Use phishing-resistant authenticators (AAL2/3).
Follow the ASVS (Application Security Verification Standard) for session management, credential storage, and logging.
Map technical capabilities to controls.
-
Requirement: "Audit all auth events."
$\rightarrow$ Capability: Structured event streams (JSON). -
Requirement: "Revoke on breach."
$\rightarrow$ Capability: API-driven session invalidation.
- Map NIST and OWASP controls to concrete platform features.
- Treat logs as evidence: structured and privacy-aware.
- Tie logout/revocation to assurance policies.
- JWTs: Good for edge/API caching. Harder to revoke immediately.
- Introspection: Centralized control, higher latency.
- Sessions: Simple for browsers, stateful.
Keep enforcement local (cached keys/policies). Centralize writes.
-
Targets: Login availability
$\ge$ 99.95%. P95 issuance < 200ms.
Quantify the cost of downtime (revenue/min) vs. the cost of a security breach (reputation, fines).
- Budget identity like payments.
- Prefer back‑channel flows and short‑lived tokens.
- Quantify outage costs to justify resilience.
A product-led team migrates to Ory to retain control.
- Strategy: "Verify-on-login" re-hashing (migrate users lazily as they log in).
A team moves to Auth0/Okta for enterprise SSO features.
- Key: Mapping directory attributes to OIDC claims strictly.
- Discovery: Inventory data and flows.
- POC: Prove one app/API integration.
- Import: Migrate hashes and factors.
- Cut-over: Use cohort flags (canary users) and have a rollback plan.
- Phase migrations with cohort flags.
- Normalize schemas early.
- Maintain dual‑write/dual‑read toggles until stabilization.
Zero Trust replaces implicit network trust with continuous evaluation. Identity and device are the new perimeter.
Adapt friction based on risk.
- Policy: If
action == transfer_funds, requireaal2(MFA).
Client → PEP (Oathkeeper) → App → PDP (Keto) → Decision
↘ (step‑up) IdP ↗
Trigger step-up authentication via the IdP when the PDP denies a request due to low assurance.
- Evaluate continuously.
- Keep PEP/PDP decisions fast and auditable.
- Emit AAL/AMR claims at the edge.
AI agents and workloads are first-class identities. They need credentials, lifecycles, and least privilege.
- Flow: Client Credentials.
- Token Exchange (RFC 8693): For "on-behalf-of" flows.
- Constraints: Use RAR (Rich Authorization Requests) and DPoP (Proof of Possession).
Model delegation in your authorization graph:
agent:tool-42#acts_on_behalf_of@user:alice.
- Human-in-the-loop: Require approval for destructive scopes.
- Budgets: Rate limits and cost ceilings enforced at the PEP.
- Treat agents as first‑class identities.
- Use Token Exchange for delegation.
- Prefer attested workload IDs (SPIFFE) over static secrets.
- DID (Decentralized Identifier): Resolves to a document with keys.
- VC (Verifiable Credential): Issuer-signed claims.
Use OIDC for the core session backbone. Use VCs for specific high-value attribute proofs (e.g., "Over 21", "Alumni").
- Bridge: Validate the VC at the edge and inject standard claims into the request context for the PDP.
- DIDs/VCs enable selective disclosure.
- Use OIDC4VCI/4VP to bridge into OIDC ecosystems.
- Start with narrow, high‑value proofs.
Open IAM is modular, standards-aligned, source-available, and API-first.
Clients (Web/Mobile/CLI)
↓
Edge PEP (Oathkeeper) → PDP (Keto) → Decision
↓
Services ← identity headers/claims
↓
Identity (Kratos) Token Service (Hydra)
This layering ensures components can be swapped or scaled independently.
- Modularity and standards define open IAM.
- Treat observability as a first‑class layer.
The future is hybrid: Managed control planes (for ease of use) driving self-hosted data planes (for sovereignty and latency).
Core identity is commoditizing (like Linux). Differentiation is in UX, analytics, and fraud protection.
- Requirements: Data export guarantees, BYOK/HSM support, and clear deprecation timelines.
- Strategy: Enforce API compatibility tests between self-hosted and SaaS modes to ensure portability.
- Value shifts to UX and analytics.
- Hybrid control/data planes balance sovereignty and velocity.
- Contract for portability.
(Selected)
- AAL: Authenticator Assurance Level.
- Grant: Method by which a client obtains tokens.
- Introspection: Endpoint for validating opaque tokens.
- PEP: Policy Enforcement Point.
- PKCE: Proof Key for Code Exchange.
- ReBAC: Relationship-Based Access Control.
- Subject (
sub): Token claim identifying the principal.
(See Chapter text for ASCII representations of flows like OAuth Code, OIDC Logout, and Zero Trust Evaluation).
A matrix to score providers (Ory, Okta, Auth0, etc.) on Authentication, OAuth/OIDC compliance, Federation, Sessions, Authorization, and Portability.
A phased guide:
- Discovery
- Proof of Concept
- Integration
- Data Import (Hashing/Factors)
- Cut-over (Cohort flags)
- Stabilization
A structure for modeling auth flows: Scope, Assets, Actors, Entry Points, Abuse Cases (STRIDE), Invariants, and Controls.
A guide to SameSite behavior, ITP/ETP restrictions, and Deep Link handling across Chrome, Safari, Firefox, and Mobile OSs.
curl commands for validating OIDC discovery, PKCE flows, Introspection, and Revocation.
# Example: Discovery
curl -s $ISSUER/.well-known/openid-configuration | jq .