Skip to content

Instantly share code, notes, and snippets.

@aw-junaid
Created February 28, 2026 12:13
Show Gist options
  • Select an option

  • Save aw-junaid/10727a2574c460649c05f9e7daa00259 to your computer and use it in GitHub Desktop.

Select an option

Save aw-junaid/10727a2574c460649c05f9e7daa00259 to your computer and use it in GitHub Desktop.

MODERN CRYPTOGRAPHY: THEORY, ALGORITHMS, PROTOCOLS, AND PRACTICE

Foundations, Post-Quantum Security, Blockchain, Secure Systems & Applied Cryptanalysis

PART I — FOUNDATIONS OF CRYPTOGRAPHY


CHAPTER 1: INTRODUCTION TO CRYPTOGRAPHY

1.1 The Essence of Cryptography

Cryptography, derived from the Greek words kryptos (hidden) and graphein (to write), is the science and art of secure communication in the presence of adversaries. At its core, cryptography provides mechanisms to protect information from unauthorized access, modification, or fabrication. In the modern digital age, cryptography has evolved from a specialized military tool to an indispensable component of our technological infrastructure, securing everything from financial transactions and medical records to social media communications and national security systems.

The fundamental problem that cryptography addresses can be stated simply: How can two parties communicate securely over an insecure channel where an eavesdropper may intercept their messages? This problem, known as secure communication, extends beyond mere confidentiality to encompass integrity (ensuring messages haven't been tampered with), authentication (verifying the identities of communicating parties), and non-repudiation (preventing denial of sent messages).

1.2 A Journey Through Cryptographic History

1.2.1 Ancient Cryptography

The earliest known cryptographic techniques date back to ancient civilizations. The Egyptians used non-standard hieroglyphs in inscriptions dating to 1900 BCE, representing one of the earliest attempts at obscuring written communication. However, these were more about creating an air of mystery than true security.

The Spartan Scytale (c. 500 BCE) : The Spartans, known for their military prowess, developed the scytale (pronounced "skit-uh-lee"), a simple transposition cipher. The system consisted of a cylindrical staff of specific diameter around which a strip of parchment was wrapped. The message was written across the wrapped strip, then unwound, leaving seemingly random characters. The recipient would rewrap the strip around a cylinder of identical diameter to read the message. This represents one of the earliest known military cryptographic devices.

The Caesar Cipher (c. 50 BCE) : Julius Caesar used a simple substitution cipher where each letter in his military messages was shifted three positions down the alphabet. Thus, "A" became "D," "B" became "E," and so on. While trivially breakable by modern standards, it was effective against illiterate adversaries and those unfamiliar with the technique. Suetonius, a Roman historian, documented that "If he had anything confidential to say, he wrote it in cipher, that is, by changing the order of the letters of the alphabet, so that not a word could be made out."

1.2.2 The Golden Age of Cryptography: The Arab World

Between the 8th and 14th centuries, the Islamic Golden Age witnessed remarkable advances in cryptography. The Arabs were pioneers in both cryptanalysis (code-breaking) and cryptography.

Al-Kindi (801-873 CE) : Often called the "philosopher of the Arabs," Al-Kindi wrote the influential manuscript "A Manuscript on Deciphering Cryptographic Messages." In this work, he described the first known use of frequency analysis—a technique that would dominate cryptanalysis for the next millennium. He wrote: "One way to solve an encrypted message, if we know its language, is to find a different plaintext of the same language long enough to fill one sheet or so, and then we count the occurrences of each letter. ... Then we look at the ciphertext we want to solve and mark its symbols. Then we replace each symbol with the letter whose frequency count matches that of the cipher symbol."

Ibn Adlan (1187-1268) : Made significant contributions by developing methods for determining the language of a ciphertext and providing tables of letter frequencies for Arabic words.

1.2.3 Renaissance and Early Modern Cryptography

Leon Battista Alberti (1404-1472) : Often called the "Father of Western Cryptology," Alberti invented the polyalphabetic cipher and the cipher wheel. His 1466 treatise "De Cifris" introduced the concept of using multiple alphabets for encryption, a revolutionary idea that would resist cryptanalysis for centuries.

Johannes Trithemius (1462-1516) : Wrote the first printed book on cryptography, "Polygraphiae," and invented the tabula recta, a square table of alphabets used in polyalphabetic ciphers. He also introduced steganography—hiding messages within other messages.

Blaise de Vigenère (1523-1596) : Developed what became known as the Vigenère cipher, though it was actually based on the work of Alberti and Trithemius. The Vigenère cipher uses a keyword to select different shift amounts for each letter, creating a polyalphabetic substitution that was considered unbreakable for over 300 years—earning it the nickname "le chiffre indéchiffrable" (the indecipherable cipher).

1.2.4 The Mechanical Age: World War II Cryptography

The Enigma Machine : Perhaps the most famous cryptographic device in history, the Enigma machine was used by Nazi Germany during World War II. Invented by Arthur Scherbius at the end of World War I, it was an electro-mechanical rotor cipher machine. Each key press would illuminate a letter on a lampboard, but the path would change based on the rotors' positions. With 3-5 rotors, each offering 26 positions, and a plugboard providing additional substitutions, the Enigma had an astronomical number of possible settings (approximately 10^114 possible keys).

The breaking of Enigma at Bletchley Park, led by Alan Turing, Gordon Welchman, and others, represents one of the greatest intellectual achievements of the 20th century. Turing's Bombe machines exploited operational weaknesses and known plaintext attacks (weather reports, stereotyped greetings) to reduce the search space dramatically. Some historians argue that breaking Enigma shortened the war by two years and saved millions of lives.

The SIGABA : The American counterpart to Enigma, the SIGABA (also called ECM Mark II) was used by the US Army and Navy throughout WWII. Unlike Enigma, SIGABA remained unbroken throughout the war due to its more complex rotor movement mechanism.

The Lorenz Cipher : The Germans' high-level teleprinter cipher, broken by British cryptanalysts including Bill Tutte and Tommy Flowers, who built Colossus—the world's first programmable electronic digital computer—to assist in the cryptanalysis.

1.2.5 The Cold War and Information Theory

Claude Shannon (1916-2001) : Shannon's 1949 paper "Communication Theory of Secrecy Systems" laid the mathematical foundations of modern cryptography. Working at Bell Labs, Shannon introduced fundamental concepts including confusion (making the relationship between key and ciphertext complex) and diffusion (spreading plaintext redundancy throughout ciphertext). His work established cryptography as a rigorous mathematical discipline.

The Data Encryption Standard (DES) : In 1973, the National Bureau of Standards (now NIST) issued a public request for a cryptographic algorithm to become a national standard. IBM submitted an algorithm based on their Lucifer cipher, developed by Horst Feistel. After modifications by the NSA, DES was adopted as a federal standard in 1977. DES used a 56-bit key, which was controversial—critics argued the NSA had weakened it intentionally. For over two decades, DES served as the global standard for symmetric encryption, though by the late 1990s, its key length became vulnerable to brute-force attacks.

1.2.6 The Public Key Revolution

The Problem of Key Distribution : Traditional (symmetric) cryptography required communicating parties to share a secret key through a secure channel. This created a chicken-and-egg problem: how do you establish a secure channel without already having one?

Whitfield Diffie, Martin Hellman, and Ralph Merkle : In 1976, Diffie and Hellman published "New Directions in Cryptography," introducing the revolutionary concept of public-key cryptography. They proposed a system where encryption and decryption keys were different, allowing users to publish their encryption key while keeping their decryption key private. The Diffie-Hellman key exchange protocol solved the key distribution problem by allowing two parties to establish a shared secret over an insecure channel.

RSA (Rivest-Shamir-Adleman) : In 1977, Ron Rivest, Adi Shamir, and Leonard Adleman at MIT developed the first practical public-key cryptosystem, RSA, based on the difficulty of factoring large numbers. RSA enabled both encryption and digital signatures, becoming the foundation of modern internet security.

1.3 Cryptography vs. Information Security

While cryptography is a critical component of information security, the two terms are not synonymous. Information security encompasses all measures taken to protect information, including:

  • Physical security: Locks, guards, secure facilities
  • Administrative controls: Policies, procedures, training
  • Technical controls: Firewalls, access control systems, intrusion detection
  • Cryptography: Mathematical techniques for protecting information

Cryptography provides the mathematical underpinnings for many security mechanisms, but it must be implemented within a comprehensive security framework. A mathematically perfect cryptographic algorithm can be rendered useless by poor key management, implementation flaws, or social engineering attacks.

1.4 The Five Pillars of Information Security

Modern information security rests on five fundamental goals, often remembered by the acronym CIANA:

1. Confidentiality: Ensuring that information is accessible only to authorized parties. Confidentiality is perhaps the most intuitive security goal—when you send a message, you want only the intended recipient to read it. Cryptography achieves confidentiality through encryption algorithms that transform plaintext into ciphertext unintelligible to eavesdroppers.

2. Integrity: Guaranteeing that information has not been modified in transit or storage. Even if an adversary cannot read a message, they might be able to alter it. Cryptographic hash functions and message authentication codes provide integrity protection by creating a "fingerprint" of the data that changes if the data is modified.

3. Authentication: Verifying the identity of communicating parties. Before sharing sensitive information, you need to know who you're talking to. Authentication mechanisms, including digital signatures and challenge-response protocols, prevent impersonation attacks.

4. Non-repudiation: Preventing an entity from denying previous actions or commitments. In digital transactions, it's crucial that parties cannot later claim they didn't send a message or authorize a payment. Digital signatures provide non-repudiation by creating unforgeable evidence of a sender's identity.

5. Availability: Ensuring that information and services are accessible when needed. While often overlooked in cryptography discussions, availability is critical—denial-of-service attacks can render systems useless even if they're perfectly secure in other respects.

1.5 Threat Models and Adversary Types

Understanding potential attackers is essential for designing appropriate cryptographic protections. Threat models categorize adversaries based on their capabilities and objectives.

1.5.1 Adversary Capabilities

Passive Attacks: The adversary merely observes communications without interfering. Passive attacks threaten confidentiality but not integrity or availability.

  • Eavesdropping on network traffic
  • Traffic analysis (observing communication patterns)
  • Monitoring electromagnetic emissions

Active Attacks: The adversary can intercept, modify, or inject messages. Active attacks threaten all security goals.

  • Man-in-the-middle attacks
  • Replay attacks
  • Message modification
  • Denial of service

1.5.2 Adversary Categories

Casual Attackers: Individuals with limited resources and motivation, such as curious hackers or disgruntled employees. They typically use publicly available tools and exploit known vulnerabilities.

Criminal Organizations: Well-funded groups seeking financial gain through theft, fraud, or ransomware. They may employ sophisticated techniques and invest significant resources in attacks.

State-Sponsored Actors: The most powerful adversaries, with virtually unlimited resources and access to advanced capabilities. They conduct espionage, sabotage, and influence operations. Protecting against such adversaries requires state-of-the-art cryptography and careful operational security.

Insiders: Authorized users who may misuse their access, either maliciously or accidentally. Insider threats are particularly dangerous because they bypass perimeter defenses.

1.5.3 Attack Models in Cryptography

Ciphertext-Only Attack (COA) : The adversary has access only to intercepted ciphertexts. This is the weakest attack model, but the easiest to execute.

Known-Plaintext Attack (KPA) : The adversary has pairs of plaintext and corresponding ciphertext. This might occur when an encrypted message's plaintext can be guessed (e.g., standardized greetings).

Chosen-Plaintext Attack (CPA) : The adversary can obtain ciphertexts for arbitrary plaintexts of their choice. This models scenarios where the attacker can influence what gets encrypted.

Chosen-Ciphertext Attack (CCA) : The adversary can obtain plaintexts for arbitrary ciphertexts of their choice (except the target ciphertext). This is the strongest attack model and represents scenarios where the attacker has temporary access to a decryption oracle.

1.6 Kerckhoffs's Principle

Auguste Kerckhoffs, a Dutch linguist and cryptographer, published two important articles in 1883 titled "La Cryptographie Militaire." In these works, he laid out six design principles for military ciphers. The second principle has become fundamental to modern cryptography:

"The system must not require secrecy and can be stolen by the enemy without causing trouble."

This principle, known as Kerckhoffs's Principle, states that a cryptographic system should be secure even if everything about the system—except the key—is public knowledge. This contrasts with "security through obscurity," which relies on keeping the algorithm secret.

Why Kerckhoffs's Principle Matters:

  1. Scalability: If thousands of users need to communicate securely, they cannot all keep the algorithm secret. Only the keys need protection.

  2. Public Scrutiny: Making algorithms public allows cryptanalysts to examine them for weaknesses. Algorithms that survive years of public analysis inspire greater confidence than secret ones.

  3. Standardization: Open algorithms can be standardized, allowing interoperable implementations.

  4. Contingency: If a user's key is compromised, they can change it without changing the entire system.

1.7 Security by Design: A Proactive Approach

Security by design means incorporating security considerations throughout the development lifecycle, not as an afterthought. This principle applies to cryptographic systems as well as general software development.

Key Aspects of Security by Design:

1. Threat Modeling: Identify potential threats early in the design process. What assets need protection? Who might attack them? What attack vectors exist?

2. Defense in Depth: Layer multiple security controls so that if one fails, others still provide protection. For example, encrypt data at rest, in transit, and in use.

3. Least Privilege: Grant only the minimum necessary access rights. A process that needs to read a file shouldn't have write access.

4. Fail Secure: When failures occur, the system should fail into a secure state, not an insecure one. If authentication fails, deny access rather than granting it.

5. Secure Defaults: Default configurations should be secure. Users should have to deliberately weaken security if they need to.

6. Open Design: As per Kerckhoffs, security shouldn't depend on secrecy of design.

7. Psychological Acceptability: Security mechanisms should be easy to use correctly; otherwise, users will bypass them.

1.8 The Cryptographic Ecosystem

Modern cryptography involves multiple stakeholders:

Cryptographers: Design and analyze cryptographic algorithms and protocols. They publish their work in academic venues, subjecting it to peer review.

Cryptanalysts: Attempt to break cryptographic systems, identifying weaknesses and helping improve designs.

Standards Organizations: NIST, ISO, IETF, and others develop and publish cryptographic standards to ensure interoperability and promote best practices.

Implementers: Translate mathematical specifications into working code. This crucial step must balance security, performance, and usability.

Security Engineers: Integrate cryptographic components into larger systems, considering factors like key management, error handling, and user experience.

End Users: Ultimately rely on cryptography to protect their communications, transactions, and data.

1.9 The Scope of Modern Cryptography

While classical cryptography focused primarily on confidentiality, modern cryptography encompasses a vast range of functionalities:

  • Encryption: Symmetric and asymmetric
  • Hashing: Creating fixed-length fingerprints of arbitrary data
  • Digital Signatures: Providing authentication and non-repudiation
  • Key Exchange: Establishing shared secrets over insecure channels
  • Zero-Knowledge Proofs: Proving knowledge without revealing it
  • Secure Multiparty Computation: Joint computation without revealing inputs
  • Homomorphic Encryption: Computing on encrypted data
  • Post-Quantum Cryptography: Algorithms resistant to quantum attacks

1.10 Organization of This Book

This comprehensive treatment of modern cryptography is organized into twelve parts:

Part I (this section) provides foundational concepts and mathematical preliminaries.

Part II covers classical cryptography, providing historical context and introducing cryptanalytic techniques.

Parts III-VI explore the core areas of modern cryptography: symmetric encryption, public-key cryptography, elliptic curve cryptography, and digital signatures.

Part VII examines cryptographic protocols that enable secure communication and authentication.

Part VIII addresses the emerging field of post-quantum cryptography.

Parts IX-X cover applied cryptography and cryptanalysis.

Parts XI-XII delve into formal security proofs and advanced topics.

Each chapter includes theoretical foundations, practical implementations, security considerations, and real-world applications.


CHAPTER 2: MATHEMATICAL FOUNDATIONS

2.1 The Language of Cryptography

Cryptography is built on a foundation of mathematics. Understanding this foundation is essential for grasping how cryptographic algorithms work, why they're secure, and how they can be attacked. This chapter provides the mathematical background required for the rest of the book.

2.2 Set Theory and Functions

2.2.1 Sets

A set is a collection of distinct objects, considered as an object in its own right. Sets are fundamental to cryptography because they define the spaces we work with.

Basic Notation:

  • Elements are written in curly braces: {1, 2, 3}
  • Set membership: a ∈ A means "a is an element of set A"
  • Subset: A ⊆ B means every element of A is also in B
  • Union: A ∪ B contains all elements in either A or B
  • Intersection: A ∩ B contains elements in both A and B
  • Complement: Ā contains all elements not in A (relative to some universal set)

Important Sets in Cryptography:

  • ℕ = {1, 2, 3, ...} (natural numbers)
  • ℤ = {..., -2, -1, 0, 1, 2, ...} (integers)
  • ℚ (rational numbers)
  • ℝ (real numbers)
  • ℂ (complex numbers)
  • ℤ_n = {0, 1, 2, ..., n-1} (integers modulo n)
  • {0,1}^n (binary strings of length n)
  • {0,1}^* (binary strings of any finite length)

2.2.2 Functions

A function f: A → B maps each element of set A (the domain) to exactly one element of set B (the codomain).

Properties of Functions Important in Cryptography:

Injective (One-to-one) : Each element of the codomain is mapped to by at most one element of the domain. If f(a₁) = f(a₂), then a₁ = a₂.

Surjective (Onto) : Every element of the codomain is mapped to by at least one element of the domain. For every b ∈ B, there exists a ∈ A such that f(a) = b.

Bijective: Both injective and surjective. Bijective functions have inverses f⁻¹: B → A such that f⁻¹(f(a)) = a for all a ∈ A.

Trapdoor Functions: Functions that are easy to compute in one direction but difficult to invert without special information (the trapdoor). Public-key cryptography relies on trapdoor functions.

One-Way Functions: Functions that are easy to compute but difficult to invert (no trapdoor needed). Cryptographic hash functions should be one-way.

2.2.3 Permutations

A permutation is a bijective function from a set to itself. Permutations are fundamental to block ciphers, which essentially apply complex permutations to blocks of plaintext.

2.3 Number Theory

Number theory, the study of integers and their properties, forms the mathematical backbone of public-key cryptography.

2.3.1 Divisibility and Primes

Divisibility: For integers a and b, a divides b (written a|b) if there exists an integer k such that b = a·k.

Prime Numbers: A prime p is an integer greater than 1 whose only positive divisors are 1 and itself. Primes are the building blocks of number theory.

Composite Numbers: Integers greater than 1 that are not prime.

The Fundamental Theorem of Arithmetic: Every integer greater than 1 can be expressed uniquely as a product of primes (up to ordering). For example, 84 = 2² × 3 × 7.

Distribution of Primes: The prime number theorem states that the number of primes less than x, denoted π(x), is approximately x/ln(x). For cryptographic purposes, we need very large primes (typically 2048 bits or more for RSA, 256 bits for ECC).

2.3.2 Greatest Common Divisor

The greatest common divisor of integers a and b, written gcd(a,b), is the largest positive integer that divides both a and b.

Properties:

  • gcd(a,b) = gcd(b,a)
  • gcd(a,b) = gcd(|a|,|b|)
  • gcd(a,0) = |a|
  • gcd(a,b) divides any integer combination of a and b

Relatively Prime (Coprime) : Numbers a and b are relatively prime if gcd(a,b) = 1.

2.3.3 The Euclidean Algorithm

The Euclidean algorithm efficiently computes the greatest common divisor of two numbers without requiring factorization.

Algorithm: Given integers a ≥ b > 0:

  1. While b ≠ 0:
    • Set r = a mod b
    • Set a = b
    • Set b = r
  2. Return |a|

Example: Compute gcd(1071, 462):

  • 1071 = 2 × 462 + 147
  • 462 = 3 × 147 + 21
  • 147 = 7 × 21 + 0
  • Therefore, gcd(1071, 462) = 21

Extended Euclidean Algorithm: Finds integers x and y such that ax + by = gcd(a,b). This is crucial for computing modular inverses.

2.3.4 Modular Arithmetic

Modular arithmetic, sometimes called "clock arithmetic," deals with integers and a modulus, where numbers "wrap around" upon reaching the modulus.

Definition: For a positive integer n, we say a ≡ b (mod n) if n divides (a - b). The set of equivalence classes under this relation is ℤ_n.

Operations in ℤ_n:

  • Addition: (a + b) mod n
  • Subtraction: (a - b) mod n
  • Multiplication: (a × b) mod n
  • Division: a/b mod n exists only if b has a multiplicative inverse modulo n

Properties: ℤ_n with addition forms a group. ℤ_n with addition and multiplication forms a ring.

2.3.5 Modular Inverses

An element a ∈ ℤ_n has a multiplicative inverse a⁻¹ if a·a⁻¹ ≡ 1 (mod n). The inverse exists if and only if gcd(a,n) = 1.

Computing Modular Inverses: Use the extended Euclidean algorithm to find x and y such that ax + ny = 1. Then x mod n is the inverse of a modulo n.

2.3.6 Euler's Totient Function

Euler's totient function φ(n) counts the positive integers less than n that are relatively prime to n.

Properties:

  • If p is prime, φ(p) = p - 1
  • If p and q are distinct primes, φ(pq) = (p-1)(q-1)
  • φ is multiplicative: if gcd(m,n) = 1, then φ(mn) = φ(m)φ(n)

Euler's Theorem: If gcd(a,n) = 1, then a^φ(n) ≡ 1 (mod n).

Fermat's Little Theorem (special case of Euler's theorem when n is prime): If p is prime and a is not divisible by p, then a^(p-1) ≡ 1 (mod p).

These theorems are fundamental to RSA encryption and primality testing.

2.3.7 The Chinese Remainder Theorem

The Chinese Remainder Theorem (CRT) states that a system of congruences with pairwise coprime moduli has a unique solution modulo the product of the moduli.

Theorem: Let n₁, n₂, ..., nₖ be pairwise coprime positive integers. For any integers a₁, a₂, ..., aₖ, the system: x ≡ a₁ (mod n₁) x ≡ a₂ (mod n₂) ... x ≡ aₖ (mod nₖ) has a unique solution modulo N = n₁n₂...nₖ.

CRT in Cryptography:

  • Speeding up RSA decryption (by a factor of approximately 4)
  • Secret sharing schemes
  • Some cryptographic attacks

2.4 Abstract Algebra

Abstract algebra studies algebraic structures like groups, rings, and fields—sets equipped with operations satisfying specific axioms. These structures provide the framework for cryptographic algorithms.

2.4.1 Groups

A group (G, ∗) is a set G with a binary operation ∗ satisfying:

  1. Closure: For all a,b ∈ G, a∗b ∈ G
  2. Associativity: (a∗b)∗c = a∗(b∗c) for all a,b,c ∈ G
  3. Identity: There exists e ∈ G such that e∗a = a∗e = a for all a ∈ G
  4. Inverse: For each a ∈ G, there exists a⁻¹ ∈ G such that a∗a⁻¹ = a⁻¹∗a = e

Abelian Group: A group where the operation is commutative (a∗b = b∗a for all a,b).

Examples in Cryptography:

  • (ℤ_n, +) is an abelian group
  • (ℤ_n^×, ×) (the multiplicative group of units modulo n) is an abelian group
  • Elliptic curve points form an abelian group under addition

Group Order: The number of elements in a group, denoted |G|.

Subgroups: A subset H ⊆ G that is itself a group under the same operation.

Cyclic Groups: A group G is cyclic if there exists an element g ∈ G such that every element of G can be written as gⁿ for some integer n. Such an element g is called a generator.

Lagrange's Theorem: The order of any subgroup H of a finite group G divides the order of G.

2.4.2 Rings

A ring (R, +, ×) is a set with two operations satisfying:

  1. (R, +) is an abelian group (identity 0)
  2. Multiplication is associative
  3. Multiplication distributes over addition: a×(b+c) = a×b + a×c and (a+b)×c = a×c + b×c

Additional Properties:

  • Commutative ring: multiplication is commutative
  • Ring with unity: has multiplicative identity 1 (1 ≠ 0)
  • Integral domain: commutative ring with unity where if a×b = 0 then either a = 0 or b = 0

Example: ℤ_n with addition and multiplication modulo n is a commutative ring with unity.

2.4.3 Fields

A field (F, +, ×) is a commutative ring with unity where every non-zero element has a multiplicative inverse. Essentially, fields allow addition, subtraction, multiplication, and division (except by zero).

Examples:

  • ℚ (rational numbers)
  • ℝ (real numbers)
  • ℂ (complex numbers)
  • ℤ_p where p is prime (finite fields)
  • GF(2ⁿ) (extension fields)

Finite Fields: Fields with a finite number of elements are called Galois fields, denoted GF(q) where q = pⁿ for some prime p and positive integer n. Finite fields are fundamental to many cryptographic algorithms, including AES and elliptic curve cryptography.

2.4.4 Finite Fields GF(p) and GF(2ⁿ)

Prime Fields GF(p) : When the modulus is a prime p, ℤ_p forms a field. All arithmetic is performed modulo p. Used in RSA, Diffie-Hellman, and many other cryptosystems.

Binary Fields GF(2ⁿ) : Fields with 2ⁿ elements. Elements can be represented as polynomials of degree less than n with coefficients in {0,1}. Arithmetic is performed modulo an irreducible polynomial of degree n. Used in AES (n=8), elliptic curve cryptography, and some post-quantum schemes.

Advantages of GF(2ⁿ) : Addition is simply XOR, making hardware implementation efficient.

2.5 Polynomial Arithmetic

Polynomials over fields are important for several cryptographic applications, including finite field arithmetic and some post-quantum cryptosystems.

Polynomial Representation: A polynomial over field F is an expression of the form aₙxⁿ + aₙ₋₁xⁿ⁻¹ + ... + a₁x + a₀, where aᵢ ∈ F.

Polynomial Operations:

  • Addition: coefficient-wise addition
  • Multiplication: standard polynomial multiplication, possibly modulo another polynomial
  • Division: polynomial long division

Irreducible Polynomials: The polynomial equivalent of primes—cannot be factored into lower-degree polynomials over the field. Used to construct extension fields.

2.6 Linear Algebra in Cryptography

Linear algebra plays a role in classical ciphers (Hill cipher), cryptanalysis (linear cryptanalysis), and some modern cryptosystems.

Vectors and Matrices: Represented over various fields (ℝ, ℤ, GF(2), etc.).

Matrix Operations:

  • Addition and multiplication
  • Determinant and inverse

Linear Transformations: Functions that preserve vector addition and scalar multiplication.

The Hill Cipher: Uses a matrix as the encryption key. Each block of plaintext is treated as a vector, multiplied by the key matrix modulo 26 (or another modulus).

Applications in Cryptanalysis:

  • Solving systems of linear equations in cryptanalysis
  • Linear approximations in linear cryptanalysis

2.7 Probability Theory

Probability is essential for understanding randomness, analyzing cryptographic security, and evaluating attacks.

2.7.1 Basic Probability Concepts

Sample Space: The set of all possible outcomes. Event: A subset of the sample space. Probability Function: A function P assigning a number between 0 and 1 to each event, with P(sample space) = 1.

Conditional Probability: P(A|B) = P(A ∩ B) / P(B), assuming P(B) > 0.

Independence: Events A and B are independent if P(A ∩ B) = P(A)P(B).

2.7.2 Random Variables

A random variable is a function from the sample space to real numbers.

Discrete Random Variables: Take on countable values (e.g., integers). Continuous Random Variables: Take on uncountable values (e.g., real numbers).

Expectation: E[X] = Σ x·P(X = x) for discrete variables; ∫ x·f(x) dx for continuous variables.

Variance: Var(X) = E[(X - E[X])²] measures spread around the mean.

2.7.3 Entropy and Information Theory

Claude Shannon's information theory provides a mathematical framework for quantifying information and randomness.

Entropy H(X) : Measures the uncertainty associated with a random variable X. For a discrete random variable taking values in alphabet χ:

H(X) = -Σ_{x∈χ} P(x) log₂ P(x) bits

Properties:

  • H(X) ≥ 0, with equality if and only if X is deterministic
  • For a uniform distribution over n outcomes, H(X) = log₂ n (maximum entropy)
  • Entropy measures the average number of bits needed to represent X optimally

Conditional Entropy: H(X|Y) measures the remaining uncertainty about X given Y.

Mutual Information: I(X;Y) = H(X) - H(X|Y) measures how much information Y provides about X.

Applications in Cryptography:

  • Quantifying key strength (key entropy)
  • Analyzing cryptographic randomness
  • Side-channel attacks (information leakage)

2.7.4 Probability in Cryptanalysis

Many cryptanalytic techniques rely on probabilistic reasoning:

Differential Cryptanalysis: Analyzes how differences in plaintext affect differences in ciphertext, using probability distributions of differences.

Linear Cryptanalysis: Finds linear approximations of cipher components with probability biased away from 1/2.

Birthday Paradox: The probability that in a set of n randomly chosen elements from a set of size N, two are the same, becomes significant when n ≈ √N. This underlies birthday attacks on hash functions.

2.8 Computational Complexity

Computational complexity theory classifies problems by how difficult they are to solve, providing the foundation for cryptographic security.

2.8.1 Complexity Classes

P (Polynomial Time) : Problems solvable by a deterministic Turing machine in time O(nᵏ) for some constant k. Considered "efficiently solvable."

NP (Nondeterministic Polynomial Time) : Problems whose solutions can be verified in polynomial time. Most cryptographic problems are in NP (solutions are easy to verify but hard to find).

NP-Complete: The hardest problems in NP—if any NP-complete problem could be solved in polynomial time, then P = NP.

BPP (Bounded-error Probabilistic Polynomial Time) : Problems solvable by probabilistic algorithms with error probability ≤ 1/3.

Cryptographic Relevance: For cryptography to be possible, we need problems that are hard on average, not just in the worst case. We also need problems that remain hard even for adversaries with probabilistic polynomial-time algorithms.

2.8.2 Hardness Assumptions

Modern public-key cryptography rests on unproven assumptions about the computational difficulty of certain problems:

Integer Factorization Assumption: Given a composite integer N that is the product of two large primes, it is computationally infeasible to find those primes. (Underlies RSA security)

Discrete Logarithm Assumption: Given a cyclic group G, a generator g, and an element h = g^x, it is computationally infeasible to find x. (Underlies Diffie-Hellman, ElGamal)

Elliptic Curve Discrete Logarithm Assumption: The same assumption but with elliptic curve groups, where the problem appears harder for a given key size.

Learning With Errors (LWE) Assumption: The basis for many post-quantum cryptosystems, believed to be hard even for quantum computers.

2.8.3 Reductions

A reduction transforms one problem into another, showing that if the second problem can be solved, so can the first. Reductions are used to prove security:

Security Reduction: Show that if an adversary can break a cryptosystem, then they can solve a hard mathematical problem. This provides evidence that breaking the cryptosystem is at least as hard as solving the underlying problem.

Example: Breaking RSA signature generation without the private key would allow factoring the RSA modulus (or vice versa).

2.9 Asymptotic Notation

Cryptographers use asymptotic notation to describe algorithm efficiency and problem difficulty:

  • O(f(n)) : Upper bound—grows no faster than f(n) up to constant factors
  • Ω(f(n)) : Lower bound—grows at least as fast as f(n) up to constant factors
  • Θ(f(n)) : Tight bound—grows exactly like f(n) up to constant factors
  • o(f(n)) : Grows strictly slower than f(n)
  • ω(f(n)) : Grows strictly faster than f(n)

Polynomial Time: O(nᵏ) for some constant k. Considered efficient. Subexponential Time: e^{o(n)}—faster than exponential but slower than polynomial. Exponential Time: O(cⁿ) for c > 1. Considered infeasible for large n.

2.10 Concrete Security Parameters

While asymptotic analysis provides theoretical understanding, practical cryptography requires concrete parameters:

Security Level: Measured in bits. An algorithm with k-bit security requires approximately 2ᵏ operations to break.

Key Length Recommendations (as of 2024):

  • Symmetric encryption: 128-256 bits
  • RSA: 2048-4096 bits
  • Elliptic curve: 256-512 bits
  • Hash functions: 256-512 bits output

Moore's Law and Security: As computing power increases, security parameters must increase. The transition to post-quantum cryptography will require even larger parameters for some schemes.


PART II — CLASSICAL CRYPTOGRAPHY


CHAPTER 3: SUBSTITUTION CIPHERS

3.1 The Dawn of Secret Writing

Substitution ciphers, the oldest cryptographic technique, replace plaintext elements (letters, characters, or symbols) with ciphertext elements according to a fixed rule. Despite their historical significance and conceptual simplicity, understanding substitution ciphers provides crucial insights into cryptographic principles and vulnerabilities that persist in modern systems.

3.2 The Caesar Cipher

3.2.1 Description

The Caesar cipher, attributed to Julius Caesar, is the simplest and most famous substitution cipher. Each letter in the plaintext is shifted a fixed number of positions down the alphabet.

Mathematically: For a shift of k (where 1 ≤ k ≤ 25), encryption and decryption functions are:

Eₖ(p) = (p + k) mod 26 Dₖ(c) = (c - k) mod 26

where p and c are numerical representations of letters (A=0, B=1, ..., Z=25).

Example with k=3 (the traditional Caesar shift): Plaintext: ATTACK AT DAWN Ciphertext: DWWDFN DW GDZQ

3.2.2 Cryptanalysis

The Caesar cipher is trivially breakable:

Brute Force: Only 25 possible keys (shifts). Even manual testing is quick.

Frequency Analysis: Letter frequencies in the ciphertext will mirror those in English, making the shift obvious. The most common letter in English is 'E' (about 12.7% frequency). If the most common ciphertext letter is 'K', the shift is likely 6 (E → K).

3.2.3 Historical Note

Suetonius wrote: "If he had anything confidential to say, he wrote it in cipher, that is, by changing the order of the letters of the alphabet, so that not a word could be made out. If anyone wishes to decipher these, and get at their meaning, he must substitute the fourth letter of the alphabet, namely D, for A, and so with the others."

3.3 Monoalphabetic Substitution Cipher

3.3.1 General Description

A monoalphabetic substitution cipher replaces each letter with another according to an arbitrary permutation of the alphabet. The key is a complete mapping from plaintext letters to ciphertext letters.

Key Space: 26! ≈ 4 × 10²⁶ possible keys (approximately 88 bits). This is enormous compared to the Caesar cipher's 25 keys.

Example Mapping: Plain: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher: Z Y X W V U T S R Q P O N M L K J I H G F E D C B A (This is the Atbash cipher, originally used for Hebrew)

3.3.2 Encryption Process

To encrypt "HELLO" with the above mapping: H → S E → V L → O L → O O → L Result: "SVOOL"

3.3.3 Cryptanalysis — Frequency Analysis

Despite the huge key space, monoalphabetic ciphers are vulnerable to frequency analysis. The technique, perfected by Al-Kindi in the 9th century, exploits statistical properties of language:

Letter Frequencies in English (approximate percentages):

  • E: 12.7%
  • T: 9.1%
  • A: 8.2%
  • O: 7.5%
  • I: 7.0%
  • N: 6.7%
  • S: 6.3%
  • H: 6.1%
  • R: 6.0%

Pattern Analysis:

  • Common digrams (letter pairs): TH, HE, IN, ER, AN, RE, ED, ON, ES, ST, EN, AT, TO, NT, HA, ND, OU, EA, NG, AL, OR, TI, AS, TE
  • Common trigrams: THE, AND, THA, ENT, ION, TIO, FOR, NDE, HAS, NCE

Method:

  1. Count frequencies of each ciphertext letter
  2. Match most frequent ciphertext letters to most frequent English letters
  3. Look for common words (especially short ones like "the", "and", "of")
  4. Use context to refine the mapping

3.3.4 The Kryptos Sculpture

The Kryptos sculpture at CIA headquarters contains four encrypted messages. The first three have been solved, but the fourth remains unbroken. Part of the first message used a monoalphabetic substitution, demonstrating that even simple ciphers can be challenging without context.

3.4 Playfair Cipher

3.4.1 Historical Context

Invented in 1854 by Charles Wheatstone, but promoted by Lord Playfair, the Playfair cipher was the first practical digraph substitution cipher. It was used by British forces in World War I and by the Australians in World War II.

3.4.2 Description

The Playfair cipher encrypts pairs of letters (digraphs) using a 5×5 grid of letters constructed from a keyword.

Key Square Construction:

  1. Choose a keyword (e.g., "PLAYFAIR")
  2. Remove duplicate letters: PLAYFIR
  3. Fill the grid row by row with the keyword letters, then fill remaining spaces with the alphabet (combining I and J into one cell)

For keyword "PLAYFAIR":

P L A Y F I R B C D E G H K M N O Q S T U V W X Z

Encryption Rules:

  1. Break plaintext into digraphs (pairs of letters)

  2. If both letters are the same, insert an 'X' between them (e.g., "LL" becomes "LX" "LX")

  3. For each digraph (let's call them A and B):

    a. Same row: Replace each letter with the letter to its right (wrapping around) b. Same column: Replace each letter with the letter below (wrapping around) c. Rectangle: Replace with letters in the same rows but swapped columns

Example: Encrypt "HELLO" with Playfair:

  • Break: HE LX LO
  • HE: H in row 3 col 3, E in row 3 col 1 — same row, so replace with letters to right: H→K (right of H), E→G (right of E) → KG
  • LX: L in row 1 col 2, X in row 5 col 4 — rectangle, so swap columns: L (col2) with X's col4 gives Z (row1 col4), X (col4) with L's col2 gives U (row5 col2) → ZU
  • LO: L in row 1 col 2, O in row 4 col 3 — rectangle: L→Q (row1 col3), O→N (row4 col2) → QN Ciphertext: KGZUQN

3.4.3 Cryptanalysis

The Playfair cipher is more secure than simple monoalphabetic substitution but still vulnerable:

Weaknesses:

  • Frequency analysis of digraphs reveals patterns
  • Known plaintext attacks are devastating
  • The structure of the key square imposes constraints
  • Modern computers can break Playfair automatically with hill climbing

3.5 Vigenère Cipher

3.5.1 The Indecipherable Cipher

The Vigenère cipher, misattributed to Blaise de Vigenère but actually invented by Giovan Battista Bellaso, was considered unbreakable for over 300 years—earning the nickname "le chiffre indéchiffrable."

3.5.2 Description

The Vigenère cipher uses a keyword to select different Caesar shifts for each letter, creating a polyalphabetic substitution.

Tabula Recta: A 26×26 grid where row i is the alphabet shifted by i positions.

Encryption: Cᵢ = (Pᵢ + Kᵢ) mod 26 where Kᵢ is the numeric value of the i-th letter of the keyword (repeated as needed)

Example: Plaintext: ATTACK AT DAWN Keyword: LEMON L EMONL Numeric: A(0) + L(11) = 11 → L T(19) + E(4) = 23 → X T(19) + M(12) = 31 mod 26 = 5 → F A(0) + O(14) = 14 → O C(2) + N(13) = 15 → P K(10) + L(11) = 21 → V A(0) + E(4) = 4 → E T(19) + M(12) = 31 mod 26 = 5 → F D(3) + O(14) = 17 → R A(0) + N(13) = 13 → N W(22) + L(11) = 33 mod 26 = 7 → H N(13) + E(4) = 17 → R

Ciphertext: LXFO PV EFRNHR

3.5.3 Security Features

The Vigenère cipher resists simple frequency analysis because the same plaintext letter can encrypt to different ciphertext letters depending on position. The frequency distribution of ciphertext is flatter than in monoalphabetic ciphers.

3.5.4 Cryptanalysis — Kasiski Examination

Friedrich Kasiski published a method for breaking the Vigenère cipher in 1863 (though Charles Babbage had discovered it earlier but kept it secret for military reasons).

Kasiski Examination:

  1. Find repeated patterns: Look for sequences of letters that appear multiple times in the ciphertext. These likely correspond to the same plaintext sequence encrypted with the same keyword segment.

  2. Measure distances: Calculate distances between repeated sequences. The keyword length likely divides these distances.

  3. Factor distances: Find the greatest common divisor of the distances between repeats—this suggests the keyword length.

Example: Ciphertext: ...XYZ... (distance 12) ...XYZ... (distance 18) GCD of 12 and 18 is 6 → keyword likely 6 letters

Once keyword length (n) is known:

  • Separate ciphertext into n columns, each encrypted with a single Caesar shift
  • Perform frequency analysis on each column independently
  • Recover the keyword letter for each column

3.5.5 Index of Coincidence

William Friedman developed the index of coincidence (IC) as a more sophisticated method for determining keyword length:

IC = Σ (fᵢ × (fᵢ - 1)) / (N × (N - 1)) where fᵢ is frequency of letter i, N is total letters

Interpretation:

  • Random text: IC ≈ 0.038
  • English text: IC ≈ 0.066
  • Vigenère ciphertext's IC falls between these values depending on keyword length

3.6 Hill Cipher

3.6.1 Description

Invented by Lester S. Hill in 1929, the Hill cipher was the first polygraphic cipher to use linear algebra. It encrypts blocks of letters simultaneously using matrix multiplication.

Mathematical Formulation: Let m be the block size. Choose an m×m invertible matrix K (the key) modulo 26.

Encryption: C = K × P mod 26 Decryption: P = K⁻¹ × C mod 26

where P and C are column vectors of length m representing plaintext and ciphertext blocks.

Example (m=2): Let K = [3 3; 2 5] (first row: 3,3; second row: 2,5) det(K) = 3×5 - 3×2 = 15 - 6 = 9 9 mod 26 has inverse 3 (since 9×3 = 27 ≡ 1 mod 26) K⁻¹ = 3 × [5 -3; -2 3] mod 26 = [15 -9; -6 9] mod 26 = [15 17; 20 9]

Encrypt "HELP": H=7, E=4 → block [7;4] C = K × [7;4] = [3×7+3×4; 2×7+5×4] = [21+12; 14+20] = [33;34] mod 26 = [7;8] → "HI" L=11, P=15 → block [11;15] C = [3×11+3×15; 2×11+5×15] = [33+45; 22+75] = [78;97] mod 26 = [0;19] → "AT" Ciphertext: "HIAT"

3.6.2 Security Considerations

Strengths:

  • Hides letter frequencies (each ciphertext letter depends on multiple plaintext letters)
  • Can use large block sizes

Weaknesses:

  • Linear — vulnerable to known-plaintext attacks
  • Requires invertible key matrix (not all matrices are invertible modulo 26)
  • Deterministic — same plaintext block always produces same ciphertext

3.6.3 Cryptanalysis

Known-Plaintext Attack: With m² plaintext-ciphertext letter pairs (for block size m), we can solve for the key matrix:

C = K × P mod 26 If P is invertible, K = C × P⁻¹ mod 26

Ciphertext-Only Attack: Possible using frequency analysis of n-grams (groups of n letters), though more difficult than for simple substitution.

3.7 Lessons from Classical Ciphers

Classical substitution ciphers teach fundamental lessons that remain relevant:

1. Key Space Is Not Enough: Monoalphabetic ciphers have huge key spaces but are easily broken. Security requires more than just many possible keys.

2. Statistical Patterns Are Deadly: Languages have statistical regularities that leak through encryption. Modern ciphers must eliminate these patterns.

3. Confusion and Diffusion Matter: Claude Shannon identified two essential properties:

  • Confusion: The relationship between key and ciphertext should be complex
  • Diffusion: Changes in plaintext should spread throughout ciphertext

4. Kerckhoffs's Principle Holds: All these ciphers assume algorithm secrecy. Once the method is known (as it inevitably becomes), only key secrecy matters.

5. Automation Changes Everything: Ciphers that resisted human cryptanalysts for centuries fall quickly to computer analysis.


CHAPTER 4: CLASSICAL CRYPTANALYSIS

4.1 The Art of Codebreaking

Cryptanalysis, the science of breaking cryptographic systems, is as old as cryptography itself. Understanding historical cryptanalysis provides insight into the weaknesses that modern cryptography must avoid and introduces analytical techniques still relevant today.

4.2 Frequency Analysis

4.2.1 Theoretical Foundation

Frequency analysis, first documented by Al-Kindi in the 9th century, exploits the non-uniform distribution of letters in natural languages. In any language, some letters appear more frequently than others, and these patterns persist through simple substitution.

4.2.2 Letter Frequencies

English Letter Frequencies (detailed percentages):

Letter Frequency Letter Frequency Letter Frequency
E 12.70% D 4.25% P 1.93%
T 9.06% L 4.03% B 1.49%
A 8.17% C 3.78% V 0.98%
O 7.51% U 3.61% K 0.77%
I 6.97% M 3.01% J 0.15%
N 6.75% W 2.56% X 0.15%
S 6.33% F 2.43% Q 0.10%
H 6.09% G 2.00% Z 0.07%
R 5.99% Y 2.00%

4.2.3 Digram and Trigram Frequencies

Letter pairs and triples provide additional statistical leverage:

Most Common Digrams: TH (3.88%), HE (3.68%), IN (2.28%), ER (2.17%), AN (2.14%), RE (1.75%), ED (1.72%), ON (1.72%), ES (1.67%), ST (1.60%), EN (1.56%), AT (1.51%), TO (1.49%), NT (1.45%), HA (1.42%), ND (1.41%), OU (1.39%), EA (1.32%), NG (1.32%), AL (1.22%)

Most Common Trigrams: THE (3.51%), AND (1.59%), THA (0.88%), ENT (0.79%), ION (0.79%), TIO (0.75%), FOR (0.71%), NDE (0.70%), HAS (0.66%), NCE (0.65%)

4.2.4 Practical Application

Breaking a monoalphabetic substitution:

  1. Count frequencies in ciphertext
  2. Hypothesize matches between frequent ciphertext letters and common English letters (especially 'E')
  3. Look for common words — if you see a three-letter ciphertext group repeating often, try "THE"
  4. Test hypotheses by substituting guessed letters
  5. Refine using context — words must make sense

Example: Ciphertext: "XJY XJY DWWF XJY VJQ HTRFXY" Assuming "XJY" is "THE", then X→T, J→H, Y→E Pattern emerges: "THE THE ... THE ..." → likely "THE THE CAT THE ..."

4.2.5 Limitations

Frequency analysis requires sufficient ciphertext length. Very short messages may not provide reliable statistics, and messages deliberately constructed to have flat frequency distributions can resist simple analysis.

4.3 Kasiski Examination

4.3.1 Breaking Polyalphabetic Ciphers

Friedrich Kasiski's 1863 method for breaking Vigenère cipher was a major breakthrough. The key insight: while polyalphabetic ciphers flatten letter frequencies, they create patterns at the level of repeated sequences.

4.3.2 The Method

Step 1: Find Repeated Sequences Scan ciphertext for sequences of 3 or more characters that appear multiple times. Record positions of each repeat.

Step 2: Calculate Distances For each repeated sequence, compute the distance between occurrences: Distance = Position₂ - Position₁

Step 3: Find GCD The greatest common divisor of all distances (or most of them) likely equals the keyword length.

Example: Ciphertext: "VHVSSPQUCEMRVBVBBBVHVSURQGIBDUGRNICJQUCERVUAXSSR"

Repeated sequence "VHVS" appears at positions 1 and 25 → distance 24 Repeated sequence "QUCE" appears at positions 6 and 32 → distance 26 GCD(24,26) = 2 → keyword length likely 2

4.3.3 Handling Noise

In practice, coincidental repetitions occur, and actual distances may be multiples of the keyword length. Statistical approaches (taking the most common factor) improve accuracy.

4.4 Index of Coincidence

4.4.1 Definition

William Friedman's index of coincidence (IC) measures how likely it is that two randomly selected letters from a text are the same:

IC = Σ (nᵢ × (nᵢ - 1)) / (N × (N - 1))

where nᵢ is the count of letter i, and N is total letters.

4.4.2 Interpretation

  • Random text (uniform distribution): IC ≈ 0.038 (1/26)
  • English text: IC ≈ 0.066
  • Vigenère ciphertext: IC depends on keyword length:
    • Long keyword → closer to random (0.038)
    • Short keyword → closer to English (0.066)

4.4.3 Determining Keyword Length

For a candidate keyword length L, separate ciphertext into L columns and compute average IC. The correct L will produce columns with IC closer to 0.066.

Procedure:

  1. For L from 1 to some maximum (e.g., 20):
    • Split text into L columns
    • Compute IC for each column
    • Average the L IC values
  2. Choose L with highest average IC

4.5 Known Plaintext Attack

4.5.1 Description

A known plaintext attack occurs when the cryptanalyst has access to both the plaintext and its corresponding ciphertext. This is a powerful attack scenario.

Historical Examples:

  • Breaking Enigma using known plaintext from standardized weather reports and stereotyped greetings
  • Breaking Lorenz cipher using known plaintext from messages containing "Heil Hitler"

4.5.2 Application to Classical Ciphers

Simple Substitution: Even a few known plaintext-ciphertext letter pairs can reveal the substitution mapping for those letters.

Vigenère: Known plaintext immediately reveals the keyword at the corresponding positions: Kᵢ = (Cᵢ - Pᵢ) mod 26.

Hill Cipher: With m blocks of m letters (or m² total letters), the key can be solved: C = K × P mod 26 → K = C × P⁻¹ mod 26

4.5.3 Modern Relevance

Known plaintext attacks remain important in modern cryptanalysis. For example:

  • Analyzing encrypted network traffic where protocol headers are predictable
  • Breaking poorly designed encryption modes
  • Side-channel attacks that capture partial plaintext information

4.6 Ciphertext-Only Attack

4.6.1 The Classic Challenge

Ciphertext-only attacks are the most difficult but also the most realistic—the cryptanalyst has only intercepted ciphertext.

4.6.2 Techniques

Frequency Analysis: As discussed, effective against simple substitution.

Pattern Analysis: Looking for repeated patterns, word boundaries, or structural clues.

Traffic Analysis: Even if content is encrypted, metadata like message lengths, timing, and communication patterns can reveal information.

Brute Force: When key space is small enough, try all possible keys.

4.6.3 Success Factors

Ciphertext-only attacks succeed when:

  • Key space is small enough to exhaust
  • Ciphertext is long enough for statistical analysis
  • Cipher has exploitable mathematical weaknesses
  • Plaintext has predictable structure

4.7 The Friedman Test

William Friedman developed systematic statistical tests for cryptanalysis:

The Friedman Test for determining whether a cipher is monoalphabetic or polyalphabetic: Compute the index of coincidence:

  • If IC > 0.06 → likely monoalphabetic
  • If IC < 0.05 → likely polyalphabetic

Estimating Keyword Length: Keyword length ≈ (0.027 × N) / ((N-1) × IC - 0.038 × N + 0.065)

4.8 Automated Cryptanalysis

4.8.1 Hill Climbing

Modern cryptanalysis often uses hill climbing algorithms to search the key space:

  1. Start with a random key
  2. Make small modifications
  3. Keep changes that improve a fitness metric (e.g., matching expected letter frequencies)
  4. Repeat until no improvement

4.8.2 Genetic Algorithms

Evolutionary approaches can solve substitution ciphers by:

  • Maintaining a population of candidate keys
  • "Breeding" keys through crossover and mutation
  • Selecting fittest candidates based on n-gram statistics

4.8.3 Simulated Annealing

Probability-based search that occasionally accepts worse solutions to escape local optima.

4.9 Lessons Learned

Classical cryptanalysis teaches enduring principles:

1. Statistical Leakage is Fundamental: Any non-uniformity in plaintext or ciphertext provides information to attackers.

2. Known Plaintext is Powerful: Designers should assume attackers will obtain some plaintext-ciphertext pairs.

3. Automation Amplifies Attacks: Computers can perform statistical analysis at scales impossible for humans.

4. Kerckhoffs's Principle is Crucial: Security must not depend on algorithm secrecy.

5. Historical Attacks Inform Modern Design: Understanding how classical ciphers were broken helps design stronger modern ones.


PART III — SYMMETRIC KEY CRYPTOGRAPHY


CHAPTER 5: BLOCK CIPHERS

5.1 Introduction to Block Ciphers

Block ciphers are fundamental building blocks of modern cryptography. A block cipher is a deterministic algorithm operating on fixed-length groups of bits, called blocks, using a symmetric key. It transforms a plaintext block into a ciphertext block of the same length. The same key is used for both encryption and decryption.

Formal Definition: A block cipher is a function E: {0,1}ᵏ × {0,1}ⁿ → {0,1}ⁿ, where:

  • k is the key length
  • n is the block length
  • For each key K, E(K, ·) is a permutation on {0,1}ⁿ
  • Decryption D is the inverse permutation: D(K, E(K, P)) = P

5.2 Design Principles

5.2.1 Confusion and Diffusion

Claude Shannon identified two fundamental properties for secure ciphers:

Confusion: The relationship between the key and the ciphertext should be as complex as possible. Each ciphertext bit should depend on multiple parts of the key, ideally in a nonlinear way. This prevents attackers from deducing key bits from statistical relationships.

Diffusion: Changes in the plaintext should propagate throughout the ciphertext. Flipping one plaintext bit should change approximately half the ciphertext bits (the avalanche effect). This hides statistical relationships between plaintext and ciphertext.

5.2.2 Substitution-Permutation Networks (SPN)

SPNs implement confusion and diffusion through alternating layers:

Substitution Layer (S-boxes) : Provide confusion by mapping small groups of bits (typically 4-8 bits) to output bits through a nonlinear lookup table. Good S-boxes are carefully designed to resist linear and differential cryptanalysis.

Permutation Layer (P-boxes) : Provide diffusion by permuting bits, spreading the output of one S-box to inputs of multiple S-boxes in the next round.

Round Keys: Each round uses a different subkey derived from the main key through a key schedule.

AES is the most prominent example of an SPN cipher.

5.2.3 Feistel Networks

Feistel networks, invented by Horst Feistel at IBM, offer a different approach where encryption and decryption share the same structure:

Round Function for a Feistel network with block size 2n (split into left half L and right half R):

Lᵢ = Rᵢ₋₁ Rᵢ = Lᵢ₋₁ ⊕ F(Rᵢ₋₁, Kᵢ)

Decryption simply reverses the process, using subkeys in reverse order. This works because XOR is its own inverse:

Rᵢ₋₁ = Lᵢ Lᵢ₋₁ = Rᢀᵢ ⊕ F(Lᵢ, Kᵢ)

Advantages:

  • Encryption and decryption are essentially identical
  • The round function F need not be invertible
  • Well-understood security properties

DES and Blowfish are classic Feistel ciphers.

5.3 Data Encryption Standard (DES)

5.3.1 Historical Context

DES, adopted as a federal standard in 1977, dominated symmetric cryptography for over two decades. Its development marked the first time a cryptographic algorithm was publicly standardized and scrutinized.

Origin: IBM developed DES from their Lucifer cipher, designed by Horst Feistel. The NSA participated in modifications, including:

  • Reducing key size from 128 bits to 56 bits
  • Modifying S-box design criteria (kept secret until rediscovered in the 1990s)

5.3.2 DES Structure

DES is a Feistel network with:

  • Block size: 64 bits
  • Key size: 56 bits (plus 8 parity bits, often ignored)
  • Rounds: 16

Initial and Final Permutations: DES begins and ends with fixed bit permutations (IP and IP⁻¹). These have no cryptographic significance but were designed for efficient hardware implementation in the 1970s.

Round Function F:

  1. Expansion: 32-bit right half expanded to 48 bits via expansion permutation
  2. Key Mixing: XOR with 48-bit round subkey
  3. S-box Substitution: 48 bits divided into 8 groups of 6 bits. Each group enters an S-box producing 4 bits. S-boxes are the heart of DES security—nonlinear lookup tables carefully designed to resist cryptanalysis.
  4. Permutation: 32 bits permuted via fixed P-box

Key Schedule:

  1. 56-bit key split into two 28-bit halves
  2. Each half rotated left by 1 or 2 bits per round
  3. 48 bits selected from rotated halves via compression permutation

5.3.3 DES S-box Design

The eight S-boxes of DES were mysterious for years. After the public rediscovery of differential cryptanalysis in the 1990s, it became clear they were designed to resist this attack:

  • No S-box is a linear or affine function
  • Changing one input bit changes at least two output bits
  • S-boxes are not too close to a linear function
  • Maximum probability of any input difference producing a given output difference is limited

5.3.4 Security of DES

Key Length Problem: The 56-bit key was controversial from the start. By the late 1990s, brute-force attacks became feasible:

  • 1997: DESCHALL project broke a DES key in 96 days
  • 1998: EFF's Deep Crack broke a DES key in 56 hours ($250,000 machine)
  • 1999: Deep Crack and distributed.net broke a DES key in 22 hours 15 minutes

Complement Property: DES has a weakness: E(K, P) = complement of E(complement(K), complement(P)). This reduces brute-force search by a factor of 2.

Weak Keys: DES has four keys that are self-inverse (encrypting twice returns original plaintext) and six pairs of semi-weak keys.

5.4 Triple DES (3DES)

5.4.1 Extending DES Lifespan

As DES's 56-bit key became inadequate, Triple DES provided a way to reuse existing DES implementations with stronger security.

Triple-DES Encryption: C = Eₖ₃(Dₖ₂(Eₖ₁(P)))

Decryption: P = Dₖ₁(Eₖ₂(Dₖ₃(C)))

Common Variants:

  • 3DES-EEE: Three independent encryptions
  • 3DES-EDE: Encrypt-Decrypt-Encrypt pattern (allows backward compatibility with single DES when k₁ = k₂ = k₃)

Keying Options:

  • Option 1: Three independent keys (168-bit key) — strongest
  • Option 2: k₁ and k₂ independent, k₃ = k₁ (112-bit key)
  • Option 3: All keys equal (56-bit key) — single DES compatibility

5.4.2 Security Considerations

  • Effective security: about 112 bits for 3-key 3DES (due to meet-in-the-middle attacks)
  • Block size remains 64 bits, making it vulnerable to birthday attacks on large volumes of data
  • Relatively slow in software
  • Officially deprecated by NIST in 2017, disallowed after 2023

5.5 Advanced Encryption Standard (AES)

5.5.1 The AES Competition

In 1997, NIST announced a competition to replace DES. Requirements:

  • Block size: 128 bits
  • Key sizes: 128, 192, and 256 bits
  • Public, fully specified algorithm
  • Available royalty-free worldwide

Finalists:

  • MARS (IBM) — complex, high security margin
  • RC6 (RSA Laboratories) — simple, based on RC5
  • Rijndael (Daemen and Rijmen) — elegant, efficient
  • Serpent (Anderson, Biham, Knudsen) — ultra-conservative security
  • Twofish (Schneier et al.) — highly secure, flexible

Winner: Rijndael, announced in October 2000, became AES.

5.5.2 AES Structure

AES is a Substitution-Permutation Network, not a Feistel network. It operates on a 4×4 array of bytes called the state (128 bits = 16 bytes).

Number of Rounds:

  • 10 rounds for 128-bit keys
  • 12 rounds for 192-bit keys
  • 14 rounds for 256-bit keys

Each Round (except last) performs:

  1. SubBytes: Nonlinear byte substitution using a single S-box applied to each byte
  2. ShiftRows: Cyclic shift of rows (row 0 unchanged, row 1 shifted left 1, row 2 shifted left 2, row 3 shifted left 3)
  3. MixColumns: Linear transformation mixing columns (omitted in final round)
  4. AddRoundKey: XOR with round subkey

5.5.3 AES S-box

The AES S-box is constructed mathematically, not as a random lookup table:

  1. Take multiplicative inverse in GF(2⁸) (using irreducible polynomial x⁸ + x⁴ + x³ + x + 1), with 0 mapped to 0
  2. Apply affine transformation over GF(2): bᵢ = aᵢ ⊕ aᵢ₊₄ mod 8 ⊕ aᵢ₊₅ mod 8 ⊕ aᵢ₊₆ mod 8 ⊕ aᵢ₊₇ mod 8 ⊕ cᵢ where c = 0x63

This construction ensures:

  • No fixed points (S-box(a) = a)
  • No opposite fixed points (S-box(a) = complement(a))
  • High nonlinearity
  • Algebraic simplicity for implementation

5.5.4 Key Schedule

AES expands the key into an array of round keys (Nb × (Nr + 1) words, where Nb=4, Nr is rounds):

For 128-bit key (4 words):

  • First 4 words are the key itself
  • Each subsequent word is XOR of previous word and word 4 positions back
  • Every 4 words (at word indices multiple of 4), apply RotWord (byte rotation), SubWord (S-box application), and XOR with Rcon (round constant)

This ensures:

  • Each round key differs significantly
  • Knowledge of some round keys doesn't easily reveal others
  • Efficient on various platforms

5.5.5 Security of AES

Known Attacks (as of 2024):

  • Related-key attacks on AES-192 and AES-256 (but not on full rounds)
  • Side-channel attacks on implementations
  • No practical attacks on full AES

Security Margin: Even the best attacks on full AES require complexity far above 2¹²⁸.

Quantum Resistance: Grover's algorithm could theoretically speed up key search, but doubling key size (using AES-256) provides adequate quantum resistance.

5.6 Other Notable Block Ciphers

5.6.1 Blowfish

Designed by Bruce Schneier in 1993:

  • Feistel network with 64-bit blocks
  • Variable key length (32-448 bits)
  • 16 rounds
  • Key-dependent S-boxes
  • Fast on 32-bit processors

Strengths: Simple, well-studied, no practical attacks on full rounds Weakness: 64-bit block size limits use in large applications

5.6.2 Twofish

AES finalist by Schneier et al.:

  • Feistel network with 128-bit blocks
  • Key sizes: 128, 192, 256 bits
  • 16 rounds
  • Bijective S-boxes derived from key
  • Maximum distance separable (MDS) matrices for diffusion

Features: Highly secure, flexible, efficient in hardware and software

5.6.3 Serpent

AES finalist by Anderson, Biham, Knudsen:

  • SPN with 128-bit blocks
  • 32 rounds (conservative design)
  • 8 S-boxes used in parallel
  • Designed for maximum security margin

Philosophy: Security over speed; even reduced-round versions remain secure

5.7 Block Cipher Cryptanalysis

5.7.1 Brute Force Search

The simplest attack: try all possible keys until the correct one is found. With k-bit keys, expected work is 2ᵏ⁻¹.

Defense: Use sufficiently large keys (128+ bits). Even with enormous computing power, 2¹²⁸ operations is infeasible.

5.7.2 Meet-in-the-Middle Attacks

Applicable to multiple encryption schemes (like 2DES or 3DES):

Given C = Eₖ₂(Eₖ₁(P)):

  1. Compute Eₖ₁(P) for all possible k₁, store in table
  2. Compute Dₖ₂(C) for all possible k₂, check against table

For 2DES (112-bit key), this requires 2⁵⁶ time and space, not 2¹¹².

Defense: Use sufficient rounds and avoid cascading ciphers without triple encryption.

5.7.3 Side-Channel Attacks

These exploit physical leakage during computation:

Timing Attacks: Measure time variations based on key bits Power Analysis: Measure power consumption patterns Cache Attacks: Observe cache access patterns Electromagnetic Analysis: Measure EM emissions

Defenses: Constant-time implementations, masking, hiding

5.8 Future Directions

Authenticated Encryption: Combining encryption with authentication in one algorithm (AES-GCM, AES-OCB, ChaCha20-Poly1305)

Lightweight Cryptography: Optimized for constrained devices (IoT)

  • PRESENT, SPECK, SIMON, ASCON

Beyond 128-bit Security: Some applications require higher security margins (AES-256, ChaCha20)


CHAPTER 6: BLOCK CIPHER MODES OF OPERATION

6.1 The Need for Modes

A block cipher encrypts only a single fixed-length block. Real-world messages vary in length and may require special handling for the first and last blocks. Modes of operation define how to apply a block cipher to messages of arbitrary length while ensuring security properties.

6.2 Electronic Codebook (ECB) Mode

6.2.1 Description

ECB is the simplest mode: each plaintext block is encrypted independently with the same key.

Encryption: Cᵢ = Eₖ(Pᵢ) Decryption: Pᵢ = Dₖ(Cᵢ)

6.2.2 Problems with ECB

Deterministic Encryption: Identical plaintext blocks produce identical ciphertext blocks, revealing patterns in the data. This is catastrophic for many real-world data types:

  • Images: Patterns remain visible in ciphertext
  • Database records: Repeated fields (like "Yes"/"No") are obvious
  • Protocol headers: Predictable values create recognizable patterns

No Integrity Protection: An attacker can reorder, duplicate, or remove blocks without detection.

Padding Required: Last block may need padding to reach block size.

6.2.3 When ECB Is Acceptable

ECB is suitable only for encrypting single blocks of data (like encrypting a single key) where the weaknesses don't apply.

6.3 Cipher Block Chaining (CBC) Mode

6.3.1 Description

CBC mode chains blocks together using XOR with the previous ciphertext block. An initialization vector (IV) is used for the first block.

Encryption: C₀ = IV Cᵢ = Eₖ(Pᵢ ⊕ Cᵢ₋₁)

Decryption: Pᵢ = Dₖ(Cᵢ) ⊕ Cᵢ₋₁

6.3.2 Properties

Advantages:

  • Identical plaintext blocks produce different ciphertext (due to chaining)
  • Widely used and well-understood
  • Parallelizable decryption (but not encryption)

Requirements:

  • IV must be unpredictable (typically random) for CPA security
  • IV need not be secret but must be available to decrypt
  • Padding required for arbitrary-length messages

6.3.3 Padding Schemes

Various padding methods ensure messages are multiples of block size:

PKCS#7 Padding: Add n bytes each with value n, where n is the number of padding bytes needed (1 to blocksize). For example, if 3 bytes needed, add 0x03 0x03 0x03.

Zero Padding: Add null bytes, but requires unambiguous way to distinguish padding from data.

ANSI X.923: Add zeros, with last byte indicating padding length.

6.3.4 Padding Oracle Attacks

CBC with improper error handling is vulnerable to padding oracle attacks:

  1. Attacker modifies ciphertext and observes whether decryption produces valid padding
  2. This oracle leaks information about plaintext
  3. Full plaintext can be recovered with about 256 guesses per byte

Defense: Always use authenticated encryption (see GCM) or implement constant-time padding validation.

6.4 Cipher Feedback (CFB) Mode

6.4.1 Description

CFB mode turns a block cipher into a self-synchronizing stream cipher:

Encryption: Cᵢ = Pᵢ ⊕ Eₖ(Cᵢ₋₁) (with C₀ = IV)

Decryption: Pᵢ = Cᵢ ⊕ Eₖ(Cᵢ₋₁)

6.4.2 Properties

  • No padding needed (works on any bit length)
  • Self-synchronizing: receiver can recover from lost ciphertext bits after one block
  • Encryption and decryption both use block cipher encryption (not decryption)
  • Not parallelizable

6.5 Output Feedback (OFB) Mode

6.5.1 Description

OFB mode also creates a stream cipher but with a different feedback mechanism:

O₀ = IV Oᵢ = Eₖ(Oᵢ₋₁) Cᵢ = Pᵢ ⊕ Oᵢ

Decryption: Pᵢ = Cᵢ ⊕ Oᵢ

6.5.2 Properties

  • No padding needed
  • Keystream can be precomputed
  • Bit errors in ciphertext only affect corresponding plaintext bit (no error propagation)
  • Synchronous stream cipher — loss of synchronization is catastrophic
  • Not parallelizable (keystream generation is sequential)

6.6 Counter (CTR) Mode

6.6.1 Description

CTR mode turns a block cipher into a synchronous stream cipher by encrypting incrementing counters:

Cᵢ = Pᵢ ⊕ Eₖ(counter + i) where counter + i means counter incremented by i (mod 2ⁿ, where n is block size)

Decryption: Pᵢ = Cᵢ ⊕ Eₖ(counter + i)

6.6.2 Properties

Advantages:

  • Fully parallelizable (both encryption and decryption)
  • Random access — can decrypt any block without processing previous ones
  • No padding needed
  • Simple and elegant

Requirements:

  • Counter must never repeat for a given key (otherwise XOR two ciphertexts reveals information)
  • Typically use a nonce + counter approach

6.6.3 Security Considerations

CTR mode is provably secure if:

  • Block cipher is a pseudorandom permutation
  • Counters don't repeat
  • Up to about 2ⁿ/² blocks encrypted (birthday bound)

6.7 Galois/Counter Mode (GCM)

6.7.1 Description

GCM combines CTR mode encryption with a Galois field-based authenticator, providing authenticated encryption:

  • Encryption: CTR mode with a 96-bit nonce (typically) and 32-bit counter
  • Authentication: GHASH function over ciphertext and additional authenticated data (AAD)
  • Output: Ciphertext plus authentication tag (typically 128 bits)

6.7.2 GCM Authentication

GHASH multiplies blocks in GF(2¹²⁸) by a key-derived hash key H:

Let X₁, X₂, ..., Xₘ be blocks of AAD and ciphertext Compute Y₀ = 0 Yᵢ = (Yᵢ₋₁ ⊕ Xᵢ) · H Tag = (Yₘ ⊕ (len(AAD) || len(C)) · H) ⊕ Eₖ(nonce||1)

6.7.3 Properties

Advantages:

  • Efficient in hardware and software
  • Parallelizable
  • Authenticates both ciphertext and associated data
  • Widely supported (TLS, IPsec, IEEE 802.1AE)

Security Considerations:

  • Nonce reuse is catastrophic (reveals authentication key)
  • Short nonces (96 bits) are recommended; longer nonces require hashing
  • Side-channel attacks on GHASH multiplication exist

6.8 XTS-AES Mode

6.8.1 Description

XTS (XEX-based Tweaked CodeBook mode with CipherText Stealing) is designed for disk encryption:

Encryption of block i at location j: T = Eₖ₂(j) ⊗ αʲ C = Eₖ₁(P ⊕ T) ⊕ T

Where α is a primitive element of GF(2¹²⁸) and ⊗ is multiplication in that field.

6.8.2 Properties

  • Each disk block (typically 512 bytes) is encrypted independently
  • Different tweaks for different positions prevent block movement attacks
  • Allows random access read/write
  • Ciphertext stealing avoids expansion

Standardized: IEEE 1619-2007, used in disk encryption systems

6.9 OCB (Offset CodeBook) Mode

6.9.1 Description

OCB is an authenticated encryption mode designed by Rogaway et al. that's both efficient and provably secure:

  • Uses a single block cipher key
  • Computes offsets based on block index and key
  • Provides encryption and authentication in one pass
  • No padding needed (uses ciphertext stealing for partial blocks)

6.9.2 Properties

Advantages:

  • Fast (about as fast as CTR alone)
  • Parallelizable
  • Provably secure
  • Minimal overhead

Drawbacks:

  • Patented (though free for open-source use)
  • Less widely supported than GCM

6.10 Choosing a Mode

General Recommendations:

  • Use authenticated encryption (GCM, OCB, ChaCha20-Poly1305)
  • Never use ECB for multiple blocks
  • Use random or counter-based nonces
  • Ensure proper IV/nonce management

Application-Specific:

  • Network protocols: GCM, ChaCha20-Poly1305
  • Disk encryption: XTS
  • Database encryption: CTR with authentication separate
  • High-performance parallel systems: CTR, GCM

6.11 Common Pitfalls

Nonce Reuse: Using the same nonce with the same key in counter-based modes is catastrophic. Always ensure nonce uniqueness.

Improper Authentication: Encryption without authentication allows tampering. Always authenticate ciphertexts.

Padding Oracle Vulnerabilities: CBC padding validation must be constant-time.

Weak Randomness: Predictable IVs in CBC mode compromise security.


CHAPTER 7: STREAM CIPHERS

7.1 Introduction to Stream Ciphers

Stream ciphers encrypt data one bit or byte at a time by XORing plaintext with a pseudorandom keystream. Unlike block ciphers, which operate on fixed-size blocks, stream ciphers can process messages of arbitrary length without padding.

Basic Operation: Keystream: k₁, k₂, k₃, ... Plaintext: p₁, p₂, p₃, ... Ciphertext: cᵢ = pᵢ ⊕ kᵢ

Security Requirement: Keystream must be indistinguishable from random to anyone not knowing the key.

7.2 Synchronous vs. Asynchronous Stream Ciphers

7.2.1 Synchronous Stream Ciphers

Keystream depends only on key (and possibly an IV), not on plaintext or ciphertext.

Properties:

  • Encryption and decryption are identical (both XOR with keystream)
  • Loss of synchronization (dropped or inserted bits) is catastrophic
  • No error propagation (bit errors affect only corresponding plaintext bits)

Examples: RC4, Salsa20, ChaCha20, A5/1 (GSM)

7.2.2 Asynchronous (Self-Synchronizing) Stream Ciphers

Keystream depends on key and a fixed number of previous ciphertext bits.

Properties:

  • Self-synchronizing after loss of ciphertext bits
  • Limited error propagation (affects finite window)
  • More complex design

Examples: CFB mode (block cipher as stream cipher)

7.3 RC4

7.3.1 History and Design

RC4 was designed by Ron Rivest in 1987 for RSA Security. It remained a trade secret until anonymously posted to Cypherpunks mailing list in 1994. It became widely used due to its simplicity and speed.

Key Scheduling Algorithm (KSA) :

  • Initialize S[0..255] with 0..255
  • For i from 0 to 255:
    • j = (j + S[i] + key[i mod keylength]) mod 256
    • Swap S[i] and S[j]

Pseudo-Random Generation Algorithm (PRGA) :

  • i = 0, j = 0
  • For each keystream byte:
    • i = (i + 1) mod 256
    • j = (j + S[i]) mod 256
    • Swap S[i] and S[j]
    • Output S[(S[i] + S[j]) mod 256]

7.3.2 Security Weaknesses

Extensive cryptanalysis has revealed numerous weaknesses:

1. Biased Output: The second output byte is biased toward zero with probability 1/128 (twice expected). Many other biases exist.

2. Related Key Attacks: The KSA has properties that allow key recovery from related keys.

3. IV Weaknesses: Used in WEP with concatenated IV+key, allowed practical attacks (Fluhrer, Mantin, Shamir attack).

4. Invariance Weakness: Some weak keys produce identical initial state bytes.

5. Practical Attacks: Modern attacks can recover plaintext from RC4-encrypted traffic with moderate amounts of data.

7.3.3 Deprecation

Due to these weaknesses, RC4 has been deprecated:

  • RFC 7465 (2015) prohibits RC4 in TLS
  • Microsoft, Google, Mozilla removed RC4 support by 2016

7.4 Salsa20

7.4.1 Design Philosophy

Daniel Bernstein designed Salsa20 as a candidate for the eSTREAM project (2004-2008). It emphasizes simplicity, security, and performance.

Core Design:

  • 20 rounds (reduced variants: Salsa20/8, Salsa20/12)
  • 256-bit key (128-bit variant also defined)
  • 64-bit nonce, 64-bit counter (allows 2⁶⁴ blocks per key)
  • 512-bit block size

7.4.2 The Salsa20 Quarter Round

The fundamental operation operates on 4 32-bit words (a, b, c, d):

b ⊕= (a + d) <<< 7 c ⊕= (b + a) <<< 9 d ⊕= (c + b) <<< 13 a ⊕= (d + c) <<< 18

This quarter round is applied in a specific pattern across the 4×4 matrix of 32-bit words.

7.4.3 The Salsa20 Hash Function

Salsa20 constructs a 512-bit output from a 512-bit input (key, nonce, counter, constants) using 20 rounds of the quarter round operations. The final output is input plus the result of the rounds (to ensure invertibility).

Input matrix (4×4 of 32-bit words):

[ "expa" Key Key Key ] [ Key "nd 3" Nonce Nonce ] [ Counter Counter "2-by" Key ] [ Key Key Key "te k" ]

Where constants are ASCII strings: "expand 32-byte k" and "expand 16-byte k" for 256-bit and 128-bit keys.

7.4.4 Security of Salsa20

Salsa20 has received extensive cryptanalysis:

  • No significant weaknesses in full 20-round version
  • Best attacks only on reduced-round variants (8 rounds)
  • Provable security bounds against differential cryptanalysis

7.5 ChaCha20

7.5.1 Improvements over Salsa20

Daniel Bernstein designed ChaCha20 as an improvement to Salsa20:

Key Changes:

  • Modified quarter round increases diffusion per round
  • Different round structure (adds instead of XORs for some operations)
  • Better resistance to cryptanalysis

ChaCha Quarter Round: a += b; d ⊕= a; d <<<= 16 c += d; b ⊕= c; b <<<= 12 a += b; d ⊕= a; d <<<= 8 c += d; b ⊕= c; b <<<= 7

7.5.2 ChaCha20 Structure

ChaCha20 uses the same 4×4 matrix of 32-bit words but with a different initial arrangement:

[ "expa" "nd 3" "2-by" "te k" ] [ Key Key Key Key ] [ Key Key Key Key ] [ Nonce Nonce Counter Counter ]

7.5.3 Advantages of ChaCha20

  • Speed: Very fast in software, especially on platforms without AES hardware acceleration
  • Security: Strong security margin, no practical attacks
  • Simplicity: Easy to implement correctly and constant-time
  • Parallelism: Can generate multiple blocks in parallel

7.5.4 ChaCha20-Poly1305

The combination of ChaCha20 encryption with Poly1305 authentication has become widely adopted:

  • ChaCha20: Stream cipher for encryption
  • Poly1305: Fast polynomial-based MAC for authentication
  • Used in TLS 1.3, SSH, IPsec, WireGuard

7.6 Other Notable Stream Ciphers

7.6.1 A5/1 and A5/2

Used in GSM mobile communications:

  • A5/1: Stronger version for some regions
  • A5/2: Weaker export version Both are broken and should not be used.

7.6.2 E0

Used in Bluetooth, now considered weak.

7.6.3 SNOW 3G and ZUC

Used in 3GPP LTE (4G) and 5G communications:

  • Designed for cellular standards
  • Based on linear feedback shift registers (LFSRs)
  • Considered secure for their intended use

7.7 Stream Cipher Security Considerations

7.7.1 Key Reuse

Never reuse a (key, nonce) pair. If keystream is reused: c₁ = p₁ ⊕ k c₂ = p₂ ⊕ k Then c₁ ⊕ c₂ = p₁ ⊕ p₂, which reveals information about both plaintexts.

7.7.2 Integrity

Stream ciphers provide no integrity protection. An attacker can flip bits in ciphertext, causing corresponding flips in plaintext. Always combine with a MAC.

7.7.3 Synchronization

Synchronous stream ciphers require perfect synchronization. Lost bits make decryption impossible. Use framing and error detection.

7.8 Block Ciphers as Stream Ciphers

Any block cipher can be used as a stream cipher in CTR, OFB, or CFB mode. This often provides better security analysis than dedicated stream ciphers.

7.9 Choosing a Stream Cipher

Modern Recommendations:

  • ChaCha20-Poly1305: Excellent for general use
  • AES-CTR + HMAC: When hardware AES is available
  • AES-GCM: Combined mode with authentication

Avoid:

  • RC4 (completely broken)
  • Software implementations of A5/1
  • Any cipher without public analysis

CHAPTER 8: CRYPTOGRAPHIC HASH FUNCTIONS

8.1 Introduction to Hash Functions

A cryptographic hash function H maps arbitrary-length input to a fixed-length output (the hash or digest). Unlike encryption, hash functions are deterministic but not invertible—they're one-way functions.

Formal Requirements: H: {0,1}* → {0,1}ⁿ (n typically 128-512 bits)

8.2 Essential Properties

8.2.1 Preimage Resistance (One-Way)

Given a hash value h, it should be computationally infeasible to find any message m such that H(m) = h.

Why Important: Prevents attacker from finding a message that hashes to a given value (e.g., recovering password from stored hash).

8.2.2 Second Preimage Resistance

Given a message m₁, it should be computationally infeasible to find a different message m₂ such that H(m₁) = H(m₂).

Why Important: Prevents attacker from substituting one message for another with same hash.

8.2.3 Collision Resistance

It should be computationally infeasible to find any two distinct messages m₁ ≠ m₂ such that H(m₁) = H(m₂).

Why Important: Stronger than second preimage resistance—attacker can choose both messages. Required for digital signatures (attacker could create two contracts with same hash).

8.2.4 Other Desirable Properties

Avalanche Effect: Changing any single bit in input should change approximately half the output bits.

Non-correlation: Input bits and output bits should not be correlated.

Near-collision Resistance: Shouldn't be possible to find inputs that produce almost identical hashes.

8.3 Applications of Hash Functions

8.3.1 Digital Signatures

Hash-then-sign paradigm: sign the hash of a message rather than the message itself. This is more efficient and handles arbitrary-length messages.

8.3.2 Password Storage

Store H(password) rather than password itself. On login, compute hash of entered password and compare.

Important: Always use salt (random per-user value) to prevent precomputation attacks (rainbow tables).

8.3.3 Data Integrity

Compare hash of received data with expected hash to detect corruption or tampering.

8.3.4 Message Authentication Codes (HMAC)

Construct MAC from hash function (see Chapter 9).

8.3.5 Key Derivation

Generate cryptographic keys from passwords or other source material (PBKDF2, scrypt, Argon2).

8.3.6 Commitment Schemes

Commit to a value without revealing it: send H(value), later reveal value for verification.

8.3.7 Blockchain and Proof-of-Work

Hash chains, Merkle trees, and mining all rely on hash functions (see Chapter 27).

8.4 MD5 (Message Digest 5)

8.4.1 Design

Designed by Ron Rivest in 1991, MD5 produces 128-bit hashes. It processes messages in 512-bit blocks, producing 4 32-bit words as output.

Structure: Merkle-Damgård construction with a compression function based on modular arithmetic and bitwise operations.

8.4.2 Security Status

MD5 is completely broken for collision resistance:

  • 2004: Wang et al. demonstrated practical collisions
  • 2007: Chosen-prefix collisions possible
  • 2012: Flame malware used MD5 collision to fake Microsoft digital signature

Do Not Use MD5 for any security-critical application.

8.5 SHA-1 (Secure Hash Algorithm 1)

8.5.1 Design

Designed by NSA in 1995, SHA-1 produces 160-bit hashes. Similar structure to MD5 but with more conservative design.

8.5.2 Security Status

SHA-1 is deprecated and practically broken:

  • 2005: Theoretical collision attacks (2⁶⁹ operations)
  • 2017: Google and CWI demonstrated actual collision (SHAttered attack)
  • 2020: Chosen-prefix collisions possible

Do Not Use SHA-1 for security. Many browsers and CAs have removed support.

8.6 SHA-2 Family

8.6.1 Overview

Designed by NSA in 2001, SHA-2 includes several variants:

Algorithm Output Size Internal State Security Level
SHA-224 224 bits 256 bits 112 bits
SHA-256 256 bits 256 bits 128 bits
SHA-384 384 bits 512 bits 192 bits
SHA-512 512 bits 512 bits 256 bits
SHA-512/224 224 bits 512 bits 112 bits
SHA-512/256 256 bits 512 bits 128 bits

8.6.2 SHA-256 Design

SHA-256 processes 512-bit blocks, maintaining 8 32-bit words of state (A-H). Each block goes through 64 rounds of operations including:

  • Bitwise operations: AND, OR, XOR, NOT
  • Rotations and shifts
  • Modular addition
  • Constants (cube roots of first 80 primes)

8.6.3 Security of SHA-2

  • No practical attacks on full SHA-2
  • Theoretical attacks on reduced rounds
  • Considered secure for all current applications
  • Length extension attack possible (mitigated by using HMAC)

8.7 SHA-3 (Keccak)

8.7.1 The SHA-3 Competition

NIST announced a competition in 2007 for a new hash standard, due to concerns about SHA-2's similarity to SHA-1. Winner announced in 2012: Keccak, designed by Bertoni, Daemen, Peeters, and Van Assche.

8.7.2 Sponge Construction

Unlike previous hash functions using Merkle-Damgård, SHA-3 uses a sponge construction:

Sponge Function:

  • State: b = r + c bits (r = rate, c = capacity)
  • Absorbing phase: XOR input blocks with first r bits, apply permutation f
  • Squeezing phase: Output first r bits, apply f, repeat

For SHA-3, b = 1600 bits, with different r/c pairs for different output sizes:

  • SHA3-224: r = 1152, c = 448
  • SHA3-256: r = 1088, c = 512
  • SHA3-384: r = 832, c = 768
  • SHA3-512: r = 576, c = 1024

8.7.3 The Keccak-f Permutation

Keccak-f[1600] operates on a 5×5×64 3-dimensional state (total 1600 bits). Each round consists of 5 steps:

  1. θ (Theta): XOR each bit with parity of two columns
  2. ρ (Rho): Rotate each lane by different offsets
  3. π (Pi): Permute lanes
  4. χ (Chi): Nonlinear step (only nonlinear operation)
  5. ι (Iota): XOR round constant

Number of rounds: 24 for 1600-bit version.

8.7.4 Advantages of SHA-3

  • Different structure from SHA-2 (diverse cryptographic toolbox)
  • Resistant to length extension attacks natively
  • Efficient in hardware
  • Can be used for PRNG, stream cipher, authenticated encryption (Keccak-based schemes like Keyak, Ketje)

8.7.5 SHA-3 Variants

SHA3-224, 256, 384, 512: Standard hash functions SHAKE128, SHAKE256: Extendable-output functions (XOFs) that can produce arbitrary-length output

8.8 BLAKE2 and BLAKE3

8.8.1 BLAKE2

BLAKE was a SHA-3 finalist. BLAKE2 is an optimized version designed for high performance:

Features:

  • Faster than MD5 but secure
  • BLAKE2b (64-bit platforms) and BLAKE2s (32-bit/embedded)
  • Keyed hashing (PRF) and salt support
  • Built-in personalization

8.8.2 BLAKE3

BLAKE3, released in 2020, further improves performance:

  • Merkle tree structure with parallel hashing
  • Up to 5-10x faster than BLAKE2
  • Full tree hashing, incremental updates, and keyed modes
  • 256-bit security

8.9 Merkle Trees

8.9.1 Structure

A Merkle tree (or hash tree) allows efficient verification of large data structures:

  • Leaf nodes: Hashes of data blocks
  • Internal nodes: Hash of concatenated child hashes
  • Root: Top hash representing entire structure

8.9.2 Applications

Blockchain: Bitcoin uses Merkle trees to summarize transactions in a block Git: Uses hash trees for version control Certificate transparency: Logs certificates in Merkle trees Distributed systems: Efficiently verify data consistency

Proof of inclusion: Can prove a block is in the tree with O(log n) hashes by providing sibling hashes along the path to root.

8.10 Hash Function Security Considerations

8.10.1 Birthday Attacks

The birthday paradox applies to hash functions: to find a collision with probability > 1/2, you need about 2ⁿ/² hash computations (for n-bit output). Thus, a 256-bit hash provides only 128 bits of collision resistance.

8.10.2 Length Extension Attacks

Merkle-Damgård hashes (MD5, SHA-1, SHA-2) are vulnerable: given H(m), you can compute H(m || padding || extra) without knowing m. This affects some applications but is mitigated by using HMAC.

8.10.3 Choosing Hash Length

  • 128 bits: Minimum for preimage resistance (but weak against collisions)
  • 160 bits: Old standard (SHA-1) now too weak
  • 256 bits: Current standard for most applications
  • 384+ bits: High-security applications, post-quantum margin

8.11 Future Directions

Resistance to Quantum Attacks: Grover's algorithm halves effective security. Use 384-bit hashes for 192-bit quantum security.

Diversity: Having multiple hash families (SHA-2, SHA-3, BLAKE3) provides options if weaknesses found in one.

Specialized Designs: Lightweight hashes for IoT, very fast hashes for high-performance computing.


CHAPTER 9: MESSAGE AUTHENTICATION CODES

9.1 Introduction to MACs

A Message Authentication Code (MAC) provides integrity and authenticity for messages. It's a symmetric-key technique where sender and receiver share a secret key.

Definition: A MAC algorithm takes a key K and message M, producing a tag T = MAC(K, M). Verification checks whether a given tag is valid for the message.

Security Requirements:

  • Unforgeability: Without knowing K, it should be infeasible to produce a valid tag for any new message
  • Even given many (message, tag) pairs, attacker cannot forge tags

9.2 MAC vs. Hash vs. Signature

Property Hash MAC Digital Signature
Key required? No Yes (symmetric) Yes (asymmetric)
Provides non-repudiation? No No Yes
Verification speed Fast Fast Slow
Public verifiability Yes No Yes

9.3 CBC-MAC

9.3.1 Basic CBC-MAC

Construct MAC from a block cipher in CBC mode:

  1. Pad message to multiple of block size
  2. Encrypt in CBC mode with IV = 0
  3. Use final ciphertext block as tag

Security Issue: Basic CBC-MAC is only secure for fixed-length messages. With variable length, existential forgery is possible.

9.3.2 CMAC (Cipher-based MAC)

CMAC (NIST SP 800-38B) fixes CBC-MAC's limitations:

  • Uses two derived subkeys K₁ and K₂
  • For full blocks: XOR with K₁ before final encryption
  • For partial blocks: pad with 10...0, XOR with K₂

Algorithm:

  1. Derive L = Eₖ(0)

  2. K₁ = (L << 1) if msb(L)=0 else (L << 1) ⊕ R

  3. K₂ = (K₁ << 1) if msb(K₁)=0 else (K₁ << 1) ⊕ R (where R depends on block size: for 128-bit, R = 0x87)

  4. Process all but last block normally

  5. For last block Mₙ:

    • If full: Mₙ' = Mₙ ⊕ K₁
    • If partial: Mₙ' = pad(Mₙ) ⊕ K₂
  6. Encrypt Mₙ' to get tag

9.4 HMAC (Hash-based MAC)

9.4.1 Design

HMAC (RFC 2104, FIPS 198-1) constructs a MAC from a cryptographic hash function:

HMAC(K, m) = H((K' ⊕ opad) || H((K' ⊕ ipad) || m))

Where:

  • ipad = 0x36 repeated block size
  • opad = 0x5C repeated block size
  • K' = K padded to block size (or hashed if longer than block)

9.4.2 Security Properties

  • Provable security: if compression function is secure, HMAC is secure
  • Resists length extension attacks (even with Merkle-Damgård hashes)
  • No known practical attacks on HMAC with secure hash

9.4.3 HMAC with Different Hash Functions

Common instantiations:

  • HMAC-SHA256: Used in TLS, IPsec, many protocols
  • HMAC-SHA1: Still considered secure for MAC (unlike SHA1 hash), but migration ongoing
  • HMAC-MD5: Broken for some applications, avoid

9.5 GMAC (Galois MAC)

9.5.1 Description

GMAC is the authentication component of GCM mode. It authenticates using multiplication in GF(2¹²⁸):

  1. Derive hash key H = Eₖ(0¹²⁸)
  2. Process blocks: Xᵢ (AAD and ciphertext blocks) Yᵢ = (Yᵢ₋₁ ⊕ Xᵢ) · H (in GF(2¹²⁸))
  3. Final tag = (Yₙ ⊕ (len(AAD) || len(C)) · H) ⊕ Eₖ(IV||1)

9.5.2 Properties

  • Fast in hardware with carry-less multiplication
  • Parallelizable
  • Security depends on uniqueness of IV/nonce
  • Side-channel attacks possible on multiplication

9.6 Poly1305

9.6.1 Design

Poly1305, designed by Daniel Bernstein, is a fast polynomial-based MAC:

  1. Break message into 16-byte blocks
  2. Interpret each block as a 129-bit number (little-endian, with top bit set for last partial block)
  3. Evaluate polynomial modulo 2¹³⁰ - 5: tag = (r × (r × ( ... ) + s) mod 2¹³⁰ - 5 where r and s are key-derived values

9.6.2 Security

  • Provable security based on polynomial evaluation
  • Used with ChaCha20 in ChaCha20-Poly1305
  • Fast in software, constant-time implementations possible
  • 128-bit security level

9.7 Authenticated Encryption Modes

Modern protocols combine encryption and authentication in single algorithms:

9.7.1 GCM (Galois/Counter Mode)

CTR mode encryption + GMAC authentication

9.7.2 CCM (Counter with CBC-MAC)

CTR mode encryption + CBC-MAC authentication Used in IEEE 802.11i (WiFi), ZigBee

9.7.3 ChaCha20-Poly1305

ChaCha20 stream cipher + Poly1305 MAC Used in TLS 1.3, SSH, WireGuard

9.7.4 OCB (Offset CodeBook)

Single-pass authenticated encryption (patented)

9.8 MAC Security Considerations

9.8.1 Verification Timing

MAC verification must be constant-time to prevent timing attacks. Comparing tags with standard string comparison (which returns early on mismatch) leaks information.

Correct: Use constant-time comparison that examines all bytes.

9.8.2 Key Reuse

MAC keys should not be reused for encryption (unless in combined modes designed for it). Separate keys for different purposes.

9.8.3 Tag Length

Shorter tags are more efficient but easier to forge. Common lengths: 64-128 bits. For high-security applications, use full tag length.

9.8.4 Replay Attacks

MACs alone don't prevent replay attacks. Include sequence numbers or timestamps in authenticated data.

9.9 Choosing a MAC

Recommendations:

  • HMAC-SHA256: Widely supported, good security
  • Poly1305: Fast with ChaCha20
  • AES-GCM/CMAC: When AES hardware acceleration available

Avoid:

  • CBC-MAC without proper padding
  • Short tags (< 64 bits)
  • Non-constant-time verification

PART IV — PUBLIC KEY CRYPTOGRAPHY


CHAPTER 10: PUBLIC KEY CONCEPTS

10.1 The Key Distribution Problem

Before 1976, all cryptography was symmetric—sender and receiver had to share a secret key. This created a fundamental problem: how to securely establish that shared key in the first place?

Key Distribution Challenges:

  • Scalability: n parties require O(n²) keys
  • Initial key exchange: Need secure channel to establish keys
  • Dynamic groups: Adding new users requires new key distribution

10.2 The Revolutionary Idea

Diffie, Hellman, and Merkle proposed a radical solution in 1976: use different keys for encryption and decryption. The encryption key could be public, while the decryption key remained private.

Public Key Cryptosystem Requirements:

  1. Key generation produces a pair (pk, sk) — public key and private key
  2. Encryption: c = E(pk, m) should be easy
  3. Decryption: m = D(sk, c) should be easy for key owner
  4. Given pk, it should be computationally infeasible to find sk or decrypt messages

10.3 Trapdoor Functions

A trapdoor function is a one-way function with a secret "trapdoor" that allows inversion:

  • Easy direction: y = f(x) is easy to compute
  • Hard direction: Given y, find x is hard without trapdoor
  • With trapdoor: Given y and trapdoor, find x is easy

Examples:

  • RSA: f(x) = xᵉ mod N — easy to compute, hard to invert without d (trapdoor)
  • Discrete log: f(x) = gˣ mod p — easy, hard to invert without knowing discrete log (no trapdoor for encryption, but used in key exchange)

10.4 Number-Theoretic Hard Problems

10.4.1 Integer Factorization Problem

Given a composite integer N that is the product of two large primes p and q, find p and q.

Security Basis: No efficient classical algorithm known for large numbers (2048+ bits).

Used in: RSA, Rabin cryptosystem

10.4.2 Discrete Logarithm Problem (DLP)

Given a cyclic group G of order n, a generator g, and an element h = gˣ, find x.

Security Basis: Hard in well-chosen groups (large prime order subgroups of ℤ_p*, elliptic curve groups)

Used in: Diffie-Hellman, ElGamal, DSA

10.4.3 Computational Diffie-Hellman (CDH)

Given g, gᵃ, gᵇ, compute gᵃᵇ.

Relation to DLP: If DLP is easy, CDH is easy. Converse not proven.

10.4.4 Decisional Diffie-Hellman (DDH)

Given g, gᵃ, gᵇ, gᶜ, decide whether c ≡ ab mod n.

Stronger assumption: CDH may be hard while DDH is easy in some groups.

10.5 Key Generation Principles

Randomness: Keys must be generated from high-quality random sources. Weak randomness leads to predictable keys.

Key Size Selection: Based on security level needed and expected attack capabilities.

Parameter Validation: Ensure generated parameters satisfy necessary mathematical properties (primality, subgroup order, etc.).

10.6 Public Key Infrastructure (PKI)

Public keys need authentication—how do you know a public key belongs to who it claims?

10.6.1 Certificate Authorities

A Certificate Authority (CA) issues digital certificates binding an identity to a public key:

  1. User generates key pair, sends public key to CA with proof of identity
  2. CA verifies identity, creates certificate containing public key and identity
  3. CA signs certificate with its own private key
  4. Others can verify certificate using CA's public key

X.509 Certificates: Standard format containing:

  • Subject (identity)
  • Public key
  • Issuer (CA)
  • Validity period
  • Digital signature
  • Extensions (key usage, etc.)

10.6.2 Web of Trust

Alternative to hierarchical CA model, used in PGP:

  • Users sign each other's keys
  • Trust is transitive based on introducers
  • No central authority

10.6.3 Certificate Validation

Steps to validate a certificate:

  1. Check signature using issuer's public key
  2. Verify certificate not expired
  3. Check revocation status (CRL, OCSP)
  4. Validate certificate chain to trusted root
  5. Check that certificate is appropriate for intended use

10.6.4 Challenges with PKI

  • Trusting CAs: Many CAs, any can issue certificates for any domain
  • Revocation: CRLs and OCSP have scalability and privacy issues
  • Key management: Users struggle with certificate lifecycle

10.7 Hybrid Encryption

Public key cryptography is too slow for bulk data encryption. Solution: hybrid encryption.

  1. Generate random symmetric key K
  2. Encrypt message M with symmetric cipher using K
  3. Encrypt K with recipient's public key
  4. Send (E_pk(K), E_K(M))

Advantages: Speed of symmetric encryption, convenience of public key

Examples: TLS, PGP, S/MIME

10.8 Security Notions for Public Key Encryption

IND-CPA (Indistinguishability under Chosen Plaintext Attack) : Attacker chooses two plaintexts, gets encryption of one, can't tell which was encrypted. Equivalent to semantic security.

IND-CCA (Indistinguishability under Chosen Ciphertext Attack) : Attacker can obtain decryptions of chosen ciphertexts (except challenge) and still can't determine which plaintext was encrypted.

IND-CCA2 (Adaptive Chosen Ciphertext Security) : Strongest notion—attacker can continue queries after seeing challenge ciphertext.

10.9 Practical Considerations

10.9.1 Key Lengths and Security Levels

Security Level Symmetric Key RSA Modulus ECC Key Size
80 bits (legacy) 80 bits 1024 bits 160 bits
112 bits 112 bits 2048 bits 224 bits
128 bits 128 bits 3072 bits 256 bits
192 bits 192 bits 7680 bits 384 bits
256 bits 256 bits 15360 bits 512 bits

10.9.2 Performance Comparison

Operations relative speed (RSA decrypt = 1):

Operation RSA 2048 ECC 256
Key generation ~10 ~2
Sign ~1 ~3
Verify ~0.1 ~1
Encrypt (public) ~0.5 N/A
Decrypt (private) ~1 N/A
Key exchange ~0.3 ~1

ECC provides equivalent security with much smaller keys and faster operations.


CHAPTER 11: RSA CRYPTOSYSTEM

11.1 Historical Context

RSA, named after Rivest, Shamir, and Adleman, was published in 1977, making it the first practical public-key cryptosystem. The trio won the Turing Award in 2002 for this work.

11.2 Mathematical Foundation

RSA relies on the practical difficulty of factoring the product of two large primes.

11.2.1 Key Generation

  1. Choose two distinct large primes p and q (keep secret)
  2. Compute n = p × q (modulus, part of public key)
  3. Compute φ(n) = (p-1)(q-1) (Euler's totient)
  4. Choose public exponent e such that:
    • 1 < e < φ(n)
    • gcd(e, φ(n)) = 1
    • Typically e = 65537 (2¹⁶+1) for efficiency
  5. Compute private exponent d ≡ e⁻¹ mod φ(n) (d is modular inverse of e modulo φ(n))

Public key: (n, e) Private key: d (and optionally p, q for efficiency)

11.2.2 Encryption and Decryption

Encryption: c = mᵉ mod n, where m is plaintext (interpreted as integer < n)

Decryption: m = cᵈ mod n

Why it works: By Euler's theorem, mᵉᵈ ≡ m mod n when m is coprime to n (and holds for all m with careful handling).

11.2.3 Simple Example

Choose p = 61, q = 53: n = 61 × 53 = 3233 φ(n) = 60 × 52 = 3120 Choose e = 17 (gcd(17,3120) = 1) Compute d = 17⁻¹ mod 3120 = 2753 (since 17 × 2753 = 46801 = 1 + 15×3120)

Encrypt m = 123: c = 123¹⁷ mod 3233 = 855

Decrypt: m = 855²⁷⁵³ mod 3233 = 123

11.3 Security Proofs and Reductions

Security Reduction: Breaking RSA (finding d from (n,e) or decrypting arbitrary ciphertexts) is not harder than factoring n.

Important: The converse (factoring reduces to breaking RSA) is not proven. It's possible that breaking RSA is easier than factoring.

RSA Assumption: Given n, e, and random c ∈ ℤ_n*, it's hard to find m such that mᵉ ≡ c mod n.

11.4 Implementation Issues

11.4.1 Padding

Textbook RSA (direct encryption of message) is insecure:

  • Deterministic → not IND-CPA secure
  • Small messages vulnerable to attack
  • Multiplicative property: E(m₁)E(m₂) = E(m₁m₂) mod n

Solution: Use padding schemes

PKCS#1 v1.5: Add random padding to message. Vulnerable to padding oracle attacks.

OAEP (Optimal Asymmetric Encryption Padding) : Provably secure padding for RSA:

  1. Pad message with zeros, add random nonce
  2. Apply Feistel network with hash functions
  3. Results in message < n, indistinguishable from random

11.4.2 Chinese Remainder Theorem Optimization

Decryption can be sped up using CRT:

Compute: mₚ = cᵈ mod p = c^{d mod (p-1)} mod p m_q = cᵈ mod q = c^{d mod (q-1)} mod q

Then combine using CRT: m = m_q + q × ((mₚ - m_q) × q⁻¹ mod p)

This is about 4 times faster than direct modular exponentiation.

Security Note: Must verify CRT results to prevent fault attacks.

11.5 Attacks on RSA

11.5.1 Small Exponent Attacks

Small e: If same message sent to multiple recipients with different moduli but same small e, Chinese Remainder Theorem can recover message.

Small d: Wiener's attack can recover d if d < (1/3)n¹/⁴. Use large d or add random padding.

11.5.2 Common Modulus Attack

If same n used with different e₁, e₂ that are coprime, an attacker with both public keys can decrypt messages intended for either user.

Defense: Never share moduli between users.

11.5.3 Timing Attacks

Modular exponentiation time can leak information about private key. Kocher's 1996 attack showed key recovery possible by measuring decryption times.

Defenses:

  • Constant-time exponentiation
  • RSA blinding: multiply ciphertext by random rᵉ before decryption, then divide result by r

11.5.4 Padding Oracle Attacks

If implementation reveals whether decrypted padding is valid, attacker can decrypt messages adaptively. Bleichenbacher's 1998 attack against PKCS#1 v1.5.

Defense: Use OAEP, constant-time validation.

11.5.5 Fault Attacks

Inducing errors during CRT computation can reveal private key. If faulty signature compared with correct one, factors can be recovered.

Defense: Verify signature after generation.

11.5.6 Side-Channel Attacks

Power analysis, electromagnetic analysis, acoustic analysis can all leak key bits.

Defense: Implementations must be resistant to physical attacks.

11.6 RSA Variants

11.6.1 Multi-prime RSA

Use more than two primes: n = p₁ × p₂ × ... × p_k

Advantages: Faster decryption with CRT Disadvantages: Smaller primes make factoring slightly easier for given n size

11.6.2 Rabin Cryptosystem

Based on squaring modulo n: c = m² mod n

Advantages: Provably as hard as factoring Disadvantages: Decryption produces four possible plaintexts; need to identify correct one

11.6.3 Paillier Cryptosystem

Homomorphic encryption: E(m₁) × E(m₂) = E(m₁ + m₂) Used in electronic voting, private information retrieval

11.7 RSA in Practice

Key Generation:

  • Use strong random number generator
  • p and q should be similar size but not too close
  • p-1 and q-1 should have large prime factors
  • Use e = 65537 for efficiency and security

Parameter Sizes:

  • 2048 bits: Minimum for current applications
  • 3072 bits: Recommended for long-term security
  • 4096 bits: High security, post-quantum margin

Standards:

  • PKCS#1: RSA Cryptography Standard
  • FIPS 186-4: Digital Signature Standard (RSA)
  • RFC 8017: PKCS #1 v2.2

CHAPTER 12: DIFFIE–HELLMAN KEY EXCHANGE

12.1 The Problem

How can two parties establish a shared secret key over an insecure channel without any prior shared secrets?

12.2 The Diffie-Hellman Protocol

12.2.1 Basic Protocol

Public parameters:

  • Large prime p
  • Generator g of a subgroup of ℤ_p* (preferably of large prime order)

Protocol:

  1. Alice chooses random secret a, computes A = gᵃ mod p, sends A to Bob
  2. Bob chooses random secret b, computes B = gᵇ mod p, sends B to Alice
  3. Alice computes shared secret s = Bᵃ mod p = (gᵇ)ᵃ = gᵃᵇ mod p
  4. Bob computes shared secret s = Aᵇ mod p = (gᵃ)ᵇ = gᵃᵇ mod p

Result: Alice and Bob share secret gᵃᵇ mod p. Eavesdropper sees p, g, A, B but cannot compute gᵃᵇ (presumed hard—CDH problem).

12.2.2 Example

p = 23, g = 5 Alice chooses a = 6: A = 5⁶ mod 23 = 8 Bob chooses b = 15: B = 5¹⁵ mod 23 = 19 Alice computes s = 19⁶ mod 23 = 2 Bob computes s = 8¹⁵ mod 23 = 2 Shared secret: 2

12.3 Security Assumptions

Discrete Log Problem: Given g, p, and A = gᵃ, find a. (Hard for large p)

Computational Diffie-Hellman: Given g, gᵃ, gᵇ, compute gᵃᵇ. (Hard—at least as hard as DLP? Not proven)

Decisional Diffie-Hellman: Given g, gᵃ, gᵇ, and a candidate c, decide if c = gᵃᵇ. (Sometimes weaker than CDH)

12.4 Man-in-the-Middle Attack

Classic DH is vulnerable to active attack:

  1. Mallory intercepts A from Alice, sends A' = gᵃ' to Bob
  2. Mallory intercepts B from Bob, sends B' = gᵇ' to Alice
  3. Alice computes s₁ = (B')ᵃ = gᵃᵇ'
  4. Bob computes s₂ = (A')ᵇ = gᵃ'ᵇ
  5. Mallory computes s₁ = (A)ᵇ' = gᵃᵇ' and s₂ = (B)ᵃ' = gᵃ'ᵇ

Result: Mallory shares keys with both parties, can decrypt and re-encrypt all traffic.

Defense: Authenticated Diffie-Hellman (use digital signatures, certificates)

12.5 Ephemeral Diffie-Hellman (DHE)

Use fresh (ephemeral) Diffie-Hellman keys for each session:

  • Generate new a, b for each connection
  • Provides forward secrecy: if long-term keys compromised, past sessions remain secure
  • Used in TLS DHE cipher suites

12.6 Authenticated Diffie-Hellman

12.6.1 Station-to-Station Protocol

Adds authentication using signatures:

  1. Alice → Bob: gᵃ
  2. Bob → Alice: gᵇ, Eₖ(Sign_Bob(gᵃ, gᵇ))
  3. Alice → Bob: Eₖ(Sign_Alice(gᵃ, gᵇ))

Where k is derived from gᵃᵇ.

12.6.2 MQV and HMQV

More efficient authenticated key agreement:

  • Each party has long-term and ephemeral keys
  • Combined to produce authenticated session key
  • HMQV (Hashed MQV) provides security proofs

12.7 Safe Parameters

Choosing p: Must be large enough (2048+ bits) and safe prime (p = 2q+1 with q prime)

Choosing g: Generator of the subgroup of order q, where q is large prime dividing p-1

RFC 3526 defines standard groups:

  • 2048-bit MODP Group
  • 3072-bit MODP Group
  • 4096-bit MODP Group

12.8 Implementation Considerations

12.8.1 Subgroup Confinement

Ensure received values are in the correct subgroup (order q). Otherwise, small subgroup attacks possible.

Defense: Check that 1 < A < p-1 and A^q ≡ 1 mod p

12.8.2 Side-Channel Protection

Exponentiation must be constant-time to prevent timing attacks.

12.8.3 Randomness Quality

Secret exponents a, b must be truly random and unpredictable.

12.9 Elliptic Curve Diffie-Hellman (ECDH)

Same protocol but using elliptic curve groups (see Chapter 15):

  • Smaller keys (256 bits vs 3072 bits RSA)
  • Faster computation
  • Equivalent security

CHAPTER 13: ELGAMAL CRYPTOSYSTEM

13.1 Overview

Taher Elgamal described this cryptosystem in 1985, based on the Diffie-Hellman key exchange. It provides both encryption and digital signatures.

13.2 ElGamal Encryption

13.2.1 Key Generation

Choose:

  • Large prime p
  • Generator g of a subgroup of ℤ_p*
  • Private key x (random, 1 < x < p-1)
  • Public key y = gˣ mod p

Public key: (p, g, y) Private key: x

13.2.2 Encryption

To encrypt message m (represented as element of group):

  1. Choose random ephemeral key k (1 < k < p-1)
  2. Compute a = gᵏ mod p
  3. Compute b = m × yᵏ mod p
  4. Ciphertext: (a, b)

13.2.3 Decryption

  1. Compute s = aˣ mod p (this equals gᵏˣ = yᵏ)
  2. Compute m = b × s⁻¹ mod p (since b = m × yᵏ)

13.2.4 Example

p = 23, g = 5 x = 7 (private) y = 5⁷ mod 23 = 17

Encrypt m = 10: Choose k = 3 a = 5³ mod 23 = 10 b = 10 × 17³ mod 23 = 10 × 15 = 150 mod 23 = 12 Ciphertext: (10, 12)

Decrypt: s = 10⁷ mod 23 = 15 s⁻¹ mod 23 = 17 (since 15×17=255≡1 mod 23) m = 12 × 17 mod 23 = 204 mod 23 = 10

13.3 Properties

Probabilistic: Different random k produces different ciphertexts for same message (IND-CPA secure).

Expansion: Ciphertext is twice size of plaintext.

Homomorphic: Multiplication of ciphertexts corresponds to multiplication of plaintexts: E(m₁) × E(m₂) = (a₁a₂, b₁b₂) = E(m₁m₂)

13.4 Security

Based on Diffie-Hellman problem:

  • Finding x from y is discrete log
  • Finding s = yᵏ from a and y is CDH
  • Without x or k, can't recover m from (a,b)

IND-CPA Security: Under DDH assumption, ElGamal is IND-CPA secure.

IND-CCA Security: Basic ElGamal is not IND-CCA secure (malleable). Use with padding or in hybrid encryption.

13.5 ElGamal Digital Signature

13.5.1 Signing

To sign message m:

  1. Choose random k with gcd(k, p-1) = 1
  2. Compute r = gᵏ mod p
  3. Compute s = (m - xr) × k⁻¹ mod (p-1)
  4. Signature: (r, s)

13.5.2 Verification

Verify: gᵐ ≡ yʳ × rˢ mod p

13.5.3 Security Notes

  • k must be random and never reused (else private key revealed)
  • Original ElGamal signature has been superseded by DSA

13.6 Variants and Applications

Integrated Encryption Scheme (IES) : Hybrid encryption combining Diffie-Hellman key agreement with symmetric encryption and MAC.

Cramer-Shoup: IND-CCA2 secure variant of ElGamal using additional elements.

Used in: Pretty Good Privacy (PGP), GNU Privacy Guard (GnuPG)


PART V — ELLIPTIC CURVE CRYPTOGRAPHY


CHAPTER 14: ELLIPTIC CURVE MATHEMATICS

14.1 Introduction to Elliptic Curves

Elliptic curves have been studied by mathematicians for centuries, but their application to cryptography began independently with Neal Koblitz and Victor Miller in 1985. ECC offers equivalent security with much smaller key sizes than RSA.

14.2 Elliptic Curves over Real Numbers

14.2.1 Definition

An elliptic curve over real numbers is defined by the equation:

y² = x³ + ax + b

with the discriminant condition 4a³ + 27b² ≠ 0 (to ensure no singular points).

14.2.2 Graph Characteristics

  • Symmetric about x-axis
  • Different shapes depending on a and b
  • Includes point at infinity (O) as identity element

14.2.3 Group Law

Points on an elliptic curve form an abelian group with addition defined geometrically:

Point Addition (P + Q) :

  1. Draw line through P and Q
  2. Find third intersection point with curve (R')
  3. Reflect across x-axis to get R = P + Q

Point Doubling (P + P) :

  1. Draw tangent line at P
  2. Find second intersection point (R')
  3. Reflect to get 2P

Identity: Point at infinity O (P + O = P)

Inverse: -P is reflection across x-axis

14.3 Elliptic Curves over Finite Fields

For cryptography, we use elliptic curves over finite fields, typically GF(p) (prime fields) or GF(2ᵐ) (binary fields).

14.3.1 Curves over GF(p)

Equation: y² ≡ x³ + ax + b (mod p)

All arithmetic modulo prime p. The curve has finitely many points (including O).

14.3.2 Curves over GF(2ᵐ)

Equation: y² + xy = x³ + ax² + b (for non-supersingular curves)

Arithmetic in binary field. Efficient for hardware implementation.

14.4 Group Law over Finite Fields

14.4.1 Point Addition (P ≠ Q)

Given P = (x₁, y₁), Q = (x₂, y₂), P ≠ Q, P ≠ -Q:

λ = (y₂ - y₁) / (x₂ - x₁) mod p x₃ = λ² - x₁ - x₂ mod p y₃ = λ(x₁ - x₃) - y₁ mod p

14.4.2 Point Doubling (P = Q)

When y₁ ≠ 0:

λ = (3x₁² + a) / (2y₁) mod p x₃ = λ² - 2x₁ mod p y₃ = λ(x₁ - x₃) - y₁ mod p

14.4.3 Point at Infinity

O serves as identity: P + O = P O + O = O

14.5 Elliptic Curve Discrete Logarithm Problem (ECDLP)

Given points P and Q = kP on an elliptic curve, find k.

Security Basis: ECDLP appears harder than DLP in finite fields for equivalent parameter sizes. Best known algorithms for ECDLP are exponential (except for special curves), while DLP has subexponential algorithms.

14.6 Choosing Curves for Cryptography

14.6.1 Security Requirements

  • Prime order: Curve order should be prime or have large prime factor to resist Pohlig-Hellman attack
  • MOV condition: Curve must not allow reduction of ECDLP to DLP in small extension field
  • Anomalous condition: Curve order ≠ p (to resist Smart's attack)
  • Twist security: Related curves must also be secure

14.6.2 Standard Curves

NIST Curves (FIPS 186-4):

  • P-192, P-224, P-256, P-384, P-521
  • K-163, K-233, K-283, K-409, K-571 (Koblitz curves)
  • B-163, B-233, B-283, B-409, B-571 (binary curves)

Curve25519 (Daniel Bernstein):

  • y² = x³ + 486662x² + x over GF(2²⁵⁵ - 19)
  • Designed for high security and efficient implementation
  • Montgomery form enables fast, constant-time scalar multiplication

Edwards Curves:

  • x² + y² = 1 + dx²y²
  • Complete addition formulas (no exceptional cases)
  • Ed25519 uses twisted Edwards form

14.7 Point Representation

14.7.1 Affine Coordinates

Standard (x, y) representation. Addition requires modular inverses (expensive).

14.7.2 Projective Coordinates

Represent point as (X, Y, Z) with x = X/Z, y = Y/Z. Avoids inverses during addition chain.

Jacobian coordinates: (X, Y, Z) with x = X/Z², y = Y/Z³. Even more efficient.

14.8 Scalar Multiplication

Computing kP (point multiplied by integer) is the core operation in ECC.

Double-and-add algorithm: Initialize Q = O For each bit of k from MSB to LSB: Q = 2Q (double) if bit = 1: Q = Q + P (add) Return Q

Optimizations:

  • Window methods (precompute multiples of P)
  • Signed-digit representations (non-adjacent form)
  • Montgomery ladder (constant-time)

CHAPTER 15: ECC ALGORITHMS

15.1 ECDH (Elliptic Curve Diffie-Hellman)

15.1.1 Protocol

Public parameters: Curve with base point G of prime order n

  1. Alice chooses private key d_A (random 1 < d_A < n) Computes public key Q_A = d_A × G
  2. Bob chooses private key d_B, computes Q_B = d_B × G
  3. Alice computes shared secret S = d_A × Q_B
  4. Bob computes shared secret S = d_B × Q_A

Both compute the same point: S = d_A d_B × G

15.1.2 Key Derivation

The shared secret point S is not used directly as key. Instead:

  • Use x-coordinate of S
  • Apply key derivation function (KDF) to produce symmetric keys

15.1.3 Security Considerations

  • Validate received public keys (point is on curve, has correct order)
  • Use ephemeral keys for forward secrecy (ECDHE)
  • Combine with authentication to prevent MitM

15.2 ECDSA (Elliptic Curve Digital Signature Algorithm)

15.2.1 Key Generation

  • Choose curve with base point G of order n
  • Choose private key d (random 1 < d < n)
  • Compute public key Q = d × G

15.2.2 Signing

To sign message m:

  1. Compute e = H(m) (hash of message)
  2. Choose random k (1 < k < n)
  3. Compute (x₁, y₁) = k × G
  4. Compute r = x₁ mod n (if r = 0, choose new k)
  5. Compute s = k⁻¹(e + dr) mod n (if s = 0, choose new k)
  6. Signature: (r, s)

15.2.3 Verification

  1. Verify r, s in [1, n-1]
  2. Compute e = H(m)
  3. Compute w = s⁻¹ mod n
  4. Compute u₁ = e × w mod n
  5. Compute u₂ = r × w mod n
  6. Compute (x₁, y₁) = u₁ × G + u₂ × Q
  7. Signature valid if r ≡ x₁ mod n

15.2.4 Security Notes

  • k must be truly random and never reused
  • Reusing k reveals private key (as happened with Sony PS3)
  • Use deterministic k (RFC 6979) to avoid RNG failures

15.3 EdDSA (Edwards-curve Digital Signature Algorithm)

15.3.1 Design Principles

EdDSA, designed by Daniel Bernstein, improves on ECDSA:

  • Uses twisted Edwards curves (Ed25519, Ed448)
  • Deterministic (no RNG needed for signing)
  • Resistant to side-channel attacks
  • Faster and simpler to implement correctly

15.3.2 Ed25519

Parameters:

  • Curve: edwards25519 (birationally equivalent to Curve25519)
  • Prime: 2²⁵⁵ - 19
  • Base point order: 2²⁵² + 27742317777372353535851937790883648493

Signing:

  1. Compute r = H(hash of private key || message)
  2. Compute R = r × B
  3. Compute h = H(R || public key || message)
  4. Compute s = r + h × private key mod L
  5. Signature: (R, s)

Advantages:

  • Small signatures (64 bytes for Ed25519)
  • Fast verification (can be batched)
  • No branching on secret data
  • Deterministic prevents RNG failures

15.4 Curve25519 and Ed25519

15.4.1 Design Choices

  • Montgomery form: y² = x³ + 486662x² + x
  • x-coordinate only: Diffie-Hellman uses only x-coordinate (faster, simpler)
  • Constant-time: Designed to enable constant-time implementations
  • Prime: 2²⁵⁵ - 19 (fits in 5 51-bit limbs on 64-bit machines)

15.4.2 Security Features

  • Twist security: The quadratic twist is also secure
  • Ladder: Montgomery ladder for scalar multiplication
  • Clamping: Private key bits set appropriately to avoid small subgroup attacks

15.4.3 Adoption

  • TLS 1.3 requires support for X25519 and Ed25519
  • Signal Protocol, WireGuard, OpenSSH use them
  • Increasingly preferred over NIST curves

15.5 ECC vs. RSA

Aspect RSA ECC
Key size (128-bit security) 3072 bits 256 bits
Signature size 3072 bits 512 bits (Ed25519: 512 bits)
Key generation speed Slow Fast
Encryption speed Fast (public) Not directly (requires ECIES)
Decryption/signing speed Slow Fast
Mathematical basis Factoring ECDLP

15.6 Implementation Pitfalls

15.6.1 Invalid Curve Attacks

Attacker supplies point not on curve, forcing computations in different group that may be easier to break.

Defense: Always verify received points are on curve and have correct order.

15.6.2 Side-Channel Attacks

  • Timing: Scalar multiplication must be constant-time
  • Power: Differential power analysis can reveal bits

Defense: Use Montgomery ladder, complete addition formulas, randomize coordinates.

15.6.3 Small Subgroup Attacks

If curve order has small factors, attacker can force private key into small subgroup.

Defense: Use prime-order curves, or validate received points have correct order.


CHAPTER 16: PAIRING-BASED CRYPTOGRAPHY

16.1 Introduction to Pairings

A bilinear pairing is a map e: G₁ × G₂ → Gₜ with special properties. Pairings enable cryptographic constructions not possible with other tools.

16.2 Bilinear Pairings

16.2.1 Definition

Let G₁, G₂, Gₜ be cyclic groups of prime order n. A bilinear pairing is a map e: G₁ × G₂ → Gₜ satisfying:

  1. Bilinearity: e(aP, bQ) = e(P, Q)ᵃᵇ for all a, b ∈ ℤ_n
  2. Non-degeneracy: e(P, Q) ≠ 1 for some P ∈ G₁, Q ∈ G₂
  3. Computability: e can be efficiently computed

If G₁ = G₂, it's called symmetric pairing.

16.2.2 Types of Pairings

  • Weil pairing: First cryptographic pairing, but inefficient
  • Tate pairing: More efficient than Weil
  • Optimal Ate pairing: Most efficient for modern curves

16.2.3 Suitable Curves

  • Supersingular curves: Allow efficient pairings but may have MOV attack (reducing ECDLP to DLP)
  • MNT curves: Miyaji-Nakabayashi-Takano curves
  • BN curves: Barreto-Naehrig curves (prime order, optimal pairing)
  • BLS curves: Barreto-Lynn-Scott curves

16.3 Identity-Based Encryption (IBE)

16.3.1 Concept

In IBE, any string (email address, domain name) can serve as public key. A trusted authority with master secret generates private keys for identities.

16.3.2 Boneh-Franklin IBE

The first practical IBE scheme (2001):

Setup:

  • Choose bilinear groups G₁, G₂, Gₜ of order n
  • Choose generators P ∈ G₁, Q ∈ G₂
  • Choose master secret s ∈ ℤ_n
  • Compute P_pub = sP (public parameters)
  • Hash functions H₁: {0,1}* → G₁, H₂: Gₜ → {0,1}*

Extract (private key for identity ID):

  • Q_ID = H₁(ID)
  • d_ID = sQ_ID

Encrypt to ID with message M:

  • Choose random r
  • Compute g_ID = e(Q_ID, P_pub)
  • Ciphertext: (rP, M ⊕ H₂(g_IDʳ))

Decrypt with private key d_ID:

  • Compute g_IDʳ = e(d_ID, rP)
  • Recover M

16.3.3 Applications

  • Email encryption without certificates
  • Key distribution in sensor networks
  • Simplified PKI

16.4 Short Signatures

16.4.1 Boneh-Lynn-Shacham (BLS) Signatures

BLS signatures are very short (one group element):

Key generation: Private key x, public key X = xP

Sign: For message m, compute σ = xH(m) (where H maps to G₁)

Verify: Check e(σ, P) = e(H(m), X)

Properties:

  • Very short signatures (33 bytes at 128-bit security)
  • Signature aggregation possible
  • Used in blockchain (Ethereum 2.0)

16.5 Other Pairing-Based Protocols

16.5.1 Joux's Tripartite Diffie-Hellman

Three parties can establish shared secret in one round:

  • Parties have keys a, b, c
  • Exchange aP, bP, cP
  • Each computes e(bP, cP)ᵃ = e(P, P)ᵃᵇᶜ

16.5.2 Attribute-Based Encryption (ABE)

Encryption based on attributes; decryption possible if user's attributes satisfy policy.

16.5.3 Searchable Encryption

Encrypt data with keywords; generate trapdoors to search without revealing keywords.

16.6 Implementation Considerations

Performance: Pairings are computationally expensive (but improving)

Security Parameters: Need careful selection of curves and group sizes

Side-Channel Protection: Pairing implementations must be constant-time


PART VI — DIGITAL SIGNATURES & AUTHENTICATION


CHAPTER 17: DIGITAL SIGNATURES

17.1 Introduction to Digital Signatures

Digital signatures provide authentication, integrity, and non-repudiation for digital messages. They are the digital equivalent of handwritten signatures.

Properties:

  • Authenticity: Verifies identity of signer
  • Integrity: Detects any modification after signing
  • Non-repudiation: Signer cannot deny having signed

17.2 General Framework

A digital signature scheme consists of three algorithms:

  1. KeyGen(1^λ) → (sk, pk): Generate signing and verification keys
  2. Sign(sk, m) → σ: Produce signature on message m
  3. Verify(pk, m, σ) → {0,1}: Verify signature validity

Security Requirement: Existential unforgeability under chosen message attack (EUF-CMA). Adversary who can request signatures on chosen messages cannot produce valid signature on new message.

17.3 RSA Signatures

17.3.1 Basic RSA Signatures

Sign: σ = mᵈ mod n Verify: m ≡ σᵉ mod n

Problems:

  • Multiplicative property: σ(m₁) × σ(m₂) = σ(m₁m₂)
  • Existential forgery possible
  • No hashing of message

17.3.2 RSA with Hashing

Sign: σ = H(m)ᵈ mod n Verify: H(m) ≡ σᵉ mod n

PKCS#1 v1.5: Add padding structure before hashing

PSS (Probabilistic Signature Scheme) : Provably secure RSA signature padding

17.4 DSA (Digital Signature Algorithm)

17.4.1 Parameters

  • Prime p (1024-3072 bits)
  • Prime q (160-256 bits) dividing p-1
  • Generator g of subgroup of order q
  • Private key x (random 1 < x < q)
  • Public key y = gˣ mod p

17.4.2 Signing

For message m:

  1. Compute hash h = H(m)
  2. Choose random k (1 < k < q)
  3. Compute r = (gᵏ mod p) mod q
  4. Compute s = k⁻¹(h + xr) mod q
  5. Signature: (r, s)

17.4.3 Verification

  1. Verify r, s in [1, q-1]
  2. Compute w = s⁻¹ mod q
  3. Compute u₁ = h × w mod q
  4. Compute u₂ = r × w mod q
  5. Compute v = ((gᵘ¹ × yᵘ²) mod p) mod q
  6. Accept if v = r

17.4.4 Security Notes

  • k must be random and secret (reuse reveals private key)
  • 1994: Boneh discovered attack with biased k

17.5 ECDSA (Elliptic Curve DSA)

(Already covered in Chapter 15)

17.6 Schnorr Signatures

17.6.1 Description

Schnorr signatures offer advantages over DSA/ECDSA:

Parameters:

  • Group G of prime order q with generator g
  • Hash function H

Key generation: Private key x, public key y = gˣ

Signing:

  1. Choose random k
  2. Compute r = gᵏ
  3. Compute e = H(r || m)
  4. Compute s = k - xe mod q
  5. Signature: (e, s)

Verification: Compute r_v = gˢ × yᵉ Accept if e = H(r_v || m)

17.6.2 Advantages

  • Provably secure in random oracle model
  • Linear: Signatures can be aggregated
  • Batch verification possible
  • Patent-free (since 2008)

17.6.3 EdDSA

EdDSA is a variant of Schnorr signatures using Edwards curves (see Chapter 15).

17.7 Comparison of Signature Schemes

Scheme Signature Size Verification Speed Security Proof
RSA-PSS 256-512 bytes Slow (exponentiation) Yes (ROM)
DSA ~40-64 bytes Moderate No
ECDSA ~64 bytes Fast No
Schnorr ~64 bytes Fast Yes (ROM)
Ed25519 64 bytes Very fast Yes

17.8 Signature Standards

  • FIPS 186-5: DSA, RSA, ECDSA, EdDSA
  • RFC 8032: EdDSA (Ed25519, Ed448)
  • PKCS#1: RSA signatures

CHAPTER 18: AUTHENTICATION PROTOCOLS

18.1 The Authentication Problem

Authentication verifies that a party is who they claim to be. This can be based on:

  • Something you know (password)
  • Something you have (token, smart card)
  • Something you are (biometrics)

18.2 Challenge-Response Protocols

18.2.1 Basic Concept

Verifier sends random challenge; claimant responds in a way that proves knowledge of secret.

Symmetric Key Version:

  1. Verifier → Claimant: random nonce N
  2. Claimant → Verifier: MAC(K, N) or E_K(N)

Public Key Version:

  1. Verifier → Claimant: random N
  2. Claimant → Verifier: Sign(sk, N)

18.2.2 Advantages

  • Prevents replay attacks (fresh challenge each time)
  • No need to store passwords

18.3 Kerberos

18.3.1 Overview

Kerberos, developed at MIT, provides authentication in distributed environments using symmetric cryptography and a trusted third party (Key Distribution Center).

18.3.2 Components

  • KDC (Key Distribution Center) :
    • Authentication Server (AS)
    • Ticket Granting Server (TGS)
  • Clients: Users requesting services
  • Servers: Services clients want to use

18.3.3 Protocol Flow

Phase 1: Authentication Service Exchange

  1. Client → AS: Request ticket-granting ticket (authenticates)
  2. AS → Client: TGT encrypted with client's password-derived key (Contains session key and TGT for TGS)

Phase 2: Ticket-Granting Service Exchange

  1. Client → TGS: TGT + authenticator + request for service ticket
  2. TGS → Client: Service ticket (encrypted with service's key)

Phase 3: Client-Server Exchange

  1. Client → Server: Service ticket + authenticator
  2. Server → Client: Optional server authentication

18.3.4 Security Features

  • Mutual authentication: Both parties verify each other
  • Replay protection: Timestamps in authenticators
  • Session keys: Separate keys per session

18.3.5 Limitations

  • Single point of failure (KDC)
  • Requires synchronized clocks
  • Limited to symmetric cryptography (no non-repudiation)

18.4 OAuth 2.0 and OpenID Connect

18.4.1 OAuth 2.0

OAuth 2.0 is an authorization framework, not authentication:

Roles:

  • Resource Owner: User who owns data
  • Client: Application requesting access
  • Authorization Server: Issues tokens
  • Resource Server: Hosts protected resources

Flow:

  1. Client requests authorization from resource owner
  2. Resource owner grants authorization
  3. Client receives authorization grant
  4. Client exchanges grant for access token
  5. Client uses token to access resources

18.4.2 OpenID Connect

Adds authentication layer on top of OAuth 2.0:

  • ID Token: JWT containing user identity claims
  • UserInfo endpoint: Additional user information

18.4.3 Security Considerations

  • Must use TLS for all communications
  • Proper validation of redirect URIs
  • Protect client secrets
  • Use PKCE for public clients

18.5 Zero-Knowledge Authentication

18.5.1 Concept

Zero-knowledge proofs allow prover to convince verifier of knowledge without revealing the secret.

18.5.2 Fiat-Shamir Protocol

  1. Prover chooses random r, sends t = r² mod n
  2. Verifier sends random challenge c ∈ {0,1}
  3. If c=0, prover sends r; verifier checks r² ≡ t
  4. If c=1, prover sends s = r × secret; verifier checks s² ≡ t × public_key

Repeated many times for security.

18.5.3 Applications

  • Smart cards (prove identity without revealing key)
  • Cryptographic protocols
  • Authentication without password transmission

18.6 Multi-Factor Authentication

Combine multiple factors for stronger security:

Common factors:

  • Password (knowledge)
  • SMS code, authenticator app (possession)
  • Fingerprint, face (inherence)
  • Location, behavior (context)

Protocols: TOTP (RFC 6238), U2F/WebAuthn


PART VII — CRYPTOGRAPHIC PROTOCOLS


CHAPTER 19: SECURE COMMUNICATION PROTOCOLS

19.1 SSL/TLS

19.1.1 History

  • SSL 1.0: Never released (1994)
  • SSL 2.0: 1995 (flawed)
  • SSL 3.0: 1996 (obsoleted 2015 due to POODLE)
  • TLS 1.0: 1999 (RFC 2246, deprecated)
  • TLS 1.1: 2006 (RFC 4346, deprecated)
  • TLS 1.2: 2008 (RFC 5246, current)
  • TLS 1.3: 2018 (RFC 8446, current)

19.1.2 TLS 1.2 Handshake

  1. Client Hello: Supported versions, cipher suites, random nonce
  2. Server Hello: Chosen version, cipher suite, random nonce, session ID
  3. Server Certificate: X.509 certificate chain
  4. Server Key Exchange: (if needed) Server's DH parameters
  5. Server Hello Done
  6. Client Key Exchange: Client's DH parameters, optionally encrypted pre-master secret
  7. Change Cipher Spec: Switch to negotiated algorithms
  8. Finished: Verified handshake messages
  9. Change Cipher Spec (server)
  10. Finished (server)

19.1.3 TLS 1.3 Improvements

  • Reduced handshake latency (1-RTT, 0-RTT)
  • Removed weak algorithms (MD5, SHA-1, RC4)
  • Encrypted handshake (most messages encrypted)
  • Perfect forward secrecy by default (no static RSA)
  • Simplified cipher suites (just AEAD and HKDF)

Handshake:

  1. Client Hello (with key share)
  2. Server Hello (with key share), Certificate, Finished
  3. Client Finished
  4. Application Data

19.1.4 TLS Security Considerations

  • Certificate validation critical
  • Protocol downgrade attacks (must enforce minimum versions)
  • Side-channel attacks (Lucky 13, etc.)
  • Implementation complexity leads to vulnerabilities

19.2 IPsec

19.2.1 Overview

IPsec provides security at IP layer, protecting all upper-layer protocols. Used in VPNs.

Components:

  • AH (Authentication Header) : Provides integrity and authentication
  • ESP (Encapsulating Security Payload) : Provides confidentiality (with optional auth)
  • IKE (Internet Key Exchange) : Key management protocol

19.2.2 Modes

Transport Mode: Protects payload only (for host-to-host) Tunnel Mode: Protects entire IP packet (for VPN gateways)

19.2.3 IKEv2

  • Authenticated Diffie-Hellman key exchange
  • Establishes Security Associations (SAs)
  • Built on reliable transport (usually UDP 500)

19.3 SSH (Secure Shell)

19.3.1 Purpose

SSH provides secure remote login and command execution, replacing telnet, rlogin, etc.

19.3.2 Architecture

SSH Transport Layer:

  • Key exchange (Diffie-Hellman)
  • Server authentication (host keys)
  • Encryption, integrity, compression

SSH User Authentication:

  • Password
  • Public key
  • Keyboard-interactive
  • GSSAPI (Kerberos)

SSH Connection Protocol:

  • Multiplexes multiple channels over one connection
  • Port forwarding, X11 forwarding, etc.

19.3.3 Security Features

  • Host key verification (TOFU or PKI)
  • Perfect forward secrecy
  • Strong crypto (ChaCha20-Poly1305, Ed25519)

19.4 Signal Protocol

19.4.1 Overview

The Signal Protocol (formerly Axolotl) provides end-to-end encryption for messaging, used in Signal, WhatsApp, Skype.

19.4.2 Key Features

Double Ratchet:

  • Symmetric ratchet: Derive new keys from previous for forward secrecy
  • Diffie-Hellman ratchet: Incorporate new DH keys for future secrecy

X3DH (Extended Triple Diffie-Hellman) :

  • Asynchronous key exchange
  • Uses prekeys published to server
  • Establishes initial shared secret

Sealed Sender:

  • Encrypts sender identity
  • Prevents server from knowing who's messaging whom

19.4.3 Security Properties

  • Forward secrecy: Compromise of current keys doesn't reveal past messages
  • Future secrecy (self-healing): Compromise doesn't affect future messages after repair
  • Deniability: No cryptographic proof of who sent message
  • Asynchronous operation: Parties need not be online simultaneously

CHAPTER 20: KEY MANAGEMENT

20.1 The Key Management Problem

Keys are the secrets that protect everything else. Poor key management undermines even the strongest cryptography.

Key Management Lifecycle:

  • Generation
  • Distribution
  • Storage
  • Usage
  • Rotation
  • Backup
  • Revocation
  • Destruction

20.2 Key Distribution

20.2.1 Symmetric Key Distribution

Manual Pre-sharing: Physically deliver keys (secure but not scalable)

Key Distribution Center (KDC) : As in Kerberos, trusted third party distributes session keys

Public Key Techniques: Use asymmetric crypto to exchange symmetric keys (hybrid encryption)

20.2.2 Asymmetric Key Distribution

Certificates: Bind identity to public key via CA signature

Web of Trust: Users sign each other's keys (PGP)

Trust on First Use (TOFU) : Accept key on first connection, warn if changes (SSH)

DNS-based Authentication: DANE (RFC 6698) uses DNSSEC to publish keys

20.3 Key Storage

20.3.1 Software Storage

  • Encrypted key files (password-protected)
  • Key derivation from passwords (PBKDF2, scrypt, Argon2)
  • Risks: Memory scraping, file system access

20.3.2 Hardware Security Modules (HSMs)

Physical devices designed to protect keys:

Features:

  • Keys never leave HSM in plaintext
  • Tamper-resistant/evident
  • Cryptographic operations performed inside
  • Audit logging
  • High-performance crypto acceleration

Use Cases:

  • Certificate Authorities (root keys)
  • Payment processing (PCI DSS)
  • Database encryption

20.3.3 Trusted Platform Module (TPM)

TPM is a low-cost HSM-like chip on many computers:

  • Secure storage of platform measurements
  • Key generation and storage
  • Platform authentication
  • Disk encryption key protection (BitLocker)

20.3.4 Secure Enclaves

Hardware-isolated execution environments (Intel SGX, ARM TrustZone):

  • Code and data protected even from OS
  • Remote attestation possible
  • Use for confidential computing

20.4 Key Rotation

Why rotate keys:

  • Limit exposure if key compromised
  • Reduce crypto-analysis material
  • Compliance requirements
  • Personnel changes

Rotation strategies:

  • Time-based: New key every X days/months
  • Volume-based: After encrypting certain amount of data
  • Event-based: After personnel changes, security incidents

20.5 Key Escrow and Recovery

Key Escrow: Third party holds copy of key for authorized access

  • Corporate: Employee keys accessible to organization
  • Government: Law enforcement access (controversial)

Key Recovery: Mechanisms to regain access when key lost

  • Secret sharing: Split key among multiple parties
  • Smart cards with PIN recovery
  • Encrypted backups

20.6 Key Derivation Functions (KDFs)

Convert source key material (password, shared secret) into cryptographic keys:

Password-Based KDFs:

  • PBKDF2 (RFC 8018): Iterated HMAC
  • scrypt: Memory-hard (resists GPU/ASIC attacks)
  • Argon2: Winner of Password Hashing Competition

HKDF (RFC 5869) : Extract-then-expand for Diffie-Hellman output


CHAPTER 21: ZERO-KNOWLEDGE PROOFS

21.1 Introduction

A zero-knowledge proof allows a prover to convince a verifier of a statement's truth without revealing any information beyond the statement's validity.

Properties:

  • Completeness: Honest prover can convince verifier
  • Soundness: Dishonest prover cannot convince verifier of false statement
  • Zero-knowledge: Verifier learns nothing except statement validity

21.2 Interactive Proofs

21.2.1 The Ali Baba Cave Example

Classic analogy: Peggy wants to prove to Victor she knows secret password to open door in circular cave, without revealing password.

  1. Victor stands outside, Peggy enters cave
  2. Victor calls out which path Peggy should exit from
  3. Peggy exits from correct path (using secret door if needed)
  4. Repeat many times

Probability of cheating: 1/2 per round → after n rounds, cheating probability 2⁻ⁿ

21.2.2 Graph Isomorphism

Prover knows isomorphism between graphs G₀ and G₁:

  1. Prover generates random H isomorphic to G₀, sends H
  2. Verifier chooses random bit b
  3. If b=0, prover shows isomorphism G₀→H
  4. If b=1, prover shows isomorphism G₁→H (requires knowing G₀→G₁)

21.3 Non-Interactive Zero-Knowledge (NIZK)

Fiat-Shamir heuristic converts interactive proofs to non-interactive using hash function as random oracle:

  • Replace verifier's challenges with hash of transcript so far
  • Prover generates proof without interaction

21.4 zk-SNARKs

Zero-Knowledge Succinct Non-interactive ARguments of Knowledge

21.4.1 Properties

  • Succinct: Proof size small (hundreds of bytes)
  • Fast verification: Millisecond time
  • Non-interactive: Single message
  • Zero-knowledge: Reveals nothing

21.4.2 How They Work

  1. Statement converted to arithmetic circuit
  2. Circuit expressed as quadratic arithmetic program (QAP)
  3. Prover computes polynomials satisfying constraints
  4. Proof consists of elliptic curve points (commitments)
  5. Verifier checks pairing equations

21.4.3 Applications

  • Zcash: Private transactions
  • Blockchain scaling: Verify many transactions off-chain
  • Privacy-preserving credentials

21.4.4 Limitations

  • Trusted setup required (ceremony)
  • Quantum-vulnerable (elliptic curves)
  • Complex to implement

21.5 zk-STARKs

Zero-Knowledge Scalable Transparent ARguments of Knowledge

21.5.1 Advantages over SNARKs

  • Transparent: No trusted setup
  • Quantum-resistant: Based on hash functions
  • Scalable: Prover time O(n log n), verification polylog(n)

21.5.2 How They Work

  • Use hash functions and Merkle trees
  • IOP (Interactive Oracle Proof) + Fiat-Shamir
  • Arithmetization using algebraic intermediate representation (AIR)

21.5.3 Trade-offs

  • Larger proofs than SNARKs (tens to hundreds of KB)
  • Newer technology, less battle-tested

21.6 Applications of Zero-Knowledge Proofs

  • Authentication: Prove knowledge of password without sending it
  • Privacy-preserving transactions: Prove balance sufficient without revealing amount
  • Verifiable computation: Prove computation done correctly
  • Anonymous credentials: Prove attributes without revealing identity
  • Voting systems: Prove vote valid without revealing choice

PART VIII — POST-QUANTUM CRYPTOGRAPHY


CHAPTER 22: QUANTUM COMPUTING BASICS

22.1 The Quantum Threat

Quantum computers, if built at sufficient scale, will break much of current cryptography:

  • Shor's algorithm: Solves factoring and discrete log in polynomial time (breaks RSA, DSA, ECDSA, DH)
  • Grover's algorithm: Quadratically speeds up brute-force search (weakens symmetric crypto, but doubling key size suffices)

22.2 Qubits and Quantum Gates

Qubit: Quantum bit that can be in superposition of |0⟩ and |1⟩: |ψ⟩ = α|0⟩ + β|1⟩, where |α|² + |β|² = 1

Entanglement: Multiple qubits can be correlated in ways impossible classically

Quantum Gates: Unitary transformations on qubits (Hadamard, Pauli, CNOT)

22.3 Shor's Algorithm

Idea: Reduce factoring to period-finding, which quantum computers can solve efficiently:

  1. Choose random a < N
  2. Find period r of f(x) = aˣ mod N (quantum part)
  3. If r even and aʳ/² ≠ ±1 mod N, compute gcd(aʳ/² ± 1, N) to get factors

Impact: RSA, DSA, ECDH, ECDSA broken in polynomial time

22.4 Grover's Algorithm

Idea: Search unsorted database of N items in O(√N) time (classical O(N))

Impact on symmetric crypto:

  • Key search: 2ⁿ → 2ⁿ/² (so 128-bit keys → 64-bit security)
  • Hash collisions: 2ⁿ/² → 2ⁿ/³ (so 256-bit hash → 85-bit security)

Mitigation: Double key sizes (AES-256, SHA-384)

22.5 Post-Quantum Cryptography (PQC)

NIST PQC competition (2016-present) aims to standardize quantum-resistant algorithms:

Categories:

  • Lattice-based
  • Code-based
  • Multivariate
  • Hash-based
  • Isogeny-based

CHAPTER 23: LATTICE-BASED CRYPTOGRAPHY

23.1 Introduction to Lattices

A lattice is a discrete subgroup of ℝⁿ: all integer combinations of basis vectors.

L = { Σ aᵢ bᵢ | aᵢ ∈ ℤ } for basis vectors b₁,...,bₙ

23.2 Hard Lattice Problems

Shortest Vector Problem (SVP) : Find shortest non-zero vector in lattice Closest Vector Problem (CVP) : Find lattice point closest to given target Learning With Errors (LWE) : Recover secret s given noisy equations

These problems are believed hard even for quantum computers.

23.3 Learning With Errors (LWE)

Definition: Given samples (a, b = ⟨a, s⟩ + e mod q) where a random, e small error, recover s.

Decision LWE: Distinguish (a, b) from random.

Security: Reductions from worst-case lattice problems to average-case LWE.

23.4 Ring-LWE

Improves efficiency by working in polynomial rings:

  • Elements are polynomials of degree n-1
  • Multiplication using FFT
  • Smaller keys than plain LWE

23.5 NTRU

One of the oldest lattice-based cryptosystems (1996):

Parameters: Polynomial ring R = ℤ[X]/(Xⁿ-1), moduli p, q

Key generation:

  • Choose random small polynomials f, g
  • Compute h = f⁻¹ × g mod q (public key)
  • Private key: f (and f⁻¹ mod p)

Encryption: c = p × h × r + m mod q (r random small polynomial)

Decryption: a = f × c mod q, then m = a × f⁻¹ mod p

23.6 NIST PQC Finalists

Kyber: IND-CCA2 secure KEM based on Module-LWE Dilithium: Lattice-based digital signature Falcon: Fast Fourier lattice-based signatures

23.7 Advantages and Challenges

Advantages:

  • Strong security proofs
  • Efficient operations
  • Versatile (encryption, signatures, FHE)

Challenges:

  • Larger keys than ECC (but smaller than code-based)
  • Complex parameter selection
  • Potential side-channel attacks

CHAPTER 24: CODE-BASED CRYPTOGRAPHY

24.1 Introduction to Error-Correcting Codes

Error-correcting codes add redundancy to detect and fix transmission errors. Code-based cryptography uses the difficulty of decoding random linear codes.

24.2 McEliece Cryptosystem

24.2.1 Overview

Proposed by Robert McEliece in 1978 (oldest PQC scheme). Based on Goppa codes.

Key generation:

  • Choose Goppa code with generator matrix G (can correct t errors)
  • Choose random invertible matrix S and permutation matrix P
  • Public key: G' = S × G × P
  • Private key: S, G, P

Encryption:

  • Encode message m as codeword m × G'
  • Add random t-bit error vector e
  • Ciphertext: c = m × G' + e

Decryption:

  • Compute P⁻¹ × c
  • Use decoding algorithm for Goppa code to correct errors
  • Recover m × S, then multiply by S⁻¹

24.2.2 Security

Based on difficulty of decoding random linear codes (NP-hard). Unbroken for >40 years.

24.2.3 Drawbacks

  • Large public keys (hundreds of KB to MB)
  • Encryption expands message length

24.3 Modern Code-Based Schemes

Classic McEliece: NIST PQC finalist with conservative parameters

BIKE: Bit-flipping key encapsulation (smaller keys, newer)

HQC: Hamming Quasi-Cyclic (balance of size and security)


CHAPTER 25: HASH-BASED SIGNATURES

25.1 Introduction

Hash-based signatures rely only on security of hash functions (well-understood, quantum-resistant with larger outputs).

25.2 Lamport Signatures

One-time signature scheme:

Key generation:

  • For each bit of message, generate two random values (xᵢ,₀, xᵢ,₁)
  • Public key: hashes of all xᵢ,ⱼ

Sign:

  • For each bit b of message, reveal xᵢ,ᵦ

Verify:

  • Check hashes match public key

Problem: Huge keys, one-time use only.

25.3 Merkle Signature Scheme

Combine many one-time keys using Merkle tree:

  1. Generate N one-time key pairs
  2. Build Merkle tree of public keys
  3. Root is overall public key
  4. Sign each message with one-time key, provide authentication path

25.4 XMSS (eXtended Merkle Signature Scheme)

Features:

  • Stateful (must track used keys)
  • Forward security
  • Parameterized for different security levels
  • RFC 8391

25.5 SPHINCS+

Stateless hash-based signatures (no state to track):

  • Uses hypertree of Merkle trees
  • Few-time signature scheme (FORS) for leaves
  • NIST PQC finalist

Properties:

  • Stateless (easier to use)
  • Large signatures (~tens of KB)
  • Conservative security

CHAPTER 26: MULTIVARIATE CRYPTOGRAPHY

26.1 Introduction

Multivariate cryptography bases security on solving systems of multivariate quadratic equations over finite fields (MQ problem), which is NP-hard.

26.2 Oil and Vinegar Schemes

Unbalanced Oil and Vinegar (UOV) :

  • Variables divided into "oil" (o) and "vinegar" (v)
  • Quadratic equations: no oil×oil terms
  • With vinegar variables fixed, system linear in oil variables

Signature:

  1. Choose random vinegar values
  2. Solve linear system for oil values
  3. Signature is complete solution

Verification: Check equations satisfied

26.3 Rainbow

Multilayer UOV with improved efficiency and smaller keys. NIST PQC finalist.

26.4 Security Considerations

  • Cryptanalysis advances have broken many multivariate schemes
  • Rainbow parameters chosen conservatively
  • Keys relatively small, signatures moderate, verification fast

PART IX — APPLIED CRYPTOGRAPHY


CHAPTER 27: BLOCKCHAIN CRYPTOGRAPHY

27.1 Introduction to Blockchain

A blockchain is a distributed, immutable ledger of transactions, secured by cryptography and consensus mechanisms.

27.2 Hash Chains

Each block contains hash of previous block, creating chain:

Blockᵢ = (dataᵢ, H(Blockᵢ₋₁))

Properties:

  • Tamper-evident: Changing any block changes all subsequent hashes
  • Efficient verification: Only need to check chain of hashes

27.3 Merkle Trees in Blockchain

Each block's transactions organized in Merkle tree:

  • Leaf: transaction hash
  • Root: Merkle root included in block header
  • Allows efficient proof of inclusion (SPV verification)

27.4 Proof of Work (PoW)

27.4.1 Concept

Require computational work to create new block:

Find nonce such that H(block header || nonce) < target

Properties:

  • Difficult to find, easy to verify
  • Adjustable difficulty (target)
  • Secures chain: attacker needs >50% hashing power

27.4.2 Mining

Miners compete to find valid nonce. Winner gets:

  • Block reward (new coins)
  • Transaction fees

27.4.3 Energy Concerns

Bitcoin PoW consumes significant energy. Alternatives seek more efficient consensus.

27.5 Proof of Stake (PoS)

27.5.1 Concept

Validators stake coins as collateral; chosen to create blocks based on stake (and other factors).

Advantages:

  • Energy efficient
  • Economic security (slashing for misbehavior)

Examples: Ethereum 2.0, Cardano, Solana

27.5.2 Challenges

  • Nothing at stake problem
  • Long-range attacks
  • Centralization risks

27.6 Bitcoin Cryptography

27.6.1 Keys and Addresses

  • Private key: 256-bit random number
  • Public key: secp256k1 elliptic curve point
  • Address: HASH160(public key) with checksum (Base58Check)

27.6.2 Transactions

  • Inputs: References to previous UTXOs (unspent transaction outputs)
  • Outputs: Amount + locking script (usually pay-to-pubkey-hash)
  • Signatures: ECDSA over secp256k1

27.6.3 Script

Bitcoin's stack-based scripting language:

  • Not Turing-complete (no loops)
  • Signature verification
  • Time locks
  • Multi-signature

27.7 Ethereum Cryptography

27.7.1 Accounts

  • Externally owned accounts (controlled by private keys)
  • Contract accounts (controlled by code)

27.7.2 Transactions

  • Signed with ECDSA (secp256k1)
  • Contains nonce (replay protection)
  • Gas limits and prices

27.7.3 Smart Contracts

Turing-complete code running on blockchain:

  • Compiled to EVM bytecode
  • State transitions validated by all nodes
  • Can hold and transfer ether

27.7.4 Future: Ethereum 2.0

  • Moving to Proof of Stake
  • BLS signatures for aggregation
  • Sharding for scalability

27.8 Privacy in Blockchain

Challenges: Public ledgers reveal transaction graph

Solutions:

  • Mixing: CoinJoin, Tumblers
  • Confidential transactions: Hide amounts (Pedersen commitments)
  • Ring signatures: Monero's ring CT
  • zk-SNARKs: Zcash's shielded transactions

CHAPTER 28: SECURE MULTIPARTY COMPUTATION

28.1 Introduction

Secure Multiparty Computation (MPC) allows parties to jointly compute a function on their private inputs without revealing those inputs to each other.

Goal: Compute f(x₁, ..., xₙ) where party i knows xᵢ, learning only output (and what can be inferred from it).

28.2 Secret Sharing

28.2.1 Concept

Split secret s into n shares such that:

  • Any t shares reconstruct s (t-out-of-n threshold)
  • Fewer than t shares reveal nothing about s

28.2.2 Shamir's Secret Sharing

Based on polynomial interpolation:

  1. Choose random polynomial P of degree t-1 with P(0) = s
  2. Share for party i: (i, P(i))
  3. Any t shares allow reconstructing P (and thus s) via Lagrange interpolation

Perfect secrecy: With fewer than t shares, any secret equally likely.

28.3 Garbled Circuits

28.3.1 Yao's Protocol

Two-party computation where one party (garbler) encrypts circuit, other (evaluator) computes:

  1. Garbler creates garbled circuit:
    • For each wire, generate two random labels (0 and 1)
    • For each gate, encrypt output labels with input labels
  2. Garbler sends garbled circuit to evaluator
  3. For evaluator's inputs, garbler sends corresponding labels (via oblivious transfer)
  4. Evaluator evaluates gate by gate, decrypting with labels
  5. Only output wire values learned

28.3.2 Optimizations

  • Free XOR: XOR gates with no encryption
  • Half-gates: Reduce AND gate size
  • Fixed-key AES for encryption

28.4 Homomorphic Encryption

28.4.1 Partially Homomorphic

Support one operation (addition or multiplication):

  • RSA: Multiplicative: E(m₁) × E(m₂) = E(m₁m₂)
  • Paillier: Additive: E(m₁) × E(m₂) = E(m₁ + m₂)
  • ElGamal: Multiplicative

Applications: Electronic voting (Paillier), private information retrieval

28.4.2 Somewhat Homomorphic

Support limited number of both operations.

28.4.3 Fully Homomorphic Encryption (FHE)

The Holy Grail: Support arbitrary computations on encrypted data.

History:

  • 2009: Gentry's first FHE scheme (using ideal lattices)
  • Subsequent schemes: BGV, BFV, CKKS (for approximate numbers)
  • TFHE: Fast bootstrapping

How it works:

  • Encryption adds noise
  • Multiplication increases noise
  • Bootstrapping refreshes ciphertext (removes noise)
  • Allows unlimited operations

Applications:

  • Private outsourced computation
  • Secure machine learning on encrypted data
  • Private database queries

Challenges:

  • Performance (1000x slower than plaintext)
  • Large ciphertexts
  • Complex parameter selection

28.5 MPC in Practice

  • SPDZ protocol: Preprocessing + online phase
  • MP-SPDZ: Framework for various MPC protocols
  • Applications: Key management, auctions, privacy-preserving analytics

CHAPTER 29: SECURE HARDWARE

29.1 Trusted Platform Module (TPM)

29.1.1 Overview

TPM is a dedicated microcontroller designed to secure hardware through integrated cryptographic keys.

Specifications:

  • TPM 1.2 (2003)
  • TPM 2.0 (2014, current)

29.1.2 Capabilities

  • Secure generation and storage of keys (never leave TPM)
  • Platform integrity measurement: Record boot process measurements in PCRs
  • Remote attestation: Prove platform state to remote party
  • Sealing: Bind data to specific platform state
  • Random number generation
  • Limited cryptographic operations (RSA, ECC, SHA, AES)

29.1.3 Applications

  • BitLocker: Full disk encryption keys protected by TPM
  • Secure Boot: Verify bootloader signatures
  • TPM-based smart cards
  • VPN authentication

29.2 Smart Cards

29.2.1 Description

Credit-card sized devices with embedded chip:

  • Secure microcontroller
  • Tamper-resistant
  • Runs Java Card or native OS

29.2.2 Security Features

  • Physical tamper protection (mesh sensors, gluing)
  • Logical security (access conditions, secure channels)
  • Limited interface (reduces attack surface)
  • PIN verification on-card

29.2.3 Applications

  • EMV payment cards
  • SIM cards (mobile authentication)
  • PIV/CAC (US government IDs)
  • ePassports

29.3 Hardware Security Modules (HSMs)

29.3.1 Overview

High-end secure devices for enterprise key management:

  • Network-attached (or PCIe card)
  • FIPS 140-2/3 Level 3 or 4 certified
  • Tamper-responsive (zeroize keys on tamper)
  • High performance (hardware acceleration)

29.3.2 Capabilities

  • Generate, store, manage keys throughout lifecycle
  • Perform crypto operations inside HSM
  • Audit logging
  • Key backup and replication
  • Multi-party authorization for sensitive operations

29.3.3 Use Cases

  • Certificate Authorities: Protect root and intermediate keys
  • Payment HSMs: PIN processing, transaction signing (PCI HSM)
  • Database encryption: Master key protection
  • Code signing: Protect software signing keys

29.4 Side-Channel Resistance

Hardware implementations must resist physical attacks:

Power Analysis:

  • Simple Power Analysis (SPA): Direct key extraction from single trace
  • Differential Power Analysis (DPA): Statistical analysis of many traces

Countermeasures:

  • Masking: Randomize intermediate values
  • Hiding: Balance power consumption, add noise
  • Shuffling operations
  • Dual-rail logic

Electromagnetic Analysis:

  • Measure EM emissions instead of power
  • Similar countermeasures apply

Fault Injection:

  • Voltage/clock glitches
  • Laser/light injection
  • Electromagnetic pulses

Countermeasures:

  • Error detection
  • Redundant computation
  • Sensors

PART X — CRYPTANALYSIS


CHAPTER 30: ATTACKS ON SYMMETRIC CIPHERS

30.1 Differential Cryptanalysis

30.1.1 Basic Concept

Analyze how differences in plaintext propagate through cipher to cause differences in ciphertext with non-random probability.

Process:

  1. Choose input difference ΔP
  2. Track propagation through rounds
  3. Find output difference ΔC with high probability
  4. Use this to recover key bits

30.1.2 Differential Characteristics

For each round, probability that given input difference leads to specific output difference. Multiply probabilities for independent rounds.

30.1.3 Attack Procedure

  1. Collect many (P, P⊕ΔP, C, C') pairs
  2. Count how many produce expected ΔC
  3. When count significantly above random, suggests correct key hypothesis

30.1.4 Historical Impact

  • Discovered by Biham and Shamir (late 1980s)
  • Could break DES faster than brute force
  • DES S-boxes designed to resist differential cryptanalysis (classified at the time)

30.2 Linear Cryptanalysis

30.2.1 Concept

Find linear approximations for cipher components:

P[α·P ⊕ β·C = γ·K] ≠ 1/2

30.2.2 Matsui's Algorithm

  1. Find linear expression with high bias ε
  2. Collect N plaintext-ciphertext pairs
  3. For each key candidate, count how many satisfy expression
  4. Correct key candidate will have count near N/2 ± Nε

30.2.3 Application to DES

  • First practical attack on DES (2⁴³ known plaintexts)
  • Later improved with multiple linear approximations

30.3 Integral Cryptanalysis

Idea: Track propagation of sums (integral) of values through cipher. Works well against byte-oriented ciphers like AES.

Square attack: For 4-round AES:

  • Choose set of 256 plaintexts varying in one byte
  • After 3 rounds, all bytes become balanced (each value appears same number of times)
  • Fourth round allows key recovery

30.4 Meet-in-the-Middle Attacks

Applicable to ciphers where key bits affect rounds independently:

  1. Compute forward from plaintext with partial key guesses
  2. Compute backward from ciphertext with partial key guesses
  3. Match in middle to identify correct keys

Applications:

  • 2DES (as earlier)
  • Some lightweight block ciphers

30.5 Related-Key Attacks

Attacker can obtain encryptions under different keys with known relationships.

Examples:

  • AES-192 and AES-256 have related-key attacks (not on full rounds)
  • Some designs compromised

30.6 Side-Channel Attacks on Symmetric Ciphers

Cache Attacks:

  • Measure cache access patterns during AES T-table lookups
  • Can recover key bits

Timing Attacks:

  • Some implementations have variable-time operations
  • Measure timing to infer key bits

Power Analysis:

  • DPA on AES implementations
  • Can recover key from power traces

CHAPTER 31: ATTACKS ON PUBLIC KEY SYSTEMS

31.1 Lattice Attacks

31.1.1 Coppersmith's Attack

Find small roots of polynomial equations using lattice basis reduction (LLL):

  • Used to attack RSA with small exponent or partial key exposure
  • Recover message when e small and message padded with known pattern

31.1.2 Lattice Attacks on DSA/ECDSA

With biased or partially known nonces, lattice techniques recover private key.

31.1.3 Lattice Basis Reduction

LLL algorithm finds relatively short vectors in lattice:

  • Applications to many cryptosystems
  • Basis for many attacks

31.2 Fault Attacks

31.2.1 RSA Fault Attack

Induce error during CRT computation:

  • Get correct signature S
  • Get faulty signature S' (error in one factor)
  • Compute gcd(S - S', n) to recover factor

31.2.2 Defense

Always verify signature after generation (or use fault detection).

31.3 Side-Channel Attacks

31.3.1 Timing Attacks on RSA

  • Kocher's attack (1996) measured decryption time variations
  • Montgomery multiplication timing leaks exponent bits
  • Defenses: constant-time, blinding

31.3.2 Power Analysis on ECC

  • Scalar multiplication leaks bits through power consumption
  • Double-and-add algorithm especially vulnerable
  • Defense: Montgomery ladder, complete addition formulas

31.4 Padding Oracle Attacks

31.4.1 Bleichenbacher's Attack

Against RSA PKCS#1 v1.5 padding (1998):

  • Server reveals whether decrypted padding is valid
  • Attacker can iteratively decrypt any ciphertext
  • ~1 million queries for full decryption

31.4.2 CBC Padding Oracle

Against CBC mode with padding validation:

  • Modify ciphertext, observe padding errors
  • Recover plaintext byte by byte
  • Defense: constant-time validation, authenticated encryption

31.5 Coppersmith's Theorem

Statement: Given monic polynomial f(x) of degree d modulo N, and N has unknown factorization, can find all small roots |x₀| < N^(1/d) efficiently.

Applications:

  • Boneh-Durfee attack on small RSA private exponent
  • Stereotyped messages with small padding
  • Factoring with high bits known

CHAPTER 32: SIDE-CHANNEL ATTACKS

32.1 Classification

Non-invasive: Observe behavior without modifying device (timing, power) Semi-invasive: Modify environment but not device (fault injection) Invasive: Physically probe/modify chip

Passive: Just observe Active: Inject faults

32.2 Timing Attacks

32.2.1 Principle

Operations take different time depending on secret data:

  • Conditional branches (if secret bit then do extra work)
  • Cache behavior (access time depends on whether data cached)
  • Multiplication time variations

32.2.2 Examples

  • RSA: Exponentiation time reveals exponent bits
  • AES: Cache-timing attacks on T-table implementations
  • ECC: Scalar multiplication timing

32.2.3 Countermeasures

  • Constant-time code (no branches on secrets)
  • Avoid secret-dependent array indices
  • Use hardware with constant-time instructions
  • Blinding (randomize inputs)

32.3 Power Analysis

32.3.1 Simple Power Analysis (SPA)

Directly observe power trace to identify operations:

  • Square vs. multiply in exponentiation
  • Conditional jumps
  • Key schedule operations

Countermeasures:

  • Make all operations indistinguishable (Montgomery ladder)
  • Add noise

32.3.2 Differential Power Analysis (DPA)

Statistical analysis of many traces:

  1. Guess key bits
  2. Partition traces based on intermediate value
  3. Compute difference of averages
  4. Correct guess shows spikes

Countermeasures:

  • Masking: Randomize intermediate values
  • Hiding: Balance power, add random delays
  • Dual-rail logic

32.4 Cache Attacks

32.4.1 Concept

Measure whether attacker's own cache lines have been evicted by victim:

  • Prime+Probe: Fill cache, wait, measure reload time
  • Flush+Reload: Flush shared line, wait, measure reload time

32.4.2 Attacks on AES

  • T-table lookups indexed by key-dependent values
  • Cache misses reveal key bits
  • Practical attacks from JavaScript in browser

32.4.3 Countermeasures

  • Use hardware AES instructions (constant-time)
  • Bitslicing implementations
  • Cache partitioning

32.5 Spectre and Meltdown

32.5.1 Overview

Transient execution attacks that leak information across security boundaries:

Meltdown (2018):

  • Bypass kernel/user isolation
  • Out-of-order execution allows accessing kernel memory

Spectre (2018):

  • Bypass bounds checking
  • Speculative execution executes illegal paths

32.5.2 Implications for Cryptography

  • Can read cryptographic keys from memory
  • Affects many cloud environments
  • Software mitigations impact performance

32.5.3 Countermeasures

  • Kernel page table isolation
  • Serializing instructions
  • Microcode updates
  • Constant-time not enough (transient execution)

PART XI — FORMAL SECURITY & PROOFS


CHAPTER 33: PROVABLE SECURITY

33.1 The Need for Proofs

Intuition about cryptographic security is often wrong. Formal proofs provide mathematical assurance that schemes meet security goals under well-defined assumptions.

33.2 Security Models

33.2.1 Adversary Goals

  • IND (Indistinguishability) : Can't distinguish encryption of two messages
  • NM (Non-malleability) : Can't transform ciphertext into encryption of related message
  • EUF (Existential Unforgeability) : Can't forge signatures

33.2.2 Adversary Capabilities

  • CPA (Chosen Plaintext Attack) : Can obtain encryptions of chosen plaintexts
  • CCA1 (Lunchtime Attack) : Can obtain decryptions before seeing challenge
  • CCA2 (Adaptive Chosen Ciphertext) : Can obtain decryptions after challenge (except target)

33.3 IND-CPA Security

Game:

  1. Challenger generates key pair
  2. Adversary chooses two messages m₀, m₁ of equal length
  3. Challenger chooses random b, returns c* = Enc(m_b)
  4. Adversary outputs guess b'
  5. Adversary wins if b' = b

Advantage: |Pr[b'=b] - 1/2| Scheme is secure if advantage negligible for all efficient adversaries.

33.4 IND-CCA Security

Same as IND-CPA, but adversary also has access to decryption oracle before and after challenge (except cannot query c*).

CCA2: Can continue queries after challenge (except c*)

Important: Many schemes IND-CPA secure (ElGamal) are not IND-CCA secure.

33.5 EUF-CMA (Signatures)

Game:

  1. Challenger generates key pair, gives adversary public key
  2. Adversary can request signatures on chosen messages
  3. Adversary outputs (m*, σ*) where m* not previously queried
  4. Wins if signature verifies

Scheme secure if no efficient adversary can win with non-negligible probability.


CHAPTER 34: RANDOM ORACLE MODEL

34.1 The Random Oracle Model (ROM)

Concept: Model hash function as truly random function accessible to all parties via oracle queries.

In proofs:

  • Hash queries answered consistently with random output
  • Adversary cannot compute hash without querying
  • Proof uses ability to "program" responses

34.2 ROM Proofs

Method:

  1. Simulator runs adversary
  2. Simulator controls random oracle
  3. When adversary queries hash, simulator responds adaptively
  4. Simulator extracts information from adversary's queries
  5. Reduces to solving hard problem

34.3 Examples

  • OAEP for RSA (proved secure in ROM)
  • Schnorr signatures
  • Fiat-Shamir transform

34.4 Controversy and Alternatives

Concerns:

  • Random oracle doesn't exist in reality
  • Some schemes secure in ROM broken in practice (though rare)
  • Proof doesn't guarantee security with real hash

Alternatives:

  • Standard model (no random oracles)
  • Generic group model
  • Ideal cipher model

CHAPTER 35: GAME-BASED PROOFS

35.1 Structure

Security proof as sequence of games:

Game₀: Original security game Game₁: Slight modification ... Gameₙ: Game where adversary's advantage is clearly bounded

Transitions:

  • Indistinguishable: Games identical from adversary view
  • Bridging step: Change that adversary can't detect
  • Failure event: Games identical unless event occurs (bound probability)

35.2 Example: IND-CPA Security of ElGamal

Game₀: IND-CPA game with real ElGamal Game₁: Replace yᵏ with random group element (by DDH assumption) Game₂: Challenge ciphertext independent of message → advantage 0

35.3 Advantages

  • Modular: Add/remove components systematically
  • Clear: Each step precisely described
  • Quantitative: Can compute exact advantage bounds

PART XII — ADVANCED TOPICS


CHAPTER 36: HOMOMORPHIC ENCRYPTION

36.1 Definition

Homomorphic encryption allows computation on encrypted data without decryption:

Enc(m₁) ⊗ Enc(m₂) = Enc(m₁ ⊕ m₂)

Types:

  • Partially homomorphic (one operation)
  • Somewhat homomorphic (limited operations)
  • Fully homomorphic (unlimited)

36.2 Applications

  • Privacy-preserving cloud computing: Process encrypted data
  • Private information retrieval: Query database without revealing query
  • Secure voting: Tally encrypted votes
  • Machine learning on encrypted data

36.3 FHE Schemes

  • Gentry's scheme (2009) : Bootstrapping from somewhat homomorphic
  • BGV: Ring-LWE based, leveled FHE
  • BFV: Integer arithmetic
  • CKKS: Approximate arithmetic for ML
  • TFHE: Fast bootstrapping, Boolean circuits

36.4 Challenges

  • Performance: 4-6 orders of magnitude slower than plaintext
  • Ciphertext expansion
  • Parameter selection complex
  • Noise management

CHAPTER 37: SECURE ENCLAVES

37.1 Concept

Hardware-protected execution environments that isolate code and data even from privileged software:

  • Intel SGX (Software Guard Extensions)
  • AMD SEV (Secure Encrypted Virtualization)
  • ARM TrustZone

37.2 Intel SGX

Features:

  • Enclaves: Protected memory regions
  • Attestation: Prove enclave identity to remote party
  • Sealing: Encrypt data to specific enclave
  • Memory encryption

Programming model:

  • Application partitioned into trusted/untrusted
  • ECALLs enter enclave, OCALLs exit

37.3 Security Properties

  • Isolated execution (even from OS, hypervisor)
  • Confidentiality: Code/data encrypted in memory
  • Integrity: Memory tampering detected
  • Remote attestation: Verify enclave before sending secrets

37.4 Attacks and Limitations

  • Side-channel attacks (cache timing, page faults)
  • Microarchitectural attacks (Foreshadow, ZombieLoad)
  • Platform compromise (SMM, ME)
  • Limited memory (128MB EPC)

37.5 Applications

  • Confidential computing (protect data in use)
  • Digital rights management
  • Secure multiparty computation
  • Blockchain (private smart contracts)

CHAPTER 38: CRYPTOGRAPHIC ENGINEERING

38.1 Implementation Principles

Correctness: Implement specification exactly Security: Constant-time, no side channels Performance: Optimize for target platform Usability: Safe APIs that are hard to misuse

38.2 Common Implementation Flaws

  • Insufficient randomness: Weak PRNG, predictable seeds
  • Timing leaks: Branches on secrets, cache-timing
  • Padding oracle: Leaking padding validity
  • Protocol downgrade: Allowing weaker parameters
  • Certificate validation: Skipping checks
  • Error handling: Revealing too much

38.3 Cryptographic Libraries

High-level (harder to misuse):

  • libsodium (C)
  • Google Tink (multi-language)
  • NaCl (C)

Low-level (more flexibility):

  • OpenSSL/LibreSSL
  • BoringSSL
  • mbedTLS

Language-specific:

  • Java JCE
  • .NET Cryptography
  • Python cryptography, PyCryptodome

38.4 Testing and Verification

  • Known-answer tests: Verify against standard test vectors
  • Statistical tests: Validate RNG output
  • Fuzzing: Test with random inputs
  • Formal verification: Prove implementation matches spec (HACL*, EverCrypt)
  • Side-channel analysis: Test for timing leaks

38.5 Constant-Time Programming

Avoid:

  • Conditional branches based on secrets
  • Table lookups with secret indices
  • Variable-time instructions (division on some CPUs)

Use:

  • Bitwise operations (select via masks)
  • Conditional moves (cmov)
  • Fixed execution path

CHAPTER 39: PRIVACY ENHANCING TECHNOLOGIES

39.1 Tor (The Onion Router)

39.1.1 Concept

Anonymous communication by routing through multiple relays with layered encryption:

  • Onion encryption: Each layer encrypted for specific relay
  • Circuit: Path through 3 relays (entry, middle, exit)
  • Perfect forward secrecy: Ephemeral keys per circuit

39.1.2 How It Works

  1. Client builds circuit incrementally
  2. Each relay only knows previous and next hop
  3. Data wrapped in multiple encryption layers
  4. Each relay removes one layer
  5. Exit relay sends plaintext to destination

39.1.3 Limitations

  • Exit node can see traffic (use TLS)
  • Traffic analysis possible
  • Slow (latency from multiple hops)
  • DoS attacks on directory authorities

39.2 Mix Networks

39.2.1 Concept

Messages batched, shuffled, and re-ordered to defeat traffic analysis:

  • Mix node: Collects messages, delays, reorders, re-encrypts, forwards
  • Mix cascade: Multiple mixes in sequence

39.2.2 Properties

  • Anonymity set: All messages in batch
  • Unlinkability: Can't correlate input to output
  • Resistance to traffic analysis

39.2.3 Examples

  • Chaum mixes: Original design
  • Mixminion: Type III anonymous remailer
  • Nym: Loopix-based mixnet

39.3 Differential Privacy

39.3.1 Definition

A randomized mechanism M satisfies ε-differential privacy if for all datasets D, D' differing in one record, and all outputs S:

Pr[M(D) ∈ S] ≤ e^ε × Pr[M(D') ∈ S]

Interpretation: Presence or absence of any single record doesn't significantly affect output distribution.

39.3.2 Mechanisms

  • Laplace mechanism: Add Laplace noise to query result
  • Exponential mechanism: Choose output with probability proportional to utility
  • Gaussian mechanism: For (ε,δ)-differential privacy

39.3.3 Applications

  • US Census: 2020 Census used differential privacy
  • Apple: Collects usage statistics with privacy
  • Google: RAPPOR for Chrome telemetry
  • Machine learning: Differentially private training

39.4 Other Privacy Technologies

  • Private Information Retrieval (PIR) : Retrieve database record without revealing which
  • Anonymous credentials: Prove attributes without revealing identity
  • Zero-knowledge proofs: Prove statements without revealing secrets
  • Privacy-preserving ML: Federated learning, split learning

CHAPTER 40: FUTURE OF CRYPTOGRAPHY

40.1 Quantum Computing Impact

  • Short-term: Grover's algorithm halves symmetric security (double key sizes)
  • Medium-term: Shor's algorithm breaks current PKC
  • Long-term: Fault-tolerant quantum computers may exist

Timeline uncertainty: No one knows when large-scale quantum computers will be built.

40.2 Post-Quantum Transition

Challenges:

  • Replace existing protocols and infrastructure
  • Larger keys and signatures affect performance
  • Standards needed for interoperability
  • Hybrid schemes during transition

NIST Timeline:

  • 2024: Final standards expected
  • 2025+: Agencies begin transition
  • 2030+: Deprecation of current algorithms

40.3 Emerging Trends

Homomorphic Encryption:

  • Improving performance
  • Practical applications emerging
  • Hardware acceleration

Multiparty Computation:

  • More efficient protocols
  • Real-world deployments (key management, auctions)
  • Combining with blockchains

Blockchain Privacy:

  • ZK-rollups for scaling
  • Private smart contracts
  • Regulatory compliance challenges

Threshold Cryptography:

  • Distributed key generation and signing
  • Eliminate single points of failure
  • Used in custody solutions

Formal Verification:

  • Verified implementations (HACL*, EverCrypt)
  • Machine-checked proofs of protocols
  • Reducing implementation vulnerabilities

40.4 The Human Element

Cryptography protects systems, but humans remain weakest link:

  • Social engineering: Bypass technical controls
  • Poor key management: Users choose weak passwords
  • Misconfiguration: Default credentials, outdated protocols
  • Insider threats: Authorized users with malicious intent

Future work:

  • Usable security: Make crypto easier to use correctly
  • Automated configuration: Reduce human error
  • Better training: Security awareness

40.5 Cryptographic Research Directions

New Hard Problems:

  • Isogeny-based cryptography (SIDH, CSIDH)
  • Code-based cryptography improvements
  • Lattice-based optimizations

Advanced Protocols:

  • Functional encryption
  • Obfuscation (still theoretical)
  • Quantum cryptography (QKD)

Implementation Security:

  • Side-channel resistance
  • Fault attack countermeasures
  • Formal verification of implementations

Policy and Standards:

  • International harmonization
  • Balancing security and privacy
  • Law enforcement access debates

APPENDICES


APPENDIX A: MATHEMATICAL REFERENCE

A.1 Number Theory

Prime number theorem: π(x) ~ x/ln(x)

Euler's theorem: a^φ(n) ≡ 1 (mod n) if gcd(a,n)=1

Fermat's little theorem: a^(p-1) ≡ 1 (mod p) for prime p, a not divisible by p

Chinese Remainder Theorem: System of congruences with coprime moduli has unique solution modulo product.

A.2 Group Theory

Lagrange's theorem: Order of subgroup divides order of group

Cyclic groups: Exist generator g such that every element is gⁱ

Order of element: Smallest positive k such that gᵏ = 1

A.3 Finite Fields

GF(p): Integers modulo prime p

GF(2ⁿ): Binary field, elements as polynomials

Irreducible polynomial: Polynomial that cannot be factored (used to construct fields)

A.4 Complexity Classes

P: Polynomial time

NP: Verifiable in polynomial time

BPP: Probabilistic polynomial time with bounded error

#P: Counting solutions to NP problems


APPENDIX B: STANDARDS

B.1 NIST Standards

  • FIPS 197: AES
  • FIPS 180-4: SHA-2
  • FIPS 202: SHA-3
  • FIPS 186-5: Digital signatures (DSA, RSA, ECDSA, EdDSA)
  • SP 800-38 series: Block cipher modes
  • SP 800-56 series: Key agreement
  • SP 800-90 series: Random number generation
  • SP 800-108: Key derivation

B.2 ISO/IEC Standards

  • ISO/IEC 18033: Encryption algorithms
  • ISO/IEC 9796: Digital signature schemes giving message recovery
  • ISO/IEC 14888: Digital signatures with appendix
  • ISO/IEC 11770: Key management

B.3 IETF RFCs

  • RFC 8446: TLS 1.3
  • RFC 8032: Ed25519 and Ed448
  • RFC 7748: X25519 and X448
  • RFC 5869: HKDF
  • RFC 6234: US Secure Hash Algorithms

B.4 Other Standards

  • PKCS #1: RSA Cryptography Standard
  • PKCS #11: Cryptographic Token Interface
  • ANSI X9.62: ECDSA
  • IEEE 1363: Standard Specifications for Public-Key Cryptography

APPENDIX C: CRYPTOGRAPHIC LIBRARIES

C.1 C/C++ Libraries

  • OpenSSL: Comprehensive, widely used, complex API
  • LibreSSL: OpenSSL fork with cleaner code
  • BoringSSL: Google's OpenSSL fork
  • libsodium: Easy-to-use, modern API, based on NaCl
  • mbedTLS: Small footprint, embedded systems
  • Botan: Broad algorithm support, modern C++

C.2 Java Libraries

  • Java Cryptography Extension (JCE) : Built-in
  • Bouncy Castle: Extensive algorithm support
  • Google Tink: High-level, safe API

C.3 Python Libraries

  • cryptography: High-level recipes, low-level interfaces
  • PyCryptodome: Self-contained, fork of PyCrypto
  • python-gnupg: GnuPG wrapper

C.4 Go Libraries

  • crypto/ : Standard library packages
  • go.crypto: Additional algorithms

C.5 Rust Libraries

  • RustCrypto: Collection of crates
  • ring: Performance-focused, based on BoringSSL

APPENDIX D: IMPLEMENTATION GUIDELINES

D.1 Random Number Generation

  • Use cryptographically secure PRNG (not rand())
  • Seed with sufficient entropy
  • On Unix: /dev/urandom or getrandom()
  • On Windows: CryptGenRandom or BCryptGenRandom

D.2 Key Management

  • Generate keys in secure environment
  • Store keys encrypted or in HSM
  • Rotate keys periodically
  • Destroy keys securely when no longer needed

D.3 Constant-Time Code

// Constant-time comparison
int ct_compare(const uint8_t *a, const uint8_t *b, size_t len) {
    int result = 0;
    for (size_t i = 0; i < len; i++) {
        result |= a[i] ^ b[i];
    }
    return result; // 0 if equal
}

// Constant-time select
uint8_t ct_select(uint8_t a, uint8_t b, int bit) {
    uint8_t mask = -bit; // 0 or 0xFF
    return (a & mask) | (b & ~mask);
}

D.4 Error Handling

  • Return generic errors (don't leak why)
  • Zeroize sensitive data from memory
  • Use secure memory (mlock, VirtualLock)

D.5 Testing

  • Known-answer tests for all algorithms
  • Negative tests (invalid inputs)
  • Fuzzing for robustness
  • Side-channel leakage testing

APPENDIX E: OPEN RESEARCH PROBLEMS

E.1 Theoretical Problems

  • P vs. NP and implications for cryptography
  • Constructing cryptographic primitives from weaker assumptions
  • Proofs of security without random oracles for practical schemes

E.2 Practical Problems

  • Efficient FHE for real-world applications
  • Post-quantum cryptography with smaller keys
  • Side-channel resistant implementations
  • Usable cryptography for non-experts

E.3 Emerging Areas

  • Quantum cryptography beyond QKD
  • Cryptographic obfuscation (efficient)
  • Privacy-preserving machine learning
  • Secure multi-party computation at scale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment