This document proposes adding support for the Key Management Interoperability Protocol (KMIP) as a backend for RGW's Server-Side Encryption with S3-Managed Keys (SSE-S3).
This feature will mirror the functionality of the existing HashiCorp Vault Transit backend, allowing a KMIP server to manage bucket-level Key Encryption Keys (KEKs) while RGW manages the creation and lifecycle of per-object Data Encryption Keys (DEKs).
The core of the work involves extending libkmip and RGW's libkmip wrapper to support Encrypt and Decrypt operations.
We are seeking feedback on our proposed implementation strategy, particularly regarding DEK generation and our approach to contributing to the libkmip fork.
As an S3 storage provider using Ceph, I want to encrypt data on the server side using the SSE-S3 method, with keys and cryptographic operations managed by a standards-compliant KMIP server.
This will allow organizations to leverage their existing KMIP infrastructure for transparent, S3-managed data-at-rest encryption. In this model, the KMIP server will store and manage bucket-level Key Encryption Keys (KEKs), while RGW will handle the creation of per-object Data Encryption Keys (DEKs) and store them in a wrapped (encrypted) format within the object's metadata.
Related Tracker: sse-s3: support kms backends other than vault-transit (https://tracker.ceph.com/issues/72251)
- KMIP: Key Management Interoperability Protocol
- DEK: Data Encryption Key (a unique key per object used for encrypting the object's data)
- KEK: Key Encryption Key (a key stored in the KMS used to wrap DEKs)
- SSE: Server Side Encryption
- KMS: Key Management System
- SSE-KMS: Server-Side Encryption with keys managed in a Key Management System (KMS), where the user specifies which key to use.
- SSE-S3: Server-Side Encryption with keys managed by S3 (RGW). The process is transparent to the end-user.
- Wrapped DEK: The DEK, but encrypted with the KEK
- RGW (x)attrs: RGW per S3 object metadata stored as RADOS xattrs.
RGW supports two modes: SSE-S3, where the keys are managed by the storage system, and SSE-KMS, where the user specifies a key id they manage in an external KMS.
RGW's SSE-S3 backend support is currently limited to HashiCorp Vault Transit mode. RGW's SSE-KMS support includes Barbican, HashiCorp Vault and the Key Management Interoperability Protocol (KMIP).
This proposal implements the SSE-S3 behavior using a KMIP server as the secure backend for storing the S3-managed keys. This is analogous to how the current Vault transit backend works.
References:
- https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html
- https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html#encryption-context
It's important to distinguish this proposal from RGW's existing KMIP support for SSE-KMS. In the SSE-KMS model, the client must explicitly provide a Key ID from the KMS to use for encryption on a per-request basis.
Our proposal for SSE-S3 provides a more transparent model where encryption is automatic once enabled on a bucket. RGW manages the key lifecycle (creating a KEK per bucket) and uses the configured KMIP server as its root of trust. This user experience is identical to the current SSE-S3 implementation that uses HashiCorp Vault.
KMIP is a protocol by OASIS with many implementations.
Selected Open Source implementations:
- libkmip, Ceph's libkmip fork - Used by RGW's SSE-KMS KMIP backend
- PyKMIP - Python implementation, including a test server
- OVH kmip-go
- ThalesGroup kmip-go
KMIP specifies a managed object model with 51 client to server operations. The wire protocol is a binary "simplified TTLV (tag, type, length, value)" format. The wire protocol is encapsulated in TLS with an auth scheme around client certificates. Additional password authentication is part of the KMIP protocol.
References:
Vault Transit stores a per-bucket Key Encryption Key (KEK) that never leaves Vault. DEK creation and decryption is done by Vault. Wrapped DEKs are stored with additional metadata in the RGW object attrs. Each object has an attached wrapped DEK + metadata + context.
Key life cycle is tied to buckets and object respectively. Removing the bucket key renders all object of that bucket useless. Object DEKs are part of the object metadata and therefore the object life cycle.
The implementation follows the established pattern of the Vault Transit backend, using a KEK/DEK encryption scheme.
When SSE-S3 with the KMIP backend is enabled on a bucket, RGW will call the Create operation on the KMIP server to generate a new symmetric Key Encryption Key (KEK). The unique ID (UUID) of this KEK is then stored in the RGW bucket metadata. This KEK will never leave the KMIP server.
flowchart TB
subgraph RGW
req[Enable SSE-S3 on Bucket] --> |1. Generate KEK| kmip_wrapper
kmip_wrapper[KMIP Wrapper] --> |3. KEK ID| bucket_meta[Store KEK ID in
Bucket Metadata]@{shape: docs}
end
subgraph KMIP Server
kmip_wrapper -- 2. Create() --> key_create["Create()"]
key_create --> |KEK| bucket_keys[KEK Storage]@{shape: cyl}
key_create --> |KEK ID| kmip_wrapper
end
For each object upload to an encrypted bucket, RGW will generate a new, random DEK locally. It then asks the KMIP server to wrap this DEK using the bucket's KEK. The resulting wrapped DEK and encryption context are stored in the object's RADOS attributes (xattrs). The plaintext DEK is used to encrypt the object data and is then immediately discarded.
flowchart TB
subgraph RGW
direction LR
req[S3 PUT Request] --> gen_dek(1. Generate DEK)
gen_dek --> |Plaintext DEK| encrypt_data(4. Encrypt Object Data)
gen_dek --> |Plaintext DEK, KEK ID, Context| kmip_wrapper
kmip_wrapper[KMIP Wrapper] --> |3. Wrapped DEK| store_attrs(5. Store in RGW Attrs)@{shape: docs}
end
encrypt_data --> |Encrypted Data| rados[RADOS]
store_attrs -->|Xattrs| rados@{shape: cyl}
subgraph KMIP Server
kmip_wrapper -- 2. Encrypt(KEK_ID, DEK, Context) --> key_wrap
key_wrap --> |Wrapped DEK| kmip_wrapper
key_wrap["Encrypt()"]
bucket_keys[KEK Store]@{shape: cyl}
bucket_keys -->|KEK| key_wrap
key_wrap -->|KEK ID|bucket_keys
end
For each read request, RGW retrieves the wrapped DEK from the object's RGW Attrs. It then asks the KMIP server to unwrap the DEK using the same KEK that wrapped it. The KMIP server returns the plaintext DEK, which RGW uses to decrypt the object data stream for the client. The plaintext DEK is discarded after the operation.
flowchart TB
rados[RADOS]@{shape: cyl} -->|Xattrs| get_attrs
rados -->|Encrypted Data| decrypt_data
subgraph RGW
direction LR
req[S3 GET Request] --> get_attrs(1. Read RGW Attrs)@{shape: docs}
get_attrs --> |Wrapped DEK, KEK ID, Context| kmip_wrapper
kmip_wrapper --> |3. Plaintext DEK| decrypt_data(4. Decrypt Object Data)
end
decrypt_data --> |Plaintext Data| client[S3 Client]
subgraph KMIP Server
kmip_wrapper[KMIP Wrapper] -- 2. Decrypt(KEK_ID, Wrapped DEK, Context) --> key_unwrap[KMIP: Unwrap Key]
key_unwrap --> |Plaintext DEK| kmip_wrapper
key_unwrap["Decrypt()"]
bucket_keys[KEK Store]@{shape: cyl}
bucket_keys -->|KEK| key_unwrap
key_unwrap -->|KEK ID|bucket_keys
end
A key difference from the Vault Transit backend is our approach to DEK generation. The Vault API has a datakey function that can generate and wrap a key in one atomic operation.
The KMIP standard includes a RNG Retrieve operation and support for batching operations, but server-side support for random number generation is not widely implemented.
To ensure maximum compatibility with various KMIP servers, we will generate the DEK using a secure random source within the RGW process. This plaintext DEK will then be sent to the KMIP server via an Encrypt call to be wrapped by the appropriate KEK. This ensures the DEK only exists in memory briefly on the RGW node before being used for encryption.
The core of the implementation involves extending RGW's libkmip wrapper and its underlying forked library (ceph/libkmip).
Missing Operations: The library currently supports the Create operation but lacks the Encrypt and Decrypt functions necessary for our DEK/KEK model. The primary technical task is to implement these operations.
DEK Generation Approach: We propose generating the DEK within RGW and using the KMIP Encrypt operation, rather than relying on the KMS to generate the key. This is because Create and Encrypt are fundamental KMIP features, whereas remote random number generation (RNG Retrieve) is less commonly supported (Examples: ovh/kmip-go, pyKMIP) . This approach maximizes compatibility with various KMIP servers.
The libkmip Fork: RGW uses a fork of libkmip. To ensure our contributions are effective, we would like to understand its history and the best path forward for contributing these new features.
Workqueue Abstraction: The existing wrapper uses a workqueue system. To ensure our additions are idiomatic, we would appreciate some background on its design and purpose (e.g., connection management, async processing).
We plan to test this against OVH OKMS and PyKMIP during development and add a Teuthology suite that uses PyKMIP
Configuration will be analogous to the existing SSE-KMS KMIP support (see: https://docs.ceph.com/en/latest/radosgw/kmip/)
This design introduces network calls to the KMIP server for Encrypt and Decrypt operations, placing them directly in the hot path. This will necessarily add latency to these operations.
To address this, we are aware of a separate, in-progress effort (PR #61256) to implement a cache for unwrapped SSE keys within RGW. This will significantly mitigate the performance impact for read-heavy workloads. The PUT path for new objects, however, will still incur the latency of one Encrypt call.
Instead of extending the binary TTLV-based libkmip, we considered implementing a new client for KMIP's JSON or XML encodings. This would be simpler and wouldn't require modifying a C library. However, the KMIP specification does not mandate support for JSON or XML, and these formats are not widely available in KMIP servers. Using libkmip and the standard binary protocol remains the most compatible and robust choice.
We are seeking feedback from the maintainers on the following key points:
-
DEK Generation in RGW: Do you agree with the proposed approach of generating DEKs locally within RGW?
-
libkmip Contribution Strategy: What is the preferred way to contribute the new Encrypt/Decrypt functionality? Should we target the ceph/libkmip fork directly, or contribute to OpenKMIP/libkmip upstream and then update the Ceph fork?
-
libkmip Wrapper Insights: Can you provide any history on the libkmip fork and the rationale behind the workqueue abstraction in the RGW wrapper? This will help us make our additions idiomatic.
-
Testing Plan: Does our proposed testing strategy seem sufficient, or are there additional scenarios or environments we should consider?
@mdw-at-linuxbox, following Adams's suggestion on Slack, I wanted to bring this KMIP backend proposal to your attention. We'd value your feedback, especially on the DEK generation and libkmip strategy.