Skip to content

Instantly share code, notes, and snippets.

@harshavardhana
Created March 10, 2026 22:01
Show Gist options
  • Select an option

  • Save harshavardhana/434226361f1d91061487e204617b5be8 to your computer and use it in GitHub Desktop.

Select an option

Save harshavardhana/434226361f1d91061487e204617b5be8 to your computer and use it in GitHub Desktop.
Storj S3 vs MinIO AIStor: Technical Comparison

Storj S3 vs MinIO AIStor: Technical Comparison

The Elephant in the Room: Storj's S3 Gateway IS MinIO

storj/gateway-st/miniogw implements MinIO's ObjectLayer interface, routing S3 API calls to the Storj network instead of local disks. Storj's entire S3-compatible surface area is a MinIO fork — they are not building an S3 implementation; they are building a storage backend adapter for MinIO.

gateway-mt (the hosted, multi-tenant variant) is the same pattern wrapped in a MultiTenancyLayer. When you connect to Storj's S3 endpoint, you are talking to a MinIO process.

This is not incidental. It means Storj's S3 compatibility ceiling is bounded by whatever version of MinIO they have forked and are maintaining — and their own issue tracker (edge#27: "Replace Minio fork with most recent Apache2 version") shows they are perpetually behind upstream.


Architecture: Fundamentally Different Design Goals

Dimension Storj MinIO AIStor
Storage model Decentralized: ~20,000 volunteer nodes globally Centralized: operator-controlled drives/nodes
Data path Client → Satellite (auth) → 80 volunteer nodes Client → MinIO server → local drives
Metadata Satellite DB (hosted, centralized) Embedded metadata on same drives
Erasure code RS(29,80) across random internet nodes RS(k,n) within operator's erasure set
Encryption Client-side mandatory (before distribution) Server-side (SSE-S3, SSE-KMS, SSE-C)
S3 implementation MinIO fork (ObjectLayer interface) Native, first-class implementation

The Satellite Is the Real Single Point of Failure

Storj markets itself as decentralized, but every operation — every PUT, GET, DELETE, LIST — requires a round-trip to the satellite for:

  • Authentication and authorization
  • Node address lookup (where are my 80 pieces?)
  • Bandwidth allocation signing
  • Audit scheduling

If the satellite is down, zero bytes are accessible, regardless of how many storage nodes are online. The data is "distributed"; the access control is not. Storj's own docs acknowledge this: the satellite is the authoritative source and they have no near-term plan to decentralize it.

MinIO AIStor has no such external dependency. A MinIO cluster is self-contained — metadata and data co-located, no external coordinator required for I/O.


Scalability Failure Modes

1. Upload Amplification Is Pathological for Small Objects

For every object, Storj must:

  • Establish TCP/TLS connections to 110 storage nodes
  • Successfully upload to 80 of them
  • Record metadata on the satellite

For a 1 KB object, you're opening 110 connections to transfer 80 × (1KB/29 + Reed-Solomon parity overhead) chunks. The connection overhead dwarfs the actual I/O. MinIO writes to local drives with a single network hop.

2. TTFB Is Structurally High

Time-to-first-byte requires: satellite auth round-trip → node address resolution → 39 parallel TCP/TLS connections → wait for 29 responses. For latency-sensitive small-object workloads (API responses, thumbnails, configuration files), this is a non-starter. MinIO AIStor's TTFB is bounded only by network RTT to the server and disk seek time.

3. 2.76× Storage Overhead Is Not Competitive

RS(29,80) stores 80 pieces where 29 suffice — 2.76× expansion factor. MinIO AIStor with RS(8,12) stores 12 pieces where 8 suffice — 1.5× expansion factor — while delivering equivalent or stronger durability guarantees on operator-controlled hardware. Storj's overhead exists solely to compensate for the unreliability of volunteer nodes.

4. Node Churn Drives Continuous Repair Cost

Storj's own white paper models per-node MTTF at 9 months. With ~20,000 nodes and a 9-month MTTF, ~2,200 nodes fail per month. Every failed node triggers repair workers on the satellite to re-fetch remaining pieces from surviving nodes and re-distribute to new nodes. At petabyte scale, this repair traffic is continuous, significant, and competes with user I/O for satellite and storage node capacity. There is no published data on repair system throughput at scale.

5. The Satellite Metadata DB Is a Hard Ceiling

Every object — regardless of size — requires a metadata record on the satellite. For workloads with billions of small objects (logs, metrics, telemetry), the satellite's database becomes the bottleneck. Storj has acknowledged scaling the satellite as a long-term engineering challenge with no shipped solution.

6. LIST Is Broken for Most S3 Applications

Storj does not use order-preserving encryption, so ListObjects cannot return keys in lexicographic order per the S3 spec. A large portion of the S3 ecosystem — pagination logic, prefix-based directory emulation, parallel listing tools, anything that assumes "a/b" < "a/c" — breaks silently.


Enterprise Feature Gaps

Feature Storj MinIO AIStor
AWS IAM-compatible policies No — uses macaroon/access grant model Full IAM: users, groups, roles, policy docs
Bucket policies (JSON) No Yes
Cross-region replication No Yes (site replication, bucket replication)
Event notifications No (SNS/SQS/Lambda targets) Yes — Kafka, NATS, Redis, webhooks, etc.
Object Lock — Governance Not supported Yes
Object Lock — Legal Hold Not supported Yes
S3 Select No Yes
Storage class tiering No Yes (warm, cold, transition rules)
SSE-KMS (external KMS) No — client-side only Yes — Vault, KES, AWS KMS
Lexicographic LIST order Broken Correct per S3 spec
Bucket logging Manual request only Self-service
CopyObject size 671 GB limit 5 TB (S3-spec compliant)
VPC / private networking No Yes
LDAP / AD integration No Yes
OpenID Connect No Yes
Audit logging No Yes

The Gateway-MT Centralization Trap

When you use Storj's hosted S3-compatible endpoint (gateway.storjshare.io), your data flows:

Client → gateway-mt (Equinix PoP) → Storj satellite → storage nodes

The hosted gateway is deployed at a fixed set of Equinix locations. All traffic passes through this centralized bottleneck, which eliminates the distributed throughput advantage that is Storj's primary selling point. You're paying the latency and reliability costs of a decentralized network while not getting the bandwidth benefits.

MinIO AIStor deployed in your datacenter or VPC: zero intermediaries, full wire speed.


What Storj Is Actually Good For

Storj works well for:

  • Large media file distribution (>64 MB objects, parallelism-friendly)
  • Backup/archive where TTFB doesn't matter and durability is the priority
  • Egress-cost-sensitive workloads (Storj charges ~$0.007/GB egress vs S3's $0.09/GB)
  • Individual developers who want cheap cold storage and don't need IAM, replication, or events

For anything requiring enterprise S3 semantics, sub-100ms TTFB, IAM, replication, event-driven architectures, or consistent LIST semantics — Storj cannot deliver. These are structural limitations, not current gaps on a roadmap.


Summary

Storj's core insight — erasure coding across geographically diverse nodes for durability and distributed bandwidth — is sound for narrow use cases. But calling it an enterprise S3 replacement is misleading:

  • The S3 gateway is MinIO — Storj depends on MinIO for S3 compatibility
  • The satellite is a hidden centralized dependency that contradicts the "decentralized" marketing
  • The architecture is optimized for large sequential reads; it degrades badly for the workloads enterprises actually run
  • The feature surface is a fraction of what AWS S3 or MinIO AIStor provides
  • The 2.76× storage overhead and continuous repair cost make it economically uncompetitive at scale for primary storage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment