Skip to content

Instantly share code, notes, and snippets.

@gaborgsomogyi
Created December 7, 2025 13:13
Show Gist options
  • Select an option

  • Save gaborgsomogyi/7e6dd1c494c39e56453676c50ed17f85 to your computer and use it in GitHub Desktop.

Select an option

Save gaborgsomogyi/7e6dd1c494c39e56453676c50ed17f85 to your computer and use it in GitHub Desktop.

Hadoop S3 vs S3A: Deep Technical Analysis

Overview

This document provides a comprehensive analysis of the differences between Hadoop's original S3 and S3A (S3Advanced) filesystem implementations based on direct examination of the Hadoop codebase.

Implementation Status and History

S3 (s3native) - DEPRECATED

  • Location: ~/hadoop/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3native/
  • Status: DEPRECATED AND REPLACED
  • Remaining Files: Only S3xLoginHelper.java and package.html - minimal codebase
  • Package Documentation: Explicitly states: "It has been replaced by the S3A client"
  • URI Scheme: s3://
  • Maintenance: No active development

S3A (S3Advanced) - ACTIVE

  • Location: ~/hadoop/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/
  • Status: ACTIVELY MAINTAINED - primary S3 implementation
  • Scale: 72 subdirectories with comprehensive implementation
  • Core Files:
    • S3AFileSystem.java (5,749 lines) - main implementation
    • S3ABlockOutputStream.java (1,594 lines) - advanced output handling
  • URI Scheme: s3a://
  • Configuration Prefix: fs.s3a.* (not fs.s3.*)

Key Architectural Differences

Output/Write Operations

S3 (Original)

  • Buffer-all approach: All data buffered to disk before upload
  • Single upload: Per close() operation
  • Performance: Slow for large files, fills disk space
  • Pattern: Simple but inefficient write pattern

S3A (Advanced)

  • Fast Upload: Modern multipart upload architecture via S3ABlockOutputStream
  • Parallel Block Uploads: Blocks uploaded in background threads while data is still being written
  • Buffering Strategies: Three configurable options:
    • disk: Uses local directories (default, configurable via fs.s3a.buffer.dir)
    • array: In-heap memory arrays
    • bytebuffer: Off-heap direct ByteBuffers
  • Key Configuration:
    fs.s3a.multipart.size = 64M                    # partition size
    fs.s3a.multipart.threshold = 128M              # when to start multipart
    fs.s3a.fast.upload.active.blocks = 4           # max concurrent uploads
    fs.s3a.fast.upload.buffer = disk               # buffering strategy
    
  • Streaming Upload: Begins uploading as soon as buffered data exceeds partition size
  • Optimized Close: Time limited to remaining data upload, not total file size

Read/Input Operations

S3A Enhancements

  • Vectored IO: Multiple file ranges read with batching and merging
  • Prefetching: Optional S3ACachingInputStream for optimized reads
  • Input Policies (fadvise): Sequential, random, whole-file modes
  • Seek Optimization: Configurable readahead range to minimize reconnections
  • Block Size: Configurable via fs.s3a.block.size = 32M

Directory/Listing Operations

S3A Improvements

  • List API Versions: Supports both ListV1 and ListV2
  • Default: fs.s3a.list.version = 2
  • Pagination: fs.s3a.paging.maximum = 5000 keys per request
  • Performance: Significantly improved listing performance (HADOOP-17400)

Consistency Model Evolution

S3 (Original)

  • Relied on eventual consistency semantics (problematic until AWS S3 changed in November 2020)

S3A (Advanced)

  • Historical Challenge: AWS S3 was eventually consistent until November 2020

    • Newly created objects excluded from directory listings
    • Newly deleted objects retained in listings
    • Deleted objects still visible in HEAD/GET probes
    • 404 caching issues
  • S3Guard Solution (2016-2020): Used DynamoDB as metadata cache for consistency

  • Current State (Post-2020): AWS S3 provides read-after-write consistency

    • S3Guard deprecated and removed (HADOOP-17409)
    • Change detection via ETag or version IDs
    • Configuration: fs.s3a.change.detection.source and fs.s3a.change.detection.mode

Multipart Upload Architecture

S3A Specifications

  • Minimum Part Size: 5MB (AWS S3 requirement)
  • Default Part Size: 64MB (fs.s3a.multipart.size)
  • Copy Operations: Also use multipart for large files
  • Cleanup Management:
    • Automatic purge of orphaned uploads (configurable age)
    • Configuration: fs.s3a.multipart.purge, fs.s3a.multipart.purge.age
  • Multi-Object Delete:
    • Enabled by default (fs.s3a.multiobjectdelete.enable = true)
    • Bulk delete in single request (vs. individual DELETEs)
    • Page size: 250 entries (not 1000) to avoid throttling

Authentication & Security

S3 (Original)

  • Used old JetS3t-based libraries
  • No modern credential provider chain
  • Limited authentication options

S3A (Advanced)

  • Credential Interface: software.amazon.awssdk.auth.credentials.AwsCredentialsProvider
  • Provider Chain:
    1. Temporary credentials (TemporaryAWSCredentialsProvider)
    2. Simple credentials (SimpleAWSCredentialsProvider)
    3. Environment variables
    4. EC2/Container Metadata Service (IAMInstanceCredentialsProvider)
  • Advanced Features:
    • Assumed Role support with STS
    • Delegation tokens
    • Per-bucket credential configuration
    • Credential provider mapping for V1→V2 migration

Encryption Support (S3A)

  • SSE-S3: AES256 server-side encryption
  • SSE-KMS: Key Management Service encryption
  • SSE-C: Customer-provided keys
  • CSE-KMS: Client-side encryption
  • Configuration: fs.s3a.encryption.algorithm, fs.s3a.encryption.key

Configuration Differences

S3A Configuration Matrix

Aspect Configuration Key Default Value
Authentication
Access Key fs.s3a.access.key -
Secret Key fs.s3a.secret.key -
Session Token fs.s3a.session.token -
Connection Management
Connection Pool fs.s3a.connection.maximum 96
Thread Pool fs.s3a.threads.max (tunable)
Retry & Resilience
Retry Limit fs.s3a.retry.limit 7
Retry Interval fs.s3a.retry.interval 500ms
Throttle Retry Limit fs.s3a.retry.throttle.limit 20
Performance Tuning
Fast Upload Buffer fs.s3a.fast.upload.buffer disk
Socket Send Buffer fs.s3a.socket.send.buffer 8192 bytes
Socket Recv Buffer fs.s3a.socket.recv.buffer 8192 bytes
Storage Options
Checksum Algorithm fs.s3a.create.checksum.algorithm (none)
Storage Class fs.s3a.create.storage.class Standard

Per-Bucket Configuration

  • All fs.s3a options (except fs.s3a.impl) can be overridden per bucket
  • Pattern: fs.s3a.bucket.BUCKETNAME.OPTION_NAME

Performance Characteristics

S3A Optimizations

Operation Characteristic Configuration
Upload Parallel background fs.s3a.threads.max
Buffering Three strategies fs.s3a.fast.upload.buffer
Queuing Task queue limit fs.s3a.max.total.tasks
Seek Readahead support fs.s3a.readahead.range = 64K
Listing Batch pagination fs.s3a.paging.maximum = 5000
Throttling Exponential backoff fs.s3a.retry.throttle.interval = 100ms
Vectored I/O Range batching fs.s3a.vectored.read.*

Directory Operations (Non-atomic)

  • Rename: Not atomic - copies all files, then deletes originals
  • Delete (dir): Not atomic - can fail partway through
  • Time Complexity: Proportional to number of files and data size
  • Network: Operations occur inside S3, don't depend on client bandwidth

Advanced Features (S3A Only)

Committer Architecture

  • Magic Committer: Special support for task-parallel writes
  • Configuration: fs.s3a.committer.name (file, directory, partitioned, magic)
  • Staging: Uses consistent cluster filesystem (HDFS) for staging metadata
  • Abort Handler: Cleanup of pending multipart uploads on job failure

Enterprise Features

  1. Auditing: Track S3 operations by job, user, filesystem operation
  2. S3Select: Server-side filtering support
  3. Delegation Tokens: Support for Kerberos environments
  4. Assumed Roles: Support for cross-account access
  5. S3 Express One Zone: Latest AWS storage option support
  6. Directory Markers: Configurable behavior for pseudo-directory tracking

Migration Path

From S3 to S3A

  1. URI Change: s3://buckets3a://bucket
  2. Configuration Prefix: fs.s3.*fs.s3a.*
  3. Credential Provider: Interface changes (V1 → V2 SDK)
  4. Mapping Support: fs.s3a.aws.credentials.provider.mapping

Current State

  • S3 implementation files are minimal (only login helper remains)
  • S3 is NOT actively maintained
  • All new features and optimizations go to S3A
  • AWS SDK upgraded from V1 to V2 in Hadoop 3.4.0

Why S3A Was Created

Based on code analysis and documentation:

  1. Performance: S3 buffered entire files before upload; S3A uploads in parallel blocks
  2. Scalability: S3A supports modern multipart upload with background threads
  3. SDK Evolution: Move to modern AWS SDK versions with better features
  4. Consistency: S3A built-in support for consistency mechanisms (S3Guard, change detection)
  5. Enterprise Features: Committers, auditing, delegation tokens, S3Select
  6. Reliability: Better error handling, retry policies, instrumentation
  7. Efficiency: Configurable buffering strategies (disk, array, bytebuffer)

File Structure Comparison

/hadoop/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/

s3native/           (DEPRECATED - original S3)
├── S3xLoginHelper.java
└── package.html    (states "It has been replaced by the S3A client")

s3a/                (ACTIVE - S3A Advanced)
├── S3AFileSystem.java (5,749 lines - core)
├── S3ABlockOutputStream.java (1,594 lines)
├── S3AInputStream.java
├── S3AInstrumentation.java (metrics & statistics)
├── Constants.java (configuration)
├── impl/           (implementation details)
├── statistics/     (performance metrics)
├── s3guard/        (consistency - deprecated)
├── commit/         (committer architecture)
└── [68+ more files and subdirectories]

Summary

S3 (Original)

  • Deprecated and unmaintained
  • Poor performance for large files
  • Limited authentication options
  • No modern AWS SDK features
  • No consistency guarantees
  • Single-threaded uploads

S3A (Advanced)

  • Actively maintained and feature-rich
  • High-performance parallel uploads
  • Modern AWS SDK v2 support
  • Comprehensive authentication options
  • Enterprise features (auditing, delegation tokens)
  • Multiple buffering strategies
  • Consistency management
  • Advanced committer support

Recommendation: Use S3A for all new deployments. The original S3 implementation should only be encountered in legacy systems requiring migration to S3A.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment