Skip to content

Instantly share code, notes, and snippets.

View gaborgsomogyi's full-sized avatar

Gabor Somogyi gaborgsomogyi

View GitHub Profile

Hadoop S3 vs S3A: Deep Technical Analysis

Overview

This document provides a comprehensive analysis of the differences between Hadoop's original S3 and S3A (S3Advanced) filesystem implementations based on direct examination of the Hadoop codebase.

Implementation Status and History

S3 (s3native) - DEPRECATED

  • Location: ~/hadoop/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3native/

Tasks

  • Tokens must be tracked per service, only updated tokens must be sent to task managers
  • Dynamic token initialization from user code must be added
  package org.apache.flink.api.common.security;

  /**
   * Manager for obtaining delegation tokens dynamically at runtime.
   * This allows user code to request tokens for services not known at cluster startup.

Which version affected?

$ gpg --version
gpg (GnuPG) 2.4.8

What kind of issue is one facing with?

Simple hanging!

from itertools import product

def generate_paintings():
    colors = ['A', 'B', 'C', 'D']
    hexagons = ['H1', 'H2', 'H3', 'H4', 'H5', 'H6', 'H7']  # 1 in the middle, 6 around
    
    # Generate all possible combinations of colors for the hexagons
    all_paintings = product(colors, repeat=7)
    

Dependency management in python

Python is not able to handle 2 different versions from the same package.

Dependency handling issue

When Pip <20.3 finds an unresolvable dependency (resolves a dependecy to 2+ versions) then would not halt the installation process. It’d successfully continue the process by installing the first matching dependency in the list of conflicts.

    @SuppressWarnings("unchecked")
    public static void setEnv(String key, String value) {
        try {
            Map<String, String> env = System.getenv();
            Class<?> cl = env.getClass();
            Field field = cl.getDeclaredField("m");
            field.setAccessible(true);
            Map<String, String> writableEnv = (Map<String, String>) field.get(env);
 writableEnv.put(key, value);