Skip to content

Instantly share code, notes, and snippets.

View lwang's full-sized avatar

Lawrence Wang lwang

  • Los Angeles
View GitHub Profile
@lwang
lwang / ctds.md
Created October 25, 2023 05:25
Using Python libraries with shared library dependencies on AWS Glue and Lambda

Using Python libraries with shared library dependencies on AWS Glue and Lambda

Some Python libraries such as ctds depend on external libraries written in C/C++ like FreeTDS. In a typical scenario where the dependency is installed through the system's package manager, the dependency libraries will be placed in a location such as /usr/lib64 where the Python library can find it upon import. However, AWS Glue/Lambda does not allow the installation of system packages. In some cases, we can copy the shared object files to another location and use LD_LIBRARY_PATH to point to the new library directory, however, Glue/Lambda also does not allow developers to configure the run command.

Package the Python library and dependencies into a wheel/layer using Docker

  • Attempting to add the desired Python library to a AWS Glue Python Shell job through the --additional-python-modules option will cause an error as pip will try to build a wheel for the library but will not have the t
@lwang
lwang / msodbcsql17.md
Created October 23, 2023 01:34
Adding the Microsoft ODBC Driver for SQL Server to an AWS Glue Python Shell Job for use with pyodbc

Adding the Microsoft ODBC Driver for SQL Server to an AWS Glue Python Shell Job for use with pyodbc

AWS Glue does not provide an easy way of adding the Microsoft ODBC Driver to your Python Shell Glue job. To allow pyodbc to recognize our driver, we need to upload the driver's shared library files to a location the Glue job can access.

Step 1

  • Compile the Microsoft ODBC Driver for SQL Server. I am using the Docker image for AWS Lambda since the environment for Glue is likely also Amazon Linux 2 based.
    FROM public.ecr.aws/lambda/python:3.9 as builder
    
    RUN yum update -y