Using pyodbc in AWS Lambda functions
This week I was working on an AWS Lambda function that needed to read and write from a legacy Microsoft SQL database. It’s written using the AWS Chalice framework and in local testing everything looked great. Not so much when we needed to deploy it to AWS for testing.
Most of the time that you include a python package for use in a lambda function, Chalice is able to package that into the deployment, and you’re good to go. However, that’s only true if the package doesn’t rely on native libraries that aren’t included in the Amazon Linux environment that lambda functions run on. Microsoft ODBC drivers are one of those things.
Thankfully you can compile these libraries yourself and include them in the package being deployed into lambda. With thanks to this gist for the starting point, here is what I did to compile the drivers.
First we start a docker container using an image that mimics the Amazon Linux environment your function will be running in. In my case I’m using python 3.7.
docker run -it --rm --entrypoint bash -e ODBCINI=/var/task -e ODBCSYSINI=/var/task \ -v "$PWD":/var/task lambci/lambda:build-python3.7
This sets the current working directory to be a volume within the docker container where all the compilation artifacts will end up. We then install the MS SQL ODBC drivers from Microsoft’s rpm repository, copy them into our target directory and set some environment variables, so the system is looking in our target directory for any libraries or include files.
curl https://packages.microsoft.com/config/rhel/6/prod.repo > /etc/yum.repos.d/mssql-release.repo # Disable the Amazon repos as the version provided isn't compatible with msodbcsql yum -y install unixODBC --disablerepo=amzn* ACCEPT_EULA=Y yum -y install msodbcsql cp -r /opt/microsoft/msodbcsql . export CFLAGS="-I/var/task/include" export LDFLAGS="-L/var/task/lib"
When we install
msodbsql it installs a number of dependency packages including
unixODBC. However, we need to include the
unixODBC drivers the libraries in our lambda function package, so we download the source, compile it and install it into our target volume. Note the version number installed by the previous step and match it to the source package.
curl ftp://ftp.unixodbc.org/pub/unixODBC/unixODBC-2.3.7.tar.gz -O tar xvzf unixODBC-2.3.7.tar.gz cd unixODBC-2.3.7 ./configure --sysconfdir=/var/task --disable-gui --disable-drivers --enable-iconv --with-iconv-char-enc=UTF8 \ --with-iconv-ucode-enc=UTF16LE --prefix=/var/task make install cd .. rm -rf unixODBC-2.3.7 unixODBC-2.3.7.tar.gz
pyodbc package into the current directory. We use the
-t option to specify the target directory.
pip install pyodbc -t .
Create updated configuration files to point to the new location for the drives. You need to ensure that your
pyodbc code is referencing the same driver version. Originally our code was referencing
ODBC Driver 17 for SQL Server which was fine locally, but not so here.
cat <<EOF > odbcinst.ini [ODBC Driver 13 for SQL Server] Description=Microsoft ODBC Driver 13 for SQL Server Driver=/var/task/msodbcsql/lib64/libmsodbcsql-13.1.so.9.2 UsageCount=1 EOF cat <<EOF > odbc.ini [ODBC Driver 13 for SQL Server] Driver = ODBC Driver 13 for SQL Server Description = My ODBC Driver 13 for SQL Server Trace = No EOF
You now have a directory containing all the libraries, drivers and packages needed to connect to an MS SQL database from an AWS Lambda function. If you are using AWS Chalice, you copy all this into the
vendors directory and Chalice will include it into the deployment package.